💳

Payment Fraud Detection

“3 simple flags — foreign IP, new device, after midnight — predicted 85% of fraud. The pattern was already in the data.”

Risk Score ModellingPrecision/RecallRegional Fraud AnalysisFalse Positive Trade-offFeature Engineering

5,000

Transactions

312

Fraud Cases

6.2%

Fraud Rate

85%

Detection Rate

High-Risk Regions

Step 1 — Data Extraction from Payment Gateway Logs

5,000 payment transaction records were exported for Jan–Dec 2024. Each record includes metadata, device signals, IP geolocation, and a confirmed fraud label (0 = legitimate, 1 = fraud) for risk model validation.

Sample raw extract — payment transaction log

Payment ID	Date	Region	Type	Amount	Hour	New Device	Foreign IP	Is Fraud
PAY-00001	2024-03-14	Kuala Lumpur	Credit Card	RM 1,240	14	No	No	0
PAY-00002	2024-03-14	Kedah	E-Wallet	RM 3,800	02	Yes	Yes	1
PAY-00003	2024-03-15	Penang	Debit Card	RM 480	11	No	No	0
PAY-00004	2024-03-15	Selangor	Credit Card	RM 580	23	Yes	No	1
PAY-00005	2024-03-16	Johor Bahru	Bank Transfer	RM 2,100	09	No	No	0
PAY-00006	2024-03-16	Kedah	BNPL	RM 920	03	Yes	Yes	1
PAY-00007	2024-03-17	KL	Credit Card	RM 290	16	No	No	0
PAY-00008	2024-03-17	Sabah	E-Wallet	RM 4,500	01	Yes	Yes	1
PAY-00009	2024-03-18	Penang	Debit Card	RM 180	10	No	No	0
PAY-00010	2024-03-18	Selangor	Credit Card	RM 750	22	No	Yes	1

Step 2 — Data Cleaning

Issues found and resolved

Issue	Count	Action	Result
Duplicate Transaction IDs	18	Removed	4,982 unique records
Amount = RM 0.00 (test transactions)	12	Removed	Excluded
Fraud label missing	34	Excluded from model training only	Flagged
Hour field blank	7	Derived from full timestamp	Imputed
Region field inconsistent casing	28	Standardised (e.g. kl → Kuala Lumpur)	Unified

Step 3 — Transform: Risk Feature Engineering + VLOOKUP

A risk score was built by combining 7 binary flags into a weighted total per transaction. VLOOKUP joined the regional risk weighting table to apply location-based score additions.

Risk score components — each flag adds points to the transaction total

Risk Flag	Condition	Points Added	Actual Fraud Rate When True
After-Midnight Transaction	Hour 00:00–04:59	+6	9.1%
High Amount	Amount > RM 2,000	+8	11.4%
New Device	Device not seen before for this customer	+5	8.2%
Foreign IP Address	IP geolocated outside Malaysia	+10	14.8%
Prior Fraud History	Customer flagged in last 90 days	+12	19.3%
BNPL Payment Type	Buy Now Pay Later selected	+4	7.6%
High-Risk Region	Kedah or Kuala Lumpur	+3	8.1%

Pivot Output — Fraud Rate by Region

Grouped by region — shows geographic concentration of fraud

Region	Total Transactions	Fraud Cases	Fraud Rate %	Risk Level
Kedah	342	28	8.2%	🔴 HIGH
Kuala Lumpur	1,248	92	7.4%	🔴 HIGH
Sabah	398	28	7.0%	🔴 HIGH
Selangor	998	62	6.2%	🟡 ELEVATED
Johor Bahru	892	44	4.9%	🟡 MODERATE
Penang	748	38	5.1%	🟡 MODERATE
Sarawak	374	20	5.3%	🟡 MODERATE

🔍 KEY FINDING

Kedah (8.2%) and KL (7.4%) show fraud rates more than 2× the national average. Transactions combining Foreign IP + New Device + After Midnight have an 18% fraud rate — nearly 3× the overall 6.2%. These 3 flags alone, as a detection rule, catch 85% of all confirmed fraud cases in this dataset.

Step 4 — Analysis: Model Performance + Precision/Recall

312

Confirmed Fraud Cases

6.2% of all transactions

85%

Detection Rate

At threshold score 30

8.2%

Kedah Fraud Rate

Highest region

False Positive Rate

Acceptable for 2FA trigger

Precision/Recall at different thresholds — finding the optimal cut-off

Threshold	Fraud Caught	False Positives	Precision	Recall	Recommendation
Score ≥ 20	92%	22%	71%	92%	Too many false positives — bad UX
Score ≥ 30	85%	8%	86%	85%	✅ Recommended — best balance
Score ≥ 40	71%	3%	94%	71%	Misses too much fraud
Score ≥ 50	54%	1%	97%	54%	Too conservative

🔍 KEY FINDING

A threshold of 30 is optimal. It catches 85% of fraud while triggering a 2FA prompt for only 8% of legitimate customers — roughly 18 real people per week experiencing a minor delay. That is the trade-off to present to management: minor inconvenience for 18 people per week vs RM 477k in preventable fraud per year.

Step 5 — Visualisation

Fraud Rate % by Region

Kedah and KL are 2× the national average of 6.2%

Fraud Rate by Hour of Day (24h)

Red bars = 12am–4am window, 2× daytime rate

Precision vs Recall Curve

Score ≥30 is the optimal balance point

Fraud Rate % by Region

Bar chart ranked — Kedah and KL isolated in red. No explanation needed after seeing this chart.

Risk Score Distribution: Fraud vs Legitimate

Overlapping histogram — legitimate clusters at 0–20, fraud at 30–80. Visual proof the model separates the two groups.

Precision vs Recall Curve

Line chart — score 30 marked as the optimal decision point. Management sees the trade-off being made.

Fraud Rate by Hour of Day

24-bar chart — 12am–4am has 2× daytime rate. Directly supports the after-midnight transaction limit.

Step 6 — Report to Management

Actions submitted to Risk Management from this analysis

Finding	Action Required	Owner	Deadline	Expected Impact
Kedah & KL at 7–8% fraud	Deploy 2FA for KL/Kedah transactions > RM 500	Risk Manager	Feb 17	Reduce regional fraud to < 4%
3-flag combo = 18% fraud rate	Auto-flag for manual review when 2+ flags trigger	Risk Team	Feb 17	Catch 85% before payment clears
BNPL at 7.6% fraud rate	Extra verification for BNPL > RM 300	Product Manager	Mar 1	Reduce BNPL fraud by est. 60%
After-midnight elevated	Cap transactions at RM 1,000 between 12am–4am	Risk Manager	Feb 24	Limit exposure in highest-risk window

✅ WHY MANAGEMENT MUST ACT

Financial exposure: 312 fraud cases × avg RM 1,800 = RM 561,600 total annual exposure. The model catches 85% = RM 477k recoverable per year.
Reputational risk: KL and Kedah concentration means affected customers are geographically clustered. One viral complaint about a stolen payment spreads fast locally.
Regulatory risk: Bank Negara Malaysia requires active fraud monitoring for payment processors. Non-compliance risks licence review — not just a fine.
False positive cost is minimal: ~18 real customers per week get a 2FA prompt. That is a 30-second inconvenience vs RM 477,000 in prevented fraud annually.