National Fraud Prevention Challenge
Catching Money Mules
ML-powered detection of fraudulent intermediary accounts across 160K accounts and 400M+ transactions
Divya Mohan & Kumkum Thakur · Team dmj.one · github.com/divyamohan1993/nfpc-mule-detection
Money Mules Are Invisible
200 Lakh Cr
processed through Indian payment systems in FY24 alone. Every undetected mule account is a breach in UPI, NEFT, RTGS integrity.
Banks flag accounts after fraud. Rule-based systems catch only the obvious.
Detect Before the Damage
0.968 AUC
Our 3-model ensemble (LightGBM + XGBoost + CatBoost) detects mule accounts with 0.968 Public AUC-ROC across 160K accounts. 208 features, rank averaging, multi-seed CV. Phase 1 achieved 0.985 OOF AUC on 24K accounts.
The Pipeline
400M+Transactions Ingested
208Features Engineered
3Models Ensembled
0.968Public AUC-ROC
4 feature passes: transaction core, extended patterns, static account metadata, and graph/network metrics (PageRank, Louvain, betweenness centrality).
Who Benefits
Banks & NBFCs
Pre-flag suspicious accounts before losses
Regulators (RBI/FIU-IND)
Strengthen STR framework, PMLA compliance
Payment Processors
Risk scoring for UPI/NEFT/RTGS flows
Insurance/Fintech
KYC enhancement, onboarding risk
API-based risk scoring — per-account inference, batch or real-time.
Reaching Users
1RBIH network — direct access to Indian banking ecosystem
2Open-source model + paper — academic validation, community trust
3Regulatory sandbox — pilot with 2-3 banks under RBI oversight
4Integration SDKs — drop-in for existing AML/KYC platforms
Landscape
| Approach | Accuracy | Patterns | Explainable |
|---|---|---|---|
| Rule-based (threshold) | Low | 2-3 | Yes |
| Single ML model | ~0.95 | Implicit | Partial |
| Ours (3-model ensemble) | 0.968 | 12/12 | Full SHAP |
Structural advantage: interpretability + accuracy. Not a black box.
Built By
DM
KT
CO
Claude Opus
AI Co-Architect & Build Partner (Anthropic)
Team dmj.one · RBIH x IIT Delhi TRYST 2025
Real Numbers
0.968
Public AUC-ROC
208
Features
160K
Accounts
400M+
Transactions
3x5x3
Model x Fold x Seed
6/7
Red Herrings Avoided
5-fold stratified cross-validation. All metrics out-of-fold. No leakage.
What's Next
✓Phase 1: EDA + 125-feature pipeline + 0.985 AUC
✓Phase 2: 208-feature pipeline + 3-model ensemble + 0.968 AUC
✓Label cleaning, red herring analysis, temporal windows
✓Showcase deployed at nfpc.dmj.one
2Real-time feature computation (<200ms)
3Regulatory sandbox pilot with partner bank