← Varaksha|Build Timeline

Feb 28 – Mar 11, 2026 · 11 days

How We Built
Varaksha in a Sprint

Two workstreams converging on a single system. Security in pink, ML in blue, shared decisions in gradient.

Security Expert
ML Expert
Together

01

The Problem

India's Unified Payments Interface processes over 500 million transactions daily. Legacy fraud detection operates on batch cycles, introducing delays that allow mule networks to execute and disperse before a single alert is raised. Real-time classification at the transaction layer is a structural necessity, not an optimisation.

02

The Architecture

Varaksha is a five-layer detection pipeline: a Rust privacy gateway that hashes identifiers at ingress, a Random Forest ML engine trained on 111K real transactions, a graph topology analyser for network-pattern fraud, a multilingual alert agent covering 22 Indian languages, and a real-time operations dashboard.

03

The Outcome

85.24% detection accuracy. ROC-AUC 0.9546. Sub-5ms P99 gateway latency. Four BIS money-mule typologies detected autonomously. Fraud alerts in 8 Indian languages, with legal citations embedded. Recall 0.9229 — catches 92 in every 100 fraud transactions. Built, trained, and deployed to global edge in 12 days.

Together
Feb 28

Defining the Architecture

Demonstrability is a first-class design constraint.

Five-layer architecture scoped: Rust privacy gateway, ML classifier, graph topology analyser, multilingual alert agent, and ops dashboard. System designed to be comprehensible in under a minute, evaluated under pressure.

5-Layer DesignArchitectureCore Pipeline
Security
Mar 1

Privacy Gateway in Rust

Sensitive identifiers must not persist beyond the perimeter. Everything downstream operates on hashes.

The Actix-Web 4 gateway is the sole component that handles raw Virtual Payment Addresses — SHA-256 hashing is applied at ingress so all downstream services receive only derived identifiers. DashMap provides a lock-free concurrent risk cache across the Actix worker pool; score_to_verdict() threshold logic determines ALLOW, FLAG, and BLOCK classifications.

RustSHA-256DashMapActix-Web 4
ML
Mar 1

ML Baseline Established

A working baseline yields insights that an unimplemented optimal architecture cannot.

Random Forest + XGBoost soft-vote ensemble on transaction velocity, round-amount flag, network out-degree, and time-of-day encoding. Stratified 50K PaySim sample with SMOTE rebalancing. Reference point established for subsequent iterations.

RF + XGBoostSMOTEPaySim
Security
Mar 2

Graph-Based Mule Detection

Network fan-out is a consistent topological signature across all known money-mule architectures.

A NetworkX graph agent runs asynchronously outside the payment critical path, detecting all four BIS Project Hertha mule typologies: fan-out, fan-in, directed cycles, and scatter patterns. Score aggregation uses the maximum across detected patterns to prevent false positives on legitimate high-volume merchants; results push to the Rust risk cache via HMAC-SHA256-signed webhooks.

NetworkXFan-outDirected CyclesAsyncHMAC-SHA256
Security
Mar 2–3

Multilingual Alert Delivery

A fraud alert has no utility if the recipient cannot read the language in which it is issued.

Alerts synthesised in 8 Indian languages via Microsoft Neural TTS (edge-tts) embed the transaction ID, blocked amount, and risk score in the recipient’s preferred language. BLOCK verdicts cite IT Act 2000 §66D and BNS §318(4) verbatim; the template engine is swappable for IndicTrans2 at production time.

8 languagesNeural TTSIT Act 2000 §66Dedge-tts
Together
Mar 3

Integration Proof-of-Concept

End-to-end verdicts validated — from Rust ingress to multilingual alert.

A live operations dashboard confirmed verdicts flowing through all five layers: transaction ingress, hashing, ML scoring, graph analysis, and multilingual alert dispatch. Force-directed network visualization, Hindi alert panel, and 50-event audit log. All data is synthetic—no real PII processed.

5-Layer PipelineLive DashboardAudit LogSynthetic
ML
Mar 5–7

Model Architecture Overhaul

At 450 MB combined, the ensemble consumed nearly the entire memory budget for a sub-0.005 accuracy gain.

XGBoost was removed from the serving stack: RF-300 achieves ROC-AUC 0.9869 in isolation and the marginal ensemble gain was insufficient to justify 450 MB combined weight. Feature engineering expanded from 8 to 16 variables, incorporating balance_drain_ratio, account_age_days, previous_failed_attempts, and transfer_cashout_flag; the output artefact became varaksha_rf_model.onnx.

RF-300 only16 features75K rowsONNXROC-AUC 0.9869
Together
Mar 9–10

Production Deployment

Static export to a global edge network eliminates cold starts and infrastructure overhead from the demonstration path entirely.

Next.js 15 configured with static export and deployed to Cloudflare Pages eliminates cold starts and Node.js server overhead from the demonstration path. The frontend ships three routes: a live stats landing page, an animated architecture walkthrough, and a real-time transaction feed with Security Arena and Cache Visualizer panels.

Next.js 15Cloudflare PagesStatic Exportframer-motion
ML
Mar 11 AM

Dataset Coverage Audit

Model timestamps revealed the training pipeline had never ingested the complete dataset.

Three missing dataset files discovered: supervised_dataset.csv, remaining_behavior_ext.csv, and ton-iot.csv. All loaders written, validated against schema, and integrated into the merge pipeline. 54,142 rows recovered.

Dataset Audit54K Rows3 Loaders
ML
Mar 11 PM

85.24%

Retraining on the complete leakage-corrected dataset: 85.24% accuracy, ROC-AUC 0.9546.

The expanded 111,499-row dataset rebalanced by SMOTE to 51,735/51,735 yielded: RF Accuracy 85.24%, ROC-AUC 0.9546, Precision 0.7709, Recall 0.9229, F1 0.8401. Stale artefacts — lightgbm, xgboost, voting ensemble — were removed from the repository.

111K rows85.24% accuracyROC-AUC 0.9546Artefact cleanup
Together
Mar 11

Finalisation and Deployment

A deployable system is defined by finishing details—texture, colour, and interactive feedback.

Frontend polish: dot-grid body texture, surface-gradient card utility, amber token separated from saffron for distinct FLAG verdict rendering. Next.js static export deployed to Cloudflare Pages. Core pipeline hardened and ready for production integration.

Next.js 15Static ExportPolishProduction-Ready

On the Horizon

What We Build Next

Next steps: features we consciously set down to meet the deadline.

All 22 Scheduled Languages

Expand from 8 to all constitutionally scheduled Indian languages via IndicTrans2 — swap a single function call in agent03.

Accessibility

Mobile SDK Packaging

Package the ONNX inference layer as an Android / iOS SDK so PSPs can embed sub-1ms on-device scoring without a network call.

Distribution

On-Device Edge Inference

Ship varaksha_rf_model.onnx to handsets via ONNX Runtime Mobile. Scores computed locally — zero round-trip latency, works offline.

Performance

Streaming Graph Analytics

Replace batch NetworkX with Apache Flink or Kafka Streams so fan-out and cycle detection updates continuously as edges arrive.

Architecture

Live LLM Legal Summaries

Replace the mock LLM in agent03 with GPT-4o-mini or Groq to generate dynamic, context-aware legal citations per transaction.

AI

NPCI Consortium Risk Sharing

Federate anonymised risk scores across participating PSP banks via a shared NPCI registry — consortium intelligence without PII exposure.

Ecosystem

Automated Regulatory Reporting

Auto-generate FIU-IND Suspicious Transaction Reports for PMLA §3 triggers and maintain a DPDP Act 2023 audit trail per blocked VPA.

Compliance

Open-Source Release

Publish the five-layer pipeline as an open library — plug in your own dataset, retrain in one command, deploy to any cloud with azd.

Community

11 days · 2 people · shipped.