AI in ICAI

Sign In

AI & Accounting

GST FraudShield

Author : CA Shruti Dang

A	INTRODUCTION

About the Tool

GST FraudShield is a privacy-conscious, AI-assisted GST audit and fraud-risk intelligence platform built for Chartered Accountants, tax officers, and compliance professionals operating under India's Goods and Services Tax framework. The platform accepts GSTR-1 return files, extracts GSTIN-wise transaction intelligence, enriches supplier profiles through a four-tier cascade (live GST portal lookup, AI inference, cache, and HSN-based analysis), and scores each supplier against a deterministic, explainable fraud rule engine.

Every risk flag comes with a rule code, severity label, and plain-English explanation — making results defensible in audit proceedings. The system runs as both a browser-based web dashboard and a full-featured Electron desktop application.

Key Objectives

Automate GSTIN-level fraud risk scoring from raw GSTR-1 JSON or Excel uploads.
Identify suspicious patterns: unknown GSTINs, HSN mismatches, invoice splitting, high transaction concentration, and round-value manipulation.
Enrich every counterparty GSTIN using a four-tier lookup: Cache → GST Portal Proxy → Anthropic AI → Internal HSN Inference.
Deliver fully explainable results — every risk flag maps to a named rule, score, and severity level — suitable for audit workpapers.
Protect data privacy: the modular backend runs locally; no transaction data is transmitted to third-party services beyond optional AI enrichment.
Provide one-click export in Excel, CSV, JSON, PDF, and Word for filing and reporting workflows.

Target Users

Practising Chartered Accountants conducting GST audits and ITC verification.
In-house tax and finance teams at mid-to-large Indian businesses.
GST Intelligence officers performing risk-based audit selection.
Tax technology teams at CA firms and fintech startups building compliance workflows.

B	PROBLEM STATEMENT

Pain Point 1: GST Fraud Detection is Reactive, Not Proactive

India loses an estimated Rs. 1 lakh crore annually to GST fraud, primarily through fake invoice networks, bogus ITC claims, and shell GSTINs. Current audit workflows rely on manual scrutiny of returns — a process that is slow, inconsistent, and scales poorly across the 1.4 crore+ active GSTINs on the portal. By the time fraud is detected, credits have already been availed and offenders may have cancelled registrations.

No existing lightweight tool integrates live GST portal enrichment with rule-based fraud scoring in a single pipeline.
Manual checks on invoice patterns (round-value clustering, splitting, concentration) require days of spreadsheet analysis per entity.
HSN code mismatches — a key fraud indicator — go undetected unless the auditor cross-references the portal profile individually.

Pain Point 2: GSTIN Enrichment is Fragmented and Time-Consuming

Verifying a counterparty GSTIN requires navigating the GST portal manually, copying legal names, checking registration status, and noting HSN/dealing-in codes. For an auditor reviewing 500+ suppliers, this is days of work. There is no free tool that automates this enrichment pipeline with fallback intelligence.

Portal barriers (CAPTCHAs, rate limits) make bulk lookup impractical without automation.
No tool combines portal lookup with AI-based inference as a fallback for unavailable GSTINs.
HSN-based business profiling — crucial for verifying supplier legitimacy — requires manual industry expertise today.

Pain Point 3: Fraud Rule Outputs Lack Explainability

Black-box ML fraud scores are not actionable in audit proceedings. Assessees demand rule-level justification for every risk flag raised. GST FraudShield addresses this with a fully deterministic, explainable rule engine where every flag carries a code, score, and plain-English rationale.

C	TECHNOLOGICAL SOLUTION

Application Architecture

GST FraudShield follows a modular pipeline architecture that cleanly separates file parsing, GSTIN enrichment, fraud rule evaluation, and presentation layers. The system runs in two modes: a web backend dashboard served via Express.js, and a full-featured Electron desktop application with live GST portal automation.

#	Module	Description
1	File Parser	Accepts GSTR-1 JSON/Excel; extracts invoices, GSTINs, HSN codes, tax values
2	GSTIN Enrichment	Four-tier lookup: Cache → Proxy → AI (Anthropic) → HSN Inference
3	Business Inference	Predicts industry from top HSN chapters; assigns confidence score
4	Fraud Rule Engine	Applies 5 deterministic rules; produces per-GSTIN score, level, flags
5	Dashboard / Export	Renders summary cards, risk table, flag list; exports Excel/CSV/PDF/Word

Technology Stack

Component	Technology
Backend API	Node.js + Express.js (port 3020)
Desktop Shell	Electron + electron-builder (Windows / macOS / Linux)
Portal Automation	Puppeteer — headless Chromium, pool of 5 browser pages
Real-time Comms	WebSocket (ws library, port 3018)
Local Cache	NeDB-Promises — embedded JSON database, 7-day TTL
AI Enrichment	Anthropic Claude API (optional, configured via API key)
Export	Excel, CSV, JSON, Print/PDF, Word (docx)

D	CORE MODULES & FEATURES

Module 1: GST File Parser

Accepts GSTR-1 style JSON files and normalises them into a structured transaction dataset. Supports B2B invoices, credit/debit notes (CDNR), and HSN summary blocks. For every invoice line the parser extracts:

Filing GSTIN and counterparty GSTIN (validated against the standard 15-character Indian GSTIN format)
Document type, date, invoice value, taxable value, CGST / SGST / IGST / cess
HSN code, place of supply, filing period, and source file reference

Module 2: Four-Tier GSTIN Enrichment Engine

For every unique counterparty GSTIN encountered, the system attempts enrichment through a prioritised four-tier cascade:

Tier	Source	Method	Trigger
1	Cache	NeDB local store (7-day TTL)	Always checked first
2	GST Portal Proxy	Puppeteer → services.gst.gov.in	Cache miss
3	AI Lookup	Anthropic Claude API	Proxy unavailable
4	HSN Inference	Invoice HSN chapter → industry map	All external sources fail

Enriched attributes include: legal name, trade name, GST status, registration date, address, HSN/dealing-in codes, source, confidence score, and predicted industry.

Module 3: HSN-Based Business Profile Inference

When external enrichment is unavailable, the inference engine analyses the HSN codes across all invoices for a GSTIN, weights them by taxable value, and maps the dominant HSN chapters to an industry category. The top 5 HSN codes by value are selected and a confidence score derived from their dominance ratio. Mapped industries: Electronics, Automobile, Textile, Chemicals, Metal, Pharma, Food, Plastics, Paper, Services.

Module 4: Deterministic Fraud Rule Engine

The centrepiece of GST FraudShield. Five rule categories are applied to every counterparty GSTIN. Scores are additive and capped at 100. Risk level: HIGH (>=60), MEDIUM (>=30), LOW (<30).

Rule Code	Description	Score	Severity	Trigger Condition
UNKNOWN_GSTIN	GSTIN not found or unresolved	20	HIGH	Enrichment missing or low-confidence
HSN_MISMATCH	Invoice HSN differs from portal profile	30	HIGH	No 2-digit, 4-digit, or exact match
HIGH_CONCENTRATION	Single GSTIN dominates total value	15	MED/HIGH	>=40% share (HIGH if >=60%)
INVOICE_SPLITTING	Many small invoices instead of one large	10	MEDIUM	>=8 invoices, avg < Rs.50K, total >= Rs.5L
ROUND_VALUE_PATTERN	High proportion of round-value lines	5	LOW	>=60% round values, >=3 transactions

E	HOW IT WORKS — END-TO-END WORKFLOW

Step-by-Step Processing Pipeline

#	Step	Description
1	Upload	User selects GSTR-1 JSON or Excel files. Multiple files can be loaded and merged into a single audit session.
2	Parse	The file parser validates GSTIN formats, normalises transaction rows across B2B, CDNR, and HSN blocks.
3	Enrich	Every unique counterparty GSTIN is passed through the four-tier enrichment cascade. Results are cached for 7 days.
4	Infer	For GSTINs with no external data, the HSN inference engine predicts industry and business profile.
5	Score	The fraud rule engine applies all five rules. Scores are aggregated and capped at 100. Risk level is assigned.
6	Review	Dashboard renders summary cards, GSTIN-wise risk table, fraud flag list, and plain-English explanations.
7	Export	One-click export generates Excel, CSV, JSON, PDF, and Word documents for audit workpapers.

Dual Deployment Modes

Web Dashboard Mode: Express.js server on port 3020. Users upload files, trigger audit via POST /api/audit, and view results instantly. Ideal for cloud or firm-wide deployment.
Electron Desktop Mode: Packaged app for Windows/macOS/Linux. Adds GST proxy server (Puppeteer, port 3017), WebSocket live updates (port 3018), cache management, and full export. Designed for privacy-first, offline environments.

F	BENEFITS, IMPACT & DIFFERENTIATORS

Challenge vs Solution Mapping

Challenge	How GST FraudShield Solves It
Manual GSTIN verification takes hours per entity	Four-tier auto-enrichment with caching completes hundreds of GSTINs in seconds
HSN mismatches go undetected in manual audits	Automated HSN comparison (2-digit, 4-digit, exact) flags every mismatch
Invoice splitting and round-value patterns invisible in raw data	Rule engine detects both patterns with configurable thresholds
Fraud scores lack explainability for audit proceedings	Every flag carries a rule code, severity, score, and plain-English reason
GST portal inaccessible for bulk programmatic lookup	Puppeteer proxy with page pooling handles portal barriers gracefully
No fallback when portal is down or GSTIN is new	AI + HSN inference provides enrichment even without portal data
Reporting requires separate tools	Built-in multi-format export: Excel, CSV, JSON, PDF, Word

Quantified Benefits

Dimension	Impact
Speed	Audit pipeline completes in seconds for hundreds of GSTINs vs. days of manual work
Accuracy	Multi-source enrichment with confidence scoring reduces false negatives in fraud detection
Scale	Handles large multi-file GSTR-1 datasets without performance degradation
Privacy	Local cache means financial data stays on-premises unless AI enrichment is explicitly enabled
Coverage	Five rule categories covering the most prevalent GST fraud patterns recognised by tax authorities
Cost	Open-source stack with zero per-use licensing cost; deployable on any standard laptop or server

Unique Differentiators

Only tool combining live GST portal proxy automation with AI-based GSTIN enrichment as a seamless fallback.
Deterministic, explainable fraud scoring — every risk flag maps to a named rule, making results contestable and audit-ready.
Four-tier enrichment cascade guarantees a best-effort profile even for new, inactive, or portal-inaccessible GSTINs.
Dual deployment flexibility: browser-based for firm-wide access, Electron desktop for privacy-sensitive offline environments.
HSN-based industry inference brings supplier profiling capability to auditors without access to the GST portal.
Open, extensible architecture: new fraud rules can be added without touching parser or enrichment layers.

G	DEPLOYMENT, LIMITATIONS & ROADMAP

Deployment Options

Local / Offline (Electron Desktop): Packaged installers for Windows (.exe), macOS (.dmg), and Linux (.AppImage). Fully offline; suitable for firms with strict data policies.
Web Server Mode: Node.js application on any server or cloud VM. Accessible from any browser on the network. Suitable for shared firm-wide use.
Developer / Source Mode: Clone repository, run npm install, and start with npm run start:app (web) or npm run electron (desktop). Requires Node.js 18+.

Current Limitations

GSTR-1 JSON only in the modular backend; GSTR-2B and e-way bill formats are supported only in the large Electron UI file.
GST portal automation depends on portal availability; heavy CAPTCHA enforcement may reduce proxy hit rates during peak filing.
AI enrichment requires an Anthropic API key to be configured; not included by default.
No persistent multi-session storage in web mode; audit results must be exported before the session ends.

Future Roadmap

Feature	Description
GSTR-2B Integration	Full reconciliation of purchase register against GSTR-2B for ITC risk detection alongside fraud scoring
ML Anomaly Layer	Complement rule-based scoring with an unsupervised anomaly detection model trained on GST return patterns
Network Graph Analysis	Visualise GSTIN-to-GSTIN transaction networks to detect circular trading and accommodation entry chains
Multi-Client Dashboard	CA firm portal to load, switch, and compare audit results across multiple client GSTINs in one session
GSTR-3B Pre-fill	Auto-populate GSTR-3B Table 4 values from reconciled data with one-click export to the filing portal
WhatsApp/Email Alerts	Automated compliance reminders to suppliers with pending or late GST filings from vendor scorecard
Tally/ERP Integration	Direct data pull via ODBC from Tally Prime and SAP, eliminating the export-import step

H	CONCLUSION

GST FraudShield represents a practical, deployable answer to one of India's most pressing tax compliance challenges: the detection of fraudulent invoice networks and bogus ITC claims within the GST ecosystem. By combining structured file parsing, automated GSTIN enrichment, AI-assisted inference, and a fully explainable rule-based fraud engine, the platform delivers audit-grade intelligence in seconds — at a cost accessible to every CA practice.

The architecture is deliberately modular and open. Parser, enrichment, inference, and fraud modules are independently maintained and testable. New fraud rules can be added without touching the data pipeline. New enrichment sources can be plugged in without disrupting the rule engine — ensuring GST Fraud Shield evolves with India's GST law and portal landscape.

The dual-mode deployment model — browser dashboard for shared access, Electron desktop for privacy-first offline use — ensures the tool fits diverse CA practice environments. Built-in multi-format export means audit findings are immediately usable in compliance filings, client reports, and regulatory submissions.

The vision is straightforward: every tax professional in India — whether an independent CA or a GST intelligence officer — should have access to the same quality of fraud detection intelligence that today requires specialized teams and expensive enterprise software. GST Fraud Shield is that equalizer.

GST FraudShield

Recent Posts

Plant-to-CFO Intelligence Platform - An AI-Powered Financial Management & Variance Analysis Tool

GST Due Diligence Report Generator AI-Powered Comprehensive GSTIN Analysi

FinChange - AI Highlight Changes and Categorize Risks using FinChange AI Tool

AI for Finance Professionals From Data to Decisions to Automation

Autonomous CA Client intelligence system