Intelligent GSTR-2B ReconciliationRecord inserted or updated successfully.
AI & Auditing

Intelligent GSTR-2B Reconciliation

Author: CA. Tapas Ruparelia

Watch on Youtube

THE PROBLEM

Chartered Accountants and their team face significant challenges in reconciling the Purchase Register (PR) with GSTR-2B data downloaded from the GST Portal.

· Data Structure Mismatch: PR data (from Tally/SAP) often has different headers, formatting (commas in numbers), and date formats compared to the GST Portal.

· Format Limitations: The GSTR-2B Excel provided by the portal contains nested headers and merged cells, making standard VLOOKUP/Pivot Tables impossible without manual cleaning.

· Volume & Complexity: Manual matching fails when there are minor typos in Invoice Numbers (e.g., "INV/001" vs "INV-001") or small rounding differences, leading to unclaimed Input Tax Credit (ITC).

· Privacy Concerns: Clients are hesitant to upload sensitive financial data to third-party cloud SaaS platforms.

THE SOLUTION

A secure, locally hosted Python application that automates the end-to-end reconciliation process. The tool utilizes an intelligent "3-Pass Matching Engine" to identify matches not just by exact values, but by recognizing patterns, cleaning data anomalies, and applying user-defined tolerances

HOW IT WORKS

· Smart Data Ingestion:

o GSTR-2B JSON Parsing: Directly consumes the complex JSON file from the GST portal, eliminating the need to clean the portal's Excel file.

o Intelligent PR Reader: Auto-detects header rows in client data (e.g., skipping top rows in Tally exports) and sanitizes number formats (removing commas/spaces).

· Advanced Reconciliation Logic (The 3-Pass Engine):

o Pass 1 (Exact Match): Matches Invoice No + GSTIN + Taxable Value (within configurable ₹ tolerance) + Date (within configurable day tolerance).

o Pass 2 (Pattern AI): Normalizes invoice numbers (removes special characters, financial year suffixes etc ) to find fuzzy matches.

o Pass 3 (GSTIN Logic): Identifies cases where the Invoice Number matches but the GSTIN has a typo or State Code mismatch.

· Comprehensive Reporting:

Generates a single Excel file

with 4 sheets: Reconciliation (Side-by-side comparison), PR Only (Missing in 2B), 2B Only (Missing in PR), and GSTIN-wise Summary

TECH STACK

· Language: Python 3.13

· Frontend: Streamlit (for a clean, browser-based UI running locally)

· Data Processing: Pandas (High-performance data manipulation)

· Database: SQLite (For storing client-specific column mapping configurations)

· Deployment: Localhost (Ensures 100% Data Privacy)

5. Innovation & Impact

· Efficiency: Reduces reconciliation time for 2,000+ line items from hours to seconds.

· Accuracy: "Pattern Matching" logic catches valid ITC that manual eyes often miss due to formatting differences.

· User Experience: "One-time mapping" memory remembers column settings for each client, reducing repetitive work.

· Data Security: Runs entirely on the user's machine; no data ever leaves the system