ArthaFlow - AI-Powered Procure-to-Pay Management
AI Tool Basics for CA

ArthaFlow - AI-Powered Procure-to-Pay Management

Author : CA. Siddharth Shah

Watch on Youtube

1. Executive Summary

ArthaFlow is an AI-powered Procure-to-Pay (P2P) management platform that automates the entire procurement lifecycle — from raising a Purchase Inquiry to recording vendor payments — while enforcing India-specific statutory and compliance requirements end-to-end.

The platform combines a modern Next.js 14 application with Google Gemini 2.5 Flash (vision) for invoice OCR, a multi-provider fallback chain, a background job worker for asynchronous extraction, and four parallel intelligence passes that triage every invoice: confidence triage, math reconciliation, PO auto-matching, and a vendor-historical anomaly detector. A new line-level 3-way match module compares each invoice line directly against the linked Purchase Order and Goods Receipt Note, flagging rate-over-PO, quantity-over-PO, and quantity-over-GRN violations as critical anomalies.

Headline Capabilities

  1. Full P2P lifecycle: Purchase Inquiry → RFQ → Bidding → Purchase Order → Goods Receipt → Invoice → Payment, with role-based gating on every step.
  2. Vision-LLM invoice extraction with SHA-256 file-hash caching — re-uploads of the same PDF are free and instant.
  3. Four-pass invoice intelligence: triage, reconciliation, PO matching, anomaly detection — all run in parallel after extraction.
  4. Line-item 3-way match against PO and GRN with configurable rate and quantity tolerances.
  5. Statutory compliance baked in: MSME 45-day payment tracking, GSTIN validation, HSN/SAC fields, CGST/SGST/IGST/UTGST splits, GSTR-2B match flag.
  6. Vendor self-service portal with AI-extracted invoice upload — vendors submit invoices that flow directly into the buyer’s review queue.
  7. One-click approval from the review queue when the AI is highly confident.


2. Problem Statement & Use Case


The Pain Today

Mid-market Indian enterprises spend a disproportionate share of their finance bandwidth on procurement paperwork. A typical 3-way match requires a clerk to physically open the Purchase Order, the Goods Receipt Note, and the supplier invoice, then compare line items, rates, quantities, and totals manually. The process is slow, error-prone, and creates friction with MSME suppliers who depend on the statutory 45-day payment cycle.

Specific Frictions

  1. Invoice data entry — every line item, HSN code, GST split, and total is keyed in by hand, often introducing typos.
  2. Rate and quantity verification — when a vendor inflates rates or bills for more units than were received, no automated check catches it before payment.
  3. MSME compliance — the 45-day payment rule (MSMED Act, Section 15) carries Section 16 interest exposure when missed; tracking deadlines manually across hundreds of invoices is impractical.
  4. GSTR-2B reconciliation — input-tax-credit eligibility depends on the vendor having filed; mismatches surface only at month-end.
  5. Vendor communication — RFQ documents, POs, and payment status are tracked over email, leading to lost threads and version conflicts.

Who ArthaFlow is For

Primary userMid-market manufacturing & infrastructure firms (₹50 Cr – ₹500 Cr revenue) with 50–500 active vendors and multiple project sites.
Direct beneficiariesProcurement managers, site engineers, finance / AP team, vendors (via portal), internal & statutory auditors.
Indirect beneficiariesMSME suppliers who get paid on time; statutory auditors who get a complete audit trail.



3. Solution Overview

ArthaFlow models the entire P2P process as a typed state machine. Each artefact — Purchase Inquiry, RFQ, Bid, Purchase Order, GRN, Invoice, Payment — has well-defined statuses and transitions enforced at the database, server-action, and UI layer. Every state change is written to an immutable AuditLog with the acting user, timestamp, before-and-after snapshot, and any override reason.

Invoice intake is the AI-heavy module. When a PDF lands (drag-drop, bulk upload, or vendor-portal upload), the system computes a SHA-256 hash, checks the InvoiceExtractionCache, and either returns the cached result instantly or hands the file to a background worker which calls Google Gemini 2.5 Flash with a strict Zod-validated schema. The worker then runs four parallel intelligence passes whose outputs are merged into a single anomaly stream surfaced to the reviewer.


The Coherent Demo Thread

To make the platform tangible, the demo seed wires a single end-to-end purchase: a leak-repair Purchase Inquiry at the Mumbai Central pipeline becomes an RFQ, three vendors bid, Acme Steel Industries wins on price, a PO is issued, materials are received, an invoice is uploaded, and a payment is recorded. The same Acme thread is also visible on the vendor portal so judges can compare the buyer-side and vendor-side views in parallel.


4. Technology Stack


4.1 Frontend

LayerTechnologyVersion / Notes

FrameworkNext.js (App Router)14.2.x — Server Components first
UI libraryReact18.3 — Concurrent rendering
LanguageTypeScript5.x — Strict mode
StylingTailwind CSS3.4 + tailwind-merge + tailwindcss-animate
Component primitivesRadix UIDialog, Dropdown, Select, Tabs, Toast, Popover, Tooltip, Switch, Checkbox
IconsLucide React0.460 — Tree-shakeable SVG icon set
FormsReact Hook Form + ZodShared validation client + server
Data fetching (client)TanStack Query (React Query)v5 — used for live polling on bulk extraction
ChartsRecharts2.x — Dashboard analytics


4.2 Backend

LayerTechnologyVersion / Notes

Server runtimeNext.js Server Actions + API Routes14.2 — runs in same Node process as the UI
AuthenticationNextAuth4.24 — Credentials provider with Prisma adapter
Password hashingbcryptjs2.4 — Salted hash, slow by design
EmailNodemailer7.0 — Transactional vendor invites & notifications
Rate limitingCustom Postgres token bucketRateLimitBucket table — survives restarts, no Redis needed
Audit loggingAuditLog tableImmutable, append-only, before/after snapshots in JSONB


4.3 Database & ORM

LayerTechnologyVersion / Notes

DatabasePostgreSQL16 — JSONB-heavy schema for extraction results
ORMPrisma5.22 — Type-safe queries, transactions, migrations
Schema models30 models, 13 enumsUser, Vendor, Site, InventoryItem, PurchaseInquiry, Rfq, Bid, PurchaseOrder, Grn, Invoice, Payment, Contract, Budget, RecurringPI, ExtractionJob, InvoiceExtractionCache, ExtractionMetric, ExtractionFeedback, ApprovalRule, AuditLog, etc.
Seedingtsx prisma/seed.tsDeterministic seed with 10 demo invoices + coherent Acme thread


4.4 AI / Invoice OCR

LayerTechnologyVersion / Notes

Primary modelGoogle Gemini 2.5 Flash (Vision)via REST — generativelanguage.googleapis.com/v1beta
Fallback chainAzure Document AI → Anthropic Claude → StubGraceful degradation; the router picks the first healthy provider
Result cachingInvoiceExtractionCache tableSHA-256 file hash + provider key; re-uploads are free
Async pipelineBackground worker processPolls ExtractionJob rows; FIFO claim with atomic updateMany; horizontally scalable
ValidationZod schemaStrict line-item schema with HSN/SAC, CGST/SGST/IGST/UTGST splits
Confidence scoringPer-field + aggregateTriaged into HIGH / MEDIUM / LOW tiers for routing
Math reconciliationPure-TS reconcilerRe-computes subtotal + tax = grandTotal; flags mismatch
PO auto-matchingJaccard token similarityScores top-N candidate POs from the last 90 days
Anomaly detectionVendor-history comparatorPRICE_SPIKE / QUANTITY_SPIKE / FIRST_INVOICE / GSTIN_FIRST_SEEN
Line-level 3-way matchpo-line-match moduleRATE_OVER_PO / QTY_OVER_PO / QTY_OVER_GRN / ITEM_NOT_IN_PO


4.5 Document Generation

LayerTechnologyVersion / Notes

PDF generationjsPDF + jsPDF-AutoTable4.2 / 5.0 — PO, GRN, Invoice, Demo-invoice PDFs
Word generationdocx9.6 — RFQ documents, this submission doc
Charts in reportsRecharts SVG → PDFServer-rendered for executive reports


4.6 Testing & Tooling

LayerTechnologyVersion / Notes

Unit & integration testsVitest + Testing Library + jsdomReconciler, confidence, matcher, anomaly suites
Runtime scriptstsx4.19 — Direct TS execution for seed, demo, backfill scripts
LintingESLintnext/core-web-vitals config
Production process managerPM2Two processes: web + ocr-worker on the VPS
Version controlGit + GitHubConventional commits


5. System Architecture


5.1 High-Level Components

  1. Web tier — Next.js application serving Server Components, Server Actions, and API routes.
  2. Database tier — PostgreSQL holding application state, AuditLog, ExtractionJob queue, and InvoiceExtractionCache.
  3. OCR worker — separate Node process that claims pending ExtractionJob rows, calls Gemini, persists results, and updates job status.
  4. Object storage — local filesystem under UPLOAD_DIR (configurable to S3 / Azure Blob for production).
  5. External services — Gemini, Azure DocAI, Anthropic Claude (fallbacks); SMTP for notifications.


5.2 Key Architectural Patterns

  1. Server Components first — most pages are async Server Components that hit Prisma directly. Only interactive widgets are client components.
  2. Server Actions for mutations — no separate REST layer for internal forms; type-safe and progressively-enhanced.
  3. API Routes only where required — file upload, polling, and public endpoints (vendor onboarding, OCR job status).
  4. Immutable state transitions — invoice/PO/payment status changes are guarded by a state-machine helper that refuses illegal transitions.
  5. Audit-first writes — every mutating server action writes an AuditLog row in the same transaction.
  6. Idempotent worker — extraction jobs use FIFO claim via updateMany so two worker replicas never process the same job.
  7. Cache-then-compute — invoice extraction always checks the cache first; cache hits return in < 50 ms instead of the typical 4–8 s Gemini call.


5.3 Invoice Intake Pipeline

The intake pipeline is the most complex flow in the system. The numbered steps below describe what happens when a single PDF is uploaded:

  1. Upload — file arrives via /invoice/bulk-upload (admin), vendor portal, or single-invoice form. Multipart body is parsed; SHA-256 is computed in-memory.
  2. Cache probe — InvoiceExtractionCache is queried by (hash, provider). Cache hit short-circuits the AI call.
  3. Cache hit path — synthetic DONE job is created and the four parallel passes (triage, reconcile, suggestPOs, detectAnomalies + comparePoLines) run inline so the UI gets identical metadata.
  4. Cache miss path — file is written to disk under UPLOAD_DIR/<hash>, ExtractionJob row is created in PENDING state, and the worker picks it up.
  5. Worker run — claims job atomically, calls Gemini with the strict Zod schema, validates response, persists ExtractionResult, writes cache.
  6. Parallel intelligence — triage classifies into HIGH/MEDIUM/LOW; reconcile re-computes math; suggestPOs returns top-3 candidate POs; detectAnomalies compares against vendor history; comparePoLines runs the 3-way match against the top PO + its GRN.
  7. Surface in queue — review queue page renders status, tier, confidence, anomaly count, and a one-click ✓ Approve button when the AI is highly confident.