AI-Powered Workflow - Automation of IT Refund Status Check & Communication
Author : CA. kadiyala praveen
Author : CA. kadiyala praveen
The Problem Statement
Challenge: Manual, repetitive and time-consuming process for checking Income Tax Refund Status and downloading and mailing ITR forms to clients.
Description of Manual Process: Chartered Accountants and their staff must manually perform the following steps for every client:
This multi-step process is highly susceptible to human error, is monotonous, and significantly consumes billable hours, especially during peak filing seasons.
The Solution (Automated Workflow)
The proposed solution is a hybrid automation model that combines Robotic Process Automation (RPA) for task execution and AI for intelligent data interaction, creating a seamless and efficient end-to-end compliance workflow.
Key Components:
Automated Process Flow:
AI Implementation (The Gemini API & Chatbot)
The solution moves beyond basic RPA by implementing an AI-driven interface using the Gemini API for sophisticated natural language processing (NLP) and document generation.
| Component | Technology Used | Functionality & Role of AI |
| Data Interface | Python, Streamlit, Pandas | Streamlit provides the user-friendly interface. Pandas is used to ingest and process the updated dataset containing refund statuses. |
| Intelligent Query Handling | Gemini API (Google AI) | The Gemini API processes complex, conversational queries related to the uploaded dataset. |
| Examples of AI Capability: | Data Retrieval: "What is the refund status for client ABC?" → AI searches the dataset and provides the exact status. | |
| Domain Expertise: "How long does it typically take to credit a refund as per IT Act Provisions?" → AI retrieves and summarizes relevant Income Tax provisions. | ||
| Content Generation: "Draft a professional email to client XYZ explaining their refund status is 'Sent to Bank.'" → AI generates a structured, context-aware email draft. |
Impact, Efficiency Gains, and Scalability
| Key Metric | Before Automation (Manual) | After Automation (RPA + AI) |
| Time per Client | ∼5−7 minutes | <1 minute (run time) |
| Time Savings | N/A | ∼80−90% of process time saved. |
| Accuracy/Error Rate | Prone to human errors in credential entry, email addressing, or attachment oversight. | Near 100% accuracy. Errors (e.g., 'Invalid Password') are logged and isolated automatically. |
| Staff Focus | Monotonous data entry and repetitive communication. | Value-added tasks, client consultation, and complex case analysis. |
Key Benefits:
Input file before running the flow: (User need to fill PAN/Password/Mail/Whatsapp number/AY)
Updated Excel after running the flow:
AI Chatbot Interface using Python/Streamlit/Gemini API:
Interface for uploading the database
AI Response from Dataset – Query & Bot Response:
Power Automate Desktop Flows for reference:
Python + Streamlit + Gemini API code:
import streamlit as st
import pandas as pd
import os
from google import genai
from google.genai.errors import APIError
st.set_page_config(page_title="Advanced Data Query Bot 🤖", layout="wide")
GEMINI_API_KEY = XXXXXXXXXXXXXXXXXXXXXXX
try:
if GEMINI_API_KEY == "YOUR_ACTUAL_GEMINI_API_KEY_HERE" or not GEMINI_API_KEY:
st.error("API Key not found. Please replace the placeholder in the code with your actual key.")
st.stop()
client = genai.Client(api_key=GEMINI_API_KEY)
model = "gemini-2.5-flash"
tools_config = [{"google_search": {}}]
except Exception as e:
st.error(f"Error initializing Gemini client: {e}")
st.stop()
@st.cache_data
def load_data(uploaded_file):
"""Loads, standardizes, and caches the Excel data."""
try:
df = pd.read_excel(uploaded_file)
df.columns = ["AY", "Refund Status", "Client Name", "Refund $", "Date of Filing", "Refund Date"]
date_cols = ["Date of Filing", "Refund Date"]
for col in date_cols:
df[col] = pd.to_datetime(df[col], errors='coerce').dt.strftime('%Y-%m-%d')
df[date_cols] = df[date_cols].fillna('N/A')
return df
except Exception as e:
st.error(f"Error loading or standardizing data: {e}. Check your file structure.")
return pd.DataFrame()
def generate_gemini_response(df, user_query):
"""
Crafts a prompt with the data and sends it to the Gemini model
for context-aware querying, including tool use.
"""
data_text = df.astype(str).to_markdown(index=False)
system_instruction = (
"You are an expert financial data analysis bot and tax assistant. "
"Your primary task is to answer the user's question using the provided 'Table Data'. "
"The columns are AY, Refund Status, Client Name, Refund $, Date of Filing, and Refund Date. "
"**DATA QUERY INSTRUCTIONS:** "
"If the query relates to the data (e.g., specific Client Name or status): "
"1. Identify the relevant rows based on the client name or other identifiers. "
"2. If the user asks for the **time taken to credit the refund**, calculate the difference "
"between 'Refund Date' and 'Date of Filing' for the specific client. "
"3. Provide the requested information clearly and concisely. "
"**GENERIC QUERY INSTRUCTIONS (If the Google Search Tool is available):** "
"If the query is a general question (e.g., 'How long does a tax refund usually take?') "
"that cannot be answered with the table data, use the Google Search tool to find and cite the answer "
"based on relevant tax provisions or typical processing times. "
"**Always state your source or mention if the answer is from general knowledge or external search.**"
)
user_prompt = (
f"**User Query:** '{user_query}'\n\n"
f"**Table Data (use this first):**\n{data_text}"
)
try:
response = client.models.generate_content(
model=model,
contents=user_prompt,
config=genai.types.GenerateContentConfig(
system_instruction=system_instruction,
tools=tools_config
)
)
return response.text
except APIError as e:
return f"An API Error occurred: {e}"
except Exception as e:
return f"An unexpected error occurred: {e}"
# --- Streamlit App Layout ---
st.title("Advanced Tax Data Bot (Gemini + Search) 📊🔍")
st.markdown("Upload your Excel data and ask specific questions about client refund statuses, time taken, or general tax refund provisions.")
# File Uploader
uploaded_file = st.file_uploader(
"Upload Excel File (.xlsx or .xls)",
type=['xlsx', 'xls']
)
if uploaded_file is not None:
# Load and process the data
data_df = load_data(uploaded_file)
if not data_df.empty:
st.success(f"Data loaded successfully with {len(data_df)} records.")
st.subheader("Data Preview")
st.dataframe(data_df.head())
# User Input for Query
user_input = st.text_input(
"Enter your query (e.g., 'How long did it take for the refund of John Doe?' or 'How long does tax refund credit take as per Indian tax law?')",
key="user_query"
)
if user_input:
with st.spinner("Processing your query with Gemini (and Google Search, if needed)..."):
# Get the response from the Gemini model
response = generate_gemini_response(data_df, user_input)
# Display the result
st.subheader("Bot Response")
st.info(response)
else:
st.warning("Could not process the uploaded file. Please ensure it's a valid Excel file with the expected headers.")
else:
st.info("Please upload an Excel file to begin.")