Revolutionizing NPO compliance through automated downloading and extraction
Author : CA. Chunauti H. Dholakia
Author : CA. Chunauti H. Dholakia
Problem Statement
In case of manual downloading of documents such as circulars, notifications, judgments, forms, rules, sections of any Act etc. from any website, multiple files cannot be downloaded together. It has to be downloaded one by one by navigating from one website to another. If the number of documents to be downloaded from any website is hundreds, it takes much time to download the documents. Also extracting specific details after downloading the documents manually requires much time and resources. If it is the requirement to download Forms, FAQs and Guidance Notes applicable to non-profit organization from the Income Tax Portal and extract the details from it, this task requires much time.
Solution
With the use of AI web scrapping, it is possible to download specific documents from any website including from the Government portal. If it is required to download Form 104 to 114 applicable to non-profit organization and FAQs and guidance notes pertaining to these forms, it can be downloaded in a single zip file directly to the computer without navigating from one site to another site manually using Google colab notebook. Also required details from these forms can be extracted automatically by AI.
How it works
· Automation of downloading
· Searches of Forms applicable to non-profit organizations as per Income Tax Rules, 2026 from the Income Tax portal by running the python script on the Google Colab Notebook.
· Saves said forms to the specified folder.
· Automatically navigates to another link of the Income Tax Portal.
· Searches FAQs and Guidance notes pertaining to forms applicable to non-profit organizations.
· Saves FAQs and Guidance notes to specific folder.
· Downloades all Forms, FAQs and Guidance notes to specified folder after converting into a single zip folder.
· Directly saves the folder to the computer.
· Extraction of details from the downloaded documents
· Searching the required details from the downloaded guidance notes by running the python script on the Google Colab Notebook.
· Converts the details into a single excel sheet.
· Downloads the excel sheet to the specified folder in the computer.
Ai tools used
· Gemini AI
· Google Colab Notebook
Impact
· Automates downloading required documents in bulk.
· Automatic extraction of required details from the downloaded documents.
· Accuracy in extracting details from the downloaded documents.
· Zero installation required.
· Zero Hallucination