ITR Extractor

A streamlit-aided web app for Income Tax Return field extraction.

Pipeline:

Image annotation -> Object Detection -> Extract ROI -> OCR -> Requisite text

DEMO:

Refer custom training tutorial for yolov5 - it is recommending to use one of their ready-made environments for training in order to avoid dependency issues
Refer train_val_test.ipynb to observe how I trained, validated and tested my model on google colab using google drive as a storage option
After training, validating and testing, downloading the 'best.pt' best weights from the directory specified by yolo, clone this repo to a local directory and place the best.pt file alongside the rest of the contents in this directory and rename to 'best_weights.pt'

pip install -r requirements.txt

To run the web app, perform the following command in the terminal:

cd path/to/directory
streamlit run app.py

Image set	mAP @ 0.5	mAP @ 0.5:0.95
Validation(all classes)	96.9	77.2
Testing(all classes)	92.1	68.3

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
app.py		app.py
best_weights.pt		best_weights.pt
demo.gif		demo.gif
requirements.txt		requirements.txt
test_ocr.py		test_ocr.py
train_val_test.ipynb		train_val_test.ipynb