A Streamlit application that enables users to upload multiple PDF files and interact with the content through a conversational interface. It employs the Retrieval-Augmented Generation (RAG) algorithm via FAISS and Google's Generative Pre-trained Transformer for responsive and context-aware answers.
- Upload multiple PDF documents for processing.
- Extract text from the PDF files using PyPDF2.
- Split the text into manageable chunks for better context capturing.
- Create a FAISS index of embeddings for efficient similarity search.
- Utilize Google's Gemini LLM Model for generating responses to user queries.
- Clone this repository to your local environment.
- Ensure that you have Python installed on your machine.
- Install the necessary Python packages using:
pip install -r requirements.txt
- You must have a
.env
file containing yourGOOGLE_API_KEY
for the Gemini LLM Model.
- Start the Streamlit app:
streamlit run chatpdf.py
- Open the Streamlit application in your browser.
- Use the sidebar to upload PDF files by clicking on the "Upload your PDF Files" button.
- Once uploaded, click "Submit & Process" to index the content of the PDFs.
- After processing, ask a question in the text input field to get responses based on the PDF content.
This repo also includes other simple projects that include utilization of the gemini-vision model, and a basic QnA system with a memory storage system.
- Streamlit - For creating the web application.
- PyPDF2 - For PDF text extraction.
- LangChain - For text splitting and chaining logic.
- FAISS - For efficient similarity search.
- Google GenerativeAI - For embedding generation and conversational model.