This project is a Streamlit application that empowers users to upload PDF documents and engage in interactive question-answering sessions about their content. It employs an Iterative Retrieval-Augmented Generation (RAG) technique, enhanced with a feedback mechanism, to deliver precise and contextually relevant answers. Built with LangChain, Chroma, and LangChain-Ollama, this application is ideal for researchers, students, or anyone needing to extract insights from PDF documents efficiently.
- Document Upload and Processing: Seamlessly upload multiple PDF files and process them into manageable chunks for querying.
- Interactive Q&A: Ask questions about the uploaded documents and receive detailed, context-aware responses.
- Iterative Feedback Mechanism: Leverages conversation history to improve the accuracy and relevance of subsequent responses.
- Transparent Retrieval: View the document chunks used to generate each answer, ensuring transparency and trust in the results.
- User-Friendly Interface: Built with Streamlit for an intuitive and responsive user experience.
The application utilizes an advanced RAG architecture that combines retrieval-based and generative AI capabilities. Below is a detailed breakdown of the workflow:
-
Document Ingestion:
- Users upload PDF files through the Streamlit sidebar.
- Each PDF is loaded using
PyPDFLoader
from LangChain and split into smaller chunks usingRecursiveCharacterTextSplitter
with a chunk size of 1000 characters and an overlap of 200 characters.
-
Vector Store Creation:
- Document chunks are converted into embeddings using
OllamaEmbeddings
(model:nomic-embed-text:latest
). - These embeddings are stored in a
Chroma
vector store, enabling efficient similarity-based retrieval.
- Document chunks are converted into embeddings using
-
Query Processing:
- When a user submits a query, the system retrieves the top 3 most relevant document chunks from the vector store using a similarity search.
-
Response Generation:
- The retrieved document chunks, the user’s query, and the conversation history (feedback) are combined into a comprehensive prompt using a
ChatPromptTemplate
. - This prompt is processed by the
OllamaLLM
(model:qwen2.5:latest
) to generate a detailed and contextually accurate response.
- The retrieved document chunks, the user’s query, and the conversation history (feedback) are combined into a comprehensive prompt using a
-
Feedback Loop:
- The system maintains a conversation history (up to 5 previous messages) to provide context for subsequent queries, enhancing response quality over time.
The following text-based diagram illustrates the flow of the Iterative RAG with Feedback system:
[Upload PDFs] --> [Process Documents] --> [Create Vector Store]
|
v
[User Query] --> [Retrieve Relevant Documents] --> [Generate Prompt (with Feedback)] --> [Language Model] --> [Response]
|
+------------------> [Update Conversation History] --> [Feedback]
To visualize this flow, you can create a diagram using tools like draw.io or Lucidchart. The diagram should depict the sequential flow from PDF upload to response generation, with a feedback loop connecting the response back to the prompt generation step.
To set up and run this application, follow these steps:
- Python 3.10 or later
- Ollama installed and running for embeddings and language model inference
- A compatible environment with internet access for package installation
- Clone the repository:
git clone https://github.com/armanjscript/Iterative-RAG-with-Feedback.git
- Navigate to the project directory:
cd Iterative-RAG-with-Feedback
- Install the required Python libraries:
pip install streamlit langchain langchain-chroma langchain-community langchain-core langchain-text-splitters langchain-ollama
- Ensure Ollama is running with the required models (
nomic-embed-text:latest
for embeddings andqwen2.5:latest
for the language model). You can pull these models using:ollama pull nomic-embed-text:latest ollama pull qwen2.5:latest
- Launch the Streamlit application:
streamlit run iterative_rag_with_feedback.py
- Open your browser to the provided Streamlit URL (typically
http://localhost:8501
). - In the Streamlit interface:
- Use the sidebar to upload one or more PDF files.
- Click the "Process Documents" button to load and process the PDFs.
- Enter questions in the chat input field to receive answers based on the document content.
- View retrieval details in the expandable section to see which document chunks were used for each response.
- To clear processed documents and start over, click the "Clear Documents" button in the sidebar.
Step | Action | Outcome |
---|---|---|
1 | Upload PDFs | PDFs are saved and processed into chunks |
2 | Process Documents | Chunks are embedded and stored in Chroma |
3 | Ask a Question | Relevant chunks are retrieved, and a response is generated |
4 | View Response | Answer is displayed with retrieval details |
5 | Continue Asking | Feedback from previous interactions improves responses |
Contributions are warmly welcomed! To contribute:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes, ensuring code quality and adherence to the project’s structure.
- Submit a pull request with a clear description of your changes.
Please ensure your contributions align with the project’s goals and maintain compatibility with the existing technologies (Streamlit, LangChain, Chroma, and Ollama).
This project is licensed under MIT license
For questions, feedback, or collaboration opportunities, please contact [[email protected]] or visit Arman Daneshdoost.