A modern Retrieval-Augmented Generation (RAG) application built using Google Gemini, LangChain, ChromaDB, and Gradio.
The system answers user questions strictly based on locally provided documents. It is designed to overcome fundamental limitations of Large Language Models (LLMs), specifically hallucination and lack of updated information beyond their knowledge cutoff.
Large Language Models are probabilistic text generators. As a result, they often:
- Produce confident but factually incorrect responses
- Fabricate information when relevant knowledge is missing
- Do not clearly state uncertainty
This behavior makes LLMs unsuitable for enterprise, research, and decision‑critical applications.
Large Language Models:
- Are trained on static, historical datasets
- Cannot access private, internal, or domain‑specific documents
- Cannot adapt to new or frequently updated information
This severely limits their usability in real‑world systems that require dynamic and context‑aware intelligence.
- Uses relevant document content before answering, so the model does not guess
- Fetches fresh information from local files instead of relying on old training data
- Clearly responds with “I don’t know” when the answer is not found in documents
- Breaks documents into smaller chunks for faster and more accurate retrieval
- Separates ingestion, retrieval, and inference for easy updates and scalability
- Documents are loaded from the
data/folder - Documents are split into small chunks
- Each chunk is converted into vector embeddings
- Embeddings are stored in ChromaDB
- User queries retrieve the most relevant chunks
- Gemini generates answers using retrieved context only
- Programming Language: Python
- Large Language Model: Google Gemini
- Framework: LangChain
- Vector Database: ChromaDB
- Embeddings: Google Generative AI Embeddings
- Frontend / UI: Gradio
- Environment Management: Python
venv,dotenv
git clone https://github.com/supriya46788/Context-Aware-Retrieval-Augmented-Knowledge-Inference-Engine.git
cd Context-Aware-Retrieval-Augmented-Knowledge-Inference-Enginepython -m venv venv
venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the project root:
GOOGLE_API_KEY=your_google_gemini_api_key
python ingest.pyAny update to the
data/directory requires re‑running the ingestion step.
python app.pyAccess the application at:
http://localhost:7860