A powerful GenAI application that summarizes YouTube videos and answers questions based on video content using LLMs, embeddings, and semantic search. Built using LangChain, Gemini Pro, ChromaDB, and deployed with Flask.
- 🔍 YouTube Transcript Extraction – Automatically pulls transcripts from any YouTube video.
- 🧠 RAG Pipeline (Retrieval-Augmented Generation) – Uses vector search + LLMs to provide accurate, context-aware answers.
- 🧾 Summarization + Q&A – Ask any question about the video or request a summary.
- 💾 Persistent Memory – Stores vectorized chunks in ChromaDB using unique hashes.
- 🌐 Flask REST API – Backend ready for integration into web apps or platforms.
- Load Transcript from a YouTube video using
YoutubeLoader
. - Split Transcript into chunks using
RecursiveCharacterTextSplitter
. - Generate Embeddings using
GoogleGenerativeAIEmbeddings
. - Store & Search chunks using
Chroma
vector database. - Query through a
RetrievalQA
orRetrievalChain
powered by Gemini Pro and LangChain. - Respond in a friendly tone customized for young learners using prompt templates.
python app.py