You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is to create 15 different Retrieval-Augmented Generation (RAG) implementations from scratch using base libraries like:
litellm for LLM interaction
lancedb for vector storage and retrieval
Embedding models from HuggingFace (e.g., BAAI/bge-base-en-v1.5, sentence-transformers family)
The goal is to recreate each RAG variant found in repo using a minimal stack with readable, modifiable code, integrating AI Engineering Academy’s educational focus and data sources.
Evaluation code (either LLM-assisted or metric-based)
Visualizations for similarity scores and retrieval steps (matplotlib/seaborn)
Educational Rationale
Retrieval-Augmented Generation is a cornerstone of modern LLM applications. By rebuilding these techniques from scratch:
Learners get a transparent view into retrieval pipelines.
They learn to decouple from opaque frameworks like LangChain or LlamaIndex.
It provides a foundation for deploying RAG in constrained environments.
Each variant highlights different trade-offs (speed, interpretability, relevance, adaptability), which are crucial to master in AI product engineering.
By using open-source embedding models and a fast, local vector DB like lancedb, the content aligns with privacy-first and cost-effective AI engineering practices.
Content Type
This issue is to create 15 different Retrieval-Augmented Generation (RAG) implementations from scratch using base libraries like:
litellm
for LLM interactionlancedb
for vector storage and retrievalBAAI/bge-base-en-v1.5
,sentence-transformers
family)The goal is to recreate each RAG variant found in repo using a minimal stack with readable, modifiable code, integrating AI Engineering Academy’s educational focus and data sources.
Topic Area
Target Knowledge Level
Content Description
This initiative will rebuild 15 of the most educational and distinct RAG pipelines from the original “all-rag-techniques” repository into modular notebooks compatible with the [AI Engineering Academy format](https://github.com/adithya-s-k/AI-Engineering.academy/blob/main/docs/RAG/01_Basic_RAG/basic_rag_scratch.ipynb).
Each notebook should:
litellm
as the LLM client for generation.lancedb
as the local vector database for embedding storage/retrieval.[/archives/data](https://github.com/adithya-s-k/AI-Engineering.academy/tree/main/archives/data)
, replacing PDF test documents with unified AI-related content.Notebooks to create:
add more if possible
Each notebook will include:
Educational Rationale
Retrieval-Augmented Generation is a cornerstone of modern LLM applications. By rebuilding these techniques from scratch:
By using open-source embedding models and a fast, local vector DB like
lancedb
, the content aligns with privacy-first and cost-effective AI engineering practices.Resources
BAAI/bge-base-en-v1.5
)The text was updated successfully, but these errors were encountered: