In this module, we'll cover the techniques that could improve your RAG pipeline.
- Small-to-Big chunk retrieval
- Leveraging document metadata
- Hybrid search
- User query rewriting
- Document reranking
Links:
- Slides
- Five Techniques for Improving RAG Chatbots - Nikita Kozodoi [Video]
- Survey on RAG techniques [Article]
- Hybrid search strategy
- Hybrid search in Elasticsearch
Links:
- Reranking concept and metrics
- Reciprocal Rank Fusion (RRF)
- Handmade raranking implementation
Links:
- Reciprocal Rank Fusion (RRF) method [Elasticsearch Guide]
- RRF method [Article]
- Elasticsearch subscription plans
We should pull and run a docker container with Elasticsearch 8.9.0 or higher in order to use reranking based on RRF algorithm:
docker run -it \
--rm \
--name elasticsearch \
-m 4GB \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.9.0
- LangChain: Introduction
- ElasticsearchRetriever
- Hybrid search implementation
pip install -qU langchain langchain-elasticsearch langchain-huggingface
Links:
TBD
- First link goes here
- Did you take notes? Add them above this line (Send a PR with links to your notes)