Retrieval, RAG, and Vector Search
This page tracks late-interaction retrieval, HNSW, RAG pipelines, video RAG, and retrieval-oriented models.
Sources in this batch
- HNSW provides a core approximate-nearest-neighbor indexing method.
- Omar Khattab’s late-interaction material points toward retrieval architectures beyond simple dense vectors.
- A RAG pipeline video, VideoRAG, and ColModernVBERT connect retrieval to long-context and multimodal workloads.
Research interest
The interesting trend is retrieval becoming modality- and task-specific. VideoRAG and late-interaction models suggest that a single vector per document is often too crude; agents may need retrieval systems that preserve structure, time, and token-level interactions.
Open questions:
- When does long context beat retrieval, and when does retrieval beat long context?
- Can late-interaction retrieval be made cheap enough for personal/local agents?
- How should retrieval expose uncertainty and provenance to downstream agents?