Resources
A curated collection of papers, tools, and datasets that I reference frequently. These are the charts and instruments for your ML voyage.
Essential Papers
Attention Is All You Need
Vaswani et al., 2017
The foundational transformer paper. Essential reading for understanding modern NLP.
BERT: Pre-training of Deep Bidirectional Transformers
Devlin et al., 2018
Introduced masked language modeling and bidirectional context. Changed NLP forever.
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Reimers & Gurevych, 2019
How to create meaningful sentence embeddings efficiently. Foundation for semantic search.
Dense Passage Retrieval for Open-Domain Question Answering
Karpukhin et al., 2020
DPR paper. Shows how to train bi-encoders for retrieval with in-batch negatives.
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction
Khattab & Zaharia, 2020
Late interaction for efficient yet accurate retrieval. Great middle ground.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Lewis et al., 2020
The original RAG paper. Combines retrieval with generation.
Tools & Libraries
sentence-transformers
EmbeddingsPython library for state-of-the-art sentence embeddings. Your go-to for semantic search.
FAISS
Vector SearchFacebook's library for efficient similarity search. Essential for production vector search.
Hugging Face Transformers
ModelsThe standard library for working with transformer models. Excellent documentation.
LangChain
RAGFramework for building LLM applications. Good for RAG pipelines and chains.
ONNX Runtime
DeploymentHigh-performance inference engine. Essential for production deployment.
Weights & Biases
MLOpsExperiment tracking and model management. Makes ML experiments reproducible.
Datasets
MS MARCO
8.8M passagesLarge-scale passage retrieval dataset. Standard benchmark for search.
Natural Questions
307K examplesReal Google search queries with Wikipedia answers. Great for QA.
STS Benchmark
8.6K pairsSentence similarity benchmark. Standard for evaluating embeddings.
BEIR
18 datasetsHeterogeneous benchmark for zero-shot retrieval. Tests generalization.
HotpotQA
113K examplesMulti-hop question answering. Tests reasoning over multiple documents.