"Redesigned a naive RAG pipeline into a high-performance semantic search architecture, integrating a cross-encoder re-ranker and semantic caching to drastically reduce TTFT (Time To First Token) and eliminate hallucinations in financial querying."
+51% via Domain-Specific Embedding fine-tuning
Baseline was: 62
Double relevance via BM25 + Dense Hybrid Search
Baseline was: 0.45
Near-zero hallucination via strict prompt grounding
Baseline was: 85
78% reduction via Semantic Caching & Edge Functions
Baseline was: 2100