Industrial RAG Gate
- Context
- While testing RAG answers on maintenance and safety manuals, good-looking answers still cited nearby but wrong authority
- Problem
- Industrial manual QA needs authority checks, not generic answer similarity
- Bottleneck
- Aggregate scores hid an item-level safety citation regression
- Fix
- Built a domain fixture, holdout split, gate states, hybrid RRF baseline, citation diagnostics, and SME review packet
- Result
- v5_t31 hybrid: recall@5 0.978; citation hit 0.945; safety specificity 5/5 exact; review queue 5 items with 2 P0
91 fixture items manual QA cases with expected source behavior
68 tests regression checks run in CI
5 items review queue support gaps prepared for SME review