Why this exists
Runtime choices are often argued with vague speed claims. Method first, then numbers: p50/p95 latency, memory footprint, quality deltas
Next checks
- reproducible hardware and model configuration
- tokens/sec and memory tables
- quantization comparison with quality notes
- variance report across repeated runs
Public line
No performance claim until the measurement script, environment, and raw result table are public