Mastering RAG Evaluation: Metrics, Testing & Best Practices
A deep dive into why evaluation is the foundation of any production RAG system. Lessons on building robust evaluation frameworks.
September 23, 2024
Technical Depth & References
Thoughts on building production AI systems. Lessons learned, techniques that work, and mistakes to avoid.
A deep dive into why evaluation is the foundation of any production RAG system. Lessons on building robust evaluation frameworks.
September 23, 2024
Technical Depth & References
Understanding the metrics that matter: Faithfulness, Answer Relevance, and Context Precision. Using the RAGAS framework.
January 5, 2025
Technical Depth & References
Concrete techniques for reducing LLM costs by 10x using prompt compression, model routing, and semantic caching.
November 15, 2024
Technical Depth & References
Moving beyond "glue code" to governed architectures. Tackling error compounding and non-deterministic behavior.
August 22, 2024
Technical Depth & References
How to build AI features that users actually trust. UX patterns for communicating confidence and handling graceful degradation.
July 10, 2024
Technical Depth & References
Practical guardrails and verification layers to ensure autonomous agents stay grounded in your business data.
December 18, 2024
Technical Depth & References
More depth added weekly. Subscribe to the newsletter for production-first AI engineering.