2026.03.16M·07LLM Evaluation: How to Measure the Quality of AI Features
Shipping an AI feature and calling it good because it 'seems to work' is not a quality strategy. This post covers types of LLM evals, key metrics, building evaluation datasets, eval tools like promptfoo and Braintrust, and integrating evals into CI.
LLM EvaluationAI QualityEvals
→2026.03.11M·05Building a RAG Pipeline: Document Search with Vector DB + LLM
LLMs don't know what they weren't trained on. Here's how RAG fixes that — walking through the complete pipeline from document ingestion to chunking, embedding, vector storage, retrieval, and generation with real Python and TypeScript examples.
RAGVector DatabaseLLM
→