AI Engineering, one notion at a time.

13 domains, 12 notions and counting. Each page is a single concept with a TL;DR, the problem it solves, how it works, and a 2026 relevance check.

006 notions

Foundations

The foundational building blocks: tokenization, embeddings, attention, transformers.

011 notion

LLMs

Models themselves: loading, serving, quantization, inference-time optimization.

020 notions

Prompt Engineering

Techniques to design effective prompts: structured output, chain of thought, XML tags.

032 notions

RAG

From naive RAG to production: embeddings, chunking, vector stores, hybrid search, reranking.

041 notion

Context Engineering

Managing the context window: compression, memory, prompt caching, budgeting.

050 notions

AI Agents

Autonomous agents: ReAct, planning, tool use, function calling, multi-agent.

060 notions

MCP

Model Context Protocol: standardized way to plug tools and resources into LLMs.

071 notion

LLM Optimization

Inference servers, KV cache, batching, page attention. Serving LLMs fast and cheap.

080 notions

Evaluations

Measuring LLM/agent/RAG quality: golden sets, LLM-as-judge, RAGAS, regression tests.

090 notions

Observability

Tracing, logging, metrics for LLM apps: Langfuse, LangSmith, Arize, Helicone.

101 notion

Fine-tuning

LoRA, QLoRA, RLHF, DPO, synthetic data. Specializing a model for a use case.

110 notions

Infrastructure

Kubernetes for AI, GPU autoscaling, inference gateway, multi-cluster ops.

120 notions

Safety & Guardrails

Red teaming, jailbreak defense, content filtering, PII redaction, alignment.