02·4 notions

Prompt Engineering

Techniques to design effective prompts: structured output, chain of thought, XML tags.

Reasoning Prompting Techniques (CoT, Self-Consistency, ToT, ARQ)

Four prompt-level techniques push an LLM to reason harder instead of guessing. Chain of Thought (CoT) asks for steps. Self-Consistency runs CoT many times and votes on the majority answer. Tree of Thoughts (ToT) explores a tree of partial reasonings and picks the best branch. Attentive Reasoning Queries (ARQs) replace free-form chains with a JSON schema of explicit sub-questions.

Verbalized Sampling

Aligned LLMs (post-RLHF) collapse to a narrow set of safe, predictable answers. Verbalized Sampling (VS) is a training-free prompt that asks the model to "generate 5 responses with their corresponding probabilities". This reaches into the broader distribution learned during pretraining and recovers 1.6x-2.1x more diversity, with no quality loss.

Structured Outputs

Structured outputs force a model to produce machine-readable content (JSON, XML, SQL, regex) that downstream code can parse. Four layers of control: prompting, post-processing, constrained sampling, and finetuning. In 2026, native API support (OpenAI structured outputs, Anthropic tool use, grammars in open-weight servers) has mostly solved this.

JSON Prompting

JSON prompting means writing prompts that explicitly request JSON output with a defined shape. It forces the model to "think in fields", reduces ambiguity, and produces outputs your code can parse. In 2026 it is the entry-level technique. The next step up is [structured outputs](/kb/02-prompt-engineering/structured-outputs) with schema enforcement.