JSON Prompting

Watch or read first

Daily Dose DS, "JSON prompting for LLMs" in the AI Engineering Guidebook (2025, paid): https://www.dailydoseofds.com/ai-engineering-guidebook/
OpenAI Structured Outputs: https://openai.com/index/introducing-structured-outputs-in-the-api/ and Anthropic tool use (for JSON-shaped responses): https://docs.anthropic.com/en/docs/tool-use - when available, use them over ad-hoc JSON prompts.
For the deep dive on the same idea with enforcement guarantees, see structured outputs.

TL;DR

JSON prompting means writing prompts that explicitly request JSON output with a defined shape. It forces the model to "think in fields", reduces ambiguity, and produces outputs your code can parse. In 2026 it is the entry-level technique. The next step up is structured outputs with schema enforcement.

The historical problem

Natural language is powerful yet vague. "Summarize this email" or "give me key takeaways" leave the model room to:

Add commentary you did not ask for
Skip details you needed
Change formatting between calls
Hallucinate structure

For any automated pipeline (extraction, reporting, analysis, agent tool use), you need the output to stay consistent every single call. Natural language prompts cannot guarantee that.

JSON prompting emerged around 2022-2023 as the simplest trick to get consistent outputs. It works because LLMs were trained on massive amounts of structured data (APIs, JSON configs, web scraping), so JSON is close to their "native language".

How it works

1. Describe the shape in the prompt

Extract the following fields from the customer email.

Return ONLY a JSON object of the form:
{
  "customer_name": string,
  "issue_category": "billing" | "shipping" | "technical" | "other",
  "urgency": "low" | "medium" | "high",
  "requires_refund": boolean,
  "summary": string (max 200 chars)
}

Email:
<email content>

The prompt tells the model: these are the fields, this is the type of each, these are the allowed values.

2. Provide a schema or an example

Two flavors:

Example-based: show a filled-in JSON. "Here is an example output: { ... }"
Schema-based: write TypeScript-like or JSON Schema syntax. "This field is an enum of these values."

Example-based is more natural but ambiguous on edge cases. Schema-based is precise but more verbose.

3. Ask for JSON only

Always add a stop rule: "Return ONLY the JSON object. No commentary. No markdown code block." Otherwise the model might wrap it in triple backticks or prefix with "Here is the JSON:".

4. Parse with fallback

Even with a perfect prompt, some models leak text around the JSON. Always extract JSON with a regex first, then parse:

import json
import re

def extract_json(text: str) -> dict:
    match = re.search(r"\{.*\}", text, re.DOTALL)
    if not match:
        raise ValueError("No JSON in response")
    return json.loads(match.group())

Why JSON works so well

1. Structure means certainty

Fields and values eliminate gray areas. "Write a summary" is vague. "Fill summary (max 200 chars)" is not.

2. You control the output

Your prompt no longer asks "what do you think?". It asks "fill this template." The model knows exactly what to produce.

3. Reusable templates

Once you have a JSON template that works, you can turn it into a shareable component. Teams plug results directly into APIs, databases, and downstream models without manual formatting.

4. LLMs are fluent in JSON

Pretraining includes millions of JSON API responses, config files, and structured data. You are speaking a language the model already knows.

Relevance today (2026)

JSON prompting is the minimum viable technique

In 2026, every API-backed LLM feature should use at least JSON prompting. Plain "summarize this" prompts are a code smell.

Structured outputs have replaced it in serious prod

OpenAI's response_format={"type": "json_schema"} and Anthropic's tool-use schemas enforce the JSON at the sampling level. No more parse errors, no more missing fields. See structured outputs for the deep dive.

By 2026, if you are still using plain JSON prompting in production, you are leaving reliability on the table.

When JSON prompting still wins

Models without structured output APIs (older local LLMs, some open-source endpoints)
Quick prototyping
Non-critical enrichment where a retry on parse error is acceptable
When you want flexibility inside the fields (e.g., long natural text answers)

Alternatives are strong now

Daily Dose DS rightly mentions:

XML for Claude - Anthropic models were trained with a lot of XML and do particularly well with it. For long structured answers, XML is often better than JSON on Claude.
Markdown - cleaner for outputs meant to be read by humans (docs, reports).

The takeaway: structure matters more than syntax. Pick the format that matches your downstream use.

Critical questions

Why does JSON output sometimes include comments or trailing commas? (Pretraining includes "loose" JSON from tutorials. Strict JSON.parse rejects them. Pre-sanitize.)
Why do long JSON outputs drift? (Decoding compounds errors. Long strings inside a field can push the model off-schema. Limit field lengths or use streaming parsers.)
When is XML better than JSON for Claude? (Long structured content, especially nested text. Anthropic docs explicitly recommend XML for prompt engineering.)
Should the schema be in the system prompt or the user prompt? (System prompt when the schema is stable, user prompt when it varies per call. Anthropic caches the system prompt, so put heavy schemas there.)
Why does adding "Return ONLY the JSON" help? (Reduces the probability of preambles. Still not guaranteed. Use structured outputs for guarantees.)

Production pitfalls

Missing fields. The model drops optional keys it has no data for. Make schemas explicit: "use null if unknown" is better than "skip the field".
Extra fields. The model adds fields it thinks would be useful. Reject unknown fields in your parser, or use strict structured outputs.
Type drift. A field you expected as integer comes back as string. Always cast after parsing, or use a schema validator like pydantic.
Markdown wrapping. Models wrap JSON in ```json blocks. Strip before parsing.
Unicode in strings. Quotes, backslashes, emojis can break parsing. Use a robust JSON parser, not regex.
Long arrays truncated. The model hits max_tokens mid-array. Invalid JSON. Increase max_tokens or chunk the task.
Prompt bloat. Large schemas eat tokens. Keep schemas compact, cache the system prompt.

Alternatives / Comparisons

Approach	Enforcement	Use case
Free text prompt	None	Exploratory chat, brainstorm
JSON prompting	Prompt only, best-effort	Prototyping, simple extraction
JSON mode (OpenAI legacy)	"Output is valid JSON"	Mid-tier reliability
Structured outputs (JSON Schema)	Grammar-constrained decoding	Production (OpenAI, Anthropic, Gemini, vLLM)
Function/tool calling	Grammar-constrained, tied to a function	Agents, MCP
XML prompting (Claude)	Prompt only	Long structured content
Markdown prompting	Prompt only	Human-readable reports
Regex-constrained decoding (Outlines, LMQL)	Strict	Custom grammars, JSON alternatives

Mental parallels (non-AI)

Forms at the DMV: giving the user a blank page produces chaos. Giving them a structured form with fields produces usable data.
SQL vs natural language querying: "show sales" is vague. SELECT sum(amount) FROM sales WHERE year = 2026 is not. Schema guides the answer.
Mad Libs: the fill-in-the-blank template forces a specific pattern of noun, verb, adjective. JSON prompting is structured Mad Libs for LLMs.
HTTP API contracts: REST endpoints publish their request and response schema. JSON prompting is the prompt-engineering version.

Mini-lab

labs/json-prompting/ (to create):

Pick a real task: extract structured fields from 100 product reviews (sentiment, topics, rating, pros, cons).
Implement 3 versions:
- Free text prompt with a human parsing step
- JSON prompting (schema inside the prompt, best-effort)
- Structured outputs (OpenAI json_schema or Anthropic tool use)
Run all three on 100 samples. Measure:
- Parse success rate
- Field coverage (how many fields populated)
- Type correctness (types match schema)
- Cost and latency
Compare results. Confirm that JSON prompting is 10x-50x more reliable than free text, and structured outputs is another 5x-10x improvement over JSON prompting.

Stack: uv, openai or anthropic, pydantic.

JSON Prompting

JSON Prompting

Watch or read first

TL;DR

The historical problem

How it works

1. Describe the shape in the prompt

2. Provide a schema or an example

3. Ask for JSON only

4. Parse with fallback

Why JSON works so well

1. Structure means certainty

2. You control the output

3. Reusable templates

4. LLMs are fluent in JSON

Relevance today (2026)

JSON prompting is the minimum viable technique

Structured outputs have replaced it in serious prod

When JSON prompting still wins

Alternatives are strong now

Critical questions

Production pitfalls

Alternatives / Comparisons

Mental parallels (non-AI)

Mini-lab

Further reading

Canonical

Related in this KB

Tools