6 canonical Claude patterns. Hand-drawn diagrams — meant to be re-drawn on a whiteboard, not pasted into a slide. See feature-inventory.md for canonical feature status.
Legend:
User / client
Claude API
Tools / MCP
Storage / data
Output / sink
1. RAG copilot
Sonnet 4.6 · Files API · prompt caching · citations · (optional MCP for live data)
Cost band
Low — Sonnet + caching dominates.
Latency
1–3s typical end-to-end.
Governance
Citations mandatory. Log retrieved chunks for audit.
Time to disruption
0–3 mo · ship now.
User question hits the app, which retrieves relevant chunks from a search index over content uploaded to Files API. The static system prompt + tool definitions stay cached on Claude's side; only the retrieved context + question are fresh per request. Sonnet 4.6 returns a grounded answer with citation spans pointing back to the source documents.
Use when
Domain Q&A over a stable corpus. Compliance requires source attribution. Latency budget allows 1–3s.
Don't use when
Answer requires multi-step reasoning across systems (use agentic). Corpus changes per-request (rebuild as live tool use).
Operationalize with:eval-starter-pack.md — grounding + format-compliance evals are how you catch citation drift and untethered answers ·
mcp-starter-pack.md — internal-docs server is the live-data MCP for this pattern ·
governance-overlay.md §9 — what to log per request when citations are mandatory.
2. Agentic workflow
Sonnet 4.6 (Opus 4.7 escalation) · Agent SDK · MCP · Skills · memory tool · plugins for distribution
Cost band
Mid–High. Loop fan-out matters; cap iterations.
Latency
10s–10m. Pattern is async-tolerant.
Governance
Sandbox tool runtime. Audit every tool call.
Time to disruption
3–12 mo · eval discipline first.
The Agent SDK runs the plan-act-observe loop. Sonnet 4.6 is the default planner; escalate to Opus 4.7 when the agent flags a hard step. Tools come through MCP servers (one connector per system, reused across agents). Skills hold the domain playbook so the agent has expert procedures, not just primitives. Memory tool persists state across runs for long-running agents. Distribute the whole bundle as a plugin so other teams install one thing.
Use when
Multi-step task that decomposes naturally into tool calls. Domain has stable procedures worth encoding as Skills. Async-tolerant.
Don't use when
Task is single-shot. Latency budget is < 3s. Tools don't exist or have no API surface (use computer use instead).
Operationalize with:mcp-starter-pack.md — 7 read-only server templates populate the MCP layer in the diagram ·
eval-starter-pack.md — tool-call-accuracy + grounding + cost-per-task evals are non-negotiable for agentic loops ·
claude-code-starter-skills.md — Skills template structure (when-to-use / failure-mode / owner) is portable beyond Claude Code ·
governance-overlay.md §14 — sandbox tool runtime, treat tool returns as untrusted.
3. Batch enrichment
Haiku 4.5 · Batch API · prompt caching · Files API for inputs
Cost band
Lowest. Haiku + batch + cache compounds.
Latency
Up to 24h SLA per job.
Time to disruption
0–3 mo · ship now.
Governance
Sample audit on output. Track per-job cost.
For document classification, extraction, summarization, evalsets, scheduled enrichment. Build N requests with the same cached schema/instruction prefix; submit as one Batch job; poll for completion. Haiku 4.5 handles the bulk of extraction work cheaper than any other tier. Output goes to a sink and audit log.
Use when
Latency-tolerant. High volume. Schema/instructions reused across records. Eval suites overnight.
Don't use when
Real-time response required. Per-record customization is high. Extraction needs deep reasoning (use Sonnet, not Haiku).
Operationalize with:cost-calculator.html — model the Haiku + batch + cache compounding before you commit a job size ·
eval-starter-pack.md — format-compliance + regression evals run cheaply on the Batch API itself ·
governance-overlay.md §9 + §11 — per-job audit trail and retention policy for batch outputs.
4. Domain expert assistant
Sonnet 4.6 + thinking · Skills · MCP · Files · citations · plugins for org distribution
The pattern that earns the highest ROI for regulated verticals (legal, finance, clinical, compliance, claims). Skills carry the domain procedures + style + decision rules. MCP exposes the systems of record. Files holds the policy/regulatory corpus. Sonnet 4.6 with extended thinking returns cited recommendations. Distribute the whole bundle as one plugin so a new region or business unit installs it in minutes.
Use when
Vertical with proprietary procedures and high-value decisions. Citation/audit is mandatory. Multiple teams will adopt.
Don't use when
Generic Q&A — the customization stack is overkill. Decisions are low-stakes — simpler RAG copilot suffices.
Operationalize with:claude-code-starter-skills.md — Skills template shape (when-to-use / failure-mode / owner) ports directly to domain Skills ·
mcp-starter-pack.md — read-only servers (issue tracker, internal docs, observability, API catalog) populate the systems-of-record layer ·
eval-starter-pack.md — grounding + refusal-calibration evals are mandatory for high-stakes verticals ·
governance-overlay.md §7 + §9 — EU AI Act high-risk deployer obligations + audit log requirements.
5. Code automation
Claude Code · Opus 4.7 + Sonnet 4.6 · plugins (commands + skills + hooks + MCP) · sub-agents · computer use 2.0 (optional)
Cost band
Per-engineer subscription or API metered. Cache + Sonnet default keeps it predictable.
Latency
Interactive in CLI. Headless in CI matches build time.
Governance
Hooks enforce policy at tool boundary. Settings.json under source control.
Time to disruption
0–3 mo · ship now.
Engineering teams use Claude Code (CLI + IDE) day-to-day. Sonnet 4.6 is the default; Opus 4.7 handles hard refactors. Sub-agents (Task tool) parallelize investigation work. The team plugin bundles slash commands, Skills (repo conventions, language patterns), hooks (PreToolUse policy enforcement), and MCP servers (issue tracker, internal docs) so a new engineer's setup is a one-line install. Headless mode runs the same loop in CI. Detail in claude-code-adoption-guide.md.
Sonnet 4.6 (Haiku 4.5 for triage) · cached app context · MCP to host app · memory tool for personalization
Cost band
Mid — Haiku triage + 90% cached input on system + app schema compounds. ~$0.001–$0.01 per interaction typical.
Latency
Sub-2s for triaged answers (Haiku). 2–6s when Sonnet handles tool use. Streaming UI hides the rest.
Governance
Memory tool retention + per-user/per-tenant isolation. App-context redaction before send. MCP server scoped to current user.
Time to disruption
3–12 mo · memory tool beta + app integration cost.
The copilot lives inside an existing app — CRM record sidebar, ticketing detail page, IDE panel — not as a standalone chat. Haiku 4.5 triages: simple lookups answer immediately; complex reasoning routes to Sonnet 4.6 with tool use. The system prompt + tool definitions + app schema are cached (90% off cached input on hit). An MCP server scoped to the host app reads the current record, searches related records, and drafts (but does not commit) actions. Memory tool persists per-user preferences across sessions. The response renders inline — never as a popup the user has to context-switch to.
Use when
Existing app with workflow context (CRM, ticketing, IDE, internal tools). Users want help inside their flow, not a separate chat tab. Personalization across sessions matters.
Don't use when
No host app yet (build the app first). Pure document Q&A (use Pattern 1 RAG). Multi-step autonomous execution (use Pattern 2 agentic).
Operationalize with:claude-code-starter-skills.md — Skills package in-app procedures (drafting, summarizing, classifying) ·
mcp-starter-pack.md — host-app MCP server (read-only by default, gated mutate via Phase 4) ·
eval-starter-pack.md — refusal-calibration + grounding evals matter most for in-app responses ·
governance-overlay.md — §1 data flow when prompts include app context, §11 memory tool retention.
Before you build any of these
Picking the architecture is the easy part. Picking the right first use case is what stalls 80% of pilots. Score 2–6 candidate use cases on 5 weakest-link axes (value, time-to-signal, data readiness, risk, sponsor clarity) before committing to any pattern above.