When Claude wins, what it costs, how to roll it out in 90 days, and how to govern it.
As of 2026-05. Verify pricing/models at anthropic.com.
The build/buy question has changed. You no longer build the intelligence — you build on top of it.
| Wave | What you used to build | What you build now |
|---|---|---|
| Mainframe → PC | The OS | Apps on the OS |
| On-prem → Cloud | The data center | Services on AWS/GCP/Azure |
| Custom ML → Foundation models | The model | Workflows on the model |
Organizations that try to recreate the model layer in 2026 are repeating the "build our own data center" mistake of 2010.
Plus: Skills (custom domain expertise), MCP (connect external systems), Agent SDK (build autonomous workflows), Plugins (bundle and distribute), Memory tool (cross-session state), Claude Code (engineering CLI).
Anchor heuristic: build where you have a moat, buy where you don't. Claude wins when the use case is a differentiator your competitors can't buy off the shelf — proprietary data, workflow, expertise. For pure-commodity use cases, packaged SaaS is usually the right call.
Achievable cost reduction vs. naive use (all-Opus, no caching, no batch).
~90% discount on cached input tokens. Game-changer for RAG, system prompts, repeated context.
50% discount on async work. Use for enrichment, classification, summarization at scale.
Triage with Haiku, route hard cases to Sonnet/Opus. 5–15× cost gap between tiers.
→ Run the numbers in cost-calculator.html for your volume.
Platform features collapse traditional ML build timelines.
| Capability | DIY timeline | On Claude |
|---|---|---|
| Domain-specific assistant | 3–6 months (fine-tune + eval + serve) | 1–2 weeks (Skills + system prompt + eval) |
| RAG over internal docs | 2–4 months (embed + index + retrieval + serving) | 1–3 weeks (Files API or vector store + prompt caching) |
| Agentic workflow | 4–8 months (orchestration + tool wiring + recovery) | 2–6 weeks (Agent SDK + tool use) |
| Connect to internal systems | Custom adapters per system | MCP server (1 adapter, all clients use it) |
The platform compresses build time. The hard work shifts to evaluation, governance, and change management — which is where it always belonged.
→ Full mapping to NIST AI RMF + EU AI Act in governance-overlay.md.
→ Operational detail in adoption-playbook.md.
| Risk | Mitigation |
|---|---|
| Vendor concentration — single-vendor model dependency | Multi-model abstraction layer; quarterly evaluation of alternatives |
| Model deprecation — models get retired | Pin to model family in code; eval pipeline detects regression on switch |
| Cost surprises — unexpected bills | Per-team budgets + hard caps + per-request cost telemetry |
| Prompt drift — silent quality regressions | Prompt versioning + canary evals + alerts on metric drops |
| Shadow AI — staff using consumer Claude with company data | Sanctioned API access with logging; policy + training; SSO-gated |
None are show-stoppers. All are predictable. Plan for them in week 1, not month 6.
Stale factual priors about Claude (context window, caching ROI, sandbox denyRead, refusal calibration) quietly distort all five mitigations. See claude-misconceptions.md before final architecture sign-off.
Companion artifacts: cost calculator · feature matrix · adoption playbook · governance overlay · build-vs-buy worksheet · reference architectures · Claude Code rollout