The AI Adoption Roadmap

How engineering teams move from 'AI autocomplete' to multi-agent systems that can ship real features—and what each stage actually looks like in practice.

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

← Start

Tap stage to jump

Goal →

Over the last two years, I've spent most of my time building AI-native products from scratch—ChatStats (an AI assistant over my iMessage history), multi-agent game systems with self-improving strategy loops, agentic business document generators—solving real context and orchestration problems at depth.

Before that: Building ad platofrms at Amazon and Apple handling 5M requests per second with sub 120ms latency. I know what production systems look like at scale, and I've personally gone through these AI adoption stages repeatedly across multiple codebases.

What surprised me is how predictable the journey looks once you've seen it enough times:

The same early wins on tests, docs, glue code, and small greenfield services.
The same frustration when agents touch large, weird, legacy codebases.
The same “ohhh, this changes everything” moments when you finally fix the context bottleneck.

You can think of this as an AI adoption curve for real engineering work. It's not about buying a particular tool. It's about:

How long an AI agent can work autonomously in your system before it loses the plot—and how you systematically extend that duration by binding the agent tighter to your intent.

This article is my attempt to write that curve down as a roadmap:

How to read the stages (unit of assessment, productivity ranges, codebase complexity)
What each stage feels like in practice, from 0 → 6
What needs to be true to graduate from one stage to the next
Where deeper articles fit in if you want to go down the rabbit hole

This is a living document. I'm consolidating my experience here as I continue building and learning. As I move through these stages myself and work with more teams, I'll update this roadmap to reflect what actually works in practice.

How to read this: Take the quiz below to identify your current stage, then read that section plus the one before and after. Unless you're very curious, reading every stage will be overwhelming. Focus on closing gaps where you're partially in the previous stage and understanding how to reach the next one.

How to Read This Roadmap

Before we dive into stages, a few important definitions.

One Engineer, One Codebase

When I say "Stage 3" or "Stage 4", I'm talking about:

One engineer working on one codebase or service.

That matters, because your org probably looks like this:

A couple of newer services where people are doing advanced stuff (Stage 3-ish).
A big legacy monolith where everyone secretly still feels like Stage 1 or 2.
One principal or staff engineer who is personally operating at Stage 3–4 with AI even if the rest of the team isn’t.

That’s normal. The point of this roadmap is not to stamp a single number on your entire company. It’s a lens, not a compliance checklist.

Productivity Ranges

Each stage has a range like 3×–30× next to it.

These ranges are wide because productivity depends on engineer skill, task complexity, and domain knowledge working together.

Writing unit tests with AI? Trivially easy. Design research or a sweeping refactor across five services? Much harder. Your strongest engineer who knows the codebase will extract far more value than a junior onboarding to unfamiliar code.

In early stages (1-2), the gap between best and worst case is enormous. As you systematize context (stages 3-4), the floor rises—average engineers on hard tasks start seeing what top engineers saw earlier.

So when you see:

Stage 2 — Agentic AI (3×–30×)

Read that as: "Depending on the engineer, the task, and the codebase, you'll see somewhere in this range."

Codebase Complexity

You've seen the vibecoding demos: AI spins up entire apps in minutes, ships features in one prompt, makes coding look like magic.

It works when the entire project fits in the model's context window. Perfect for greenfield: one service, modern stack, no history, no hidden invariants.

The disillusionment hits hard when you take those same tools into a large legacy service and they serve up hallucinations on your first request.

Real companies have systems that look like:

100k+ lines across multiple services
10–15 years of history with mixed paradigms and half-finished migrations
Tribal knowledge about "that one place you must not touch"

For humans, onboarding onto those systems is orders of magnitude harder than starting greenfield. For AI, it's no different.

A tool that behaves like Stage 3 or 4 on a tiny service may feel like Stage 2 on your legacy monolith until you fix context.

If you take nothing else away from this article:

Codebase complexity is not a side note. It's one of the main difficulty knobs for AI adoption.

Context is how you pay down that difficulty. I dig into this in Context Is Your Constraint.

The Stages

Here's the whole ladder in one view:

Stage 0 — No AI (1×–10×)
Traditional engineering. No AI tools in the loop.
Stage 1 — Ad-hoc AI (1×–15×)
ChatGPT/Claude in the browser. Manual copy-paste of context.
Stage 2 — Agentic AI (3×–30×)
Agentic IDEs and agent terminals^[†] that can search and read your repo—without any context engineering toolchain on top.
Stage 3 — Ad-hoc Context Engineering (5×–50×)
Power users hand-crafting context bundles for serious tasks.
Stage 4 — Systematic Context (Intent Layer) (15×–60×)
A shared, token-efficient context layer (e.g. agents.md hierarchy) sits over your codebase.
Stage 5 — Agentic Verification (30×–100×)
Agents own implementation and verification loops; humans review final results.
Stage 6 — Multi-Agent Orchestration (60×–300×)
Many autonomous agents work in parallel; orchestration handles conflict and coordination.

Stage 7—Continuously Learning Agents—is frontier territory being defined by early adopters. You don't need Stage 7 to see massive value; for most orgs, there are years of runway moving from Stage 2 to 4 or 5.

Now let's go stage by stage.

Stage 0 — No AI

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

Most serious engineering orgs with modern stacks aren't here anymore, so we won't linger. But for completeness:

What it feels like

Everyone works in traditional editors. No AI tools. All improvements come from hiring, better processes, and better infrastructure.

Dimension	What This Stage Looks Like
AI Tools & Infrastructure	No AI tools available. Engineers use traditional IDEs.
Context Engineering	Not applicable — no AI to provide context to.
Task Scope & Capabilities	Manual coding for all tasks. Engineers rely on their own knowledge and traditional docs.
Engineer Skill Requirements	Traditional software development skills. No AI prompting needed.
Quality & Verification	Manual code review, manual testing, traditional QA.
Iteration Speed	Baseline. Features take standard timelines based on engineer skill and availability.

Stage 1 — Ad-hoc AI

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

This is "we use AI, but mostly as smart autocomplete."

What it feels like

Engineers paste stack traces into browser-based chat UIs, use autocomplete for small helpers, and occasionally have it explain unfamiliar code. It's undeniably helpful, but only on narrow, local tasks.

Dimension	What This Stage Looks Like
AI Tools	Browser-based chat UIs (ChatGPT, Claude) and basic IDE autocomplete plugins.
Context Engineering	Manual copy-paste workflow. Engineers copy code snippets and error messages into chat interfaces.
Task Scope	Single-file edits, code explanations, isolated tests and docs.
Engineer Skills	Basic prompting skills. Wide skill variance.
Quality & Verification	Manual review of all AI suggestions. Engineers treat AI output as a starting point requiring significant editing.
Iteration Speed	Faster than pure manual for specific tasks, but context translation between browser and editor creates overhead.

Your Constraint

Agency. You have amazing AI but you're not giving it tools to make changes directly. You're bottlenecked by human I/O going back and forth. The workflow: hit an error, switch to browser, paste stack trace, get response, copy back to editor, repeat. This translation loop limits AI to isolated tasks.

Graduate By

Adopt agentic tools^[†] that can search, read, and make changes to your repo directly. This eliminates the translation loop. Agents get their hands dirty in your code instead of you playing messenger.

Final Thoughts

Stage 1 is where manual copy-paste hits its ceiling. Graduating to Stage 2 is straightforward: give engineers tools that make changes directly in code. The interesting problems start after that.

Stage 2 — Agentic AI

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

This is where most serious teams I talk to live today.

What it feels like

You've wired in agentic tools^[1]. Agents can list files, search the repo, read code/configs/docs, and propose multi-file changes. It feels magical on small, clean services—and frustratingly hit-or-miss on large, messy ones.

Dimension	What This Stage Looks Like
AI Tools	Agent harness with agentic context retrieval: Cursor, Augment, Claude Code, or similar.
Context Engineering	Agentic context retrieval. Tool automatically searches codebase using terminal tools, embedding tools, and AST relationships.
Task Scope	Greenfield: prototype entire apps in hours. Legacy: multi-file changes but struggles with correctness on large systems. Best for unit tests, docs, and glue functionality.
Engineer Skills	Learning to work with agents: Effective prompts, reviewing multi-file changes, understanding agent limitations. Still figuring out what works.
Quality & Verification	Manual code review and testing required. Must carefully verify agent changes, especially on legacy systems where context gaps lead to subtle bugs.
Iteration Speed	Greenfield: minutes - hours for prototypes. Legacy: fast for simple tasks, unreliable for complex features without extensive manual context.

Your Constraint

Context.
The AI has the intelligence and the tools, but it does not see what your best engineers see before touching production^[2].

On greenfield, agentic search can wander the repo and land somewhere reasonable. On legacy systems, it's guessing where to look and filling the window with noise.

It cannot see your architectural patterns, enforcement boundaries, or "never do X" rules. It does not know which configs and experiments actually matter, or how Service A's contract quietly constrains Service B. That missing context is why you get:

Fast prototypes on small, clean repos
Inconsistent results on large, messy ones
Subtle bugs when invariants are not visible to the model
More time spent reviewing than the AI saved

The model's capability is already there. The ceiling you are hitting is context.

Graduate By

Treat context as an explicit engineering skill rather than something the agent figures out on its own.

Pick a few senior engineers and have them hand craft rich context packs for serious tasks: code, configs, docs, and constraints that matter. Use context engineering tools^[†] to help assemble bundles.

The real leverage here is teaching your engineers how to think in terms of context engineering. This skill also becomes a prerequisite for building effective agents later.

Final Thoughts

Stage 2 is where you learn that agentic search alone is just the AI feeling around your system with a blindfold on. Stage 3 is where you start taking that blindfold off by giving it the curated view your best engineers use.

Stage 3 — Ad-hoc Context Engineering

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

This is the "wizard" phase.

What it feels like

A handful of senior engineers have learned to act as human context routers. They spend 60-90+ minutes before a task assembling the perfect context bundle: the right files, architectural patterns, cross-service contracts, and domain constraints. When the context is right, the agent delivers production-quality work in one shot.

Dimension	What This Stage Looks Like
AI Tools	Agent harness + manual context engineering tools.
Context Engineering	Manual context engineering. Engineers craft packages: files, diagrams, dependencies, domain knowledge.
Task Scope	Deliver entire features across complex multi-service systems. Context engineering unlocks effectiveness on legacy.
Engineer Skills	Must learn context engineering, a specialized skillset with steep learning curve. Complex workflow requiring significant cognitive overhead.
Quality & Verification	Manual review with better confidence. Good context means fewer subtle bugs, but verification still required. Reviewers still carry the load.
Iteration Speed	Complex features feasible but context preparation adds upfront time. Faster than Stage 2 on legacy, but the preparation tax is real.

Your Constraint

Skill gap and scalability.
The difference between a novice and expert context engineer is 10x or more. Your wizards are shipping entire features that used to take teams, but:

Only 2-3 engineers can execute this effectively
Each context pack takes 30-90 minutes to assemble
Miss one file or dump too much, and quality degrades
Every pack is a bespoke experiment that disappears into chat logs
Average engineers remain stuck at Stage 2

You've proven context is the lever. Now you need to scale it^[3].

Graduate By

Build a systematic context layer over your codebase. Capture the hard-won mental models from your best engineers once, version them in the repo, and make them reusable.
Tools: agents.md hierarchies, architectural decision records, explicit invariants.

Final Thoughts

Stage 3 is where the most impressive AI case studies come from, but it's also where "AI works... but only when we put a principal engineer in front of it and treat every serious change as a one-off science project."
Stop treating context as personal craft. Start treating it as infrastructure. That's Stage 4.

Stage 4 — Systematic Context

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

This is the first big step change for larger organizations.

What it feels like

You've turned the lights on permanently^[4]. Your best engineers' mental models are now encoded as a hierarchical context layer over the codebase. Instead of every task requiring manual context assembly, agents automatically start with architectural patterns, invariants, cross-service contracts, and known pitfalls. The context your wizards had to craft by hand in Stage 3 is now infrastructure that everyone gets for free.

Dimension	What This Stage Looks Like
AI Tools	Agent harness + systematic context layer. Manual context engineering is still an incredibly valuable tool.
Context Engineering	Systematic and pre-built. Created once through interviews and hierarchical summarization. Maintained as code evolves.
Task Scope	Large features, projects, and refactors delivered end-to-end. Bugs triaged rapidly with full system context.
Engineer Skills	Just need to use the agent harness. Context engineering is democratized.
Quality & Verification	Manual review with higher confidence. Agents make sound decisions and avoid known pitfalls. Engineers still test and iterate.
Iteration Speed	Top engineers faster (less overhead). Bottom engineers productive on complex tasks (context guides them). Team velocity increases.

Your Constraint

Verification and iteration.
Agents now make correct, system-aware changes without manual context wrangling, but you still spend significant time:

Running tests manually
Checking logs and metrics
Iterating on failures
Verifying edge cases

You've solved "what should the agent read?" The bottleneck is now "who runs the tests and fixes failures?"

Graduate By

Add an agentic verification layer. Let agents run tests, check logs, and iterate on their own failures autonomously. You review the final result, not every iteration.

Final Thoughts

Stage 4 raises the floor. Average engineers now get results that were previously only available to your top engineers. Context engineering becomes shared infrastructure instead of personal craft, and AI effectiveness gets democratized across your entire team.

Stage 5 — Agentic Verification

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

By Stage 5, you've solved the input side of context. Now you start solving the feedback loop.

What it feels like

Agents no longer just propose changes. They run tests, check logs, iterate until acceptance criteria are met. Implementation agent makes changes using the context layer, verification agent runs tests and checks regressions. They iterate for 30-60 minutes. You review a patch with clean test results.

Dimension	What This Stage Looks Like
AI Tools	Agent harness + systematic context layer + verification agents. Implementation and test agents work in tandem to iterate autonomously.
Context Engineering	Systematic context layer extended to testing patterns, verification strategies, and quality criteria. Both implementation and verification are context-aware.
Task Scope	Agents iterate 30-60 minutes autonomously to achieve goals. Implementation agent makes changes, test agent verifies and checks regressions, iterate until complete. Fully autonomous cycles.
Engineer Skills	Focus shifts to planning and specification. Engineers define goals, acceptance criteria, and edge cases. Agents execute and verify. Review final output, not iterations.
Quality & Verification	Autonomous implementation and verification loop. Test agent ensures correctness and no regressions. Engineers review completed, tested work.
Iteration Speed	30-60 minute autonomous cycles. Define feature, walk away, return to tested implementation. Eliminates manual test-debug-fix loop.

Your Constraint

Serial execution.
You work on one task at a time. The agents can iterate autonomously, but you can't parallelize multiple loops simultaneously. If you want to ship five features, you queue them up and wait for each to complete.

The bottleneck shifts from can an agent make a correct change to how many loops can we run at once and how do we avoid agents stepping on each other.

Graduate By

Enable parallel multi-agent orchestration with merge conflict resolution. Spin up multiple agent pairs working simultaneously on different parts without stepping on each other.

Final Thoughts

Stage 5 is where you delegate entire implementation cycles. Define features in the morning, walk away while they run, return to working tested code. Or move to Stage 6 and start running multiple independent cycles in parallel while you wait. The 30-60 minute autonomous cycle eliminates the test-debug-fix loop.

Stage 6 — Multi-Agent Orchestration

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

This is the current frontier: engineers orchestrating many autonomous loops in parallel.

What it feels like

Engineers implement and test multiple non-trivial features daily. An orchestration layer spins up multiple agent pairs, coordinates which parts of the codebase they touch, and handles merge conflicts. You spend mornings on planning and specs, fire off multiple agent tasks, and spend afternoons reviewing and merging.

Dimension	What This Stage Looks Like
AI Tools	Full orchestration system managing multiple parallel autonomous agents.
Context Engineering	Systematic context layer enables planning mode. Engineers use context to design end-to-end features and test plans with agents before implementation.
Task Scope	Multiple features delivered daily. Architecture decisions become two-way doors. Explore multiple directions experimentally.
Engineer Skills	95% planning and design. Focus on architecture, feature specs, edge cases. Implementation runs parallel while planning next batch.
Quality & Verification	Multiple autonomous agent pairs in parallel, each with implementation and verification loops. Orchestration ensures clean merges between workstreams.
Iteration Speed	Several large features per engineer per day. What used to take a sprint now takes an afternoon. Planning quality is the bottleneck, not implementation capacity.

Your Constraint

Planning and specification quality.
Implementation capacity is no longer a constraint. The accuracy and completeness of your feature specifications is now the bottleneck. Poorly defined edge cases or ambiguous requirements waste agent cycles.

Your role shifts from can we build it to did we specify the right thing. Engineering becomes primarily architectural decision-making and planning.

Graduate By

This is the top tier for most organizations. The next frontier is continuously learning agents that improve their own context and skills over time, but you don't need that to see massive value. Focus on improving planning processes, specification quality, and architectural decision-making.

Final Thoughts

You're operating on a fundamentally different level. The constraint is how well you can think through problems, not how fast you can implement them.

Idea people will rule the world.

What I Think Comes Next

No AITraditional engineering. No AI tools.

1×–10×

Ad-hoc AICopy-pasting into ChatGPT.

1×–15×

Agentic AICursor/Claude Code agents search repo.

3×–30×

Ad-hoc ContextManual context engineering per task.

5×–50×

SystematicShared context layer infrastructure.

15×–60×

VerificationAgents run tests and iterate loops.

30×–100×

Multi-AgentParallel autonomous workstreams.

60×–300×

???Continuously learning agents.

300×+

Everything in this roadmap is based on patterns that already exist in production today.

The interesting question for the next few years is: What does Stage 7 actually look like?

My current bet is continuously learning agents that improve their planning capabilities over time:

Learn your patterns: After seeing how you specify features, what edge cases you care about, and how you make architectural tradeoffs, agents start suggesting complete specifications that match your style.
Compound context: Instead of just reading static context layers, agents update their understanding based on what works and what doesn't in your specific codebase and team.
Planning partners: The bottleneck shifts from how fast can I write specs to how well can I communicate what I want to an agent that already knows my constraints and preferences.

This is frontier territory. Most organizations will see enormous value just moving from Stage 2 → 4 on their critical systems.

This roadmap gives you the language and sequence to keep making progress without getting lost in the hype.

If you want to climb this curve:

Use this article as the map
Use the quiz as an alignment tool
Focus on the stages that remove your current bottleneck

And if you want help moving faster, from training your team in context engineering to installing a context layer over a legacy service, that's exactly what I spend my days working on.

Want help climbing the roadmap?

Whether you're trying to get more out of Cursor/Claude Code today or you're ready to install a context layer over a real legacy service, I can help you move up a stage without burning a year experimenting.

Get in touch

Up Next

Deep Dive

Context Is Your Constraint

Why AI is bottlenecked by context and what a systematic context layer can do about it.

Services

How We Can Help

Private training, team workshops, and hands-on context layer implementation to move your team up the curve.

Want help climbing the roadmap?

Up Next

Context Is Your Constraint

How We Can Help

Get notified when I publish