Frontier engineering teams don’t just adopt AI tools, they change how they build software
Frontier engineering teams don’t just code faster. They ship production-quality software with AI that understands their codebase, their standards, and their intent - before a single line is written.
What makes a frontier engineering team?
Frontier teams don’t just adopt AI tools, they change how they build software. Here’s what separates them from teams that are still experimenting.
Across teams at Amazon, the highest-performing teams share five practices with a common logic. Reduce the barriers to context for the agent and increase the surface area of work it can do independently.
They invest in agent context. The most advanced teams invest heavily in making projects and knowledge easier for agents to consume. Agent steering files that define conventions, coding standards, testing patterns, and codebase navigation. One infrastructure team placed all code and documentation into a monorepo and kept the inline commentary that agents generated, treating it as persistent memory. Teams that skip this step often wonder why their agents keep making the same mistakes.
They slow down to speed up. High-performing teams consistently reported that things initially slowed down as they learned the models. The teams that pushed through that learning curve experienced compounding acceleration. The teams that expected immediate gains without changing their workflows were disappointed. The first two weeks tend to feel slower. The weeks after tend to feel dramatically faster.
They feed agents instead of babysitting them. Frontier teams maintain a steady backlog of well-scoped tasks, running multiple agents in parallel and reviewing output asynchronously. Some run agents for hours at a time, often overnight, and review generated code in the morning. One principal engineer shipped a complete change with only “a couple of hours of contiguous time” because the agent worked while the engineer moved between code reviews, operational support, and meetings.
They make intent explicit before code gets written. Whether through structured specifications, detailed requirements documents, or well-scoped task decomposition, frontier teams ensure agents have clear context about what “done” looks like before they start generating code. Some teams using this approach report hand-writing only 1–2% of their code while pushing significantly more commits per person per week than before.
They shift testing left. Frontier teams build tooling so agents can run integration tests locally and self-correct before code ever reaches the pipeline. One team invested in automated guardrails, component tests, performance tests, and formatters that caught issues early. Code reviews shifted focus to interface definitions and architectural decisions rather than code style and naming conventions.
Where does your team stand?
Use these behaviors to spot where your team already operates like a frontier team.
| Dimension | Still Experimenting | Frontier-Ready |
|---|---|---|
| AI usage | Autocomplete, ad-hoc prompts | Autonomous agents with structured context |
| Requirements | Verbal or in tickets | Explicit intent captured before code is written |
| Code quality | Review catches issues late | Tests and docs generated alongside code |
| Legacy code | Avoided or manually refactored | AI analyzes and modernizes with guardrails |
| Governance | Per-developer tool choices | Org-wide steering files and policies |
Results
Inside the teams that already ship this way. Not projections. Measured results from teams that restructured their workflows around AI.
Up to 20x
individual productivity gain. Amazon Bedrock (6 engineers, 76 days)
~6x
throughput acceleration. Amazon Prime Video (90-week estimate compressed to 24 weeks)
4.5x median
productivity gain. Amazon Stores (multiple team pilot, typical engineers)
More than 10x
top-end improvement. Amazon Stores (highest-performing teams)
Amazon Bedrock. Workflow redesign from day one.
Six engineers delivered a project scoped for 30 developers over 12-18 months in 76 days. The team redesigned how they work around AI from the start. They shifted from discrete tasks to goal-driven outcomes, ran multiple agents in parallel, and enabled autonomous work during off-hours. Commits went from 2 per week to 40 per developer.
Amazon Prime Video. Taming a decade-old legacy codebase.
The Financial Systems team ran a 10-day structured sprint on a 10-year-old payments system processing billions in transactions across 6 production services. Six engineers, one room, zero context switching. They produced 556 commits against a baseline of 96. The gain came from three factors multiplying together. Acceleration of low-judgment work (1.5x) x uninterrupted focus on high-judgment work (1.5x) x instant access to agent-captured domain expertise (1.5x).
Amazon Stores. Median 4.5x, top teams exceeding 10x.
A structured pilot with typical development teams working against their regular backlogs. No special conditions. No handpicked engineers. The teams that restructured workflows around AI achieved a median 4.5x productivity gain. One team now ships features in an afternoon that previously took two weeks.
How Kiro works
Frontier teams find their own path to productivity gains. Some invest in agent infrastructure. Some restructure around structured requirements. Some do both. But the teams that sustain those gains over time tend to converge on three things. They give agents rich context. They make intent explicit. And they verify correctness before code ships. Kiro builds these into the development environment so teams don’t have to invent them from scratch.
Intent
Describe what the team wants in natural language. Kiro transforms that into structured requirements with acceptance criteria, architectural designs, and sequenced implementation tasks. Intent stays synchronized with code as it evolves. Requirements never drift from implementation because they’re linked bidirectionally.
Context
Kiro builds and maintains a persistent understanding of the project across sessions, surfaces, and contributors. Three layers of memory work together. Explicit rules from steering files. Episodic history from past sessions. Learned patterns from how the team works. Agents don’t start cold every time. They carry forward what they’ve learned.
Correctness
Property-based testing extracts testable properties directly from requirements and generates hundreds of randomized test cases probing edge cases no human would write by hand. Code is verified against intent before it ships, not after. The outcome is software that’s correct by design, not just fast by default.
From intent to production
Capture intent
Natural language becomes structured requirements and acceptance criteria. Spec clarity detects vagueness before implementation begins. The team aligns on what “done” looks like before a single line gets written.
Design the architecture
Kiro analyzes the codebase and generates system design, tech stack decisions, and an implementation plan broken into discrete tasks sequenced by dependency.
Agents build to context
Agents execute against structured context, not guesswork. Tests, documentation, and code are generated together. Steering files enforce team conventions regardless of who started the session or which surface they’re working from.
Verify correctness
Property-based tests validate implementation against intent. Edge cases are caught automatically. Code that doesn’t meet requirements doesn’t ship.
One agent. Every surface.
Kiro delivers the same agent, same memory, and same reasoning across IDE, CLI, and web. Start a task in the IDE, continue it from the CLI, review it on the web. Context follows the team, not the device.
Steering files enforce org-wide conventions across every surface. Memory compounds across sessions and contributors. New team members inherit the project’s accumulated intelligence from day one.
Use cases
Greenfield. Start from intent, ship production-grade. Write a natural-language prompt. Kiro turns it into structured requirements, architectural designs, and an implementation plan with discrete tasks. Your agent builds against that spec, in fewer prompts.
Brownfield. Make your existing codebase AI-ready. Kiro analyzes your codebase and generates specs for what already exists. Steering files enforce your team’s conventions. New features land cleanly because the agent understands the architecture it’s building on.
Getting started
Not a prescription. A pattern that’s worked across teams with different codebases, team sizes, and risk tolerances.
1. Onboard
Pick one team. Install Kiro. Run it on a contained feature or module. No org-wide rollout yet.
2. Experiment
Write the first structured requirements. Let agents generate code, tests, and docs. Compare output quality to the current workflow.
3. Scale
Add steering files for org-wide conventions. Roll out to more teams. Measure correctness, not just speed.
4. Sustain
Build a champion team model. Share steering files and context across the org. Make AI-native development the default.