What are frontier engineering teams?

Teams that have moved past using AI as autocomplete. They use structured context, autonomous agents, and org-level steering to ship production-quality code, not just faster code.

How is this different from just using an AI coding tool?

Developers prompt, wait, paste errors back, and constantly supervise the agent. Frontier teams restructure their workflows so agents can run autonomously for longer stretches, validating their own output through tests, self-correcting, and working in parallel. The difference isn't the tool. It's the workflow around it.

What results have teams actually seen?

Results vary by team and context. Amazon teams that restructured their workflows around AI have seen productivity gains ranging from 4.5x (median across a multiple team pilot) to 20x on specific projects. These are measured results, not guarantees. The common factor across all of them was changing how the team works, not just changing the tools they use.

How does Kiro handle legacy codebases?

Kiro analyzes your existing codebase and generates specs for what's already there. Steering files enforce your team's conventions. When agents write new code, they understand the architecture they're building on, so new features land cleanly instead of creating more tech debt.

How should we start adopting this?

Pick one team, one contained feature. Install Kiro, capture intent as structured requirements, and compare the output to the current workflow. Don't try to roll out org-wide on day one. The four-phase framework gives a pattern that's worked for other teams.

Frontier engineering teams don’t just adopt AI tools, they change how they build software

Frontier engineering teams don’t just code faster. They ship production-quality software with AI that understands their codebase, their standards, and their intent - before a single line is written.

Set up your team

Contact sales

What makes a frontier engineering team?

Frontier teams don’t just adopt AI tools, they change how they build software. Here’s what separates them from teams that are still experimenting.

Across teams at Amazon, the highest-performing teams share five practices with a common logic. Reduce the barriers to context for the agent and increase the surface area of work it can do independently.

They invest in agent context. The most advanced teams invest heavily in making projects and knowledge easier for agents to consume. Agent steering files that define conventions, coding standards, testing patterns, and codebase navigation. One infrastructure team placed all code and documentation into a monorepo and kept the inline commentary that agents generated, treating it as persistent memory. Teams that skip this step often wonder why their agents keep making the same mistakes.

They slow down to speed up. High-performing teams consistently reported that things initially slowed down as they learned the models. The teams that pushed through that learning curve experienced compounding acceleration. The teams that expected immediate gains without changing their workflows were disappointed. The first two weeks tend to feel slower. The weeks after tend to feel dramatically faster.

They feed agents instead of babysitting them. Frontier teams maintain a steady backlog of well-scoped tasks, running multiple agents in parallel and reviewing output asynchronously. Some run agents for hours at a time, often overnight, and review generated code in the morning. One principal engineer shipped a complete change with only “a couple of hours of contiguous time” because the agent worked while the engineer moved between code reviews, operational support, and meetings.

They make intent explicit before code gets written. Whether through structured specifications, detailed requirements documents, or well-scoped task decomposition, frontier teams ensure agents have clear context about what “done” looks like before they start generating code. Some teams using this approach report hand-writing only 1–2% of their code while pushing significantly more commits per person per week than before.

They shift testing left. Frontier teams build tooling so agents can run integration tests locally and self-correct before code ever reaches the pipeline. One team invested in automated guardrails, component tests, performance tests, and formatters that caught issues early. Code reviews shifted focus to interface definitions and architectural decisions rather than code style and naming conventions.

Where does your team stand?

Use these behaviors to spot where your team already operates like a frontier team.

Dimension	Still Experimenting	Frontier-Ready
AI usage	Autocomplete, ad-hoc prompts	Autonomous agents with structured context
Requirements	Verbal or in tickets	Explicit intent captured before code is written
Code quality	Review catches issues late	Tests and docs generated alongside code
Legacy code	Avoided or manually refactored	AI analyzes and modernizes with guardrails
Governance	Per-developer tool choices	Org-wide steering files and policies

Results

Inside the teams that already ship this way. Not projections. Measured results from teams that restructured their workflows around AI.

Up to 20x

individual productivity gain. Amazon Bedrock (6 engineers, 76 days)

~6x

throughput acceleration. Amazon Prime Video (90-week estimate compressed to 24 weeks)

4.5x median

productivity gain. Amazon Stores (multiple team pilot, typical engineers)

More than 10x

top-end improvement. Amazon Stores (highest-performing teams)

Amazon Bedrock. Workflow redesign from day one.

Six engineers delivered a project scoped for 30 developers over 12-18 months in 76 days. The team redesigned how they work around AI from the start. They shifted from discrete tasks to goal-driven outcomes, ran multiple agents in parallel, and enabled autonomous work during off-hours. Commits went from 2 per week to 40 per developer.

Amazon Prime Video. Taming a decade-old legacy codebase.

The Financial Systems team ran a 10-day structured sprint on a 10-year-old payments system processing billions in transactions across 6 production services. Six engineers, one room, zero context switching. They produced 556 commits against a baseline of 96. The gain came from three factors multiplying together. Acceleration of low-judgment work (1.5x) x uninterrupted focus on high-judgment work (1.5x) x instant access to agent-captured domain expertise (1.5x).

Amazon Stores. Median 4.5x, top teams exceeding 10x.

A structured pilot with typical development teams working against their regular backlogs. No special conditions. No handpicked engineers. The teams that restructured workflows around AI achieved a median 4.5x productivity gain. One team now ships features in an afternoon that previously took two weeks.

How Kiro works

Frontier teams find their own path to productivity gains. Some invest in agent infrastructure. Some restructure around structured requirements. Some do both. But the teams that sustain those gains over time tend to converge on three things. They give agents rich context. They make intent explicit. And they verify correctness before code ships. Kiro builds these into the development environment so teams don’t have to invent them from scratch.

Intent

Describe what the team wants in natural language. Kiro transforms that into structured requirements with acceptance criteria, architectural designs, and sequenced implementation tasks. Intent stays synchronized with code as it evolves. Requirements never drift from implementation because they’re linked bidirectionally.

Context

Kiro builds and maintains a persistent understanding of the project across sessions, surfaces, and contributors. Three layers of memory work together. Explicit rules from steering files. Episodic history from past sessions. Learned patterns from how the team works. Agents don’t start cold every time. They carry forward what they’ve learned.

Correctness

Property-based testing extracts testable properties directly from requirements and generates hundreds of randomized test cases probing edge cases no human would write by hand. Code is verified against intent before it ships, not after. The outcome is software that’s correct by design, not just fast by default.

From intent to production

Capture intent

Natural language becomes structured requirements and acceptance criteria. Spec clarity detects vagueness before implementation begins. The team aligns on what “done” looks like before a single line gets written.

Design the architecture

Kiro analyzes the codebase and generates system design, tech stack decisions, and an implementation plan broken into discrete tasks sequenced by dependency.

Agents build to context

Agents execute against structured context, not guesswork. Tests, documentation, and code are generated together. Steering files enforce team conventions regardless of who started the session or which surface they’re working from.

Verify correctness

Property-based tests validate implementation against intent. Edge cases are caught automatically. Code that doesn’t meet requirements doesn’t ship.

One agent. Every surface.

Kiro delivers the same agent, same memory, and same reasoning across IDE, CLI, and web. Start a task in the IDE, continue it from the CLI, review it on the web. Context follows the team, not the device.

Steering files enforce org-wide conventions across every surface. Memory compounds across sessions and contributors. New team members inherit the project’s accumulated intelligence from day one.

Use cases

Greenfield. Start from intent, ship production-grade. Write a natural-language prompt. Kiro turns it into structured requirements, architectural designs, and an implementation plan with discrete tasks. Your agent builds against that spec, in fewer prompts.

Brownfield. Make your existing codebase AI-ready. Kiro analyzes your codebase and generates specs for what already exists. Steering files enforce your team’s conventions. New features land cleanly because the agent understands the architecture it’s building on.

Getting started

Not a prescription. A pattern that’s worked across teams with different codebases, team sizes, and risk tolerances.

1. Onboard

Pick one team. Install Kiro. Run it on a contained feature or module. No org-wide rollout yet.

2. Experiment

Write the first structured requirements. Let agents generate code, tests, and docs. Compare output quality to the current workflow.

3. Scale

Add steering files for org-wide conventions. Roll out to more teams. Measure correctness, not just speed.

4. Sustain

Build a champion team model. Share steering files and context across the org. Make AI-native development the default.

Common questions

Build something real in minutes

Set up your team Contact sales