Empowering Kiro with IDE diagnostics

Why agents miss the errors your IDE already knows about

Early coding agents had this problem: AI generates code that looks correct, but IDE errors aren't immediately visible to the agent. This was because agents lacked visibility into these errors without executing additional tools. Consequently, the agent would move on confidently while the codebase accumulated technical debt. This represents a gap in how any coding agent that doesn't incorporate diagnostics information operates.

While most modern IDEs continuously run sophisticated language analysis tools that catch errors in real-time, agents need an efficient way to access this wealth of information. Otherwise, they have to resort to running (expensive) build/test commands to validate code, which is slower, consumes tokens, and misses the nuanced feedback your development environment already provides. The result? Code validation that's more resource-intensive than it needs to be.

When agents aren’t able to see IDE diagnostics, they cannot iterate towards quality. Instead, they generate code they presume is correct before advancing to the next task. You're left performing QA for the agent—manually fixing type errors (e.g. calling user.getName() when the property is actually user.name), adding missing imports (the agent uses Button but forgot import { Button } from '@/components/ui/button'), and resolving linting violations (unused variables, inconsistent indentation). This doesn't just slow developers down, it trains you to distrust agent-generated code entirely.

Closing the feedback loop in Kiro

Modern IDEs are powered by sophisticated language analysis infrastructure. Language servers perform real-time analysis of your codebase. For example, a TypeScript extension performs type checking, ESLint validates code style, Java extensions provide instant compilation feedback, CloudFormation and Terraform extensions validate infrastructure configurations such as resource properties, required arguments, and resource references before deployment.

This analysis happens continuously as you code to surface errors as diagnostics—the red squiggles and problem markers you see in your editor. These diagnostics represent a rich source of immediate, accurate feedback about code correctness.

Yet, early AI coding assistants couldn’t access this information. Instead, they validated code by executing build commands (npm run build, npm test, etc.) although it could take several seconds or even minutes per invocation.

In Kiro, an agentic IDE that brings structure to AI coding through spec-driven development, we've addressed this by giving the agent direct access to IDE diagnostics. Now, when Kiro writes code, it immediately sees the same errors you would see, and can fix them as they occur; before you ever review the changes. The impact on code quality has been significant: fewer manual corrections and better adherence to project quality standards. Kiro’s coding agents integrate tightly with these client-side diagnostics to enhance both code understanding and generation. Through their connection to the same language servers that power the IDE, agents can now read and interpret compile-time errors, type warnings, and linting results in real time. When Kiro generates or modifies code, it queries this diagnostics information to validate correctness—for example, detecting a missing symbol in Java or a syntax error in Python — and can automatically refine its output based on that feedback.

Workflow comparison

The conventional approach follows a slow, iterative cycle. The agent generates code and then executes a build (and/or test) command which can be time consuming. When the build fails with an error, the agent generates a fix and executes the build command again. This heavy process repeats until the build succeeds.

The diagnostic-driven approach is much faster. After generating code, the agent checks diagnostics in under 35ms. It is provided with specific errors with line numbers and descriptions. The agent then generates a targeted fix and verifies it via diagnostics in another 35ms before proceeding with validated code.

The time difference compounds over multi-step tasks. In spec-driven development, where Kiro implements dozens of tasks, the diagnostic tool is invoked roughly 4x more frequently than in vibe-coding mode because each discrete task boundary requires confirming that acceptance criteria are met. This provides continuous validation throughout the implementation process.

Generate Code
NEW COMPONENT
Check Diagnostics
- Real-time analysis for syntax, type, import, linter, and config errors
- Spots code hallucinations early
- Avoids spurious build/test commands
Generate Targeted Fix
Execute Build/Test
Deliver Final Code

Real-world impact

We have measured the impact of diagnostic integration across production usage and controlled benchmarks. The results demonstrate significant efficiency gains alongside measurable code quality improvements. On the efficiency side, we observed a 29% reduction in command executions because of cutting back on build/test commands. This metric was derived by comparing agent interaction statistics across several days of production usage before and after introducing the diagnostics feature. Code quality improved while simultaneously reducing end-to-end latency.

Language-agnostic architecture

The diagnostic system's power comes from its generality. Rather than building custom analysis for each language, Kiro leverages the Language Server Protocol (LSP) and extension APIs that already power a myriad of IDE functionalities. Production data confirms this works across diverse tech stacks. For instance, popular extensions for general-purpose languages such as TypeScript, Python, and Rust as well as domain-specific languages such as SQL, YAML, and GraphQL which provide type checking and linting information. Additionally, there are extensions for major build tools (such as Maven, Make, Cargo, etc.) and programmatic configuration files (Terraform, Kubernetes YAML, Dockerfile, etc.) to improve the setup and debugging experience.

Usage scenarios

Analysis of production usage reveals common patterns in how the diagnostic tool improves the quality of the existing or newly generated code:

Scenario 1 [Syntax Errors]: In a Python codebase for analytics, the agent implements a new feature for windowed aggregations.

I'll add a new advanced windowed aggregation feature with rolling statistics.
Let me implement this:
• Accepted edits to & analytics_engine.py
Now let me check the diagnostics to see if there are any issues:
* Checked diagnostics analytics_engine.py
Oh, there's a subtle syntax error in the validate_tag_patterns method!
Accepted edits to
* analytics_engine.py
Now let me verify the fix by running diagnostics again:
* Checked diagnostics analytics_engine.py
Perfect! The syntax error is now fixed.

The diagnostics tool finds a regular-expression error (missing ), unterminated subpattern at position 13) which is then fixed by the agent. The re-validation confirms that the error is no longer present. In absence of the diagnostics tool, the agent would rely on composing and executing tests via command-line to check for issues.

Scenario 2 [Avoiding Hallucinations]: In a TypeScript codebase, Kiro makes changes to a component and immediately validates it:

Let me check for any type or other semantic errors and make sure everything is correct:
Checked diagnostics ecs-cluster-stack.ts
I need to fix the errors. Let me update the file:
Accepted edits to ecs-cluster-stack.ts
Let me check the diagnostics again:
Checked diagnostics ecs-cluster-stack.ts
Perfect! There are no more errors.

The diagnostics tool immediately reports two type mismatches (Type 'number' is not assignable to type 'string' and Cannot assign to read-only 'executionTime') and a property hallucination (Property 'itemAge' does not exist on type 'StackProps'). Guided by these issues, the agent generates corrections and re-validates the changes. This pattern (generate → validate → refine) occurs frequently in statically-typed languages where type errors are common but easily caught by language servers.

Scenario 3 [Type Validations]: In a Swift project, Kiro adds a new functionality and checks the diagnostics for the edited files:

Perfect! Let me also check the NetworkConfiguration file:
Checked diagnostics
HTTPManager.swift
NetworkClient.swift
NetworkConfiguration-broken. swift
I need to fix the BackoffStrategy to make the custom function Sendable:
2 edits to file
NetworkConfiguration-broken.swift
Now let me run diagnostics again to check if all issues are resolved:
Checked diagnostics
NetworkConfiguration-broken.swift
Perfect! All the code I implemented compiles without errors.

The diagnostics reveal that there is a type error in one of the files. The agent then fixes the issue and re-validates the affected file to ensure the fix is correct.

Scenario 4 [Infrastructure as Code — aka IaC]: Diagnostic validation extends beyond application code. A user prompts Kiro to check their Terraform configurations:

Sure, let me check that for you.
Checked diagnostics main.tf
Found a couple of issues in your Terraform config.
Let me fix these for you.
2 edits to file main.tf
Fixed both issues. Let me verify everything looks good now.
Checked diagnostics main.tf
All clean! Your Terraform configuration is now valid with no diagnostic errors.

This demonstrates how the diagnostic system works with domain-specific languages and configuration formats, not just conventional programming languages.

Benefits for agentic development

By bringing code generation and code validation closer together, the diagnostics tool enables the following improvements:

Reduced Cognitive Load: Developers spend less time manually validating AI-generated code. When Kiro reports "No diagnostics found," developers can proceed with higher confidence.

Faster Iteration: The reduction in the frequency of bash commands translates to tangible time savings.

Better Code Quality: Addressing issues as soon as they appear means cleaner, higher quality code.

Summary

IDE diagnostics represent an important source of immediate feedback for code validation. By integrating diagnostic information directly into Kiro's agentic workflow, we've eliminated the gap between code generation and validation that plagued early coding agents.

The results speak for themselves: fewer command executions, fast validation cycles, and improved code quality across diverse tech stacks. Rather than forcing agents to rely on slow, expensive build/test commands, Kiro leverages the same sophisticated language analysis infrastructure that already powers modern IDEs; from TypeScript type checking to Terraform configuration validation.

This diagnostic-driven approach speeds up the agent’s feedback loop. Instead of generating code blindly and hoping it works, Kiro sees the same errors you would see and fixes them proactively. The outcome is code that requires less manual correction which builds greater developer confidence.

Ready to experience the difference? Get started with Kiro for free and see how diagnostic-powered agents can transform your development workflow. Join our growing community on Discord to share feedback, ask questions, and connect with other developers building with AI-assisted coding.

Acknowledgements

Credits (listed in alphabetical order) Al Harris, Nathan Jones, Nikhil Swaminathan, Siwei Cui, and Varun Kumar