Open weight models are here: more choice, more speed, less cost

By
NI

Nima Kaviani

Product

From the start, we built Kiro to give you the best AI coding experience possible. That meant shipping with the current state-of-the-art coding models and building everything around quality output. Six months ago, we introduced Auto—our agent mode that blends frontier models with specialized models, layering in intent detection, caching, and other optimization techniques to give you a strong balance of performance, efficiency, and output quality. Today, we're adding open weight models to Kiro, available in both the IDE and CLI.

A great mix of speed, quality, and cost effectiveness

We've been watching the open weight space closely, and the progress has been remarkable. Models that would have trailed proprietary options by a wide margin a year ago are now delivering genuinely competitive results for many development tasks. They're fast, they're cost-effective, and they keep getting better. Many of you have been experimenting with these models on your own and asking us to support them directly. We heard you.

Open weight models give you more choice in how you work. Some tasks don't need the heaviest model available. Quick iterations, boilerplate generation, straightforward refactors need speed and low cost more than raw reasoning power. Other tasks demand strong agentic capabilities or specialized language support. Having a range of models means you can match the tool to the job.

The models

Here's what we are making available starting today, and where we think each one shines.

DeepSeek v3.2 (0.25x credit multiplier) — Built on a sparse Mixture-of-Experts architecture, DeepSeek v3.2 activates only the parameters it needs per task. It excels at agentic workflows: multi-step tool calling, maintaining state across long sessions, and complex reasoning chains. It’s great for generating initial code but may have challenges with complex debugging and code review quality. If you're building agents or working through involved debugging sessions, this is a strong pick.

MiniMax 2.1 (0.15x credit multiplier) — This model stands out for multilingual programming. It delivers strong results across Rust, Java, Go, C++, Kotlin, TypeScript, JavaScript, and more. It also has notably good UI generation capabilities for web, Android, and iOS. Developers have noticed that it can struggle with complex refactoring tasks compared to frontier models. If your team works across multiple languages or does a lot of frontend work, MiniMax 2.1 is worth trying.

Qwen3 Coder Next (0.05x credit multiplier) — An 80B sparse MoE model that activates just 3B parameters per token, Qwen3 Coder Next is purpose-built for coding agents. It scores above 70% on SWE-Bench Verified, supports 256K context, and has strong capabilities in error detection, recovery, and tool calling. The community have noted some compatibility and integration challenges that require careful configuration for more consistent results. If you want an efficient model that handles long agentic coding sessions reliably, specifically in the CLI, give Qwen3 a shot.

Try them out in the IDE and CLI

All models are available now with experimental support in the IDE model selector and through the Kiro CLI, for both free and paid users logging in with Google, GitHub and AWS BuilderID. Inference is performed in the AWS US East (N. Virginia) region. Switch between them, pair them with Auto, or set a default for specific project types—whatever fits your workflow. As always, experiment and let us know how these are working for you. We're paying close attention to which models resonate and what gaps remain. If there's a model you'd like to see supported next, let us know.