Loading image...Kiro
  • CLI
  • IDE
  • Web
  • Mobile
  • Enterprise
  • Pricing
  • Docs
SIGN INDOWNLOADS
Loading image...Kiro
Loading image...Kiro
Product
  • About Kiro
  • IDE
  • CLI
  • Web
  • Mobile
  • Pricing
  • Downloads
For
  • Enterprise
  • Startups
  • Students
Community
  • Overview
  • Ambassadors
  • Discord
  • Events
  • Powers
  • Shop
  • Showcase
Resources
  • Docs
  • Blog
  • Changelog
  • FAQs
  • Report a bug
  • Suggest an idea
  • Billing support
Social
Site TermsLicenseResponsible AI PolicyLegalPrivacy PolicyCookie Preferences
  1. Docs
  2. CLI
  3. Chat
  4. Effort

Effort


The /effort command controls how much reasoning the model applies to your prompts. Lower effort levels produce faster, shorter responses. Higher levels spend more tokens on deeper analysis, multi-step reasoning, and thorough code generation.

Setting effort level

bash
# Open interactive picker /effort # Set directly /effort high

Available levels:

LevelBehavior
lowFast, concise responses. Good for simple questions and quick lookups.
mediumBalanced reasoning. Suitable for most development tasks.
highThorough analysis. Better for complex refactoring and architecture decisions.
xhighExtended reasoning. Useful for multi-file changes and nuanced problems.
maxMaximum depth. Best for difficult debugging, security analysis, and intricate logic.

Not all models support every level. The picker only shows levels available for your current model.

Supported models:

Each model defines the values it accepts for the reasoning-related fields in its configuration schema. The output_config.effort column lists the effort levels available to /effort and --effort for that model; thinking.type and thinking.display control reasoning behavior and visibility; max_tokens sets the output length limit.

Modelthinking.typethinking.displayoutput_config.effortmax_tokens
Claude Opus 4.8adaptive, disabledsummarized, omittedlow, medium, high, xhigh, max1024–128000
Claude Opus 4.7adaptive, disabledsummarized, omittedlow, medium, high, xhigh, max1024–64000
Claude Opus 4.6adaptive, disabledsummarized, omittedlow, medium, high, max1024–64000
Claude Sonnet 4.6adaptive, disabledsummarized, omittedlow, medium, high, max1024–64000

Setting effort at launch

Set the initial effort level when you start a session with the --effort flag:

bash
kiro-cli chat --effort high

The flag accepts the same levels as /effort (low, medium, high, xhigh, max) and applies from your first prompt, so quick lookups stay fast and complex work gets deeper reasoning from the start. You can still change the level mid-session with /effort.

Persisting your effort level

Your effort choice persists automatically. Set a level with /effort or --effort and Kiro remembers it for future sessions — there's no extra step to make it the default. Preferences are stored in ~/.kiro/settings/cli.json. See In-session settings for more on how preferences persist.

Persistent defaults

To set default model parameters per model — so you don't have to run /effort at the start of every session — add chat.modelDefaults to your settings file:

json
{ "chat.modelDefaults": { "claude-sonnet-4.6": { "output_config": { "effort": "high" } }, "claude-opus-4.7": { "output_config": { "effort": "max" } } } }

Configuring thinking behavior

Control whether the model uses extended thinking and how it displays reasoning:

json
{ "chat.modelDefaults": { "claude-opus-4.8": { "thinking": { "type": "adaptive", "display": "summarized" } }, "claude-sonnet-4.6": { "thinking": { "type": "disabled" } } } }
  • thinking.type — adaptive enables extended thinking when the model determines it's needed; disabled turns it off entirely.
  • thinking.display — summarized shows a condensed version of reasoning; omitted hides it from output. Only applies when type is adaptive.

Configuring max output tokens

Set the maximum number of tokens the model can generate per response:

json
{ "chat.modelDefaults": { "claude-opus-4.8": { "max_tokens": 128000 }, "claude-sonnet-4.6": { "max_tokens": 32000 } } }

max_tokens limits per model:

ModelMinimumMaximum
Claude Opus 4.81024128000
Claude Opus 4.7102464000
Claude Opus 4.6102464000
Claude Sonnet 4.6102464000

Combining all options

You can combine output_config, thinking, and max_tokens in a single model entry:

json
{ "chat.modelDefaults": { "claude-opus-4.8": { "output_config": { "effort": "max" }, "thinking": { "type": "adaptive", "display": "summarized" }, "max_tokens": 128000 } } }

To open your settings file in your editor:

bash
kiro-cli settings open

Or place a .kiro/settings/cli.json in your project root to set workspace-level defaults that apply to everyone working in that repository.

Precedence

When determining the effort level for a session, Kiro applies this priority order:

  1. Session override — value set via /effort or --effort during the current session
  2. Workspace defaults — chat.modelDefaults in .kiro/settings/cli.json
  3. User defaults — chat.modelDefaults in ~/.kiro/settings/cli.json
  4. Built-in defaults — the model's standard effort level

When to adjust effort

  • Bump up when the agent is giving shallow answers, missing edge cases, or producing incomplete implementations
  • Bump down when you need quick answers and don't want to wait for extended reasoning
  • Use max for security reviews, complex debugging sessions, or when you need the agent to consider many interacting constraints

Related

  • Models — available models and their capabilities
  • Slash commands reference — quick command reference
  • Settings — all configurable settings
Page updated: June 6, 2026
Session management
Rewind