Effort

The /effort command controls how much reasoning the model applies to your prompts. Lower effort levels produce faster, shorter responses. Higher levels spend more tokens on deeper analysis, multi-step reasoning, and thorough code generation.

Setting effort level

bash

# Open interactive picker
/effort

# Set directly
/effort high

Available levels:

Level	Behavior
`low`	Fast, concise responses. Good for simple questions and quick lookups.
`medium`	Balanced reasoning. Suitable for most development tasks.
`high`	Thorough analysis. Better for complex refactoring and architecture decisions.
`xhigh`	Extended reasoning. Useful for multi-file changes and nuanced problems.
`max`	Maximum depth. Best for difficult debugging, security analysis, and intricate logic.

Not all models support every level. The picker only shows levels available for your current model.

Supported models:

Each model defines the values it accepts for the reasoning-related fields in its configuration schema. The output_config.effort column lists the effort levels available to /effort and --effort for that model; thinking.type and thinking.display control reasoning behavior and visibility; max_tokens sets the output length limit.

Model	`thinking.type`	`thinking.display`	`output_config.effort`	`max_tokens`
Claude Opus 4.8	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `xhigh`, `max`	1024–128000
Claude Opus 4.7	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `xhigh`, `max`	1024–64000
Claude Opus 4.6	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `max`	1024–64000
Claude Sonnet 4.6	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `max`	1024–64000

Setting effort at launch

Set the initial effort level when you start a session with the --effort flag:

bash

kiro-cli chat --effort high

The flag accepts the same levels as /effort (low, medium, high, xhigh, max) and applies from your first prompt, so quick lookups stay fast and complex work gets deeper reasoning from the start. You can still change the level mid-session with /effort.

Persisting your effort level

Your effort choice persists automatically. Set a level with /effort or --effort and Kiro remembers it for future sessions — there's no extra step to make it the default. Preferences are stored in ~/.kiro/settings/cli.json. See In-session settings for more on how preferences persist.

Persistent defaults

To set default model parameters per model — so you don't have to run /effort at the start of every session — add chat.modelDefaults to your settings file:

json

{
  "chat.modelDefaults": {
    "claude-sonnet-4.6": {
      "output_config": {
        "effort": "high"
      }
    },
    "claude-opus-4.7": {
      "output_config": {
        "effort": "max"
      }
    }
  }
}

Configuring thinking behavior

Control whether the model uses extended thinking and how it displays reasoning:

json

{
  "chat.modelDefaults": {
    "claude-opus-4.8": {
      "thinking": {
        "type": "adaptive",
        "display": "summarized"
      }
    },
    "claude-sonnet-4.6": {
      "thinking": {
        "type": "disabled"
      }
    }
  }
}

thinking.type — adaptive enables extended thinking when the model determines it's needed; disabled turns it off entirely.
thinking.display — summarized shows a condensed version of reasoning; omitted hides it from output. Only applies when type is adaptive.

Configuring max output tokens

Set the maximum number of tokens the model can generate per response:

json

{
  "chat.modelDefaults": {
    "claude-opus-4.8": {
      "max_tokens": 128000
    },
    "claude-sonnet-4.6": {
      "max_tokens": 32000
    }
  }
}

max_tokens limits per model:

Model	Minimum	Maximum
Claude Opus 4.8	1024	128000
Claude Opus 4.7	1024	64000
Claude Opus 4.6	1024	64000
Claude Sonnet 4.6	1024	64000

Combining all options

You can combine output_config, thinking, and max_tokens in a single model entry:

json

{
  "chat.modelDefaults": {
    "claude-opus-4.8": {
      "output_config": {
        "effort": "max"
      },
      "thinking": {
        "type": "adaptive",
        "display": "summarized"
      },
      "max_tokens": 128000
    }
  }
}

To open your settings file in your editor:

bash

kiro-cli settings open

Or place a .kiro/settings/cli.json in your project root to set workspace-level defaults that apply to everyone working in that repository.

Precedence

When determining the effort level for a session, Kiro applies this priority order:

Session override — value set via /effort or --effort during the current session
Workspace defaults — chat.modelDefaults in .kiro/settings/cli.json
User defaults — chat.modelDefaults in ~/.kiro/settings/cli.json
Built-in defaults — the model's standard effort level

When to adjust effort

Bump up when the agent is giving shallow answers, missing edge cases, or producing incomplete implementations
Bump down when you need quick answers and don't want to wait for extended reasoning
Use max for security reviews, complex debugging sessions, or when you need the agent to consider many interacting constraints

Models — available models and their capabilities
Slash commands reference — quick command reference
Settings — all configurable settings

Page updated: June 6, 2026

Session management

Rewind

Setting effort level

bash

# Open interactive picker
/effort

# Set directly
/effort high

Available levels:

Level	Behavior
`low`	Fast, concise responses. Good for simple questions and quick lookups.
`medium`	Balanced reasoning. Suitable for most development tasks.
`high`	Thorough analysis. Better for complex refactoring and architecture decisions.
`xhigh`	Extended reasoning. Useful for multi-file changes and nuanced problems.
`max`	Maximum depth. Best for difficult debugging, security analysis, and intricate logic.

Not all models support every level. The picker only shows levels available for your current model.

Supported models:

Model	`thinking.type`	`thinking.display`	`output_config.effort`	`max_tokens`
Claude Opus 4.8	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `xhigh`, `max`	1024–128000
Claude Opus 4.7	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `xhigh`, `max`	1024–64000
Claude Opus 4.6	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `max`	1024–64000
Claude Sonnet 4.6	`adaptive`, `disabled`	`summarized`, `omitted`	`low`, `medium`, `high`, `max`	1024–64000

Setting effort at launch

Set the initial effort level when you start a session with the --effort flag:

bash

kiro-cli chat --effort high

Persisting your effort level

Persistent defaults

To set default model parameters per model — so you don't have to run /effort at the start of every session — add chat.modelDefaults to your settings file:

json

{
  "chat.modelDefaults": {
    "claude-sonnet-4.6": {
      "output_config": {
        "effort": "high"
      }
    },
    "claude-opus-4.7": {
      "output_config": {
        "effort": "max"
      }
    }
  }
}

Configuring thinking behavior

Control whether the model uses extended thinking and how it displays reasoning:

json

{
  "chat.modelDefaults": {
    "claude-opus-4.8": {
      "thinking": {
        "type": "adaptive",
        "display": "summarized"
      }
    },
    "claude-sonnet-4.6": {
      "thinking": {
        "type": "disabled"
      }
    }
  }
}

thinking.type — adaptive enables extended thinking when the model determines it's needed; disabled turns it off entirely.
thinking.display — summarized shows a condensed version of reasoning; omitted hides it from output. Only applies when type is adaptive.

Configuring max output tokens

Set the maximum number of tokens the model can generate per response:

json

{
  "chat.modelDefaults": {
    "claude-opus-4.8": {
      "max_tokens": 128000
    },
    "claude-sonnet-4.6": {
      "max_tokens": 32000
    }
  }
}

max_tokens limits per model:

Model	Minimum	Maximum
Claude Opus 4.8	1024	128000
Claude Opus 4.7	1024	64000
Claude Opus 4.6	1024	64000
Claude Sonnet 4.6	1024	64000

Combining all options

You can combine output_config, thinking, and max_tokens in a single model entry:

json

{
  "chat.modelDefaults": {
    "claude-opus-4.8": {
      "output_config": {
        "effort": "max"
      },
      "thinking": {
        "type": "adaptive",
        "display": "summarized"
      },
      "max_tokens": 128000
    }
  }
}

To open your settings file in your editor:

bash

kiro-cli settings open

Or place a .kiro/settings/cli.json in your project root to set workspace-level defaults that apply to everyone working in that repository.

Precedence

When determining the effort level for a session, Kiro applies this priority order:

Session override — value set via /effort or --effort during the current session
Workspace defaults — chat.modelDefaults in .kiro/settings/cli.json
User defaults — chat.modelDefaults in ~/.kiro/settings/cli.json
Built-in defaults — the model's standard effort level

When to adjust effort

Bump up when the agent is giving shallow answers, missing edge cases, or producing incomplete implementations
Bump down when you need quick answers and don't want to wait for extended reasoning
Use max for security reviews, complex debugging sessions, or when you need the agent to consider many interacting constraints

Models — available models and their capabilities
Slash commands reference — quick command reference
Settings — all configurable settings

Page updated: June 6, 2026

Session management

Rewind

Effort

Effort

Setting effort level

Setting effort at launch

Persisting your effort level

Persistent defaults

Configuring thinking behavior

Configuring max output tokens

Combining all options

Precedence

When to adjust effort

Related

Setting effort level

Setting effort at launch

Persisting your effort level

Persistent defaults

Configuring thinking behavior

Configuring max output tokens

Combining all options

Precedence

When to adjust effort

Related