Skip to content

feat(provider): add adaptive thinking and 1M context support for Claude Opus 4.6#12342

Open
okhsunrog wants to merge 3 commits intoanomalyco:devfrom
okhsunrog:feat/opus-4-6-adaptive-thinking
Open

feat(provider): add adaptive thinking and 1M context support for Claude Opus 4.6#12342
okhsunrog wants to merge 3 commits intoanomalyco:devfrom
okhsunrog:feat/opus-4-6-adaptive-thinking

Conversation

@okhsunrog
Copy link

@okhsunrog okhsunrog commented Feb 5, 2026

Adaptive thinking

  • Add adaptive-thinking-2026-01-28 beta header for Anthropic provider
  • Detect Opus 4.6 models and use adaptive thinking with effort parameter
  • Support all effort levels: low, medium, high, max
  • Older models continue to use manual thinking with budgetTokens

Opus 4.6 uses adaptive thinking where the model decides how much to think based on task complexity, guided by the effort parameter. This is more efficient than fixed budgetTokens as simple tasks use minimal thinking.

1M context window

  • Add context-1m-2025-08-07 beta header for Anthropic provider (API key users)
  • Fix compaction logic: when model.limit.input is set, compare input tokens only against that limit instead of the combined total (input + cache + output)

Without the beta header, Opus 4.6 enforces a 200k input token limit (the context window is still 1M — output/thinking tokens aren't affected). The opencode (OAuth) provider doesn't support the 1M beta header, so it relies on model.limit.input in models.dev to trigger compaction at the right time: anomalyco/models.dev#819

AI SDK upgrade

  • Upgrade ai v5 → v6, @ai-sdk/anthropic v2 → v3, and all @ai-sdk/* packages
  • Migrate LanguageModelV2LanguageModelV3, async toModelMessages, renamed tool factories

Ref: https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking
Ref: https://platform.claude.com/docs/en/build-with-claude/context-windows#1-m-token-context-window

Tested locally, works with Opus 4.6

Closes #12323
Closes #12338
Closes #12438

@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

The following comment was made by an LLM, it may be inaccurate:

No duplicate PRs found

@okhsunrog okhsunrog force-pushed the feat/opus-4-6-adaptive-thinking branch from 7ffe6a8 to a3ebbc5 Compare February 5, 2026 20:58
@okhsunrog okhsunrog marked this pull request as draft February 5, 2026 21:00
@okhsunrog
Copy link
Author

okhsunrog commented Feb 5, 2026

Adding of adaptive thinking needs an upgrade of @ai-sdk/anthropic which leads to upgrading of a bunch of other packages. Making it a draft for now.
I started working on upgrading the deps, I hope it can be made as part of this PR

@okhsunrog okhsunrog force-pushed the feat/opus-4-6-adaptive-thinking branch from a3ebbc5 to 5316c7e Compare February 5, 2026 21:38
@okhsunrog okhsunrog marked this pull request as ready for review February 5, 2026 21:45
@okhsunrog
Copy link
Author

Added a commit with dependencies upgrade. Tested locally, switching effort levels with Opus 4.6 now works as expected

@okhsunrog
Copy link
Author

okhsunrog commented Feb 5, 2026

Can anyone test if 1M context for you? It works for me just fine with Opus 4.6. I managed to get up to 340k token

@ItsWendell
Copy link
Contributor

@okhsunrog what's the easiest way to test this?

@okhsunrog
Copy link
Author

@ItsWendell make sure you have latest bun installed, then run these commands:

git clone -b feat/opus-4-6-adaptive-thinking https://github.com/okhsunrog/opencode.git
cd opencode
bun install
bun dev

@ItsWendell
Copy link
Contributor

Tested and works:
image
image

@ItsWendell
Copy link
Contributor

Couple of issues that I ran into, at e.g. 400 tokens I again get a error:

image image

@okhsunrog
Copy link
Author

image image well, shit...

@okhsunrog
Copy link
Author

@ItsWendell yes, I confirm the issue is there. but if we'd pass context-1m-2025-08-07 header we'd get error from Anthropic with OAuth, that the long context beta is not available for this subscription. Are you using it with Claude subscription via OAuth or vie Anthropic API?

@okhsunrog
Copy link
Author

okhsunrog commented Feb 6, 2026

Confirmed, the 200k limit is enforced by the Anthropic API. The status bar shows a combined total (input + output + cache tokens), but the API only counts input tokens against the 200k limit. That's why I didn't hit it earlier — actual input tokens stayed under 200k even though the status bar showed 385k.
The context-1m-2025-08-07 beta header is needed for Opus 4.6 to go past 200k, same as Sonnet 4/4.5. The compaction check in opencode uses this combined count:

// compaction.ts:35
const count = input.tokens.input + input.tokens.cache.read + input.tokens.output

So opencode thinks there's plenty of headroom (39% of 1M) while the API is already at the 200k input limit. I'll update my PR to add the context-1m-2025-08-07 beta header for non-OAuth.

P.S. Worth noting that the 200k limit on Opus 4.6 without the beta header is different from the 200k context window on models like Haiku 4.5. For Haiku 4.5, 200k is the total context window — input tokens, output tokens, everything has to fit within that 200k budget. For Opus 4.6, the context window is actually 1M — the 200k is only a gate on input tokens. Output and thinking tokens live in the larger 1M space and don't count against the 200k. So even without the context-1m-2025-08-07 header, Opus 4.6 gives you significantly more effective room. For example, if you're using extended thinking with large thinking budgets, those tokens aren't eating into your 200k input limit like they would eat into Haiku 4.5's 200k context window. And thinking tokens from previous turns are automatically stripped by the API, so they don't accumulate as input at all.

- Add adaptive-thinking-2026-01-28 beta header for Anthropic provider
- Detect Opus 4.6 models and use adaptive thinking with effort parameter
- Support all effort levels: low, medium, high, max
- Older models continue to use manual thinking with budgetTokens

Opus 4.6 uses adaptive thinking where the model decides how much to think
based on task complexity, guided by the effort parameter. This is more
efficient than fixed budgetTokens as simple tasks use minimal thinking.

Ref: https://docs.anthropic.com/en/docs/build-with-claude/adaptive-thinking
- Upgrade ai package from 5.0.124 to 6.0.72
- Upgrade @ai-sdk/anthropic from 2.0.58 to 3.0.37 (adds adaptive thinking support)
- Upgrade all @ai-sdk/* packages to v3+/v4+ for compatibility
- Update LanguageModelV2 to LanguageModelV3 across codebase
- Make toModelMessages async per AI SDK 6 requirements
- Update toModelOutput signature to use destructured parameter
- Fix tool factory renames and remove deprecated name property
- Update tests for new LanguageModelUsage type structure

Enables Claude Opus 4.6 to use adaptive thinking with effort levels
(low/medium/high/max) instead of fixed budget tokens.
@okhsunrog
Copy link
Author

@rekram1-node @thdxr could anyone review this, please?

@okhsunrog okhsunrog force-pushed the feat/opus-4-6-adaptive-thinking branch from cd15179 to b49a992 Compare February 6, 2026 08:39
@okhsunrog okhsunrog changed the title feat(provider): add adaptive thinking support for Claude Opus 4.6 feat(provider): add adaptive thinking and 1M context support for Claude Opus 4.6 Feb 6, 2026
@BouquetAntoine
Copy link

Work well for me except the displayed usage is wrong (that's also why you got the 200k limit - claude code limit - warning at "400k" context)

Bug: Token Counter Displays ~2× Actual Usage with @ai-sdk/anthropic v3.x

Symptom

When using @ai-sdk/anthropic v3.x, the token counter displays approximately double the actual token usage (e.g., 49,000 shown vs 24,450 in provider logs).

Root Cause

The SDK v3 changed the structure of LanguageModelUsage.inputTokens:

Version inputTokens value
v2.x input_tokens (excludes cache)
v3.x total = noCache + cacheRead + cacheWrite

In session/index.ts, the getUsage() function assumes that for Anthropic, inputTokens excludes cached tokens:

const excludesCachedTokens = !!(input.metadata?.["anthropic"] || input.metadata?.["bedrock"])

const adjustedInputTokens = excludesCachedTokens
  ? (input.usage.inputTokens ?? 0) // ❌ This is now the TOTAL in v3!
  : ...

Then in the UI (header.tsx, sidebar.tsx), tokens are displayed as:

tokens.input + tokens.output + tokens.reasoning + tokens.cache.read + tokens.cache.write

Result: Cache tokens are counted twice — once inside tokens.input (which is now the total) and again as tokens.cache.read + tokens.cache.write.

Fix

Use the new inputTokenDetails.noCacheTokens field introduced in SDK v3:

// SDK v3: inputTokens is now TOTAL, use noCacheTokens for pure input
const noCacheInputTokens = input.usage.inputTokenDetails?.noCacheTokens

const adjustedInputTokens = noCacheInputTokens !== undefined
  ? noCacheInputTokens
  : (input.usage.inputTokens ?? 0) - cacheReadInputTokens - cacheWriteInputTokens

Also update cache token extraction to use the new structure:

const cacheReadInputTokens =
  input.usage.cachedInputTokens ??
  input.usage.inputTokenDetails?.cacheReadTokens ??
  0

const cacheWriteInputTokens =
  input.usage.inputTokenDetails?.cacheWriteTokens ??
  0

SDK v3 Type Reference

type LanguageModelUsage = {
  inputTokens: number | undefined;          // Now the TOTAL
  inputTokenDetails: {
    noCacheTokens: number | undefined;      // Pure input without cache
    cacheReadTokens: number | undefined;    // Tokens read from cache
    cacheWriteTokens: number | undefined;   // Tokens written to cache
  };
  cachedInputTokens?: number | undefined;   // @deprecated — use inputTokenDetails.cacheReadTokens
  // ...
}

@BouquetAntoine
Copy link

can confirm I managed to use the 1M context
image
image

@reynard93
Copy link

how did anyone manage to use with this branch? i keep getting : The long context beta is not yet available for this subscription

…ounting, use plugin fork for OAuth context cap
@loop-uh
Copy link

loop-uh commented Feb 6, 2026

+1

image

@@ -16,15 +16,14 @@ import { gitlabAuthPlugin as GitlabAuthPlugin } from "@gitlab/opencode-gitlab-au
export namespace Plugin {
const log = Log.create({ service: "plugin" })

const BUILTIN = ["opencode-anthropic-auth@0.0.13"]
const BUILTIN = ["github:okhsunrog/opencode-anthropic-auth#feat/oauth-context-cap"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opencode core shouldn't have builtin plugins that pull from forks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was only supposed for testing till changes to opencode-anthropoc-auth are accepted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Claude Opus 4.6 context window limits still 200k 1M tokens for Opus 4.6 [FEATURE]: Add support of Claude Opus 4.6

7 participants