feat(provider): add adaptive thinking and 1M context support for Claude Opus 4.6#12342
feat(provider): add adaptive thinking and 1M context support for Claude Opus 4.6#12342okhsunrog wants to merge 3 commits intoanomalyco:devfrom
Conversation
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: No duplicate PRs found |
7ffe6a8 to
a3ebbc5
Compare
|
Adding of adaptive thinking needs an upgrade of @ai-sdk/anthropic which leads to upgrading of a bunch of other packages. Making it a draft for now. |
a3ebbc5 to
5316c7e
Compare
|
Added a commit with dependencies upgrade. Tested locally, switching effort levels with Opus 4.6 now works as expected |
|
Can anyone test if 1M context for you? It works for me just fine with Opus 4.6. I managed to get up to 340k token |
|
@okhsunrog what's the easiest way to test this? |
|
@ItsWendell make sure you have latest bun installed, then run these commands: git clone -b feat/opus-4-6-adaptive-thinking https://github.com/okhsunrog/opencode.git
cd opencode
bun install
bun dev |
|
@ItsWendell yes, I confirm the issue is there. but if we'd pass |
|
Confirmed, the 200k limit is enforced by the Anthropic API. The status bar shows a combined total (input + output + cache tokens), but the API only counts input tokens against the 200k limit. That's why I didn't hit it earlier — actual input tokens stayed under 200k even though the status bar showed 385k. // compaction.ts:35
const count = input.tokens.input + input.tokens.cache.read + input.tokens.outputSo opencode thinks there's plenty of headroom (39% of 1M) while the API is already at the 200k input limit. I'll update my PR to add the P.S. Worth noting that the 200k limit on Opus 4.6 without the beta header is different from the 200k context window on models like Haiku 4.5. For Haiku 4.5, 200k is the total context window — input tokens, output tokens, everything has to fit within that 200k budget. For Opus 4.6, the context window is actually 1M — the 200k is only a gate on input tokens. Output and thinking tokens live in the larger 1M space and don't count against the 200k. So even without the |
- Add adaptive-thinking-2026-01-28 beta header for Anthropic provider - Detect Opus 4.6 models and use adaptive thinking with effort parameter - Support all effort levels: low, medium, high, max - Older models continue to use manual thinking with budgetTokens Opus 4.6 uses adaptive thinking where the model decides how much to think based on task complexity, guided by the effort parameter. This is more efficient than fixed budgetTokens as simple tasks use minimal thinking. Ref: https://docs.anthropic.com/en/docs/build-with-claude/adaptive-thinking
- Upgrade ai package from 5.0.124 to 6.0.72 - Upgrade @ai-sdk/anthropic from 2.0.58 to 3.0.37 (adds adaptive thinking support) - Upgrade all @ai-sdk/* packages to v3+/v4+ for compatibility - Update LanguageModelV2 to LanguageModelV3 across codebase - Make toModelMessages async per AI SDK 6 requirements - Update toModelOutput signature to use destructured parameter - Fix tool factory renames and remove deprecated name property - Update tests for new LanguageModelUsage type structure Enables Claude Opus 4.6 to use adaptive thinking with effort levels (low/medium/high/max) instead of fixed budget tokens.
|
@rekram1-node @thdxr could anyone review this, please? |
cd15179 to
b49a992
Compare
|
Work well for me except the displayed usage is wrong (that's also why you got the 200k limit - claude code limit - warning at "400k" context) Bug: Token Counter Displays ~2× Actual Usage with
|
| Version | inputTokens value |
|---|---|
| v2.x | input_tokens (excludes cache) |
| v3.x | total = noCache + cacheRead + cacheWrite |
In session/index.ts, the getUsage() function assumes that for Anthropic, inputTokens excludes cached tokens:
const excludesCachedTokens = !!(input.metadata?.["anthropic"] || input.metadata?.["bedrock"])
const adjustedInputTokens = excludesCachedTokens
? (input.usage.inputTokens ?? 0) // ❌ This is now the TOTAL in v3!
: ...Then in the UI (header.tsx, sidebar.tsx), tokens are displayed as:
tokens.input + tokens.output + tokens.reasoning + tokens.cache.read + tokens.cache.write
Result: Cache tokens are counted twice — once inside tokens.input (which is now the total) and again as tokens.cache.read + tokens.cache.write.
Fix
Use the new inputTokenDetails.noCacheTokens field introduced in SDK v3:
// SDK v3: inputTokens is now TOTAL, use noCacheTokens for pure input
const noCacheInputTokens = input.usage.inputTokenDetails?.noCacheTokens
const adjustedInputTokens = noCacheInputTokens !== undefined
? noCacheInputTokens
: (input.usage.inputTokens ?? 0) - cacheReadInputTokens - cacheWriteInputTokensAlso update cache token extraction to use the new structure:
const cacheReadInputTokens =
input.usage.cachedInputTokens ??
input.usage.inputTokenDetails?.cacheReadTokens ??
0
const cacheWriteInputTokens =
input.usage.inputTokenDetails?.cacheWriteTokens ??
0SDK v3 Type Reference
type LanguageModelUsage = {
inputTokens: number | undefined; // Now the TOTAL
inputTokenDetails: {
noCacheTokens: number | undefined; // Pure input without cache
cacheReadTokens: number | undefined; // Tokens read from cache
cacheWriteTokens: number | undefined; // Tokens written to cache
};
cachedInputTokens?: number | undefined; // @deprecated — use inputTokenDetails.cacheReadTokens
// ...
}|
how did anyone manage to use with this branch? i keep getting : The long context beta is not yet available for this subscription |
…ounting, use plugin fork for OAuth context cap
b49a992 to
471746d
Compare
| @@ -16,15 +16,14 @@ import { gitlabAuthPlugin as GitlabAuthPlugin } from "@gitlab/opencode-gitlab-au | |||
| export namespace Plugin { | |||
| const log = Log.create({ service: "plugin" }) | |||
|
|
|||
| const BUILTIN = ["opencode-anthropic-auth@0.0.13"] | |||
| const BUILTIN = ["github:okhsunrog/opencode-anthropic-auth#feat/oauth-context-cap"] | |||
There was a problem hiding this comment.
opencode core shouldn't have builtin plugins that pull from forks.
There was a problem hiding this comment.
It was only supposed for testing till changes to opencode-anthropoc-auth are accepted








Adaptive thinking
adaptive-thinking-2026-01-28beta header for Anthropic providerOpus 4.6 uses adaptive thinking where the model decides how much to think based on task complexity, guided by the effort parameter. This is more efficient than fixed budgetTokens as simple tasks use minimal thinking.
1M context window
context-1m-2025-08-07beta header for Anthropic provider (API key users)model.limit.inputis set, compare input tokens only against that limit instead of the combined total (input + cache + output)Without the beta header, Opus 4.6 enforces a 200k input token limit (the context window is still 1M — output/thinking tokens aren't affected). The
opencode(OAuth) provider doesn't support the 1M beta header, so it relies onmodel.limit.inputin models.dev to trigger compaction at the right time: anomalyco/models.dev#819AI SDK upgrade
aiv5 → v6,@ai-sdk/anthropicv2 → v3, and all@ai-sdk/*packagesLanguageModelV2→LanguageModelV3, asynctoModelMessages, renamed tool factoriesRef: https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking
Ref: https://platform.claude.com/docs/en/build-with-claude/context-windows#1-m-token-context-window
Tested locally, works with Opus 4.6
Closes #12323
Closes #12338
Closes #12438