-
Notifications
You must be signed in to change notification settings - Fork 497
fix: Add structured output fallback for non-Claude models in app spec generation #682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Add structured output fallback for non-Claude models in app spec generation #682
Conversation
… generation This fixes the app spec generation failing for non-Claude models (Cursor, Gemini, OpenCode, Copilot) that don't support structured output capabilities. Changes: - Add `supportsStructuredOutput()` utility function in @automaker/types to centralize model capability detection - Update generate-features-from-spec.ts: - Add explicit JSON instructions for non-Claude/Codex models - Define featuresOutputSchema for structured output - Pre-extract JSON from text responses using extractJsonWithArray - Handle both structured_output and text responses properly - Update generate-spec.ts: - Replace isCursorModel with supportsStructuredOutput for consistency - Update sync-spec.ts: - Add techStackOutputSchema for structured output - Add JSON extraction fallback for text responses - Handle both structured_output and text parsing - Update validate-issue.ts: - Use supportsStructuredOutput for cleaner capability detection The fix follows the same pattern used in generate-spec.ts where non-Claude models receive explicit JSON formatting instructions in the prompt and responses are parsed using extractJson utilities. Fixes #669 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughReplaces per-model checks with a generic supportsStructuredOutput(model) predicate and implements a structured-output-first flow with a robust JSON-extraction fallback across feature generation, spec generation, tech-stack sync, and issue validation; threads reasoningEffort and Codex-specific timeout behavior into model resolution and providers. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server
participant Settings
participant Model as LLM Provider
participant Extractor as JSON Extractor
Client->>Server: request spec/features generation
Server->>Settings: resolve model & reasoningEffort
Server->>Model: streamingQuery(prompt, outputFormat? schema)
alt Model returns structured_output
Model-->>Server: structured_output (JSON)
Server->>Server: validate with schema -> contentForParsing
else Model returns text
Model-->>Server: text response
Server->>Extractor: extractJsonWithArray(text)
alt extraction succeeds
Extractor-->>Server: parsed JSON -> contentForParsing
else extraction fails
Extractor-->>Server: error
Server->>Server: emit error event & throw
end
end
Server->>Server: parse features from contentForParsing & proceed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Shironex, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where app specification generation failed for AI models that do not natively support structured JSON output. By introducing a centralized mechanism to detect model capabilities and implementing intelligent fallbacks—including explicit JSON instructions in prompts and post-processing text responses to extract JSON—the system can now reliably generate app specs and analyze tech stacks across a wider range of AI providers, ensuring a more resilient and compatible generation pipeline. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a robust fallback mechanism for app spec generation on AI models that do not support structured output. The changes are well-executed, introducing a supportsStructuredOutput() utility to centralize model capability detection and applying it consistently across generate-features-from-spec.ts, sync-spec.ts, generate-spec.ts, and validate-issue.ts.
The addition of explicit JSON instructions for non-Claude models and the use of a JSON extraction fallback (extractJson and extractJsonWithArray) significantly improve the reliability of spec generation with a wider range of models. The code is clean, well-documented, and the refactoring improves maintainability.
I have one suggestion in generate-features-from-spec.ts to avoid a redundant parsing attempt on failure, which would make the error handling slightly more efficient and explicit. Overall, this is an excellent contribution that addresses a key functionality gap.
- Throw error immediately when JSON extraction fails in generate-features-from-spec.ts to avoid redundant parsing attempt (feedback from Gemini Code Assist review) - Emit spec_regeneration_error event before throwing for consistency - Fix TypeScript cast in sync-spec.ts by using double cast through unknown 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@libs/types/src/provider-utils.ts`:
- Around line 348-374: supportsStructuredOutput currently returns true for any
model where isClaudeModel(model) is true, which incorrectly enables structured
output for Claude-branded Copilot/OpenCode IDs (e.g.,
"copilot-claude-3.5-sonnet"); update supportsStructuredOutput(model) to also
exclude copilot/OpenCode models by checking existing helpers (e.g.,
isCopilotModel(model) and isOpenCodeModel(model)) or by explicitly matching
prefixes (e.g., model startsWith 'copilot-' or 'opencode-') and only return true
when isClaudeModel(model) is true AND those exclusion checks are false (i.e.,
return isClaudeModel(model) && !isCopilotModel(model) &&
!isOpenCodeModel(model)); keep isCodexModel logic as-is.
- Introduced a dedicated 5-minute timeout for Codex models during feature generation to accommodate slower response times when generating 50+ features. - Updated the CodexProvider to utilize this extended timeout based on the reasoning effort level. - Enhanced the feature generation logic in generate-features-from-spec.ts to detect Codex models and apply the appropriate timeout. - Modified the model resolver to include reasoning effort in the resolved phase model structure. This change improves the reliability of feature generation for Codex models, ensuring they have sufficient time to process requests effectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/server/src/providers/codex-provider.ts (1)
831-842: Gate the extended timeout on models that actually support reasoning effort.Right now, any caller passing
reasoningEffort: 'xhigh'gets the 5‑minute base timeout even if the model doesn’t support reasoning effort (you already gate CLI overrides withsupportsReasoningEffort). This can unnecessarily delay failure detection for unsupported models. Consider normalizing the effective reasoning effort before applying the extended timeout.🛠️ Suggested fix
- const baseTimeout = - options.reasoningEffort === 'xhigh' - ? CODEX_FEATURE_GENERATION_BASE_TIMEOUT_MS - : CODEX_CLI_TIMEOUT_MS; - const timeout = calculateReasoningTimeout(options.reasoningEffort, baseTimeout); + const effectiveReasoningEffort = supportsReasoningEffort(options.model) + ? options.reasoningEffort + : undefined; + const baseTimeout = + effectiveReasoningEffort === 'xhigh' + ? CODEX_FEATURE_GENERATION_BASE_TIMEOUT_MS + : CODEX_CLI_TIMEOUT_MS; + const timeout = calculateReasoningTimeout(effectiveReasoningEffort, baseTimeout);Based on learnings, keep capability checks strictly per-model.
🧹 Nitpick comments (1)
apps/server/src/routes/app-spec/generate-features-from-spec.ts (1)
29-33: Remove unused constantCODEX_FEATURE_GENERATION_TIMEOUT_MS.This constant is defined but never referenced anywhere in the codebase. The timeout is controlled via
reasoningEffort: 'xhigh'passed tostreamingQuery, with the actual 5-minute timeout managed by the Codex provider implementation. Remove this dead code.
Summary
This PR fixes the app spec generation failing for non-Claude models (Cursor, Gemini, OpenCode, Copilot) that don't support structured output capabilities.
supportsStructuredOutput()centralized utility functionFixes #669
Problem
When users select non-Claude AI models (like Cursor, Gemini, or Copilot) for app spec generation, the final step fails with "No valid JSON found in response" error. This happens because:
generate-features-from-spec.tswas missing structured output handlingsync-spec.tswas missing JSON extraction fallbackSolution
Applied the same pattern already used in
generate-spec.ts:Added
supportsStructuredOutput()utility (libs/types/src/provider-utils.ts)truefor Claude and Codex modelsfalsefor Cursor, Gemini, OpenCode, and Copilot modelsUpdated
generate-features-from-spec.tsfeaturesOutputSchemafor Claude/Codex structured outputextractJsonWithArraystructured_outputand text responses properlyUpdated
sync-spec.tstechStackOutputSchemafor structured outputextractJsonutilityUpdated
generate-spec.tsandvalidate-issue.tsisCursorModel()withsupportsStructuredOutput()for consistencyTest plan
Related Issues
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Chores
✏️ Tip: You can customize this high-level summary in your review settings.