-
-
Notifications
You must be signed in to change notification settings - Fork 1
feat:Add language_detection_options to TranscriptOptionalParams #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdded a new optional object field language_detection_options to TranscriptOptionalParams in src/libs/AssemblyAI/openapi.yaml, introducing expected_languages (array of language codes) and fallback_language (string with default "auto") to configure behavior when language_detection is enabled. No endpoints or other behaviors were changed. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant API as AssemblyAI API
participant Engine as Transcription Engine
Note over Client,API: Create transcript request
Client->>API: POST /transcripts { language_detection, language_detection_options }
alt language_detection enabled
API->>Engine: Start job with options.expected_languages, options.fallback_language
Engine-->>API: Job accepted
API-->>Client: 201 Created + job_id
loop Processing
Engine->>Engine: Detect language (use expected_languages)
opt Detected not in expected_languages
Engine->>Engine: Apply fallback_language ("auto" allowed)
end
Engine->>API: Update transcript status/results
end
API-->>Client: GET /transcripts/{id} results
else language_detection disabled
API->>Engine: Start job without language detection options
end
Note over Client,API: Error paths follow existing API error handling
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/libs/AssemblyAI/openapi.yaml (1)
1256-1261: Optional: strengthen types, align with existing patterns, and polish labels.
- Align with TranscriptLanguageCode by allowing either the enum or any string (as done elsewhere). This keeps SDKs flexible while guiding users.
- Enforce uniqueItems and minItems for expected_languages.
- Tweak the container label/description to match your style used for redact_pii_audio_options, etc.
- Expose TS/Go hints for fallback_language for better SDK ergonomics.
Apply this diff:
- language_detection_options: - x-label: Specify options for Automatic Language Detection. - description: Specify options for Automatic Language Detection. + language_detection_options: + x-label: Language detection options + description: Options for Automatic Language Detection. Only used when `language_detection` is true. type: object additionalProperties: false properties: expected_languages: - x-label: Minimum speakers expected - description: List of languages expected in the audio file. + x-label: Expected languages + description: List of language codes expected in the audio file. type: array - objects: - x-label: language - type: string + minItems: 1 + uniqueItems: true + items: + anyOf: + - $ref: "#/components/schemas/TranscriptLanguageCode" + - type: string fallback_language: x-label: Fallback language description: | - If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. - type: string + If the detected language of the audio file is not in the list of `expected_languages`, use this fallback language. + Set to "auto" to let our model choose the fallback from `expected_languages` with the highest confidence score. + type: string + x-ts-type: LiteralUnion<TranscriptLanguageCode, string> + x-go-type: TranscriptLanguageCode default: "auto"Also applies to: 1262-1275
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (4)
src/libs/AssemblyAI/Generated/AssemblyAI.JsonSerializerContextTypes.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParams.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.Json.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/AssemblyAI/openapi.yaml(1 hunks)
| language_detection_options: | ||
| x-label: Specify options for Automatic Language Detection. | ||
| description: Specify options for Automatic Language Detection. | ||
| type: object | ||
| additionalProperties: false | ||
| properties: | ||
| expected_languages: | ||
| x-label: Minimum speakers expected | ||
| description: List of languages expected in the audio file. | ||
| type: array | ||
| objects: | ||
| x-label: language | ||
| type: string | ||
| fallback_language: | ||
| x-label: Fallback language | ||
| description: | | ||
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | ||
| type: string | ||
| default: "auto" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Quick validation checks for this spec block.
Run this script to ensure we’ve eliminated the invalid "objects" key, corrected the "auto" guidance, and that the YAML still parses.
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Fail if any invalid 'objects:' key remains
if rg -n "^\s*objects\s*:" -C2 src/libs/AssemblyAI/openapi.yaml; then
echo "ERROR: Found invalid 'objects:' key under an array schema."
exit 1
else
echo "OK: No 'objects:' keys found."
fi
# 2) Fail if docs still suggest [\"auto\"] (array) for a string field
if rg -n '\["auto"\]' src/libs/AssemblyAI/openapi.yaml; then
echo 'ERROR: Found ["auto"] in fallback_language description.'
exit 1
else
echo 'OK: No ["auto"] mention found.'
fi
# 3) Basic YAML parse sanity check
python - <<'PY'
import yaml, sys
with open("src/libs/AssemblyAI/openapi.yaml", "r") as f:
yaml.safe_load(f)
print("OK: YAML parsed successfully.")
PYLength of output: 434
AssemblyAI OpenAPI Spec: Remove Invalid objects Key
The validation script still detects an objects: key under the expected_languages array schema on line 1266. This must be replaced with the correct OpenAPI keyword (items:) before the spec can merge cleanly.
Please update src/libs/AssemblyAI/openapi.yaml as follows:
• File: src/libs/AssemblyAI/openapi.yaml
• Line ~1266: Change
diff - objects: + items:
• Ensure the indented block beneath (x-label: language, type: string) remains unchanged under items:.
After making that change, re-run the validation script to confirm:
#!/bin/bash
set -euo pipefail
# Check for any remaining invalid 'objects:' occurrences
rg -n "^\s*objects\s*:" -C2 src/libs/AssemblyAI/openapi.yaml && \
(echo "ERROR: 'objects:' key still present"; exit 1) || \
echo "OK: No 'objects:' keys found."
# Confirm no array-style ["auto"] guidance remains
rg -n '\["auto"\]' src/libs/AssemblyAI/openapi.yaml && \
(echo "ERROR: Found [\"auto\"] in description"; exit 1) || \
echo "OK: No [\"auto\"] mention found."
# Verify YAML parses
python - <<'PY'
import yaml
yaml.safe_load(open("src/libs/AssemblyAI/openapi.yaml"))
print("OK: YAML parsed successfully.")
PYOnce the objects: key is removed and replaced with items:, the script should pass without errors.
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1256 to 1275 (approximately
line 1266), replace the invalid OpenAPI key "objects:" under the
expected_languages array with the correct keyword "items:" and keep the indented
block beneath it (x-label: language and type: string) exactly as-is under the
new items: key; then save and re-run the provided validation script to confirm
there are no remaining "objects:" occurrences and YAML parses cleanly.
| expected_languages: | ||
| x-label: Minimum speakers expected | ||
| description: List of languages expected in the audio file. | ||
| type: array | ||
| objects: | ||
| x-label: language | ||
| type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix invalid OpenAPI array schema: replace "objects" with "items" and correct the field label.
- OpenAPI uses "items" (not "objects") to define array element schemas. Using "objects" will fail validation and break client generation.
- The x-label "Minimum speakers expected" is clearly a copy/paste error for an array of languages.
Apply this diff:
expected_languages:
- x-label: Minimum speakers expected
+ x-label: Expected languages
description: List of languages expected in the audio file.
type: array
- objects:
- x-label: language
- type: string
+ items:
+ type: string📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| expected_languages: | |
| x-label: Minimum speakers expected | |
| description: List of languages expected in the audio file. | |
| type: array | |
| objects: | |
| x-label: language | |
| type: string | |
| expected_languages: | |
| x-label: Expected languages | |
| description: List of languages expected in the audio file. | |
| type: array | |
| items: | |
| type: string |
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1262-1268 the array schema is
invalid and mislabelled: replace the incorrect "objects" key with the OpenAPI
"items" key to define the element schema, and correct the x-label values so the
array-level label reflects expected languages (e.g., "Expected languages") and
the item-level x-label reflects an individual language (e.g., "language"); keep
the item type as string.
| x-label: Fallback language | ||
| description: | | ||
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | ||
| type: string | ||
| default: "auto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation/type mismatch: fallback_language description suggests an array, but the type is string.
The text says to specify ["auto"], but the schema defines a string. This will confuse users and SDKs.
Apply this diff:
fallback_language:
x-label: Fallback language
description: |
- If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
+ If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Set to "auto" to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| x-label: Fallback language | |
| description: | | |
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | |
| type: string | |
| default: "auto" | |
| fallback_language: | |
| x-label: Fallback language | |
| description: | | |
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Set to "auto" to let our model choose the fallback language from `expected_languages` with the highest confidence score. | |
| type: string | |
| default: "auto" |
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1270 to 1274, the description
for fallback_language refers to specifying ["auto"] (an array) but the schema
sets type: string; change the schema to type: array with items: { type: string }
and set default: ["auto"] (or alternatively adjust the description to reference
a single string if intended); ensure the description matches the schema and
update any examples to use an array of strings when using ["auto"].
Pull request was closed
Summary by CodeRabbit