-
-
Notifications
You must be signed in to change notification settings - Fork 1
feat:Add language_detection_options to TranscriptOptionalParams #121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdded a new language_detection_options object to TranscriptOptionalParams in src/libs/AssemblyAI/openapi.yaml, introducing expected_languages (array of strings) and fallback_language (string, default "auto") to configure automatic language detection alongside the existing language_detection flag. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant API as AssemblyAI API
participant LD as Language Detector
Client->>API: POST /transcripts { language_detection: true, language_detection_options }
API->>LD: Detect language(s) with expected_languages and fallback_language
alt Detected in expected_languages
LD-->>API: Return detected language
else Not in expected list
LD-->>API: Apply fallback_language (or "auto")
end
API-->>Client: Transcript created with resolved language settings
note over API,LD: New: language_detection_options guides detection and fallback
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/libs/AssemblyAI/openapi.yaml (3)
1263-1264: Correct mislabeled field."x-label" says “Minimum speakers expected” but the field is languages.
- x-label: Minimum speakers expected + x-label: Expected languages
1256-1259: Tighten label/description and scope.Minor copy tweaks; clarify it applies when language_detection=true.
- x-label: Specify options for Automatic Language Detection. - description: Specify options for Automatic Language Detection. + x-label: Automatic language detection options + description: Options for Automatic Language Detection (effective only when `language_detection` is true).
1256-1275: Optional: gate options on language_detection and add usage example.
- Consider an if/then at TranscriptOptionalParams level to warn when options are present but language_detection=false.
- Add an example to examples showing expected usage.
Example (outside this hunk):
# In components.schemas.TranscriptOptionalParams (sibling of properties) allOf: - if: properties: language_detection: const: false then: not: properties: language_detection_options: {}Usage snippet:
language_detection: true language_detection_options: expected_languages: ["en_us", "es"] fallback_language: "auto"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (4)
src/libs/AssemblyAI/Generated/AssemblyAI.JsonSerializerContextTypes.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParams.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.Json.g.csis excluded by!**/generated/**src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/AssemblyAI/openapi.yaml(1 hunks)
| type: array | ||
| objects: | ||
| x-label: language | ||
| type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix array schema: use items (not objects) and validate codes.
"objects" is invalid in OAS 3.1 for arrays; use "items". Recommend allowing either TranscriptLanguageCode or free-form string, plus minItems/uniqueItems.
Apply:
- type: array
- objects:
- x-label: language
- type: string
+ type: array
+ minItems: 1
+ uniqueItems: true
+ items:
+ anyOf:
+ - $ref: "#/components/schemas/TranscriptLanguageCode"
+ - type: string📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type: array | |
| objects: | |
| x-label: language | |
| type: string | |
| type: array | |
| minItems: 1 | |
| uniqueItems: true | |
| items: | |
| anyOf: | |
| - $ref: "#/components/schemas/TranscriptLanguageCode" | |
| - type: string |
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1265 to 1268, the array schema
incorrectly uses "objects" (invalid in OAS 3.1); replace "objects" with "items"
and make the items accept either the TranscriptLanguageCode reference or a
free-form string (e.g., a oneOf with $ref to TranscriptLanguageCode and type:
string), and add validation such as minItems: 1 and uniqueItems: true to ensure
at least one language and no duplicates.
| x-label: Fallback language | ||
| description: | | ||
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | ||
| type: string | ||
| default: "auto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align description and type for fallback_language.
Description suggests an array value ["auto"] while the type is string. Use "auto" (string) or a language code; enforce via oneOf.
- description: |
- If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
- type: string
- default: "auto"
+ description: |
+ If the detected language is not in `expected_languages`, use this fallback. Specify "auto" to let the model choose the fallback from `expected_languages` with the highest confidence score.
+ oneOf:
+ - const: "auto"
+ - anyOf:
+ - $ref: "#/components/schemas/TranscriptLanguageCode"
+ - type: string
+ default: "auto"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| x-label: Fallback language | |
| description: | | |
| If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score. | |
| type: string | |
| default: "auto" | |
| x-label: Fallback language | |
| description: | | |
| If the detected language is not in `expected_languages`, use this fallback. Specify "auto" to let the model choose the fallback from `expected_languages` with the highest confidence score. | |
| oneOf: | |
| - const: "auto" | |
| - anyOf: | |
| - $ref: "#/components/schemas/TranscriptLanguageCode" | |
| - type: string | |
| default: "auto" |
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1270 to 1274, the
fallback_language field's description refers to an array value (e.g. ["auto"])
but the schema type is string; change the schema to use oneOf so it accepts
either the string "auto" or an array of language-code strings (oneOf: - {type:
string, enum: ["auto"]} - {type: array, items: {type: string}}) and update the
description to mention that it can be the string "auto" or an array of language
codes.
Pull request was closed
Summary by CodeRabbit