Skip to content

Conversation

@HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Aug 27, 2025

Summary by CodeRabbit

  • New Features
    • Added configurable options for Automatic Language Detection in transcription requests.
    • You can specify expected languages to guide detection and improve accuracy.
    • Introduced a fallback language used when the detected language isn’t in the expected list (defaults to “auto”).
    • Existing language detection functionality remains unchanged, with added flexibility for multilingual audio scenarios.

@coderabbitai
Copy link

coderabbitai bot commented Aug 27, 2025

Walkthrough

Added a new language_detection_options object to TranscriptOptionalParams in src/libs/AssemblyAI/openapi.yaml, introducing expected_languages (array of strings) and fallback_language (string, default "auto") to configure automatic language detection alongside the existing language_detection flag.

Changes

Cohort / File(s) Summary
AssemblyAI OpenAPI schema
src/libs/AssemblyAI/openapi.yaml
Added TranscriptOptionalParams.language_detection_options object with: expected_languages (array of strings; “List of languages expected in the audio file.”) and fallback_language (string; default “auto”) under existing language_detection.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant API as AssemblyAI API
  participant LD as Language Detector

  Client->>API: POST /transcripts { language_detection: true, language_detection_options }
  API->>LD: Detect language(s) with expected_languages and fallback_language
  alt Detected in expected_languages
    LD-->>API: Return detected language
  else Not in expected list
    LD-->>API: Apply fallback_language (or "auto")
  end
  API-->>Client: Transcript created with resolved language settings
  note over API,LD: New: language_detection_options guides detection and fallback
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my ears at tongues anew,
A chorus of codes the servers view;
Expected trails, a fallback lane,
If accents hop beyond the plain.
YAML burrows, options bloom—
Now transcripts find their proper room. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202508270334

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@HavenDV HavenDV enabled auto-merge (squash) August 27, 2025 03:34
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Add language_detection_options to TranscriptOptionalParams Aug 27, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
src/libs/AssemblyAI/openapi.yaml (3)

1263-1264: Correct mislabeled field.

"x-label" says “Minimum speakers expected” but the field is languages.

-              x-label: Minimum speakers expected
+              x-label: Expected languages

1256-1259: Tighten label/description and scope.

Minor copy tweaks; clarify it applies when language_detection=true.

-          x-label: Specify options for Automatic Language Detection.
-          description: Specify options for Automatic Language Detection.
+          x-label: Automatic language detection options
+          description: Options for Automatic Language Detection (effective only when `language_detection` is true).

1256-1275: Optional: gate options on language_detection and add usage example.

  • Consider an if/then at TranscriptOptionalParams level to warn when options are present but language_detection=false.
  • Add an example to examples showing expected usage.

Example (outside this hunk):

# In components.schemas.TranscriptOptionalParams (sibling of properties)
allOf:
  - if:
      properties:
        language_detection:
          const: false
    then:
      not:
        properties:
          language_detection_options: {}

Usage snippet:

language_detection: true
language_detection_options:
  expected_languages: ["en_us", "es"]
  fallback_language: "auto"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c531b33 and 0370305.

⛔ Files ignored due to path filters (4)
  • src/libs/AssemblyAI/Generated/AssemblyAI.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParams.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.Json.g.cs is excluded by !**/generated/**
  • src/libs/AssemblyAI/Generated/AssemblyAI.Models.TranscriptOptionalParamsLanguageDetectionOptions.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/AssemblyAI/openapi.yaml (1 hunks)

Comment on lines +1265 to +1268
type: array
objects:
x-label: language
type: string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix array schema: use items (not objects) and validate codes.

"objects" is invalid in OAS 3.1 for arrays; use "items". Recommend allowing either TranscriptLanguageCode or free-form string, plus minItems/uniqueItems.

Apply:

-              type: array
-              objects:
-                x-label: language
-                type: string
+              type: array
+              minItems: 1
+              uniqueItems: true
+              items:
+                anyOf:
+                  - $ref: "#/components/schemas/TranscriptLanguageCode"
+                  - type: string
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type: array
objects:
x-label: language
type: string
type: array
minItems: 1
uniqueItems: true
items:
anyOf:
- $ref: "#/components/schemas/TranscriptLanguageCode"
- type: string
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1265 to 1268, the array schema
incorrectly uses "objects" (invalid in OAS 3.1); replace "objects" with "items"
and make the items accept either the TranscriptLanguageCode reference or a
free-form string (e.g., a oneOf with $ref to TranscriptLanguageCode and type:
string), and add validation such as minItems: 1 and uniqueItems: true to ensure
at least one language and no duplicates.

Comment on lines +1270 to +1274
x-label: Fallback language
description: |
If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Align description and type for fallback_language.

Description suggests an array value ["auto"] while the type is string. Use "auto" (string) or a language code; enforce via oneOf.

-              description: |
-                If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
-              type: string
-              default: "auto"
+              description: |
+                If the detected language is not in `expected_languages`, use this fallback. Specify "auto" to let the model choose the fallback from `expected_languages` with the highest confidence score.
+              oneOf:
+                - const: "auto"
+                - anyOf:
+                    - $ref: "#/components/schemas/TranscriptLanguageCode"
+                    - type: string
+              default: "auto"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
x-label: Fallback language
description: |
If the detected language of the audio file is not in the list of expected languages, the `fallback_language` is used. Specify `["auto"]` to let our model choose the fallback language from `expected_languages` with the highest confidence score.
type: string
default: "auto"
x-label: Fallback language
description: |
If the detected language is not in `expected_languages`, use this fallback. Specify "auto" to let the model choose the fallback from `expected_languages` with the highest confidence score.
oneOf:
- const: "auto"
- anyOf:
- $ref: "#/components/schemas/TranscriptLanguageCode"
- type: string
default: "auto"
🤖 Prompt for AI Agents
In src/libs/AssemblyAI/openapi.yaml around lines 1270 to 1274, the
fallback_language field's description refers to an array value (e.g. ["auto"])
but the schema type is string; change the schema to use oneOf so it accepts
either the string "auto" or an array of language-code strings (oneOf: - {type:
string, enum: ["auto"]} - {type: array, items: {type: string}}) and update the
description to mention that it can be the string "auto" or an array of language
codes.

@HavenDV HavenDV closed this Aug 27, 2025
auto-merge was automatically disabled August 27, 2025 11:18

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants