feat:OpenAPI: add RealtimeTurnDetection union; update refs, rename timeout event #224

HavenDV · 2025-09-11T18:19:48Z

Summary by CodeRabbit

New Features
- Added Semantic VAD turn detection with selectable eagerness levels.
- Unified turn detection into a reusable option selectable by type.
Changes
- Streamlined server-side VAD configuration and applied idle timeout limits (5–30s).
- Renamed the input_audio_buffer timeout event to a stable name; clients using the old event or inline configs may need updates.
Documentation
- Expanded turn detection guidance and clarified timeout behavior.
- Updated examples to match the new event shape and schema.

coderabbitai · 2025-09-11T18:19:56Z

Walkthrough

Defines a reusable RealtimeTurnDetection union schema (server_vad and semantic_vad) with a discriminator, updates all turn_detection references to use it, adds constraints and descriptions, renames a timeout event schema, and revises related metadata, descriptions, and examples in src/libs/tryAGI.OpenAI/openapi.yaml.

Changes

Cohort / File(s)	Change Summary
Turn detection schema refactor `src/libs/tryAGI.OpenAI/openapi.yaml`	Added `components/schemas/RealtimeTurnDetection` as a discriminated union with variants: Server VAD and Semantic VAD; set union-level `nullable: true`; replaced inline `turn_detection` objects with `$ref` to the new schema.
Server VAD constraints `src/libs/tryAGI.OpenAI/openapi.yaml`	Tightened `idle_timeout_ms` (min 5000, max 30000) and expanded descriptions; retained related fields under Server VAD variant.
Semantic VAD introduction `src/libs/tryAGI.OpenAI/openapi.yaml`	Added Semantic VAD variant with `type: semantic_vad`, `eagerness` enum, and fields like `create_response` and `interrupt_response`.
Event renaming and metadata `src/libs/tryAGI.OpenAI/openapi.yaml`	Renamed `RealtimeBetaServerEventInputAudioBufferTimeoutTriggered` to `RealtimeServerEventInputAudioBufferTimeoutTriggered`; updated `x-oaiMeta` keys and references.
Docs and examples updates `src/libs/tryAGI.OpenAI/openapi.yaml`	Revised descriptions to align with union semantics; updated `input_audio_buffer.timeout_triggered` example (field order and details).
General description edits `src/libs/tryAGI.OpenAI/openapi.yaml`	Trimmed and clarified long-form descriptions and truncation guidance to match new public schema structure.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant R as Realtime API
  participant TD as Turn Detection (discriminated)

  Note over C,R: Session setup with reusable turn_detection schema
  C->>R: Create/Configure session { turn_detection: { type, ... } }
  R->>TD: Initialize turn detection via discriminator on type
  TD-->>R: Active variant (server_vad or semantic_vad)

  Note over C,R: Audio streaming and detection
  C->>R: Stream input_audio_buffer
  R->>TD: Analyze audio and timings
  alt Silence/Idle reached
    TD-->>R: Timeout condition met
    R-->>C: input_audio_buffer.timeout_triggered (renamed schema)
  else Speech continues
    TD-->>R: No timeout
    R-->>C: Continue streaming/responses as configured
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks (2 passed, 1 inconclusive)

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title Check	❓ Inconclusive	The current title "feat:@coderabbitai" is vague and does not describe the primary changes in this PR (the new RealtimeTurnDetection union, refactor of turn_detection references, and the timeout event rename), so it does not provide a concise summary a reviewer can scan to understand the main change. It therefore fails the guideline requiring a short, specific single-sentence title.	Please replace the title with a short, specific sentence that highlights the main change (for example: "feat(realtime): add RealtimeTurnDetection union and rename timeout event" or "refactor(openapi): consolidate turn_detection into RealtimeTurnDetection and rename timeout schema"); avoid mentions/emojis and keep it focused on the primary API change so the check can be re-evaluated.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Poem

I twitch my ears at schemas new,
Two paths to turns—semantic, VAD—woo-hoo! 🥕
A timeout renamed, neat and tidy,
Discriminator guides me spry and spritely.
I thump approval, hop along—
Realtime’s rhythm, my review song.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bot/update-openapi_202509111819

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (4)

src/libs/tryAGI.OpenAI/openapi.yaml (4)

23050-23050: Tighten wording for offsets (clarity).

Minor phrasing tweaks improve precision and reduce ambiguity.

-          description: Millisecond offset of audio written to the input audio buffer at the time the timeout was triggered.
+          description: Millisecond offset, from the start of the input audio buffer, of the last audio written when the timeout was triggered.

-          description: Millisecond offset of audio written to the input audio buffer that was after the playback time of the last model response.
+          description: Millisecond offset, from the start of the input audio buffer, of the first audio written after playback of the last model response.

Also applies to: 23053-23053

23065-23065: Polish description; small grammar/flow fixes.

Optional edits for readability; semantics unchanged.

-There may be speech
-that didn't trigger VAD but is still detected by the model, so the model may respond with
-something relevant to the conversation or a prompt to continue speaking.
+There may be speech that didn't trigger VAD but is still detected by the model, so the model may respond with something relevant to the conversation or a prompt to continue speaking.

24911-24919: Add defaults to match docs.

Docs mention defaults for prefix_padding_ms (300ms) and silence_duration_ms (500ms); encode them as defaults for stronger contracts.

             prefix_padding_ms:
               type: integer
               description: "Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in
 milliseconds). Defaults to 300ms.\n"
+              default: 300
             silence_duration_ms:
               type: integer
               description: "Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults
 to 500ms. With shorter values the model will respond more quickly,
 but may jump in on short pauses from the user.\n"
+              default: 500

24957-24958: Consider discriminator mappings for robust codegen.

Inline anyOf variants + discriminator work, but some generators (incl. Stainless in strict modes) prefer explicit mappings to component refs. Optional: extract variants into named component schemas and add discriminator.mapping.

Example structure:

components:
  schemas:
    RealtimeTurnDetectionServerVAD: { ... }
    RealtimeTurnDetectionSemanticVAD: { ... }
    RealtimeTurnDetection:
      anyOf:
        - $ref: '#/components/schemas/RealtimeTurnDetectionServerVAD'
        - $ref: '#/components/schemas/RealtimeTurnDetectionSemanticVAD'
      discriminator:
        propertyName: type
        mapping:
          server_vad: '#/components/schemas/RealtimeTurnDetectionServerVAD'
          semantic_vad: '#/components/schemas/RealtimeTurnDetectionSemanticVAD'

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73d0cdd and 4abce5a.

⛔ Files ignored due to path filters (43)

src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI..JsonSerializerContext.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateRequestGAAudioInputTurnDetectionEagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateRequestGAAudioInputTurnDetectionEagernessNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAAudioInputTurnDetectionEagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeSessionCreateResponseGAAudioInputTurnDetectionEagernessNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateRequestGAAudioInputTurnDetectionEagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateRequestGAAudioInputTurnDetectionEagernessNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateRequestGAAudioInputTurnDetectionType.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTranscriptionSessionCreateRequestGAAudioInputTurnDetectionTypeNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetection.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetectionVariant1Type.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetectionVariant1TypeNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetectionVariant2Eagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetectionVariant2EagernessNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetectionVariant2Type.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeTurnDetectionVariant2TypeNullable.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeBetaServerEventInputAudioBufferTimeoutTriggered.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeBetaServerEventInputAudioBufferTimeoutTriggeredType.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeServerEventInputAudioBufferTimeoutTriggered.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateRequestGAAudioInput.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateRequestGAAudioInputTurnDetection.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateRequestGAAudioInputTurnDetectionType.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInput.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetection.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetectionEagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateResponseGAAudioInputTurnDetectionType.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateRequestGAAudioInput.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateRequestGAAudioInputTurnDetectionEagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTranscriptionSessionCreateRequestGAAudioInputTurnDetectionType.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetection.Json.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetection.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionDiscriminator.Json.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionDiscriminator.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant1.Json.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant1.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant1Type.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant2.Json.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant2.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant2Eagerness.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeTurnDetectionVariant2Type.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ResponseProperties.g.cs is excluded by !**/generated/**
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ResponsePropertiesTruncation.g.cs is excluded by !**/generated/**

📒 Files selected for processing (1)

src/libs/tryAGI.OpenAI/openapi.yaml (9 hunks)

🔇 Additional comments (6)

src/libs/tryAGI.OpenAI/openapi.yaml (6)
24535-24535: Consistent reuse of RealtimeTurnDetection.

Good alignment across schemas.

24726-24726: Consistent reuse (session schema).

LGTM.

24901-24906: Confirm idle_timeout_ms limits align with server validation.

Min/max (5000–30000 ms) look reasonable; please confirm backend enforces the same to avoid SDK/server drift.

24956-24958: nullable in OAS 3.1 vs 3.0 — verify spec version and adjust if needed.

If this spec is OAS 3.1, prefer JSON Schema nullability (type: [object, "null"] / oneOf) over nullable: true. If OAS 3.0, current usage is fine.

Proposed 3.1-style tweak:
-      nullable: true
+      type: [object, "null"]
-              nullable: true
+              oneOf:
+                - type: integer
+                - type: 'null'
Also applies to: 24906-24906

26578-26578: LGTM on truncation description.

Clearer behavior explanation without changing semantics.

33957-33959: Rename propagation check — no remaining Beta references.
Search for RealtimeBetaServerEventInputAudioBufferTimeoutTriggered returned no matches; only input_audio_buffer.timeout_triggered appears in openapi.yaml and generated code.

coderabbitai · 2025-09-11T18:27:52Z

src/libs/tryAGI.OpenAI/openapi.yaml

            python: "from openai import OpenAI\n\nclient = OpenAI(\n    api_key=\"My API Key\",\n)\nclient_secret = client.realtime.client_secrets.create()\nprint(client_secret.expires_at)"
            ruby: "require \"openai\"\n\nopenai = OpenAI::Client.new(api_key: \"My API Key\")\n\nclient_secret = openai.realtime.client_secrets.create\n\nputs(client_secret)"
-          response: "{\n  \"value\": \"ek_68af296e8e408191a1120ab6383263c2\",\n  \"expires_at\": 1756310470,\n  \"session\": {\n    \"type\": \"realtime\",\n    \"object\": \"realtime.session\",\n    \"id\": \"sess_C9CiUVUzUzYIssh3ELY1d\",\n    \"model\": \"gpt-realtime\",\n    \"output_modalities\": [\n      \"audio\"\n    ],\n    \"instructions\": \"You are a friendly assistant.\",\n    \"tools\": [],\n    \"tool_choice\": \"auto\",\n    \"max_output_tokens\": \"inf\",\n    \"tracing\": null,\n    \"truncation\": \"auto\",\n    \"prompt\": null,\n    \"expires_at\": 0,\n    \"audio\": {\n      \"input\": {\n        \"format\": {\n          \"type\": \"audio/pcm\",\n          \"rate\": 24000\n        },\n        \"transcription\": null,\n        \"noise_reduction\": null,\n        \"turn_detection\": {\n          \"type\": \"server_vad\",\n          \"threshold\": 0.5,\n          \"prefix_padding_ms\": 300,\n          \"silence_duration_ms\": 200,\n          \"idle_timeout_ms\": null,\n          \"create_response\": true,\n          \"interrupt_response\": true\n        }\n      },\n      \"output\": {\n        \"format\": {\n          \"type\": \"audio/pcm\",\n          \"rate\": 24000\n        },\n        \"voice\": \"alloy\",\n        \"speed\": 1.0\n      }\n    },\n    \"include\": null\n  }\n}\n"
+          response: "{\n  \"value\": \"ek_68af296e8e408191a1120ab6383263c2\",\n  \"expires_at\": 1756310470,\n  \"session\": {\n    \"type\": \"realtime\",\n    \"object\": \"realtime.session\",\n    \"id\": \"sess_C9CiUVUzUzYIssh3ELY1d\",\n    \"model\": \"gpt-realtime\",\n    \"output_modalities\": [\n      \"audio\"\n    ],\n    \"instructions\": \"You are a friendly assistant.\",\n    \"tools\": [],\n    \"tool_choice\": \"auto\",\n    \"max_output_tokens\": \"inf\",\n    \"tracing\": null,\n    \"truncation\": \"auto\",\n    \"prompt\": null,\n    \"expires_at\": 0,\n    \"audio\": {\n      \"input\": {\n        \"format\": {\n          \"type\": \"audio/pcm\",\n          \"rate\": 24000\n        },\n        \"transcription\": null,\n        \"noise_reduction\": null,\n        \"turn_detection\": {\n          \"type\": \"server_vad\",\n        }\n      },\n      \"output\": {\n        \"format\": {\n          \"type\": \"audio/pcm\",\n          \"rate\": 24000\n        },\n        \"voice\": \"alloy\",\n        \"speed\": 1.0\n      }\n    },\n    \"include\": null\n  }\n}\n"


⚠️ Potential issue

Fix invalid JSON in example: trailing comma in turn_detection block.

The response example has a trailing comma after "server_vad", making it invalid JSON.

Apply this diff inside the stringified JSON:

- "turn_detection": { - "type": "server_vad", - } + "turn_detection": { + "type": "server_vad" + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

response: "{\n \"value\": \"ek_68af296e8e408191a1120ab6383263c2\",\n \"expires_at\": 1756310470,\n \"session\": {\n \"type\": \"realtime\",\n \"object\": \"realtime.session\",\n \"id\": \"sess_C9CiUVUzUzYIssh3ELY1d\",\n \"model\": \"gpt-realtime\",\n \"output_modalities\": [\n \"audio\"\n ],\n \"instructions\": \"You are a friendly assistant.\",\n \"tools\": [],\n \"tool_choice\": \"auto\",\n \"max_output_tokens\": \"inf\",\n \"tracing\": null,\n \"truncation\": \"auto\",\n \"prompt\": null,\n \"expires_at\": 0,\n \"audio\": {\n \"input\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"transcription\": null,\n \"noise_reduction\": null,\n \"turn_detection\": {\n \"type\": \"server_vad\",\n }\n },\n \"output\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"voice\": \"alloy\",\n \"speed\": 1.0\n }\n },\n \"include\": null\n }\n}\n"

"turn_detection": {

"type": "server_vad"

}

🤖 Prompt for AI Agents

In src/libs/tryAGI.OpenAI/openapi.yaml around line 5468, the example JSON response inside the string contains a trailing comma after the "server_vad" field in the turn_detection object, making it invalid JSON; remove that trailing comma (ensure items inside turn_detection are comma-separated only when followed by another key), validate the entire stringified JSON with a JSON linter/validator, and update the YAML example so the JSON parses correctly.

coderabbitai · 2025-09-11T18:27:53Z

src/libs/tryAGI.OpenAI/openapi.yaml

-                      description: "Type of turn detection.\n"
-                      default: server_vad
-                  description: "Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response.\nServer VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.\nSemantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with \"uhhm\", the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.\n"
+                  $ref: '#/components/schemas/RealtimeTurnDetection'


💡 Verification agent

🧩 Analysis chain

Refactor to shared schema looks good.

turn_detection now references the reusable RealtimeTurnDetection. Nice consolidation.

Run to ensure no stray inline turn_detection definitions remain:

🏁 Script executed:

#!/bin/bash rg -nP -C2 'turn_detection:\s*(\n\s+(?!\$ref).*){1,10}' src/libs/tryAGI.OpenAI/openapi.yaml

Length of output: 83

🏁 Script executed:

#!/bin/bash set -euo pipefail file="src/libs/tryAGI.OpenAI/openapi.yaml" if [ ! -f "$file" ]; then echo "FILE_NOT_FOUND:$file" exit 0 fi echo "1) grep for 'turn_detection:' with context:" rg -n -C3 '^\s*turn_detection\s*:' "$file" || true echo echo "2) For each occurrence show up to 12 following lines and whether \$ref appears in that block:" awk ' /turn_detection:/{ start=NR print "MATCH_AT_LINE:" start block=$0 "\n" for(i=1;i<=12;i++){ if(getline){ block = block $0 "\n" } else break } print block if(block ~ /\$ref/){ print "HAS_REF: true" } else { print "HAS_REF: false" } print "----" } ' "$file" || true echo echo "3) occurrences of 'RealtimeTurnDetection' (should exist if ref points to it):" rg -n "RealtimeTurnDetection" "$file" || true echo echo "4) scan repo for any other 'turn_detection:' occurrences (all files):" rg -n --hidden -S '^\s*turn_detection\s*:' || true

Length of output: 15140

Replace remaining inline turn_detection definitions with $ref to RealtimeTurnDetection

Mixed usage found — convert the inline schema definitions to reference '#/components/schemas/RealtimeTurnDetection' or document why they must differ.

src/libs/tryAGI.OpenAI/openapi.yaml — inline schemas at lines: 24051, 24225, 24400, 24491, 24684, 24772, 24817.

src/libs/tryAGI.OpenAI/openapi.yaml — refs present at lines: 24270, 24535, 24726; component RealtimeTurnDetection at 24889.

src/libs/tryAGI.OpenAI/openapi.yaml — inline example/instance at 31288 (example can remain if intended).

src/libs/tryAGI.OpenAI/openapi.realtime.yaml — occurrences at lines: 138, 433.

coderabbitai · 2025-09-11T18:27:53Z

src/libs/tryAGI.OpenAI/openapi.yaml

+              type: string
+              description: "Type of turn detection, `semantic_vad` to turn on Semantic VAD.\n"
+          description: Server-side semantic turn detection which uses a model to determine when the user has finished speaking.
+      description: "Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response.\n\nServer VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.\n\nSemantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with \"uhhm\", the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.\n"


⚠️ Potential issue

Fix typo: “ether” → “either”.

- description: "Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response. + description: "Configuration for turn detection, either Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

description: "Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response.\n\nServer VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.\n\nSemantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with \"uhhm\", the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.\n"

description: "Configuration for turn detection, either Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response."

🤖 Prompt for AI Agents

In src/libs/tryAGI.OpenAI/openapi.yaml around line 24955, there's a typo using "ether" where the correct word is "either"; update the text to replace "ether" with "either" to fix the spelling error while preserving surrounding punctuation and formatting.

feat: Updated OpenAPI spec

4abce5a

github-actions bot approved these changes Sep 11, 2025

View reviewed changes

HavenDV merged commit 89c6575 into main Sep 11, 2025
2 of 4 checks passed

HavenDV deleted the bot/update-openapi_202509111819 branch September 11, 2025 18:20

coderabbitai bot changed the title ~~feat:@coderabbitai~~ feat:OpenAPI: add RealtimeTurnDetection union; update refs, rename timeout event Sep 11, 2025

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat:OpenAPI: add RealtimeTurnDetection union; update refs, rename timeout event #224

feat:OpenAPI: add RealtimeTurnDetection union; update refs, rename timeout event #224

Uh oh!

HavenDV commented Sep 11, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 11, 2025

Uh oh!

coderabbitai bot Sep 11, 2025

Uh oh!

coderabbitai bot Sep 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-          response: "{\n  \"value\": \"ek_68af296e8e408191a1120ab6383263c2\",\n  \"expires_at\": 1756310470,\n  \"session\": {\n    \"type\": \"realtime\",\n    \"object\": \"realtime.session\",\n    \"id\": \"sess_C9CiUVUzUzYIssh3ELY1d\",\n    \"model\": \"gpt-realtime\",\n    \"output_modalities\": [\n      \"audio\"\n    ],\n    \"instructions\": \"You are a friendly assistant.\",\n    \"tools\": [],\n    \"tool_choice\": \"auto\",\n    \"max_output_tokens\": \"inf\",\n    \"tracing\": null,\n    \"truncation\": \"auto\",\n    \"prompt\": null,\n    \"expires_at\": 0,\n    \"audio\": {\n      \"input\": {\n        \"format\": {\n          \"type\": \"audio/pcm\",\n          \"rate\": 24000\n        },\n        \"transcription\": null,\n        \"noise_reduction\": null,\n        \"turn_detection\": {\n          \"type\": \"server_vad\",\n        }\n      },\n      \"output\": {\n        \"format\": {\n          \"type\": \"audio/pcm\",\n          \"rate\": 24000\n        },\n        \"voice\": \"alloy\",\n        \"speed\": 1.0\n      }\n    },\n    \"include\": null\n  }\n}\n"
+        "turn_detection": {
+          "type": "server_vad"
+        }

Uh oh!

feat:OpenAPI: add RealtimeTurnDetection union; update refs, rename timeout event #224

feat:OpenAPI: add RealtimeTurnDetection union; update refs, rename timeout event #224

Uh oh!

Conversation

HavenDV commented Sep 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks (2 passed, 1 inconclusive)

Poem

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HavenDV commented Sep 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 11, 2025 •

edited

Loading