Skip to content

Audio chunk too short #191

@martrdev

Description

@martrdev

Hello,

I've recently used the openai-whisper cookbook example to add the Speech-to-Text function to my app.
I've noticed that the audio gets transcribed in small chunks instead of waiting for the audio recording to be stopped by the user.

This generates a low-quality transcription because each small audio chunk gets transcribed individually, sometimes it transcribes only half of the sentence and doesn't complete the transcription of the rest (see attached image).

Is there a way to set it in a way where the audio gets fully captured and the transcription starts only when the user manually stops the recording?

Hypothetically, I wouldn't mind if the recording stopped automatically after a few seconds of silence, but in this moment it looks like this function doesn't work very well - I tried upping the SILENCE_TIMEOUT to 4000.0 without much success-, so I'd rather do it manually.

Image

Many thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions