-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Feature Description
Persistent daemon mode that keeps the model in memory and provides lower end-to-end latency.
Problem Statement
When the model file is in the page cache, it's acceptably fast. But the latency could be far lower if the model was kept resident in GPU memory.
Proposed Solution
Keeping existing scripts and commands would be best, so ideally we'd have an optional daemon that the main command utilizes and forwards the audio to when it's running in the background.
Alternative Solutions
Use Case
- people with enough VRAM that transcribe often
Implementation Ideas
Additional Context
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request