-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
User story
As a user, I want to be able to use .en models, so that I can have a better transcription performance.
Acceptance criteria
- The system should be able to download .en models if they do not already exist
- The system should be able to utilize already downloaded .en models.
Development information
The model_handler.rs contains code responsible for downloading models based on their name.
The download of a model is as follows:
- Instantiate the model handler:
let m = model_handler::ModelHandler::new("tiny", "models/").await;- The model handler then assigns the model name based on a hashmap:
const MODEL_MAP: phf::Map<&'static str, &'static str> = phf::phf_map! {
"tiny" => "ggml-tiny",
"base" => "ggml-base",
"small" => "ggml-small",
"medium" => "ggml-medium",
"large" => "ggml-large",
};
impl ModelHandler {
pub async fn new(model_name: &str, models_dir: &str) -> ModelHandler {
let model_handler = ModelHandler {
model_name: MODEL_MAP
.get(&model_name.to_lowercase())
.copied()
.unwrap()
.to_string(),
models_dir: models_dir.to_string(),
};- The download function uses this name to download the model:
async fn download_model(&self) -> Result<(), Box<dyn std::error::Error>> {
if !self.is_model_existing() {
self.setup_directory()?;
}
let base_url = "https://huggingface.co/ggerganov/whisper.cpp/resolve/main";
let response = reqwest::get(format!("{}/{}.bin", base_url, &self.model_name)).await?;
let mut file =
std::fs::File::create(format!("{}/{}.bin", &self.models_dir, &self.model_name))?;
let mut content = std::io::Cursor::new(response.bytes().await?);
std::io::copy(&mut content, &mut file)?;
Ok(())
}Potential solution
A possible solution would be to add the .en variant to the MODEL_MAP constant in the model_handler.rs file. As an example, if the user instantiates the ModelHandler with "tiny.en", a mapping should exist for: "tiny.en" => "ggml-tiny-en"
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers