Skip to content

on22y/AV2AV_granted_resources

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implement AV-Hubert unit pretraining task and UTUT pretraining/finetuning pipelines with associated data and configurations.


▶️ Download Large Files

This project requires several large binary files that cannot be stored in this GitHub repository due to the 100MB file limit.


Please download them from Google Drive using gdown:
  1. Install gdown: pip install gdown
2. Download files: ``` gdown --id 1VRKEn_Kjw1TYzaPNi0dhcaLWaOxbd0Eg -O data/train.en-es.en.bin gdown --id 1kIkYd0zkC-nt8VCL3Mfw3k7UZRQdjux6 -O data/train.en-es.es.bin ```

After downloading, ensure the directory structure is:

AV2AV_granted_resources/utut_finetune/data/dataset_mbart_ft_bin_data/
  ├── (existing files...)
  ├── train.en-es.en.bin
  └── train.en-es.es.bin

These files are limited to 100MB by Gitand must be manually downloaded before running the code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 96.4%
  • Shell 3.6%