DeepFense is a modular framework for building and training deepfake audio detection models. It provides a plug-and-play architecture where you can easily combine different Frontends (feature extractors), Backends (classifiers), and Loss Functions to create state-of-the-art detection systems.
- π Modular Architecture β Swap components with a single config change
- βοΈ Configuration-Driven β All experiments defined in YAML
- ποΈ Advanced Augmentations β RawBoost, RIR, Codec, Noise, and more
- π Built-in Metrics β EER, minDCF, F1-score, Accuracy
- π Simple CLI β Train and test models with command-line interface
- π Recipes β Pre-configured training setups and example models (see recipes)
New to DeepFense? Check out our recipes for pre-configured training setups and example models to get started quickly!
- Installation
- Understanding DeepFense Architecture
- Available Components
- Training Models
- Evaluating and Testing Models
- Data Preparation
- Extending DeepFense
- Using the CLI (Alternative)
- Complete Pipeline Flow
- Documentation
# From source (recommended for development)
git clone https://github.com/Yaselley/deepfense-framework
cd deepfense-framework
pip install -e .
# Or From PyPI
pip install deepfenseSee Installation Guide for detailed instructions.
DeepFense uses a modular pipeline architecture:
Raw Audio β Frontend β Features β Backend β Embeddings β Loss β Scores
Key Components:
- Frontend: Extracts features from audio (Wav2Vec2, WavLM, HuBERT, etc.)
- Backend: Processes features into embeddings (AASIST, ECAPA-TDNN, MLP, etc.)
- Loss Function: Computes loss and scores (CrossEntropy, OC-Softmax, etc.)
See Architecture Overview for detailed architecture explanation, or Pipeline Flow for complete pipeline walkthrough.
DeepFense provides a modular set of components that can be mixed and matched:
- Frontends: Wav2Vec2, WavLM, HuBERT, EAT, MERT - See Frontends Documentation
- Backends: AASIST, ECAPA-TDNN, RawNet2, MLP, Nes2Net, TCM - See Backends Documentation
- Losses: CrossEntropy, OC-Softmax, AM-Softmax, A-Softmax - See Losses Documentation
- Augmentations: RawBoost, RIR, Codec, Noise, SpeedPerturb - See Augmentations Documentation
See Component Reference for complete details.
Looking for example configurations? Check out our recipes for pre-configured training setups and trained models.
Train models using Python scripts:
python train.py --config deepfense/config/train.yamlTraining creates an experiment directory with checkpoints, logs, and metrics.
Alternative: You can also use the CLI (see Using the CLI section below).
See Quick Start Guide for detailed instructions and Configuration Reference for all YAML parameters.
Test a trained model using Python scripts:
python test.py \
--config deepfense/config/train.yaml \
--checkpoint outputs/my_experiment/best_model.pthDeepFense computes metrics automatically (EER, minDCF, F1, ACC) and saves results to the experiment directory.
Alternative: You can also use the CLI (see Using the CLI section below).
See Evaluation & Inference Guide for details.
DeepFense uses Parquet files for dataset metadata. Each parquet file should contain:
ID: Unique identifierpath: Path to audio filelabel: Label string ("bonafide" or "spoof")dataset_name: (Optional) Dataset identifier
Example:
import pandas as pd
data = pd.DataFrame({
"ID": ["sample_001", "sample_002"],
"path": ["/path/to/audio1.flac", "/path/to/audio2.flac"],
"label": ["bonafide", "spoof"]
})
data.to_parquet("train.parquet")DeepFense applies transforms in two stages:
- Base Transforms (always): Padding, cropping, resampling
- Augmentations (training only): RawBoost, RIR, Noise, etc.
Critical: All audio must be padded/cropped to the same length for batching. Configure this in your YAML:
data:
train:
base_transform:
- type: "pad"
args:
max_len: 160000 # 10 seconds @ 16kHz
random_pad: True # Random crop if longer
pad_type: "repeat" # Repeat if shorterorSee Data Transforms Guide for complete transform parameters, padding/cropping options, augmentations, and how to check/modify configurations.
DeepFense makes it easy to add custom components using the registry pattern. Each component type has a detailed guide:
- Adding Backends | Adding Frontends | Adding Losses
- Adding Datasets | Adding Augmentations
- Adding Optimizers | Adding Metrics | Adding Schedulers
See Extending DeepFense (Quick Reference) for a quick overview of all component types.
DeepFense provides a CLI as an alternative to Python scripts. The CLI supports:
# Train a model (alternative to python train.py)
deepfense train --config config/train.yaml
# Test a model (alternative to python test.py)
deepfense test --config config/train.yaml --checkpoint outputs/exp/best_model.pth
# List available components
deepfense listNote: The CLI currently supports training and testing existing models with different parameters. Future versions will support adding components via CLI.
See CLI Reference for complete documentation.
The DeepFense pipeline: Data β Transforms β Frontend β Backend β Loss β Training β Evaluation
See Pipeline Flow Documentation for the complete detailed pipeline with all stages, data shapes, and configuration flow.
| Guide | Description |
|---|---|
| Installation | Full installation instructions |
| Quick Start | Train your first model in 5 minutes |
| Full Tutorial | Complete config-driven training guide |
| Architecture | How DeepFense works internally |
| Component | Documentation |
|---|---|
| Frontends | Wav2Vec2, WavLM, HuBERT, MERT, EAT |
| Backends | AASIST, ECAPA_TDNN, RawNet2, MLP, Pool, Nes2Net, TCM |
| Losses | CrossEntropy, OC-Softmax, AM-Softmax, A-Softmax |
| Augmentations | RawBoost, RIR, Codec, Noise, SpeedPerturb |
| Optimizers & Schedulers | Adam, SGD, CosineAnnealing, StepLR |
| Guide | Description |
|---|---|
| Training with CLI | How to train models using the CLI |
| Training Workflow | Detailed training loop explanation |
| Evaluation & Inference | Testing and deployment |
| Configuration Reference | All YAML parameters explained |
| Library Usage | Use DeepFense programmatically in Python |
| Guide | Description |
|---|---|
| Adding a New Backend | Step-by-step guide to create custom backends |
| Adding a New Frontend | Step-by-step guide to create custom frontends |
| Adding a New Loss | Step-by-step guide to create custom loss functions |
| Adding a New Dataset | Step-by-step guide to create custom datasets |
| Adding Augmentations | Step-by-step guide to create custom data augmentations |
| Adding Optimizers | Step-by-step guide to add custom optimizers |
| Adding Metrics | Step-by-step guide to add custom evaluation metrics |
| Adding Schedulers | Step-by-step guide to add custom learning rate schedulers |
| Extending DeepFense (Quick Reference) | Quick reference for all component types |
| Guide | Description |
|---|---|
| CLI Reference | Complete CLI documentation |
| Resource | Description |
|---|---|
| Recipes | Pre-configured training setups and example models |
deepfense/
βββ config/ # YAML configurations
βββ data/ # Data handling & transforms
βββ models/ # Frontends, backends, losses
βββ training/ # Training loop & evaluation
βββ utils/ # Registry & helpers
βββ cli/ # Command-line interface
See Architecture Overview for detailed structure and component organization.
DeepFense provides example recipes (pre-configured training setups) to help you get started quickly. Each recipe includes:
- Complete configuration files
- Pre-trained model checkpoints (where available)
- Training scripts and evaluation results
- Documentation on architecture choices and hyperparameters
See the recipes folder for available recipes. Each recipe includes detailed README files explaining the configuration and how to reproduce the results.
We welcome contributions! See Extending DeepFense for guidelines on adding new components.
Apache 2.0 β see LICENSE for details.
If you use DeepFense in your research, please cite:
@software{deepfense2024,
title={DeepFense: A Modular Framework for Deepfake Audio Detection},
author={DeepFense Team},
year={2024},
url={https://github.com/Yaselley/deepfense-framework}
}