Deep learning models developed for the 2025 Antibody Developability Prediction Competition.
abdix provides a few models for predicting antibody developability properties directly from sequence data. It integrates multiple feature modalities, including protein language models, handcrafted biochemical features, and metadata about antibody subclasses.
The five target properties are:
- Polyreactivity
- Titer
- HIC
- Self-association
- Thermostability
- Pre-trained protein language models (p-IgGen, ESM-C)
- Handcrafted sequence descriptors
- CDR-specific annotations
- Metadata encodings (IgG subtype, light-chain type)
- Modality-specific projections into a shared representation space
- Transformer encoders for cross-modal reasoning
- Task-specific attention pooling with independent regression heads
abdix includes three model variants designed to trade off accuracy and computational cost:
| Model | Architecture | Parameters |
|---|---|---|
regressor_pooled.py |
Minimal MLP baseline | ~100K |
regressor_lite.py |
Lightweight transformer | ~1–2M |
regressor.py |
Full multi-modal transformer | ~6–7M |
Training a single multi-task model on a small dataset (N ≈ 200) proved challenging. On the held-out test set, predictions for polyreactivity ranked among the top 10, while predictions for Titer, HIC, Self-association, and Thermostability showed negative correlations.
from abdix.model.regressor import AntibodyRegressor
from abdix.features.base import PIgGenFeaturizer, ESMCFeaturizer
# Initialize the model
model = AntibodyRegressor(
featurizers={
'piggen': PIgGenFeaturizer(),
'esmc': ESMCFeaturizer(),
},
property_names=['Polyreactivity', 'Titer', 'HIC', 'Self-association', 'Tm'],
)
# Predict
predictions = model(batch) # Shape: (batch_size, 5)