CARDIUM: Congenital Anomaly Recognition with Diagnostic Images and Unified Medical Records

This repository contains the code implementation of CARDIUM: Congenital Anomaly Recognition with Diagnostic Images and Unified Medical Records, which was presented at the Third Workshop on Computer Vision for Automated Medical Diagnosis at ICCV 2025.

Prenatal diagnosis of Congenital Heart Diseases (CHDs) holds great potential for Artificial Intelligence (AI)-driven solutions. However, collecting high-quality diagnostic data remains difficult due to the rarity of these conditions, resulting in imbalanced and low-quality datasets that hinder model performance. Moreover, no public efforts have been made to integrate multiple sources of information, such as imaging and clinical data, further limiting the ability of AI models to support and enhance clinical decision-making. To overcome these challenges, we introduce the Congenital Anomaly Recognition with Diagnostic Images and Unified Medical records (CARDIUM) dataset, the first publicly available multimodal dataset consolidating fetal ultrasound and echocardiographic images along with maternal clinical records for prenatal CHD detection. Furthermore, we propose a robust multimodal transformer architecture that incorporates a cross-attention mechanism to fuse feature representations from image and tabular data, improving CHD detection by 11% and 50% over image and tabular single-modality approaches, respectively, and achieving an F1-score of 79.8 ± 4.8% in the CARDIUM dataset. We will publicly release our dataset and code to encourage further research on this unexplored field.

Fig. 1. Overview of the CARDIUM dataset.

Fig. 2. Overview of the CARDIUM model.

Overview

This repository provides both the multimodal dataset and the model implementation. Users can download the data and replicate the results from Sections 5.1, and 5.2 of our paper.

CARDIUM Dataset and Pretrained Models

Image and Tabular Data

The CARDIUM dataset includes:

6558 anonymized images, with 16.3% corresponding to positive CHD patients and 83.7% to negative patients, structured across 3 folds for training and testing.
Clinical tabular data in both raw and preprocessed formats:
- cardium_clinical_data_wnm_translated_final_cleaned.json – raw data for tabular encoder training.
- cardium_clinical_data_woe_wnm_standarized_f_normalized.json – preprocessed data used for the multimodal model.

All image and tabular data files are available through the following access request form, which is used to verify institutional or educational affiliation prior to granting download permissions. Once the request is approved, you will receive the download link to the dataset. The dataset is distributed exclusively for academic and research purposes.

Note: After downloading, the image data will be provided as a .tar.gz file. Please extract it inside the data/ directory, maintaining the folder structure shown below. Ensure tabular data JSONs are also downloaded inside this directory.

Example (Linux/macOS):
  tar -xvzf cardium_images.tar.gz -C data/

Upon download, the dataset will follow this structure:

data/
├── cardium_images/
│   ├── fold_1/
│   │   ├── train/
│   │   │   ├── CHD/
│   │   │   └── Non_CHD/
│   │   └── test/
│   │       ├── CHD/
│   │       └── Non_CHD/
│   ├── fold_2/
│   │   ├── train/
│   │   └── test/
│   └── fold_3/
│       ├── train/
│       └── test/
└── tabular_data/
    ├── cardium_clinical_data_wnm_translated_final_cleaned.json
    └── cardium_clinical_data_woe_wnm_standarized_f_normalized.json

Pretrained Models

Pretrained models are available for:

The image-based classifier (trained on the CARDIUM image dataset).
The tabular-based classifier (trained on the CARDIUM tabular dataset).
The multimodal model combining tabular and image features.

These models can be downloaded from the same link as the dataset. Access is granted only to individuals or institutions with academic or research purposes, and requests are subject to verification.

After downloading, the image, tabular and multimodal weights will be provided as .tar.gz files. Please extract the weights following these commands:

tar -xvzf tabular_encoder.tar.gz -C tabular_script/tabular_checkpoints/

tar -xvzf image_encoder.tar.gz -C img_script/image_checkpoints/

tar -xvzf cardium_model_weights.tar.gz -C multimodal_script/multimodal_checkpoints/

Setup

Create a virtual environment:

conda env create -f CARDIUM.yml

Install dependencies:

pip install -r requirements.txt

Usage

Training from draft

Train the tabular encoder:

python tabular_script/main_tab.py --num_epochs 400 --batch_size 32 --lr 0.00000050169647031011 --weight_decay 1e-3 --loss_factor 0.7 --sampling False

After training the tabular encoder, the repository structure will include the following additions:

.
├── data/
│   └── tabular_data/
│       ├── cardium_clinical_data_wnm_translated_final_cleaned.json
│       │   (raw file for tabular encoder training)
│       ├── cardium_clinical_data_woe_wnm_standarized_f_normalized.json
│       │   (preprocessed file for multimodal model)
│       └── output_folds_final/
│           (contains 6 JSONs for training and testing across three folds)

└── tabular_script/
    └── tabular_checkpoints/
        └── run_name/
            ├── fold0_best_model.pth
            ├── fold1_best_model.pth
            └── fold2_best_model.pth

Train the image encoder:

python img_script/main_img.py --lr 1e-6 --batch_size 8 --loss_factor 2

After training the image encoder, the checkpoints will be stored here:

.
└── img_script/
    └── image_checkpoints/
        └── run_name/
            ├── fold0_best_model.pth
            ├── fold1_best_model.pth
            └── fold2_best_model.pth

Train the multimodal model:

After training the multimodal model, the checkpoints will be stored here

python multimodal_script/main_multimodal.py

After training the multimodal model, the checkpoints will be stored here:

.
└── multimodal_script/
    └── multimodal_checkpoints/
        └── run_name/
            ├── fold0_best_model.pth
            ├── fold1_best_model.pth
            └── fold2_best_model.pth

Inference

Tabular Encoder

python tabular_script/inference_tab.py --num_epochs 400 --batch_size 32 --lr 0.00000050169647031011 --weight_decay 1e-3 --loss_factor 0.7 --sampling False --tab_checkpoint {tabular_checkpoint_route}

Image Encoder

python img_script/inference_img.py --lr 1e-6 --batch_size 8 --loss_factor 2 --img_checkpoint {image_checkpoint_route}

Multimodal Model

python multimodal_script/inference_multimodal.py --multimodal_checkpoint {multimodal_checkpoint_route}

Trimester Performance

For trimestral CHD detection results (Section 5.2):

Save the divided dataset:

 python trimester_results/create_trimester_dataset.py

Data will be stored here:

.
└── data/
    └── trimester_images/
        ├── first_trimester/
        ├── second_trimester/
        └── third_trimester/

Make inference from multimodal model in specific trimester data:

python multimodal_script/inference_multimodal.py --image_folder_path dataset/trimester_images --trimester first

For second and third trimester, put second and third in --trimester parameter.

License

The code is released under the Apache 2.0 License (see License).
The CARDIUM dataset is released under the CC BY-NC 4.0 License (see Dataset_License).

Citation

@inproceedings{
vega2025cardium,
title={{CARDIUM}: Congenital Anomaly Recognition with Diagnostic Images and Unified Medical records},
author={Daniela Vega and Hannah Ceballos and Javier Santiago Vera Rincon and Santiago Rodriguez and Alejandra Perez and Angela Castillo and Maria Escobar and Dario Londo{\~n}o and Luis Andres Sarmiento and Camila Irene Castro and Nadiezhda Rodriguez and Juan Carlos Brice{\~n}o and Pablo Arbelaez},
booktitle={Third Workshop on Computer Vision for Automated Medical Diagnosis},
year={2025},
url={https://openreview.net/forum?id=MDHl5LCcka}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
img_script		img_script
multimodal_script		multimodal_script
readme_fig		readme_fig
tabular_script		tabular_script
trimester_results		trimester_results
.gitignore		.gitignore
DATASET_LICENSE		DATASET_LICENSE
LICENSE		LICENSE
README.md		README.md
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CARDIUM: Congenital Anomaly Recognition with Diagnostic Images and Unified Medical Records

Table of Contents

Overview

CARDIUM Dataset and Pretrained Models

Image and Tabular Data

Pretrained Models

Setup

Usage

Training from draft

Inference

Tabular Encoder

Image Encoder

Multimodal Model

Trimester Performance

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

BCV-Uniandes/Cardium

Folders and files

Latest commit

History

Repository files navigation

CARDIUM: Congenital Anomaly Recognition with Diagnostic Images and Unified Medical Records

Table of Contents

Overview

CARDIUM Dataset and Pretrained Models

Image and Tabular Data

Pretrained Models

Setup

Usage

Training from draft

Inference

Tabular Encoder

Image Encoder

Multimodal Model

Trimester Performance

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages