Skip to content

Add Python fastMRI converter#52

Open
naegelejd wants to merge 6 commits intomainfrom
fastmri-converter
Open

Add Python fastMRI converter#52
naegelejd wants to merge 6 commits intomainfrom
fastmri-converter

Conversation

@naegelejd
Copy link
Collaborator

We'll stage development of a fastMRI -> mrd converter in this branch. Leave the PR a draft.

@naegelejd naegelejd requested a review from xueh2 January 30, 2026 21:58
@naegelejd naegelejd marked this pull request as ready for review February 18, 2026 21:04
@naegelejd naegelejd requested a review from Copilot February 18, 2026 21:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an initial (draft) Python CLI tool to convert fastMRI HDF5 datasets into MRD streams, plus a small CI/workflow adjustment to pin the Miniforge version for conda builds.

Changes:

  • Introduce fastmri_to_mrd.py to convert fastMRI headers/k-space and reconstructed RSS images into MRD (BinaryMrdWriter).
  • Add basic dataset presence checks and synthesize MRD noise acquisitions from k-space edge lines.
  • Pin miniforge-version in the conda GitHub Actions workflow.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
python/mrd/tools/fastmri_to_mrd.py New fastMRI→MRD converter tool (header conversion + k-space/image export).
.github/workflows/mrd_conda.yml Pins Miniforge version for more reproducible conda CI/builds.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +105
if np.std(np.abs(last_line.data)) < np.std(np.abs(first_line.data)):
acq = last_line
else:
acq = first_line

acq.head.flags = mrd.AcquisitionFlags.IS_NOISE_MEASUREMENT
if np.std(np.abs(acq.data)) >= 1e-5:
logger.warning(f"Expected to find noise in slice {slice}, but data has significant variation with std {np.std(np.abs(acq.data))}.")
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.std(np.abs(...)) is computed multiple times here (including inside the log message). Compute it once and reuse the value to avoid repeated full-array passes over k-space data (can be expensive for large datasets).

Suggested change
if np.std(np.abs(last_line.data)) < np.std(np.abs(first_line.data)):
acq = last_line
else:
acq = first_line
acq.head.flags = mrd.AcquisitionFlags.IS_NOISE_MEASUREMENT
if np.std(np.abs(acq.data)) >= 1e-5:
logger.warning(f"Expected to find noise in slice {slice}, but data has significant variation with std {np.std(np.abs(acq.data))}.")
first_line_std = np.std(np.abs(first_line.data))
last_line_std = np.std(np.abs(last_line.data))
if last_line_std < first_line_std:
acq = last_line
acq_std = last_line_std
else:
acq = first_line
acq_std = first_line_std
acq.head.flags = mrd.AcquisitionFlags.IS_NOISE_MEASUREMENT
if acq_std >= 1e-5:
logger.warning(
f"Expected to find noise in slice {slice}, but data has significant variation with std {acq_std}."
)

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +10
import argparse
import logging

import h5py
import ismrmrd
import mrd
import numpy as np
from mrd.tools.ismrmrd_to_mrd import convert_header
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h5py is imported at module import time, but the Python package/conda recipe in this repo doesn’t list h5py as a dependency. This makes the converter unusable in a standard install (import fails immediately). Consider moving the h5py import inside convert() (or guarding it with a try/except that raises a clear install hint), and/or add h5py to the package/conda runtime dependencies.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +41
assert len(mrd_header.encoding) >= 1
encoding: mrd.EncodingType = mrd_header.encoding[0]
assert encoding.encoded_space is not None
encoded_space = encoding.encoded_space
assert encoding.encoding_limits is not None
encoding_limits = encoding.encoding_limits
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assert statements are being used for validating user input / file structure (header content). Asserts can be disabled with python -O, and they also produce unhelpful error messages for CLI users. Prefer explicit validation with a raised exception (e.g., ValueError/RuntimeError) including what was expected vs. what was found.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +80
head = mrd.AcquisitionHeader()
head.scan_counter = scan_counter
scan_counter += 1
head.channel_order = list(range(num_channels))
head.center_sample = encoded_space.matrix_size.x // 2
head.position[:] = [0.0, 0.0, 0.0]
head.read_dir[:] = [1.0, 0.0, 0.0]
head.phase_dir[:] = [0.0, 1.0, 0.0]
head.slice_dir[:] = [0.0, 0.0, 1.0]

head.idx.slice = slice
head.idx.kspace_encode_step_1 = line
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New acquisitions don’t set head.encoding_space_ref. Other generators/converters in this repo set it explicitly (e.g., python/mrd/tools/phantom.py:142-146, python/mrd/tools/ismrmrd_to_mrd.py:372-396) so downstream tooling can reliably map acquisitions to header.encoding[...]. Consider setting head.encoding_space_ref = 0 when synthesizing acquisitions here for consistency.

Copilot uses AI. Check for mistakes.
# MRD images are of shape (channels, slices, rows, cols)
# So we need to tranpose X/Y, and add empty channel and slice dimensions
data = np.transpose(dset[slice, :, :], (1, 0))
data = np.expand_dims(data, axis=(0, 1))
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.transpose(...) returns a non-C-contiguous view; the MRD binary writer falls back to element-by-element serialization for non-contiguous arrays, which can be extremely slow for image-sized arrays. Consider making the transposed array contiguous (e.g., np.ascontiguousarray(...)) before writing.

Suggested change
data = np.expand_dims(data, axis=(0, 1))
data = np.expand_dims(data, axis=(0, 1))
data = np.ascontiguousarray(data)

Copilot uses AI. Check for mistakes.
head.image_series_index = 1

# MRD images are of shape (channels, slices, rows, cols)
# So we need to tranpose X/Y, and add empty channel and slice dimensions
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: “tranpose” should be “transpose”.

Suggested change
# So we need to tranpose X/Y, and add empty channel and slice dimensions
# So we need to transpose X/Y, and add empty channel and slice dimensions

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +176
def validate_input_file(f: h5py.File) -> None:
"""Ensure the input fastMRI file contains the expected datasets."""


def convert(input_filename: str, output_data_filename: str | None, output_images_filename: str | None) -> None:
"""Convert fastMRI HDF5 file to MRD format."""
with h5py.File(input_filename, "r") as f:
# First validate the input file
required_datasets = [FASTMRI_DATASET_NAME_HEADER]
if output_data_filename is not None:
required_datasets += [FASTMRI_DATASET_NAME_KSPACE]
if output_images_filename is not None:
required_datasets += [FASTMRI_DATASET_NAME_RECON_RSS]

for dset_name in required_datasets:
if dset_name not in f:
raise RuntimeError(f"Input file is missing required dataset '{dset_name}'")

Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_input_file() is currently a no-op and convert() re-implements validation inline. Either implement validate_input_file() and call it from convert(), or remove the stub to avoid dead code / confusion.

Suggested change
def validate_input_file(f: h5py.File) -> None:
"""Ensure the input fastMRI file contains the expected datasets."""
def convert(input_filename: str, output_data_filename: str | None, output_images_filename: str | None) -> None:
"""Convert fastMRI HDF5 file to MRD format."""
with h5py.File(input_filename, "r") as f:
# First validate the input file
required_datasets = [FASTMRI_DATASET_NAME_HEADER]
if output_data_filename is not None:
required_datasets += [FASTMRI_DATASET_NAME_KSPACE]
if output_images_filename is not None:
required_datasets += [FASTMRI_DATASET_NAME_RECON_RSS]
for dset_name in required_datasets:
if dset_name not in f:
raise RuntimeError(f"Input file is missing required dataset '{dset_name}'")
def validate_input_file(f: h5py.File, require_kspace: bool, require_images: bool) -> None:
"""Ensure the input fastMRI file contains the expected datasets."""
required_datasets = [FASTMRI_DATASET_NAME_HEADER]
if require_kspace:
required_datasets.append(FASTMRI_DATASET_NAME_KSPACE)
if require_images:
required_datasets.append(FASTMRI_DATASET_NAME_RECON_RSS)
for dset_name in required_datasets:
if dset_name not in f:
raise RuntimeError(f"Input file is missing required dataset '{dset_name}'")
def convert(input_filename: str, output_data_filename: str | None, output_images_filename: str | None) -> None:
"""Convert fastMRI HDF5 file to MRD format."""
with h5py.File(input_filename, "r") as f:
# First validate the input file
validate_input_file(
f,
require_kspace=output_data_filename is not None,
require_images=output_images_filename is not None,
)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments