Skip to content

kimtae55/fcHMRF-LIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 

Repository files navigation

FcHMRF-LIS: a stable, spatial FDR control method

🧠 Introduction

Why use FDR control?

In high-dimensional testing (e.g., brain imaging, genomics analyses), we often perform hundreds of thousands of statistical tests (e.g. comparing voxel-level brain glucose metabolism across healthy versus diseased groups using the standardized uptake value ratio (SUVR) from Fluorine-18 fluorodeoxyglucose positron emission tomography (FDG-PET) data).

When you test many hypotheses, the chance of making false discoveries increases. FDR control limits the expected proportion of false positives among the discoveries, ensuring results are statistically reliable at a desired level, without being overly conservative like family-wise error rate (FWER) method.

The mechanisms behind spatial FDR control can be viewed analogously to an unsupervised binary segmentation task (0 representing the null, 1 representing the rejection of null).

🌟 What is special about fcHMRF-LIS?

❗ Existing spatial FDR control methods face three fundamental challenges in neuroimaging applications:

  • First, many fail to account for the complex spatial dependencies inherent in neuroimaging data, including distance-related dependence and spatial heterogeneity.

  • Second, while state-of-the-art methods focus on controlling the FDR and minimizing the false non-discovery rate (FNR), they often exhibit high variance in FDP and FNP, leading to instability across replications. Figure below illustrates this high variability across 50 replications of real-data based simulation settings.

  • Many spatial FDR methods lack computational scalability, making them impractical for analyzing high-resolution datasets (e.g. neuroimaging data)

🔦 The fcHMRF-LIS is a stable, spatial false discover rate (FDR) control method that mitigates the above three issues:

  • It is the first multiple hypothesis testing framework to utilize a fully connected HMRF to model voxel-wise spatial dependencies. Our method exhibits exceptional stability across multiple replications.

  • We provide an efficient and scalable expectation-maximization (EM) algorithm leveraging the CRF-RNN, where high-dimensional filtering is accelerated on a permutohedral lattice.

  • We apply our method to the FDG-PET data from ADNI, to identify brain areas affected along the progression of the Alzheimer's disease (AD).

gui_image

Installation

This package was developed using Python 3.12 and Pytorch 2.6.0. The permutohedral lattice based filtering relies on C++ implementation provided by crfasrnn_pytorch.

To install the package, please run the following lines:

git clone $fcHMRF-LIS-repo$
cd $PATH_TO_fcHMRF-LIS-repo$
python setup.py install 

Usage

python -m src.train --lr {learning rate}
                    --e {number of EM steps}
                    --threshold {desired FDR threshold}
                    --labelpath {optional groundtruth if using for simulation}
                    --datapath {z-statistics as in the paper}
                    --betapath {delta_mu as in the paper}
                    --ppath {p-values corresponding to the z-statistics, this is just used for initialization}
                    --savepath {directory path for saving results}

Citing our work

@misc{kim2025falsediscoveryratecontrol,
      title={A False Discovery Rate Control Method Using a Fully Connected Hidden Markov Random Field for Neuroimaging Data}, 
      author={Taehyo Kim and Qiran Jia and Mony J. de Leon and Hai Shu},
      year={2025},
      eprint={2505.20688},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/2505.20688}, 
}

Updates

About

Github Repo for "A False Discovery Rate Control Method Using a Fully Connected Hidden Markov Random Field for Neuroimaging Data"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors