Skip to content

Alessi0X/HypergraphEmbedding4MetabolicNetworks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HypergraphEmbedding4MetabolicNetworks

bioRxiv Python Documentation License DOI

This repository contains the code and data for the paper "Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features".

The paper is currently under review on Algorithms for Molecular Biology. A preprint version is available on bioRxiv.

Usage

The EmbeddingsAndKernels folder contains separate scripts for computing the embeddings used in the paper.

Each script is designed to be run independently, and they can be executed in any order. Each script will load the metabolic pathways dataset (an example dataset is provided in the data folder -- more info below) and compute the embeddings or kernels, saving the results in a pickle file.

Detailed instructions for each embedding method are provided in the Wiki.

Requirements

To run the code, you need to install the following Python packages:

  • torch==2.7.1
  • torch_geometric==2.6.1
  • hypergraphx==1.7.7
  • karateclub==1.2.1
  • networkx==3.5
  • scipy==1.15.3
  • pyclustertend==1.9.0
  • multiprocess==0.70.18

The code has been tested with Python 3.12. Preliminary experiments have shown compatibility issues with later Python versions (especially with karateclub and pyclustertend).

Data

An example of the metabolic pathways dataset used in the paper is available in the file data/MetabolicPathways_DEMO_DATASET_Python.pkl. This file contains the metabolic pathways data in a format suitable for analysis. This example dataset is a smaller version of the dataset used in the paper (5 organisms only), and it is intended for demonstration purposes only. The full list of organisms is available as a supplementary file in the paper.

The Pickle file contains a dictionary with 'DATASET' as the key and a list of dictionaries as the value. Each dictionary in the list represents an organism and contains the following keys:

  • 'ID': the ID of the organism
  • 'simplices_nodelabels': the hyperedge list of the organism, where each hyperedge is represented as a n-tuple of node labels (strings), with n being the number of nodes in the hyperedge.

Citation

If you use this code in your research, please cite the paper as follows:

@article {Cervellini2025.07.10.663860,
	author = {Cervellini, Mattia and Sinaimeri, Blerina and Matias, Catherine and Martino, Alessio},
	title = {Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features},
	elocation-id = {2025.07.10.663860},
	year = {2025},
	doi = {10.1101/2025.07.10.663860},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/10.1101/2025.07.10.663860v3},
	journal = {bioRxiv}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors