Skip to content

Paralellization #10

@pabloati

Description

@pabloati

Hi, I would like to run MAGinator on a pretty large data set. I have around 420 samples, with 60 bins per sample on average, and the preprocessed reads are around 6GB each sample.

I have been running a subset of the samples (5) as a trial run on a cluster (40ppn and 180GB), and it has been running for more than 24 hours already.

Is there any possibility to run MAGinator in parallel to speed up the process? I am running the following command:

maginator -v trial/maginator_clusters.tsv
-r trial/maginator_reads.csv
-c trial/maginator_contigs.fasta
-o trial/maginator
-g /home/people/pablop/workdir/databases/gtdb_release207_v2
bin/run_maginator.sh (END)

Thank you,
Pablo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions