Skip to content

Maybe add batching back in. #9

@simonleandergrimm

Description

@simonleandergrimm

After initially closing the PR that would have added batching, I got back to Jeff with:

FWIW, just revisiting this, we do still sometimes have sequencing runs where we get lots of small files, where batching might be useful. E.g., NAO-ONT-20250328-Zephyr12b has 73 pod5 files, with median file size of 171.4 MB.
I also think that AWS Batch doesn't execute all jobs that can be executed at any one time, so e.g., giving mgs-workflow hundreds of ONT seq files that are all very small might be less efficient than giving fewer, slightly larger files (due to time spent starting and ending a lot of tiny jobs).

I could benchmark this quite easily (but it's not a priority right now).

Originally posted by @simonleandergrimm in #7 (comment)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions