-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
After initially closing the PR that would have added batching, I got back to Jeff with:
FWIW, just revisiting this, we do still sometimes have sequencing runs where we get lots of small files, where batching might be useful. E.g., NAO-ONT-20250328-Zephyr12b has 73 pod5 files, with median file size of 171.4 MB.
I also think that AWS Batch doesn't execute all jobs that can be executed at any one time, so e.g., giving mgs-workflow hundreds of ONT seq files that are all very small might be less efficient than giving fewer, slightly larger files (due to time spent starting and ending a lot of tiny jobs).
I could benchmark this quite easily (but it's not a priority right now).
Originally posted by @simonleandergrimm in #7 (comment)
Metadata
Metadata
Assignees
Labels
No labels