Conversation
Important! Template update for nf-core/tools v3.3.2
update metro map
refact: remove unused subworkflows
… into s3-references
update download action: remove non-stub run
nschan
left a comment
There was a problem hiding this comment.
Mostly minor comments, I approve now assuming you will adress them as you see fit :)
Co-authored-by: Niklas Schandry <schandry@gmail.com>
Co-authored-by: Niklas Schandry <schandry@gmail.com>
Co-authored-by: Niklas Schandry <schandry@gmail.com>
johandahlberg
left a comment
There was a problem hiding this comment.
I have looked things over and I see nothing blocking merging this. You can find some comments and questions from me here, and also as line comments on some of the files.
I have checked out the repo, run the nf-test tests locally and run the minimal test pipeline. Since I am something of a nextflow rookie I have focused on trying to review and make suggestions from the user perspective.
Tests
Running nf-test test --profile +docker .. I got quite a few test failures due to out of memory errors, e.g:
Test [381185aa] 'stub test ctatsplicing' java.lang.OutOfMemoryError: Required array size too large
Which surprised me a bit since it happed also for the stub tests. Is that expected? I'm running on a machine with 128G of RAM?
Running nextflow run . -profile test,docker --outdir test-results -stub worked fine.
Documentation
Going over the docs and following the links I noticed that it seems that data has not been generated under the results folder for the actual tools. I only found data for the references and the pipeline info. It also does not correspond to the description of the output in the docs (https://nf-co.re/rnafusion/dev/docs/output).
This is related for previous releases not to this one specifically, but I thought it might be indicative of something failing in the release process, so I wanted to mention it.
I also found some broken images under: https://nf-co.re/rnafusion/dev/docs/output#multiqc

Other
One thing I noticed was that a lot of the reference data being downloaded seemed to be some pretty big plain text files. I'm wondering if the download times couldn't be reduced somewhat by compressing them and providing a script that downloads and then decompresses them on the target system?
…to code_review
apeltzer
left a comment
There was a problem hiding this comment.
I'll also do some battle testing with our internal (synthetic) known Fusion data.
Co-authored-by: Alexander Peltzer <apeltzer@users.noreply.github.com>
Code review
|
@johandahlberg I cannot reproduce your memory issue and manage to run the tests on my laptop that definitely does not have 128GB RAM. We will leave the optimizations for the references download for next release, but feel free to open an issue about this to see it moving forward! |
|
All suggestions added in from the code_review branch |
v4.0.0 - [2025-09-10]
Added
ENSEMBL_DOWNLOAD#539HGNC_DOWNLOAD#540STRINGTIE_WORKFLOW#541PICARD_COLLECTRNASEQMETRICSand update module #551--skip_vcfboolean parameter to skip vcf file generation #554FUSIONREPORT_DOWNLOAD#560QC_WORKFLOW#568TRIM_WORKFLOW#572FUSIONREPORT_DETECT. ImproveFUSIONREPORT_DOWNLOADmodule #577ARRIBA_WORKFLOW#578STARFUSION_BUILD. #585STARFUSION_DETECT. #586CTATSPLICING_STARTOCANCERINTRONSand a new parameter--ctatsplicing. This options creates reports on cancer splicing aberrations and requires one or both of--arribaand--starfusionto be given. #587--references_onlywhen no data should be analysed, but only the references should be built #505FUSIONCATCHER_WORKFLOW#591STARFUSION_WORKFLOW. #597FUSIONINSPECTOR. #601CTATSPLICING_PREPGENOMELIBto update the starfusion genome library directory with a cancer splicing index. #610FUSIONREPORT_WORKFLOW. #607ARRIBA_VISUALISATION. #625bam: A BAM file aligned with STAR, it's the responsibility of the pipeline user to make sure this file has been correctly called.bai: The index of the BAM file, this is not required when abamfile has been given but can increase the pipeline speed a bit.cram: A CRAM file aligned with STAR, it's the responsibility of the pipeline user to make sure this file has been correctly called.crai: The index of the CRAM file, this is not required when acramfile has been given but can increase the pipeline speed a bit.junctions: A file containing the junctions determined by STAR (needed bystarfusionandctatsplicing)splice_junctionsA file containing the splice junctions determined by STAR (needed byctatsplicing)--fusioncatcher_download_link. #650seq_platformandseq_centerfields to the samplesheet. These values can be used to overwrite the value of--seq_platformand--seq_centeron a sample-by-sample basis #654--human_gencode_filterwhile building STARFusion references for human species #657--star_limit_bam_sort_ramto set the maximum memory for sorting the BAM files in STAR. #668--dfam_hmm,--dfam_h3f,--dfam_h3i,--dfam_h3m,--dfam_h3p,--pfam_fileand--annot_filter_urlparameters to allow use of custom files inSTARFUSION_BUILDmodule #709VCF_COLLECT. #745FUSIONINSPECTOR_WORKFLOW. #753Changed
RRNA_TRANSCRIPTS(replaced by nf-core module) #541FUSIONINSPECTORto v2.10.0. #601STARFUSION_DOWNLOAD#598--arribaand--starfusion#633--cramparameter has been converted to a boolean value instead of the comma-separated list of values. Use this parameter if you also want to create CRAM files from the BAM files created with STAR #633star/alignto 1.21 #634--run_fusioncatcherback tofusioncatcher#641--fastp_trim,--arriba,--ctatsplicing,--fusioncatcher,--starfusion,--stringtieand--alland replaced these parameters with the--toolsparameter. This parameter takes a comma-delimited list of tool names to run for the pipeline and is a required parameter. Following tools are supported by this parameter #645:arribactatsplicingfusioncatcherstarfusionstringtiefusionreportfastpsalmonfusioninspectorall=> This will automatically run all of the above toolswgetcontainers toconda-forge::wget=1.21.4#655FUSIONREPORTandFUSIONREPORT_DOWNLOADwithFUSIONREPORT_DETECTandFUSIONREPORT_DOWNLOADfrom nf-core #703STARFUSION_BUILDfor module from nf-core #709test_buildprofile to use a reduced version of Pfam and Dfam files #733ARRIBA_VISUALIZATION,CTATSPLICING_STARTOCANCERINTRONS,CTATSPLICING_PREPGENOMELIB,FUSIONINSPECTOR,STARFUSION_DETECTfor its nf-core module versions #740TRIM_WORKFLOWfor its nf-core subworkflow equivalentFASTQ_FASTQC_UMITOOLS_FASTP#752FASTQ_ALIGN_STARfor subworkflow from nf-core #756STRINGTIE_WORKFLOWfor its nf-core versionBAM_STRINGTIE_MERGEand the local module moduleGTF_TO_REFFLATfor nf-core'sUCSC_GTFTOGENEPRED[Migrate sbwf stringtie and module gtfftoreflat #758](Migrate sbwf stringtie and module gtfftoreflat #758Fixed
FUSIONREPORT_DOWNLOADwhen building references with--no_cosmic parameter#555FUSIONREPORT_DOWNLOADto use cosmic credentials inext.args#556RRNATRANSCRIPTSmodule #563GFFREADthat caused outputgffread_fastanot being produced #565FUSIONCATCHER_DOWNLOADthat caused an error when running with singularity profile #573gtf2bedwhich caused local moduleGET_RRNA_TRANSCRIPTSto fail #602FUSIONINSPECTORprocess will no longer fail when no fusions have been found. #651--human_gencode_filter#683transcript_type#749QC_WORKFLOWsubworkflow #756Removed
--build_referencesARRIBA_WORKFLOW#692STARFUSION_WORKFLOW#707CTATSPLICING_WORKFLOW#704FUSIONREPORT_WORKFLOW#721GET_RRNA_TRANSCRIPTS#736params.download_refsandparams.fusioncatcher_download_link#752Parameters
--no_cosmic--build_references--references_only--fastp_trim--tools fastp--arriba--tools arriba--run_fusioncatcher--tools fusioncatcher--starfusion--tools starfusion--stringtie--tools stringtie--all--tools all--fusioncatcher_download_link--trim_tail_fusioncatcher--save_trimmed_fail--save_merged--min_trimmed_reads--trim_tail_fusioncatcherPR checklist
nf-core lint).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).