-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add varnet (https://github.com/skandlab/VarNet) #11787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 3 commits
03634e4
49a8dc3
90a50bc
2b21805
81165b5
13ff223
9d43972
c254f8e
25f5d0f
148ee05
17c3b97
778e198
8acf7f0
540c5bb
bfbd246
019557c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| process VARNET { | ||
| tag "$meta.id" | ||
| label 'process_high_memory' | ||
|
|
||
| // Using the official VarNet Docker image | ||
| container "docker.io/kiranchari/varnet:latest" | ||
|
|
||
| input: | ||
| tuple val(meta), path(input_normal), path(index_normal), path(input_tumor), path(index_tumor) | ||
| tuple val(meta2), path(intervals) | ||
| tuple val(meta3), path(fasta) | ||
| tuple val(meta4), path(fai) | ||
|
kiranchari marked this conversation as resolved.
Outdated
|
||
|
|
||
| output: | ||
| tuple val(meta), path("${prefix}/${prefix}.vcf.gz") , emit: vcf | ||
| tuple val("${task.process}"), val("varnet"), val("1.0.0"), emit: versions, topic: versions | ||
|
|
||
| when: | ||
| task.ext.when == null || task.ext.when | ||
|
|
||
| script: | ||
| def args = task.ext.args ?: '' | ||
| prefix = task.ext.prefix ?: "${meta.id}" | ||
| def regions = intervals ? "--region_bed \$WORKDIR/${intervals}" : "" | ||
| def normal = input_normal ? "--normal_bam \$WORKDIR/${input_normal}" : "" | ||
| """ | ||
|
|
||
| WORKDIR=\$(pwd) | ||
|
|
||
| cd /VarNet | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are you changing the directory to here? Hence you obviously need the WORKDIR.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the moment, my scripts rely on being in the /VarNet working directory. The test input files were using relative paths based on $WORKDIR instead of absolute paths, which was causing the tests to fail. Prepending $WORKDIR was a quick fix to make those paths valid so the tests could run.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be honest with you, it sounds like you need to update your tool to make it functional.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To elaborate a little: a key feature of Nextflow is that execution of each task is encapsulated within its work directory. That way, different tasks do not affect one another and the execution becomes much safer and more reproducible. Moving outside of the work directory violates this, and also means the pipeline is much less likely to work in diverse compute environments. eg. You typically cannot make root directories like this if you're running on a HPC using conda. |
||
| TF_CPP_MIN_LOG_LEVEL=3 python /VarNet/filter.py \\ | ||
| --sample_name ${prefix} \\ | ||
| ${normal} \\ | ||
| --tumor_bam \$WORKDIR/${input_tumor} \\ | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this really need to use WORKDIR for everything? Given it is your tool, could you patch it to respect the current working directory?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I modified this because the test input files were using relative paths based on $WORKDIR instead of absolute paths, which was causing the tests to fail. Prepending $WORKDIR was a quick fix to make those paths valid so the tests could run. |
||
| --reference \$WORKDIR/${fasta} \\ | ||
| --output_dir \$WORKDIR \\ | ||
| --processes ${task.cpus} \\ | ||
| ${regions} \\ | ||
| ${args} | ||
|
|
||
| TF_CPP_MIN_LOG_LEVEL=3 python /VarNet/predict.py \\ | ||
| --sample_name ${prefix} \\ | ||
| ${normal} \\ | ||
| --tumor_bam \$WORKDIR/${input_tumor} \\ | ||
| --reference \$WORKDIR/${fasta} \\ | ||
| --output_dir \$WORKDIR \\ | ||
| --processes ${task.cpus} \\ | ||
| ${args} | ||
|
kiranchari marked this conversation as resolved.
|
||
|
|
||
| cat <<-END_VERSIONS > \$WORKDIR/versions.yml | ||
| "${task.process}": | ||
| varnet: 1.5.0 | ||
| END_VERSIONS | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please switch to topics. |
||
| """ | ||
|
|
||
| stub: | ||
| prefix = task.ext.prefix ?: "${meta.id}" | ||
| """ | ||
| mkdir -p ${prefix} | ||
| echo "" | gzip > ${prefix}/${prefix}.vcf.gz | ||
|
|
||
| cat <<-END_VERSIONS > versions.yml | ||
| "${task.process}": | ||
| varnet: 1.0.0 | ||
| END_VERSIONS | ||
| """ | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,105 @@ | ||
| name: "varnet" | ||
| description: VarNet is a deep learning-based somatic variant caller that utilizes | ||
| a two-stage filtering and prediction architecture. It processes aligned reads from | ||
| tumor and normal data to identify somatic mutations using a deep neural network | ||
| approach. | ||
| keywords: | ||
| - variant calling | ||
| - machine learning | ||
| - neural network | ||
| - somatic | ||
| tools: | ||
| - "varnet": | ||
| description: "A deep learning-based somatic variant caller" | ||
| homepage: "https://github.com/skandlab/VarNet" | ||
| documentation: "https://github.com/skandlab/VarNet" | ||
| tool_dev_url: "https://github.com/skandlab/VarNet" | ||
| doi: "10.1038/s41467-022-31765-8" | ||
| licence: | ||
| - "PolyForm Noncommercial License 1.0.0" | ||
| identifier: "" | ||
| input: | ||
| - - meta: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing sample information | ||
| e.g. [ id:'test', single_end:false ] | ||
| - input_normal: | ||
| type: file | ||
| description: BAM/CRAM file from normal sample | ||
| pattern: "*.{bam,cram}" | ||
| ontologies: [] | ||
| - index_normal: | ||
| type: file | ||
| description: Index of normal BAM/CRAM file | ||
| pattern: "*.{bai,crai}" | ||
| ontologies: [] | ||
| - input_tumor: | ||
| type: file | ||
| description: BAM/CRAM file from tumor sample | ||
| pattern: "*.{bam,cram}" | ||
| ontologies: [] | ||
| - index_tumor: | ||
| type: file | ||
| description: Index of tumor BAM/CRAM file | ||
| pattern: "*.{bai,crai}" | ||
| ontologies: [] | ||
| - - meta2: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing reference information | ||
| - intervals: | ||
| type: file | ||
| description: BED file containing target intervals | ||
| pattern: "*.bed" | ||
| ontologies: [] | ||
| - - meta3: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing reference information | ||
| - fasta: | ||
| type: file | ||
| description: The reference fasta file | ||
| pattern: "*.{fasta,fa}" | ||
| ontologies: [] | ||
| - - meta4: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing reference information | ||
| - fai: | ||
| type: file | ||
| description: Index of reference fasta file | ||
| pattern: "*.fai" | ||
| ontologies: [] | ||
| output: | ||
| vcf: | ||
| - - meta: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing sample information | ||
| e.g. `[ id:'sample1', single_end:false ]` | ||
| - ${prefix}/${prefix}.vcf.gz: {} | ||
| vcf_tbi: | ||
| - - meta: | ||
| type: map | ||
| description: | | ||
| Groovy Map containing sample information | ||
| e.g. `[ id:'sample1', single_end:false ]` | ||
| - ${prefix}/${prefix}.vcf.gz.tbi: {} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like we're missing this in the main.nf No tbi are being emitted out |
||
| versions: | ||
| - versions.yml: {} | ||
| topics: | ||
| versions: | ||
| - - ${task.process}: | ||
| type: string | ||
| description: The name of the process | ||
| - varnet: | ||
| type: string | ||
| description: The name of the tool | ||
| - 1.0.0: | ||
| type: string | ||
| description: The expression to obtain the version of the tool | ||
| authors: | ||
| - "@kiranchari" | ||
| maintainers: | ||
| - "@kiranchari" | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| nextflow_process { | ||
|
|
||
| name "Test Process VARNET" | ||
| script "../main.nf" | ||
| process "VARNET" | ||
|
|
||
| tag "modules" | ||
| tag "modules_nfcore" | ||
| tag "varnet" | ||
|
|
||
| test("tumor_normal_pair") { | ||
| config './nextflow.config' | ||
|
SPPearce marked this conversation as resolved.
Outdated
|
||
|
|
||
| when { | ||
| process { | ||
| """ | ||
| input[0] = [ | ||
| [ id:'test' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam', checkIfExists: true), | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true), | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true), | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true) | ||
| ] | ||
| input[1] = [ | ||
| [ id:'test_region' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true) | ||
| ] | ||
| input[2] = [ | ||
| [ id:'genome' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) | ||
| ] | ||
| input[3] = [ | ||
| [ id:'genome' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) | ||
| ] | ||
| """ | ||
| } | ||
| } | ||
|
|
||
| then { | ||
| assertAll( | ||
| { assert process.success }, | ||
| { | ||
| assert snapshot( | ||
| process.out.vcf.collect { file(it[1]).getName() }, | ||
| process.out.vcf_tbi.collect { file(it[1]).getName() }, | ||
| process.out.versions | ||
|
kiranchari marked this conversation as resolved.
Outdated
|
||
| ).match() | ||
| } | ||
| ) | ||
| } | ||
|
|
||
| } | ||
|
|
||
| test("tumor_only") { | ||
| config './nextflow.config' | ||
|
|
||
| when { | ||
| process { | ||
| """ | ||
| input[0] = [ | ||
| [ id:'test_tumor_only' ], | ||
| [], // input_normal is empty | ||
| [], // index_normal is empty | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true), | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true) | ||
| ] | ||
| input[1] = [ | ||
| [ id:'intervals' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true) | ||
| ] | ||
| input[2] = [ | ||
| [ id:'genome' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) | ||
| ] | ||
| input[3] = [ | ||
| [ id:'genome' ], | ||
| file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) | ||
| ] | ||
| """ | ||
| } | ||
| } | ||
|
|
||
| then { | ||
| assertAll( | ||
| { assert process.success }, | ||
| { | ||
| assert snapshot( | ||
| process.out.vcf.collect { file(it[1]).getName() }, | ||
| process.out.vcf_tbi.collect { file(it[1]).getName() }, | ||
| process.out.versions | ||
|
kiranchari marked this conversation as resolved.
Outdated
|
||
| ).match() | ||
| } | ||
| ) | ||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| { | ||
| "tumor_only": { | ||
| "content": [ | ||
| [ | ||
| "test_tumor_only_out.vcf.gz" | ||
| ], | ||
| [ | ||
|
|
||
| ], | ||
| [ | ||
| [ | ||
| "VARNET", | ||
| "varnet", | ||
| "1.0.0" | ||
| ] | ||
| ] | ||
| ], | ||
| "timestamp": "2026-05-28T17:01:27.251285842", | ||
| "meta": { | ||
| "nf-test": "0.9.5", | ||
| "nextflow": "26.04.0" | ||
| } | ||
| }, | ||
| "tumor_normal_pair": { | ||
| "content": [ | ||
| [ | ||
| "test_out.vcf.gz" | ||
| ], | ||
| [ | ||
|
|
||
| ], | ||
| [ | ||
| [ | ||
| "VARNET", | ||
| "varnet", | ||
| "1.0.0" | ||
| ] | ||
| ] | ||
| ], | ||
| "timestamp": "2026-05-28T16:46:04.690450597", | ||
| "meta": { | ||
| "nf-test": "0.9.5", | ||
| "nextflow": "26.04.0" | ||
| } | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| process { | ||
| withName: VARNET { | ||
| ext.prefix = { "${meta.id}_out" } | ||
| memory = '16GB' | ||
| cpus = 1 | ||
| // ext.args is no longer needed for the region since it's handled via the input channel | ||
|
kiranchari marked this conversation as resolved.
|
||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a plan to put it on bioconda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the moment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that a nf-core recommendation describes this: https://nf-co.re/docs/specifications/pipelines/recommendations/bioconda
We do make some exceptions for this but usually only for things like licensing issues etc. We have a
#biocondachannel on slack with a bunch of experts around to help :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My institution requires VarNet to use a PolyForm Noncommercial License 1.0.0, which prevents adding it to bioconda. Please let me know if my understanding is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never heard of that one before.
From what I can tell, it allows for redistribution, which I think would mean it is compatible with bioconda. But I'm not a bioconda expert.