Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions modules/nf-core/varnet/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
process VARNET {
tag "$meta.id"
label 'process_high_memory'

container "docker.io/kiranchari/varnet:latest"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan to put it on bioconda?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make one?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that a nf-core recommendation describes this: https://nf-co.re/docs/specifications/pipelines/recommendations/bioconda

We do make some exceptions for this but usually only for things like licensing issues etc. We have a #bioconda channel on slack with a bunch of experts around to help :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My institution requires VarNet to use a PolyForm Noncommercial License 1.0.0, which prevents adding it to bioconda. Please let me know if my understanding is incorrect.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never heard of that one before.
From what I can tell, it allows for redistribution, which I think would mean it is compatible with bioconda. But I'm not a bioconda expert.


input:
tuple val(meta), path(input_normal), path(index_normal), path(input_tumor), path(index_tumor)
tuple val(meta2), path(intervals)
tuple val(meta3), path(fasta)
tuple val(meta4), path(fai)
Comment thread
kiranchari marked this conversation as resolved.
Outdated

output:
tuple val(meta), path("${prefix}/${prefix}.vcf.gz"), emit: vcf
tuple val("${task.process}"), val("varnet"), val("1.5.0"), emit: versions_varnet, topic: versions
Comment thread
SPPearce marked this conversation as resolved.
Outdated

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
prefix = task.ext.prefix ?: "${meta.id}"
def regions = intervals ? "--region_bed \$WORKDIR/${intervals}" : ""
def normal = input_normal ? "--normal_bam \$WORKDIR/${input_normal}" : ""
"""

WORKDIR=\$(pwd)

cd /VarNet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you changing the directory to here? Hence you obviously need the WORKDIR.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, my scripts rely on being in the /VarNet working directory. The test input files were using relative paths based on $WORKDIR instead of absolute paths, which was causing the tests to fail. Prepending $WORKDIR was a quick fix to make those paths valid so the tests could run.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest with you, it sounds like you need to update your tool to make it functional.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To elaborate a little: a key feature of Nextflow is that execution of each task is encapsulated within its work directory. That way, different tasks do not affect one another and the execution becomes much safer and more reproducible.

Moving outside of the work directory violates this, and also means the pipeline is much less likely to work in diverse compute environments. eg. You typically cannot make root directories like this if you're running on a HPC using conda.

TF_CPP_MIN_LOG_LEVEL=3 python /VarNet/filter.py \\
--sample_name ${prefix} \\
${normal} \\
--tumor_bam \$WORKDIR/${input_tumor} \\

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to use WORKDIR for everything? Given it is your tool, could you patch it to respect the current working directory?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified this because the test input files were using relative paths based on $WORKDIR instead of absolute paths, which was causing the tests to fail. Prepending $WORKDIR was a quick fix to make those paths valid so the tests could run.

--reference \$WORKDIR/${fasta} \\
--output_dir \$WORKDIR \\
--processes ${task.cpus} \\
${regions} \\
${args}

TF_CPP_MIN_LOG_LEVEL=3 python /VarNet/predict.py \\
--sample_name ${prefix} \\
${normal} \\
--tumor_bam \$WORKDIR/${input_tumor} \\
--reference \$WORKDIR/${fasta} \\
--output_dir \$WORKDIR \\
--processes ${task.cpus} \\
${args}
Comment thread
kiranchari marked this conversation as resolved.
"""

stub:
prefix = task.ext.prefix ?: "${meta.id}"
"""
mkdir -p ${prefix}
echo "" | gzip > ${prefix}/${prefix}.vcf.gz
"""
}

112 changes: 112 additions & 0 deletions modules/nf-core/varnet/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
name: "varnet"
description: VarNet is a deep learning-based somatic variant caller that utilizes
a two-stage filtering and prediction architecture. It processes aligned reads from
tumor and normal data to identify somatic mutations using a deep neural network
approach.
keywords:
- variant calling
- machine learning
- neural network
- somatic
tools:
- "varnet":
description: "A deep learning-based somatic variant caller"
homepage: "https://github.com/skandlab/VarNet"
documentation: "https://github.com/skandlab/VarNet"
tool_dev_url: "https://github.com/skandlab/VarNet"
doi: "10.1038/s41467-022-31765-8"
licence:
- "PolyForm Noncommercial License 1.0.0"
identifier: ""
input:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- input_normal:
type: file
description: BAM/CRAM file from normal sample
pattern: "*.{bam,cram}"
ontologies: []
- index_normal:
type: file
description: Index of normal BAM/CRAM file
pattern: "*.{bai,crai}"
ontologies: []
- input_tumor:
type: file
description: BAM/CRAM file from tumor sample
pattern: "*.{bam,cram}"
ontologies: []
- index_tumor:
type: file
description: Index of tumor BAM/CRAM file
pattern: "*.{bai,crai}"
ontologies: []
- - meta2:
type: map
description: |
Groovy Map containing reference information
- intervals:
type: file
description: BED file containing target intervals
pattern: "*.bed"
ontologies: []
- - meta3:
type: map
description: |
Groovy Map containing reference information
- fasta:
type: file
description: The reference fasta file
pattern: "*.{fasta,fa}"
ontologies: []
- - meta4:
type: map
description: |
Groovy Map containing reference information
- fai:
type: file
description: Index of reference fasta file
pattern: "*.fai"
ontologies: []
output:
vcf:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'sample1', single_end:false ]`
- ${prefix}.vcf.gz:
type: file
description: Compressed VCF file
pattern: "*.vcf.gz"
ontologies:
- edam: http://edamontology.org/format_3989 # GZIP format

versions_varnet:
- - ${task.process}:
type: string
description: The name of the process
- varnet:
type: string
description: The name of the tool
- "1.5.0":
type: eval
description: The expression to obtain the version of the tool
topics:
versions:
- - ${task.process}:
type: string
description: The name of the process
- varnet:
type: string
description: The name of the tool
- "1.5.0":
type: eval
description: The expression to obtain the version of the tool
authors:
- "@kiranchari"
maintainers:
- "@kiranchari"
95 changes: 95 additions & 0 deletions modules/nf-core/varnet/tests/main.nf.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
nextflow_process {

name "Test Process VARNET"
script "../main.nf"
process "VARNET"

tag "modules"
tag "modules_nfcore"
tag "varnet"

test("tumor_normal_pair") {
config './nextflow.config'
Comment thread
SPPearce marked this conversation as resolved.
Outdated

when {
process {
"""
input[0] = [
[ id:'test' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true)
]
input[1] = [
[ id:'test_region' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true)
]
input[2] = [
[ id:'genome' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[3] = [
[ id:'genome' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true)
]
"""
}
}

then {
assertAll(
{ assert process.success },
{
assert snapshot(
process.out.vcf.collect { file(it[1]).getName() },
process.out.versions
Comment thread
kiranchari marked this conversation as resolved.
Outdated
).match()
}
)
}

}

test("tumor_only") {
config './nextflow.config'

when {
process {
"""
input[0] = [
[ id:'test_tumor_only' ],
[], // input_normal is empty
[], // index_normal is empty
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true)
]
input[1] = [
[ id:'intervals' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true)
]
input[2] = [
[ id:'genome' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[3] = [
[ id:'genome' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true)
]
"""
}
}

then {
assertAll(
{ assert process.success },
{
assert snapshot(
process.out.vcf.collect { file(it[1]).getName() },
process.out.versions
Comment thread
kiranchari marked this conversation as resolved.
Outdated
).match()
}
)
}
}
}
28 changes: 28 additions & 0 deletions modules/nf-core/varnet/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"tumor_only": {
"content": [
[
"test_tumor_only_out.vcf.gz"
],
null
],
"timestamp": "2026-06-03T16:15:06.084936823",
"meta": {
"nf-test": "0.9.5",
"nextflow": "26.04.3"
}
},
"tumor_normal_pair": {
"content": [
[
"test_out.vcf.gz"
],
null
],
"timestamp": "2026-06-03T16:06:07.792304371",
"meta": {
"nf-test": "0.9.5",
"nextflow": "26.04.3"
}
}
}
8 changes: 8 additions & 0 deletions modules/nf-core/varnet/tests/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
process {
withName: VARNET {
ext.prefix = { "${meta.id}_out" }
memory = '16GB'
cpus = 1
// ext.args is no longer needed for the region since it's handled via the input channel
Comment thread
kiranchari marked this conversation as resolved.
}
}
Loading