Skip to content

Releases: vgteam/vg

vg 1.75.0 - Spike

15 Jun 15:52
32d310b

Choose a tag to compare

Known Issues

In this release, the single-command haplotype sampling mode of vg giraffe will include kmc k-mer counting logs in the alignment output files, corrupting them. This issue is fixed in #4938.

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.75.0

Buildable Source Tarball: vg-v1.75.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Put back the code to finalize giraffe's paired end distribution after trying enough reads.
  • vg CI builds of ARM containers should no longer segfault when upgrading libc
  • Alignment scoring and mapping quality computation have been broken out of GSSWAligner and moved to AlignmentScorer and MappingQualityCalculator.
  • vg depth will now work on .gbz files.
  • vg stats returns correct aggregate stats even when some values are negative
  • vg filter --tsv-out has a softclip_total option for convenience (softclip_end + softclip_start)
  • Speed up minimizer index construction.
  • The vg giraffe --rec-penalty-chain parameter has been split into --rec-penalty (for chaining), --rec-consistency-bonus (a bonus for haplotype consistency used during chaining but not incorporated into the chain score), and --rec-penalty-aln (used to penalize alignment scores per recombination).
  • Recombination-aware minimizer indexing is now always on when there are few enough haplotypes and the GBZ being indexed is not a path cover. Passing --rec-mode to vg minimizer now just makes it fail if recombination-aware minimizer indexing isn't on (because of too many haplotypes or the presence of synthetic path cover paths).
  • Recombination-aware mapping is now the default in vg giraffe, if a recombination-aware minimizer index file is loaded and you are using the hifi or r10 presets. To turn it off, pass --no-rec-mode. There's no longer a distinction between .path minimizer and zipcodes files and normal ones.
  • The hifi and r10 presets for vg giraffe have been updated with tuned recombination penalty settings.
  • vg giraffe no longer produces alignments with nonempty path and negative or zero score. Potential alignment that would reach or go below a score of 0 (perhaps because of --rec-penalty-aln) will be removed, and if needed an unmapped alignment record will be emitted for the read.
  • Significant time and memory optimizations to vg giraffe chaining/long-read mode
  • --comments-as-tags is now under test with vg giraffe's chaining codepath
  • Surject tests now test SAM tags in GAM with an actual vg surject command line
  • vg surject now preserves unrecognized GAF tags as tags on output alignments (and GAF input in general retains tags)
  • vg giraffe chaining mode now properly retains input tags on unmapped reads
  • vg giraffe --track-provenance should no longer crash with complaints about the filters. (Fixes an unreleased regression.)
  • Add option vg filter --tsv-out "is_aligned" to return whether a read has an alignment
  • Add new vg giraffe filter for low-scoring MAPQ 0 R10 reads
  • vg stats -a reports aggregate bp/alignment stats as per aligned reads, ignoring unmapped reads
  • Remove --item-scale and --points-per-possible-match from vg giraffe as needless unused complexity.
  • vg giraffe chaining mode allows negative affine-gap alignment scores to be log-gap rescored before tossing out negatively scoring alignments (minor accuracy improvement)
  • vg now uses an old version of the multi-arch support container in its CI Docker builds to work around tonistiigi/binfmt#298
  • vg find -Q/--paths-named is now deprecated due to its partial-Protobuf output
  • vg find will now index its target paths but not other haplotype paths.
  • vg should no longer position-index haplotype paths unnecessarily in commands using the PathPositionOverlayHelper.
  • vg filter can accept GAMPs when it's told to expect them, and errors nicely with --input-mp-alns --tsv-out

Updated Submodules

Thegbwtgraph, libbdsg, and libvgio submodules have been updated.

vg 1.74.1 - Petrie

11 May 19:21

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.74.1

Buildable Source Tarball: vg-v1.74.1.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Added a little test file of reads for new Quickstart page
  • vg gbwt option --subgraph-of for marking a GBZ graph a subgraph of another.
  • Fixed a bug with minimizer indexing that impacted recombination-aware mapping with Giraffe

Updated Submodules

Thegbwtgraph submodule has been updated.

vg 1.74.0 - Petrie

04 May 19:56

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.74.0

Buildable Source Tarball: vg-v1.74.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Added vg giraffe --haplotype-sampling to automatically count kmers and haplotype-index and haplotype-sample the graph. Make sure to have kmc installed. Providing either a --kff-name or --haplotype-name will now also trigger generation of the other. To do one-reference sampling, continue to use --set-reference. To do non-diploid sampling with a certain number of haplotypes, use --no-diploid-sampling and --num-haplotypes.
  • vg giraffe will no longer claim to be guessing a GBZ file you definitely told it to use
  • vg paths -u fixed to use use reference path to help root the integrated snarl finder.
  • vg gbwt option --gbz-v1 for writing GBZ version 1 for compatibility with older tools
  • Remove broken vg paths --extract-vg option which would extract a partial Protobuf graph file in a way so poorly explained as to be unusable.
  • Giraffe no longer ignores the parts of seeds that extend outside their graph nodes to the left when scoring them. Note that this can reduce R10 read variant calling accuracy versus the previous release of vg. This regression was fixed before release (see below).
  • Giraffe hifi mapping preset has been re-tuned for new seed score distribution.
  • Chain visualizations no longer need to be panned or zoomed to show changes to the traceback.
  • Chain visualizations no longer accumulate more and more transition lines when mousing in and out of a selected node.
  • Giraffe no longer tries to position-index all haplotypes when showing work. If you need all haplotypes position-indexed for debugging chaining against them, use --haplotype-positions.
  • vg autoindex --gfa will error if the filename seems gzipped
  • vg CI data is no longer hosted under a user public_html directory
  • vg autoindex has a -w sampling workflow to make indexes for haplotype sampling
  • Revised Giraffe chain and alignment scoring. Alignments generated from chains are now no longer scored with the affine gap model used for base-level dynamic programming, but instead are scored with a logged-gap-score, variable-mismatch-penalty model borrowed from minimap2.
  • Calling results from nanopore reads are now better than v1.73.0 again.
  • Simplify an internal return value for align_sequence_between().
  • vg giraffe will now stop with an error when the minimizers or zipcodes are older than the distance index they were supposedly generated from.
  • Add compile-time option to check ziptree iterator for missing seed-to-seed transitions
  • augref-related options in vg paths renamed to be gref-related

Updated Submodules

Thegbwtgraph and libvgio submodules have been updated.

vg 1.73.0 - Ducky

23 Mar 19:40
61212f9

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.73.0

Buildable Source Tarball: vg-v1.73.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Off-reference cover logic moved from vg deconstruct to vg paths. deconstruct and call now have prototype logic to fully take advantage of it.
  • Fix regression in vg clip, depth, simplify and potentially some uses of deconstruct and call, that results from a change that ignores haplotypes in .vg files (to be consistent with how .gbz files would have been treated).
  • Better distance indexing in complex DAG snarls. Distance indexes should be re-made.
  • vg wiki manpage links to subcommand sections now work
  • Add option --exclude-sample to vg paths
  • Stable GAF sorting is actually stable.
  • vg surject -S no longer loses read names with GAM input
  • Added the total number of recombination in a chain, recombinant anchor are now marked in the chain dump file
  • Very minor vg giraffe chaining mode speedup
  • vg surject now takes --read-length short and --read-length long, and sets low-complexity pruning correctly.
  • vg giraffe's built-in surjection now uses low-complexity pruning by default for long reads.
  • vg giraffe now has --no-XXX and --XXX flag options in pairs.
  • R plotting scripts no longer insist on installing all their dependencies
  • Add #define compile-time option to print info about sampled haplotypes in vg haplotypes
  • vg call -C can now be used with -a
  • GBZ-to-GBZ chunking with vg chunk --gbz (can choose all components or components by contig name).
  • vg convert options --gbwtgraph-algorithm and --drop-haplotypes work correctly together in GBZ to GFA conversion.
  • vg describe works better with old obsolete files.
  • Add vg sim --use-average-length option

Updated Submodules

The gbwt, gbwtgraph, and libbdsg submodules have been updated.

vg 1.72.0 - Littlefoot

09 Feb 21:25
b676ef8

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.72.0

Buildable Source Tarball: vg-v1.72.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Giraffe now just uses a single chaining pass, instead of a fragmenting pass and then a chaining pass
  • Remove a useless check/error that can by definition never be raised
  • vg paths should again work on GBZ files containing only haplotypes
  • Giraffe/DeepVariant is now under CI test.
  • Per-unit-test-set binaries (like bin/unittest/snarl_distance_index) work again
  • Operations on GBZ graphs no longer hide haplotype paths from the PathHandleGraph iteration functions. vg attempts to request the appropriate path senses when haplotype paths should be ignored for a particular operation.
  • vg giraffe chaining mode bugfix; minor accuracy improvement
  • vg now requires C++17 on Linux
  • vg giraffe help now mentions its start[:end[:step]] range specification syntax
  • Help vg autoindex not error when indexing a graph with oversized snarls
  • Update vg surject helptext to be clear that GAM is the default output format
  • vg giraffe in non-chaining mode will no longer mis-index pair distances when rescue fails
  • vg giraffe --supplementary
  • test/build_graph executable should no longer be mistaken for malware
  • Fixed some non-wrapping vg index helptext
  • GBZ version 2 with better compression for sequences (existing files can still be used).

Updated Submodules

The gbwt, gbwtgraph, libbdsg, libhandlegraph, libvgio, and sdsl-lite submodules have been updated.

vg 1.71.0 - Cera

06 Jan 16:02
010e9ed

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.71.0

Buildable Source Tarball: vg-v1.71.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Running vg augment with no arguments will print helptext
  • Recombination aware chaining fixes
  • Explainer explanations for reads now get organized into explanation_<READ_NAME> directories.
  • Explainer explanations for reads now explain all chains, not just the best one.
  • Explainer explanations for reads now include coordinates on all haplotypes, not just references.
  • New scripts/plot_chains.sh script to plot all the chains for explained reads against all the contigs
  • GBZ graphs store stable graph names (pggname).
    • The information is copied to haplotype information files, minimizer indexes, and GFA/GAF headers.
    • Some subcommands (e.g. vg giraffe, vg haplotypes, vg pack) use the information to determine if the input files are compatible.
  • Standalone GBWTGraph (.gg) files are no longer supported.
  • New version of haplotype information (.hapl) files with tags. Old files can still be read.
  • Haplotype sampling should work better with noisy kmer counts.
  • Bugfix for vg giraffe chaining; improvements to accuracy and minor effect on runtime
  • Add option vg haplotypes --ban-sample
  • vg filter gives a clean error when passing files that don't look like GAMs
  • vg paths prints a warning if path criteria select 0 paths
  • vg gbwt and vg autoindex support GFA files with grammar-compressed walks.
  • Random double space in a vg autoindex logging line is now a single space
  • The random zip code tree test works with the --rng-seed option.
  • -P option added to vg snarls and vg index to specify a reference backbone for orienting the snarl tree. This can be required to run vg haplotypes on some graphs from minigraph-cactus with newer vg versions. Can be thought of as a much higher-level version of the current -w interface which lets you manually upweight nodes.
  • vg giraffe can compute supplementary alignments with the --supplementary option

Updated Submodules

  • gbwt
  • gbwtgraph
  • libvgio

vg 1.70.0 - Zebedassi

17 Nov 21:55
4cdd53a

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.70.0

Buildable Source Tarball: vg-v1.70.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • Minor formatting improvements in README
  • Fix bug in distance indexing where there weren't enough bits per int to represent all values
  • Add more softclip statistics to vg stats
  • vg inject now has the option --allow-missing-contig/-a which treats reads mapped to missing contigs as unmapped instead of erroring (Resolves #4613)
  • Warn in vg snarls when -l, -o, or -a is used without --traversals
  • Make helptext for vg index --snarl-limit match reality
  • Minimizer index changes:
    • New version of the minimizer index. Existing indexes must be rebuilt.
    • Fixes for vg minimizer and autoindex after the --rec-mode changes (vg minimizer no longer fails to save oversized zipcode references).
    • The index now knows the type of the payload stored with each hit.
    • A --rec-mode also knows the path name fields used to identify haplotypes.
  • Fix bug causing vg pack -d -e crash.
  • vg describe subcommand for identifying and describing files based on header information.
  • Create utility functions for basic parsing/validity checks, and use them in subcommands + src/index_registry.cpp
    • At least attempt to enforce the use of some new standardized parsing/validity functions
    • Create utility functions info(), warn(), and error() for pretty error/warning printing (also exposed by a Logger object) and use them in subcommands + src/index_registry.cpp
  • Fix two broken tests in test/t/03_vg_view.t
  • Changes to GAF output:
    • Header lines starting with @. All tools reading GAF files must be updated to handle headers.
    • Unaligned sequences are preserved as insertions aligned to an empty target path.
  • libhandlegraph versions have been re-synced
  • deconstruct -f option to write fasta file of off-reference sequence, as well as a tsv table describing its locations
  • vg giraffe will again use path payloads from the minimizer index
  • If vg index --snarl-limit has a threshold equal to a snarl's size, it no longer counts as an oversized snarl
  • Hint to the user what value they might need to increase --snarl-limit to
  • vg snarls -w option added to specify node weights (similar to index -w)
  • vg call and vg deconstruct now use reference-guided snarl decomposition by default.
  • vg clip, vg simplify and vg stats now use reference information when applicable/available during snarl computation.
  • Empty string SAM tags can now be parsed when embedded in GAM records.
  • vg surject -p now works on haplotype paths (with their #0, #1 etc. fragment numbers) in a GBZ.
  • vg manpage generator now includes vg combine
  • In vg giraffe chaining mode, don't bother calculating DP matrix size if a conservative/minimal size estimate would exceed the maximum threshold
  • Put long read giraffe preprint link in README for citation
  • Remove duct tape by reordering snarl ranks, which breaks previous distance indexes

Updated Submodules

  • gbwtgraph
  • libbdsg
  • libhandlegraph
  • libvgio
  • sdsl-lite
  • xg

vg 1.69.0 - Bologna

16 Oct 14:33
077917c

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.69.0

Buildable Source Tarball: vg-v1.69.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

Release Note! Compared to the previous v1.68.0 release, vg giraffe is faster on long reads, but may be less accurate for variant calling from HiFi reads, when using available trained DeepVariant models.

This release includes:

  • vg inject now produces useful error messages when reads go out of range on paths
  • vg autoindex now gives you hints about what files would help it, when it can't make the indexes it wants to make.
  • vg chains subcommand for extracting top-level chains from a distance index or a snarls file for GBZ-base.
  • vg inject will no longer spontaneously map SAM/BAM reads that have their mapping fields filled in but are flagged as unmapped.
  • vg inject will now throw away scores for unmapped reads
  • vg stats and vg inject can now understand reads that are asserted to be "mapped", but where the position/path is not provided, a thing the SAM spec does not appear to prohibit.
  • Zip code trees for vg giraffe's chaining mode now have non-heuristic* distances in non-DAG snarls [*intra-chain reversals are still not handled at all] As a practical matter, we get significant speedups on HiFi and R10 reads (especially for the slowest reads) and a tiny increase in read identity scores (though some increase and some decrease)
  • vg mapping tools can now produce supplementary alignments for SAM/BAM output
  • vg giraffe now implements a recombination aware chaining algorithm
  • GBWTGraph can again be built for more than 64 paths
  • vg find -G now includes regions of paths touched by the extracted graph
  • vg haplotypes --include-reference now also includes reference paths that do not visit any snarls.
  • Breaking changes to the haplotype information (.hapl) files used by vg haplotypes. Old files can no longer be used.
  • Improve automatic manpage generation
  • Fixed haplotypes supported by minimizers (for recombination-aware vg giraffe)
  • Add tiebreak on identity for alignments with identical score (vg giraffe)
  • Heuristically detect & fix when snarl ranks are sorted backwards in zip code tree

Updated Submodules

  • gbwtgraph
  • sdsl-lite

vg 1.68.0 - Rimbocchi

25 Aug 20:34
e78e9d5

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.68.0

Buildable Source Tarball: vg-v1.68.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • vg index now accepts a -w option to up-weight nodes to push the top-level chain through them when finding snarls
  • Added a warning that path selection options are not compatible with vg paths -g
  • vg haplotypes exits with an error if the snarl decomposition contains a cyclical top-level chain.
  • scripts/check_options.py now catches if something other than , is between shortform and longform options
  • Add option vg autoindex --no-guessing to allow force-regenerating indices
  • Lookup of regions within paths that are themselves subpaths (like Stella_v1p1#0#Chr4__Stella_v1p1[11578420-11580540]:0-100) should now work again.
  • Add errors when using incompatible options in vg depth
  • SAM-style tags are no longer lost on unmapped reads during surject
  • vg's vcflib build will now use the default python3 instead of the latest installed Python (which might not have its headers)
  • Add nodes as a vg filter --tsv-out field option; prints a comma-separated list of nodes traversed by the read's path
  • vg giraffe now has a --softclip-penalty flag to reduce alignment scores per-base for softclips
  • vg filter now has a -W/--overwite-score flag to save the scores from --rescore.
  • vg filter now checks to make sure you aren't using --rescore or related options when they would do nothing.
  • Internal changes in vg giraffe to allow multiple presets to potentially share settings.
  • Bug fixes for chain transition distance measurement with the zip code tree in vg giraffe
  • vg now supports Protobuf 30+ and its string view return types.
  • vg mod now has an --invert-keep-paths option to save the complement of path names passed to --keep-paths
  • vg giraffe -b hifi preset now uses a --max-min-chain-score of 100
  • vg now has a libbdsg that can run is_regular_snarl() on a distance-less distance index.

Updated Submodules

  • gbwtgraph
  • libbdsg
  • libvgio

vg 1.67.0 - Vetria

14 Jul 19:44
379c37d

Choose a tag to compare

Download for Linux

Don't forget to mark the static binary executable:

chmod +x vg

Docker Image: quay.io/vgteam/vg:v1.67.0

Buildable Source Tarball: vg-v1.67.0.tar.gz

Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.

This release includes:

  • GAF path end positions are calculated correctly in some edge cases.
  • --keep-path can now be used multiple times in vg mod
  • vg giraffe --track-correctness should no longer crash when read truth positions are on paths that exist in the graph, but are too short to reach where the read is.
  • Bring vg cluster up-to-date: now accepts GBZ files, can do short-read or long-read giraffe, and allows --prefix for better compatibility with vg autoindex
  • Add some options to vg cluster to help with chaining issue diagnosis: print out cyclic snarl sizes, seeds with high hit amounts
  • Fix GFA haplotype sniffing for GFAs with P-lines
  • Use graph metadata and not path name to determine reference/haplotype status for paths in vg call and vg deconstruct.
  • Loading transcript files will now produce a human-readable error message when there are duplicate transcripts with the same ID on different paths.
  • The GBWT built while sorting GAF with vg gamsort is now forward-only by default.
  • vg sim now can output in FASTQ format via --fastq-out
  • Make vg mod -t take an argument and stop -E from requiring one
  • In vg chunk, fix the long names for -P, -c, -r, and -R, and make the latter two accept arguments.
  • Register command line options correctly & put them under test (scripts/check_options.py). This involved a lot of minor bugfixes and helptext modifications, collected in a Google Doc.
  • Manually wrap option helptext lines after 80 characters
  • vg sim now works with sample name even when no GBWT is provided.
  • CI now enforces the minimum required GCC version.
  • vg now requires a minimum GCC version of 7, the oldest major version available in the Ubuntu releases we test on for CI.
  • vg giraffe usage example now shows using a .zipcodes file and a .withzip.min file.
  • vg can now be built with the mimalloc allocator (v3 beta)

Updated Submodules

  • BBHash
  • libbdsg
  • libvgio
  • sdsl-lite
  • sparsepp
  • vcflib

New Submodules

  • mimalloc

Removed Submodules

  • fastahack (now used via vcflib)