Allow vW:i tag for wasp in SAM output#2617
Open
ASLeonard wants to merge 1 commit into
Open
Conversation
birdingman0626
added a commit
to birdingman0626/STAR-Cross
that referenced
this pull request
Jun 29, 2026
…2687), WASP-in-SAM (alexdobin#2617), CodeQL fix (#8) * chore(release): STAR-Cross 0.0.1 versioning, main branch, CodeQL third-party exclusion - Set project/binary version to "STAR-Cross 0.0.1_<hash>" (CMakeLists + VERSION fallback); update version_check test regex to 0.0.1. - Release workflow: name releases "STAR-Cross", auto-tag fallback 0.0.1_<sha>. - CI branch triggers master -> main (codeql.yml). - CodeQL: mark cpp-httplib include as SYSTEM (CodeQL skips system headers — the reliable fix for the cpp/non-https-url alert in the #include'd httplib.h, which paths-ignore cannot filter for C/C++); broaden config globs to **/_deps/**. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat: allow WASP vW:i tag in SAM output (upstream PR alexdobin#2617) Emit the vW WASP filtering tag in the SAM/CRAM path and drop the BAM-only restriction on --waspOutputMode and the vW attribute. Emission is scoped to ATTR_vW only (the upstream patch incorrectly shared it with vG/vA). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * perf: multicore genomeGenerate suffix-array build (upstream PR alexdobin#2687) Port the parallel prefix-bucketed suffix-array chunk sort (with sub-binning, optional in-memory chunk retention, and a skip-first-word comparator fast-path) from upstream PR alexdobin#2687. Reconciled with the fork: - funCompareSuffixesFromWord uses the big-endian-safe loadUintLE loads. - sjdbSortBucket reformulated without __uint128 (MSVC has no native 128-bit); the bucket mapping stays monotonic in the key, so the total order — and thus the final index — is unchanged. - SA chunk packing keeps the in-memory path and the binary-mode disk fallback. Index output is byte-identical to the previous builder; a new CI job (validate-genome-index) builds the main baseline and this branch and diffs the SA/SAindex/Genome across 1 vs 16 threads and a low-RAM multi-chunk layout. build.yml CI branch triggers also move master -> main. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs: STAR-Cross 0.0.1, CRAM/big-endian/ARM, genomeGenerate perf, WASP-in-SAM Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(win): use signed loop indices in OpenMP parallel-for (MSVC OpenMP 2.0) MSVC's OpenMP 2.0 requires signed integral index variables in `#pragma omp for`. The alexdobin#2687 ports used `uint` (64-bit) indices; switch the new parallel-for loops to int64 (genomeGenerate genome scans, genomeSAindex chunks, sjdbBuildIndex bucket count/scatter). No behavior change on GCC/Clang. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
birdingman0626
added a commit
to birdingman0626/STAR-Cross
that referenced
this pull request
Jun 29, 2026
…osix_spawn, WASP vW in SAM - Windows: open genome/SA/Genome files in binary mode (streamFuns). - Referenceless CRAM output (--outSAMtype CRAM) via cramOutput.cpp (CRAM_OPT_NO_REF), transcoded from BAM at finalization; hot path untouched. - macOS: spawn readFilesCommand via posix_spawnp (issue alexdobin#2663); Windows path unchanged. - Allow WASP vW:i tag in SAM/CRAM output, not just BAM (upstream PR alexdobin#2617). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Allowing wasp tags in SAM output seems straightforward (and seems to be identical after this modification to converting BAM to SAM). Maybe it was just not implemented, but I can't see any reason why such a tag would actually be problematic in SAM.
This PR only looks at a few code pathways, so probably many more tags/modes should be changed, but this addresses a common usecase.