Skip to content

Soft clipping with surject #4892

@schorlton-bugseq

Description

@schorlton-bugseq

Hi! Thanks for the great suite of tools. Running v1.70.0 installed via bioconda, I generated an index with:

vg autoindex --workflow lr-giraffe --prefix vg_index --ref-fasta reference.fna --vcf chrom1.vcf.gz --vcf chrom2.vcf.gz

then mapping reads with:

vg giraffe -Z vg_index.giraffe.gbz -o BAM  -b r10 -f reads.fastq > test.bam

I got alignments, which for example, look like:

read1	1245	0	1245	+	>12690>12693>12696>12698>12701>12702>12708>12709>12715>12718>12720>12721>12725>12727>12732>12733>12737>12739>12742>12745>12751>12752>12755>12759>12763>12765>12767>12770>12774>12777>12779>12782>12784>12788>12790>12791>12794>12797>12811>12812>12813>12814>12815>12816>12799>12818>12808>12821>12825>12826>12830>12833>12841>12837>12846>12849>12850>12854>12860>12862>12868>12869>12878>12877>12882>12884>12887>12894>12896>12900>12903>12907>12910>12914>12919>12923>12925>12928>12935>12937>12958>12952>12947>12961>12957>12945>12964>12979>12988>12986>12992>12997>13000>13003>13006>13008>13009>13015>13024>13018>13021>13016>13029>13031>13037>13040>13042>13048>13051>13057>13060>13064>13067>13071>13073>13077>13079>13084>13086>13087>13090>13095>13098>13099>13104>13107>13108>13113>13116>13137>13141>13121>13143>13145>13146>13147>13152>13154>13157>13162>13164>13186>13172>13174>13194>13165>13196>13197>13168>13185>13200>13203>13208>13210>13214>13218>13223>13234>13237>13239>13241>13246>13250>13253>13257>13263>13268>13270>13273>13279>13289>13285>13286>13292>13295>13298>13301>13304>13309>13310>13316>13318>13321>13325>13328>13336>13339>13344>13347>13352>13353>13356>13359>13361>13363>13366>13370>13373>13375>13377>13379>13384>13387>13391>13393>13394>13397>13398>13401>13405>13408>13410>13413>13415>13417>13423>13424>13428>13430>13433>13437>13440>13441>13443>13445>13447>13449>13453>13456>13457>13458>13462>13463>13464>13466>13469>13473>13474>13476>13479>13482>13485>13488>13490>13493>13497>13501>13504>13505>13510>13511>13514>13515>13517>13520>13524>13525>13528>13530>13532>13534>13535>13538>13540>13541>13543>13544>13545>13547>13548>13552>13553>13556>13557>13560>13563>13567>13569>13573>13574>13575>13577>13579>13584>13586>13588>13591>13593>13597>13602>13605>13608>13609>13613>13614>13616>13618>13622>13625>13628>13631>13634>13636>13638>13639>13641>13643>13647>13648>13650>13653>13656>13661>13663>13665>13666>13667>13671>13674>13675>13678>13681>13686>13689>13690>13692>13696>13698>13700>13703>13707>13710>13711>13715>13718>13719>13721>13724>13726>13730>13731>13733>13734>13736>13738>13741>13744>13745>13749>13750>13751>13755>13757>13760>13762>13763>13766>13767>13769>13771>13772>13777>13779>13781>13784>13786>13789>13791>13792>13794>13796>13799>13803>13807>13808>13810>13814>13815>13817>13820>13823>13824>13829>13832>13835>13840>13843>13845>13848>13853>13855>13857>13861>13863>13875>13881>13882>13887>13888>13891>13893>13897>13902>13905>13907>13910>13913>13917>13921>13924>13927>13931>13933>13938>13939>13944>13945>13947>13951>13953>13956>13961>13962>13964>13968>13969>13971>13975>13977>13980>13982>13984>13986>13988>13992>13994>13996>13997>14001>14002>14004>14008>14011>14013>14016>14018>14021>14024>14025>14027>14029>14030>14032>14036>14037>14040>14045>14046>14050>14055>14056>14058>14061>14062>14063>14066>14069>14071>14074>14077>14079>14081>14083>14084>14089>14090>14091>14095>14100>14102>14104>14107>14109>14112>14114>14118>14119>14121>14123>14126>14128>14130>14132>14135>14139>14142>14144>14146>14150>14153>14155>14156>14159>14164>14169>14170>14172>14177>14179>14182>14185>14187>14189>14192>14193>14199>14200>14202>14205>14206>14208>14211>14214>14215>14219>14221>14223>14228>14230>14232>14236>14238>14242>14247>14248>14249>14253>14257>14262>14264>14266>14268>14272>14274>14276>14280>14285>14287>14289>14293>14295>14296>14299>14301>14303>14306>14308>14312>14313>14314>14316>14318>14321>14322>14324>14326>14327>14328>14333>14335>14337>14338>14339>14343>14344>14348>14351>14353>14355>14358>14359>14365>14367>14370>14371>14373>14375>14379>14383>14384>14388>14390>14392>14395>14398>14402>14405>14408>14412>14416>14420>14423>14424>14428>14429>14431>14434>14436>14437>14439>14442>14446>14447>14451>14454>14459>14460>14463>14466>14470>14473>14474>14478>14482>14486>14489>14491>14493>14497>14503>14505>14508>14511>14513>14515>14517>14519>14521>14522>14525>14529>14530>14534>14537>14541>14545>14547>14548>14554>14556>14557>14561>14564>14566>14567>14569>14571>14574>14575>14577>14580>14582>14586>14589>14597>14598>14616>14601>14607>14619>14604>14621>14615>14624>14629>14632>14634>14640>14642>14645>14646>14649>14652>14656>14657>14662>14663>14666>14672>14673>14675>14681>14683>14686>14691>14695>14696>14699>14701>14705>14707>14708>14710>14711>14713>14714>14716>14719>14721>14722>14724>14725>14728>14731>14735>14736>14739>14742>14748>14758>14755>14751>14753>14765>14768>14770>14775>14779>14780>14782>14786>14788>14792>14795>14797>14799>14801>14804>14806>14808>14810>14811>14813>14814>14816>14817>14818>14820>14821>14825>14827>14829>14832>14834>14839>14841>14844>14846>14848>14849>14851>14852>14860>14854>14858>14855>14866>14869>14879>14882>14870>14875>14873>14891>14884>14893>14898>14899>14903>14907>14915>14917>14921>14932>14938>14941>14942>14946>14949>14952>14953>14957>14961>14964>14967>14974>14970>14976>14979>14982>14984>14985>14987>14991>14993>14995>14996>15000>15001>15005>15008>15011>15013>15016>15019>15023>15025>15029>15030>15036>15038>15039>15043>15045>15046>15047>15050>15054>15056>15057>15061>15063>15064>15068>15071>15077>15082>15085>15087>15090>15095>15097>15103>15106>15110>15115>15116>15121>15124>15128>15132>15134>15140>15141>15144>15150>15145>15151>15154>15157>15161>15164>15172>15166>15168>15177>15181>15187>15193>15196>15200>15204>15207>15212>15214>15219>15221>15222>15225>15227>15231>15236>15240>15242>15245>15247>15248>15250>15253>15254>15256>15258>15260>15261>15264>15266>15270>15271>15274>15275>15278>15280>15283>15287>15289>15294>15298>15300>15301>15304>15308>15312>15315>15317>15318>15320>15324>15326>15327>15331>15335>15337>15339>15341>15342>15346>15347	1049	0	1049	1007	1245	60	AS:i:816	bq:Z:'&&%%%%&)''45+'''%$$###&*45<=46887++(')'('())&&%&&),++01/.('()8>@6667DDDC?=>=?<77666-*'('*/1?@A;:::886))))2--(&&'*(3/,,-:76:;7---.>=A::;;<=ED:;;:;=C?=;;>EE==;;<=@A::99=>@2222:9343**'&'(-3::>:;{@:;<:<=57))047<<989:A=DGDDB@??=>=;:;;<;777<>AB?>>====>?@>>??????CBJL8877<FF>=<<=>87<;::98,++))*,,-10>:8988>{8899:@@?>??@@?<21/-.*+/99867<?><{?910037-----3=>>?>><<<;<<=743/&(+1;=>;;;<@?>@<<:7744===8644/((((*@BOBCDC@7777AA=;9:8<???7767=?=9:888:=C?C{744454<;989:<;:;<8999=@@=<<=>?>?@<<=<>?>A0///6;=<>;=:@==<999;@?66910/..059889;;;;:DAA?@BD>99986323229788+**.+8;<==>>@@ACBBCBBA333':9327?>?=={@<;78//00:<54632357636:&%%&'(()0ABBD@>==>?=<<;8779=>=657>A;;;;=111<<:;=>???@==>?2.,,,0018<?>=117:<:6546599987668446<??=@@?<<;<<=>6666>?@AC?80//48:8///0@EF{EC@A@>>8764455:>=@=17><>??<;<7689<=@?<<=:;:<<@>=?CG>=<225)(''('''(-/1:=559;<<44543647>;9;==9778;@>>>?<1001=?=6655('''4;::<87332'%&(''14622'''+;JKHD><=@@8878?@>@93-,,4;6664(('&)*-,./@D>=::;:,++&&&&&&&&'''+3117;>=<<;<=::;;7/@:5553987:;===>?=:=<>0002459:<:;;;=?<>8777798;>4./(&$&&%$%&'*1,+,,2111,*/330001.,++,0//1449{>><:<<:<=?;<?<:211143.,)',(%'',48:<{{{{@B@@>==<;20/+,,,18:{{:7022-14??:++.8333/.+,-10(''&)*49:9766579;<<AA@>@@DB?5820/--)+,-3336==8<744565445542699:<<9:997:<9777=@===7522348;9<<<?=6656::8443210/&	cs:Z:+ATGTACTTCGTTCAATTGTTTGGGTGTTTTAACCATGTCCCAGTTAAGAGGAGGAAACAGTTTTCGCATTTATCGTGAAACGCTTTGCGTTTTTCGTGCGCCGCTTCACTTTGGTTGGGAGGAAACGGTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTCGTGCGCCGCTTCAATTGTG:21-G:40*AT:34-CTA:5*AT:2*CT:31*CA:43-AGA:18*TC:14-C:20*AG:128*AG:18*CT:11*CT:31*CT:28*AT*GT:40-A:101*AG:17+G:65-A:4-GAT*CA:9+A:47*AG:1-A:2+C:85*AG:10*GT:19*AT:10*AG*CG:2*AT:8*AT:8*CG:9*AT:30*AG:15+CC:4*AT*GT:22*AG:55+AACAATCAACGAGACGTCGCTTA	dv:f:0.031

If I output to SAM/BAM, it looks like:

read1	0	NC_002018.1	404	60	566S10M669S	*	0	0	ATGTACTTCGTTCAATTGTTTGGGTGTTTTAACCATGTCCCAGTTAAGAGGAGGAAACAGTTTTCGCATTTATCGTGAAACGCTTTGCGTTTTTCGTGCGCCGCTTCACTTTGGTTGGGAGGAAACGGTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTCGTGCGCCGCTTCAATTGTGTCTGTTTCTCTCACAATTTCACAATATGCTTCTTCATGCAAATTGCCATCCTGATAACTACTGTAACATTGCATTTCAAGCAATATGAATTCAACTCCCAATATTAAGTGATGCTGTGTGAACCAACAATAATAGAAAGAAACATAACAGAGATAGTGTATTTGACCAACACCACCATAGAGGAAATATGCCCCAAACCAGCAGAATACAGAATTGGTCAAAACCGCAATGTGGCATTACAGGATTTGCACCTTTCTCTAAGGACAATTCGATTAGGCTTTCCGCTGGTGGGGACATATGGGTGACAAGAGAACCGTATGTGTCATGCGATCTTGACAAGTGTTATCAATTTGCCCTTGGACGAGGAACAACACTAAACAATGTGCATTCAAATAACACAGTACATGATAGAACCCCCCATCGAATCCTATTGATGAATGAGTTGGGTGTTCCTTTCCATCTGGGGACCAAGCAAGTGTGCATGGCATGGTCAGCTCAAGTTGTCACGATGGAAAAGCATGGCTGCATGTTTGTATAACGGGGGATGATAAAAATGCAACTGCTAGCTTCATTTACAATGGGAGGCTTGTAGATAGTGTTGTTTCATGGTCAAACGAACATTCTCAGAACCCAGGAGTCAGAATGCGTTTGTATCAATGGAACTTGTACAGTAGTAATGACGATGATGCTACAGGAAAAAGCTGATACTAAAATACTATTCATTGAGGAGGGGAAAATCGTTCGCCGCAGCAAATTGTCAGGAAGTGCTCAGCATGTCGAAGAGTGCTCTTGCTATCCTCGATATCCTGGTGTCAGATGTGTCTGCAGAGACAGCTGGAAAGGATCCAACCGGCCCATCATAGATATAAACATAAGGGATCATAGCATTGTTTCCAGGTATGTGTGTTCTGGACTTGTTGGAGACACACCCAGAAAAAGCGACAGCTCCAGCAACCCTGTTTGAACCCTAACAATGAAAAAGGTGATCATGGAGTGAAAGGCTGGGCCTTTGATGATGGAAATGACGTGTGGATGGGGAGAACAATCAACGAGACGTCGCTTA	'&&%%%%&)''45+'''%$$###&*45<=46887++(')'('())&&%&&),++01/.('()8>@6667DDDC?=>=?<77666-*'('*/1?@A;:::886))))2--(&&'*(3/,,-:76:;7---.>=A::;;<=ED:;;:;=C?=;;>EE==;;<=@A::99=>@2222:9343**'&'(-3::>:;{@:;<:<=57))047<<989:A=DGDDB@??=>=;:;;<;777<>AB?>>====>?@>>??????CBJL8877<FF>=<<=>87<;::98,++))*,,-10>:8988>{8899:@@?>??@@?<21/-.*+/99867<?><{?910037-----3=>>?>><<<;<<=743/&(+1;=>;;;<@?>@<<:7744===8644/((((*@BOBCDC@7777AA=;9:8<???7767=?=9:888:=C?C{744454<;989:<;:;<8999=@@=<<=>?>?@<<=<>?>A0///6;=<>;=:@==<999;@?66910/..059889;;;;:DAA?@BD>99986323229788+**.+8;<==>>@@ACBBCBBA333':9327?>?=={@<;78//00:<54632357636:&%%&'(()0ABBD@>==>?=<<;8779=>=657>A;;;;=111<<:;=>???@==>?2.,,,0018<?>=117:<:6546599987668446<??=@@?<<;<<=>6666>?@AC?80//48:8///0@EF{EC@A@>>8764455:>=@=17><>??<;<7689<=@?<<=:;:<<@>=?CG>=<225)(''('''(-/1:=559;<<44543647>;9;==9778;@>>>?<1001=?=6655('''4;::<87332'%&(''14622'''+;JKHD><=@@8878?@>@93-,,4;6664(('&)*-,./@D>=::;:,++&&&&&&&&'''+3117;>=<<;<=::;;7/@:5553987:;===>?=:=<>0002459:<:;;;=?<>8777798;>4./(&$&&%$%&'*1,+,,2111,*/330001.,++,0//1449{>><:<<:<=?;<?<:211143.,)',(%'',48:<{{{{@B@@>==<;20/+,,,18:{{:7022-14??:++.8333/.+,-10(''&)*49:9766579;<<AA@>@@DB?5820/--)+,-3336==8<744565445542699:<<9:997:<9777=@===7522348;9<<<?=6656::8443210/&	AS:i:10

This very well could be my misunderstanding, but I expected the surjected alignment to span the full read? Instead however, we see soft-clipping of almost the entire read with only 10 bases matched. What is going on here? Thanks again!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions