-
Hi I have been using Truvari with HG002_SVs_Tier1_v0.6.vcf.gz and HG002_SVs_Tier1_v0.6.bed, ”These regions define our Tier 1 benchmark set, which spans 2.51 Gbp and includes 5,262 insertions and 4,095 deletions. These regions exclude 1,837 of the 12,745 SVs because they were within 50 bp of a 20–49-bp indel; they exclude an additional 856 SVs within 50 bp of a candidate SV for which no consensus genotype could be determined; and they exclude an additional 411 calls that were not fully supported by a diploid assembly as the only SV in the region.” Now, 5262 + 4095 = 9357 variants Not sure if I am missing something and this is intended? Also, although only related to the HG002_SVs_Tier1_v0.6.vcf.gz truthset, I have looked at the number of different REPTYPEs contained under the INS and DEL SVTYPEs. Does this table make sense? Can an INS have a SIMPLEDEL as a REPTYPE and a DEL have DUP as REPTYPE for example (these are including all variants):
Just asking, because I don't fully know. But to me it seems odd? Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
Hello, I'll try my best to answer your questions with the disclaimer that I'm not the official maintainer of the GIAB data. It appears that the DEL/INS SV counts in the manuscript are incorrect. 9641 is the count of PASS/Tier1 SVs which breaks down to
From my understanding, the REPTYPE isn't a technical definition the same as SVTYPE's DEL/INS as much as it is an annotation. It would appear you have found edge cases where the REPTYPE annotation could be incorrect. However, it may be useful for you to explore how many of these are Tier1 SVs. Generally, I choose not to stray outside of the PASS/Tier1 regions. $ bcftools view -i "REPTYPE == 'DUP' & SVTYPE == 'DEL'" Tier1_Only.vcf.gz | grep -v "#" | cut -f1-6
19 10636680 HG2_Ill_svaba_21979 TATATACATATACATATACATATACATATACATATACACATATACATATAC T 10 So for this example it's a 50bp deletion that looks to be in some kind of simple repeat that may be the cause of the missed annotation. |
Beta Was this translation helpful? Give feedback.
Hello,
I'll try my best to answer your questions with the disclaimer that I'm not the official maintainer of the GIAB data.
It appears that the DEL/INS SV counts in the manuscript are incorrect. 9641 is the count of PASS/Tier1 SVs which breaks down to
From my understanding, the REPTYPE isn't a technical definition the same as SVTYPE's DEL/INS as much as it is an annotation. It would appear you have found edge cases where the REPTYPE annotation could be incorrect. However, it may be useful for you to explore how many of these are Tier1 SVs. Generally, I choose not to stray outside of the PASS/Tier1 regions.
$ bcftools view -i "REPTYPE == 'DUP' & SVTYPE == 'DEL'" T…