Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aligning to chm13v2 #33

Open
knightjimr opened this issue Apr 22, 2022 · 8 comments
Open

Aligning to chm13v2 #33

knightjimr opened this issue Apr 22, 2022 · 8 comments

Comments

@knightjimr
Copy link

I am able to get dragmap running to align reads to hg38, but when I try to align to the new chm13v2 human reference, the build looks like it works correctly, but the alignment output gives the results below, and then all of the reads are output as unmapped. It looks like the index reading is getting short-circuited (compared to the hg38 alignment output). If you need it, attached is the log of the build stdout (there was a stderr message saying "Suppressing decoy"). Both dragen-os commands matched your main page commands.

Should dragmap be able to run using this reference?

build.log

2022-04-22 09:45:55 [2af3bc142f40] Version: 1.2.1
2022-04-22 09:45:55 [2af3bc142f40] argc: 7 argv: dragen-os -r ref/ -1 ../Unaligned/NA12878-S1_AH5WLCDMXX_L002_R1_00
1.50m.fastq.gz -2 ../Unaligned/NA12878-S1_AH5WLCDMXX_L002_R2_001.50m.fastq.gz
decompHashTableCtxInit...
0.773 seconds
decompHashTableHeader...
Running dual fastq workflow on 20 threads. System supports 20 threads.
0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6
0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5
0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4
0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3
0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2
0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1
0 0 0 0 0 0 10000 1 40000 1 1000 0 0 0 0
Initial paired-end statistics detected for read group all, based on 0 high quality pairs for FR orientation
Quartiles (25 50 75) = 0 0 0
Mean = 0
Standard deviation = 10000
Rescue radius = 0
Effective rescue sigmas = 0
WARNING: Default rescue sigmas value of 2.5 was overridden by host software!
The user may wish to set a rescue sigmas value explicitly with --Aligner.rescue-sigmas
Boundaries for mean and standard deviation: low = 0, high = 0
Boundaries for proper pairs: low = 1, high = 40000
NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN)

@kokyriakidis
Copy link

I am not able to map to chm13v2 too

@gconcepcion
Copy link

I also just ran into this issue. pipeline works fine w/ hg38, but if I switch to chm13v2 I get no reads mapped. I thought I was going nuts, reassuring to see that I'm in good company.

@tienpm7723
Copy link

@knightjimr Could you give me your result for HG38, i have the same problem even i run HG38 and this is my output:
decompHashTableCtxInit...
1.220 seconds
decompHashTableHeader...
0.002 seconds
decompHashTableLiterals...
3.367 seconds
decompHashTableExtIndex...
0.070 seconds
decompHashTableAutoHits...
Running fastq workflow on 88 threads. System supports 88 threads.
0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6
0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5
0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4
0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3
0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2
0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1

@knightjimr
Copy link
Author

For hg38, I get the following initial log messages (I was using a file that only contained 50 million reads, hence the numbers). After your lines, there should be the "Initial paired-end statistic" lines that come up quickly, then a wait for the summary stats.

decompHashTableCtxInit...
0.777 seconds
decompHashTableHeader...
0.002 seconds
decompHashTableLiterals...
1.782 seconds
decompHashTableExtIndex...
0.033 seconds
decompHashTableAutoHits...
55.169 seconds
decompHashTableSetFlags...
6.004 seconds
finished decompress
Running dual fastq workflow on 10 threads. System supports 20 threads.
0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6
0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5
0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4
0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3
0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2
0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1
0 0 217 254 296 257.157 59.9087 1 533 107 407 84497 84930 0 0
Initial paired-end statistics detected for read group all, based on 84497 high quality pairs for FR orientation
Quartiles (25 50 75) = 217 254 296
Mean = 257.157
Standard deviation = 59.9087
Rescue radius = 149.772
Effective rescue sigmas = 2.5
Boundaries for mean and standard deviation: low = 59, high = 454
Boundaries for proper pairs: low = 1, high = 533
NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN)
MAPPING/ALIGNING SUMMARY,,Total input reads,50000000,100.00
MAPPING/ALIGNING SUMMARY,,Number of duplicate marked reads,0,0.00
MAPPING/ALIGNING SUMMARY,,Number of duplicate marked and mate reads removed,0,0.00
MAPPING/ALIGNING SUMMARY,,Number of unique reads (excl. duplicate marked reads),50000000,100.00
MAPPING/ALIGNING SUMMARY,,Reads with mate sequenced,50000000,100.00
MAPPING/ALIGNING SUMMARY,,Reads without mate sequenced,0,0.00
MAPPING/ALIGNING SUMMARY,,QC-failed reads,0,0.00
MAPPING/ALIGNING SUMMARY,,Mapped reads,49952500,99.90

@obigbando
Copy link

Same situation met on release 1.3.1.
What we did:

  • build chm13v2.0 hash table with command:
    dragen-os --build-hash-table true --ht-mem-limit=125GB --ht-reference /mnt/chm13v2.0.fa --output-directory /mnt/chm13v2.0/
  • get return code 0 from hash table build command and index file list:
azureuser@notebookmachine:/mnt/chm13v2.0$ ls -l
total 9882764
drwxrwxr-x 2 azureuser azureuser        210 Aug  9 03:29 ./
drwxrwxr-x 5 azureuser azureuser         58 Aug  9 03:24 ../
-rwxrwxr-x 1 azureuser azureuser       5822 Aug  9 03:25 hash_table.cfg*
-rwxrwxr-x 1 azureuser azureuser       1452 Aug  9 03:25 hash_table.cfg.bin*
-rwxrwxr-x 1 azureuser azureuser 4839326630 Aug  9 03:25 hash_table.cmp*
-rwxrwxr-x 1 azureuser azureuser      15015 Aug  9 03:25 hash_table_stats.txt*
-rwxrwxr-x 1 azureuser azureuser 1558818816 Aug  9 03:26 reference.bin*
-rwxrwxr-x 1 azureuser azureuser   48713216 Aug  9 03:26 ref_index.bin*
-rwxrwxr-x 1 azureuser azureuser  389704704 Aug  9 03:26 repeat_mask.bin*
-rwxrwxr-x 1 azureuser azureuser  127191248 Aug  9 03:26 str_table.bin*
  • return the following error when trying to map reads
2022-08-09 06:32:19 	[7f5cadc47100]	Version: 1.3.0
2022-08-09 06:32:19 	[7f5cadc47100]	argc: 13 argv: dragen-os --ref-dir /mnt/chm13v2.0/ --fastq-file1 HG003.novaseq.pcr-free.35x.R1.fastq.gz --fastq-file2 HG003.novaseq.pcr-free.35x.R2.fastq.gz --output-directory . --output-file-prefix HG003 --num-threads 16
decompHashTableCtxInit...
  0.810 seconds
decompHashTableHeader...
Error: boost::exception: /tmp/DRAGMAP/src/lib/reference/ReferenceDir.cpp(115): Throw in function dragenos::reference::ReferenceDir7::ReferenceDir7(const boost::filesystem::path&, bool, bool)
Dynamic exception type: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::logic_error> >
std::exception::what: Decompressed extend table buffer length doesn't match expected

Same error occurs for chrm13v2.0_noY, but not on GRCH38, which simply works just fine.

@mazin128
Copy link

I am facing the same issue when I aligning to chm13v2.

@Arlaz
Copy link

Arlaz commented Sep 26, 2022

Same issue here, can't map to chm13v2

@TomofumiSaka
Copy link

I found a fix for this bug.
Please refer to this pull request.

#55

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants