You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I realize the recorded current in the bulk file is immutable during playback when performing adaptive sampling on it with Readfish. Although no actual enrichment can be achieved, I am wondering if the segmented "unblocked" reads can be used to still estimate enrichment with playback.
Since enrichment could be defined as the time you save by not sequencing things you don't want, I came up with a method to estimate enrichment by calculating the remaining bases of unblocked reads from adaptive sampling. Although I am not certain if this method is the correct way to estimate enrichment, so I would like your thoughts about it.
I will explain how I intend to calculate the remaining bases.
When performing readfish stats you can demultiplex the reads into *_proceed.fastq.gz, *_stop_receiving.fastq.gz and *_unblock.fastq.gz in my case for control and hum_test (with adaptive sampling).
Since there was no documentation I could find about how exactly these files were made, I assume hum_test_unblock.fastq.gz contains all complete reads that were supposedly "unblocked" (since I used playback).
Additionally, during execution of readfish the output of all individual reads should be generated in live_reads.fq.
# Fastq output for individual reads
debug_log = "live_reads.fq"
I noticed that in live_reads.fq there were multiple fragments with the same read IDs as the complete reads in hum_test_unblock.fastq.gz. So, I assume that complete "unblocked" reads were segmented into unblocked chunks in live_reads.fq, with the end of the first segment being the location of when the rejection signal was sent. If my assumptions are correct then I should be able to estimate enrichment by calculating the remaining bases of "unblocked" reads as: length of complete "unblocked" reads (from _hum_test_unblock.fastq.gz_) - length of the first segment of corresponding read (from live_reads.fq) = remaining bases of the "unblocked" read
Subsequently, the average remaining bases of the "unblocked" reads can be calculated by summing up the above for all unblocked reads and dividing it by the number of "unblocked" reads.
Since the average of remaining bases could indicate the saved time during adaptive sampling can it also be used as an estimation of enrichment?
If you notice that any of my assumptions are wrong, please let me know and if possible advise me on changes I should make in my approach to calculate a valid measure for estimated enrichment with playback.
Thanks in advance
The text was updated successfully, but these errors were encountered:
Dear dev team,
I realize the recorded current in the bulk file is immutable during playback when performing adaptive sampling on it with Readfish. Although no actual enrichment can be achieved, I am wondering if the segmented "unblocked" reads can be used to still estimate enrichment with playback.
Since enrichment could be defined as the time you save by not sequencing things you don't want, I came up with a method to estimate enrichment by calculating the remaining bases of unblocked reads from adaptive sampling. Although I am not certain if this method is the correct way to estimate enrichment, so I would like your thoughts about it.
I will explain how I intend to calculate the remaining bases.
When performing readfish stats you can demultiplex the reads into *_proceed.fastq.gz, *_stop_receiving.fastq.gz and *_unblock.fastq.gz in my case for control and hum_test (with adaptive sampling).
Since there was no documentation I could find about how exactly these files were made, I assume hum_test_unblock.fastq.gz contains all complete reads that were supposedly "unblocked" (since I used playback).
Additionally, during execution of readfish the output of all individual reads should be generated in live_reads.fq.
For complete toml see below:
I noticed that in live_reads.fq there were multiple fragments with the same read IDs as the complete reads in hum_test_unblock.fastq.gz. So, I assume that complete "unblocked" reads were segmented into unblocked chunks in live_reads.fq, with the end of the first segment being the location of when the rejection signal was sent. If my assumptions are correct then I should be able to estimate enrichment by calculating the remaining bases of "unblocked" reads as:
length of complete "unblocked" reads (from _hum_test_unblock.fastq.gz_) - length of the first segment of corresponding read (from live_reads.fq) = remaining bases of the "unblocked" read
Subsequently, the average remaining bases of the "unblocked" reads can be calculated by summing up the above for all unblocked reads and dividing it by the number of "unblocked" reads.
Since the average of remaining bases could indicate the saved time during adaptive sampling can it also be used as an estimation of enrichment?
If you notice that any of my assumptions are wrong, please let me know and if possible advise me on changes I should make in my approach to calculate a valid measure for estimated enrichment with playback.
Thanks in advance
The text was updated successfully, but these errors were encountered: