Chunk size, kmers or seqs? #13

mr-eyes · 2021-02-14T18:18:57Z

Rethinking regarding the chunk size, should we define the chunk size as the number of sequences or the number of kmers?

Chunk size as the number of sequences should work when the sequence lengths are relatively small. In genomes for example, if we set the chunk size to 10k that will consume a lot of memory per single chunk. On the other hand, it will work smoothly when processing transcripts due to their short and the average length is small.

Chunk size as the number of kmers will work just fine on the previous examples and we can set a fixed multiplier of thousands or millions.

@drtamermansour what do you think?

The text was updated successfully, but these errors were encountered:

mr-eyes · 2021-02-14T18:21:07Z

Error Duplicate #12

mr-eyes closed this as completed Feb 14, 2021

dib-lab locked and limited conversation to collaborators Feb 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunk size, kmers or seqs? #13

Chunk size, kmers or seqs? #13

mr-eyes commented Feb 14, 2021

mr-eyes commented Feb 14, 2021

Chunk size, kmers or seqs? #13

Chunk size, kmers or seqs? #13

Comments

mr-eyes commented Feb 14, 2021

mr-eyes commented Feb 14, 2021