Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk size, kmers or seqs? #13

Closed
mr-eyes opened this issue Feb 14, 2021 · 1 comment
Closed

Chunk size, kmers or seqs? #13

mr-eyes opened this issue Feb 14, 2021 · 1 comment

Comments

@mr-eyes
Copy link
Member

mr-eyes commented Feb 14, 2021

Rethinking regarding the chunk size, should we define the chunk size as the number of sequences or the number of kmers?

Chunk size as the number of sequences should work when the sequence lengths are relatively small. In genomes for example, if we set the chunk size to 10k that will consume a lot of memory per single chunk. On the other hand, it will work smoothly when processing transcripts due to their short and the average length is small.

Chunk size as the number of kmers will work just fine on the previous examples and we can set a fixed multiplier of thousands or millions.

@drtamermansour what do you think?

@mr-eyes
Copy link
Member Author

mr-eyes commented Feb 14, 2021

Error Duplicate #12

@mr-eyes mr-eyes closed this as completed Feb 14, 2021
@dib-lab dib-lab locked and limited conversation to collaborators Feb 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant