Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Many-vs-Many upper triangle of pairwise genome ANI calculation #127

Open
jolespin opened this issue Dec 8, 2023 · 3 comments

Comments

@jolespin
Copy link

jolespin commented Dec 8, 2023

A new tool called skani has a very convenient option that avoids a lot of duplicate computation:
https://github.com/bluenote-1577/skani/wiki/skani-basic-usage-guide#skani-triangle---all-to-all-ani-computation

Here's a screenshot:
image

Would it be possible for FastANI to use this functionality as well to only calculate the upper triangle?

@cjain7
Copy link
Member

cjain7 commented Dec 10, 2023

The current implementation of FastANI indexes all genomes in the reference list at the preprocessing stage. The index is not changed afterwards when each query genome is processed. As a result, doing n^2 computations is more convenient for us.

We can try periodically recomputing the index with fewer genomes. I am not sure how much time we will gain by this.

@jolespin
Copy link
Author

jolespin commented Dec 10, 2023

Would it be possible to provide a single list, index all of the genomes, and then query each non redundant pair?

I was thinking of implementing a wrapper to do the pairs myself and call FastANI around it but then realized the index would be created for each process.

I'm currently having memory issues with FastANI and I think the n^2 might be the culprit.

@jolespin
Copy link
Author

Also, if you provide a list of 1k genomes for --rl and the same list for --ql, does it calculate the index of each genome twice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants