Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a downsampled iterator so that downsampling becomes less dependent on clone #3355

Open
ctb opened this issue Oct 15, 2024 · 2 comments

Comments

@ctb
Copy link
Contributor

ctb commented Oct 15, 2024

per comment on #3342, #3342 (comment)

@luizirber speaketh:

I wonder if we can do (in a future PR, not this one) a new .downsampled_iter(scaled) for operations like count_common, and avoid the conversion.

The downsampled iter would iterate over values that are in the appropriate scaled value, but wouldn't need to create new minhash sketches (can reuse the largest one and stop returning values once they go over max_hash, for example)

@ctb
Copy link
Contributor Author

ctb commented Nov 13, 2024

maybe explored here? #3394

@luizirber
Copy link
Member

maybe explored here? #3394

In a similar direction, but not quite. The downsample iter is more similar to .iter_mins() or .iter_abunds() in #3394, but there I'm only using iterators directly to calculate the intersection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants