Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache creation of compound masks #4612

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

PicoCentauri
Copy link
Contributor

@PicoCentauri PicoCentauri commented Jun 5, 2024

Changes made in this Pull Request:

Add a cache for creating the compound masks in a group. This speeds up the calculations of accumulate other methods that use _split_by_compound_indices:

u = mda.Universe(TPR, XTC)

Running a simple timing test with caching gives on my machine 397 µs ± 1.55 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%%timeit 
u.atoms.accumulate("masses", compound="residues")

and with "disabled" cache 1.16 ms ± 11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%%timeit 
u.atoms.accumulate("masses", compound="residues")
u.atoms._cache.pop("residues_masks")

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4612.org.readthedocs.build/en/4612/

@pep8speaks
Copy link

pep8speaks commented Jun 5, 2024

Hello @PicoCentauri! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-06-05 17:36:21 UTC

Copy link

github-actions bot commented Jun 5, 2024

Linter Bot Results:

Hi @PicoCentauri! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/9388722067/job/25854613962


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

Comment on lines +983 to +986
self._cache[cache_key] = {
"compound_indices": compound_indices,
"data": (atom_masks, compound_masks, len(compound_sizes))
}
Copy link
Contributor Author

@PicoCentauri PicoCentauri Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is the canonical approach for caching. But the @cached decorator does not work here I think.

@PicoCentauri
Copy link
Contributor Author

This is related to #3169 but does different caching as proposed in #3169.

@orbeckst
Copy link
Member

@richardjgowers could you please have a look at this PR, looks like your wheelhouse.

If you don't have the bandwidth please un-assign yourself and let me know. Thanks!

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TestUnwrapFlag.test_residues test fails. I assume that this is actually due to the changes here. What needs to be done to make the tests pass again?

@richardjgowers
Copy link
Member

this failure here:

Mismatched elements: 2 / 9 (22.2%)
  Max absolute difference: 55.42303448
  Max relative difference: 26.25439738
   x: array([[21.286, 41.664, 40.465],
         [44.528, 43.426, 23.248],
         [57.534, 27.871, 53.767]])
   y: array([[21.286, 41.664, 40.465],
         [44.528, 43.426, 78.671],
         [ 2.111, 27.871, 53.767]], dtype=float32)

the x coordinate is off by 55.423 where the boxlength on that dimension is 55.423, so that just looks like a rounding error. Those two x coordinates are virtually identical, and this test doesn't seem to be trying to establish which image gets picked, it's just bad luck that this regression test could fall into either. (arguably there could be some sort of periodic assert coordinates equal function for things like this...)

The moment of inertia error is a little more tricky to pin down in a similar way, but I'd be suspicious that it is also a rounding error

I got these tests to fail locally, and then they started passing locally, so it looks a bit twitchy...

Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

barring the PBC precision fails, something I noticed is that for some reason group.center_of_mass() doesn't seem to create the cache entry, which is confusing as the code should be going through this new cache path. .accumulate() does seem to be creating the cache entry.

@PicoCentauri
Copy link
Contributor Author

Thanks for looking into this @richardjgowers. I can also try to debug why the .center_of_mass() doens't seem to create the cache entry.

@orbeckst orbeckst dismissed their stale review July 17, 2024 02:27

I am not reviewing the PR anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants