-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add saving of ragged lazy vectors #193
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #193 +/- ##
==========================================
+ Coverage 86.02% 86.25% +0.23%
==========================================
Files 81 82 +1
Lines 10387 10541 +154
Branches 2253 2291 +38
==========================================
+ Hits 8935 9092 +157
Misses 931 931
+ Partials 521 518 -3 ☔ View full report in Codecov by Sentry. |
1f1fa8f
to
dd7bd61
Compare
Fix lazy loading of ragged arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks good, just need to get the hspy writer to work!
@ericpre I went through and added in your suggestions. I also added an error for trying to save 1-d lazy ragged array using the .hspy file writer. I think there is something weird going on with h5py casting the array to a higher dimension and then trying to save it. I can try to track down the bug, but it's pretty low on my to-do list. If someone needs the feature or you really think it's worth tracking down I can put the effort in otherwise I might come back to this in a couple of months after I graduate. |
@ericpre what are your thoughts on including this in the 0.3.0 release? It doesn't have the 1-D hyperspy lazy saving figured out but I feel like that shouldn't stop it from being included. |
As this is the |
Yes, sure, this PR is pretty much finish. |
Co-authored-by: Eric Prestat <[email protected]>
Description of the change
Currently Ragged Lazy arrays aren't saved properly because the underlying dtype isn't properly accessed. Hdf5 and zarr only support 1 layer of object arrays so you can't have an object array of object arrays as serializing that would be quite difficult.
In addition there are some small points here that have performance implications. For example we need the shape size at each navigation position in order to unwrap things. We want to make sure that this runs in the same task graph so that they are merged/ simplified so that things don't run twice when saving :).
Thus passing a tuple to da.store.
Progress of the PR
upcoming_changes
folder (seeupcoming_changes/README.rst
),docs/readthedocs.org:rosettasciio
build of this PR (link in github checks)Minimal example of the bug fix or the new feature
The 1D hspy case is still failing for some reason but I wanted to just create this PR to see if people had suggestions?