You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
SpatialPandas helps spatially sort data but we are seeing the need for higher level arbitrary indexing. Two example use cases:
Geospatial. We have spatially sorted daily GPS data for the US for multiple days. Getting a small region for a 60-90 day process can get bogged down by the need to read the 60-90 multiple metadata files and construct the task graph.
Astronomy. We have spatial data for multiple filters (HSC-Y, HSC-G etc). Again we would have to read multiple metadata files.
Describe the solution you'd like
The above could be fixed by building higher level indexes. I think we can benefit from integrating with kartothek. It enables an O(1) index and creates the necessary task graphs for reading just the partitions required. It could also be used to store the extra metadata spatialpandas currently stores in its own format (if I'm understanding spatialpandas correctly)
I'm at the Dask Dev Conference with some of the Kartothek devs and based on conversations with fjetter this integration should be possible.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
SpatialPandas helps spatially sort data but we are seeing the need for higher level arbitrary indexing. Two example use cases:
Geospatial. We have spatially sorted daily GPS data for the US for multiple days. Getting a small region for a 60-90 day process can get bogged down by the need to read the 60-90 multiple metadata files and construct the task graph.
Astronomy. We have spatial data for multiple filters (HSC-Y, HSC-G etc). Again we would have to read multiple metadata files.
Describe the solution you'd like
The above could be fixed by building higher level indexes. I think we can benefit from integrating with kartothek. It enables an O(1) index and creates the necessary task graphs for reading just the partitions required. It could also be used to store the extra metadata spatialpandas currently stores in its own format (if I'm understanding spatialpandas correctly)
I'm at the Dask Dev Conference with some of the Kartothek devs and based on conversations with fjetter this integration should be possible.
The text was updated successfully, but these errors were encountered: