You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR holoviz/datashader#702 introduced support for spatially indexing Dask dataframes and writing them out as parquet files with custom spatial metadata using the datashader.spatial.points.to_parquet.
To accomplish this, the parquet file is initially written out using dask's dask.dataframe.io.to_parquet function. Then the parquet file is opened with fastparquet directly. The parquet metadata is retrieved using fastparquet, the spatial metadata is added, and then the updated metadata is written back to the file.
In order to support the creation of spatially partitioned parquet files using pyarrow (rather than fastparquet), we would need to work out an similar approach to adding properties to the parquet metadata using the pyarrow parquet API.
The text was updated successfully, but these errors were encountered:
jonmmease
changed the title
Support datashader.spatial.points.to_parquet with pyarrow
Support datashader.spatial.points.to_parquet with pyarrow
Feb 15, 2019
PR holoviz/datashader#702 introduced support for spatially indexing Dask dataframes and writing them out as parquet files with custom spatial metadata using the
datashader.spatial.points.to_parquet
.To accomplish this, the parquet file is initially written out using dask's
dask.dataframe.io.to_parquet
function. Then the parquet file is opened withfastparquet
directly. The parquet metadata is retrieved using fastparquet, the spatial metadata is added, and then the updated metadata is written back to the file.In order to support the creation of spatially partitioned parquet files using pyarrow (rather than fastparquet), we would need to work out an similar approach to adding properties to the parquet metadata using the pyarrow parquet API.
The text was updated successfully, but these errors were encountered: