-
Notifications
You must be signed in to change notification settings - Fork 73
Open
Labels
Description
Is your feature request related to a problem? Please describe.
Currently, spatialdata stores Shape elements on-disk using WKB-encoded parquet files, since this is the GeoPandas default of geopandas.to_parquet - see geometry_encoding parameter (code).
WKB encoding has a high overhead for decoding/deserialization. For background:
- https://geoarrow.org/format.html#motivation
- https://observablehq.com/@kylebarron/geoparquet-on-the-web
- In Vitessce in JS, we have to iterate over each row of the parquet column and perform WKB decoding (code)
Describe the solution you'd like
Would it be possible to use to_parquet(geometry_encoding='geoarrow') as the default
Describe alternatives you've considered
Alternatively, could allow users to opt-in to using GeoArrow encoding (as opposed to WKB).
Additional context
Would need to document that the on-disk representation may use either wkb or geoarrow encoding (or only one of the two) and how to detect which was used.