Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early draft guidance document on general guidance on spatial type implementation #248

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

paleolimbot
Copy link
Collaborator

@paleolimbot paleolimbot commented Nov 8, 2024

Very early draft with plenty of unsubstantiated opinions! Just posting here for feedback 🙂 .

For background, at the last sync we discussed that probably a document like this should exist given the discussions that have taken place in the Parquet and Iceberg PRs. I am not sure this is the actual home for this, if this is the thing that we envisioned creating, or if it should be anything more than a blog post, but I did say I would give it a go and this is what I came up with.

format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
for the combinations of x, y, z, and/or m values of which geometries are
comprised. Just as a value of "5 meters" has physical meaning whereas
the value "5" does not, geometries with a coordinate reference system have
physical meaning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking about units doesn't give much justice to the concept of CRS. Perhaps quicly mentions that it encodes if the coordinates are geographic (longitude / latitude) or projected, and the underlying reference ellipsoid

format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved

### Why can't I just use an "identifier"

Multiple identifiers needed for xy + elevation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. EPSG has codified a number of compound CRS where a EPSG id designates both the horizontal and vertical CRS. The main reason is custom CRS, that is CRS not integrated to an official catalog

format-specs/spatial-type-guidance.md Show resolved Hide resolved
Copy link
Collaborator Author

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback! I will circle back to the CRS section this week 🙂

Copy link

@dmeaux dmeaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this does a good job explaining to a user why certain decisions were made in the format definition and the context of the state of the art at the time during which those decisions made.

Also, I found a couple of typos that I noted with comments.

- [Boost Geometry](https://www.boost.org/doc/libs/1_86_0/libs/geometry/doc/html/index.html)

These libraries are used by Geospatial type implementations to handle the details of
computational geometry; however, thier scope does not include relating those geometries
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

however, thier

should be:

however, their

two types differ with respect to how adjascent coordinates should be connected
(i.e., edges). Among other things, this affects how area, length, and distance
between elements are defined. A precise definition of the edge interpretation
of all Geometry types of which we are aware can found in the widely adopted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can found

should be:

can be found

accessing the contents is possible with any off-the-shelf JSON parser (whereas
to parse WKT2 a specialized parser is required).

Because some producers *cannot* choose, we also reccomend allowing those producers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reccomend -> recommend

https://macwright.com/lonlat/
https://erouault.blogspot.com/2024/09/those-concepts-in-geospatial-field-that.html
- Representing a coordinate reference system in JSON

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, those issues are partially caused by conflation (not only by open source projects, but also in some OGC groups) of concepts that should have been separated concerns. For example, the confusion around "pixel is point/area" is partially because it has been interpreted as synonymous of "pixel center/corner", while actually they are two independent problems. Likewise, the confusion about axis order is caused by "CRS orientation" being interpreted as aligned with "display orientation", while actually CRS is an intermediate step between the data and the visualization.

The above links to blogs explain that there is confusions, but do not really analyse the causes and remedies. Users are left with the impression that the situation is just chaos, while actually the problem is oversimplification. I tried to show which concepts have been conflated, and how to resolve those ambiguities, in an OGC Testbed report (draft). This is not a proposal to link to above Testbed (especially since it still a draft), but rather to avoid this controversial topic since above blogs do not really help in my opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this section needs filling out! I do think we need to explain why the GeoParquet spec chose its axis order language (which is very explicit). Those blogs themselves have good references I was planning to look at but it's a great point that they're intended for a spatial audence (whereas this document is intended for a non-spatial audience).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeoParquet is not very explicit regarding axis order. It suffers from the ambiguities that I tried to describe in the Testbed report, and doesn't comply to OGC directive 14. But we probably don't want to redo this debate here. My proposal would be to just said that there is confusions about axis order, that this problem is not yet resolved in the general case, but we have a status quo in e.g. WKB that works with common cases, and the current GeoParquet specification follows that status quo. But we acknowledge that this is an unsatisfying situation, and a future version may have some changes (e.g., maybe a permutation field), depending on how things evolve at OGC.

Copy link
Contributor

@brendan-ward brendan-ward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @paleolimbot ! A few word-smithing suggestions...

These components are used by major data warehouse vendors like
[BigQuery](https://cloud.google.com/bigquery),
[Snowflake](https://snowflake.com), [Databricks](https://databricks.com). Broadly,
GeoParquet was
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing rest of sentence...

format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
the format. For example, [GeoJSON](https://geojson.org) encodes geometries as JavaScript
objects instead of a text or binary-based format.

For more details and ways to ensure that the type definition is precise and unambiguous,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This transition sentence seems unnecessary; the next section follows nicely enough without it...

format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
format-specs/spatial-type-guidance.md Outdated Show resolved Hide resolved
@paleolimbot
Copy link
Collaborator Author

Thank you all for the feedback! I will circle back to the parts that require some more substantial content generation in the next few days 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants