-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement geo-traits for writing to WKT & perf improvement #124
Conversation
All existing tests pass; just a question of whether |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, can you address the todo!
before marking a PR as ready for review? I don't want to accidentally miss one and then merge code that willfully explodes on the happy path. It seems like a good way to get users to not trust you.
Well I consider the Is there a better way to signify this? |
Draft PRs are a great way to get line by line feedback and signify "please don't merge this" as is. I'll also often leave comments on my own PR with specific questions and relevant context at the spot I'm seeking help. |
Another alternative / complementary approach is to deny |
Ok! I've found different expectations in different communities on github of whether draft means "don't look at this yet" or "I'm soliciting feedback on things, but it's not yet ready for merge" |
Here the remaining questions are:
I should be able to fix the rect and triangle issues separately |
Other questions:
|
src/to_wkt/geo_trait_impl.rs
Outdated
let max_coord = rect.max(); | ||
|
||
// Note: Even if the rect has more than 2 dimensions, we omit the other dimensions when | ||
// converting to a Polygon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we do this for Rect's?
Instead can it look more like the triangle impl which avoids this loss of dims and heap allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triangle
and Line
store all coordinates, just with the simplifying case that they only have a specific number of points. Rect
doesn't do that, or else Rect
would store 4 coordinates, not 2. We need to do some sort of conversion from the min
and max
coordinates to a continuous ring.
It's obvious how to do this for 2D input, but less clear how to do this for 3D or 4D. I'd suggest that we should error for 3D/4D, as the user can choose how to transform a 3D or 4D Rect.
Or perhaps we could even change the RectTrait
so that it doesn't have the same dimensionality as the others. Because Rect
means "axis aligned bounding box" and when you have 3 or 4 axes, it's less clear how that maps to simple-features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh geeze, yeah that's a mess.
Given that we're only handling 2d, can we at least avoid the vec allocation in that case?
As for erroring on 3-d/4-d, that seems reasonable, assuming you mean a runtime error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can avoid the vec allocation, just makes the code slightly more complicated.
Also, what should we do with the error? Currently it returns std::fmt::Error
and there's no error type in this crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to propose something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an Error
enum and now error on non-2d Rect input.
Let's get #123 merged and then we can come back to this to decide how we want to write |
I updated the Rect writing, changed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
I left a bunch of stylistic nitpicks, but I don't feel that strongly about any of them. Apply whatever makes sense to you and merge away.
fn from(value: Error) -> Self { | ||
match value { | ||
Error::FmtError(err) => err, | ||
_ => std::fmt::Error, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: std::fmt
here, but core::fmt
at the top — settle on one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, oops. I don't actually know the difference between std
and core
. I changed to std
everywhere. I hope that's ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core is a subset of std.
std re-exports all of core and then has some other fancier functionality. So referencing things from core or std is equivalent, it just makes me twitch a little to see both used interchangeably in such close proximity. 🤪
The distinction is typically only relevant if you are trying to target a no-std build for very lightweight platforms (like an embedded system) which might not have support for the std lib.
src/to_wkt/geo_trait_impl.rs
Outdated
|
||
pub fn write_point<T: WktNum + fmt::Display, G: PointTrait<T = T>, W: Write>( | ||
g: &G, | ||
f: &mut W, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f: &mut W, | |
f: &mut impl Write |
It's a little more readable to use impl Trait
for these kinds of params.
style nit #2: I think conventionally the impl Write
is the first param for free functions, e.g. :
- https://github.com/rust-lang/rust/blob/dff3e7ccd4a18958c938136c4ccdc853fcc86194/src/tools/rust-analyzer/crates/proc-macro-api/src/json.rs#L29
- https://github.com/rust-lang/rust/blob/dff3e7ccd4a18958c938136c4ccdc853fcc86194/compiler/rustc_pattern_analysis/src/rustc/print.rs#L47
So all together...
pub fn write_point<T: WktNum + fmt::Display, G: PointTrait<T = T>>(f: &mut impl Write, g: &G)
// e.g.
write_point(&mut some_writable, &some_point)
(similar for other methods in this file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little more readable to use
impl Trait
for these kinds of params.
Happy to switch. Can you clarify what you mean by "these kinds"? Do you just mean "simple bounds without any parameters"? Like you switched the impl Write
but not the impl PointTrait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think anything that can be written as impl Trait
probably should be. Maybe there's an exception, but I can't think of one.
f.write_str("(")?; | ||
write_coord_sequence(f, coords.iter(), PhysicalCoordinateDimension::Two)?; | ||
Ok(f.write_char(')')?) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be as well be direct like:
write!(
f,
"POLYGON ({0} {1},{2} {1},{2} {3},{0} {3},{0} {1})",
min.x(),
min.y(),
max.x(),
max.y(),
)?;
It was not materialized at the time it was copied; sorry for that.
CHANGES.md
if knowledge of this change could be valuable to users.Changes:
This slightly changes the implementation for writing to WKT, partially based on @b4l's PR here geoarrow/geoarrow-rs#788.
While porting the code to use traits, I realized that there was an awful lot of string concatenation being performed. So for example, when writing a Polygon to WKT, every single coordinate would be allocated to a separate string, and then the coords of each ring would be concatenated and then all rings would be concatenated.
wkt/src/types/polygon.rs
Lines 42 to 52 in e1d838d
This instead writes coordinates directly to the output writer. So perhaps the performance differences in the discussion here #118 were not primarily due to
ryu
.This PR also implements writing for geo-traits objects, and delegates the
std::fmt::Display
implementation to the underlying geo-traits impl.Benchmark against
main
:I also added another bench for writing the
geo::Geometry
object directly instead of first converting it to awkt::Wkt
. It's not 100% apples to oranges because I can't usestd::io::sink()
for astd::fmt::Write
trait input. But it's still faster 🤷♂️