Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-order element traversal #80

Open
urschrei opened this issue Dec 7, 2021 · 13 comments
Open

Post-order element traversal #80

urschrei opened this issue Dec 7, 2021 · 13 comments
Labels
enhancement New feature or request

Comments

@urschrei
Copy link
Member

urschrei commented Dec 7, 2021

Cascaded unions are a desirable feature for the geo library. However, the algorithm requires post-order traversal of the tree in order to work, but iteration order is currently not specified.

@urschrei urschrei added the enhancement New feature or request label Dec 7, 2021
@rmanoka
Copy link
Contributor

rmanoka commented Dec 7, 2021

I may be out of depth here, but the data in the R-Tree is only at the leaf-nodes right? So what does a post-order traversal mean for a R-Tree? Also, does locate_with_selection_function help? It offers to selectively open each intermediate bounding box based on a custom predicate.

@urschrei
Copy link
Member Author

urschrei commented Dec 7, 2021

I think the idea is that we begin with unions of the smallest subsets (the deepest leaf nodes?) before moving up, and it's this that requires the post-order (reverse post-order, so moving from right-to-left would also work I think?); unless I'm misunderstanding that's where the efficiency comes from: the closer you get to the root, the less work there is to do.

I'm not sure how the locate-with-selection-function would help here, but I've never used it and I may well be being dense about that.

@rmanoka
Copy link
Contributor

rmanoka commented Dec 7, 2021

Got it. The issue is that the R-Tree itself has no user-data available at the intermediate nodes, so all the iterations have no concept of post-order.

I suppose you want to be able to do a map-reduce op on the R-Tree, where each intermediate (aka ParentNode) is reduced to a new polygon (union of its children), and do that in a bottom-up fashion. The locate-with-selection is essentially the reverse: it is a "pre-order" traversal as it queries if you intend to traverse a given intermediate node, before delving into it.

Here is one suggestion (just thinking aloud): we add a function to the trait SelectionFunction that also signals when we are done with a particular intermediate node.

trait SelectionFunction<...> {
  fn finished_unpacking_parent(...) {}
}

This can(?) then be used to drive a post-order / map-reduce op. Unfortunately, the trait SelectionFunction seems to take &self and not &mut self which seems a bit restrictive in the first-place.

@rmanoka
Copy link
Contributor

rmanoka commented Dec 7, 2021

Also, note that the root node, and then children of intermediate nodes can be accessed via public methods, so we can implement the union-algorithm without using the iterators.

@urschrei
Copy link
Member Author

urschrei commented Dec 7, 2021

Got it. The issue is that the R-Tree itself has no user-data available at the intermediate nodes, so all the iterations have no concept of post-order.
Ahh, yep.

I suppose you want to be able to do a map-reduce op on the R-Tree, where each intermediate (aka ParentNode) is reduced to a new polygon (union of its children), and do that in a bottom-up fashion.

This is my understanding too, yes.

Here is one suggestion (just thinking aloud): we add a function to the trait SelectionFunction that also signals when we are done with a particular intermediate node.

trait SelectionFunction<...> {

  fn finished_unpacking_parent(...) {}

}

This can(?) then be used to drive a post-order / map-reduce op. Unfortunately, the trait SelectionFunction seems to take &self and not &mut self which seems a bit restrictive in the first-place.

Don't we still have the problem of not knowing where to start, though?

@rmanoka
Copy link
Contributor

rmanoka commented Dec 7, 2021

The SelectionFunction already has a method should_unpack_parent that tells when it starts unpacking a parent node. That along with some nifty usage of a vec/stack should let us do it. However, it all seems more complicated than just directly recursing on the RTree nodes.

I would support introducing a new trait to support this type of map-reduce-fold if we could brain-storm a little bit. There are a couple of diff. use-cases here:

  1. Pure reduce. This is the current use-case you've mentioned iiuc; the final output is same as the R-Tree type T
  2. Map + reduce. First an intermediate T -> R on the data nodes, then reduce bottom-up to a final output R. Note that the R may not itself be an RTreeObject so it's hard to divide this up into two composable ops
  3. Fold. In iterators (std, rayon) there is also a distinction between reduce and fold. This is if leaf is not mapped from T -> R, but the penultimate intermediate nodes compute [T] -> R; the rest of the levels still compute [R] -> R. It could be rephrased as map + reduce, but the T -> R, and then [R] -> R might be more inefficient, than just directly converting from [T] -> R. This only helps at the bottom most level, so may be not worth it for trees.

Now, unfortunately the naive trait defn. I could think of to handle the above seems to involve many intermediate Vec allocations, that seems to hinder it's usage in a good algo impl. The direct recursion on the nodes seems simpler as it is more flexible.

@adamreichold
Copy link
Member

adamreichold commented Dec 8, 2021

The direct recursion on the nodes seems simpler as it is more flexible.

So something like

fn bottom_up_fold_reduce<T, S, I, F, R>(tree: &RTree<T>, mut init: I, mut fold: F, mut reduce: R) -> S
where
    T: RTreeObject,
    I: FnMut() -> S,
    F: FnMut(S, &T) -> S,
    R: FnMut(S, S) -> S,
{
    fn inner<T, S, I, F, R>(parent: &ParentNode<T>, init: &mut I, fold: &mut F, reduce: &mut R) -> S
    where
        T: RTreeObject,
        I: FnMut() -> S,
        F: FnMut(S, &T) -> S,
        R: FnMut(S, S) -> S,
    {
        parent
            .children()
            .iter()
            .fold(init(), |accum, child| match child {
                RTreeNode::Leaf(value) => fold(accum, value),
                RTreeNode::Parent(parent) => {
                    let value = inner(parent, init, fold, reduce);

                    reduce(accum, value)
                }
            })
    }

    inner(tree.root(), &mut init, &mut fold, &mut reduce)
}

?

@adamreichold
Copy link
Member

I think what is also nice about this way of doing things is that it has a straight-forward parallel version using Rayon which could come in handy for large data sets:

fn parallel_bottom_up_fold_reduce<T, S, I, F, R>(tree: &RTree<T>, init: I, fold: F, reduce: R) -> S
where
    T: RTreeObject,
    RTreeNode<T>: Send + Sync,
    S: Send,
    I: Fn() -> S + Send + Sync,
    F: Fn(S, &T) -> S + Send + Sync,
    R: Fn(S, S) -> S + Send + Sync,
{
    fn inner<T, S, I, F, R>(parent: &ParentNode<T>, init: &I, fold: &F, reduce: &R) -> S
    where
        T: RTreeObject,
        RTreeNode<T>: Send + Sync,
        S: Send,
        I: Fn() -> S + Send + Sync,
        F: Fn(S, &T) -> S + Send + Sync,
        R: Fn(S, S) -> S + Send + Sync,
    {
        parent
            .children()
            .into_par_iter()
            .fold(init, |accum, child| match child {
                RTreeNode::Leaf(value) => fold(accum, value),
                RTreeNode::Parent(parent) => {
                    let value = inner(parent, init, fold, reduce);

                    reduce(accum, value)
                }
            })
            .reduce(init, reduce)
    }

    inner(tree.root(), &init, &fold, &reduce)
}

@rmanoka
Copy link
Contributor

rmanoka commented Dec 8, 2021

Thanks for fleshing out the details @adamreichold ! Looks quite on track; like the easy rayon support. Just a few minor comments / points to discuss:

  1. The init parameter should probably take the bounding-box AABB as ref, as it is available anyway, and might help in initializing some structures that help with the reduce/fold.

  2. Is (S, S) -> S as efficient as (&mut S, S) if S is big? Rust may be optimizing it away as we're just storing the returned value back into the accumulator for the next iteration, but I wasn't too sure.

  3. I wonder if there are use-cases where it is more efficient to directly reduce &[S] -> S. Even with polygon clipping, the underlying data-structure is better built once for the sequence than during each iterative reduction. Unfortunately, I don't see a neat way to handle that use-case without many Vec allocations.

@adamreichold
Copy link
Member

adamreichold commented Dec 8, 2021

The init parameter should probably take the bounding-box AABB as ref, as it is available anyway, and might help in initializing some structures that help with the reduce/fold.

👍

Is (S, S) -> S as efficient as (&mut S, S) if S is big? Rust may be optimizing it away as we're just storing the returned value back into the accumulator for the next iteration, but I wasn't too sure.

Both std and rayon seem to pass the accumulator by value. I think, if it was really expensive to move and not optimized properly, one would box it?

I wonder if there are use-cases where it is more efficient to directly reduce &[S] -> S. Even with polygon clipping, the underlying data-structure is better built once for the sequence than during each iterative reduction. Unfortunately, I don't see a neat way to handle that use-case without many Vec allocations.

I do not know the details of those auxiliary data structures, but maybe moving them into the closures would help? (At least in the serial case. Not sure whether thread-local storage is worth that in the parallel case.)

Maybe the above could also accommodate this differently: S could keep auxiliary allocations around, e.g. a Vec<T> to delay the union computation until the next call to reduce?

In any case, I think the main take away could also be that we do not really need to make these decisions in a general context as the traversal/reduction can be written against the existing rstar API (and I don't see any obvious performance gains from having access to its internals) so when geo implements cascaded unions, it can use code like the above but tailored to its specific use case.

Or would you rather say that this is important/complex enough to be exposed by rstar itself? Should it then also optionally provide the parallel variant?

@JosiahParry
Copy link

@urschrei
Copy link
Member Author

urschrei commented Nov 2, 2024

The single-threaded implementation of this will land in geo at some point: georust/geo#1246.

@urschrei urschrei closed this as completed Nov 2, 2024
@urschrei
Copy link
Member Author

urschrei commented Nov 2, 2024

I didn't investigate the AABB hint or auxiliary allocation for S (not sure how that would work), but perf on intermediate real-world workloads (39k polygons, around 73.8 mb of WKT) is excellent: 11 seconds wall-clock for a complete union op on the tree.

Here's what I ended up with:

fn bottom_up_fold_reduce<T, S, I, F, R>(
    tree: &RTree<T>,
    mut init: I,
    mut fold: F,
    mut reduce: R,
) -> S
where
    T: RTreeObject,
    I: FnMut() -> S,
    F: FnMut(S, &T) -> S,
    R: FnMut(S, S) -> S,
{
    fn inner<T, S, I, F, R>(parent: &ParentNode<T>, init: &mut I, fold: &mut F, reduce: &mut R) -> S
    where
        T: RTreeObject,
        I: FnMut() -> S,
        F: FnMut(S, &T) -> S,
        R: FnMut(S, S) -> S,
    {
        parent
            .children()
            .iter()
            .fold(init(), |accum, child| match child {
                RTreeNode::Leaf(value) => fold(accum, value),
                RTreeNode::Parent(parent) => {
                    let value = inner(parent, init, fold, reduce);
                    reduce(accum, value)
                }
            })
    }

    inner(tree.root(), &mut init, &mut fold, &mut reduce)
}

with init, fold, and reduce specified as:

let init = || MultiPolygon::<T>::new(vec![]);
let fold = |mut accum: MultiPolygon<T>, poly: &Polygon<T>| -> MultiPolygon<T> {
    accum = accum.union(poly);
    accum
};
let reduce = |accum1: MultiPolygon<T>, accum2: MultiPolygon<T>| -> MultiPolygon<T> {
    accum1.union(&accum2)
};

I'm very happy to keep the implementation in consumer libraries, but it would be good to capture the mechanism for this kind of full traversal and intermediate processing in the crate's docs somewhere – it's complex and deep knowledge of the rstar's internal API is rare enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants