Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplifications and out-of-memory fix in RRuleIterSet #120

Closed
wants to merge 3 commits into from
Closed

simplifications and out-of-memory fix in RRuleIterSet #120

wants to merge 3 commits into from

Conversation

oll3
Copy link
Contributor

@oll3 oll3 commented Jun 23, 2024

...and another one 🔢

Did some simplifications to the RRuleSetIter, mainly trying to get rid of the HashMap queue thing.

Following to this change it seemed possible to stop appending exrules dates to the exdates btree indefinitely (which eventually may cause out of memory on the system). And instead keeping track of the last value of each iterator. A change made possible since the date given to the is_date_excluded should now always be greater (or the same) as the last call to the function. Hence there is no need to track previously yielded exrule dates.

These changes seems to improve the overall iteration performance of the RRuleSetIter. I have created some benchmarks (not in PR) and the iteration time is cut down by 4-70 % depending on the specific rruleset tested.

@oll3
Copy link
Contributor Author

oll3 commented Jul 10, 2024

To clarify this PR a bit more...

I wanted to remove this hashmap since it seemed to have an impact on the performance of the rruleset iterator. Hashing values may be expensive compute wise and is done in a quite tight loop. And it seems to be done at least once every loop depending on how complex the rruleset is.

When I understood the inner workings of the set iterator better it seemed this could be done an simplified way by using peekable iterators for the inner RRuleIters. This change decreased the code of the set iterator a bit (see commit f9aed0b (1 file changed, 59 insertions(+), 149 deletions(-)) while still outputting the same result. At least according to the current tests suite and how I understood the previous code.

This also made the inner iterators to always output an increasing date. Which, in turn, this made it possible to remove the btree storage of old exrules dates and fix this "leak".

To give an idea of the performance improvement, here is the output from running my benchmarks locally (sorry for the bad resolution). First run on the main branch (commit 8402921) then again on this branch (commit 4e0de2d4):
image

The benchmark code is available here if you want to try or have any feedback.

I have also noticed that there are more room for improvements performance wise and already have a bunch of changes queued up. Personally I would really like to see these merged to this project. I have no problem adjusting or providing more tests if wanted.
I understood that you (@fmeringdal) might value mature code higher, and that's of course ok. Still I'm wondering if this work is of any interest to this project or am I better of forking (is that ok?)?

oll3 added 3 commits July 13, 2024 12:58
The RRuleSetIter will peek all the rrule iterators to find the earliest
date available. And peek the rdates list. And then consume the date which
is the earliest by either stepping the corresponding rrule iterator or
popping from the rdate list.

This makes it possible to skip the rrule "queue" and manual buffering.
When exrules are used the exdates btree will eventually cause out of
memory since it's always appended to but never drained.

Fix is to stop adding to the btree, make the the exrule iterators
peekable and only step an iterator when the compared date is greater
than the peeked date.

The fix assumes that the date given to 'is_date_excluded' is never
earlier than the date of the last call. It's still allowed to be called
with the same date as last call though.

Also the exdates has been made a sorted Vec of dates which are popped
whenever the compared date is greater than the last (earliest) date.
@oll3 oll3 closed this Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant