Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read: add Dwarf::populate_abbreviations_cache #679

Merged
merged 2 commits into from
Sep 14, 2023

Conversation

philipc
Copy link
Collaborator

@philipc philipc commented Sep 12, 2023

This gives additional options for caching duplicate entries or all entries.

@al13n321 Can you test if this has similar performance to the caching you implemented?

@al13n321
Copy link
Contributor

al13n321 commented Sep 12, 2023

Yep, about the same performance. The set_abbreviations_cache_strategy() call adds ~20 ms. That seems fine next to the 370 ms loop parsing all the units (in one thread). (And the overall time likely increased by less than 20 ms, I didn't measure carefully enough to tell.)

I'd document that set_abbreviations_cache_strategy() call may iterate over all units, i.e. may do a bunch of random reads from .debug_info (which is likely mmapped, so reads may be slow and may in principle have side effects).

Thanks for working on this, and for making this library!


Some theoretical thoughts, optional reading:

The LazyArc's behavior under contention is not ideal: if multiple threads try to get the same abbreviations at the same time, they'll all do the parsing, and all but one will discard the result. It would be better (?) if one thread did the parsing while other threads wait. A busy-wait seems good enough and easy to implement by making the LazyArc's atomic have 3 states: empty, busy, ready. Busy-wait seems better than duplicate work because: (a) it will complete sooner because it only waits for the remaining fraction of the work, (b) it can do something like thread::yield_now(), (c) it doesn't read from the Reader. The downside is that it's not lock-free, i.e. we'd be in trouble if the thread dies or stalls in the middle of parsing abbreviations. Threads dying or stalling is not a real concern in practice (or is it?) (except for panics, but they can be handled by just resetting the LazyArc state to 'empty'), so busy-wait seems better overall.

Since set_abbreviations_cache_strategy() call is already doing some work inline, why not also have a mode where it would go ahead and load all abbreviations right there, AbbreviationsCacheStrategy::Eager or whatever. It can be faster because (a) it'll go in order of increasing offset (more-sequential file reads), (b) it's be done in one thread, so less LazyArc contention.

@philipc
Copy link
Collaborator Author

philipc commented Sep 12, 2023

Thanks for the feedback.

The LazyArc's behavior under contention is not ideal: if multiple threads try to get the same abbreviations at the same time, they'll all do the parsing, and all but one will discard the result.

Note that this only happens for no_std. With std we use a Mutex.

Since set_abbreviations_cache_strategy() call is already doing some work inline, why not also have a mode where it would go ahead and load all abbreviations right there, AbbreviationsCacheStrategy::Eager or whatever. It can be faster because (a) it'll go in order of increasing offset (more-sequential file reads), (b) it's be done in one thread, so less LazyArc contention.

I'm starting to question the value of using LazyArc at all. Maybe we should always parse them all up front. It would greatly reduce the complexity of the cache.

The original reason for LazyArc is so that we could improve performance for users without them needing to change anything, but if they need to call something anyway for cases such as yours, then that is no longer a reason.

@al13n321
Copy link
Contributor

Note that this only happens for no_std. With std we use a Mutex.

Ah, I didn't notice that.

I'm starting to question the value of using LazyArc at all. Maybe we should always parse them all up front. It would greatly reduce the complexity of the cache.

I agree. Maybe the behavior should be:

  • By default, there's no caching. Useful if the user just wants a few units and doesn't want to pay 500 ms.
  • The user may call dwarf.preload_abbreviations() (or whatever), which will populate the abbreviations map right there. There can still be Duplicates and All modes. Duplicates useful when iterating over units only once (especially from multiple threads), All useful if iterating multiple times.
  • Or the user may do their own management of abbreviations and use Unit::new_with_abbreviations(). (E.g. group the units by their abbreviations offsets and assign groups to threads, then have separate abbreviations cache in each thread. I mean it's unlikely that anyone would need this particular scheme in practice, but it seems good to have this sort of freedom in a library.)

Fwiw, for me the abbreviations cache stood out as a surprising behavior: in a mostly straightforward low-level library that just parses bytes when requested, there's this one place where it does nontrivial things behind the scenes: caching (imposing memory usage), reference counting, thread synchronization, none of that mentioned in the documentation for dwarf.unit(). So making it explicit and less magic seems really good.

@philipc
Copy link
Collaborator Author

philipc commented Sep 13, 2023

Sounds good. I've added a commit to do that.

@al13n321
Copy link
Contributor

The code looks good to me, performance on my test is still good. All seems to be ~5-10% faster than Duplicates, surprisingly. (The total populate + iterate time.) Maybe it's from sequential access order, idk.

@philipc philipc changed the title read: add Dwarf::set_abbreviations_cache_strategy read: add Dwarf::populate_abbreviations_cache Sep 14, 2023
@philipc philipc merged commit 23ebfc8 into gimli-rs:master Sep 14, 2023
20 checks passed
@philipc philipc deleted the abbrev_cache branch September 14, 2023 02:13
@philipc
Copy link
Collaborator Author

philipc commented Sep 14, 2023

@Swatinem FYI, this removes the automatic caching that was added in #628, so you'll probably want to add a call to populate_abbreviations_cache once this is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants