Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ledger: increase locks granularity in lookupWithoutRewards #5637

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

algorandskiy
Copy link
Contributor

@algorandskiy algorandskiy commented Aug 4, 2023

Summary

This PR is a rework of #5527 (later revered in #5620 because of test failure).

  1. Take a state snapshot for a single iteration (the first part of the for {} loop in lookupWithoutRewards) - dbRound, deltas, offset, rewards, inDeltas check.
  2. Make the nested loop over deltas lock free.
  3. If not found in deltas check baseAccounts and ensure the ledger did not advance while we were checking deltas.

Why does it work:

say au.deltas has 10 elements that is backed by some array underneath. e make local deltas = au.deltas[0:len(au.deltas)], pointing to the same backing array. Then there are two options:

  1. when receiving a new block => add element
    • if backing array has space, it is not reallocated, just appended, our deltas keeps pointing on it
    • if not enough space, then realloc for au.deltas but our deltas keeps pointing to the old backing array
  2. when committing => remove elements
    • removal op is slicing, so au.deltas keeps pointing to the same backing array but to the right subset, our deltas is fine as well

The change would benefit from @icorderi work on making lru caches concurrency-safe and possible switch to a bound circular buffer of deltas.

Why is it safe

lookupWithoutRewards is used by block evaluator in txpool, ensureBlock, catchup to add a new block into the ledger, i.e. always has rnd=latest. Having the MaxAcctLookback > 1 (deltas size, actual value is 4) guarantees the requested rnd is greater than the flushed rounds, and for rnd=latest the invariant "value is in deltas, or is in baseAccounts if flushed recently, or in DB" still holds.

Test Plan

Existing tests should pass

@codecov
Copy link

codecov bot commented Aug 4, 2023

Codecov Report

Attention: Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.

Project coverage is 54.94%. Comparing base (5acab71) to head (80b24d5).
Report is 427 commits behind head on master.

Files with missing lines Patch % Lines
ledger/acctupdates.go 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5637   +/-   ##
=======================================
  Coverage   54.94%   54.94%           
=======================================
  Files         463      463           
  Lines       64476    64485    +9     
=======================================
+ Hits        35427    35433    +6     
+ Misses      26668    26667    -1     
- Partials     2381     2385    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

// there is a chance the db advanced between deltas check and the baseAccounts check
// ensure the round requested is still in the db
if macct.Round > rnd {
return ledgercore.AccountData{}, 0, "", 0, &RoundOffsetError{rnd, macct.Round}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think it is worth taking an optimistic (instead of preventing an error here with locks), and returning an error when the database advances?
There are operation that need to get canceled because an error is returned from here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only actual for rest api since txpool and block validator need the latest state of account to validate.

@zeldovich
Copy link
Contributor

Looks reasonable to me. The only suggestion is, consider moving the two au.baseAccounts reads (read() and readNotFound()) to before the lock release. They seem cheap enough (just a map lookup in each one), and it seems safe to do the au.baseAccounts.writePending(macct) later without holding the lock, if we decide to go down that control flow path. Avoids an extraneous lock release and re-acquire in the case that we fall through to au.accountsq.LookupAccount etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants