-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - sync2: implement database-backed sync based on FPTree #6406
Conversation
Given that after recent item sync is done (if it's needed at all), the range set reconciliation algorithm no longer depends on newly received item being added to the set, we can save memory by not adding the received items during reconciliation. During real sync, the received items will be sent to the respective handlers and after the corresponding data are fetched and validated, they will be added to the database, without the need to add them to cloned OrderedSets which are used to sync against particular peers.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #6406 +/- ##
=========================================
+ Coverage 79.8% 79.9% +0.1%
=========================================
Files 341 352 +11
Lines 44204 46096 +1892
=========================================
+ Hits 35285 36862 +1577
- Misses 6918 7151 +233
- Partials 2001 2083 +82 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
This adds multi-peer synchronization support. When the local set differs too much from the remote sets, "torrent-style" "split sync" is attempted which splits the set into subranges and syncs each sub-range against a separate peer. Otherwise, the full sync is done, syncing the whole set against each of the synchronization peers. Full sync is also done after each split sync run. The local set can be considered synchronized after the specified number of full syncs has happened. The approach is loosely based on [SREP: Out-Of-Band Sync of Transaction Pools for Large-Scale Blockchains](https://people.bu.edu/staro/2023-ICBC-Novak.pdf) paper by Novak Boškov, Sevval Simsek, Ari Trachtenberg, and David Starobinski.
The `sqlstore` package provides simple sequence-based interface to the tables being synchronized. It is used by the FPTree data structure as the database layer, and doesn't do range fingerprinting by itself. `SyncedTable` and `SyncedTableSnapshot` provide methods that wrap the necessary SQL operations. `sql/expr` package was added to facilitate SQL generation.
f886141
to
bb43161
Compare
`fptree.FPTree` provides a sufficiently efficient data structure for performing range fingerprinting on data residing in database tables, speeding up the queries at expense of some memory use. `dbset.DBSet` builds on `fptree.FPTree` and provides a database-backed implementation of the `multipeer.OrderedSet` interface.
It didn't add much to efficiency
Re-generating SQL statements every time is quick but increases GC pressure.
It was causing issues with the race detector on Windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some minor comments
return rowsA, rowsB, combined | ||
} | ||
|
||
func TestP2P(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess that this test is an e2e test that checks that the whole syncing flow works with a db backed dataset. it is however huge and very complex. having at least a modest comment about what this test is supposed to do would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment
// EnableTrace enables or disables tracing for the tree. | ||
func (ft *FPTree) EnableTrace(enable bool) { | ||
ft.traceEnabled = enable | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that reviewing this file while understanding everything which is happening here will probably take me a week. i will therefore "trust you on this"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not insisting on reading this right now, but for future reference, the docs on how FPTree works are in the dev-docs
folder.
https://github.com/spacemeshos/go-spacemesh/blob/44d888f22e96d0f1b0a59480af19548b23ba38de/dev-docs/sync2-set-reconciliation.md#fptree-data-structure
sync2/fptree/refcountpool_test.go
Outdated
require.Equal(t, foo{x: 12}, pool.item(idx5)) | ||
|
||
// // don't replace an item with multiple refs | ||
// pool.ref(idx5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks
bors merge |
------- ## Motivation The new set reconciliation mechanism requires an efficient data structure for handling range fingerprint queries.
Build failed:
|
Unrelated
|
bors merge |
------- ## Motivation The new set reconciliation mechanism requires an efficient data structure for handling range fingerprint queries.
That's not the test that failed, this is the workflow that failed: https://github.com/spacemeshos/go-spacemesh/actions/runs/11748167959 And the reason it failed is because a node generated a malfeasance proof that couldn't be validated during publication: https://grafana.spacemesh.dev/goto/sKk1avMNg?orgId=1 It appears that the node correctly detected the post as invalid, created a malfeasance proof for it but then failed to validate that malfeasance proof when publishing it (so it never did). I have never seen this before, the post that we generate in that test is guaranteed to be invalid so a malfeasance proof generated for it should be valid... cc @poszu |
SyncV2 is not used in systests yet (and not integrated with ATXs yet in this PR). |
Pull request successfully merged into develop. Build succeeded: |
Motivation
The new set reconciliation mechanism requires an efficient data structure for handling range fingerprint queries.
Description
fptree.FPTree
provides a sufficiently efficient data structure forperforming range fingerprinting on data residing in database tables,
speeding up the queries at expense of some memory use.
dbset.DBSet
builds onfptree.FPTree
and provides a database-backedimplementation of the
multipeer.OrderedSet
interface.#6405 needs to be merged before this one.