Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: skip output level sstables with no data overlap with input level #389

Open
petermattis opened this issue Nov 10, 2019 · 5 comments
Open
Labels
A-write-amp potential to reduce write amplification

Comments

@petermattis
Copy link
Collaborator

petermattis commented Nov 10, 2019

See facebook/rocksdb#6021.

The idea is that if we're doing a compaction from Ln->Ln+1, we can skip sstables in Ln+1 which do contain any keys in Ln. Right now this determination is done solely based on the sstable boundaries, but it is only a little more work to create a levelIter on the input sstables in Ln and seek to the boundary keys in Ln+1 to see if any actual key is contained within the sstables in Ln+1.

Here's an example of where this could be beneficial:

L0: a-----g------------t-----z
L1:   c----h j------q s-----y        

L0 has one sstable with the bounds [a,z] containing the keys a, g, t, and z. A compaction from L0 to L1 would normally involve every sstable in L1 because the bounds [a,z] covers every sstable in L1. But notice that the L1 sstable [j,q] doesn't overlap with any keys in L0.

After selecting the initial set of input sstables in L0 and L1, we could open a levelIter on the sstables in L0 and seek that iterator to the boundary keys of the tables in L1, trimming the L1 inputs of any sstables that do not contain an overlapping key in L0. The compaction logic would have to be changed to ensure that the tables output to L1 are cut so as not to overlap with the excluded sstables in L1.

Cc @sumeerbhola as this overlaps with other thinking we've been doing in this area. I'm not sure how often this situation occurs in practice. We'd want to do some experimentation to determine if it is common or rare.

Jira issue: PEBBLE-180

Epic CRDB-40361

@sumeerbhola
Copy link
Collaborator

Is the first step to add some stats where we do this seeking and record how many sstables and bytes could have been eliminated? Is it reasonable to add these stats for every compaction, or do you expect this to be slow enough that one should sample?

@petermattis
Copy link
Collaborator Author

Is the first step to add some stats where we do this seeking and record how many sstables and bytes could have been eliminated? Is it reasonable to add these stats for every compaction, or do you expect this to be slow enough that one should sample?

Yes, I think that is the first step. I think the stats could be added for every compaction. The seeks should be fast enough that they will be lost in the noise of the compaction itself.

@petermattis
Copy link
Collaborator Author

There will probably be dragons in the handling of range tombstones that overlap the skipped sstable.

@github-actions
Copy link

github-actions bot commented Mar 2, 2022

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it
in 10 days to keep the issue queue tidy. Thank you for your
contribution to Pebble!

Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it
in 10 days to keep the issue queue tidy. Thank you for your
contribution to Pebble!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-write-amp potential to reduce write amplification
Projects
Status: Compaction heuristics
Development

No branches or pull requests

3 participants