Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11766. DataNode to cache blockDataTable and lastChunkInfoTable #7469

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jojochuang
Copy link
Contributor

@jojochuang jojochuang commented Nov 21, 2024

What changes were proposed in this pull request?

HDDS-11766. DataNode to cache blockDataTable and lastChunkInfoTable

Please describe your PR in detail:

  • DataNode updating checksum of an open file is becoming the bottleneck for HBase/hsync workloads.
  • The main bottleneck is reading from rocksdb and deserialize it to the corresponding protobuf and further to the helper Java class.
  • To address this overhead, add a Guava cache in front of rocksdb operation so the deserialization is avoid.
  • Serialization cost when the rocksdb is updated, still exists and need to addressed seperately.
  • That said, the ycsb data loading workload throughput is on par with the one where checksum is disabled, which means the bottleneck is probably on the client side again.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11766

How was this patch tested?

Ran on a small cluster with HBase YCSB workloads.

Change-Id: I21c1a6ff0c3875fb893024ed443ba3e4b0b8542b
(cherry picked from commit 216f9ba64c2df1355294348c24c7ab0ab1cbbc6d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant