HDDS-10338. Implement a Client Datanode API to stream a block #6613

chungen0126 · 2024-04-30T15:20:59Z

What changes were proposed in this pull request?

To reduce round trips between the Client and Datanode for reading a block, we nee a new API to read.

Client -> block(offset, length) -> Datanode
Client <- chunkN <- Datanode
Client <- chunkN+1 <- Datanode
..
Client <-chunkLast <- Datanode

This is using the ability of gRPC to send bidirectional traffic such that the server can pipeline the chunks to the client without waiting for ReadChunk API calls. This also avoids the client from creating multiple Chunk Stream Clients and should simplify the read path on the client side by a bit.

Please describe your PR in detail:

Add a new logic at both client and server side to read block as streaming chunks.
Add a new StreamBlockInput at client side called from KeyInputStream to read a block from the container.
Add unit tests and integration tests for `StreamBlockInput.
Add a new version in datanode for compatibilities, while new client reading blocks from old server, it will fallback and read blocks by BlockInputStream.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10338

How was this patch tested?

There are existed test for reading data.

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java

devabhishekpal

Thanks for taking on this effort @chungen0126.
I just had a few questions and some nits.

fenixjin · 2024-10-11T04:26:27Z

Test conducted on our cluster(3DN / HDD / 10 Gigabit network) shows this improvement can boost read speed by at least 30%.

In single thread read, stream read cut read time from 7.3 - 7.4s to 4.8 - 5.2s.
In freon ozone-client-one-key-reader test, stream read increased read bandwidth from 427MB/s to 586MB/s.

hadoop-hdds/client/src/test/java/org/apache/hadoop/hdds/scm/storage/TestStreamBlockInput.java

hadoop-hdds/client/dev-support/findbugsExcludeFile.xml

hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java

swamirishi · 2024-10-16T21:51:57Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java

+      BlockID blockID = BlockID.getFromProtobuf(
+          readBlock.getBlockID());
+      // This is a new api the block should always be checked.
+      BlockUtils.verifyReplicaIdx(kvContainer, blockID);


What would happen if the replicaIndex of the block changes because of containerBalancer running on the background. We would need to take some kind of a lock here to ensure the block data does change after this point.

or we would need to validate the replicaIndex and bcsID is the same on every readChunk call on the file(Basically move this check inside the loop). Take a look at HDDS-10983 for context.

or we would need to validate the replicaIndex and bcsID is the same on every readChunk call on the file(Basically move this check inside the loop). Take a look at HDDS-10983 for context.

@swamirishi thanks for your review.
I'm still confused of the problem. We validate the replicaIndex and bcsID at the start of the readBlock, and all the readChunks belong to the same block. Why do we need to validate again fot every readChunk. If the replicaIndex of the block changes during the readBlock, a mismatch can still happen after the validation and before readChunk.

It could so happen that the container replica index could change b/w 2 read chunks. You are right about the fact that the replica can still fail, but we can save unnecessary round trips b/w client & server side. It is about narrowing down the possibilty and minor optimization. We need to ensure that the checksums of the chunks do match on the client side. I am still looking through the client side code, just wanted to understand if we are doing a checksum verification for each and every chunk read on the client side.

Done for adding replica index validation.

swamirishi

@chungen0126 Thanks for working on the patch. I am still reviewing the PR. Posting my first level review comments.

chungen0126 added 30 commits March 11, 2024 16:40

implement data stream api

d6d7b2a

implement XceiverClientGrpc#sendCommandOnlyRead

d8a519d

read data to buffers

ffebd66

create NewBlockInputStream to support Streaming data

a6ed056

fix checkstyle

fbd20eb

fix checkstyle

a663604

fix synchronized

0f35371

fix synchronized

80329dc

fix synchronized

e398d17

fix synchronized

a404266

fix synchronized

c57c2e6

fix synchronized

4dc9082

ignore find bugs in TestNewBlockInputStream

af4a25b

clean up

79ae3eb

implement server side stream data

b1b301e

fix bug

ad0fd8d

fix bug

e86aee9

Merge branch 'master' into HDDS-10338

5442761

fix bug

8d97d47

fix bug

b694448

fix bug

d5dc908

fix bug

74eac1e

fix bug

29c8f80

fix bug

741effb

fix checkstyle

f1d4d7f

fix bug

6a67375

Merge branch 'master' into HDDS-10338

b0c64d7

fix checkstyle

b4cfd3f

fix bug

91631ac

fix bug

6e36ec1

devabhishekpal reviewed Oct 6, 2024

View reviewed changes

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java Show resolved Hide resolved

devabhishekpal reviewed Oct 6, 2024

View reviewed changes

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java Outdated Show resolved Hide resolved

devabhishekpal suggested changes Oct 6, 2024

View reviewed changes

chungen0126 added 5 commits October 9, 2024 02:13

add testReadBlock in TestKeyValueHandler and rename variable

71094bd

fix checkstyle and fix bug

9e5f77c

revert StreamObserver<ContainerCommandResponseProto>.onComplete

281e91e

create functions to handle exception

397ef40

address comments

fe5f8ec

kerneltime reviewed Oct 15, 2024

View reviewed changes

hadoop-hdds/client/src/test/java/org/apache/hadoop/hdds/scm/storage/TestStreamBlockInput.java Outdated Show resolved Hide resolved

kerneltime reviewed Oct 15, 2024

View reviewed changes

hadoop-hdds/client/dev-support/findbugsExcludeFile.xml Outdated Show resolved Hide resolved

kerneltime reviewed Oct 15, 2024

View reviewed changes

hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java Outdated Show resolved Hide resolved

kerneltime reviewed Oct 15, 2024

View reviewed changes

hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java Outdated Show resolved Hide resolved

chungen0126 added 2 commits October 17, 2024 00:33

address comments

9566b89

address comments

200da6d

swamirishi reviewed Oct 16, 2024

View reviewed changes

swamirishi requested changes Oct 16, 2024

View reviewed changes

adoroszlai added the performance label Oct 17, 2024

chungen0126 added 2 commits October 18, 2024 18:49

address comments

0814a22

address comment

e040768

chungen0126 requested review from swamirishi and guohao-rosicky November 8, 2024 07:05

chungen0126 added 8 commits November 8, 2024 17:22

Merge branch 'master' into HDDS-10338

ece66fc

fix DummyStreamBlockInput

398692e

rmove StreamData type

fead0b7

fix checkstyle

2fb2851

Merge branch 'master' into HDDS-10338

dec005d

Merge branch 'master' into HDDS-10338

1968756

fix verify checksum

5c70fd4

no need to compute startByteIndex

bc804ff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-10338. Implement a Client Datanode API to stream a block #6613

HDDS-10338. Implement a Client Datanode API to stream a block #6613

chungen0126 commented Apr 30, 2024 •

edited

Loading

devabhishekpal left a comment

fenixjin commented Oct 11, 2024

swamirishi Oct 16, 2024

swamirishi Oct 16, 2024 •

edited

Loading

chungen0126 Oct 18, 2024

swamirishi Oct 22, 2024 •

edited

Loading

chungen0126 Nov 22, 2024

swamirishi left a comment

HDDS-10338. Implement a Client Datanode API to stream a block #6613

Are you sure you want to change the base?

HDDS-10338. Implement a Client Datanode API to stream a block #6613

Conversation

chungen0126 commented Apr 30, 2024 • edited Loading

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

devabhishekpal left a comment

Choose a reason for hiding this comment

fenixjin commented Oct 11, 2024

swamirishi Oct 16, 2024

Choose a reason for hiding this comment

swamirishi Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

chungen0126 Oct 18, 2024

Choose a reason for hiding this comment

swamirishi Oct 22, 2024 • edited Loading

Choose a reason for hiding this comment

chungen0126 Nov 22, 2024

Choose a reason for hiding this comment

swamirishi left a comment

Choose a reason for hiding this comment

chungen0126 commented Apr 30, 2024 •

edited

Loading

swamirishi Oct 16, 2024 •

edited

Loading

swamirishi Oct 22, 2024 •

edited

Loading