Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fp32 vector to fp16/bf16 vector conversion for RESTful API #37556

Merged

Conversation

jiangyinzuo
Copy link
Contributor

@jiangyinzuo jiangyinzuo commented Nov 8, 2024

RESTful API. The influenced API are as follows:

  • Handler. insert
  • HandlerV1. insert/upsert
  • HandlerV2. insert/upsert/search

We do not modify search API in Handler/HandlerV1 because they do not support fp16/bf16 vectors.

module github.com/milvus-io/milvus/pkg:

Add Float32ArrayToBFloat16Bytes(), Float32ArrayToFloat16Bytes() and
Float32ArrayToBytes(). These method will be used in GoSDK in the
future.

issue: #37448

@sre-ci-robot sre-ci-robot added area/test sig/testing size/L Denotes a PR that changes 100-499 lines. labels Nov 8, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/feature Issues related to feature request from users labels Nov 8, 2024
Copy link

codecov bot commented Nov 8, 2024

Codecov Report

Attention: Patch coverage is 88.40580% with 16 lines in your changes missing coverage. Please review.

Project coverage is 81.05%. Comparing base (06d73cf) to head (febe188).
Report is 15 commits behind head on master.

Files with missing lines Patch % Lines
internal/distributed/proxy/httpserver/utils.go 81.42% 11 Missing and 2 partials ⚠️
...ernal/distributed/proxy/httpserver/wrap_request.go 93.75% 2 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #37556   +/-   ##
=======================================
  Coverage   81.04%   81.05%           
=======================================
  Files        1358     1358           
  Lines      190438   190526   +88     
=======================================
+ Hits       154334   154423   +89     
+ Misses      30628    30626    -2     
- Partials     5476     5477    +1     
Components Coverage Δ
Client 72.16% <ø> (ø)
Core 68.87% <ø> (ø)
Go 83.22% <88.40%> (+0.01%) ⬆️
Files with missing lines Coverage Δ
pkg/util/typeutil/convension.go 100.00% <100.00%> (ø)
...ernal/distributed/proxy/httpserver/wrap_request.go 95.22% <93.75%> (-0.42%) ⬇️
internal/distributed/proxy/httpserver/utils.go 84.26% <81.42%> (+0.04%) ⬆️

... and 36 files with indirect coverage changes

---- 🚨 Try these New Features:

Copy link
Contributor

mergify bot commented Nov 8, 2024

@jiangyinzuo E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@sre-ci-robot sre-ci-robot added size/XL Denotes a PR that changes 500-999 lines. and removed size/L Denotes a PR that changes 100-499 lines. labels Nov 9, 2024
@jiangyinzuo jiangyinzuo force-pushed the feat/fp32-to-fp16-or-bf16 branch 2 times, most recently from 41dc76b to 43c1181 Compare November 9, 2024 07:59
Copy link
Contributor

mergify bot commented Nov 9, 2024

@jiangyinzuo go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 9, 2024

@jiangyinzuo E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@jiangyinzuo jiangyinzuo force-pushed the feat/fp32-to-fp16-or-bf16 branch 3 times, most recently from 69831ba to a900400 Compare November 10, 2024 15:53
Copy link
Contributor

mergify bot commented Nov 10, 2024

@jiangyinzuo E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@sre-ci-robot sre-ci-robot added size/XXL Denotes a PR that changes 1000+ lines. and removed size/XL Denotes a PR that changes 500-999 lines. labels Nov 11, 2024
Copy link
Contributor

mergify bot commented Nov 11, 2024

@jiangyinzuo E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@jiangyinzuo jiangyinzuo force-pushed the feat/fp32-to-fp16-or-bf16 branch 3 times, most recently from e8c2f3c to 53b5480 Compare November 11, 2024 08:38
Copy link
Contributor

mergify bot commented Nov 11, 2024

@jiangyinzuo E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@jiangyinzuo
Copy link
Contributor Author

/run-cpu-e2e

@mergify mergify bot added the ci-passed label Nov 19, 2024
@smellthemoon
Copy link
Contributor

/lgtm

@cqy123456
Copy link
Contributor

/lgtm

// unmarshal []float32 first to make sure `[3, 3]` is []float instead of []byte
var float32Array []float32
err := json.Unmarshal([]byte(vectorStr), &float32Array)
if err == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance of a false positive occurring here?

// try to unmarshal as [][]float32 first to make sure `[[3, 3]]` is [][]float32 instead of [][]byte
fp32Values := make([][]float32, 0)
err := json.Unmarshal([]byte(vectorStr), &fp32Values)
if err == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance of a false positive occurring here?
the performance here is likely to deteriorate, isn't it?

@jiangyinzuo
Copy link
Contributor Author

jiangyinzuo commented Nov 21, 2024

@czs007 I added a micro-benchmark at internal/distributed/proxy/httpserver/utils_bench_serialize_vectors_test.go, here are the results.

Run benchmarks at internal/distributed/proxy/httpserver directory

go test -bench=BenchmarkSerialize -run=^$
goos: linux
goarch: amd64
pkg: github.com/milvus-io/milvus/internal/distributed/proxy/httpserver
cpu: Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz
BenchmarkSerialize_FloatVectors_Baseline-40                            7         144641969 ns/op
BenchmarkSerialize_FloatVectors-40                                    10         108097462 ns/op
BenchmarkSerialize_FloatVectors_Float16-40                             7         148711147 ns/op
BenchmarkSerialize_FloatOrByteVectors_Fp32-40                          7         158311570 ns/op
BenchmarkSerialize_FloatOrByteVectors_Byte-40                         58          19690757 ns/op
BenchmarkSerialize_FloatOrByteVectors_Fp32_UnmarshalTwice-40            7         151955889 ns/op
BenchmarkSerialize_FloatOrByteVectors_Byte_UnmarshalTwice-40           37          33288434 ns/op
BenchmarkSerialize_ByteVectors-40                                     55          21864953 ns/op
PASS
ok      github.com/milvus-io/milvus/internal/distributed/proxy/httpserver       17.815s
  • serialize float32 vector. BenchmarkSerialize_FloatVectors_Baseline is slower than BenchmarkSerialize_FloatVectors because the former calls json. Unmarshal multiple times and the later calls json.Unmarshal only once.
  • serialize float16/bfloat16 vector.
    • BenchmarkSerialize_FloatOrByteVectors_* is close to the performance of BenchmarkSerialize_FloatVectors_Float16 or BenchmarkSerialize_ByteVectors. This indicates that the performance overhead for distinguishing JSON type is not significant.

    • BenchmarkSerialize_FloatOrByteVectors_* is faster than BenchmarkSerialize_FloatOrByteVectors_*_UnmarshalTwice

The key to performance improvement lies in reducing the number of json.Unmarshal calls.

@sre-ci-robot sre-ci-robot removed the lgtm label Nov 21, 2024
@mergify mergify bot removed the ci-passed label Nov 21, 2024
@jiangyinzuo jiangyinzuo force-pushed the feat/fp32-to-fp16-or-bf16 branch 4 times, most recently from 2c28e76 to febe188 Compare November 22, 2024 04:32
Make the Milvus server support the conversion from fp32 to fp16/bf16,
so as to facilitate the client users who have difficulties in handling
fp16/bf16 conversions.

1. The influenced RESTful API are as follows:
    - Handler. insert
    - HandlerV1. insert/upsert
    - HandlerV2. insert/upsert/search

We do not modify search API in Handler/HandlerV1 because they do not
support fp16/bf16 vectors.

2. Add micro-benchmarks for various vector serialization methods.
3. Module github.com/milvus-io/milvus/pkg.

Add `Float32ArrayToBFloat16Bytes()`, `Float32ArrayToFloat16Bytes()` and
`Float32ArrayToBytes()`. These method will be used in GoSDK in the
future.

issue: milvus-io#37448

Signed-off-by: Yinzuo Jiang <[email protected]>
Signed-off-by: Yinzuo Jiang <[email protected]>
Copy link
Contributor

mergify bot commented Nov 22, 2024

@jiangyinzuo E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 22, 2024

@jiangyinzuo cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 22, 2024

@jiangyinzuo go-sdk check failed, comment rerun go-sdk can trigger the job again.

@jiangyinzuo
Copy link
Contributor Author

/run-cpu-e2e

@jiangyinzuo
Copy link
Contributor Author

rerun cpp-unit-test

@jiangyinzuo
Copy link
Contributor Author

rerun go-sdk

@mergify mergify bot added the ci-passed label Nov 22, 2024
@czs007
Copy link
Collaborator

czs007 commented Nov 24, 2024

/approve
/lgtm

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: czs007, jiangyinzuo, zhuwenxing

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit 5a06fac into milvus-io:master Nov 24, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/test ci-passed dco-passed DCO check passed. kind/feature Issues related to feature request from users lgtm sig/testing size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants