feat: readiness #181

simone-sanfratello · 2022-12-16T15:03:59Z

~~note: to be rebased after #180~~

The PR aims to replace the current readiness logic. Current readiness logic is based on the fact the 3rd party storage such as AWS Dynamo and S3 are able to respond. This is good for low traffic service, but not good enough for the intense traffic of the bitswap-peer service.

The proposed readiness logic is going to consider:

amount of active connections
pending request blocks
event loop utilization (ELU)

Where active connections and and pending request blocks are hard limits got from production metrics, ELU is a index of the node inner process "busyness", so when one of them pass the limit, we can consider the single instance "busy" and stop to send it more request, until it solves the pending load. Note the ELU is more significant than memory and cpu usage - we're going to add them to the readiness logic in the future, if needed.

Open question: should be consider Dynamo and S3 access for readiness?

What we want to really avoid is this situation, where the service is busy responding (for ~10 minutes at ~15.00) but it still receive requests

More info: https://www.nearform.com/blog/event-loop-utilization-with-hpa/

Also note, the next step for bitswap scalability will be to implement on the k8s load balancer to scale services based on custom parameters, that will be the same (active connections, pending blocks, ELU) exposed on the /load endpoint

Simone Sanfratello added 3 commits January 2, 2023 10:42

feat: remove current readiness logic

8f9debf

feat: new readiness logic

a89b9fe

feat: default readiness values

2035193

simone-sanfratello force-pushed the feat/readiness branch from 6ef04bc to 2035193 Compare January 2, 2023 09:42

simone-sanfratello temporarily deployed to dev January 2, 2023 15:07 — with GitHub Actions Inactive

fix: service must not start without peer-id file

7b32894

simone-sanfratello force-pushed the feat/readiness branch from 1976228 to 7b32894 Compare January 2, 2023 15:41

simone-sanfratello added 3 commits January 3, 2023 15:11

fix: remove max response duration from readiness

21bed6b

fix: test

2cc6420

fix: readiness settings

b67f566

simone-sanfratello marked this pull request as ready for review January 5, 2023 10:36

simone-sanfratello requested review from codeflyer and alanshaw January 5, 2023 10:36

simone-sanfratello added 2 commits January 5, 2023 11:51

release: v0.15.0

817fff7

doc: readiness

3333acf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: readiness #181

feat: readiness #181

simone-sanfratello commented Dec 16, 2022 •

edited

Loading

feat: readiness #181

Are you sure you want to change the base?

feat: readiness #181

Conversation

simone-sanfratello commented Dec 16, 2022 • edited Loading

simone-sanfratello commented Dec 16, 2022 •

edited

Loading