Kubernetes: request.body()
taking a long time when targeting server through a LoadBalancer service
#2473
Replies: 1 comment 1 reply
-
Probably not. When you're awaiting Here's a little Litestar app to demonstrate this: from litestar import post, Request
from litestar.testing import create_test_client
@post("/")
async def handler(request: Request) -> None:
print("start receiving")
await request.body()
print("done receiving")
def generate_body():
print("start sending")
yield b""
with create_test_client([handler]) as client:
client.post("/", content=generate_body()) Running this will output
(Pure ASGI version)import httpx
import asyncio
async def generate_body():
print("start sending")
yield b""
async def app(scope, receive, send):
print("start receiving")
await receive()
print("done receiving")
await send({"type": "http.response.start", "status": 200, "headers": [[b"content-type", b"text/plain"]]})
await send({"type": "http.response.body", "body": b""})
async def main():
async with httpx.AsyncClient(app=app, base_url="http://testserver") as client:
await client.post("/", content=generate_body())
asyncio.run(main()) Notice how the sending doesn't start before Can I ask why you've set up your route handler this way, instead of just letting Litestar handle the deserialisation and data fetching? This should be equivalent to your setup: @post(...)
async def ranking(data: RankingRequest) -> Response[bytes]:
... |
Beta Was this translation helpful? Give feedback.
-
I have a Litestar app deployed to AKS (Azure Kubernetes Cluster). This app wraps a ML model and return some time metrics alongside the actual results. In the past days I've been running a simple test that makes a
n
sequential requests (no concurrence), collect time metrics and plot results. And something really weird that I can't explain happens. My metrics include the time taken for deserializing the request body, measured as following:Now, using
httpx.Client().post
as http client:Notes:
requests.Session
as http client, deserialization gets faster, but still 10x slower when using the LoadBalancer servicehistory_length
you see in the x axis.From my colleague PC (using httpx), we get similar results but in another "scale", i.e. the overhead is similar to what I get with
requests
.I can't wrap my mind around this, isn't the request body already in the k8s node when I start reading it with
request.body()
?Beta Was this translation helpful? Give feedback.
All reactions