Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking: implement traffic shaping (throttle/rate limit) directly in gVisor #11109

Open
norbjd opened this issue Nov 3, 2024 · 4 comments
Labels
type: enhancement New feature or request

Comments

@norbjd
Copy link

norbjd commented Nov 3, 2024

Description

Hello 👋

I'm wondering if it is technically possible to implement traffic shaping directly in the gVisor sandbox, and not rely on external tools (like tc) to restrict both ingress and egress bandwidth of containers. I have been reading the networking architecture guide, and the sentry/netstack abstraction makes me think it could be possible, since all packets seems to go through the virtual interface. Also I've already seen some qdisc references in the code. I am not a networking expert though, so maybe I'm just writing absolute nonsense; feel free to correct me if so.

Anyway, my use-case is very simple: I want to run multiple containers using runsc, but easily restrict/throttle/rate-limit both transmitted and received bytes rate to something like 10 MB/s for each container, without having to tweak my host. This use-case can also apply in more high level tools like Kubernetes. There are some ways to do traffic shaping in a K8s cluster (bandwidth CNI plugin, cilium bandwidth manager, ...) but I have not been able to make it work with gVisor though.

Despite having a very different approach when it comes to sandboxing, Kata containers have implemented this feature, and it is as simple as setting rx_rate_limiter_max_rate and tx_rate_limiter_max_rate parameters in the configuration. So, I would expect ingress/egress rate limit to be as simple as passing a flag to the runsc command.

Again, I don't know if what I am asking is possible, and whether there are easy alternatives that could already fulfill the bandwidth rate limiting feature. Any pointers would be appreciated.

Thanks 😃

Is this feature related to a specific bug?

No.

Do you have a specific solution in mind?

Not really sadly 😕 I'm assuming this behavior could live in the qdisc "algorithm" where packets are dispatched, or in the netstack abstraction, but these are just wild guesses as I'm not an expert of how gVisor works under the hood (especially the networking part).

@norbjd norbjd added the type: enhancement New feature or request label Nov 3, 2024
@kevinGC
Copy link
Collaborator

kevinGC commented Nov 4, 2024

It could be implemented in gVisor, sure. A few things:

  • The quote at the top of rate-limiter: network I/O throttling on VM level  kata-containers/kata-containers#312 applies to gVisor as well: it's better to shape traffic outside of the sandbox. I would think it's possible to set limits on the host side of the container veth to enforce this. I'm going to guess that the existing K8s stuff doesn't work because our use of AF_PACKET skips it (just a guess).
  • If you really want it inside gVisor, the best way to implement this would probably be to add another qdisc implementation (like the existing FIFO) that imposes rate limits. If the limits must be customizable, we can update the runsc flags to to pass the rate limit to the implementation. Thankfully the implementation is pretty clear cut: we need another implementation of an existing interface, and some plumbing to get command line configuration down to the implementation.

@norbjd
Copy link
Author

norbjd commented Nov 5, 2024

Thanks for your quick answer Kevin 😃

Glad to see that at least it is technically possible, despite the fact there are better ways 😁 By chance, do you (or someone else) have any idea/pointers about how to shape traffic on the host directly that would work with gVisor? 👀

To be honest, I couldn't make it work on a k8s cluster after a few hours of tweaking, because of an error I couldn't find who is throwing it: containerd? the host? the CNI (cilium in my case)? gVisor? So I couldn't really blame gVisor, as there were many components involved 😛 All I can remember is an error in containerd system logs when using the default bandwidth CNI plugin:

containerd plugin type="bandwidth" failed (add): create ingress qdisc: file exists

I will try to reproduce with a minimal example with k8s, but if you or other people have ideas feel free to answer. I thought it was a good option to add it on runsc itself because people might run gVisor outside of k8s, but I do agree that finding (and document) a way to shape traffic directly on the host would be even better.

@kevinGC
Copy link
Collaborator

kevinGC commented Nov 5, 2024

To be clear: we'd love to have it added to gVisor. We just always think about things from a security perspective first, but we'd 100% work with a contributor to get a PR in with shaping or rate limiting.

It looks like, based on a little investigating (e.g.), doing rate limiting on the host would involve a highly custom setup. Kubernetes isn't going to make it easy to support that out-of-the-box.

@norbjd
Copy link
Author

norbjd commented Nov 9, 2024

Thanks for the follow up! I will try to implement something in the next few months based on your guidance because it seems interesting to do 😄 I might need help at some point because I'm not a networking expert, nor familiar with gVisor codebase (but on this side, I feel like the change is mostly localized to pkg/tcpip/link/qdisc, so that should be fine). I will try to draft a PR and link to this issue. If I struggle too much, I'll try to keep you updated here so you or others members of the community can give it a try :)

One last question though before starting: the comment above the QueueingDiscipline interface states that the qdisc only applies to outgoing packets, so only egress (if I understand correctly). This surely answers one part of the initial problem for throttling transmitted bytes, but I am wondering if it was possible to also rate limit received bytes with the same qdisc? Or would that require a different change? Similarly to the qdisc's WritePacket method, maybe is there a ReadPacket method somewhere to manage this?

I'm asking because in the meantime, I have tested some other alternatives in Kubernetes environments, and I was able to successfully restrict egress for gVisor by using Cilium's Bandwidth Manager, with kubernetes.io/egress-bandwidth annotation. Sadly, Cilium's Bandwidth Manager does not support kubernetes.io/ingress-bandwidth annotation. So, in my specific case, I am more interested by restricting ingress on gVisor side, rather than egress, as Cilium already manages it. Still, both features stay interesting for standalone gVisor containers.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants