-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add privileged mode for SST #105
Conversation
The NRI pod needs to access the /dev/isst_interface device in the privileged mode.
This pr addresses #101 |
@marquiz Do you have any comments for this pr? |
@@ -66,6 +70,10 @@ spec: | |||
mountPath: /var/lib/nri-resource-policy | |||
- name: hostsysfs | |||
mountPath: /host/sys | |||
{{ if .Values.privilegedMode }} | |||
- name: hostdev | |||
mountPath: /host/dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think mounting /dev
should be a separate setting.
@@ -54,9 +54,13 @@ spec: | |||
image: {{ .Values.image.name }}:{{ .Values.image.tag | default .Chart.AppVersion }} | |||
imagePullPolicy: {{ .Values.image.pullPolicy }} | |||
securityContext: | |||
{{ if .Values.privilegedMode }} | |||
privileged: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also suggest to restrict the capabilities:
to the minimum that is actually required.
I'm not sure what would be the nicest way to tie all the pieces (privileged mode, capabilities, /dev
mount) together. Would there be three separate parameters or a signle devAccess
or smth 🤨 Thoughts?
Hi, @marquiz . I tried to use
In this securitycontext, I didn't know which capabilities I should add so I added all the capabilities to the nri pod. But the nri pod still reported an error:
Do you have any ideas on that? |
If I don't add extra capabilities to the nri pod. The grep command output is:
And, if I add the extra capabilities to the nri pod. The grep command output is:
So, it seems that all the capabilities have already added to the nri pod. But the nri pod still can't access the |
Thanks @changzhi1990 for debugging this. Fair enough, seems like we need the privileged mode, then. I'd still like to understand why that fails. Maybe its the ambient capabilities somehow being required (although I don't quite understand why)... |
Responding to myself: I think it's because runc uses cgroups to control access to device nodes. To make this work without privileged |
Thanks for your response. I will try this. |
I have compared the capabilities of two scenarios.
In the scenario1,
In the scenario2,
The only difference between these two scenarios is that there is an extra "41" capabilities in scenario2. |
Maybe.... we need an sst device plugin, like dsb device plugin, gpu device plugin, etc? |
@changzhi1990, ah sorry, I was a bit "equivocal" on this. I was merely speaking to myself because there is nothing else than privileged mode in the current setup that can make it work.
Yes, this would be a way to do it. Another possible solution would be running another NRI plugin before this one that would insert the isst_interface device in the resource-policy container but let's not go there yet... |
Alright, Is worth to create an sst plugin? I mean maybe there are a lot of works to do |
Talking about another NRI plugin @klihub had an idea about this (it seems that there already exists a sample plugin for this purpose, i.e. injecting a device) so that could be the solution, indeed. Let @klihub provide the details here |
@changzhi1990 You should be able to get read-only access to For instance, I used that plugin and this following pod spec to test if I can do 'SST discovery': apiVersion: v1
kind: Pod
metadata:
name: sst-test
annotations:
devices.nri.io/container.c0: |+
- path: /dev/isst_interface
type: c
major: 10
minor: 123
file_mode: 0o600
spec:
containers:
- name: c0
image: quay.io/marquiz/goresctrl:test
command:
- sh
- -c
- sleep 3600
resources:
requests:
cpu: 250m
memory: 200M
limits:
cpu: 500m
memory: 200M
securityContext:
privileged: false
capabilities:
drop:
- all
imagePullPolicy: IfNotPresent Just run the device-injector plugin (for instance klitkey1@emr-1:~/xfer$ kubectl apply -f sst-test.yaml
klitkey1@emr-1:~/xfer$ kubectl exec -ti sst-test -c c0 -- /bin/bash --login
root@sst-test:/go/builder# /go/bin/sst-ctl info
...
PPCurrentLevel: 0
PPLocked: true
PPMaxLevel: 4
PPSupported: true
PPVersion: 3
TFEnabled: false
TFSupported: true
... This sample plugin was not really meant for production. It's merely a sample plugin which demonstrate some of NRI's capabilities. Anyway, if you'd like to use this plugin or you roll your own, you can enable it permanently on your cluster nodes by symlinking it into klitkey1@emr-1:~/xfer$ sudo mkdir -p /opt/nri/plugins
sudo ln -s $(pwd)/device-injector /opt/nri/plugins/10-device-injector
klitkey1@emr-1:~/xfer$ sudo systemctl restart containerd # or crio After this you should be able to annotate your pods for device injection for testing... In a production environment you might want to restrict somehow (for instance by namespaces) which pods can be annotated with injected devices. Also, you might want to deploy that plugin itself as a DaemonSet instead of having to install it separately on each of your worker nodes. If there is enough interest, we can consider polishing that plugin, adding any necessary mechanisms for restricting access to annotation-based device injection, etc. and creating images and other deployment artifacts for it within the |
Hi, Thanks for your detailed reply. According to your message, maybe we have two options for this issue.
|
Closing this as ancient stale. |
The NRI pod needs to access the /dev/isst_interface device in the privileged mode.