Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cuda broken - Cannot load libcuda.so.1 #403

Open
1 task done
HidingCherry opened this issue Sep 11, 2024 · 8 comments
Open
1 task done

[BUG] cuda broken - Cannot load libcuda.so.1 #403

HidingCherry opened this issue Sep 11, 2024 · 8 comments

Comments

@HidingCherry
Copy link

HidingCherry commented Sep 11, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Trying to use hardware transcoding with nvidia.
image
plex uses custom libraries from /usr/lib/plexmediaserver/lib/ instead of /usr/lib/ as one would expect.
nvidia-cdi has even been modified to include libcuda.so.1 to /usr/lib.

nvidia-smi works as expected.

PS: you really miss a "custom comment" section...

Expected Behavior

No response

Steps To Reproduce

  • Setup nvidia environment
  • use nvidia-modprobe (if necessary)
  • create required cdi file with nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
  • recreate container

Environment

- OS: ArchLinux
- How docker service was installed: podman rootless

CPU architecture

x86-64

Docker creation

/usr/bin/podman run \
	--rm \
	-d \
	--replace \
	--name=plex \
	--stop-timeout 90 \
	--ulimit=nofile=65536:65536 \
	--hooks-dir=/usr/share/containers/oci/hooks.d/ \
	--device=/dev/dri:/dev/dri \
	--device nvidia.com/gpu=all \
	-e PUID=1000 \
	-e PGID=1000 \
	-e VERSION=docker \
	-e "NVIDIA_DRIVER_CAPABILITIES=nvidia.com/gpu=all" \
	-e "NVIDIA_VISIBLE_DEVICES=nvidia.com/gpu=all" \
	--net slirp4netns:port_handler=slirp4netns \
	-p xxx:32400:32400 \
	lscr.io/linuxserver/plex:latest

Container logs

[migrations] started
[migrations] no migrations found
───────────────────────────────────────

      ██╗     ███████╗██╗ ██████╗
      ██║     ██╔════╝██║██╔═══██╗
      ██║     ███████╗██║██║   ██║
      ██║     ╚════██║██║██║   ██║
      ███████╗███████║██║╚██████╔╝
      ╚══════╝╚══════╝╚═╝ ╚═════╝

   Brought to you by linuxserver.io
───────────────────────────────────────

To support LSIO projects visit:
https://www.linuxserver.io/donate/

───────────────────────────────────────
GID/UID
───────────────────────────────────────

User UID:    1000
User GID:    1000
───────────────────────────────────────
Linuxserver.io version: 1.40.5.8921-836b34c27-ls234
Build-date: 2024-09-09T09:24:02+00:00
───────────────────────────────────────
    
**** Server already claimed ****
Docker is used for versioning skip update check
[custom-init] No custom files found, skipping...
Starting Plex Media Server. . . (you can ignore the libusb_init error)
Connection to localhost (127.0.0.1) 32400 port [tcp/*] succeeded!
[ls.io-init] done.
Critical: libusb_init failed
Copy link

Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.

@aptalca
Copy link
Member

aptalca commented Sep 11, 2024

You're well into unsupported territory with that level of customization.

Docker with nvidia runtime (nvidia-container-toolkit) works just fine, which is the configuration we support. Podman rootless itself is considered a reasonable endeavor so while we don't officially support it, we'll try and help. But podman rootless with custom nvidia bits is not something we can help with.
https://docs.linuxserver.io/misc/support-policy/#reasonable-endeavours-support

@HidingCherry
Copy link
Author

HidingCherry commented Sep 11, 2024

Yes, that's correct.
podman + rootless is already something I don't expect to get lots of support with.

But since linux standard programs use /usr/lib libraries and not a custom library, the issue seems to be image or plex specific. The main question is, why does plex not use /usr/lib for libraries - but nvidia-smi does (in the same container).

edit: You should be able to recreate the same issue with docker rootful.

@aptalca
Copy link
Member

aptalca commented Sep 11, 2024

No point in arguing hypothethicals. I'd rather avoid the xy problem.

edit: You should be able to recreate the same issue with docker rootful.

Please don't expect us to do the work based on assumptions or a hunch. If you reproduce the issue with rootful docker and official nvidia-container-toolkit, let us know and we'll look into it.

Last I tested plex with nvidia-container-toolkit, following the instructions we provide in the readme, everything was working just fine and that was only a couple of months ago.

@HidingCherry
Copy link
Author

I had no issues a couple of months ago either, I use your image since at least a year.
I'll see if I can setup rootful docker in the near future.

@AngellusMortis
Copy link

AngellusMortis commented Oct 6, 2024

I am seeing the same issue in kubernetes. Same thing, use to work a few months ago and now it is broken. nvec is still working in other containers, just not Plex.

image

@LinuxServer-CI
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

@HidingCherry
Copy link
Author

HidingCherry commented Nov 9, 2024

Temporary workaround which works for me:

for i in /usr/lib/lib*.so.1;
do ln -s $i /usr/lib/plexmediaserver/lib/;
done

This symlinks all libraries (which are actually all nvidia/cuda ones) to the library folder of plex.
After a short test, hw-transcoding works again.
This script has to be run inside the container.

script for podman:
podman exec plex bash -c 'for i in /usr/lib/lib*.so.1; do ln -s $i /usr/lib/plexmediaserver/lib/; done'

Currently I don't have the time to setup rootful docker to recreate the issue there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Issues
Development

No branches or pull requests

4 participants