-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama-server of vulkan backend crashes #3313
Comments
Information about your version Information about your GPU
Intel Driver & Support Assistant reports:
Additional context
|
A workaround is to install an upstream llama-server.exe, e.g. llama-b4034-bin-win-vulkan-x64 |
Hi @gmatht and @Vlod-github, thank you for reporting the issue. Did you install the Vulkan runtime before using Tabby? Could you please try running the following command and post the result here: vulkaninfo.exe |
I have verified this on Linux and found an issue with the Vulkan build. We will investigate further and fix it later. https://gist.github.com/zwpaper/08e80712e1f3f82a41a1a0ee41735b2f |
After running
.\tabby.exe serve --model StarCoder-1B --chat-model Qwen2-1.5B-Instruct
after downloading the models, the llama-server.exe crashes.Tabby
tabby v0.18.0 and tabby v0.19.0-rc.1
tabby_x86_64-windows-msvc and tabby_x86_64-windows-msvc-vulkan
It turns out that this doesn't work even for the CPU.
Environment
Ryzen 5 3500U with Vega 8
Windows 10
Further, I connected these gguf models to gpt4all and they work, so the issue is with the backend.
Here is the output that tabby periodically produces.
The text was updated successfully, but these errors were encountered: