You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The models are loaded via onnxruntime-node, which is a node.js binding for Microsoft's ONNX runtime.
onnxruntime-node doesn't currently have GPU support on node.js.
This is currently a working item for them for 2024. It is still in development. They did add some early code but it isn't fully deployed yet.
Also, they don't support models large than 2.0GB on onnxruntime-node, so whisper-large models are not currently supported. I opened an issue for that on the ONNX runtime repository, several months ago.
Once they add GPU support, there will be GPU support added, both for Whisper recognition and possibly for synthesis with the VITS models, and maybe for other features like speech language recognition. Same for large model support.
If you are using Echogarden mostly for speech recognition, just know that it isn't actually its strongest area (strongest is likely alignment). There are faster implementations of OpenAI Whisper models, like whisper.cpp, that support NVIDIA GPUs and are otherwise significantly faster on CPU due to quantization and other optimizations they use.
when I run audio-to-txt using api, it always run on my CPU and my gpu is free, I want to set it run on my gpu to improve running speed.
The text was updated successfully, but these errors were encountered: