-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nllb-api support GPU #49
Comments
Yes, cls.translator = CTranslator(model_path, device='cuda', compute_type='auto', device_index=[0, 1]) Afterwards, just run the container with the NVIDIA runtime. docker build -f Dockerfile.build -t nllb-api .
docker run --rm --runtime=nvidia --gpus all \
-e SERVER_PORT=5000 \
-e APP_PORT=7860 \
-e OMP_NUM_THREADS=6 \
-e WORKER_COUNT=1 \
-p 7860:7860 \
-v ./cache:/home/user/.cache \
nllb-api |
Awesome, thanks. I will try. |
Is there any problem with this? I don’t seem to see GPU usage
|
Whether "WORKER_COUNT=2" supports running on two GPUs separately. |
Whoops, this was what I was afraid of. You will have to also replace the final image in FROM python:3.11.6-slim as python-builder
ENV POETRY_VIRTUALENVS_CREATE false
ENV POETRY_HOME /opt/poetry
ENV PATH $POETRY_HOME/bin:$PATH
WORKDIR /
COPY pyproject.toml .
RUN apt update
RUN apt install -y curl
RUN curl -sSL https://install.python-poetry.org | python -
RUN poetry install --no-dev
FROM caddy:builder-alpine as caddy-builder
RUN xcaddy build --with github.com/caddyserver/cache-handler
FROM nvidia/cuda:11.7.1-runtime-ubuntu22.04
ENV HOME /home/user
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
RUN useradd -m -u 1000 user
USER user
WORKDIR $HOME/app
COPY --chown=user --from=caddy-builder /usr/bin/caddy /usr/bin/caddy
COPY --chown=user --from=python-builder /usr/local/ /usr/local/
COPY --chown=user . $HOME/app
CMD ["supervisord"]
cls.translator = CTranslator(model_path, device='cuda', compute_type='auto', device_index=[0, 1]) You can read more here. |
This is the repaired Dockerfile and it now runs normally.
|
There's a lot of redundancies and your final image size is probably larger than it ever needs to be. I think the one I suggested should work fine. |
That lacks something, lacks Lib and python model. I patched it day by day according to the prompts, and it became what it is now. |
If you cloned the repository and built the Docker image in the repository, the correct libraries should be downloaded. Python will also be copied over from /usr/local/ |
What you mean is "pip3 install torch torchvision torchaudio tensorflow ctranslate2 tensorrt" Put these into python-builder images, and then copy them to the nvidia/cuda:11.7.1-runtime-ubuntu22.04 image |
No. My Python builder stage should already install the required dependencies. These dependencies are found in /usr/local/lib/python3.11/* which is copied over in the final build step. Therefore, you don't have to do anything. |
I have officially tested and introduced CUDA support in this commit e7214f3. The image size will be a lot smaller than your approach. Cheers! |
Awesome ~ |
Is it possible to support GPU? I found out that CTranslate2 supports GPU. Does it support GPU to respond faster?
The text was updated successfully, but these errors were encountered: