Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nllb-api support GPU #49

Closed
online2311 opened this issue Oct 10, 2023 · 13 comments
Closed

nllb-api support GPU #49

online2311 opened this issue Oct 10, 2023 · 13 comments

Comments

@online2311
Copy link

Is it possible to support GPU? I found out that CTranslate2 supports GPU. Does it support GPU to respond faster?

@winstxnhdw
Copy link
Owner

winstxnhdw commented Oct 10, 2023

Yes, CTranslate2 supports GPU. It is indeed magnitudes faster and it's really easy to implement it yourself. I don't have the time to add it right now, and will only introduce it in a month or two. For now, you can do it by yourself by changing the following code and installing the NVIDIA Container Toolkit.

cls.translator = CTranslator(model_path, device='cuda', compute_type='auto', device_index=[0, 1])

Afterwards, just run the container with the NVIDIA runtime.

docker build -f Dockerfile.build -t nllb-api .
docker run --rm  --runtime=nvidia --gpus all \
  -e SERVER_PORT=5000 \
  -e APP_PORT=7860 \
  -e OMP_NUM_THREADS=6 \
  -e WORKER_COUNT=1 \
  -p 7860:7860 \
  -v ./cache:/home/user/.cache \
  nllb-api

@online2311
Copy link
Author

Awesome, thanks. I will try.

@online2311
Copy link
Author

Is there any problem with this? I don’t seem to see GPU usage


==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

2023-10-12 04:27:23,976 INFO supervisord started with pid 1
2023-10-12 04:27:24,979 INFO spawned: 'server' with pid 27
2023-10-12 04:27:24,980 INFO spawned: 'caddy' with pid 28
{"level":"info","ts":1697084845.0245798,"msg":"using provided configuration","config_file":"Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1697084845.0266504,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["//localhost:2019","//[::1]:2019","//127.0.0.1:2019"]}
{"level":"info","ts":1697084845.0269077,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc00036d680"}
{"level":"info","ts":1697084845.0271118,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
{"level":"info","ts":1697084845.0271096,"logger":"tls","msg":"cleaning storage unit","description":"FileStorage:/home/user/.local/share/caddy"}
{"level":"error","ts":1697084845.027182,"msg":"unable to create folder for config autosave","dir":"/home/user/.config/caddy","error":"mkdir /home/user/.config: permission denied"}
{"level":"info","ts":1697084845.027194,"msg":"serving initial configuration"}
{"level":"info","ts":1697084845.0271883,"logger":"tls","msg":"finished cleaning storage units"}
 * /v1/cpu route found!
 * /v1/index route found!
 * /v1/translate route found!
 * /v2/index route found!
 * /v2/translate route found!
Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 40372.98it/s]
2023-10-12 04:27:35,816 INFO success: server entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2023-10-12 04:27:35,816 INFO success: caddy entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
[2023-10-12 04:27:39 +0000] [52] [INFO] Running on http://0.0.0.0:5000 (CTRL + C to quit)
2023-10-12 04:27:55.933192: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-12 04:27:55.933287: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-12 04:27:55.933397: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-12 04:27:57.223742: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2023-10-12 04:27:58 +0000] [52] [INFO] 200 "POST /v2/translate 2" 127.0.0.1:33352 "curl/8.1.2"

@online2311
Copy link
Author

Whether "WORKER_COUNT=2" supports running on two GPUs separately.

@winstxnhdw
Copy link
Owner

winstxnhdw commented Oct 12, 2023

Whoops, this was what I was afraid of. You will have to also replace the final image in Dockerfile.build to nvidia/cuda.

FROM python:3.11.6-slim as python-builder

ENV POETRY_VIRTUALENVS_CREATE false
ENV POETRY_HOME /opt/poetry
ENV PATH $POETRY_HOME/bin:$PATH

WORKDIR /

COPY pyproject.toml .

RUN apt update
RUN apt install -y curl
RUN curl -sSL https://install.python-poetry.org | python -
RUN poetry install --no-dev


FROM caddy:builder-alpine as caddy-builder

RUN xcaddy build --with github.com/caddyserver/cache-handler


FROM nvidia/cuda:11.7.1-runtime-ubuntu22.04

ENV HOME /home/user
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1

RUN useradd -m -u 1000 user

USER user

WORKDIR $HOME/app

COPY --chown=user --from=caddy-builder  /usr/bin/caddy /usr/bin/caddy
COPY --chown=user --from=python-builder /usr/local/    /usr/local/
COPY --chown=user . $HOME/app

CMD ["supervisord"]

WORKER_COUNT is for CPU threads only. If you want to use multiple GPUs, you'll have to change the add the device_index argument.

cls.translator = CTranslator(model_path, device='cuda', compute_type='auto', device_index=[0, 1])

You can read more here.

@online2311
Copy link
Author

This is the repaired Dockerfile and it now runs normally.

FROM caddy:builder-alpine as caddy-builder

RUN xcaddy build --with github.com/caddyserver/cache-handler


FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
ENV POETRY_VIRTUALENVS_CREATE false
ENV POETRY_HOME /opt/poetry
ENV PATH $POETRY_HOME/bin:$PATH

WORKDIR /
RUN apt-get update && \
    apt-get install --no-install-recommends -y git curl vim python3-dev python3-pip && \
    rm -rf /var/lib/apt/lists/*

ENV HOME /home/user
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1

RUN pip3 install torch torchvision torchaudio tensorflow ctranslate2 tensorrt
RUN ln -s /usr/bin/python3 /usr/bin/python
RUN curl -sSL https://install.python-poetry.org | python -
COPY pyproject.toml .
RUN poetry install --no-dev
RUN useradd -m -u 1000 user
RUN chown -R user:user /home/user

USER user

WORKDIR $HOME/app

ENV LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages/tensorrt_libs
COPY --chown=user --from=caddy-builder  /usr/bin/caddy /usr/bin/caddy
COPY --chown=user . $HOME/app

ENV OMP_NUM_THREADS 4
ENV WORKER_COUNT 1
CMD ["supervisord"]
cls.translator = CTranslator(model_path, device='cuda', compute_type='auto', device_index=[0, 1], inter_threads=Config.worker_count)

@winstxnhdw
Copy link
Owner

winstxnhdw commented Oct 13, 2023

There's a lot of redundancies and your final image size is probably larger than it ever needs to be. I think the one I suggested should work fine.

@online2311
Copy link
Author

That lacks something, lacks Lib and python model. I patched it day by day according to the prompts, and it became what it is now.

@winstxnhdw
Copy link
Owner

If you cloned the repository and built the Docker image in the repository, the correct libraries should be downloaded. Python will also be copied over from /usr/local/

@online2311
Copy link
Author

What you mean is "pip3 install torch torchvision torchaudio tensorflow ctranslate2 tensorrt" Put these into python-builder images, and then copy them to the nvidia/cuda:11.7.1-runtime-ubuntu22.04 image

@winstxnhdw
Copy link
Owner

No. My Python builder stage should already install the required dependencies. These dependencies are found in /usr/local/lib/python3.11/* which is copied over in the final build step. Therefore, you don't have to do anything.

@winstxnhdw
Copy link
Owner

I have officially tested and introduced CUDA support in this commit e7214f3.

The image size will be a lot smaller than your approach. Cheers!

@online2311
Copy link
Author

Awesome ~

@winstxnhdw winstxnhdw pinned this issue Dec 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants