Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding files to deploy CodeGen application on AMD GPU #1130

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
2449fc1
Add Docker Compose file, README file and set environments script for …
Nov 13, 2024
9f5b50c
Add Docker Compose file, README file and set environments script for …
Nov 13, 2024
295df7e
Add Docker Compose file, README file and set environments script for …
Nov 13, 2024
f056800
Fix set env script
Nov 13, 2024
feb1dda
Fix set env script
Nov 13, 2024
cdb7529
Fix README.md
Nov 13, 2024
4dfa522
Fix Docker compose file and set env script
Nov 13, 2024
e4108d9
Fix Docker compose file
Nov 13, 2024
32c9fe6
Fix Docker compose file
Nov 13, 2024
306fdc9
Fix set env script
Nov 13, 2024
066f693
Fix set env script
Nov 13, 2024
11de342
Fix Docker compose file
Nov 13, 2024
a4afc7f
Fix Docker compose file
Nov 13, 2024
7373c12
Fix Docker compose file
Nov 13, 2024
bf72c4c
Fix Docker compose file
Nov 13, 2024
1289dbc
Fix Docker compose file
Nov 13, 2024
ffcc178
Add test script for testing deploy application on AMD GPU
Nov 13, 2024
5e98c69
Fix test script for testing deploy application on AMD GPU
Nov 13, 2024
84f920f
Fix docker compose file
Nov 13, 2024
6c8fc86
Fix docker compose file
Nov 13, 2024
1e5726f
Fix test script test_compose_on_amd.sh
Nov 13, 2024
9c8393a
Fix test script test_compose_on_amd.sh
Nov 13, 2024
e5a7358
Fix test script test_compose_on_amd.sh
Nov 13, 2024
d697458
Fix test script test_compose_on_amd.sh
Nov 13, 2024
d75fceb
Fix test script test_compose_on_amd.sh
Nov 13, 2024
1e3d85f
Fix README.md and fix Docker compose file - removing GPU forwarding t…
Nov 14, 2024
94b5d37
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 14, 2024
7be7333
Fix README.md and fix Docker compose file - removing GPU forwarding t…
Nov 14, 2024
03c7a4c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 14, 2024
92e3a09
Merge branch 'main' into feature/GenAIExamples_ROCm
chyundunovDatamonsters Nov 14, 2024
49f0295
Rename tests script for testing compose deploy CedeGen application
Nov 14, 2024
0e06bf4
Merge branch 'main' into feature/GenAIExamples_ROCm
chyundunovDatamonsters Nov 14, 2024
6541d88
Merge branch 'main' into feature/GenAIExamples_ROCm
chyundunovDatamonsters Nov 14, 2024
3cf9a3b
Fix CodeGen tests script
Nov 14, 2024
30aca8b
Merge branch 'main' into feature/GenAIExamples_ROCm
lvliang-intel Nov 15, 2024
6f63e30
Fix CodeGen - README.md, Docker Compose file, set envs script and tes…
Nov 18, 2024
2cbb860
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 18, 2024
4e356b7
Fix CodeGen - enable UI test
Nov 18, 2024
6e2705b
Merge remote-tracking branch 'origin/feature/GenAIExamples_ROCm' into…
Nov 18, 2024
b89bdd4
Fix CodeGen - fix UI tests
Nov 18, 2024
af86089
Fix CodeGen - fix UI tests
Nov 18, 2024
edfbe7e
Fix CodeGen - fix UI tests
Nov 18, 2024
76760cb
Fix CodeGen - fix UI tests
Nov 18, 2024
ec279c2
Merge branch 'main' into feature/GenAIExamples_ROCm
chyundunovDatamonsters Nov 18, 2024
6a89c4e
Fix CodeGen - fix README.md
Nov 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions CodeGen/docker_compose/amd/gpu/rocm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Build and deploy CodeGen Application on AMD GPU (ROCm)

## Build images

### Build the LLM Docker Image

```bash
### Cloning repo
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps

### Build Docker image
docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
```

### Build the MegaService Docker Image

```bash
### Cloning repo
git clone https://github.com/opea-project/GenAIExamples
cd GenAIExamples/CodeGen

### Build Docker image
docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
```

### Build the UI Docker Image

```bash
cd GenAIExamples/CodeGen/ui
### Build UI Docker image
docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .

### Build React UI Docker image (React UI allows you to use file uploads)
docker build --no-cache -t opea/codegen-react-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react .
chyundunovDatamonsters marked this conversation as resolved.
Show resolved Hide resolved
```

It is recommended to use the React UI as it works for downloading files. The use of React UI is set in the Docker Compose file

## Deploy CodeGen Application

### Features of Docker compose for AMD GPUs

1. Added forwarding of GPU devices to the container TGI service with instructions:

```yaml
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/:/dev/dri/
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
```

In this case, all GPUs are thrown. To reset a specific GPU, you need to use specific device names cardN and renderN.

For example:

```yaml
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/card0:/dev/dri/card0
- /dev/dri/render128:/dev/dri/render128
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
```

To find out which GPU device IDs cardN and renderN correspond to the same GPU, use the GPU driver utility

### Go to the directory with the Docker compose file

```bash
cd GenAIExamples/CodeGen/docker_compose/amd/gpu/rocm
```

### Set environments

In the file "GenAIExamples/CodeGen/docker_compose/amd/gpu/rocm/set_env.sh " it is necessary to set the required values. Parameter assignments are specified in the comments for each variable setting command

```bash
chmod +x set_env.sh
. set_env.sh
```

### Run services

```
docker compose up -d
```

# Validate the MicroServices and MegaService

## Validate TGI service

```bash
curl http://${HOST_IP}:${CODEGEN_TGI_SERVICE_PORT}/generate \
-X POST \
-d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \
-H 'Content-Type: application/json'
```

## Validate LLM service

```bash
curl http://${HOST_IP}:${CODEGEN_LLM_SERVICE_PORT}/v1/chat/completions\
-X POST \
-d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
-H 'Content-Type: application/json'
```

## Validate MegaService

```bash
curl http://${HOST_IP}:${CODEGEN_BACKEND_SERVICE_PORT}/v1/codegen -H "Content-Type: application/json" -d '{
"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
}'
```
78 changes: 78 additions & 0 deletions CodeGen/docker_compose/amd/gpu/rocm/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
codegen-tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
container_name: codegen-tgi-service
ports:
- "${CODEGEN_TGI_SERVICE_PORT:-8028}:80"
volumes:
- "/var/lib/GenAI/data:/data"
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HUGGING_FACE_HUB_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN}
HUGGINGFACEHUB_API_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN}
shm_size: 1g
devices:
- /dev/kfd:/dev/kfd
- /dev/dri/:/dev/dri/
cap_add:
- SYS_PTRACE
group_add:
- video
security_opt:
- seccomp:unconfined
ipc: host
command: --model-id ${CODEGEN_LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048
codegen-llm-server:
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
container_name: codegen-llm-server
depends_on:
- codegen-tgi-service
ports:
- "${CODEGEN_LLM_SERVICE_PORT:-9000}:9000"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: "http://codegen-tgi-service"
HUGGINGFACEHUB_API_TOKEN: ${CODEGEN_HUGGINGFACEHUB_API_TOKEN}
restart: unless-stopped
codegen-backend-server:
image: ${REGISTRY:-opea}/codegen:${TAG:-latest}
container_name: codegen-backend-server
depends_on:
- codegen-llm-server
ports:
- "${CODEGEN_BACKEND_SERVICE_PORT:-7778}:7778"
environment:
no_proxy: ${no_proxy}
https_proxy: ${https_proxy}
http_proxy: ${http_proxy}
MEGA_SERVICE_HOST_IP: ${CODEGEN_MEGA_SERVICE_HOST_IP}
LLM_SERVICE_HOST_IP: "codegen-llm-server"
ipc: host
restart: always
codegen-ui-server:
image: ${REGISTRY:-opea}/codegen-ui:${TAG:-latest}
container_name: codegen-ui-server
depends_on:
- codegen-backend-server
ports:
- "${CODEGEN_UI_SERVICE_PORT:-5173}:5173"
environment:
no_proxy: ${no_proxy}
https_proxy: ${https_proxy}
http_proxy: ${http_proxy}
BASIC_URL: ${CODEGEN_BACKEND_SERVICE_URL}
BACKEND_SERVICE_ENDPOINT: ${CODEGEN_BACKEND_SERVICE_URL}
ipc: host
restart: always

networks:
default:
driver: bridge
37 changes: 37 additions & 0 deletions CodeGen/docker_compose/amd/gpu/rocm/set_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

### The IP address or domain name of the server on which the application is running
export HOST_IP=direct-supercomputer1.powerml.co

### The port of the TGI service. On this port, the TGI service will accept connections
export CODEGEN_TGI_SERVICE_PORT=8028

### A token for accessing repositories with models
export CODEGEN_HUGGINGFACEHUB_API_TOKEN=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

### Model ID
export CODEGEN_LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"

### The port of the LLM service. On this port, the LLM service will accept connections
export CODEGEN_LLM_SERVICE_PORT=9000

### The endpoint of the TGI service to which requests to this service will be sent (formed from previously set variables)
export CODEGEN_TGI_LLM_ENDPOINT="http://${HOST_IP}:${CODEGEN_TGI_SERVICE_PORT}"

### The IP address or domain name of the server for CodeGen MegaService
export CODEGEN_MEGA_SERVICE_HOST_IP=${HOST_IP}

### The port for CodeGen backend service
export CODEGEN_BACKEND_SERVICE_PORT=18150

### The URL of CodeGen backend service, used by the frontend service
export CODEGEN_BACKEND_SERVICE_URL="http://${HOST_IP}:${CODEGEN_BACKEND_SERVICE_PORT}/v1/codegen"

### The endpoint of the LLM service to which requests to this service will be sent
export CODEGEN_LLM_SERVICE_HOST_IP=${HOST_IP}

### The CodeGen service UI port
export CODEGEN_UI_SERVICE_PORT=18151
Loading
Loading