- BrainHack TIL-24 Semis + Finals README
You should see that the til-24-base
repository now contains 3 additional directories:
main
, with template code for your main orchestration participant server to be deployed in the finals.simulator
, with a testing version of the competition server simulator. You do NOT need to touch this folder at all.autonomy
, with template code for your robotics autonomy code.
Note also that there are several new files in the top-level directory:
Dockerfile
for the test competition server,docker-compose.yml
for connecting to your own competition server service for testing,docker-compose-finals.yml
for connecting to the actual competition server (and thus not using your own competition server service),test_competition_server.py
, the code for your testing version of the competition server, and- an updated version of the
.env.example
file, which now includes additional variables for handling various networking configurations.
Take note that the .env
file will be loaded by docker compose
, and used to populate the environment variables the containers have access to. As such, be sure to update your .env
with whichever condition you're testing. On Linux machines, the host.docker.interal
hostname does not exist; instead, 172.17.0.1
is the IP address of the host from inside the container.
The simulator is written in JavaScript and uses 3js
to render the 3D environment of the turret. The environment is rendered as a cubemap surrounding the camera, which is positioned at the center of the cube and is given the ability to rotate around.
From here on, you will need to have a local development environment set up for your own testing. This local development environment should have:
- Python3 (optional but highly recommended, an environment management system like
conda
) - Docker
- A modern web browser (e.g. Chrome, Firefox)
You will have to build two additional Docker images:
autonomy
, containing your robotics autonomy code responsible for controlling your Robomaster turret, andmain
, containing your main server orchestration code for the semi-finals + finals.
You will have to push these images to Artifact Registry (note: NOT Model Registry!), and these will be taken as your submissions along with your final model versions. Thus, though as a semi-finalist/finalist you will continue to have access to your Online Development Environment on GCP, this means you will likely want to set up the gcloud
CLI on your local development environment as well if you want to be able to push containers from the same platform that you test them from; see the installation instructions below.
IMPORTANT NOTE: You should also run all your models simultaneously on your T4 instance on GCP to ensure that your models will all fit into the VRAM you will have access to during the semi-finals + finals (16GB of VRAM). Note that for the testing and finals, the network will be a LAN setup disconnected from the internet. As such, you must ensure that your images are able to be run offline
Participants should submit all 5 images tagged in the format {[team-name]-[service]:finals}
to the GCP Artifact registry by 3 June 2024 12:00HRS. The 5 images are: ASR, NLP, autonomy, VLM and main.
Example of image tagging:
- team-name-asr:finals
- team-name-nlp:finals
- team-name-vlm:finals
- team-name-autonomy:finals
- team-name-main:finals
# build container
docker build -t TEAM-NAME-autonomy .
# run container
docker run -p 5003:5003 TEAM-NAME-autonomy
# run container with GPU
docker run -p 5001:5001 --gpus all -d TEAM-NAME-asr
# view all running containers
docker ps
# view ALL containers, whether running or not
docker ps -a
# stop running container
docker kill CONTAINER-ID
# tag built image to be pushed to artifact registry
docker tag TEAM-NAME-asr asia-southeast1-docker.pkg.dev/dsta-angelhack/repository-TEAM-NAME/TEAM-NAME-asr:finals
# push built image to artifact registry
docker push asia-southeast1-docker.pkg.dev/dsta-angelhack/repository-TEAM-NAME/TEAM-NAME-asr:finals
When your model is pushed successfully, you should be able to see it under your team's repository on Artifact Registry.
To run all the containers, we will be using Docker Compose, a tool that lets you easily configure and run multi-container applications. Note that you will likely have to install it if it is not already installed (e.g. it may not be installed on the GCP instance); see the Docker Compose plugin installation instructions.
# start all the services
docker compose up
# force a build of all services and start them afterwards
docker compose up --build
# take down all the services
docker compose down
# start a particular docker compose file by name (it defaults to `docker-compose.yml` if not indicated)
docker compose -f docker-compose-finals.yml up
Create an .env
file based on the provided .env.example
file, and update it accordingly:
COMPETITION_IP = "172.17.0.1"
on Linux,"host.docker.internal"
otherwiseLOCAL_IP = "172.17.0.1"
on Linux,"host.docker.internal"
otherwiseUSE_ROBOT = "false"
Then run docker compose up
. This should start the competition server locally, as well as the rest of the services accordingly to connect to it.
Note that for the testing setup, it will be a LAN setup disconnected from the internet. As such, you must ensure that your images are able to be run offline (i.e. have your model weights packaged in your image through your Dockerfile).
- Your Docker images should be pre-loaded onto your desktop server hardware from Artifact Registry; verify that this is correct
- Verify
.env
settings to connect to our competition server (should be correct, since we'll be setting this up for each machine in advance) - With either our
docker-compose-finals.yml
or your own customdocker-compose.yml
, rundocker compose up
- We'll run the example test cases through your server simulating the semi-finals, and you can see how it all performs.
- You can make any changes you need to during this time, but be sure to push them to Artifact Registry by the end of your session!
Note that for the finals setup, it will be a LAN setup disconnected from the internet. As such, you must ensure that your images are able to be run offline (i.e. have your model weights packaged in your image through your Dockerfile).
- Your Docker images should be pre-loaded onto your desktop server hardware from Artifact Registry; verify that this is correct
- Verify
.env
settings to connect to our competition server (should be correct, since we'll be setting this up for each machine in advance) - With your
docker-compose.yml
(could be based on ourdocker-compose-finals.yml
), rundocker compose up
- We run our test cases through your server in a live stage setting, and you see whether you advance/win
To install it, see the installation docs provided by GCP.
Then, run gcloud init
, which should open a browser window for you to grant permissions for your user account. If prompted for your project id, input dsta-angelhack
. If prompted for your region, input asia-southeast1
. If prompted for your zone, input the zone corresponding to the zone of your team's instance (should be of the form asia-southeast1-x
where x is any of a
, b
, or c
).
Then run the following, ensuring to replace TEAM-NAME
with your team name.
gcloud auth configure-docker asia-southeast1-docker.pkg.dev -q
gcloud config set artifacts/location asia-southeast1
gcloud config set artifacts/repository repository-TEAM-NAME
You should then be able to push to your team's Artifact Registry repository same as during qualifiers:
docker tag TEAM-NAME-asr asia-southeast1-docker.pkg.dev/dsta-angelhack/repository-TEAM-NAME/TEAM-NAME-asr:finals
docker push asia-southeast1-docker.pkg.dev/dsta-angelhack/repository-TEAM-NAME/TEAM-NAME-asr:finals
For some reason, the python robomaster
SDK provided by DJI only has pre-built wheels for python version 3.8 exactly. As such, your autonomy
container likely needs to use exactly that python version. The existing template code uses a base image with a functioning version of python, but if you change the image, you will have to take note of this possible issue.
For some reason, the python robomaster
SDK provided by DJI only has pre-built wheels for the x86
architecture, entirely ignoring building arm64
wheels for Macs on Apple Silicon. There is a workaround available using Rosetta and conda
(see this GitHub issue):
First, install conda
from https://conda.io/projects/conda/en/latest/user-guide/install/macos.html and then run the following commands:
CONDA_SUBDIR=osx-64 conda create -n robo python=3.8.10
conda activate robo
python -c "import platform;print(platform.machine())" # x86_64
conda config --env --set subdir osx-64 # set future packages installs to also use osx-64 over arm64
pip install robomaster
This will create a new Python 3.8 environment robo
and configure it to use x86
packages using Rosetta instead of native ones for arm64
.
Some HuggingFace models, such as the DETR model referenced in the training curriculum, encounter some issues when trying to load from a finetuned model offline, as they rely on some backbone weights from HuggingFace Hub not stored in the model when downloaded using the save_pretrained()
function.
One possible solution to this problem is to run a script to load the models once when building the Dockerfile. This will cache whatever backbone weights are necessary, so subsequent calls of the model will just use the cached version instead of trying to access the HuggingFace Hub (which won't be possible without internet access).
Here's an example of what such a script, save_cache.py
, would look like:
from transformers import (
AutoImageProcessor,
AutoModelForObjectDetection
)
detr_checkpoint = "models/detr_model.pth"
detr_model = AutoModelForObjectDetection.from_pretrained(detr_checkpoint)
detr_processor = AutoImageProcessor.from_pretrained(detr_checkpoint)
Then in your Dockerfile, add these lines after pip install -r requirements.txt
and copying over necessary files (especially the model file).
COPY save_cache.py save_cache.py
RUN python3 save_cache.py
This will copy and run save_cache.py
, which will save the model backbone weights to the .cache
directory on the image, allowing it to be loaded offline by HuggingFace Transformers later on.