Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Scaffolding #3

Merged
merged 12 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .env.TEMPLATE
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
DEFAULT_ENV=dev

# if 0, doesn't open a browser to the frontend webapp on a normal stack launch
DO_OPEN_BROWSER=1

POSTGRES_USER=molevolvr
POSTGRES_PASSWORD=
POSTGRES_DB=molevolvr
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/.env
44 changes: 44 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# MolEvolvR Stack

This repo contains the implementation of the MolEvolvR stack, i.e.:
- `app`: the frontend webapp, written in React
- `backend`: a backend written in [Plumber](https://www.rplumber.io/index.html)
- `cluster`: the containerized SLURM "cluster" on which jobs are run
- `services`: a collection of services on which the stack relies:
- `postgres`: configuration for a PostgreSQL database, which stores job information

Most of the data processing is accomplished via the `MolEvolvR` package, which
is currently available at https://github.com/JRaviLab/molevol_scripts. The stack
simply provides a user-friendly interface for accepting and monitoring the
progress of jobs, and orchestrates running the jobs on SLURM. The jobs
themselves call methods of the package at each stage of processing.

## Running the Stack in Development

To run the stack, you'll need to [install Docker and Docker Compose](https://www.docker.com/).

First, copy `.env.TEMPLATE` to `.env` and fill in the necessary values. You
should supply a random password for the `POSTGRES_PASSWORD` variable. Of note
is the `DEFAULT_ENV` variable, which gives `run_stack.sh` a default environment
in which to operate; in development, this should be set to `dev`.

Then, you can run the following command to bring up the stack:

```bash
./run_stack.sh
```

This will start the stack in development mode, which automatically reloads the
backend or frontend when there are changes to their source.

You should then be able to access the frontend at `http://localhost:5173`.

## Production

To run the stack in production, you can run the following

```bash
./run_stack.sh prod
```

This will start the stack in production mode.
1 change: 1 addition & 0 deletions app/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
node_modules
50 changes: 50 additions & 0 deletions app/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# from https://bun.sh/guides/ecosystem/docker, with modifications
# to run a hot-reloading development server

# use the official Bun image
# see all versions at https://hub.docker.com/r/oven/bun/tags
FROM oven/bun:1 AS base
WORKDIR /app


# -----------------------------------------------------------
# install dependencies for dev and prod into temp directories
FROM base AS install

COPY package.json bun.lockb /temp/dev/
RUN cd /temp/dev/ && \
bun install --frozen-lockfile

# FA: the production-only install is currently commented out since we always
# require the dev dependencies, specifically vite, to run *or* build the app.
# i'm leaving it here because perhaps someday we'll think of a reason why we
# want just the production dependencies.

# # install with --production (exclude devDependencies)
# COPY package.json bun.lockb /temp/prod/
# RUN cd /temp/prod && \
# bun install --frozen-lockfile --production

# -----------------------------------------------------------
# copy node_modules from dev stage, copy entire app
# source into the image
FROM base AS dev
COPY --from=install /temp/dev/node_modules node_modules
COPY . .
# run the app in hot-reloading development mode
# set up vite to accept connections on any interface, e.g. from outside the
# container, and to always run on port 5713)
CMD [ "vite", "--host", "--port", "5713" ]


# -----------------------------------------------------------
# copy production dependencies and source code into final image
FROM base AS release
COPY --from=install /temp/dev/node_modules node_modules
COPY . .

# produce a bundle that'll then be served via a reverse http proxy, e.g. nginx
# (you'll want /app/dist to be mapped to a volume that's served by the reverse
# http proxy)
CMD [ "vite", "build" ]

Binary file modified app/bun.lockb
Binary file not shown.
4 changes: 3 additions & 1 deletion app/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"@radix-ui/react-slider": "^1.1.2",
"@radix-ui/react-tabs": "^1.0.4",
"@radix-ui/react-tooltip": "^1.0.7",
"@tanstack/react-query": "^5.36.2",
"@tanstack/react-table": "^8.15.3",
"classnames": "^2.5.1",
"csv-stringify": "^6.4.6",
Expand All @@ -34,7 +35,8 @@
"react-time-ago": "^7.3.1",
"react-to-text": "^2.0.1",
"react-use": "^17.5.0",
"use-debounce": "^10.0.0"
"use-debounce": "^10.0.0",
"use-query-params": "^2.2.1"
},
"devDependencies": {
"@ianvs/prettier-plugin-sort-imports": "^4.2.1",
Expand Down
42 changes: 42 additions & 0 deletions backend/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# syntax=docker/dockerfile:1.7

# this Dockerfile should be used with the ./backend/ folder as the context
# and ./backend/docker/Dockerfile as the dockerfile

FROM rocker/tidyverse:4.3

# install ccache, a compiler cache
RUN apt-get update && apt-get install -y ccache

# install some common cmd line tools
RUN apt-get update && apt-get install -y curl

# acquire drip, a plumber auto-reloader, and install
ENV DRIP_URL="https://rdrip.netlify.app/builds/drip_0.1.0_linux_amd64.zip"
RUN mkdir -p /tmp/software/ && \
wget -L -O /tmp/software/drip.zip ${DRIP_URL} && \
unzip /tmp/software/drip.zip -d /tmp/software && \
mv /tmp/software/drip /usr/local/bin && \
chmod +x /usr/local/bin/drip

# acquire atlas, a schema manager
RUN curl -sSf https://atlasgo.sh | sh

# configure ccache env vars
ENV PATH="/usr/lib/ccache:${PATH}"
ENV CCACHE_DIR="/tmp/ccache"

# install dependencies into the image
COPY ./docker/install.R /tmp/install.r
RUN Rscript /tmp/install.r

# RUN --mount=type=cache,target=/usr/local/lib/R/site-library \
# Rscript /tmp/install.r

WORKDIR /app

# copy the app into the image
COPY . /app

# run the app
CMD ["/app/launch_api.sh"]
11 changes: 11 additions & 0 deletions backend/docker/install.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# install packages depended on by the molevolvr API server
install.packages(
c(
"plumber", # REST API framework
"DBI", # Database interface
"RPostgres", # PostgreSQL-specific impl. for DBI
"dbplyr", # dplyr for databases
"box" # allows R files to be referenced as modules
),
Ncpus = 6
)
7 changes: 7 additions & 0 deletions backend/dummy.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Load the plumber package
library(plumber)

#* @get /
function() {
"Hello, world!"
}
25 changes: 25 additions & 0 deletions backend/entrypoint.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
options(box.path = "/app")

box::use(
plumber[plumb],
server/tcp_utils[wait_for_port]
)

# receive the target port as the env var API_PORT, or 9050 if unspecified
target_port <- as.integer(Sys.getenv("API_PORT", unset=9050))

# workaround for https://github.com/siegerts/drip/issues/3, in which
# reloading fails due to the port being in use. we just wait, polling
# occasionally, for up to 60 seconds for the port to become free.
if (wait_for_port(target_port, poll_interval = 1, verbose = TRUE)) {
pr <- plumb("./dummy.R")$run(
host="0.0.0.0",
port=target_port,
debug=TRUE
)
}
else {
stop(
paste0("Failed to start the API server; port ", target_port, " still occupied after wait timeout exceeded"
)
}
4 changes: 4 additions & 0 deletions backend/launch_api.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash

# pass off to drip to control serving and reloading the API
drip
41 changes: 41 additions & 0 deletions backend/server/tcp_utils.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#' Utility functions for working with TCP ports

#' Check if a port is in use
#' @param port The port to check
#' @param host The IP for which to check the port
#' @return TRUE if the port is in use, FALSE otherwise
is_port_in_use <- function(port, host = "127.0.0.1") {
connection <- try(suppressWarnings(socketConnection(host = host, port = port, timeout = 1, open = "r+")), silent = TRUE)
if (inherits(connection, "try-error")) {
return(FALSE) # Port is not in use
} else {
close(connection)
return(TRUE) # Port is in use
}
}

#' Wait for a port to become free
#' @param port The port to wait for
#' @param timeout The maximum time to wait in seconds
#' @param poll_interval The interval between checks in seconds
#' @param host The IP for which to check the port
#' @param verbose Whether to print messages to the console
#' @return TRUE if the port is free, FALSE if the timeout is reached
wait_for_port <- function(port, timeout = 60, poll_interval = 5, host = "127.0.0.1", verbose = TRUE) {
start_time <- Sys.time()
end_time <- start_time + timeout

while (Sys.time() < end_time) {
if (!is_port_in_use(port, host)) {
if (verbose) { cat("Port", port, "is now free\n") }
return(TRUE)
}
if (verbose) { cat("Port", port, "is in use. Checking again in", poll_interval, "seconds...\n") }
Sys.sleep(poll_interval)
}

if (verbose) {
cat(paste0("Timeout of ", timeout, "s reached, but port ", port, " is still in use, aborting\n"))
}
return(FALSE)
}
4 changes: 4 additions & 0 deletions cluster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# MolEvolvR Cluster

This folder will eventually contain Dockerfiles for building images for a SLURM
controller and worker nodes.
36 changes: 36 additions & 0 deletions docker-compose.override.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
services:
backend:
volumes:
- ./backend/:/app/
# - ./backend/api/:/app/api/
# - ./backend/schema/:/app/schema/
# - ./backend/entrypoint.R:/app/entrypoint.R
# - ./backend/run_tests.sh:/app/run_tests.sh
ports:
- "9050:9050"
environment:
- "POSTGRES_DEV_HOST=dev-db"
- "PLUMBER_DEBUG=1"
depends_on:
- "dev-db"

app:
build:
context: ./app
target: dev
volumes:
- ./app/src:/app/src
environment:
- 'VITE_API=http://localhost:9050'
ports:
- "5713:5713"

db:
ports:
- "5460:5432"

# used by atlas to create migrations
dev-db:
image: postgres:16
env_file:
- .env
28 changes: 28 additions & 0 deletions docker-compose.prod.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
volumes:
app_bundle:
caddy_data:
caddy_config:

services:
app:
image: molevolvr-frontend
build:
context: ./app
target: release
volumes:
- app_bundle:/app/dist
depends_on:
- backend

caddy:
image: caddy:2
ports:
- "80:80"
- "443:443"
volumes:
- app_bundle:/srv
- ./services/caddy/Caddyfile:/etc/caddy/Caddyfile
- caddy_data:/data
- caddy_config:/config
depends_on:
- app
32 changes: 32 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
services:
backend:
image: molevolvr-backend
platform: linux/amd64
build:
context: ./backend
dockerfile: ./docker/Dockerfile
env_file:
- .env
environment:
- API_PORT=9050
- "POSTGRES_HOST=db"
depends_on:
db:
condition: service_healthy

app:
image: molevolvr-frontend
build: ./app
depends_on:
- backend

db:
image: postgres:16
env_file:
- .env
healthcheck:
test: ["CMD-SHELL", "sh -c 'pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}'"]
interval: 30s
timeout: 60s
retries: 5
start_period: 80s
Loading