CaeNDR
is the code used to run the Caenorhabditis elegans Natural Diversity Resource website.
Ask in MS teams for the DevOps service-account json file. Create a local folder under your home directory named ~/.gcp and copy the service account json file to that folder.
$ mkdir ~/.gcp
open -a Finder ~/.gcp
The last line should open MacOS Finder on the ~/.gcp/
folder. Drop the .json
service account file there.
Download from https://code.visualstudio.com/ Install the extension: "Python" (should have 96M downloads)
Download from https://docs.docker.com/desktop/install/mac-install/
$ cd $HOME
$ mkdir homebrew && curl -L https://github.com/Homebrew/brew/tarball/master | tar xz --strip 1 -C homebrew
Find out which shell you are using:
echo $SHELL
If you see "bash" then update the ~/.bash_profile
in the next step. If you see zsh
as your shell, then update the file ~/.zprofile
Add this line to the bottom of your file ~/.bash_profile
, or .zprofile
.
export PATH=$HOME/homebrew/bin:$PATH
- Open Finder on your Mac
- Navigate to Applications/Utilities/Terminal.app,
- Right-click/GetInfo
- Enable the checkbox "Open with Rosetta".
- Close and reopen the terminal app.
- Inside Terminal, type;
$ arch
Expected result:
i386
arch -x86_64 brew update
arch -x86_64 brew install pyenv OpenSSL readline gettext xz
Edit your ~/.bash_profile
and add this to the bottom of the file. If the file ~/.bash_profile
doens't exist check if you are using a different shell (eg: zsh, etc). In that case you might need to edit the file ~/.zshrc
or ~/.zprofile
.
# if using bash, do
nano ~/.bash_profile
# if using zsh then
nano ~/.zprofile
export PATH=$HOME/.pyenv/bin:$PATH
eval "$(pyenv init --path)"
eval "$(pyenv init -)"
# if using bash, do
source ~/.bash_profile
# if using zsh then
source ~/.zprofile
pyenv install 3.7.12
pyenv global 3.7.12
pip install virtualenv
Expected Outputs:
$ python -V
Python 3.7.12
$ virtualenv --version
virtualenv 20.13.0 from /Users/rbv218/.pyenv/versions/3.7.12/lib/python3.7/site-packages/virtualenv/__init__.py
Open one terminal window and run:
export GOOGLE_APPLICATION_CREDENTIALS=~/.gcp/NAME_OF_THE_SERVICE_ACCOUNT_FILE.json
export ENV=development
cd src/modules/site
make clean
make configure
make cloud-sql-proxy-start
Check that the cloud-sql-proxy docker container is running:
$ docker ps
Expected Result:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9413d4f448f0 gcr.io/cloudsql-docker/gce-proxy:1.28.1-alpine "/cloud_sql_proxy -i…" 3 weeks ago Up 23 hours 0.0.0.0:5432->5432/tcp caendr-cloud-sql-proxy-1
Please note that the CONTAINER_ID will be different on your machine.
Keep this docker container running while running the site below.
Open a second terminal window
export GOOGLE_APPLICATION_CREDENTIALS=~/.gcp/NAME_OF_THE_SERVICE_ACCOUNT_FILE.json
export ENV=development
cd src/modules/site-v2
make configure
make dot-env
make venv
code ../../..
The last command will open Visual Studio Code at the root of the project. From the DEBUG->List of options, select "Run Site-v2 (requires a local Postgres instance or cloud-sql-proxy)" and click "Play'.
Once you are done working on the site and no longer need the database, then close the connection:
make cloud-sql-proxy-stop
if this does not stop the container, do this:
docker ps
Expected Result:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
75ef941c1e64 gcr.io/cloudsql-docker/gce-proxy:1.28.1-alpine "/cloud_sql_proxy -i…" 29 minutes ago Up 29 minutes 0.0.0.0:5432->5432/tcp caendr-cloud-sql-proxy-1
Find the CONTAINER_ID (first column) and stop the container manually with:
$ docker kill 75ef941c1e64
To make changes to the Legacy site (currently in production) Open a second terminal window
export GOOGLE_APPLICATION_CREDENTIALS=~/.gcp/NAME_OF_THE_SERVICE_ACCOUNT_FILE.json
export ENV=development
cd src/modules/site
make configure
make dot-env
make venv
code ../../..
Setup requires make
which can be installed with:
sudo apt-get update && sudo apt-get install build-essential
To list all available MakeFile targets and their descriptions in the current directory:
make
or
make help
To automatically install system package requirements for development and deployment:
make configure
To configure your local environment to use the correct cloud resources, you must set the default project and credentials for the Google Cloud SDK and define the 'ENV' environment variable:
gcloud init
gcloud auth login
gcloud auth application-default login
gcloud auth configure-docker
export ENV={ENV_TO_DEPLOY}
Set ENV and GOOGLE_APPLICATION_CREDENTIALS environment variables:
export MODULE_DB_OPERATIONS_CONNECTION_TYPE=localhost
export MODULE_DB_TIMEOUT=3
export ENV={ENV_TO_DEPLOY}
export GOOGLE_APPLICATION_CREDENTIALS={PATH_TO_GCP_CREDENTIALS}
If the module requires a connection to the Cloud SQL instance, you will need to keep the Google Cloud SQL proxy running in the background:
./cloud_sql_proxy -instances=${GOOGLE_CLOUD_PROJECT_ID}:${GOOGLE_CLOUD_REGION}:${MODULE_DB_OPERATIONS_INSTANCE_NAME} -dir=/cloudsql &
or
make cloud-sql-proxy-start
Then switch to a different terminal prompt and change to the module's src directory:
make run
Pre-requisites: Ensure that you are logged in to the GCLOUD GCP project in the CLI, or using a devops service account.
Open a terminal at the root of the project:
-
Set ENV and GOOGLE_APPLICATION_CREDENTIALS environment variables:
export ENV={ENV_TO_DEPLOY} export GOOGLE_APPLICATION_CREDENTIALS={PATH_TO_GCP_CREDENTIALS}
-
Increment the versions for each module that is being updated as part of the deployment:
- Update the
version
property for the module in the/env/{env}/global.env
- Update
version
in the filesrc/modules/{module_name}/module.env
- Update the
-
Move to each module folder and configure the modules for deployment:
cd src/modules/{module_name} make configure
- The module root folder should now contain a .env file
- The module root folder SHOULD NOT contain a venv folder
-
Publish the module to GCR (src/modules/{module_name}):
make publish
- When the command completes, check the GCR and confirm your image with the proper version tag is appearing
-
Deploy new app version:
make cloud-resource-deploy
Troubleshooting:
- Even if ENV and GOOGLE_APPLICATION_CREDENTIALS are set correctly you will need to be logged into gcloud and configure docker to enable publishing containers to GCR since the service account does not have permissions to publish.
- Sometimes after deployment of the full application the ext_assets folder will not copy to the GCP static bucket, but terraform state will reflect the correct bucket resources. You'll notice the CeNDR logo and worms video will not show up on the home page. Simply redeploy the full application and the assets should be correctly copied to the GCP static bucket, fixing the issue.
- Deployment will not work if a virtual environment exists in
img_thumb_gen
, giving an error like the following:Remove the╷ │ Error: Error while updating cloudfunction configuration: Error waiting for Updating CloudFunctions Function: Error code 14, message: The service has encountered an error during container import. Please try again later │ │ with module.img_thumb_gen.google_cloudfunctions_function.generate_thumbnails, │ on modules/img_thumb_gen/cloud-function.tf line 1, in resource "google_cloudfunctions_function" "generate_thumbnails": │ 1: resource "google_cloudfunctions_function" "generate_thumbnails" { │ ╵ make: *** [cloud-resource-deploy] Error 1
venv
directory and try redeploying. - Due to a race condition, sometimes Terraform will attempt to access the new site image before it has been built and published to GCP. Manually publishing the image by running
make publish
insrc/modules/site
(orsrc/modules/site-v2
), then deploying, should fix this issue.
Targeted deployment is under construction until isolated TF states can be establish for each module.
To allow the website to write to the google sheet where orders are recorded, you must add the Google Sheets service account as an editor for the sheet {ANDERSEN_LAB_ORDER_SHEET}: {GOOGLE_SHEETS_SERVICE_ACCOUNT_NAME}@{GOOGLE_CLOUD_PROJECT_ID}.iam.gserviceaccount.com
You must also add the google analytics service account user to the Google Analytics account to view the 'about/statistics' page: {GOOGLE_ANALYTICS_SERVICE_ACCOUNT_NAME}@{GOOGLE_CLOUD_PROJECT_ID}.iam.gserviceaccount.com
Create a new user and log in to the site. Once the account has been created, you can manually promote it to admin by editing the user
entity in Google Cloud Datastore.
Before these tools can be used for the first time, the available container versions must be loaded from docker hub. Visiting the 'Tool Versions' page in the 'Admin' portal will import this data automatically:
Admin -> Tool Versions
These steps describe how to add data to the strain sheet, load it into the site database, then load the strain data, wormbase gene information, and strain variant annotation data into the site's SQL database:
Admin -> Strain Sheet
: The google sheet linked here must be populated with the strain data that you want to load into the site's internal database.Admin -> ETL Operations
: click 'New Operation' then 'Rebuild strain table from Google Sheet'. (No other fields are required)Admin -> ETL Operations
: click 'New Operation' then 'Rebuild wormbase gene table from external sources' (Wormbase Version number required)Admin -> ETL Operations
: click 'New Operation' then 'Rebuild Strain Annotated Variant table from .csv.gz file' (Strain Variant Annotation Version number required). This operation expects the .csv.gz source file to already exist in the Cloud Bucket location described below.
The strain variant annotation data csv should be versioned with the date of the release having the format YYYYMMDD, compressed with gzip, and uploaded to:
${MODULE_DB_OPERATIONS_BUCKET_NAME}/strain_variant_annotation/c_elegans/WI.strain-annotation.bcsq.YYYYMMDDD.csv.gz
To add a Dataset Release to the site through the Admin panel, you will first have to upload the release files to:
${MODULE_SITE_BUCKET_PUBLIC_NAME}/dataset_release/c_elegans/${RELEASE_VERSION}
using the file and directory structure described in the AndersenLab dry guide
Strain photos should be named using the format <strain>.jpg
and uploaded to a bucket where the img_thumb_gen module will automatically create thumbnails with the format <strain>.thumb.jpeg
:
${MODULE_SITE_BUCKET_PHOTOS_NAME}/c_elegans/<strain>.jpg
-> ${MODULE_SITE_BUCKET_PHOTOS_NAME}/c_elegans/<strain>.thumb.jpg
BAM and BAI files are stored in:
${MODULE_SITE_BUCKET_PRIVATE_NAME}/bam/c_elegans/<strain>.bam
${MODULE_SITE_BUCKET_PRIVATE_NAME}/bam/c_elegans/<strain>.bam.bai
Nemascan requires species data to be manually uploaded to cloud storage to make it accessible to the pipeline:
${MODULE_SITE_BUCKET_PRIVATE_NAME}/NemaScan/input_data
Q: Why does it look like the site
or db_operations
are unable to connect to Cloud SQL (PostGres)?
A: Check if the server exhausted the max_connections
limit. Google Postgres has a hard limit on connections and there is a reserved number of connections for super-admin (backups, etc), that are not available for run-time apps/services/modules. Consider restarting (or stopping and starting) to close all the active connections. In GCP this can be viewed in the POSTGRES tab, select the "Active Connections" from the dropdown to view the stats.
Q: I'm getting errors installing numpy
on MacOS running on M1/M2 chip.
A: See below:
pip3 install cython
pip3 install --no-binary :all: --no-use-pep517 numpy
Q: Which version of terraform do I need to use?
A: Use terraform version 1.1.8. Optional: use tfenv
to manage the terraform version
Q: Missing pg_config
when running on MacOS?
A: Install via homebrew:
brew install postgresql
Q: I'm seeing this error when running make venv
from the src/modules/site-v2
folder: "_libintl_textdomain", referenced from:
_PyIntl_textdomain in libpython3.7m.a(_localemodule.o)
_PyIntl_textdomain in libpython3.7m.a(_localemodule.o)
A: Install gettext
$ arch -x86_64 brew install gettext
Q: I'm seeing this error when runing make venv
from the src/modules/site-v2
folder: "ModuleNotFoundError: No module named 'readline'"
A:
$ arch -x86_64 brew install readline
Q: I'm seeing this error when running make venv
from the src/modules/site-v2
folder:
"ERROR: The Python ssl extension was not compiled. Missing the OpenSSL lib?"
A:
$ arch -x86_64 brew install openssl
Q: I get an ImportError
when running the API in VSCode:
ImportError: cannot import name 'Literal' from 'typing' (/Users/ ... /.pyenv/versions/3.7.12/lib/python3.7/typing.py)
A: The VSCode extension debugpy
, version v2024.12.0
, appears to break on Python 3.7 -- it's not clear if this is a bug, or intentional dropping of support for older versions.
You can get around this by pinning your extension to v2024.10.0
-- see the VSCode docs for instructions.
(If this is a bug, further updates to debugpy
might fix this issue. Keep an eye on it, and try the newest version if necessary.)