Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for the Advanced User Support agreement with Sigma2 #21

Open
JamesSample opened this issue Jan 26, 2023 · 2 comments
Open

Comments

@JamesSample
Copy link
Contributor

JamesSample commented Jan 26, 2023

At the start of the SeaBee project, we discussed setting up an Advanced User Support (AUS) agreement with Sigma2. The purpose of this is to get additional support from Sigma2 for things beyond the scope of the standard helpdesk.

The Sigma2 helpdesk has been great so far, so the AUS is not a high priority for the moment. However, I'd like to use this issue to gather ideas for things we could include as part of an AUS in the future. After the NFR review in April, I'll setup a meeting with Sigma2 to discuss ideas in more detail.

@knl88 @deviirmr @awigeon @jarlehr - As you're developing and testing on Sigma2, please keep the AUS in mind and comment here with any suggestions (please be as specific as possible).

So far we have:

  • Better options for deploying NodeODM. We have NodeODM deployed in our namespace and it works well, but would benefit from more resources. We don't run NodeODM that often, but when we do it can use a lot of resources (e.g. 40 to 60 CPUs and 1 TB of memory). Can Sigma2 help us to deploy this more efficiently e.g. to scale from zero (having NodeODM turned off) to having it turned on and accessible from the Hub with lots of memory and CPUs available. Maybe Sigma could setup a NodeODM "app" and Helm chart, or deploy it on the HPC side via Singularity, or something else? We basically want NodeODM to be turned off most of the time (so it's not using resources), but we need to be able to turn it on at short notice and have access to lots of CPUs and memory.

  • Better scaling of the JupyterHub. The current solution of running three Hubs to create three different user environments (different single-user images with different resource allocations) is fiddly to maintain. It also wastes resources running three Hubs. One Hub should be able to serve several different user environments via single-user profiles. Could Sigma2 expose some of these settings in their Helm chart, or help us to achieve the same thing some other way?

  • Fix permissions in the single-user image for JupyterHub. VSCode (code-server) and R-Studio behave a little strangely on the JupyterHub - I think because I don't fully understand the user/group permissions used by Sigma2 within their Jupyter Helm charts. It would be nice to get some help from Sigma2 with our single-user Dockerfile for the Hub. It basically works, but some minor bugs could probably be fixed easily by someone familiar with the Sigma2 JupyterHub deployment.

  • MLOps and version control. Eventually, we will need to keep track of algorithm versions (which algorithms have been trained with which datasets, and under which conditions etc.). Various tools for this exist (e.g. KubeFlow), but some require permissions beyond our Kubernetes namespace and are therefore not possible on Sigma2. Advice and assistance from Sigma2 on how to "operationalise" out ML worlkflows would be helpful.

@knl88
Copy link
Contributor

knl88 commented Jan 26, 2023

I would also suggest:

@KristofferKa
Copy link

KristofferKa commented Jan 27, 2023

Sigma2_Project-AUS-application_SeaBee.docx
Sigma2's template for AUS applications attached/uploaded here ☝🏻 (prefilled on initial headlines, but lacking the details on specific needs)

The template says "2019", but is the most recent one (and the only one I've been able to dig up)
@JamesSample @knl88 @deviirmr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants