You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Jupyter Notebook is an application for creating and sharing computational documents. JupyterHub is a way of providing the Notebooks to multiple users. The benefit is that users gain easy interactive access to computational resources without need to install anything.
GA4GH TES (Task Execution Service) API is a standardized schema and API for describing and executing batch execution tasks on any underlying computational backend. Full TES spec defines TES capabilities.
The goal of this issue is to develop or to lay foundations to GA4GH TES service plugin for JupyterHub that would execute all cells in the TES instance.
Objective: Build a plugin or extension within JupyterHub that allows seamless access to GA4GH TES, streamlining federated task submission. The plugin will focus on the goal of executing all notebook cells (so whole .ipynb) through TES
Scope: Focus on plugin development, installation instructions, and usage documentation so administrators can easily deploy it across ELIXIR nodes.
This is a larger meta issue that might (should) require discussions. Here are some helping points:
Considerations:
Core Components
TES Client Library: You'll need a client library in Python (the language Jupyter notebooks use) to interact with the TES instance. This library will handle:
Constructing TES task requests based on notebook cell content.
Submitting these tasks to the TES server.
Monitoring task execution status.
Retrieving results and outputs.
Notebook Integration: Develop a mechanism within the Jupyter notebook environment to:
Identify code cells to be executed on TES. (Perhaps a magic command like %%tes or a dedicated cell tag)
Extract code and dependencies from these cells.
Package them into a format suitable for TES (e.g., Docker image).
Display task status and results within the notebook.
Workflow Creation: Treat the entire notebook as a workflow with dependencies between cells.
Cell Ordering: Determine the execution order of cells based on their dependencies (e.g., using cell tags, code analysis).
Task Chaining: The client library will create a series of TES tasks, where the output of one task becomes the input of the next.
State Management: Track the execution state of each cell and the overall workflow.
Error Handling: Implement mechanisms to handle errors in individual cells or the workflow as a whole.
Implementation Considerations
Security: Securely handle authentication and authorization to the TES instance.
Scalability: Design for efficient execution of large notebooks with many cells and complex dependencies.
Usability: Provide a user-friendly interface within the notebook for TES interaction.
Flexibility: Support the option to choose from multiple TES instances and allow customization of task parameters.
Tools and Technologies
TES Implementations: Funnel, TESK, TES Azure
Python TES Client: py-tes
Docker: For containerization
Jupyter Extensions: To enhance the notebook interface
If you want to work on this issue:
Assign yourself to the issue (if someone else is already assigned, first ask them if they would mind help on the issue - or pick another one)
Once assigned, move your issue to the "In progress" column on the project board
Start working 🚀
The text was updated successfully, but these errors were encountered:
Why?
Jupyter Notebook is an application for creating and sharing computational documents. JupyterHub is a way of providing the Notebooks to multiple users. The benefit is that users gain easy interactive access to computational resources without need to install anything.
GA4GH TES (Task Execution Service) API is a standardized schema and API for describing and executing batch execution tasks on any underlying computational backend. Full TES spec defines TES capabilities.
The goal of this issue is to develop or to lay foundations to GA4GH TES service plugin for JupyterHub that would execute all cells in the TES instance.
Objective: Build a plugin or extension within JupyterHub that allows seamless access to GA4GH TES, streamlining federated task submission. The plugin will focus on the goal of executing all notebook cells (so whole .ipynb) through TES
Scope: Focus on plugin development, installation instructions, and usage documentation so administrators can easily deploy it across ELIXIR nodes.
More useful information and link: document online
How?
This is a larger meta issue that might (should) require discussions. Here are some helping points:
Considerations:
If you want to work on this issue:
The text was updated successfully, but these errors were encountered: