nextstrain · joverlee521 · Mar 28, 2024 · Mar 1, 2024 · Mar 21, 2024 · Mar 21, 2024
diff --git a/redirects.yml b/redirects.yml
@@ -159,14 +159,26 @@
   from_url: /tutorials/quickstart.html
   to_url: /tutorials/running-a-workflow.html
 
+- type: page
+  from_url: /tutorials/running-a-workflow.html
+  to_url: /tutorials/running-a-phylogenetic-workflow.html
+
 - type: page
   from_url: /tutorials/zika.html
   to_url: /tutorials/creating-a-workflow.html
 
+- type: page
+  from_url: /tutorials/creating-a-workflow.html
+  to_url: /tutorials/creating-a-phylogenetic-workflow.html
+
 - type: page
   from_url: /tutorials/tb_tutorial.html
   to_url: /tutorials/creating-a-bacterial-pathogen-workflow.html
 
+- type: page
+  from_url: /tutorials/creating-a-bacterial-pathogen-workflow.html
+  to_url: /tutorials/creating-a-bacterial-phylogenetic-workflow.html
+
 - type: page
   from_url: /guides/share/nextstrain-groups.html
   to_url: /guides/share/groups/index.html

diff --git a/src/guides/share/groups/index.rst b/src/guides/share/groups/index.rst
@@ -11,7 +11,7 @@ Share via Nextstrain Groups
     This how-to guide assumes familiarity with the :doc:`Nextstrain Groups
     </learn/groups/index>` feature and the :doc:`Nextstrain dataset files
     </reference/data-formats>` produced by :doc:`running a pathogen workflow
-    </tutorials/running-a-workflow>`.  We recommend reading about those first
+    </tutorials/running-a-phylogenetic-workflow>`.  We recommend reading about those first
     if you're not familiar with them.
 
 Log in with the Nextstrain CLI

diff --git a/src/index.rst b/src/index.rst
@@ -52,10 +52,10 @@ team and other Nextstrain users provide assistance.  For private inquiries,
     :hidden:
 
     Installing <install>
-    tutorials/running-a-workflow
-    tutorials/creating-a-workflow
+    tutorials/running-a-phylogenetic-workflow
+    tutorials/creating-a-phylogenetic-workflow
     Exploring SARS-CoV-2 evolution <https://docs.nextstrain.org/projects/ncov/page/index.html>
-    tutorials/creating-a-bacterial-pathogen-workflow
+    tutorials/creating-a-bacterial-phylogenetic-workflow
     tutorials/narratives-how-to-write
     Analyzing genomes with Nextclade <https://docs.nextstrain.org/projects/nextclade/page/user/nextclade-web/index.html>
 

diff --git a/src/install.rst b/src/install.rst
@@ -325,7 +325,7 @@ Try running Augur and Auspice
 Next steps
 ==========
 
-With Nextstrain installed, try :doc:`tutorials/running-a-workflow` next.
+With Nextstrain installed, try :doc:`tutorials/running-a-phylogenetic-workflow` next.
 
 
 Alternate installation methods

diff --git a/src/learn/augur-to-auspice.rst b/src/learn/augur-to-auspice.rst
@@ -24,7 +24,7 @@ Auspice (visualization) components
 
 It's helpful to start in Auspice and then work backwards to Augur.
 In this section, we will walk through various components of Auspice and how
-they relate to the :term:`dataset JSON <dataset>` (sometimes called an Auspice JSON).
+they relate to the :term:`dataset JSON <phylogenetic dataset>` (sometimes called an Auspice JSON).
 
 Phylogeny Tree Panel and Core Controls
 --------------------------------------
@@ -226,7 +226,7 @@ various components:
 .. image:: ../images/auspice-components-diversity-panel.png
   :alt: Annotated screenshot of Auspice's diversity (entropy) panel
 
-The diversity panel is enabled by data in the :term:`dataset JSON <dataset>`.
+The diversity panel is enabled by data in the :term:`dataset JSON <phylogenetic dataset>`.
 The top-level ``meta.genome_annotations`` provides the genome annotations
 displayed and the individual tree nodes provide the mutations
 via ``node.branch_attrs.mutations``, which are used to calculate the entropy
@@ -337,9 +337,9 @@ Exporting data via Augur
 ========================
 
 We now consider how information flows through Augur, specifically
-``augur export v2`` which produces the :term:`dataset (Auspice) JSON <dataset>`
+``augur export v2`` which produces the :term:`dataset (Auspice) JSON <phylogenetic dataset>`
 described above.  This process combines data inputs with parameters configuring
-aspects of the visualisation and produces :term:`dataset files <dataset>` for
+aspects of the visualisation and produces :term:`dataset files <phylogenetic dataset>` for
 Auspice to visualise.
 
 .. graphviz::

diff --git a/src/learn/parts.rst b/src/learn/parts.rst
@@ -79,7 +79,7 @@ example, you visit `nextstrain.org/mumps/na
 
     Auspice displaying Mumps genomes from North America.
 
-:term:`Datasets<dataset>` are produced by Augur and
+:term:`Datasets<phylogenetic dataset>` are produced by Augur and
 visualized by Auspice.  These files are often referred to as :term:`JSONs`
 colloquially because they use a generic data format called JSON.
 
@@ -118,7 +118,7 @@ colloquially because they use a generic data format called JSON.
         Augur -> jsons -> Auspice;
     }
 
-:term:`Builds<build>` are recipes of code and data that produce these :term:`datasets<dataset>`.
+A :term:`build` is a recipe of several commands and data that produce a single :term:`dataset`.
 
 .. graphviz::
     :align: center
@@ -165,9 +165,13 @@ colloquially because they use a generic data format called JSON.
         metadata -> filter;
     }
 
-Builds run several commands and are often automated by workflow managers such as `Snakemake <https://snakemake.readthedocs.io>`__, `Nextflow <https://nextflow.io>`__ and `WDL <https://openwdl.org>`__. A :term:`workflow` bundles one or more related :term:`builds<build>` which each produce a :term:`dataset` for visualization with :term:`Auspice`.
+A :term:`workflow` can bundle one or more related :term:`builds<build>` and are often automated by workflow managers
+such as `Snakemake <https://snakemake.readthedocs.io>`__, `Nextflow <https://nextflow.io>`__
+and `WDL <https://openwdl.org>`__.
 
-As an example, our core workflows are organized as `Git repositories <https://git-scm.com>`__ hosted on `GitHub <https://github.com/nextstrain>`__. Each contains a :doc:`Snakemake workflow </guides/bioinformatics/augur_snakemake>` using Augur, configuration, and data.
+Our :term:`pathogen repositories<pathogen repository>` are organized as `Git repositories <https://git-scm.com>`__
+hosted on `GitHub <https://github.com/nextstrain>`__. Each repository can contain
+one or more workflows.
 
 .. graphviz::
     :align: center
@@ -176,44 +180,96 @@ As an example, our core workflows are organized as `Git repositories <https://gi
         graph [
             fontname="Lato, 'Helvetica Neue', sans-serif",
             fontsize=12,
-        ]
+        ];
         node [
             shape=box,
             style="rounded, filled",
             fontname="Lato, 'Helvetica Neue', sans-serif",
             fontsize=12,
             height=0.1,
             colorscheme=paired10,
+            pad=0.1,
+            margin=0.1,
         ];
-        rankdir=LR
+        rankdir=LR;
+
+        subgraph cluster_ncov {
+            label = "SARS-CoV-2 repository";
+            subgraph cluster_ncov_phylo {
+                label = "Phylogenetic workflow";
+                build0 [width=1, label="Global build"];
+                build1 [width=1, label="Africa build"];
+                build2 [width=1, label="Europe build"];
+                output0 [width=1, label="dataset"];
+                output1 [width=1, label="dataset"];
+                output2 [width=1, label="dataset"];
+                ellipses1 [width=1, label="...", penwidth=0, fillcolor="white"];
+                ellipses2 [width=1, label="...", penwidth=0, fillcolor="white"];
+            }
+        }
 
-        subgraph cluster_0 {
-            label = "Zika workflow";
-            build0 [width=1, label="Zika build"]
-            dataset0 [width=1, label="dataset"]
+        subgraph cluster_zika {
+            label = "Zika repository";
+            nojustify = true;
+            subgraph cluster_zika_ingest {
+                label = "Ingest workflow";
+                build3 [width=1, label="ingest build"];
+                output3 [width=1, label="ingest dataset"];
+            }
+            subgraph cluster_zika_phylo {
+                label = "Phylogenetic workflow";
+                build4 [width=1, label="phylogenetic build"];
+                output4 [width=1, label="dataset"];
+            }
         }
 
-        subgraph cluster_1 {
-            label = "SARS-CoV-2 workflow";
-            build1 [width=1, label="Global build"]
-            build2 [width=1, label="Africa build"]
-            build3 [width=1, label="Europe build"]
-            dataset1 [width=1, label="dataset"]
-            dataset2 [width=1, label="dataset"]
-            dataset3 [width=1, label="dataset"]
-            ellipses1 [width=1, label="...", penwidth=0, fillcolor="white"]
-            ellipses2 [width=1, label="...", penwidth=0, fillcolor="white"]
+        subgraph cluster_mpox {
+            label = "Mpox repository";
+            subgraph cluster_mpox_ingest {
+                label = "Ingest workflow";
+                build5 [width=1, label="ingest build"];
+                output5 [width=1, label="ingest dataset"];
+            }
+            subgraph cluster_mpox_phylo {
+                label = "Phylogenetic workflow";
+                build6 [width=1, label="mpxv build"];
+                build7 [width=1, label="hmpxv1 build"];
+                build8 [width=1, label="hmpxv1_big build"];
+                output6 [width=1, label="dataset"];
+                output7 [width=1, label="dataset"];
+                output8 [width=1, label="dataset"];
+
+            }
+            subgraph cluster_mpox_nextclade {
+                label = "Nextclade workflow";
+                build9 [width=1, label="all-clades build"];
+                build10 [width=1, label="clade-iib build"];
+                build11 [width=1, label="lineage-b.1 build"];
+                output9 [width=1, label="Nextclade dataset"];
+                output10 [width=1, label="Nextclade dataset"];
+                output11 [width=1, label="Nextclade dataset"];
+
+            }
         }
 
-        build0 -> dataset0
-        build1 -> dataset1
-        build2 -> dataset2
-        build3 -> dataset3
+        build0 -> output0;
+        build1 -> output1;
+        build2 -> output2;
+        build3 -> output3;
+        build4 -> output4;
+        build5 -> output5;
+        build6 -> output6;
+        build7 -> output7;
+        build8 -> output8;
+        build9 -> output9;
+        build10 -> output10;
+        build11 -> output11;
 
         {
-            edge[style=invis]
-            dataset0 -> build1      // arrange clusters on same row
-            ellipses1 -> ellipses2
+            edge[style=invis];
+            output0 -> build3; // arrange clusters on same row
+            output3 -> build5; // arrange clusters on same row
+            ellipses1 -> ellipses2;
         }
     }
 
@@ -242,5 +298,5 @@ quality checks, and phylogenetic placement. Nextclade can be used independently
 of other Nextstrain tools as well as integrated into workflows.
 
 With this overview, you'll be better prepared to :doc:`install Nextstrain
-</install>` and :doc:`run a workflow </tutorials/running-a-workflow>` or :doc:`contribute
+</install>` and :doc:`run a workflow </tutorials/running-a-phylogenetic-workflow>` or :doc:`contribute
 to development </guides/contribute/index>`.
diff --git a/src/reference/data-files.rst b/src/reference/data-files.rst
@@ -26,14 +26,14 @@ Workflow files
     Files which correspond to several :term:`builds <build>` visible on nextstrain.org, e.g. all of builds under <nextstrain.org/ncov/open/…>.
     These often include the full metadata table, sequences FASTA, titer matrix, etc.
 
-    We often call these "inputs" colloquially because they're often the top-level inputs to a :term:`workflow`, but some of the files are actually workflow-level outputs.
+    We often call these "inputs" colloquially because they're often the top-level inputs to a :term:`phylogenetic workflow`, but some of the files are actually workflow-level outputs.
     (Albeit, outputs that can be used as time-saving inputs in later workflow runs.)
 
 Build files
     Files which correspond to a specific single :term:`build` visible on nextstrain.org, e.g. <`nextstrain.org/ncov/open/global/6m <https://nextstrain.org/ncov/open/global/6m>`__>.
-    These often include the subsampled metadata table, sequences FASTA, and Newick tree as well as the final :term:`dataset` JSONs.
+    These often include the subsampled metadata table, sequences FASTA, and Newick tree as well as the final :term:`phylogenetic dataset` JSONs.
 
-    We often call these "outputs" colloquially because they're produced by running a :term:`workflow`, but some of the files are actually the specific, subsampled inputs that went into the specific build.
+    We often call these "outputs" colloquially because they're produced by running a :term:`phylogenetic workflow`, but some of the files are actually the specific, subsampled inputs that went into the specific build.
 
 Workflow and build files for public data are available from: