docs: update documentation

stefpeschel · Nov 28, 2024 · 5cda66d · 5cda66d
1 parent 13f04b0
commit 5cda66d
Show file tree

Hide file tree

Showing 3 changed files with 231 additions and 31 deletions.
diff --git a/_pkgdown.yml b/_pkgdown.yml
@@ -31,6 +31,7 @@ articles:
 - title: More examples
   navbar: More examples
   contents:
+  - plot_modifications
   - asso_as_input
   - cross_domain_networks
   - soil_example
@@ -80,4 +81,3 @@ authors:
     href: https://www.helmholtz-munich.de/en/iap/pi/erika-von-mutius
   Martin Depner:
     href: https://www.helmholtz-munich.de/en/iap
-
diff --git a/vignettes/NetCoMi.Rmd b/vignettes/NetCoMi.Rmd
@@ -19,7 +19,7 @@ knitr::opts_chunk$set(
 
 The American Gut data provided by the [`SpiecEasi`](https://github.com/zdk123/SpiecEasi) package are used for almost all NetCoMi tutorials. 
 
-We begin by constructing a single network, which we analyze using quantitative and graphical methods. Later, we will compare the networks of two groups: Individuals with and without seasonal allergies.
+We begin by constructing a single network on genus level, which we analyze using quantitative and graphical methods. Later, we will compare the networks of two groups: Individuals with and without seasonal allergies.
 
 NetCoMi's main functions are `netConstruct()` for network construction, `netAnalyze()` for network analysis, and `netCompare()` for network comparison. As should become clear from the following examples, these three functions must be executed in the aforementioned order. A further function is `diffnet()` for constructing a differential association network. `diffnet()` must be applied to the object returned by `netConstruct()`.
 
@@ -29,20 +29,58 @@ loaded together with NetCoMi).
 
 ```{r load_data, message=FALSE, warning=FALSE,}
 library(NetCoMi)
-data("amgut1.filt")
-data("amgut2.filt.phy")
+library(phyloseq)
+
+# Load data sets
+data("amgut1.filt") # ASV count matrix
+data("amgut2.filt.phy") # phyloseq objext
+```
+
+### Data agglomeration
+
+*Skip this step if you wish to construct the network at ASV level*.
+
+We start by agglomerating the phyloseq object to genus level, named "Rank6" in this data set.
+
+The `renameTaxa` function is used to rename the taxa according to a defined pattern and to make the genus names unique.
+
+```{r}
+# Agglomerate to genus level
+amgut_genus <- tax_glom(amgut2.filt.phy, taxrank = "Rank6")
+
+# Rename taxonomic table and make Rank6 (genus) unique
+amgut_genus_renamed <- renameTaxa(amgut_genus, 
+                                  pat = "<name>", 
+                                  substPat = "<name>_<subst_name>(<subst_R>)",
+                                  numDupli = "Rank6")
 ```
 
+Let's compare the taxa names before and after renaming them.
+
+```{r}
+head(tax_table(amgut_genus))
+head(tax_table(amgut_genus_renamed))
+```
+
+```{r}
+taxtab <- tax_table(amgut_genus)
+taxtab[taxtab[, "Rank6"] == "g__Eubacterium", ]
+```
+
+For example, a "1" has been added to Eubacterium because this genus exists twice: once in the Ruminococcaceae family and once in the Lachnospiraceae family. For association estimation, it is important to distinguish between them, so they are numbered.
+
 ### Network construction 
 
-We use the [SPRING](https://github.com/GraceYoon/SPRING) package for estimating associations (conditional dependence) between OTUs. 
+We use the [SPRING](https://github.com/GraceYoon/SPRING) package for estimating associations (conditional dependence) between taxa. 
 
 The data are filtered within `netConstruct()` as follows:
 
 - Only samples with a total number of reads of at least 1000 are included 
 (argument `filtSamp`). 
 - Only the 50 taxa with highest frequency are included (argument `filtTax`).
 
+Note that the taxa filter is set for demonstration purposes only, but has no effect here because there are only 43 genera in the data set.
+
 `measure` defines the association or dissimilarity measure, which is `"spring"` 
 in our case. Additional arguments are passed to `SPRING()` via `measurePar`. 
 `nlambda` and `rep.num` are set to 10 for a decreased execution time, but should 
@@ -63,7 +101,8 @@ The `verbose` argument is set to 3 so that all messages generated by
 `netConstruct()` as well as messages of external functions are printed.
 
 ```{r single_spring}
-net_spring <- netConstruct(amgut1.filt,
+net_spring <- netConstruct(amgut_genus_renamed,
+                           taxRank = "Rank6",
                            filtTax = "highestFreq",
                            filtTaxPar = list(highestFreq = 50),
                            filtSamp = "totalReads",
@@ -80,10 +119,23 @@ net_spring <- netConstruct(amgut1.filt,
                            seed = 123456)
 ```
 
+Let's take a look at the edge list, which contains for each pair of nodes the estimated association, the dissimilarity and the adjacency (= edge weight for weighted networks).
+
+```{r}
+head(net_spring$edgelist1)
+```
+
+We can also use NetCoMi's `plotHeat` function to plot the estimated associations.
+
+```{r, fig.width=12, fig.height=10}
+plotHeat(net_spring$assoMat1, textUpp = "none", textLow = "none")
+```
+
+
 ### Network analysis
 
 NetCoMi's `netAnalyze()` function is used for analyzing the constructed 
-network(s).
+network.
 
 Here, `centrLCC` is set to `TRUE` meaning that centralities are calculated only 
 for nodes in the largest connected component (LCC). 
@@ -114,28 +166,6 @@ props_spring <- netAnalyze(net_spring,
 summary(props_spring, numbNodes = 5L)
 ```
 
-#### Plotting the GCM heatmap manually
-
-```{r single_spring_heat, fig.height=6, fig.width=6}
-plotHeat(mat = props_spring$graphletLCC$gcm1,
-         pmat = props_spring$graphletLCC$pAdjust1,
-         type = "mixed",
-         title = "GCM", 
-         colorLim = c(-1, 1),
-         mar = c(2, 0, 2, 0))
-
-# Add rectangles highlighting the four types of orbits
-graphics::rect(xleft   = c( 0.5,  1.5, 4.5,  7.5),
-               ybottom = c(11.5,  7.5, 4.5,  0.5),
-               xright  = c( 1.5,  4.5, 7.5, 11.5),
-               ytop    = c(10.5, 10.5, 7.5,  4.5),
-               lwd = 2, xpd = NA)
-
-text(6, -0.2, xpd = NA, 
-     "Significance codes:  ***: 0.001;  **: 0.01;  *: 0.05")
-```
-
-
 ### Visualizing the network
 
 We use the determined clusters as node colors and scale
@@ -148,11 +178,13 @@ the node sizes according to the node's eigenvector centrality.
 
 ```{r single_spring_3, fig.height=18, fig.width=20}
 p <- plot(props_spring, 
+          labelScale = FALSE,
           nodeColor = "cluster", 
           nodeSize = "eigenvector",
-          title1 = "Network on OTU level with SPRING associations", 
+          title1 = "Network on genus level with SPRING associations", 
           showTitle = TRUE,
-          cexTitle = 2.3)
+          cexTitle = 2.3,
+          cexLabels = 1.5)
 
 legend(0.7, 1.1, cex = 2.2, title = "estimated association:",
        legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"), 
@@ -173,6 +205,7 @@ p$q1$Arguments$cut
 ```
 
 
+
 ### Export to Gephi
 
 Some users may be interested in how to export the network to Gephi. Here's an example:

diff --git a/vignettes/plot_modifications.Rmd b/vignettes/plot_modifications.Rmd
@@ -0,0 +1,167 @@
+---
+title: "Modifying the network plot"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{plot_modifications}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+# Suppress title check warning
+options(rmarkdown.html_vignette.check_title = FALSE)
+
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+```
+
+```{r setup, message=FALSE, warning=FALSE}
+library(NetCoMi)
+library(phyloseq)
+```
+
+We construct a network at genus level to demonstrate a variety of modification options for the network plot. The SparCC package is used here because the resulting graph is denser than conditional independence graphs, which is preferable for this tutorial.
+
+## Network construction and analysis
+
+```{r single_genus_1, fig.height=6, fig.width=6}
+data("amgut2.filt.phy")
+
+# Agglomerate to genus level
+amgut_genus <- tax_glom(amgut2.filt.phy, taxrank = "Rank6")
+
+# Rename taxonomic table and make Rank6 (genus) unique
+amgut_genus_renamed <- renameTaxa(amgut_genus, 
+                                  pat = "<name>", 
+                                  substPat = "<name>_<subst_name>(<subst_R>)",
+                                  numDupli = "Rank6")
+
+# Network construction and analysis
+net <- netConstruct(amgut_genus_renamed,
+                    taxRank = "Rank6",
+                    measure = "sparcc",
+                    normMethod = "none", 
+                    zeroMethod = "none",
+                    sparsMethod = "thresh",
+                    thresh = 0.3,
+                    dissFunc = "signed",
+                    verbose = 2,
+                    seed = 123456)
+
+netprops <- netAnalyze(net, 
+                       clustMethod = "cluster_fast_greedy",
+                       gcmHeat = FALSE # Do not plot the GCM heatmap
+)
+```
+
+### Simple network plot
+
+```{r netplot_mod_simple, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops)
+```
+
+In the following we will look at the different examples, where the arguments of `plot.microNetProps` are adjusted. 
+
+### Layout
+
+Circular layout: 
+
+```{r netplot_mod_layout_circle, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, layout = "circle")
+```
+Layouts from `igraph`package are also accepted (see `?igraph::layout_`).
+
+```{r netplot_mod_layout_with_fr, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, layout = "layout_with_fr")
+```
+
+We can also use a layout in matrix form. Here, we generate one in advance using the Fruchterman-Reingold layout algorithm from `igraph` package.
+
+```{r netplot_mod_layout_matrix, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+# Compute layout
+graph <- igraph::graph_from_adjacency_matrix(net$adjaMat1, 
+                                             weighted = TRUE)
+set.seed(123456)
+lay_fr <- igraph::layout_with_fr(graph)
+
+# Row names of the layout matrix must match the node names
+rownames(lay_fr) <- rownames(net$adjaMat1)
+
+plot(netprops, 
+     layout = lay_fr)
+```
+We can see that the layout is basically the same as before, just rotated.
+
+### Repulsion
+
+Nodes are placed closer together for smaller repulsion values and further apart for higher values.
+
+```{r netplot_mod_repulsion1, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, repulsion = 1.2)
+```
+
+```{r netplot_mod_repulsion2, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, repulsion = 0.5)
+```
+
+For the rest of the tutorial we will use a repulsion of 0.9.
+
+### Shorten labels 
+
+```{r netplot_mod_shortlabs_simple, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, 
+     repulsion = 0.9,
+     shortenLabels = "simple", 
+     labelLength = 6)
+```
+
+In many cases, it makes sense to use the " intelligent " label shortening, which preserves the last part of the label. This allows you to distinguish for instance between Enterococcus and Enterobacter. With the following label pattern, they would be abbreviated to "Enter'coc" and "Enter'bac", while both would be abbreviated to "Entero" with the "simple" shortening used before. 
+
+```{r netplot_mod_shortlabs_intelligent, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, 
+     repulsion = 0.9,
+     shortenLabels = "intelligent", 
+     labelPattern = c(5, "'", 3, "'", 3))
+```
+
+### Label size and scaling
+
+By default, labels are scaled according to node size.
+
+```{r netplot_mod_labelscale, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, 
+     repulsion = 0.9,
+     labelScale = FALSE)
+```
+
+```{r netplot_mod_labelscale_cex, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, 
+     repulsion = 0.9,
+     labelScale = FALSE,
+     cexLabels = 1.5)
+```
+
+If the labels overlap, one can play around with the `repulsion` argument to randomly rearrange the node placement until a good solution is found.
+
+```{r netplot_mod_labelscale_rep, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, 
+     repulsion = 0.901,
+     labelScale = FALSE,
+     cexLabels = 1.5)
+```
+
+### Label font
+
+Label fonts can be manipulated with the `labelFont' argument. 2 stands for bold, 3 for italic, and so on. 
+It is recommended to define the hub label font separately.
+
+```{r netplot_mod_label_font, fig.height=14, fig.width=15, fig.dpi=200, out.width="100%"}
+plot(netprops, 
+     repulsion = 0.901,
+     labelScale = FALSE,
+     cexLabels = 1.5,
+     labelFont = 3,
+     hubLabelFont  = 2)
+```