- Authors: Chenhao Li, Niranjan Nagarajan
BEEM-Static is an R package for learning directed microbial interactions from cross-sectional microbiome profiling data based on the generalized Lotka-Volterra model (gLVM). Extending the core idea of the original BEEM algorithm for longitudinal data (Reference, Source code), BEEM-Static directly works with relative abundances to jointly estimate total biomass and gLVM parameters, thus eliminating the need for experimentally quantifying absolute abundances. BEEM-Static identifies microbiomes that are not at equilibrium states and automatically excludes such samples from the analysis. The package also provides the user with a collection of utility functions for visualizing and diagnosing the fitted model.
Note: This package is under active development. Please record the commit ID for reproducibility.
devtools::install_github('lch14forever/beem-static')
The demo dataset is a simulated community of 20 species and 500 samples. All of the samples are at the equilibrium states (generated by numerically integrating the gLVM until convergence) and each sample contains 70% of all the species randomly (each species has a 70% habitat preference).
library(beemStatic)
data("beemDemo")
attach(beemDemo)
## Use `?beemDemo` to see the help of the fields in this dataset
BEEM-Static is run by calling the func.EM
function.
res <- func.EM(dat.w.noise, ncpu=4, scaling=median(biomass.true), max.iter=200, epsilon = 1e-4)
We provide a function showInteraction
to plot the interaction network
inferred by BEEM-Static (based on the
ggraph package).
showInteraction(res, dat.w.noise)
## Warning: package 'ggplot2' was built under R version 3.5.2
BEEM-Static also estimates the biomass for each sample (retrieved by the
beem2biomass
function). Here we can compare the estimated biomass with
the true biomass on this simulated
dataset.
plot(beem2biomass(res), biomass.true, xlab='BEEM biomass estimation', ylab='True biomass')
We provide a function diagnoseFit
to plot the coefficient of
determination
(R2) for each species. A high R2 (close to 1)
value indicates that the variation in the data is well explained by the
model.
diagnoseFit(res, dat.w.noise, annotate = FALSE)
We now run two popular methods for inferring microbial interactions on our simulated data. Both methods try to infer a correlation matrix as a proxy for the interaction matrix.
- Using an naive Spearman’s correlation method
spearman <- cor(t(dat.w.noise), method='spearman')
- Using SPIEC-EASI
## devtools::install_github("zdk123/SpiecEasi")
library(SpiecEasi)
##
## Attaching package: 'SpiecEasi'
## The following object is masked from 'package:igraph':
##
## make_graph
se <- spiec.easi(t(dat.w.noise), method='mb')
## Applying data transformations...
## Selecting model with pulsar using stars...
## Fitting final estimate with mb...
## done
se.stab <- as.matrix(getOptMerge(se))
- Using BEEM-Static
est <- beem2param(res)
We implement a function auc.b
for ploting the receiver operating
characteristic (ROC) curve with computed area under the curve (AUC). We
compare the ROC curves of the above three methods. We use inference
function provided in the pacakge to estimate confidence values
(t-statistics) that can be used to rank the predictions.
par(mfrow=c(1,3))
auc.b(spearman, scaled.params$b.truth, is.association = TRUE, main='Spearman correlation', print.auc.cex=2)
auc.b(se.stab, scaled.params$b.truth, is.association = TRUE, main='SPIEC-EASI', print.auc.cex=2)
auc.b(inference(dat.w.noise, res), scaled.params$b.truth, main='BEEM-static', print.auc.cex=2)
A manuscript for BEEM-Static is in preparation and please contact us (Li Chenhao or Niranjan Nagarajan) if you are interested in using it. Alternatively, you can also cite our manuscript on BEEM:
- C Li, et al. (2018) An expectation-maximization-like algorithm enables accurate ecological modeling using longitudinal metagenome sequencing data. Microbiome