This repository pertains to the manuscript "Statistical Design of a Synthetic Bacterial Community that Clears a multi-drug Resistant gut pathogen."
There are two different Demo:
This folder contains a demonstration of the construction of Strain and metabolite landscapes using the PCA projections of the real dataset. These diagrams can also be found in our manuscript.
This folder contains a demonstration of PCA Analysis, Metabolite RF modeling, and PCA landscape construction on the synthetic datasets. These synthetic datasets are of the same size and similar nature as the real dataset. The results here are solely for pedagogical demonstration of our code. No part of these datasets and results generated have been used in our analysis and the manuscript.
-
In your terminal type and execute the following: [https://github.com/aramanlab/Oliveira_et_al_2024_V_Demo.git]
-
Alternatively, go to the link (https://github.com/aramanlab/Oliveira_et_al_2024_V_Demo.git) And download the zip file.
Installation/Download Time: 1 minute.
This repository contains the following datasets:
- PCA_coordinates_Metabolite_space_original_81_actual.csv ------- [Metabolite_Space_Principal_component_landscape.ipynb => Output]
- PCA_coordinates_Strain_Presence_Absence_Space_Original_96_actual.csv ------- [Strain_Space_Principal_component_landscape.ipynb => Output]
Strain presence-absence vs KP CFU for Original and out-of-sample experiments:
- consortia_taxa_presence_and_KpCFUs.csv-------[Iput => RF_Model_Metabolites.ipynb, Strain_Space_Principal_component_landscape.ipynb]
- consortia_taxa_presence_and_KPCFUs_OOS.csv
Metabolite Z score vs KP CFU for Original and out-of-sample experiments:
- Metabolite_120_Hr_dataset.csv-------[Input => RF_Model_Metabolites.ipynb, Metabolite_Space_Principal_component_landscape.ipynb]
- Metabolite_120_Hrs_OOS_set_log10_CFU.csv-------[Input => RF_Model_Metabolites.ipynb]
This repository contains the following notebooks:
Major Package Detail: scikit-learn[1.0.2] , scipy[1.7.3], Matplotlib[3.5.1]
- Jupyter Notebook to construct Metabolite space landscape of the original experiments [Metabolite_Space_Prinincipal_component_landscape.ipynb] [Runtime 5 minutes]
- Jupyter Notebook to construct Strain space landscape of the original experiments [Strain_Space_Prinincipal_component_landscape.ipynb] [Runtime 5 minutes]
- Jupyter Notebook to construct a Random Forest Model on Metabolites to predict KP suppression
[RF_Model_Metabolites.ipynb] [Runtime 5 minutes]
- Trained on original 96 experiments
- Tested on 60 Out-of-sample experiments
Pseudocodes describing the overarching steps in the calculations can be found in the Pseudocodes folder and the folders with notebooks. They also contain Run times for the codes they describe.