Repository structure:
sarbecovirus_phylogeography/
├── data
└── analysis
├── recombination_analysis
├── SARS-CoV-1-like_viruses
│ ├── Clock_calibration
│ ├── Bayesian_divergence_time_estimation_and_recCA_inference
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_extra_genomes
│ ├── tip-dated_phylogeography
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_withTwoGhostLineages
│ └── PoW-transformed_phylogeography
│ └── Primary_analysis
├── SARS-CoV-2-like_viruses
│ ├── Clock_calibration
│ ├── Bayesian_divergence_time_estimation_and_recCA_inference
│ │ ├── Primary_analysis
│ │ └── Sensitivity_analysis_early2020
│ ├── tip-dated_phylogeography
│ │ ├── Primary_analysis
│ │ ├── Sensitivity_analysis_withTwoGhostLineages
│ │ ├── Sensitivity_analysis_early2020
│ │ └── Sensitivity_analysis_early2020_withTwoGhostLineages
│ └── PoW-transformed_phylogeography
│ ├── Primary_analysis
│ └── Sensitivity_analysis_early2020
└── post_hoc_analyses
The input data that we are able to share publicly is in the data
directory.
Contains XMLs and resulting MCC trees for each of the analyses. The Clock_calibration
analyses also includes subsampled tree files that were used for the empirical tree distribution in the clock calibration inference across each non-recombinant region. The post_hoc_analyses
subdirectory includes jupyter notebooks to generate Figures 1 and 3, the scripts for applying the PoW transformation to the output trees generated by the XMLs in the PoW-transformed_phylogeography
analyses, and the scripts for generating the lineage dispersal rates and phylogeography figures. Refer to the methods in the manuscript for more details.