This R script was a part of a Macroevolution class project. It looks at how Random forest algorithms can be impacted by phylogenetic traits, and how you can include phylogenetic information to account for this. It's designed to help people understand the impact of tree-structured trait correlation on classification. The script investigates the impact of removing different types of traits or tree information on the accuracy of the classifier.
Full Model: Utilizes all available features. No Random Trait Model: Excludes an arbitrary trait. No Clade Info Model: Excludes clade information. No Tree-Correlated Trait Model: Excludes a tree-correlated trait.
OOB Error: Out-of-bag error is calculated for each model. Overall Error: Generalized error rates are calculated for test data.
Box plots are used for visualizing the error rates across different models.
Feel free to clone, modify, and use this code for your research and applications.