-
Notifications
You must be signed in to change notification settings - Fork 1
1. Introduction
triUMPF (triple non-negative matrix factorization (NMF) with community detection for metabolic pathway inference) combining three stages of NMF to capture relationships between enzymes and pathways within a network followed by community detection to extract higher-order network structure to improve the precision of metabolic pathway prediction.
Fig. 1: triUMPF workflow
Specifically, triUMPF (Fig. 1) performs training in three iterative steps (see Training):
- Building three association matrices: At the very beginning, MetaCyc is applied to extract three matrices (Fig. 1(a)): i)- pathways and enzymes (P2E) associations indicated by enzyme commission (EC)) numbers, ii)- enzymes interactions (E2E), and iii)- pathways interactions (P2P).
- Generating features: During this step (Fig. 1(b)), pathway2vec is applied to automatically generate features corresponding to pathways and enzymes (or EC).
- Training: triUMPF is trained in two phases: i)- decomposition of the pathway EC association matrix (Fig. 1(c)) and ii)- subnetwork or community reconstruction while, simultaneously, learning optimal multi-label pathway parameters (Figs 1(d-f)).
The last step described above is repeated a prespecified number of times. Once the training is complete (Fig. 1(g)), the trained model can then be applied to predict metabolic pathways from a newly sequenced genome (see Tutorial on pathway prediction).
leADS was evaluated on the pathway prediction task using 10 multi-organism pathway datasets (see Download files), where the experiments revealed that triUMPF achieved a compelling and competitive performance against the state-of-the-art pathway inference algorithms (see Evaluation). For more information about triUMPF, please visit our paper.
If you find triUMPF useful in your research, please consider citing the following paper:
M. A. Basher, Abdur Rahman, McLaughlin, Ryan J., and Hallam, Steven J.. "Metabolic pathway prediction using non-negative matrix factorization with improved precision", bioRxiv (2021).
For any inquiries or issues, please contact Abdurrahman Abul-Basher at: [email protected]