Skip to content

1. Introduction

Abdurrahman Abul-Basher edited this page Jun 19, 2021 · 3 revisions

About

triUMPF (triple non-negative matrix factorization (NMF) with community detection for metabolic pathway inference) combining three stages of NMF to capture relationships between enzymes and pathways within a network followed by community detection to extract higher-order network structure to improve the precision of metabolic pathway prediction.

Fig. 1: triUMPF workflow

Specifically, triUMPF (Fig. 1) performs training in three iterative steps (see Training):

  1. Building three association matrices: At the very beginning, MetaCyc is applied to extract three matrices (Fig. 1(a)): i)- pathways and enzymes (P2E) associations indicated by enzyme commission (EC)) numbers, ii)- enzymes interactions (E2E), and iii)- pathways interactions (P2P).
  2. Generating features: During this step (Fig. 1(b)), pathway2vec is applied to automatically generate features corresponding to pathways and enzymes (or EC).
  3. Training: triUMPF is trained in two phases: i)- decomposition of the pathway EC association matrix (Fig. 1(c)) and ii)- subnetwork or community reconstruction while, simultaneously, learning optimal multi-label pathway parameters (Figs 1(d-f)).

The last step described above is repeated a prespecified number of times. Once the training is complete (Fig. 1(g)), the trained model can then be applied to predict metabolic pathways from a newly sequenced genome (see Tutorial on pathway prediction).

leADS was evaluated on the pathway prediction task using 10 multi-organism pathway datasets (see Download files), where the experiments revealed that triUMPF achieved a compelling and competitive performance against the state-of-the-art pathway inference algorithms (see Evaluation). For more information about triUMPF, please visit our paper.

Citing

If you find triUMPF useful in your research, please consider citing the following paper:

M. A. Basher, Abdur Rahman, McLaughlin, Ryan J., and Hallam, Steven J.. "Metabolic pathway prediction using non-negative matrix factorization with improved precision", bioRxiv (2021).

Contact information

For any inquiries or issues, please contact Abdurrahman Abul-Basher at: [email protected]

Clone this wiki locally