Skip to content

Source code for metabolite identification with bi-modal VAEs

License

Notifications You must be signed in to change notification settings

sgalkina/svae_spectra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

svae_spectra

Source code for the paper "Bi-modal Variational Autoencoders for Metabolite Identification Using Tandem Mass Spectrometry"

Source code for training and evaluating SVAE and JMVAE models is provided for spectra-SMILES and spectra-fingerprints case. data folder contains the list of spectra identifiers used for training and testing and the example data for demonstrating the required format for testing and training. experiments folder contains the code for spectra preprocessing, model training and testing. models folder contains the bimodal VAE model classes.

data/train_ids.txt is a list of MoNA and NIST identifiers used for training in the paper evaluations. data/test_ids.txt is a list of MoNA and NIST identifiers used for testing in the paper evaluations.

usage: run.py [-h] -t TRAIN -v TEST [-u TRAIN_MOLECULES] -m MODEL [-k KEYWORD]
              [-l LOAD] [-d DEVICE] [-e]

optional arguments:
  -h, --help            show this help message and exit
  -t TRAIN, --train TRAIN
                        Train set, a csv file with comma as delimiter,
                        required columns 'spectrum' and 'SMILES'
  -v TEST, --test TEST  Test set, a csv file with comma as delimiter, required
                        columns 'spectrum' and 'SMILES'
  -u TRAIN_MOLECULES, --train_molecules TRAIN_MOLECULES
                        Unsupervised molecule train set. If not provided, the
                        algorithm runs in supervised mode
  -m MODEL, --model MODEL
                        Model name (SVAE or JMVAE). JMVAE only runs in
                        supervised mode even if train_molecules provided
  -k KEYWORD, --keyword KEYWORD
                        Additional keyword for the trained model name. Default
                        'molecules'
  -l LOAD, --load LOAD  Trained model name to evaluate or continue the
                        training
  -d DEVICE, --device DEVICE
                        Device name, default cpu
  -e, --eval            If eval flag is set, model predicts spectra from
                        molecules for 1000 samples from the test set and
                        predicts molecules from spectra

Example of training the JMVAE model for fingerprints in supervised mode:

python experiments/spectra_fingerprints_paper/run.py --train data/train.csv --test data/test.csv --model JMVAE --device cuda:0

Example of training the SVAE model for SMILES in semi-supervised mode:

python experiments/spectra/run.py --train data/train.csv --test data/test.csv --train_molecules data/unsupervised.csv --model SVAE --device cuda:0

Example of evaluating the trained SVAE model for SMILES:

python experiments/spectra/run.py --train train.csv --test data/test.csv --load trained_model.pth.tar --model SVAE --device cuda:0 --eval

About

Source code for metabolite identification with bi-modal VAEs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages