Cross-modal information retrieval using generative models

Implements a novel VAE with siamese loss on the latent space to correctly classify visually-invoked brain wave signals to the image shown.

The model achieves 25.30% Accuracy and 96% AUROC on 50-class classification, compared to a similarly deep standard VAE which achieves 2% and 50% respectively (random guessing, cannot pick apart signal at all). A neural net classifier achieves the same.

The model is also one of the first to perform reasonably on 100-class classification: 19.57% accuracy and 97% AUROC. Similarly deep standard VAEs and neural nets perform only as well as random guessing.

The model, unfortunately, is not able to reproduce the visually-invoked image from the brain wave signals.

CS237_Final_Project (17).pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Cross-modal information retrieval using generative models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Cross-modal information retrieval using generative models