Masked Vision Transformers for Hyperspectral Image Classification

This projects tailors vision transformers to the characteristics of hyperspectral aerial and satellite imagery using: (i) blockwise patch embeddings (ii) spatial-spectral self-attention, (iii) spectral positional embeddings and (iv) masked self-supervised pre-training.

Results were presented at the CVPR EarthVision Workshop 2023, Paper Link.

Masked pre-training

The masked pre-training can be started with a call to the pretrain.py file. Before starting the training, make sure to adjust the paths to your local copy of the dataset in configs/config.yaml. Hyperparameters can be adjusted in configs/pretrain_config.yaml.

Land-cover Classification

The finetune.py script can be used to finetune a pre-trained model or to train a model from scratch for classifcation of EnMAP or Houston2018 data. The desired dataset must be provided as argument, e.g., finetune.py enmap. Prior to training, the dataset paths must be specified in configs/config.yaml. Hyperparameters can be adjusted in configs/finetune_config_{dataset}.yaml. There is also an alternative fine-tuning script for the use with wandb sweep functionality at src/finetune_sweep.py.

Data

The Houston2018 dataset is publicly available from the Hyperspectral Image Analysis Lab at the University of Houston and IEEE GRSS IADF.
Code to re-create the unlabeled EnMAP and labeled EnMAP-DFC datasets is made available in the enmap_data directory. Please follow the instructions there.

Pre-trained checkpoints

We provide the pre-trained model checkpoints for the spatial-spectral transformer on Houston2018 and EnMAP datasets.

Code

This repository was developed using Python 3.8.13 with PyTorch 1.12. Please have a look at the requirements.txt file for more details.

It incorporates code from the following source for the 3D-CNN model of Li et al. (Remote Sensing, 2017)

DeepHyperX

The vision transformer and SimMIM implementations are adapted from:

vit-pytorch

Reference

If you would like to cite our work, please use the following reference:

Scheibenreif, L., Mommert, M., & Borth, D. (2023). Masked Vision Transformers for Hyperspectral Image Classification, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023

@inproceedings{scheibenreif2023masked,
  title={Masked vision transformers for hyperspectral image classification},
  author={Scheibenreif, Linus and Mommert, Michael and Borth, Damian},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={2166--2176},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Masked Vision Transformers for Hyperspectral Image Classification

Masked pre-training

Land-cover Classification

Data

Pre-trained checkpoints

Code

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
DeepHyperX		DeepHyperX
checkpoints		checkpoints
configs		configs
enmap_data		enmap_data
src		src
.gitignore		.gitignore
figure1.jpg		figure1.jpg
finetune.py		finetune.py
inference_example.ipynb		inference_example.ipynb
pretrain.py		pretrain.py
readme.md		readme.md
requirements.txt		requirements.txt
setup.py		setup.py

keldentp/MaskedSST

Folders and files

Latest commit

History

Repository files navigation

Masked Vision Transformers for Hyperspectral Image Classification

Masked pre-training

Land-cover Classification

Data

Pre-trained checkpoints

Code

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages