Code for "Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation" (Findings of ACL 2024).
cd src
pip install -e ./
Dataset can be downloaded here
Run data-build/create_lmdb.sh
to process IIMT data.
Run script/train_mgpu_tiny.sh
to train the image tokenizer.
- Run
script/vit-vqgan/run_t2i_layout.sh
to train the teacher model. - Run
script/iimt/run_translatotron_v.sh
to train Translatotron-V
Run script/test_translatotron_v.sh
to test Translatotron-V.
- parti-pytorch : the codebase we built upon. This repository is an implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch.