CalliarGen

Word-To-Image: Morphing Arabic Text to a Visual Representation

Multilingual OpenCLIP

Dataset

Preprocessing:

Extract the caption from the images' names.
Remove the numbers in the caption using Maha library.
Write the "file_name" and "text" in jsonl file as recommended from HF.
Upload the dataset to HF dataset hub.

References:

Uploading the dataset to HF hub, here.
Dataset in HF hub, here.

Model training:

Creating venv using venv in python.
Install the diffusers using these instructions: https://github.com/huggingface/diffusers/tree/main/examples/text_to_image#installing-the-dependencies.
Run the training using the command line:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export dataset_name="lambdalabs/pokemon-blip-captions"

accelerate launch --mixed_precision="fp16"  train_text_to_image.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$dataset_name \
  --use_ema \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --gradient_checkpointing \
  --max_train_steps=15000 \
  --learning_rate=1e-05 \
  --max_grad_norm=1 \
  --lr_scheduler="constant" --lr_warmup_steps=0 \
  --output_dir="calliar_1"

References:
The model and the latest checkpoint, here.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Dataset_processing		Dataset_processing
Testing_SD_Model		Testing_SD_Model
fonts-bgs		fonts-bgs
fonts		fonts
.gitignore		.gitignore
Interacting_with_open_CLIP.ipynb		Interacting_with_open_CLIP.ipynb
Process_Calliar_as_HF.ipynb		Process_Calliar_as_HF.ipynb
README.md		README.md
create_dataset.ipynb		create_dataset.ipynb
create_dataset_hf.ipynb		create_dataset_hf.ipynb
create_hf_dataset_from_id.ipynb		create_hf_dataset_from_id.ipynb
extract_words.py		extract_words.py
two_words_freq.txt		two_words_freq.txt
utils_datasets.py		utils_datasets.py
words_freq.txt		words_freq.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CalliarGen

Dataset

Model training:

About

Releases

Packages

Contributors 3

Languages

ARBML/CalliarGen

Folders and files

Latest commit

History

Repository files navigation

CalliarGen

Dataset

Model training:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages