arabic_calligraphy

This repository contains code for 2 projects:

Arabic Calligraphy Style Recognition (ACSR) : Detect the style of the arabic calligraphy, this work is deeply inspired by the work of ARBML, a demo is available here : HuggingFace space
Arabic Calligraaphy OCR : recognize the text of the arabic calligraphy

Datasets

Calliar (OCR)

Import the calliar repository into your project, by executing the bash script as follows :

bash get_calliar.sh

The script initialises the git submodules, updates it. Then extracts the dataset.

To get images for individual characters, you can use the following command :

cd scripts
python3 prepare_calliar.py --level words

RuFa (Font recognition)

Fonts: Aref Ruqaa - Iran Nastaliq https://paperswithcode.com/dataset/rufa https://mhmoodlan.github.io/blog/arabic-font-classification To get the data, execute the bash script as follows :

cd data
get_rufa_dataset.sh

KAFD (Font recognition)

Fonts: (40)

https://catalog.ldc.upenn.edu/docs/LDC2016T21/2014-J-KAFD%20Arabic%20Font%20Database_1-s2.0-S0031320313005463-main.pdf

ADAB (handwritten arabic)

Execute the bash script as follows :

cd data
bash get_adab_dataset.sh

Training

OCR

First

$ cd scripts/trocr

Then, one of the following tasks can be executed for (chars, words, sentences) :

$ python3 train.py --level chars --test_size 0.1 --num_beams 4 --limit_eval 256 --per_device_train_batch_size 32 --per_device_eval_batch_size 16 --gradient_accumulation_steps 2

$ python3 train.py --level words --test_size 0.1 --num_beams 4 --limit_eval 256 --per_device_train_batch_size 32 --per_device_eval_batch_size 16 --gradient_accumulation_steps 2

$ python3 train.py --level sentences --test_size 0.1 --num_beams 4 --limit_eval 256 --per_device_train_batch_size 32 --per_device_eval_batch_size 16 --gradient_accumulation_steps 2

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
Calliar @ f2830f3		Calliar @ f2830f3
data		data
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
get_calliar.sh		get_calliar.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

arabic_calligraphy

Datasets

Calliar (OCR)

RuFa (Font recognition)

KAFD (Font recognition)

ADAB (handwritten arabic)

Training

OCR

About

Releases

Packages

Contributors 2

Languages

License

Mustapha-AJEGHRIR/arabic_calligraphy

Folders and files

Latest commit

History

Repository files navigation

arabic_calligraphy

Datasets

Calliar (OCR)

RuFa (Font recognition)

KAFD (Font recognition)

ADAB (handwritten arabic)

Training

OCR

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages