Tasks for Vsy

Note: 29.2.2024

Download some datasets or subsets such as from Wikipedia or other corpora: https://huggingface.co/datasets/wikipedia
Build a new corpus/corpuses for Bulgarian
Continue the BgGPT etc. development
If you manage, convert the GPT2-Medium model from h5 to ggml for fast CPU inference (in progress, fixed 50255 tokens (50257) on the fly, but some other tf-pt incompatibilities, transposes...)
Continue to work with Whisper, integrate with AutoClap and Toshko 2
...

Provide feedback