Releases: lottev1991/LotteV-Voicebank-Update-Repo
Hoshino Hanami ~AI❤dol~ for DiffSinger v1.0
The first full release of Hanami's diffsinger voicebank. Hope you enjoy!
I changed the name of the voicebank, since it's how part of Project AI❤dol; you'll hear more details about that at a later date.
Download on MediaFire
(For organizational purposes)
Voice description
- General voice type: Natural "pop soprano", suitable for most popular genres of music.
- Vocal modes:
- Root (Core/Normal);
- Fragrance (Power);
- Nectar (Soft);
- Other voicebank features:
- Duration;
- Velocity;
- Gender;
- Auto-pitch;
- Custom vocoder (AI❤dolGAN).
Officially supported languages
These are languages for which actual data has been recorded.
- English (approx. 2 hours of data);
- Japanese (approx. 40 min. of data);
- German (approx. 17 min. of data);
- Korean (approx. 11 min. of data);
- Spanish (approx. 7 min. of data);
- Mandarin Chinese (approx. 6 min. of data);
- Latin (Ecclesiastical) (approx. 5 min. of data; no dictionary and/or phonemizer included yet);
- French (approx. 3 min. of data; phonemizer included).
Total data count: 3 hours, 53 minutes.
No external data has been used for training. Due to this, some of the data might have a bit of an accent. Apologies in advance!
Unofficially supported languages (dictionaries included)
- Cantonese;
- Vietnamese (phonemizer included).
Note that there are likely many more unofficially supported languages; they simply haven't been tested yet. End users are encouraged to experiment.
Relevant credits
Millefeuille phonemizer by imsupposedto @ UTAUFrance.
Hoshino Hanami -LoveSong- for DiffSinger v0.0.2b (with reflow+vocal modes)
This model was trained with the new reflow method. It has 3 vocal modes (see below). The idea of supporting tension has been scrapped. Support for more languages is currently being considered. As it's still in beta, there might be some unpredictable quirks.
Note that the voicebank currently lacks art, this will be added later. The art will be all-new and is currently WIP. It uses a beta icon as a placeholder for now, and lacks a piano roll portrait.
Voice features
- Soprano voice type (female character);
- Natural voice tone.
Vocal modes
- Root (normal);
- Fragrance (power);
- Nectar (soft).
Supported languages
- English (primary) - approx. 2 hours of data;
- Japanese (secondary) - approx 40 min. of data.
Supported parameters
- Random pitch shifting (gender curve);
- Duration;
- Auto-pitch.
Other features
- Custom vocoder;
- Trained with reflow.
LEONA -Ubasti- AI for DiffSinger v0.0.1b (without reflow)
Initial release.
This model was trained with the old method, so it does not make use of reflow. The sound quality should be quite good, and the model's performance is fairly stable. Still, as it's still in beta, there might be some unpredictable quirks.
This model was also trained on multispeaker with Hanami's dataset; LEONA's base vocal modes contain under 1 minutes of data each. Despite this, the model is of fairly good quality.
No other parameters are currently planned to be supported, as they're considered redundant.
Note that the voicebank currently lacks art, this will be added later. The art will be all-new and is currently WIP. It uses a beta icon as a placeholder for now, and lacks a piano roll portrait.
Voice features
- Soprano voice type (female character);
- Cute character-style voice tone;
- 3 vocal modes:
- Core (normal);
- CatNip (power);
- CatNap (soft).
Supported languages
- English (primary) - approx. 2 hours of data;
- Japanese (secondary) - approx 40 min. of data.
Supported parameters
- Random pitch shifting (gender curve);
- Duration;
- Auto-pitch.
Other features
- Custom vocoder.
Hoshino Hanami UTAU Beta Voicebanks
Finally getting these out of the way. I'm not sure when they'll be finished, but I'll try my best. Voicebank dev is rather exhausting.
Fair warning one of these otos are finished. The art is old too. The readme files might also be inaccurate right now.
Have fun regardless.
The rest will follow.
Hoshino Hanami -LoveSong- for DiffSinger v0.0.1b (without tension/reflow)
Initial release.
This model was trained with the old method, so it does not make use of reflow. It is single-speaker and lacks the tension parameter currently, although it is planned. As a result, there is currently no real control over the power of the voice. However, the sound quality should be quite good, and the model's performance is fairly stable. Still, as it's still in beta, there might be some unpredictable quirks. (A tension-supported model has been created, but due to quality issues it will not be released.)
Note that the voicebank currently lacks art, this will be added later. The art will be all-new and is currently WIP. It uses a beta icon as a placeholder for now, and lacks a piano roll portrait.
Voice features
- Soprano voice type (female character);
- Natural voice tone.
Supported languages
- English (primary) - approx. 2 hours of data;
- Japanese (secondary) - approx 40 min. of data.
Supported parameters
- Random pitch shifting (gender curve);
- Duration;
- Auto-pitch.
Other features
- Custom vocoder.