Skip to content

Latest commit

 

History

History
18 lines (15 loc) · 2.8 KB

README.md

File metadata and controls

18 lines (15 loc) · 2.8 KB

Yogera Dataset

This dataset was generated from the data collected from various users of the yogera mobile app. The dataset includes the metadata and the link to the to voice clips. This dataset consists of 5 different languages (Luganda, Lusoga, Lumasaba, Acholi and Runyankole-Rukiga)

The latest dataset release (version 4.0.1) combines all collected data. This includes Phase 1.0 data (from August to December 2023), Phase 1.1 data (January to February 2024) and Phase 2.0 data (from April to August 2024). This dataset split is well represented in the metadata (inclusion of the phase column)

Releases

Version Date Released Voice Clips Recorded hours Approved Hours Unique Voices Transcribed Reviewed
5.0.1 Nov 20, 2024 Link 3,411.1 2,217.7 2,675 253.2 251.7
4.0.1 Aug 13, 2024 Link 2,253.3 1,565.8 1,641 152.8 151.3
4.0.0 Aug 07, 2024 Link 2,166.8 1,478.1 1,585 152.8 151.3
3.0.1 Feb 07, 2024 Link 844.0 509.4 479 58.0 53.4
3.0.0 Jan 18, 2024 Link 682.9 334.0 440 58.0 53.4
2.0.0 Oct 31, 2023 Link 511.3 17.2 312 0 0
1.0.0 Sept 20, 2023 Link 43 4 34 0 0