Now you can train models of this dataset with speechbrain here.
Binaural separation dataset for two or three speakers in Real-time binaural speech separation with preserved spatial cues. Briefly, we randomly sampled 2 or 3 speaker locations in the HRTF database from the CIPIC, convolved with randomly sampled two or three utterances from the wsj0 and mixed them all. Also, we created 2 speakers mixture with DEMAND noise or simulated BRIR reverberance.
These scripts require the Numpy, Scipy, Pandas packages.
The original wsj0 dataset and the CIPIC HRTF Database. The CIPIC HRTF Database contains real recorded HRIR filters across 25 different interaural-polar azimuths from −80◦ to 80◦ and 50 different interaural-polar elevations from −90◦ to 270◦. We separated subjects into 27 for training, 9 for validation and 9 for test set, ensuring that the model is evaluated in a listener-independent way. Here we used the WAV version of CIPIC.
$ python create_wav_2speakers.py
--wsj0-root /path/to/wsj/wsj0/
--output-dir /path/to/the/output/directory/
The arguments for the script are:
- wsj0-root: Path to the folder containing
wsj0/
- output-dir: Where to write the new dataset.
$ python create_wav_2speakers_noise.py
--wsj0-root /path/to/wsj/wsj0/
--output-dir /path/to/the/output/directory/
The arguments for the script are:
- wsj0-root: Path to the folder containing
wsj0/
- output-dir: Where to write the new dataset. It will download DEMAND dataset automatically.
$ python create_wav_2speakers_reverb.py
--wsj0-root /path/to/wsj/wsj0/
--output-dir /path/to/the/output/directory/
The arguments for the script are:
- wsj0-root: Path to the folder containing
wsj0/
- output-dir: Where to write the new dataset. It will download Simulated Room Impulse Responses automatically.