Try example with file stream #103

surasakBoonkla · 2022-10-05T09:17:38Z

surasakBoonkla
Oct 5, 2022

Hello,

I tried the example but change from mic to file stream. There was no error but nothing printed out. Please help me.

import rx.operators as ops
import diart.operators as dops
#from diart.sources import MicrophoneAudioSource
from src.diart.sources import FileAudioSource
from diart.blocks import SpeakerSegmentation, OverlapAwareSpeakerEmbedding

segmentation = SpeakerSegmentation.from_pyannote("pyannote/segmentation")
embedding = OverlapAwareSpeakerEmbedding.from_pyannote("pyannote/embedding")
sample_rate = segmentation.model.get_sample_rate()
#mic = MicrophoneAudioSource(sample_rate)
file = FileAudioSource('Toefl-16k-08.wav', sample_rate)

stream = file.stream.pipe(
# Reformat stream to 5s duration and 500ms shift
dops.regularize_audio_stream(sample_rate),
ops.map(lambda wav: (wav, segmentation(wav))),
ops.starmap(embedding)
).subscribe(on_next=lambda emb: print(emb.shape))

#mic.read()
file.read()

juanmc2005 · 2022-10-05T10:25:24Z

juanmc2005
Oct 5, 2022
Maintainer

Hi @surasakBoonkla,

The problem is that FileAudioSource is already formatted to a specific chunk duration and step, by default these are 5s and 500ms, so my guess is that dops.regularize_audio_stream(sample_rate) is causing problems, could you try again without this line?

This particular aspect of the audio sources is currently not very well designed. As you may have noticed, it's not very intuitive that files are already formatted but not the microphone. This is going to change starting from next release (v0.6), with which your example should work correctly. In fact, it is already implemented in the develop branch if you want to try it out.

Part of the road to v1.0 is to identify issues like this to design a robust and consistent API, so thank you for bringing this up!

1 reply

surasakBoonkla Oct 5, 2022
Author

Dear Juan,

Thanks, it works now after I removed that line.
Your work is interesting. I am trying to understand its workflow.

thanks again

surasakBoonkla · 2022-10-06T03:32:36Z

surasakBoonkla
Oct 6, 2022
Author

Dear sir,

Please teach me more a little bit. I am not good much in Python.

I would like to use both wav (or segmentation(wav)) and embedding to be processed. My idea is to track speakers by using wav (or segmentation(wav)) and embedding in a function.

Can I modified your example like this? I added a variable wav before emb but it did not work.

import rx.operators as ops
import diart.operators as dops
#from diart.sources import MicrophoneAudioSource
from src.diart.sources import FileAudioSource
from diart.blocks import SpeakerSegmentation, OverlapAwareSpeakerEmbedding

segmentation = SpeakerSegmentation.from_pyannote("pyannote/segmentation")
embedding = OverlapAwareSpeakerEmbedding.from_pyannote("pyannote/embedding")
sample_rate = segmentation.model.get_sample_rate()
#mic = MicrophoneAudioSource(sample_rate)
file = FileAudioSource('friday16k.wav', sample_rate)

import torch

def process(wav, emb):
#np_emb = emb.numpy()
embeddings = emb.detach().cpu().numpy()

print(embeddings.shape)
print(wav)

stream = file.stream.pipe(
# Reformat stream to 5s duration and 500ms shift
#dops.regularize_audio_stream(sample_rate),
ops.map(lambda wav: (wav, segmentation(wav))),
ops.starmap(embedding)
).subscribe(on_next= lambda wav, emb: process(wav, emb))

#mic.read()
file.read()

0 replies

juanmc2005 · 2022-10-06T13:06:34Z

juanmc2005
Oct 6, 2022
Maintainer

Keep in mind that ops.starmap(embedding) will not output the audio chunk, only the embedding tensor.
You can take a look at how it's implemented in OnlineSpeakerDiarization (this line).

1 reply

surasakBoonkla Oct 6, 2022
Author

OK. I see.

Thanks a lot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try example with file stream #103

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Try example with file stream #103

surasakBoonkla Oct 5, 2022

Replies: 3 comments · 2 replies

juanmc2005 Oct 5, 2022 Maintainer

surasakBoonkla Oct 5, 2022 Author

surasakBoonkla Oct 6, 2022 Author

juanmc2005 Oct 6, 2022 Maintainer

surasakBoonkla Oct 6, 2022 Author

surasakBoonkla
Oct 5, 2022

Replies: 3 comments 2 replies

juanmc2005
Oct 5, 2022
Maintainer

surasakBoonkla Oct 5, 2022
Author

surasakBoonkla
Oct 6, 2022
Author

juanmc2005
Oct 6, 2022
Maintainer

surasakBoonkla Oct 6, 2022
Author