Try example with file stream #103
Replies: 3 comments 2 replies
-
Hi @surasakBoonkla, The problem is that This particular aspect of the audio sources is currently not very well designed. As you may have noticed, it's not very intuitive that files are already formatted but not the microphone. This is going to change starting from next release (v0.6), with which your example should work correctly. In fact, it is already implemented in the Part of the road to v1.0 is to identify issues like this to design a robust and consistent API, so thank you for bringing this up! |
Beta Was this translation helpful? Give feedback.
-
Dear sir, Please teach me more a little bit. I am not good much in Python. I would like to use both wav (or segmentation(wav)) and embedding to be processed. My idea is to track speakers by using wav (or segmentation(wav)) and embedding in a function. Can I modified your example like this? I added a variable wav before emb but it did not work. import rx.operators as ops segmentation = SpeakerSegmentation.from_pyannote("pyannote/segmentation") import torch def process(wav, emb):
stream = file.stream.pipe( #mic.read() |
Beta Was this translation helpful? Give feedback.
-
Keep in mind that |
Beta Was this translation helpful? Give feedback.
-
Hello,
I tried the example but change from mic to file stream. There was no error but nothing printed out. Please help me.
import rx.operators as ops
import diart.operators as dops
#from diart.sources import MicrophoneAudioSource
from src.diart.sources import FileAudioSource
from diart.blocks import SpeakerSegmentation, OverlapAwareSpeakerEmbedding
segmentation = SpeakerSegmentation.from_pyannote("pyannote/segmentation")
embedding = OverlapAwareSpeakerEmbedding.from_pyannote("pyannote/embedding")
sample_rate = segmentation.model.get_sample_rate()
#mic = MicrophoneAudioSource(sample_rate)
file = FileAudioSource('Toefl-16k-08.wav', sample_rate)
stream = file.stream.pipe(
# Reformat stream to 5s duration and 500ms shift
dops.regularize_audio_stream(sample_rate),
ops.map(lambda wav: (wav, segmentation(wav))),
ops.starmap(embedding)
).subscribe(on_next=lambda emb: print(emb.shape))
#mic.read()
file.read()
Beta Was this translation helpful? Give feedback.
All reactions