Combining Transcription with Diarization (speaker identification) #99
Unanswered
MustafaCQN
asked this question in
Q&A
Replies: 2 comments 5 replies
-
Have you checked out WhisperX? |
Beta Was this translation helpful? Give feedback.
4 replies
-
checkout this repo: https://github.com/Navodplayer1/speechlib you will get accurate timing You can also do speaker recognition if you provide voices_folder. Then transcription will contain actual speaker names! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone!
I was working on a project that will takes audio file and transcribes the meeting.
The problem with this project that I am facing is I need to diarize the speakers with their names. I was using Pyannote package for the identification of speakers but the problem is, Both transcription and diarization uses different models which outcomes different timecodes. Because of the different timecodes, I cannot able to match the transcription with the speaker names.
Anybody knows how I can tweak this problem or is there a product/model/method that I can use for both transcribing the matching with the speaker name using timecodes?
Left is speaker identifications with timecodes (pyannote), right is the transcription with timecodes (faster-whisper)
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions