Utilizing Deepgram as a Websocket (Question about Multichannel streaming) #993

JlossBustin · 2024-11-11T17:29:10Z

JlossBustin
Nov 11, 2024

Hello,

I am trying to understand the lengths to which deepgram will work as a websocket, providing live transcriptions on the fly to my client-side application.

Currently, I am forwarding raw RTP data (inbound & outbound) to the websocket, to my knowledge there is no way for Deepgram to pick up on which direction is which and assign a channel (have channels set to 2 in the URL parameters). I have the option of sending the inbound and outbound RTP data with a header byte defining the direction of the audio (1 or 2) and was wondering if I could somehow get Deepgram to pick up on this and return back a respective channel based on the header byte?

2024-11-11T17:29:21Z

deepgram-community[bot]
bot Nov 11, 2024

Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion.

0 replies

2024-11-11T17:29:22Z

deepgram-community[bot]
bot Nov 11, 2024

It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?

The programming language you are working in (e.g. JavaScript, Python).
The deepgram product you are using (e.g Speech to Text, Agent API)
A request ID that triggered your error or issue.

0 replies

JlossBustin · 2024-11-11T17:34:31Z

JlossBustin
Nov 11, 2024
Author

Below are some additional requested details about my use case:

Language: Java
Product: Speech-to-Text
request ID: None.

0 replies

JlossBustin · 2024-11-11T20:32:14Z

JlossBustin
Nov 11, 2024
Author

I was able to come up with a solution for combining audio streams for both inbound and outbound callers but now the issue is with returning the transcription results for both legs of a call.

Currently, this is the URL I am utilizing for the deepgram websocket:

wss://api.deepgram.com/v1/listen?encoding=linear16&sample_rate=8000&model=nova-2-phonecall&interim_results=true&endpointing=300&utterance_end_ms=1200&multichannel=true&channels=2&vad_events=true&smart_format=true

I've seen in other discussions that users were having issues with just having multichannel set to true, and I find that to be my case as well. When channels is not present in the URL .. no transcription results are returned to my client side application. On the other hand, if channels is set to 2, I only receive the outbound transcription for the call.

0 replies

jkroll-deepgram · 2024-11-11T22:55:58Z

jkroll-deepgram
Nov 11, 2024
Collaborator

Hi @JlossBustin, streaming audio with multichannel=true returns the channel_index field, which is an array of two items. The first value is the channel index (0-indexing), and the second value is the total number of channels. So a transcription response with "channel_index": [0, 2] means that the transcript is for channel #0 (the first channel), out of 2 total channels. See our docs section for further explanation.

0 replies

JlossBustin · 2024-11-11T23:04:26Z

JlossBustin
Nov 11, 2024
Author

Hi Julia, I understand the functionality of multichannel. However the issue is with returning both channels’ respective transcripts. Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Julia Kroll ***@***.***> Sent: Monday, November 11, 2024 5:56:20 PM To: deepgram/community ***@***.***> Cc: Justin Bloss ***@***.***>; Mention ***@***.***> Subject: Re: [deepgram/community] Utilizing Deepgram as a Websocket (Question about Multichannel streaming) (Discussion #993) Hi @JlossBustin<https://github.com/JlossBustin>, streaming audio with multichannel=true returns the channel_index field, which is an array of two items. The first value is the channel index (0-indexing), and the second value is the total number of channels. So a transcription response with "channel_index": [0, 2] means that the transcript is for channel #0 (the first channel), out of 2 total channels. See our docs section<https://developers.deepgram.com/docs/multichannel#streaming-response> for further explanation. — Reply to this email directly, view it on GitHub<#993 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQZHETD4ZRQ2OOYAL75YQGT2AEYZJAVCNFSM6AAAAABRSJRBSKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMRRHE2TENY>. You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

jkroll-deepgram · 2024-11-11T23:21:51Z

jkroll-deepgram
Nov 11, 2024
Collaborator

Hi @JlossBustin, if you can provide a Deepgram request ID (response["metadata"]["request_id"]), I can see further details about your request, and help determine why you're not receiving a proper multichannel transcript.

You mentioned you found "a solution for combining audio streams for both inbound and outbound callers" - are you still sending two-channel audio to Deepgram, or are you combining the channels such that it becomes mono audio?

0 replies

JlossBustin · 2024-11-12T15:41:39Z

JlossBustin
Nov 12, 2024
Author

Hi there, it appears I am sending two-channel audio to Deepgram. The issue with only receiving the live transcription for one leg of the call is due to an error in my processing outside of the Deepgram websocket.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepgram

Utilizing Deepgram as a Websocket (Question about Multichannel streaming) #993

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Deepgram

Utilizing Deepgram as a Websocket (Question about Multichannel streaming) #993

JlossBustin Nov 11, 2024

Replies: 8 comments

deepgram-community[bot] bot Nov 11, 2024

deepgram-community[bot] bot Nov 11, 2024

JlossBustin Nov 11, 2024 Author

JlossBustin Nov 11, 2024 Author

jkroll-deepgram Nov 11, 2024 Collaborator

JlossBustin Nov 11, 2024 Author

jkroll-deepgram Nov 11, 2024 Collaborator

JlossBustin Nov 12, 2024 Author

JlossBustin
Nov 11, 2024

deepgram-community[bot]
bot Nov 11, 2024

deepgram-community[bot]
bot Nov 11, 2024

JlossBustin
Nov 11, 2024
Author

JlossBustin
Nov 11, 2024
Author

jkroll-deepgram
Nov 11, 2024
Collaborator

JlossBustin
Nov 11, 2024
Author

jkroll-deepgram
Nov 11, 2024
Collaborator

JlossBustin
Nov 12, 2024
Author