-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a demo of device speaking ($1500) #1006
Comments
sweet! |
I can get this working but would be using Deepgram only. |
Go a head with a PR pls @DamienDeepgram , you don't need a permission to do great things. |
@hoai265 you too - great firmware dev man |
Sorry guys had to update the bounty |
so sad, no one take this :dead: too sweet $1K |
Hi @kodjima33 , I’d love to take this task. Please assign it to me, and I’ll get started. Thanks! |
@Sanchay-T yes, pls keep us updated! every 24h would be great ~ just to keep your motivation up! |
Hi @beastoin 👋 I've been diving into the codebase to understand how we can implement voice responses for the device. Really interesting architecture you've built here! While I'm familiar with backend systems, I'm getting up to speed with some of the Flutter and BLE specifics. Looking at the current audio pipeline, I can see we're handling real-time streaming through WebSockets pretty elegantly. The socket service in Future<TranscriptSegmentSocketService?> socket({
required BleAudioCodec codec,
required int sampleRate,
required String language,
bool force = false,
}) async { For implementing the voice response feature, I think we can build on this foundation. I see we're already integrated with OpenAI's APIs in I have a few questions about the device interaction part though. Looking at BluetoothCharacteristic? getCharacteristicByUuid(BluetoothService service, String uuid) {
return service.characteristics.firstWhereOrNull(
(characteristic) => characteristic.uuid.str128.toLowerCase() == uuid.toLowerCase(),
);
} Before I proceed with the implementation, I wanted to check:
I have some ideas about the implementation, but wanted to validate these core aspects first to make sure I'm heading in the right direction. Happy to elaborate on any part of this! Thanks for the help! Looking forward to your insights. |
1/ what do your propose ? pros / cons. |
Deepgram also has TTS so you could use the same sdk I think that the speech to text is using. Not sure if Omi has a preference there tho see: https://pub.dev/packages/deepgram_speech_to_text#text-to-speech |
Removed @Sanchay-T from assigned - no progress @beastoin let's try to not assign people if they didn't yet have PRs submitted. We assign only to those who had PRs. Others will need to do a PR first. @DamienDeepgram try it out bro - looking forward! |
Hi @beastoin and @kodjima33 I wanted to clarify the situation regarding my previous assignment. First, I apologize for the delay in updates - I was away for Diwali celebrations in my hometown, which affected my response time. However, I want to assure you that I've been actively working on this in the background:
I understand the policy about assignments and PRs, and I'm committed to submitting a PR with my implementation soon. I should have communicated my temporary absence better, and I appreciate your patience. Would it be alright if I continue working on this feature and submit a PR for review? I'm happy to share my current progress in more detail if helpful. Thanks for understanding! |
checking speaker functions of current firmware... |
Hi @kodjima33 What if we add a separate option to route audio to the phone's output, like AirPods? I think this would be another option for users, as they could listen privately. |
@DamienDeepgram don't forget to ref your PR ;) about the preference to implement this task, be creative. but i think Nik's description is good/simple enough to roll out l the first draft. smth likes ~ 1/ the user press the button in the device and say something hope that helps. |
speaker should support playback over BT, i dont might looking into this after apple watch PR as it will use a similar two-way transport. |
I have started on this, but need to get some sleep #1243 - i think even before i flashed new firmware any button click on my (red) devkit2 causes a fatal crash - not sure if the shipped devkit2's have a different setup with button? Could be a different pin/setup for the button causing this issue. Code in WIP includes all the BLE setup to stream and handle the stream on desktop side (Python). Once finalised can move to dart code. |
@Sanchay-T bro no worries, just keep building this and try to make it work. No one blames you - it's just we assign issues only after first PR @vincentkoc @DamienDeepgram guys I believe in you. Let's make this work! (ideally today) You are both working on this, if you both make it work, I'll make a post about both of you and we will solve the bounty issue |
fighting 💪 |
Sorry yes here is the PR with the issue with playback not streaming correctly |
Hey @kodjima33 @beastoin |
Please go a head and keep us updated @ombhojane |
Sure @beastoin |
@beastoin |
So how do we integrate this feature if we do not have Omi dev kit devices? Or, can we implement this feature in our Android/iOS device, and if it works there, then it works with OMI devices? |
@himmat12 you already know the answer man. the ticket title is super clear. |
@ombhojane how's it going? if you want to get this tiket done - building the app / the firmware is a basic requirement. |
Hey @beastoin I'll figure this out today, yesterday was my exam. |
Hii @beastoin |
@beastoin @kodjima33 is this fixed with #1452 or work still required? |
Is your feature request related to a problem? Please describe.
People want to talk to the device. Our v2 device is equipped with a speaker but doesn't speak yet
Describe the solution you'd like
Solution should be exactly like our in-app chat but with audio from device. For example, I click on the button (on device) and ask "hey what's the capital of united states" and device should respond with audio "Washington DC". This response and prompt should be visible on the chat inside of the app
This functionality should be disabable via settings.
Additional context
There was a PR submitted a while ago to make the app speak, take a look at that
This is a paid task. Reward is $1,500 in cash. Simply link your PR to this issue to receive the money
The text was updated successfully, but these errors were encountered: