Skip to content

Commit

Permalink
feat!:move to simple listener (#1)
Browse files Browse the repository at this point in the history
* feat!:move to simple listener

be even more lightweight

* feat!:move to simple listener

be even more lightweight

* feat!:move to simple listener

be even more lightweight

* feat!:move to simple listener

be even more lightweight
  • Loading branch information
JarbasAl authored Oct 26, 2024
1 parent ebf718a commit 307a0a1
Show file tree
Hide file tree
Showing 5 changed files with 130 additions and 91 deletions.
55 changes: 37 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,36 @@
# HiveMind Voice Relay

OpenVoiceOS Relay, connect to [HiveMind](https://github.com/JarbasHiveMind/HiveMind-core)
OpenVoiceOS Relay, connect to [HiveMind](https://github.com/JarbasHiveMind/HiveMind-listener)

Similar to [voice-satellite](https://github.com/JarbasHiveMind/HiveMind-voice-sat), but STT and TTS are sent to HiveMind instead of handled on device
A lightweight version of [voice-satellite](https://github.com/JarbasHiveMind/HiveMind-voice-sat), but STT and TTS are sent to HiveMind instead of handled on device

> NOTE: if using ovos-installer for the server this requires the `listener` profile
## Server requirements

## Install
> ⚠️ `hivemind-listener` is required server side, the default `hivemind-core` does not provide STT and TTS capabilities.
Install dependencies (if needed)
> Alternatively run `hivemind-core` together with `ovos-audio` and `ovos-dinkum-listener`
```bash
sudo apt-get install -y libpulse-dev libasound2-dev
```
The regular voice satellite is built on top of [ovos-dinkum-listener](https://github.com/OpenVoiceOS/ovos-dinkum-listener) and is full featured supporting all plugins

This repo is built on top of [ovos-simple-listener](https://github.com/TigreGotico/ovos-simple-listener), while it needs less resources it is also **missing** some features

- STT plugin
- TTS plugin
- Audio Transformers plugins
- Continuous Listening
- Hybrid Listening
- Recording Mode
- Sleep Mode
- Multiple WakeWords

If you need an even lighter implementation, consider [hivemind-mic-satellite](https://github.com/JarbasHiveMind/hivemind-mic-satellite) to also offload wake word to the server

## Install

Install with pip

```bash
$ pip install git+https://github.com/JarbasHiveMind/HiveMind-voice-relay
$ pip install HiveMind-voice-relay
```

## Usage
Expand All @@ -37,16 +50,22 @@ Options:

```


## Configuration

Voice relay uses the default OpenVoiceOS configuration `~/.config/mycroft/mycroft.conf`
Voice relay is built on top of [ovos-simple-listener](https://github.com/TigreGotico/ovos-simple-listener) and [ovos-audio](https://github.com/OpenVoiceOS/ovos-audio), it uses the default OpenVoiceOS configuration `~/.config/mycroft/mycroft.conf`

Supported plugin types:
- Microphone (required)
- VAD (required)
- WakeWord (required)
- Audio Transformers (optional, None by default)
- Dialog Transformers (optional, None by default)
- TTS Transformers (optional, None by default)
- PHAL (optional, None by default)

| Plugin Type | Description | Required | Link |
|-------------|-------------|----------|------|
| Microphone | Captures voice input | Yes | [Microphone](https://openvoiceos.github.io/ovos-technical-manual/mic_plugins/) |
| VAD | Voice Activity Detection | Yes | [VAD](https://openvoiceos.github.io/ovos-technical-manual/vad_plugins/) |
| WakeWord | Detects wake words for interaction | Yes* | [WakeWord](https://openvoiceos.github.io/ovos-technical-manual/ww_plugins/) |
| STT | speech-to-text (STT)| Yes | [STT](https://openvoiceos.github.io/ovos-technical-manual/stt_plugins/) |
| TTS | text-to-speech (TTS) | Yes | [TTS](https://openvoiceos.github.io/ovos-technical-manual/tts_plugins) |
| G2P | grapheme-to-phoneme (G2P), used to simulate mouth movements | No | [G2P](https://openvoiceos.github.io/ovos-technical-manual/g2p_plugins) |
| Media Playback Plugins | Enables media playback (e.g., "play Metallica") | No | [Media Playback Plugins](https://openvoiceos.github.io/ovos-technical-manual/media_plugins/) |
| OCP Plugins | Provides playback support for URLs (e.g., YouTube) | No | [OCP Plugins](https://openvoiceos.github.io/ovos-technical-manual/ocp_plugins/) |
| Dialog Transformers | Processes text before text-to-speech (TTS) | No | [Dialog Transformers](https://openvoiceos.github.io/ovos-technical-manual/transformer_plugins/) |
| TTS Transformers | Processes audio after text-to-speech (TTS) | No | [TTS Transformers](https://openvoiceos.github.io/ovos-technical-manual/transformer_plugins/) |
| PHAL | Provides platform-specific support (e.g., Mark 1) | No | [PHAL](https://openvoiceos.github.io/ovos-technical-manual/PHAL/) |
1 change: 0 additions & 1 deletion hivemind_voice_relay/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +0,0 @@
from hivemind_voice_relay.service import VoiceRelay, AudioPlaybackRelay
49 changes: 9 additions & 40 deletions hivemind_voice_relay/__main__.py
Original file line number Diff line number Diff line change
@@ -1,43 +1,22 @@
from threading import Event

import click
from hivemind_bus_client import HiveMessageBusClient
from hivemind_bus_client.identity import NodeIdentity
from ovos_bus_client.client import MessageBusClient
from ovos_utils import wait_for_exit_signal
from ovos_utils.log import init_service_logger, LOG
from ovos_utils.fakebus import FakeBus
from hivemind_voice_relay.service import VoiceRelay, AudioPlaybackRelay


def launch_bus_daemon() -> MessageBusClient:
from ovos_utils import create_daemon
from tornado import web, ioloop
from ovos_messagebus.event_handler import MessageBusEventHandler

INTERNAL_PORT = 9987 # can be anything, wanted to differentiate from standard ovos-bus

routes = [("/core", MessageBusEventHandler)]
application = web.Application(routes)
application.listen(INTERNAL_PORT, "127.0.0.1")
create_daemon(ioloop.IOLoop.instance().start)

bus = MessageBusClient(host="127.0.0.1", port=INTERNAL_PORT)
bus.run_in_thread()
return bus
from ovos_utils.log import init_service_logger, LOG

from hivemind_bus_client import HiveMessageBusClient
from hivemind_bus_client.identity import NodeIdentity
from hivemind_voice_relay.service import HiveMindVoiceRelay


# TODO - add a flag to use FakeBus instead of real websocket
@click.command(help="connect to HiveMind")
@click.command(help="connect to HiveMind Sound Server")
@click.option("--host", help="hivemind host", type=str, default="")
@click.option("--key", help="Access Key", type=str, default="")
@click.option("--password", help="Password for key derivation", type=str, default="")
@click.option("--port", help="HiveMind port number", type=int, default=5678)
@click.option("--selfsigned", help="accept self signed certificates", is_flag=True)
@click.option("--siteid", help="location identifier for message.context", type=str, default="")
@click.option("--fakebus", help="use FakeBus instead of real websocket", is_flag=True)
def connect(host, key, password, port, selfsigned, siteid, fakebus):
def connect(host, key, password, port, selfsigned, siteid):
init_service_logger("HiveMind-voice-relay")

identity = NodeIdentity()
Expand All @@ -58,29 +37,20 @@ def connect(host, key, password, port, selfsigned, siteid, fakebus):
LOG.error(f"ws://{host} or wss://{host}")
exit(1)

# Check for fakebus flag
if fakebus:
internal_bus = FakeBus()
else:
internal_bus = launch_bus_daemon() or FakeBus()
internal_bus = FakeBus()

# connect to hivemind
bus = HiveMessageBusClient(key=key,
password=password,
port=port,
host=host,
useragent="VoiceRelayV0.0.1",
useragent="VoiceRelayV1.0.0",
self_signed=selfsigned,
internal_bus=internal_bus)
bus.connect(site_id=siteid)

# create Audio Output interface (TTS/Music)
audio = AudioPlaybackRelay(bus=bus)
audio.daemon = True
audio.start()

# STT listener thread
service = VoiceRelay(bus=bus)
service = HiveMindVoiceRelay(bus=bus)
service.daemon = True
service.start()

Expand All @@ -95,7 +65,6 @@ def connect(host, key, password, port, selfsigned, siteid, fakebus):
wait_for_exit_signal()

service.stop()
audio.shutdown()
if phal:
phal.shutdown()

Expand Down
110 changes: 82 additions & 28 deletions hivemind_voice_relay/service.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,49 @@
import threading
from typing import List, Tuple, Optional

import speech_recognition as sr
from ovos_audio.service import PlaybackService
from ovos_bus_client.message import Message, dig_for_message
from ovos_config.locale import setup_locale
from ovos_dinkum_listener.plugins import FakeStreamingSTT
from ovos_dinkum_listener.service import OVOSDinkumVoiceService
from ovos_config import Configuration
from ovos_plugin_manager.microphone import OVOSMicrophoneFactory
from ovos_plugin_manager.templates.stt import STT
from ovos_plugin_manager.templates.tts import TTS
from ovos_plugin_manager.templates.vad import VADEngine
from ovos_plugin_manager.utils.tts_cache import hash_sentence
from ovos_plugin_manager.vad import OVOSVADFactory
from ovos_plugin_manager.wakewords import OVOSWakeWordFactory
from ovos_simple_listener import ListenerCallbacks, SimpleListener
from ovos_utils.fakebus import FakeBus
from ovos_utils.log import LOG
from speech_recognition import AudioData

from hivemind_bus_client.client import HiveMessageBusClient
from hivemind_bus_client.identity import NodeIdentity


def get_bus() -> HiveMessageBusClient:
# TODO - kwargs
identity = NodeIdentity()
siteid = identity.site_id or "unknown"
host = identity.default_master
port = 5678

if not identity.access_key or not identity.password or not host:
raise RuntimeError("NodeIdentity not set, please pass key/password/host or "
"call 'hivemind-client set-identity'")

if not host.startswith("ws://") and not host.startswith("wss://"):
host = "ws://" + host
if not host.startswith("ws"):
raise ValueError(f"Invalid host, please specify a protocol: 'ws://{host}' or 'wss://{host}'")

bus = HiveMessageBusClient(key=identity.access_key,
password=identity.password,
port=port,
host=host,
useragent="VoiceRelayV1.0.0",
internal_bus=FakeBus())
bus.connect(site_id=siteid)
return bus


def on_ready():
Expand All @@ -37,6 +67,31 @@ def on_error(e='Unknown'):
LOG.error(f'HiveMind Voice Relay failed to launch ({e}).')


class HMCallbacks(ListenerCallbacks):
def __init__(self, bus: Optional[HiveMessageBusClient] = None):
self.bus = bus or get_bus()

def listen_callback(self):
LOG.info("New loop state: IN_COMMAND")
self.bus.internal_bus.emit(Message("mycroft.audio.play_sound",
{"uri": "snd/start_listening.wav"}))
self.bus.internal_bus.emit(Message("recognizer_loop:wakeword"))
self.bus.internal_bus.emit(Message("recognizer_loop:record_begin"))

def end_listen_callback(self):
LOG.info("New loop state: WAITING_WAKEWORD")
self.bus.internal_bus.emit(Message("recognizer_loop:record_end"))

def error_callback(self, audio: sr.AudioData):
LOG.error("STT Failure")
self.bus.internal_bus.emit(Message("recognizer_loop:speech.recognition.unknown"))

def text_callback(self, utterance: str, lang: str):
LOG.info(f"STT: {utterance}")
self.bus.emit(Message("recognizer_loop:utterance",
{"utterances": [utterance], "lang": lang}))


class HiveMindSTT(STT):
def __init__(self, bus: HiveMessageBusClient, config=None):
super().__init__(config)
Expand Down Expand Up @@ -70,17 +125,18 @@ def execute(self, audio: AudioData, language: Optional[str] = None) -> str:
return ""


class AudioPlaybackRelay(PlaybackService):

class HMPlayback(PlaybackService):
def __init__(self, bus: HiveMessageBusClient, ready_hook=on_ready, error_hook=on_error,
stopping_hook=on_stopping, alive_hook=on_alive,
started_hook=on_started, watchdog=lambda: None):
super().__init__(ready_hook, error_hook, stopping_hook, alive_hook, started_hook, watchdog=watchdog,
bus=bus, validate_source=False,
disable_fallback=True)
self.bus.on("speak:b64_audio.response", self.handle_tts_b64_response)
self.start()

def execute_tts(self, utterance, ident, listen=False, message: Message = None):
def execute_tts(self, utterance, ident, listen=False,
message: Message = None):
"""Mute mic and start speaking the utterance using selected tts backend.
Args:
Expand Down Expand Up @@ -109,31 +165,29 @@ def handle_tts_b64_response(self, message: Message):
)

def handle_b64_audio(self, message):
pass # handled in master, not client

def _maybe_reload_tts(self):
# skip loading TTS in this subclass
# HACK: dont get in a infinite loop, this message is meant for master
# because of how HiveMindTTS is implemented we need to do this
pass


class VoiceRelay(OVOSDinkumVoiceService):
"""HiveMind Voice Relay, but bus is replaced with hivemind connection"""
class HiveMindVoiceRelay(SimpleListener):
def __init__(self, bus: Optional[HiveMessageBusClient] = None):
self.bus = bus or get_bus()
self.audio = HMPlayback(bus=self.bus)
ww = Configuration().get("listener", {}).get("wake_word", "hey_mycroft")
super().__init__(
mic=OVOSMicrophoneFactory.create(),
vad=OVOSVADFactory.create(),
wakeword=OVOSWakeWordFactory.create_hotword(ww),
stt=HiveMindSTT(self.bus),
callbacks=HMCallbacks(self.bus)
)

def __init__(self, bus: HiveMessageBusClient, on_ready=on_ready, on_error=on_error,
on_stopping=on_stopping, on_alive=on_alive,
on_started=on_started, watchdog=lambda: None, mic=None,
vad: Optional[VADEngine] = None):
setup_locale() # read mycroft.conf for default lang/timezone in all modules (eg, lingua_franca)
stt = FakeStreamingSTT(HiveMindSTT(bus=bus))
super().__init__(on_ready, on_error, on_stopping, on_alive, on_started, watchdog, mic,
stt=stt, vad=vad,
bus=bus, validate_source=False, disable_fallback=True)

def _handle_b64_transcribe(self, message: Message):
pass # handled in master, not client
def main():
t = HiveMindVoiceRelay()
t.run()

def _connect_to_bus(self):
pass

def reload_configuration(self):
pass
if __name__ == "__main__":
main()
6 changes: 2 additions & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
hivemind_bus_client>=0.0.4a10
hivemind_bus_client>=0.0.4
ovos-audio
ovos-dinkum-listener>=0.0.3a14, < 2.0.0
ovos-simple-listener>=0.0.3,<1.0.0
ovos-microphone-plugin-alsa
ovos-vad-plugin-silero
ovos-stt-plugin-server
ovos-tts-plugin-server
ovos-ww-plugin-vosk
click
hivemind-ggwave
ovos-messagebus

0 comments on commit 307a0a1

Please sign in to comment.