feat!:move to simple listener (#1)

* feat!:move to simple listener be even more lightweight * feat!:move to simple listener be even more lightweight * feat!:move to simple listener be even more lightweight * feat!:move to simple listener be even more lightweight
JarbasHiveMind · Oct 26, 2024 · 307a0a1 · 307a0a1
1 parent ebf718a
commit 307a0a1
Show file tree

Hide file tree

Showing 5 changed files with 130 additions and 91 deletions.
diff --git a/README.md b/README.md
@@ -1,23 +1,36 @@
 # HiveMind Voice Relay
 
-OpenVoiceOS Relay, connect to [HiveMind](https://github.com/JarbasHiveMind/HiveMind-core)
+OpenVoiceOS Relay, connect to [HiveMind](https://github.com/JarbasHiveMind/HiveMind-listener)
 
-Similar to [voice-satellite](https://github.com/JarbasHiveMind/HiveMind-voice-sat), but STT and TTS are sent to HiveMind instead of handled on device
+A lightweight version of [voice-satellite](https://github.com/JarbasHiveMind/HiveMind-voice-sat), but STT and TTS are sent to HiveMind instead of handled on device
 
-> NOTE: if using ovos-installer for the server this requires the `listener` profile
+## Server requirements
 
-## Install
+> ⚠️ `hivemind-listener` is required server side, the default `hivemind-core` does not provide STT and TTS capabilities.
 
-Install dependencies (if needed)
+> Alternatively run `hivemind-core` together with `ovos-audio` and `ovos-dinkum-listener`
 
-```bash
-sudo apt-get install -y libpulse-dev libasound2-dev
-```
+The regular voice satellite is built on top of [ovos-dinkum-listener](https://github.com/OpenVoiceOS/ovos-dinkum-listener) and is full featured supporting all plugins
+
+This repo is built on top of [ovos-simple-listener](https://github.com/TigreGotico/ovos-simple-listener), while it needs less resources it is also **missing** some features
+
+- STT plugin
+- TTS plugin
+- Audio Transformers plugins
+- Continuous Listening
+- Hybrid Listening
+- Recording Mode
+- Sleep Mode
+- Multiple WakeWords
+
+If you need an even lighter implementation, consider [hivemind-mic-satellite](https://github.com/JarbasHiveMind/hivemind-mic-satellite) to also offload wake word to the server
+
+## Install
 
 Install with pip
 
 ```bash
-$ pip install git+https://github.com/JarbasHiveMind/HiveMind-voice-relay
+$ pip install HiveMind-voice-relay
 ```
 
 ## Usage
@@ -37,16 +50,22 @@ Options:
 
 ```
 
-
 ## Configuration
 
-Voice relay uses the default OpenVoiceOS configuration `~/.config/mycroft/mycroft.conf`
+Voice relay is built on top of [ovos-simple-listener](https://github.com/TigreGotico/ovos-simple-listener) and [ovos-audio](https://github.com/OpenVoiceOS/ovos-audio), it uses the default OpenVoiceOS configuration `~/.config/mycroft/mycroft.conf`
 
 Supported plugin types:
-- Microphone  (required)
-- VAD  (required)
-- WakeWord (required)
-- Audio Transformers  (optional, None by default)
-- Dialog Transformers  (optional, None by default)
-- TTS Transformers  (optional, None by default)
-- PHAL  (optional, None by default)
+
+| Plugin Type | Description | Required | Link |
+|-------------|-------------|----------|------|
+| Microphone | Captures voice input | Yes | [Microphone](https://openvoiceos.github.io/ovos-technical-manual/mic_plugins/) |
+| VAD | Voice Activity Detection | Yes | [VAD](https://openvoiceos.github.io/ovos-technical-manual/vad_plugins/) |
+| WakeWord | Detects wake words for interaction | Yes* | [WakeWord](https://openvoiceos.github.io/ovos-technical-manual/ww_plugins/) |
+| STT | speech-to-text (STT)| Yes | [STT](https://openvoiceos.github.io/ovos-technical-manual/stt_plugins/) |
+| TTS | text-to-speech (TTS) | Yes | [TTS](https://openvoiceos.github.io/ovos-technical-manual/tts_plugins) |
+| G2P | grapheme-to-phoneme (G2P), used to simulate mouth movements  | No | [G2P](https://openvoiceos.github.io/ovos-technical-manual/g2p_plugins) |
+| Media Playback Plugins | Enables media playback (e.g., "play Metallica") | No | [Media Playback Plugins](https://openvoiceos.github.io/ovos-technical-manual/media_plugins/) |
+| OCP Plugins | Provides playback support for URLs (e.g., YouTube) | No | [OCP Plugins](https://openvoiceos.github.io/ovos-technical-manual/ocp_plugins/) |
+| Dialog Transformers | Processes text before text-to-speech (TTS) | No | [Dialog Transformers](https://openvoiceos.github.io/ovos-technical-manual/transformer_plugins/) |
+| TTS Transformers | Processes audio after text-to-speech (TTS) | No | [TTS Transformers](https://openvoiceos.github.io/ovos-technical-manual/transformer_plugins/) |
+| PHAL | Provides platform-specific support (e.g., Mark 1) | No | [PHAL](https://openvoiceos.github.io/ovos-technical-manual/PHAL/) |
diff --git a/hivemind_voice_relay/__init__.py b/hivemind_voice_relay/__init__.py
@@ -1 +0,0 @@
-from hivemind_voice_relay.service import VoiceRelay, AudioPlaybackRelay

diff --git a/hivemind_voice_relay/__main__.py b/hivemind_voice_relay/__main__.py
@@ -1,43 +1,22 @@
-from threading import Event
-
 import click
-from hivemind_bus_client import HiveMessageBusClient
-from hivemind_bus_client.identity import NodeIdentity
-from ovos_bus_client.client import MessageBusClient
 from ovos_utils import wait_for_exit_signal
-from ovos_utils.log import init_service_logger, LOG
 from ovos_utils.fakebus import FakeBus
-from hivemind_voice_relay.service import VoiceRelay, AudioPlaybackRelay
-
-
-def launch_bus_daemon() -> MessageBusClient:
-    from ovos_utils import create_daemon
-    from tornado import web, ioloop
-    from ovos_messagebus.event_handler import MessageBusEventHandler
-
-    INTERNAL_PORT = 9987  # can be anything, wanted to differentiate from standard ovos-bus
-
-    routes = [("/core", MessageBusEventHandler)]
-    application = web.Application(routes)
-    application.listen(INTERNAL_PORT, "127.0.0.1")
-    create_daemon(ioloop.IOLoop.instance().start)
-
-    bus = MessageBusClient(host="127.0.0.1", port=INTERNAL_PORT)
-    bus.run_in_thread()
-    return bus
+from ovos_utils.log import init_service_logger, LOG
 
+from hivemind_bus_client import HiveMessageBusClient
+from hivemind_bus_client.identity import NodeIdentity
+from hivemind_voice_relay.service import HiveMindVoiceRelay
 
 
 # TODO - add a flag to use FakeBus instead of real websocket
-@click.command(help="connect to HiveMind")
+@click.command(help="connect to HiveMind Sound Server")
 @click.option("--host", help="hivemind host", type=str, default="")
 @click.option("--key", help="Access Key", type=str, default="")
 @click.option("--password", help="Password for key derivation", type=str, default="")
 @click.option("--port", help="HiveMind port number", type=int, default=5678)
 @click.option("--selfsigned", help="accept self signed certificates", is_flag=True)
 @click.option("--siteid", help="location identifier for message.context", type=str, default="")
-@click.option("--fakebus", help="use FakeBus instead of real websocket", is_flag=True)
-def connect(host, key, password, port, selfsigned, siteid, fakebus):
+def connect(host, key, password, port, selfsigned, siteid):
     init_service_logger("HiveMind-voice-relay")
 
     identity = NodeIdentity()
@@ -58,29 +37,20 @@ def connect(host, key, password, port, selfsigned, siteid, fakebus):
         LOG.error(f"ws://{host} or wss://{host}")
         exit(1)
 
-    # Check for fakebus flag
-    if fakebus:
-        internal_bus = FakeBus()
-    else:
-        internal_bus = launch_bus_daemon() or FakeBus()
+    internal_bus = FakeBus()
 
     # connect to hivemind
     bus = HiveMessageBusClient(key=key,
                                password=password,
                                port=port,
                                host=host,
-                               useragent="VoiceRelayV0.0.1",
+                               useragent="VoiceRelayV1.0.0",
                                self_signed=selfsigned,
                                internal_bus=internal_bus)
     bus.connect(site_id=siteid)
 
-    # create Audio Output interface (TTS/Music)
-    audio = AudioPlaybackRelay(bus=bus)
-    audio.daemon = True
-    audio.start()
-
     # STT listener thread
-    service = VoiceRelay(bus=bus)
+    service = HiveMindVoiceRelay(bus=bus)
     service.daemon = True
     service.start()
 
@@ -95,7 +65,6 @@ def connect(host, key, password, port, selfsigned, siteid, fakebus):
     wait_for_exit_signal()
 
     service.stop()
-    audio.shutdown()
     if phal:
         phal.shutdown()
 

diff --git a/hivemind_voice_relay/service.py b/hivemind_voice_relay/service.py
@@ -2,19 +2,49 @@
 import threading
 from typing import List, Tuple, Optional
 
+import speech_recognition as sr
 from ovos_audio.service import PlaybackService
 from ovos_bus_client.message import Message, dig_for_message
-from ovos_config.locale import setup_locale
-from ovos_dinkum_listener.plugins import FakeStreamingSTT
-from ovos_dinkum_listener.service import OVOSDinkumVoiceService
+from ovos_config import Configuration
+from ovos_plugin_manager.microphone import OVOSMicrophoneFactory
 from ovos_plugin_manager.templates.stt import STT
 from ovos_plugin_manager.templates.tts import TTS
-from ovos_plugin_manager.templates.vad import VADEngine
 from ovos_plugin_manager.utils.tts_cache import hash_sentence
+from ovos_plugin_manager.vad import OVOSVADFactory
+from ovos_plugin_manager.wakewords import OVOSWakeWordFactory
+from ovos_simple_listener import ListenerCallbacks, SimpleListener
+from ovos_utils.fakebus import FakeBus
 from ovos_utils.log import LOG
 from speech_recognition import AudioData
 
 from hivemind_bus_client.client import HiveMessageBusClient
+from hivemind_bus_client.identity import NodeIdentity
+
+
+def get_bus() -> HiveMessageBusClient:
+    # TODO - kwargs
+    identity = NodeIdentity()
+    siteid = identity.site_id or "unknown"
+    host = identity.default_master
+    port = 5678
+
+    if not identity.access_key or not identity.password or not host:
+        raise RuntimeError("NodeIdentity not set, please pass key/password/host or "
+                           "call 'hivemind-client set-identity'")
+
+    if not host.startswith("ws://") and not host.startswith("wss://"):
+        host = "ws://" + host
+    if not host.startswith("ws"):
+        raise ValueError(f"Invalid host, please specify a protocol: 'ws://{host}' or 'wss://{host}'")
+
+    bus = HiveMessageBusClient(key=identity.access_key,
+                               password=identity.password,
+                               port=port,
+                               host=host,
+                               useragent="VoiceRelayV1.0.0",
+                               internal_bus=FakeBus())
+    bus.connect(site_id=siteid)
+    return bus
 
 
 def on_ready():
@@ -37,6 +67,31 @@ def on_error(e='Unknown'):
     LOG.error(f'HiveMind Voice Relay failed to launch ({e}).')
 
 
+class HMCallbacks(ListenerCallbacks):
+    def __init__(self, bus: Optional[HiveMessageBusClient] = None):
+        self.bus = bus or get_bus()
+
+    def listen_callback(self):
+        LOG.info("New loop state: IN_COMMAND")
+        self.bus.internal_bus.emit(Message("mycroft.audio.play_sound",
+                                           {"uri": "snd/start_listening.wav"}))
+        self.bus.internal_bus.emit(Message("recognizer_loop:wakeword"))
+        self.bus.internal_bus.emit(Message("recognizer_loop:record_begin"))
+
+    def end_listen_callback(self):
+        LOG.info("New loop state: WAITING_WAKEWORD")
+        self.bus.internal_bus.emit(Message("recognizer_loop:record_end"))
+
+    def error_callback(self, audio: sr.AudioData):
+        LOG.error("STT Failure")
+        self.bus.internal_bus.emit(Message("recognizer_loop:speech.recognition.unknown"))
+
+    def text_callback(self, utterance: str, lang: str):
+        LOG.info(f"STT: {utterance}")
+        self.bus.emit(Message("recognizer_loop:utterance",
+                              {"utterances": [utterance], "lang": lang}))
+
+
 class HiveMindSTT(STT):
     def __init__(self, bus: HiveMessageBusClient, config=None):
         super().__init__(config)
@@ -70,17 +125,18 @@ def execute(self, audio: AudioData, language: Optional[str] = None) -> str:
             return ""
 
 
-class AudioPlaybackRelay(PlaybackService):
-
+class HMPlayback(PlaybackService):
     def __init__(self, bus: HiveMessageBusClient, ready_hook=on_ready, error_hook=on_error,
                  stopping_hook=on_stopping, alive_hook=on_alive,
                  started_hook=on_started, watchdog=lambda: None):
         super().__init__(ready_hook, error_hook, stopping_hook, alive_hook, started_hook, watchdog=watchdog,
                          bus=bus, validate_source=False,
                          disable_fallback=True)
         self.bus.on("speak:b64_audio.response", self.handle_tts_b64_response)
+        self.start()
 
-    def execute_tts(self, utterance, ident, listen=False, message: Message = None):
+    def execute_tts(self, utterance, ident, listen=False,
+                    message: Message = None):
         """Mute mic and start speaking the utterance using selected tts backend.
 
         Args:
@@ -109,31 +165,29 @@ def handle_tts_b64_response(self, message: Message):
         )
 
     def handle_b64_audio(self, message):
-        pass  # handled in master, not client
-
-    def _maybe_reload_tts(self):
-        # skip loading TTS in this subclass
+        # HACK: dont get in a infinite loop, this message is meant for master
+        # because of how HiveMindTTS is implemented we need to do this
         pass
 
 
-class VoiceRelay(OVOSDinkumVoiceService):
-    """HiveMind Voice Relay, but bus is replaced with hivemind connection"""
+class HiveMindVoiceRelay(SimpleListener):
+    def __init__(self, bus: Optional[HiveMessageBusClient] = None):
+        self.bus = bus or get_bus()
+        self.audio = HMPlayback(bus=self.bus)
+        ww = Configuration().get("listener", {}).get("wake_word", "hey_mycroft")
+        super().__init__(
+            mic=OVOSMicrophoneFactory.create(),
+            vad=OVOSVADFactory.create(),
+            wakeword=OVOSWakeWordFactory.create_hotword(ww),
+            stt=HiveMindSTT(self.bus),
+            callbacks=HMCallbacks(self.bus)
+        )
 
-    def __init__(self, bus: HiveMessageBusClient, on_ready=on_ready, on_error=on_error,
-                 on_stopping=on_stopping, on_alive=on_alive,
-                 on_started=on_started, watchdog=lambda: None, mic=None,
-                 vad: Optional[VADEngine] = None):
-        setup_locale()  # read mycroft.conf for default lang/timezone in all modules (eg, lingua_franca)
-        stt = FakeStreamingSTT(HiveMindSTT(bus=bus))
-        super().__init__(on_ready, on_error, on_stopping, on_alive, on_started, watchdog, mic,
-                         stt=stt, vad=vad,
-                         bus=bus, validate_source=False, disable_fallback=True)
 
-    def _handle_b64_transcribe(self, message: Message):
-        pass  # handled in master, not client
+def main():
+    t = HiveMindVoiceRelay()
+    t.run()
 
-    def _connect_to_bus(self):
-        pass
 
-    def reload_configuration(self):
-        pass
+if __name__ == "__main__":
+    main()
diff --git a/requirements.txt b/requirements.txt
@@ -1,11 +1,9 @@
-hivemind_bus_client>=0.0.4a10
+hivemind_bus_client>=0.0.4
 ovos-audio
-ovos-dinkum-listener>=0.0.3a14, < 2.0.0
+ovos-simple-listener>=0.0.3,<1.0.0
 ovos-microphone-plugin-alsa
 ovos-vad-plugin-silero
 ovos-stt-plugin-server
 ovos-tts-plugin-server
-ovos-ww-plugin-vosk
 click
-hivemind-ggwave
 ovos-messagebus