Real-Time Open-Source AI Assistant with Voice

Бюджет: 750 $

I want to ship a lightning-fast personal voice assistant that relies only on open-source or freely accessible services and runs entirely in real time. The core tech stack is already chosen: • LiveKit will handle the bi-directional audio stream. • Speech-to-text must use Groq Whisper Large for immediate transcriptions. • Chat logic will sit on Groq’s Llama 3.3-70B Versatile endpoint, with my fine-tuned KimiR2 weights ready to drop in. • Text-to-speech is Kokoro (fully OpenAI TTS-compatible) and needs to surface several selectable voices on the fly. Your task is to wire these pieces into a single, production-ready application that I can demo end-to-end: hot-word detection, low-latency streaming, live transcription, contextual Llama replies, and instant playback. Everything must feel seamless and snappy. I need this ASAP, so I’m leaning on people who have shipped similar voice or streaming AI projects before. When you reply, link me directly to past work that proves you can integrate LiveKit or comparable low-latency pipelines and large-model inference. Deliverables (all required for sign-off) – A runnable repo with clear setup instructions (Docker or simple script). – Real-time assistant demo showing <->200 ms round-trip from speech end to voice reply. – Configurable voice list sourced from Kokoro. – Readme covering model/API keys, environment variables, and how to swap voices or models without code changes. If you have a faster approach that still keeps everything open source, feel free to propose it—speed is the only hard constraint.

Python

Регистрация