Skip to content

Add an OpenAI voice assistant#1002

Open
jeremiahrose wants to merge 20 commits intoslopus:mainfrom
jeremiahrose:openai-voice-assistant
Open

Add an OpenAI voice assistant#1002
jeremiahrose wants to merge 20 commits intoslopus:mainfrom
jeremiahrose:openai-voice-assistant

Conversation

@jeremiahrose
Copy link
Copy Markdown

@jeremiahrose jeremiahrose commented Apr 7, 2026

Keen for feedback on this approach. This is something I've been developing for my own personal use but I'm hoping that at least some of it could be useful for the Happy community.

The basic idea is to add a settings dialogue to allow the user to select between voice backends, and add a new option to make use of OpenAI STT/TTS services to talk directly to Claude. It also adds a "push to talk" button to make this simpler and less prone to VAD errors / being triggered by background speech etc.

This takes a "talk directly to Claude" approach rather that talking to a voice agent that sits between the user and Claude. I found the voice agent approach to be unworkable with GPT 4o because the model is fine tuned to respond to questions and would often get into endless conversations with Claude rather than strictly always relay the users messages. SST/TTS works better in my experience since Claude has all of the project context.

Keen to discuss which parts of this if any we could merge into Happy, and then we can rework into something more solid. This was recently rebased onto the latest main so there may still be some things to fix.

jeremiahrose and others added 15 commits April 7, 2026 09:22
…kend

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
…eSession

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
…port

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
@jeremiahrose jeremiahrose changed the title Add a OpenAI voice assistant Add an OpenAI voice assistant Apr 7, 2026
@bra1nDump
Copy link
Copy Markdown
Contributor

bra1nDump commented Apr 7, 2026

Cool PR! Love the push-to-talk approach over VAD. I'm a big fan of the middleware voice agent approach in general, but I actually use push-to-talk myself when voice coding :D

Was just configuring voice a bit more for 11 labs so broke some of the changes here.

Have you considered on-device STT instead of OpenAI's API? NVIDIA's Parakeet v3 (0.6B) beats Whisper Large v3 on accuracy and runs fully on-device now.

On iOS there's @fluidinference/react-native-fluidaudio, an official React Native wrapper that runs Parakeet v3 on Apple Neural Engine via CoreML. Push-to-talk with on-device STT would feel instant (no network round-trip), no per-minute API cost, works offline, and audio never leaves the device. I feel like this will be quite different from the current capability and will work out of the box without extra settings which is preffered.

Could start with iOS only and figure out Android later.

@jeremiahrose
Copy link
Copy Markdown
Author

Have you considered on-device STT instead of OpenAI's API? NVIDIA's Parakeet v3 (0.6B) beats Whisper Large v3 on accuracy and runs fully on-device now.

I love this idea, although I'm on Android so I wouldn't be able to test @fluidinference/react-native-fluidaudio.

Perhaps we could start with the menu option for users to select their preferred voice mode with options for 11labs vs a local Android STT/TTS (plus PTT button) and add iOS/Parakeet after that?

I'll see if I can rework this PR accordingly...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants