Add an OpenAI voice assistant by jeremiahrose · Pull Request #1002 · slopus/happy

jeremiahrose · 2026-04-07T00:15:30Z

Keen for feedback on this approach. This is something I've been developing for my own personal use but I'm hoping that at least some of it could be useful for the Happy community.

The basic idea is to add a settings dialogue to allow the user to select between voice backends, and add a new option to make use of OpenAI STT/TTS services to talk directly to Claude. It also adds a "push to talk" button to make this simpler and less prone to VAD errors / being triggered by background speech etc.

This takes a "talk directly to Claude" approach rather that talking to a voice agent that sits between the user and Claude. I found the voice agent approach to be unworkable with GPT 4o because the model is fine tuned to respond to questions and would often get into endless conversations with Claude rather than strictly always relay the users messages. SST/TTS works better in my experience since Claude has all of the project context.

Keen to discuss which parts of this if any we could merge into Happy, and then we can rework into something more solid. This was recently rebased onto the latest main so there may still be some things to fix.

…kend Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

…eSession Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

…port Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

bra1nDump · 2026-04-07T11:54:30Z

Cool PR! Love the push-to-talk approach over VAD. I'm a big fan of the middleware voice agent approach in general, but I actually use push-to-talk myself when voice coding :D

Was just configuring voice a bit more for 11 labs so broke some of the changes here.

Have you considered on-device STT instead of OpenAI's API? NVIDIA's Parakeet v3 (0.6B) beats Whisper Large v3 on accuracy and runs fully on-device now.

On iOS there's @fluidinference/react-native-fluidaudio, an official React Native wrapper that runs Parakeet v3 on Apple Neural Engine via CoreML. Push-to-talk with on-device STT would feel instant (no network round-trip), no per-minute API cost, works offline, and audio never leaves the device. I feel like this will be quite different from the current capability and will work out of the box without extra settings which is preffered.

Could start with iOS only and figure out Android later.

jeremiahrose · 2026-04-08T22:00:29Z

Have you considered on-device STT instead of OpenAI's API? NVIDIA's Parakeet v3 (0.6B) beats Whisper Large v3 on accuracy and runs fully on-device now.

I love this idea, although I'm on Android so I wouldn't be able to test @fluidinference/react-native-fluidaudio.

Perhaps we could start with the menu option for users to select their preferred voice mode with options for 11labs vs a local Android STT/TTS (plus PTT button) and add iOS/Parakeet after that?

I'll see if I can rework this PR accordingly...

jeremiahrose and others added 15 commits April 7, 2026 09:22

Add VoiceSession interface extensions for push-to-talk and OpenAI bac…

706c7af

…kend Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Add voiceBackend and voicePushToTalk settings

7705118

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Add OpenAI voice config, foreground service, and Android native plugin

71bde0f

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Rename RealtimeVoiceSession to ElevenLabsVoiceSession, add OpenAIVoic…

830d93c

…eSession Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Update RealtimeProvider to switch between ElevenLabs and OpenAI backends

133341d

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Update RealtimeSession for dual-backend support with PTT exports

d74ff21

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Rewrite context formatters for OpenAI voice with glossary and TTS sup…

f52051d

…port Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Update voiceHooks with glossary extraction and session filtering

d6908f5

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Add voice prompt and system prompt switching for voice mode

08b312a

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Add voice provider selector and OpenAI settings UI

1daf18e

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Add push-to-talk button to VoiceAssistantStatusBar

86eef17

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Hide Skill tool in knownTools for TTS

02eef7c

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Add voice provider, PTT, and API key translations

de68311

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Update ElevenLabs packages for voice backend compatibility

4fd8fab

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Fix @expo/ui API usage and remove stale masked-progress route

23605a9

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

jeremiahrose changed the title ~~Add a OpenAI voice assistant~~ Add an OpenAI voice assistant Apr 7, 2026

jeremiahrose and others added 5 commits April 7, 2026 16:04

Show visible error alerts for OpenAI voice quota and session errors

bab41ee

Deduplicate voice permission announcements and silence tool call TTS

a2abef9

Add 'approved' to voice permission approval patterns

ed9617a

Switch transcription model from gpt-4o-transcribe to whisper-1

2a62dd0

Tolerate trailing punctuation in voice permission approval patterns

65a1aea

Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an OpenAI voice assistant#1002

Add an OpenAI voice assistant#1002
jeremiahrose wants to merge 20 commits intoslopus:mainfrom
jeremiahrose:openai-voice-assistant

jeremiahrose commented Apr 7, 2026 •

edited

Loading

Uh oh!

bra1nDump commented Apr 7, 2026 •

edited

Loading

Uh oh!

jeremiahrose commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jeremiahrose commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bra1nDump commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremiahrose commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeremiahrose commented Apr 7, 2026 •

edited

Loading

bra1nDump commented Apr 7, 2026 •

edited

Loading