A Next.js-based web-app for conversational AI agents, built with Agora's Real-Time Communication SDK.
- Guide.md on how to build this application from scratch.
- User Interaction Diagram for how the application interacts with the different services.
Before you begin, ensure you have the following installed:
You must have an Agora account and a project to use this application.
- Clone the repository:
git clone https://github.com/AgoraIO-Community/conversational-ai-nextjs-client
cd conversational-ai-nextjs-client- Install dependencies:
pnpm install- Create a
.env.localfile in the root directory and add your environment variables:
cp .env.local.example .env.localThe following environment variables are required:
NEXT_PUBLIC_AGORA_APP_ID- Your Agora App IDNEXT_PUBLIC_AGORA_APP_CERTIFICATE- Your Agora App CertificateNEXT_PUBLIC_AGORA_CONVO_AI_BASE_URL- Agora Conversation AI Base URLNEXT_PUBLIC_AGORA_CUSTOMER_ID- Your Agora Customer IDNEXT_PUBLIC_AGORA_CUSTOMER_SECRET- Your Agora Customer SecretNEXT_PUBLIC_AGENT_UID- Agent UID (defaults to "Agent")
NEXT_PUBLIC_LLM_URL- LLM API endpoint URLNEXT_PUBLIC_LLM_TOKEN- LLM API authentication tokenNEXT_PUBLIC_LLM_MODEL- LLM model to use (optional)
Choose one of the following TTS providers:
NEXT_PUBLIC_TTS_VENDOR=microsoftNEXT_PUBLIC_MICROSOFT_TTS_KEY- Microsoft TTS API keyNEXT_PUBLIC_MICROSOFT_TTS_REGION- Microsoft TTS regionNEXT_PUBLIC_MICROSOFT_TTS_VOICE_NAME- Voice name (optional, defaults to 'en-US-AndrewMultilingualNeural')NEXT_PUBLIC_MICROSOFT_TTS_RATE- Speech rate (optional, defaults to 1.0)NEXT_PUBLIC_MICROSOFT_TTS_VOLUME- Volume (optional, defaults to 100.0)
NEXT_PUBLIC_TTS_VENDOR=elevenlabsNEXT_PUBLIC_ELEVENLABS_API_KEY- ElevenLabs API keyNEXT_PUBLIC_ELEVENLABS_VOICE_ID- ElevenLabs voice IDNEXT_PUBLIC_ELEVENLABS_MODEL_ID- Model ID (optional, defaults to 'eleven_flash_v2_5')
NEXT_PUBLIC_INPUT_MODALITIES- Comma-separated list of input modalities (defaults to 'text')NEXT_PUBLIC_OUTPUT_MODALITIES- Comma-separated list of output modalities (defaults to 'text,audio')
- Run the development server:
pnpm dev- Open your browser and navigate to
http://localhost:3000to see the application in action.
This project is configured for quick deployments to Vercel.
This will:
- Clone the repository to your GitHub account
- Create a new project on Vercel
- Prompt you to fill in the required environment variables:
- Required: Agora credentials (
NEXT_PUBLIC_AGORA_APP_ID,NEXT_PUBLIC_AGORA_APP_CERTIFICATE, etc.) - Required: LLM API key (
NEXT_PUBLIC_LLM_API_KEY) - OpenAI API key by default - Required: Either Microsoft TTS key (
NEXT_PUBLIC_MICROSOFT_TTS_KEY) or ElevenLabs API key (NEXT_PUBLIC_ELEVENLABS_API_KEY) - Other variables have defaults if values are not provided
- Required: Agora credentials (
- Deploy the application automatically
Male voices:
- en-US-AndrewMultilingualNeural (default)
- en-US-ChristopherNeural (casual, friendly)
- en-US-GuyNeural (professional)
- en-US-JasonNeural (clear, energetic)
- en-US-TonyNeural (enthusiastic)
Female voices:
- en-US-JennyNeural (assistant-like)
- en-US-AriaNeural (professional)
- en-US-EmmaNeural (friendly)
- en-US-SaraNeural (warm)
Try voices: https://speech.microsoft.com/portal/voicegallery
Try voices: https://elevenlabs.io/app/voice-lab
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
The application provides the following API endpoints:
- Endpoint:
/api/generate-agora-token - Method: GET
- Query Parameters:
uid(optional) - User ID (defaults to 0)channel(optional) - Channel name (auto-generated if not provided)
- Response: Returns token, uid, and channel information
- Endpoint:
/api/invite-agent - Method: POST
- Body:
{
requester_id: string;
channel_name: string;
input_modalities?: string[];
output_modalities?: string[];
}- Endpoint:
/api/stop-conversation - Method: POST
- Body:
{
agent_id: string;
}