A powerful Next.js application featuring real-time voice interaction with an AI coding assistant powered by Agora Conversational AI. Talk to the AI and watch it generate HTML/CSS/JS code that renders live in your browser!
Built for LA Tech Week by ConvoAI × Agora
- 🎤 Voice Interaction: Natural voice conversations with AI using Agora RTC
 - 💻 Live Code Generation: AI-generated code appears in real-time
 - 🖼️ Sandboxed Preview: Code renders safely in an isolated iframe
 - 🔄 Source/Preview Toggle: Switch between rendered preview and raw HTML source
 - 📝 Live Transcript: See the full conversation history with timestamps
 - 🔇 Mic Control: Mute/unmute microphone with visual feedback
 - 📦 Code Download: Export generated code as a .zip file
 - 🎨 Modern UI: Beautiful gradient design with responsive layout
 - 🚀 Smart Loading: Context-aware "Generating code..." indicator
 - 🌐 Auto Images: Uses Picsum Photos for all image generation
 
- Start Session: Click the gradient "Start Session" button to connect
 - Talk Naturally: Your microphone activates automatically - just start talking
 - Watch Magic Happen: The AI responds with voice and generates code live
 - See Results: Code renders instantly in the preview pane
 - Explore: Toggle to source view, download as .zip, or keep chatting
 
The AI wraps code in Chinese square brackets 【】 to separate it from spoken text:
Here's a beautiful button 【<!DOCTYPE html><html>...</html>】 that you can interact with.
- Text outside 
【】is spoken by the AI's voice - Code inside 
【】is rendered visually in the preview pane - The TTS automatically skips the code blocks
 
npm installCreate a .env.local file in the root directory:
# Agora App Credentials
NEXT_PUBLIC_AGORA_APP_ID=your_agora_app_id
AGORA_APP_CERTIFICATE=your_app_certificate
# Agora RESTful API Credentials (for Conversational AI agent)
AGORA_API_KEY=your_api_key
AGORA_API_SECRET=your_api_secret
# Bot Configuration
NEXT_PUBLIC_AGORA_BOT_UID=1001
# LLM Configuration (OpenAI GPT-4o)
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=your_openai_api_key
# TTS Configuration (Microsoft Azure)
TTS_API_KEY=your_azure_tts_api_key
TTS_REGION=eastusWhere to get these values:
- 
Agora Credentials: Sign up at Agora Console
- Create a project → Get App ID and App Certificate
 - Enable Conversational AI → Get API Key & Secret
 
 - 
OpenAI API Key: Get from OpenAI Platform
- Uses GPT-4o model for best code generation
 
 - 
Azure TTS: Create resource at Azure Portal
- Uses 
en-US-AndrewMultilingualNeuralvoice 
 - Uses 
 
📚 See ENV_SETUP.md for detailed setup instructions
npm run devOpen http://localhost:3000 in your browser.
- Frontend: Next.js 14 (App Router), React 18, TypeScript
 - Styling: Tailwind CSS with custom gradients
 - Icons: Lucide React (professional icon library)
 - Real-time Communication: Agora RTC SDK 4.x
 - Real-time Messaging: Agora RTM SDK 2.x
 - AI Integration: Agora Conversational AI (GPT-4o + Azure TTS)
 - File Export: JSZip for client-side .zip generation
 
la_tech_week/
├── app/
│   ├── api/
│   │   ├── token/route.ts          # Dynamic RTC token generation
│   │   ├── start-agent/route.ts    # Start Conversational AI agent
│   │   └── leave-agent/route.ts    # Clean up agent on disconnect
│   ├── page.tsx                    # Main UI component
│   ├── layout.tsx                  # Root layout with metadata
│   └── globals.css                 # Global styles
├── lib/
│   └── agora-client.ts             # Agora RTC/RTM wrapper class
├── .env.local                      # Environment variables (create this)
└── package.json                    # Dependencies
Main UI component with:
- Voice interaction controls (mic, mute, disconnect)
 - Live code preview with iframe sandbox
 - Source code viewer with syntax highlighting
 - Transcript panel with auto-scroll
 - Smart loading indicators
 
Agora client wrapper featuring:
- RTC audio streaming
 - RTM messaging for transcription
 - Microphone control (mute/unmute)
 - Clean disconnect logic
 
/api/token: Generates RTC tokens server-side for security/api/start-agent: Initializes Conversational AI agent with custom prompt/api/leave-agent: Properly shuts down the AI agent
1. User clicks "Start Session"
   ↓
2. Generate random channel name (e.g., "agora-ai-abc123xyz")
   ↓
3. Request RTC token from /api/token
   ↓
4. Start Conversational AI agent via /api/start-agent
   ↓
5. Initialize Agora RTC client + join channel
   ↓
6. Subscribe to RTM transcription messages
   ↓
7. Auto-activate microphone
   ↓
8. User talks → AI responds with voice + code
1. User clicks "End" button
   ↓
2. Call /api/leave-agent to stop AI agent
   ↓
3. Disconnect Agora RTC/RTM client
   ↓
4. Reset all state (transcript, code, UI)
   ↓
5. Ready for new session
- ConvoAI Logo + Agora Logo branding
 - Responsive layout (mobile-friendly)
 - Gradient "Start Session" button
 - Connection status indicator
 
- Mic Button: Circular with 🎤/🔇 Lucide icons, green/red states, animated pulse
 - End Button: Pill-shaped with exit icon, smooth hover effects
 
- Toggle View: Switch between rendered preview and source code
 - Download: Export code as .zip file with single click
 - Smart Loading: "Generating code..." only shows when relevant
 - Dark Empty State: Professional look before code loads
 
- Auto-scroll: New messages scroll smoothly into view
 - Internal Scrolling: Won't affect the main page
 - Timestamp: Each message shows when it was sent
 - Speaker Labels: Clear "You" vs "AI" distinction
 
- Sandboxed Iframe: Code runs isolated with 
sandbox="allow-scripts" - Server-side Tokens: App Certificate never exposed to client
 - Environment Variables: All credentials stored securely
 - No DOM Access: Generated code can't access parent page
 - Content Security: XSS prevention through iframe isolation
 
# Install dependencies
npm install
# Run dev server with hot reload
npm run dev
# Build for production
npm run build
# Test production build
npm start- Browser Console: Check for RTC/RTM connection logs
 - Server Logs: Watch terminal for API route responses
 - Network Tab: Monitor token generation and agent API calls
 
Ask the AI to:
- "Create a todo list app"
 - "Build a calculator with gradient buttons"
 - "Make a responsive card layout with images"
 - "Design a landing page hero section"
 - "Build a Tetris game"
 
The AI will use https://picsum.photos/ for all images automatically!
✅ Check that .env.local exists with all required variables
✅ Allow microphone permissions in browser settings ✅ Check that no other app is using the microphone
✅ Verify NEXT_PUBLIC_AGORA_BOT_UID matches your agent configuration
✅ Check browser audio isn't muted
✅ Verify App ID and Certificate are correct ✅ Check that tokens aren't expired (1 hour validity) ✅ Ensure API Key/Secret are valid for Conversational AI
✅ AI must wrap code in Chinese brackets: 【<!DOCTYPE html>...】
✅ Check browser console for parsing errors
✅ Verify TTS skip_patterns is set to [2] in start-agent route
✅ Check that /api/leave-agent route exists
✅ Verify agentId is being stored and passed correctly
✅ See server logs for API call status
ENV_SETUP.md: Detailed environment variable setupAGORA_API_SETUP.md: Agora API configuration guideAPI_FEATURES.md: API features and capabilitiesTRANSCRIPTION_SETUP.md: Transcription implementation details
We use Chinese square brackets instead of regular parentheses/brackets because:
- ✅ TTS skip pattern 
[2]specifically handles these - ✅ Won't conflict with JavaScript array syntax 
[] - ✅ Won't conflict with function calls 
() - ✅ More reliable than markdown code fences
 - ✅ Clear visual separation in transcript
 
The "Generating code..." spinner only shows when:
- User says code-related keywords (create, build, make, generate, etc.)
 - Not shown during greeting or casual conversation
 - Auto-hides after 5 seconds if no code appears
 
Instead of downloading raw .html, we:
- Create a 
.zipfile client-side with JSZip - Name it with timestamp: 
generated-code-[timestamp].zip - Include the full HTML file inside
 - Trigger browser download automatically
 
The mic button:
- Uses Agora SDK's 
setEnabled()method - Shows proper mic icons from Lucide React
 - Green when active, red when muted
 - Animated pulse dot when transmitting
 - Doesn't disconnect, just stops audio
 
Make sure to set all environment variables in your deployment platform:
- Vercel: Project Settings → Environment Variables
 - Netlify: Site Settings → Build & Deploy → Environment
 - AWS/GCP: Use secrets manager
 
npm run buildnpm startMIT License - feel free to use this for your own projects!
Built with ❤️ for LA Tech Week
Powered by:
Questions? Check the documentation files or open an issue!
Demo: Try it live and ask the AI to build anything you can imagine! 🚀