CAPABILITY · Client under NDA

Voice AI Phone Receptionist for Businesses

AI receptionist that answers business phone calls 24/7, routes by intent, books appointments, and captures leads — replacing the always-on front-desk role at a fraction of human cost.

AI Phone ReceptionistVoice AIBusiness AutomationCustomer Service TechIVR ReplacementSMB SoftwareFront-Desk AutomationConversational AI

under the hood…

See it work

From a phone call to a booked appointment.

A caller dials the business. Twilio streams the audio to a real-time speech-to-text service, which feeds an LLM agent grounded in the business's RAG knowledge base. The agent answers in real-time TTS, books appointments straight into Google Calendar, and writes leads to the CRM — all inside one continuous call.

calls.example/live

viewing as · Caller

Caller · Incoming call

📞 northwind

Live call

Knowledge

Bookings

Leads

Incoming call

Northwind Plumbing · main line · 24/7 AI receptionist

via Twilio

Incoming

+1 (415) 555-0148

San Francisco, CA · mobile

AI receptionist answering…

picked up in 0.4s · STT stream open

ring · listen · respond

24/7 grounded receptionist

Demo only

This is an animated mockup of the voice-AI-receptionist capability — not a live product. Business names, phone numbers, and callers are illustrative.

Twilio telephony

The business phone number routes through Twilio. Inbound audio is streamed to the agent in real time, and outbound TTS audio streams back the same way — no IVR menus.

Real-time speech-to-text

A streaming STT service transcribes the caller turn by turn with partial hypotheses, so the agent can start thinking before the sentence is finished.

RAG knowledge base

Business hours, services, prices, and policies are indexed into a vector store. Each agent turn retrieves the relevant chunks so answers stay grounded — not improvised.

LLM function calling

The agent decides when to answer in voice and when to call a tool — book_appointment, upsert_lead, transfer_to_human — keeping the conversation and the action in one flow.

Calendar + CRM integration

Successful bookings create a real Google Calendar event. New callers land in the CRM with the conversation context attached — the receptionist isn't just answering, it's converting.

Real-time text-to-speech

The agent reply streams to a TTS vendor that returns audio chunks the caller hears almost immediately. Round-trip latency stays inside the conversational-tolerable range.

What we built

AI receptionist that answers business phone calls 24/7, routes by intent, books appointments, and captures leads — replacing the always-on front-desk role at a fraction of human cost.

How we built it

Twilio receives the call; speech-to-text streams the caller's words to an LLM agent grounded in the business's RAG knowledge base; the agent responds via real-time TTS. Workflow automation handles bookings into Google Calendar, lead writes into the CRM, and email follow-up.

A caller dials the business number. Twilio routes the audio to a real-time speech-to-text service that streams transcripts to an LLM agent. The agent has read-only access to a RAG knowledge base built from the business's hours, services, prices, and policies — so answers stay grounded in the actual business rather than improvised. When the caller asks for a booking, an agent function writes a Google Calendar event. New leads land in the business's CRM. End-to-end latency sits in the conversational-tolerable range.

Architecture

How a request flows through it

Each request enters at the top of the diagram, flows through every box, and lands at the bottom — exactly the way the production system behaves. The scan-line traces where a live request would be right now.

voice-ai-receptionist/architecture.txttracing request flow

Caller dials Twilio number

│

▼

┌──────────────────────┐

│ Deepgram speech-to- │

│ text │

└──────────┬───────────┘

▼

┌──────────────────────────┐

│ LangChain agent + Vector │

│ database (RAG) │

└──────────┬───────────────┘

▼

┌──────────────────────┐

│ n8n workflow │

│ orchestration │

└──────────┬───────────┘

│

┌─────────┼──────────────┬──────────────┐

▼ ▼ ▼ ▼

ElevenLabs Google GoHighLevel React

(voice) Calendar (CRM) dashboard

▼ flow direction┌─┐ componentvoice-ai-receptionist.flow · live

Stack

What it's built with

Capabilities

Voice AI Conversation EngineRAG Knowledge Base (Vector Store)Twilio Telephony IntegrationReal-time Speech-to-TextReal-time Text-to-SpeechLLM Function CallingCalendar / CRM IntegrationWorkflow Automation

Engineering notes

The interesting parts

RAG-grounded answers

Business hours, services, prices, and policies indexed into a vector store and retrieved per turn — answers stay accurate to the actual business instead of being improvised by the LLM.

Real-time voice loop

Speech-to-text streams to the LLM; the LLM's reply streams to text-to-speech. Round-trip latency lands inside the conversational-tolerable range, so the call doesn't feel like talking to a slow bot.

Bookings + CRM in one flow

Successful booking intents write a real calendar event; new leads land in the business's CRM with the conversation context attached. The receptionist isn't just answering — it's converting.

Best-of-breed vendor stack

Twilio for telephony, dedicated speech-to-text and TTS vendors, workflow orchestration for the after-call steps. Each leg uses the best specialist rather than a single all-in-one platform.

Decisions

The calls that did most of the work

A handful of engineering choices shape how a system feels. Here are the ones we'd still defend — alongside what each one cost.

RAG over fine-tuning for the business knowledge

A business's hours, services, and prices change weekly; fine-tuning would mean re-training on every update. RAG lets the model stay general and the knowledge stay current.

Tradeoff: Slightly higher per-call cost and an extra retrieval hop compared to a fine-tuned baseline.

n8n for orchestration over custom code

The 'after-the-call' workflow — calendar event, CRM record, fallback to human — changes frequently; a visual orchestrator absorbs those changes without code edits.

Tradeoff: Adds a runtime dependency, and debugging crosses two systems when something fails.

Twilio + Deepgram + ElevenLabs (best-of-breed)

Each leg of the voice loop has a strong specialist; assembling the best-of-breed stack ships faster than a single all-in-one vendor.

Tradeoff: Three vendor contracts and three failure modes — vendor lock-in is replaced by vendor sprawl.

Want something like this?

Tell us what you're building.

Free 30-minute call. Real humans, real timelines, no follow-up emails forever.

See more capabilities