Voice agents vs text agents
Wassla ships two flavours of AI agent on the same workspace. Text agents handle every written channel — WhatsApp, Instagram DMs, Facebook Page, the embedded web widget, and Twilio SMS. Voice agents answer inbound phone calls on a Twilio voice number through a LiveKit bridge. Both share the same knowledge base, persona, and escalation rules, so you only train your agent once and reuse it everywhere.
What a text agent does
A text agent is a streaming, tool-using assistant that replies in writing. Every inbound message — a WhatsApp text, an Instagram DM, a Facebook comment-reply, a widget message on your website, or an SMS — lands on the same pipeline.
The text agent reads the message, searches your knowledge base, drafts a reply, and streams the answer back to the customer in the channel they arrived on. It can also create tickets, update existing tickets, and trigger handoff to a human teammate when it hits one of your escalation rules.
Text agents are configured per channel under Channels. One agent can serve multiple channels at once. The agent shows the same name (for example "Layla #support") and uses the same tone whether the customer wrote in on WhatsApp or on the website widget.
Channels a text agent handles
- WhatsApp Business through the Meta Cloud API
- Instagram Direct Messages through the Meta Graph API
- Facebook Page messages and comment replies
- Embedded web widget dropped into any site with a single script tag
- Twilio SMS for inbound and outbound text messages
What gets billed for text
Text replies bill an ai_replies credit per assistant message. Knowledge searches, sentiment classification, and conversation summaries are bundled into that price — you are not charged per RAG hit.
What a voice agent does
A voice agent answers a phone call. When a customer dials your Twilio number, Twilio forwards the call to Wassla's voice handler, which spins up a LiveKit room with a real-time AI agent. The agent greets the caller, listens, transcribes, generates a reply, speaks it back, and handles natural turn-taking — including barge-in if the customer starts talking over the agent.
The voice agent uses the same persona, the same knowledge base, and the same escalation rules as your text agent. The only differences are the surface (audio instead of text) and a handful of voice-only settings — greeting line, voice selection, max call length, and the human fallback target.
WhatsApp Calling is also supported on workspaces that have it enabled in Meta Business Manager, and it rides the same LiveKit bridge as Twilio voice.
What gets billed for voice
Voice calls bill voice_minutes based on actual call duration, charged when the call terminates. Speech-to-text (transcription) and text-to-speech (synthesis) are bundled into the per-minute price. Call recording storage and egress are also bundled — you are not double-charged.
Setting up a text agent
A text agent is the default agent type. Every new workspace starts with one already provisioned.
- Open Agents in the sidebar and pick (or create) the agent you want to wire up.
- Open Personality and set a name, a role (for example
Mona #support), a tone, and the system prompt that frames how the agent should behave. - Open Knowledge and upload documents or paste website URLs. See Train your AI agent for the full walkthrough.
- Open Channels and connect the channels you want this agent to answer on — WhatsApp, Instagram, Facebook Page, web widget, or SMS. Each channel has its own connect flow; the WhatsApp setup is documented in Connect WhatsApp to Wassla.
- Open Handoff and configure when the agent should escalate to a human teammate.
- Send a real message to a connected channel and watch the conversation appear in the inbox.
That is the entire setup. The same agent now handles every text channel you connected.
Setting up a voice agent
A voice agent needs a phone number and a small amount of Twilio configuration on top of the text agent setup.
- Purchase a Twilio phone number with voice capability. Local Saudi, UAE, and US numbers are all supported.
- In the Twilio console, set the inbound voice webhook on that number to the URL Wassla shows you in Channels > Voice. Wassla speaks standard Twilio voice (TwiML), so no Twilio Functions or middleware are required.
- In Wassla, open Channels, click Add channel, pick Voice, and enter the Twilio number you just configured.
- Open the agent and set the voice-only knobs: greeting line, voice selection, max call length, and a human fallback teammate.
- Call the number from your phone. The agent should answer within a second and greet you in the configured voice.
The voice agent reuses your text agent's persona, knowledge, and escalation rules. You do not retrain anything — the same Mona who replies on WhatsApp also picks up the phone.
The complete voice setup, including the WhatsApp Calling variant, is documented in Connect Twilio SMS and Voice.
When to use which
Most workspaces run both. Text is the workhorse for asynchronous, high-volume support on the channels customers already use. Voice is the catch for customers who prefer to call — and for after-hours coverage, where a voice agent can handle the call instead of routing it to a missed-call queue.
A few decision rules:
- High message volume, low call volume — start with text agents on WhatsApp and the web widget. Add a voice agent later if you start losing customers to missed calls.
- Existing phone-first business — set up the voice agent first so your number never rings out. Layer text agents on top once your knowledge base is solid.
- Both — connect both. Customers self-select the channel they prefer, and your team handles only the escalations.