Platform · Recordings & analytics

Debug voice calls like you debug code.

Every Vocily AI call becomes a structured execution record — full transcript, recording, sentiment, tool results, custom analysis, cost and latency breakdowns. Plus a Sentry-style event timeline so you can see exactly what happened, when, and how long it took.

See transfer privacy Outbound execution model

Problem · Solution

What isWhat changes

The problem today

Most call platforms hand you a folder of MP3 files and call it analytics. QA listens to a random 1% sample and hopes that's representative. When a call goes sideways, the customer asks 'why didn't the bot use the manual?' and your team spends an hour scrubbing audio. Provider errors get re-skinned as 'something went wrong, please try again.' Cost lives in a billing dashboard nobody opens. Vocily AI takes the opposite approach: every call is a debuggable execution record, every event is its own row, every provider error is shown verbatim — debug like a developer, not like a guesser.

How Vocily AI handles it

Chronological event timeline
Every STT, LLM, TTS, KB, tool, and error event is its own row with duration. Click to expand. A Sentry trace, but for voice calls.
Honest provider errors
When Sarvam times out or Cartesia returns a 5xx, you see the exact code, message, and operation context. No marketing-speak hiding what failed.
Cost breakdown per call
'₹4.20 = ₹1.10 Sarvam STT + ₹1.80 OpenAI + ₹0.90 Cartesia + ₹0.40 VoBiz' — real per-provider, per-call. Lets you optimise provider mix on actual spend.
Latency breakdown
STT, LLM, TTS first-byte, and end-to-end latency averaged per call. See where the milliseconds go.
KB hits inline in the transcript
Click any agent turn to see which document chunks grounded it, with retrieval score. The #1 customer debug question is now self-service.
Tool call inspector
Request and response JSON for every API tool the LLM invoked, with args, response, status, and duration. Catch silent integration failures fast.

What's in it

What you can do with an execution record.

Every call produces one record with these surfaces. The pre-call analysis page (where you'd shape the agent) and the post-call review page (where you'd debug or audit) both read from this one structured object.

Audio & transcript

The raw signal — captured, transcribed, and aligned.

Recording: Audio file per execution. Stops at human handoff so post-transfer conversations stay private.
Transcript: Full turn-by-turn with speaker labels (customer / agent).
Realtime alignment: Word-level alignment events persisted from the live conversation.
Language detected: Per-turn language tag — useful for multilingual calls.

Event timeline

A row per platform event, with duration and click-to-expand details.

Event types: STT chunks, LLM turns, TTS synth, KB lookups, tool calls, errors.
Duration: Each event shows its time spent.
Detail expand: Click any row for full payloads — STT confidence, LLM prompt, tool args, etc.
Use it like: A Sentry / OpenTelemetry trace, but for voice calls.

Cost & latency

Performance and economics side by side.

Cost per call: Per-component breakdown — STT, TTS, LLM, telephony — summed and shown per call.
Latency per stage: STT, LLM, TTS first-byte, end-to-end — averaged per call.
Provider attribution: See exactly which vendor billed each line item.
Optimise on real data: Switch provider routing for an agent based on actual spend and latency, not vendor pitches.

Knowledge & tool traces

What the agent looked up and what it acted on.

KB hits in transcript: Click any agent turn to see the document chunks that grounded it, with retrieval score.
Tool call inspector: Args, response, status, duration for every API tool invocation.
Silent failures: Tools that errored show their error code and message — no swallowing.
Scheduling, transfer: Built-in tool invocations (booking creation, transfer dial) recorded the same way.

Post-call analysis

Auto-applied to every execution as soon as the call ends.

Sentiment: Customer and agent sentiment scored per turn plus an overall trend.
Summary: AI-generated one-paragraph summary visible in the call list and detail page.
Custom analysis: Define your own boolean / text / number outputs — 'did the caller book?', 'what was the lead score?'. Configurable per agent.
Outcome: Structured tag (reached / transferred / booked / follow-up / unreached / opt-out).

Honest errors

When something breaks, you'll know exactly what broke.

Raw vendor codes: Sarvam 429, Cartesia 504, OpenAI invalid_api_key — shown verbatim with the operation context.
No re-skinning: We don't replace provider errors with 'something went wrong, please try again.'
Operation context: Which agent, which call, which turn, which provider, which request body.
Debug like a dev: Read the error, find the cause, fix it. No guessing.

QA & feedback loop

Build your fine-tuning dataset from real conversations.

Tag calls: Mark calls as good or bad with optional notes.
Filter by tag: Pull all the thumbs-down calls for a specific outcome category to retrain a prompt.
Search transcripts: Keyword search across all transcripts in the workspace.
Filter on signals: By outcome, sentiment, agent, language, tool that fired, or error code surfaced.

Replay & live ops

Re-run yesterday's calls. Watch today's calls live.

Replay: Re-run any past call's transcript against an updated agent version — see how your new prompt would have handled it. No real phone call needed.
Aggregate dashboard: Multi-call rollups — success rate, latency p50/p95, error trends. For when you're past 100 calls/day.
Live call monitor: Real-time view of active calls — useful for team-scale operations.
Webhooks: Push every execution to your warehouse or BI tool on call completion.

Post-call analysis

Your own questions, answered automatically.

Custom analysis lets you define the questions you want answered about every call — from outcome category to compliance checks — and Vocily AI runs them across the transcript.

Sentiment

Outcome category

Compliance checks

Custom rubrics

Execution exec_29x8b

Post-call analysis

Outcome

Renewal accepted

Sentiment · customer

Neutral

Sentiment · agent

Supportive

Objection

Premium amount

Compliance read

Disclosed

Next step

Email confirmation

Common questions

What teams ask before they switch.

Sentiment is a built-in score that ships on every execution — customer mood, agent tone, overall trend. Custom analysis is whatever you define: boolean outputs ('did they book?'), text outputs ('what objection did they raise?'), or numbers ('lead score 1–10'). The platform runs both on every call.

Keep reading

Keep exploring Vocily AI

Voice agents

Configure role, language, voice, and escalation per agent.

Languages & voice engine

17+ Indian languages, code-mix, per-language voices.

Inbound phone

Your agent answers calls on a real number.