🎯 Lead Qualification & Inbound

How to Score Leads From a Voice Conversation

A voice conversation is a rich source of signal for lead scoring — far richer than a form submission or a website visit. The caller tells you their role, their company, their need, their timeline, and their tone.

Tyler Weitzman
Tyler Weitzman
February 22, 2026 · 5 min read
Speechify

A voice conversation is a rich source of signal for lead scoring — far richer than a form submission or a website visit. The caller tells you their role, their company, their need, their timeline, and their tone. The challenge is turning all that into a numeric score the sales team can act on. Done well, AI-powered lead scoring from voice conversations lets you route the top 20% of leads to AEs in real time, tier the rest for nurture, and measure what actually predicts closes over time.

TL;DR

  • Extract structured signals from the conversation: role, company size, use case, timeline, sentiment.
  • Map signals to a numeric score (0–100) via a scoring rubric tied to your ICP.
  • Sentiment and urgency are often the most underused signals.
  • Calibrate monthly against actual close data.
  • Route based on score thresholds; iterate on thresholds.

Signals worth capturing

From a voice conversation, you can capture:

  • Role / title. Self-identified during qualification.
  • Company size. Explicit or inferred from role/company.
  • Industry. Stated or inferred.
  • Use case. The specific problem they're solving.
  • Timeline. How soon they want a solution.
  • Budget signal. Explicit or implicit.
  • Decision authority. Solo or part of a group.
  • Current solution. What are they using today?
  • Urgency. How much does this matter?
  • Sentiment. Neutral, enthusiastic, skeptical.
  • Engagement. Asking detailed questions vs shallow.

Not all are equal weight. Your scoring model prioritizes what's predictive for your ICP.

A simple scoring rubric

Start simple. Example for a mid-market SaaS:

SignalValuePoints
RoleVP+25
RoleDirector15
RoleManager5
Company size500–5000 employees20
Company size100–50010
Use case matchDirect20
Use case matchAdjacent10
Timelineunder 3 months20
Timeline3–6 months10
Budget signalExplicit10
Budget signalImplicit5
SentimentPositive5

Max: 100. Threshold for AE routing: 50.

Implementation

Voice agent captures during the call:

function capture_signals({
  role, title, company_name, company_size, industry,
  use_case_description, timeline, budget_mentioned,
  decision_authority, current_solution, urgency, sentiment
})

Post-call, score function applies rubric:

def calculate_score(signals):
  score = 0
  score += role_score(signals.role)
  score += size_score(signals.company_size)
  score += use_case_score(signals.use_case_description)
  score += timeline_score(signals.timeline)
  score += budget_score(signals.budget_mentioned)
  score += sentiment_score(signals.sentiment)
  return min(100, score)

LLM-assisted extraction

For free-form signals (use case, urgency, sentiment), the LLM extracts and classifies:

Prompt: "Given this call transcript, classify:
- Use case match (direct/adjacent/off-fit): [...]
- Urgency (high/medium/low): [...]
- Sentiment (positive/neutral/negative): [...]"

Post-call extraction is cleaner than trying to classify in-conversation.

Beyond the rubric

Static rubrics are a starting point. More sophisticated:

  • Regression model trained on historical call data + closed outcomes.
  • Feature engineering — call duration, number of questions asked, specific keyword density.
  • Time-decay — older signals weighted less if the lead is re-engaging.

Start static, evolve into ML when you have enough data.

Calibration

Every month, cross-reference AI scores with outcomes:

  • What did scores predict?
  • What % of high-score leads became opportunities?
  • What % closed?
  • False positives: high-score, didn't close → what was the miss?
  • False negatives: low-score, closed anyway → what was missed?

Adjust rubric weights based on findings.

Routing thresholds

  • Score 80–100: priority-route to top AEs; offer meeting immediately.
  • Score 50–79: route to appropriate AE; 24-hour follow-up.
  • Score 20–49: nurture; email sequence + SDR follow-up.
  • Score under 20: disqualify politely; exit.

Tune thresholds over time.

Sentiment as signal

Voice carries sentiment — LLMs can pick up enthusiasm, frustration, skepticism. Don't underuse this:

  • Enthusiastic about your product → strong buying signal.
  • Skeptical / testing → needs nurture, not AE time.
  • Frustrated with current solution → urgent switcher.
  • Flat / tire-kicker tone → low priority.

Multi-call scoring

For leads who call multiple times:

  • Aggregate signals across calls.
  • Weight recency (most recent signals heavier).
  • Detect pattern: re-engaging strong signal.

Cross-channel scoring

Voice score is one input. Combine with:

  • Web behavior (pages viewed, time on site).
  • Email engagement.
  • Product signals (if self-serve exists).

Unified lead score > voice-only score for most modern funnels.

See inbound lead qualification with voice agents.

Common pitfalls

Over-indexing on declared budget. Callers under-report budget. Infer from other signals.

Under-valuing timeline. Someone saying "this quarter" is hugely different from "next year."

Static rubric. Never recalibrating. Scores drift from reality.

Too many signals. Analysis paralysis. Pick 5–7 that matter.

Ignoring sentiment. Voice gives you sentiment for free; use it.

Privacy consideration

Scoring a lead based on voice content is normal CRM practice. But:

  • Store scores, not raw transcripts unnecessarily.
  • Document scoring model for compliance (GDPR right to explanation).
  • Don't score on protected characteristics (gender, ethnicity, etc.).

Observability

  • Score distribution (histogram).
  • % of calls scored correctly per recent calibration.
  • Conversion rate by score band.
  • AE acceptance rate by score band.

FAQ

What if the caller is evasive? Partial data → partial score. Don't over-weight missing dimensions.

Can AI predict close probability directly? With enough history, yes — that's where ML beats rubrics. Many teams use hybrid rule + model approach.

Should scores be visible to AEs? Yes — transparent scores build trust. "Why was this scored 75?" is a reasonable question.

How fine-grained should scores be? 0–100 with tiers is usually enough. More granular doesn't add actionability.

What about intent data from third parties? Useful input, separate from voice scoring. Combine at the CRM level.

Tyler Weitzman
Tyler Weitzman
Co-Founder & Head of AI, Speechify

Tyler Weitzman is co-founder and Head of AI at Speechify. He has spent the past decade building the speech-synthesis stack that powers millions of users. Tyler writes about the engineering of real-time conversational systems — text-to-speech, speech recognition, latency budgets, model serving, and the architectural choices that separate prototypes from production-grade voice agents.

More from Tyler Weitzman

View all →

Related reading

Voice AI, twice a month.

Get the best of the SIMBA resources hub — new articles, trend notes, and operator guides. No spam.