DTMF and IVR Navigation for Outbound Voice Agents
Outbound voice agents calling businesses often encounter IVR systems — "press 1 for sales, press 2 for support" phone trees that the AI needs to navigate to reach the right person.
Outbound voice agents calling businesses often encounter IVR systems — "press 1 for sales, press 2 for support" phone trees that the AI needs to navigate to reach the right person. Handling this well requires DTMF (dual-tone multi-frequency) signaling, IVR recognition, and occasionally the ability to speak to an automated attendant as if it were a person. It's a niche but important skill for production voice agents, especially in B2B outbound.
TL;DR
- DTMF is the technology for sending keypad tones over a call.
- IVR navigation lets voice agents reach the right person past phone trees.
- Pre-configured paths are cleaner than real-time recognition.
- Direct-dial numbers bypass IVR — prefer when available.
- Handle gracefully when IVR logic changes (call routing updates).
What DTMF is
DTMF tones are the keypad sounds your phone makes. "0123456789*#" each has a specific two-tone signal. Every telephony system handles DTMF; it's the universal way to communicate digits over a call.
For voice agents:
- Sending DTMF: dial extension, navigate IVR, enter account number.
- Receiving DTMF: caller pressing digits to authenticate or pick options.
Why IVR navigation matters for outbound
Many B2B calls land on an automated attendant:
- "Thank you for calling Acme. If you know your party's extension, dial it now. Otherwise, press 1 for sales, 2 for support..."
Voice AI needs to navigate this to reach the intended contact. Options:
- Direct-dial number. Skip IVR. Best if available.
- Extension. Dial during greeting.
- Menu navigation. Press appropriate option.
- Operator / live attendant. Ask for person by name.
Pre-configured navigation
For accounts you call often, pre-configure the IVR path:
Acme Corp main number: +1-555-123-4567
Sales extension: 101
Jamie Patel's extension: 4827
Navigation: dial main number → wait 5s → press 4827
AI follows the stored path. Fast, reliable.
Real-time IVR recognition
For accounts you don't have pre-configured:
- AI listens to the IVR prompts.
- STT captures the menu options.
- LLM determines which option matches the goal.
- AI sends DTMF for that option.
More flexible but slower and error-prone.
DTMF timing
Important:
- Wait for prompt before sending. DTMF sent too early = ignored.
- Inter-digit timing. Space digits 100–200ms apart for clean recognition.
- Tone duration. 80–200ms per tone.
Most voice AI platforms handle timing automatically.
Handling "operator" or "person"
Sometimes saying "operator" or "speak to a person" works on IVRs:
- AI says "Operator" clearly.
- IVR connects to live attendant or specific department.
- AI proceeds with the human attendant.
Not all IVRs support this; test per-account.
Speaking to a live attendant
Voice agent asks for the intended person:
Attendant: "Acme, how can I direct your call?"
Agent: "I'm calling for Jamie Patel."
Attendant: "One moment."
[Transfer to Jamie's line.]
[Voice agent proceeds with its actual call.]
AI speaks to the attendant just like a human would.
When the person isn't available
Common outcomes:
- Voicemail.
- "Jamie's not in today."
- "Can I take a message?"
AI handles:
- Drop voicemail (if appropriate).
- Give a callback message to the attendant.
- Log outcome and schedule retry.
IVR changes
IVR menus change. Pre-configured paths break:
- Monitor for navigation failures.
- Fall back to real-time recognition.
- Update stored paths when detected.
The caller ID angle
Outbound caller ID affects reception:
- Unknown numbers often go to voicemail regardless.
- Business callers get past IVR if recognized.
- Spam flags cause routing issues.
See caller ID and trust: why numbers get marked as spam.
Language considerations
IVRs in different languages:
- Some present language options ("Press 1 for English, 2 for Spanish").
- Some automatically respond in the detected language.
- AI needs to handle the language presented.
Compliance
IVR navigation isn't separately regulated but underlying call is:
- TCPA applies.
- Consent still required.
- Opt-out respected at the call level.
Sample end-to-end
Call placed to Acme (+1-555-123-4567).
IVR: "Thanks for calling Acme. Press 1 for sales, 2 for
support, 3 for billing, or say the name of the person
you want."
AI: "Jamie Patel."
IVR: "Connecting to Jamie Patel."
[Jamie's voicemail picks up or Jamie answers.]
Voice agent proceeds with its outreach.
When to skip IVR entirely
If direct-dial or mobile number is available, use it:
- Faster.
- No IVR navigation risk.
- Direct person-to-person.
CRM should track preferred number per contact.
Manual override
Sometimes IVR navigation fails or isn't worth the effort:
- Log as navigation failure.
- Skip outbound attempt.
- Route to human SDR to call manually.
Not every call needs to succeed automated.
Measuring
- IVR navigation success rate. % of calls that reach target.
- Time-to-person. Median seconds from call connect to person on line.
- Voicemail rate. Higher than direct calls because of IVR friction.
- Callback rate. Do IVR-navigated calls convert at similar rates?
Common pitfalls
DTMF timing too fast. Digits sent before menu prompt finishes. Ignored.
IVR menu changes. Stored path breaks. Fallback needed.
Silent voicemails. Detection fails; AI leaves voicemail on a live person. Awkward.
Stuck in IVR loop. AI can't find the right option. Abandon after N attempts.
Attendant frustration. AI repeats the same request to a human attendant. Sounds robotic.
Related reading
- Outbound AI Calling in 2026: A Practical Playbook
- Outbound for B2B: Pipeline, Renewals, and Win-Backs
- Outbound for B2C: Subscription, Healthcare, and Auto
- How to Run an Outbound AI Pilot That Doesn't Embarrass You
- Outbound Voice Agents for Renewal Conversations
FAQ
Can AI handle complex multi-level IVRs? Usually yes for 2–3 levels. Deep IVRs (5+ levels) unreliable.
What about IVRs that require speech? AI can speak responses ("Say 'sales' to continue"). Handled naturally.
Can AI bypass IVR entirely? Only if direct-dial number exists. Main numbers require IVR handling.
What about automated attendants that detect robots? Some flag and route to voicemail intentionally. Hard to bypass.
How do we handle international IVRs? Similar patterns. Pre-configure paths per-country for major targets.

Tyler Weitzman is co-founder and Head of AI at Speechify. He has spent the past decade building the speech-synthesis stack that powers millions of users. Tyler writes about the engineering of real-time conversational systems — text-to-speech, speech recognition, latency budgets, model serving, and the architectural choices that separate prototypes from production-grade voice agents.
More from Tyler Weitzman
View all →Open-Source vs Proprietary Voice Agent Stacks
The open-source voice AI stack in 2026 is genuinely good. Whisper and its derivatives handle STT. Open-weight LLMs like Llama 3/4, Qwen, Mistral handle the reasoning. Open-source TTS (XTTS, StyleTTS, Orpheus-class) handles output.
Build vs Buy: When to Build Your Own Voice Agent
Build-vs-buy for voice agents in 2026 is a different conversation than it was two years ago. Then, the open-source stack was rough and most serious deployments ended up building.
Voice Agents for Developer Support
Developer support is a strange category. Developers don't generally want to call anyone. They want Stack Overflow, they want clear docs, they want an LLM that can read their code.
Related reading
Outbound for B2C: Subscription, Healthcare, and Auto
B2C outbound voice AI has different dynamics than B2B. Consumers are less forgiving of interruption. TCPA enforcement is stricter. Complaint thresholds are lower.
Outbound for B2B: Pipeline, Renewals, and Win-Backs
B2B outbound has different mechanics than B2C. Business buyers are more tolerant of outreach when it's relevant, more sensitive when it's not. Conversation quality matters more than volume.
How to Run an Outbound AI Pilot That Doesn't Embarrass You
The failure mode for outbound AI pilots isn't "it didn't work." It's "it worked badly in public." A scaled pilot that generates complaint calls, social media backlash, or a TCPA letter from a plaintiff's lawyer damages the brand in ways the pipeline it generated can't offset.
Voice AI, twice a month.
Get the best of the SIMBA resources hub — new articles, trend notes, and operator guides. No spam.
