🧠

Conversational AI & LLMs

How large language models power voice agents — prompting, function calling, memory, evaluations, and orchestration.

22 articles

🧠 LLMs

How to Handle Personally Identifiable Information in Voice Agents

Voice agents collect PII constantly — names, phone numbers, addresses, dates of birth, account numbers, sometimes even social security numbers and credit cards. Handling this responsibly isn't optional.

Tyler Weitzman · Jan 25, 2026 · 5 min
🧠 LLMs

Designing Voice Agents That Ask Better Questions

A voice agent that asks bad questions wastes the caller's time and produces bad data. Good questions feel natural and capture what you need in fewer turns.

Rohan Pavuluri · Jan 24, 2026 · 5 min
🧠 LLMs

Open-Source vs Closed-Source LLMs for Voice Agents

The open-source LLM ecosystem caught up to closed models faster than anyone expected. Llama 3.3, Mistral, Qwen — all good enough for most voice agent use cases.

Tyler Weitzman · Jan 24, 2026 · 4 min
🧠 LLMs

How LLMs Decide What to Say Next in a Voice Conversation

Step inside the LLM's "head" for a moment and look at how it picks what to say on each turn of a voice call. The answer is less mysterious than the term "AI" suggests and more interesting than "next-token prediction" implies.

Tyler Weitzman · Jan 23, 2026 · 5 min
🧠 LLMs

Red-Teaming Your Voice Agent

Red-teaming is the practice of deliberately trying to break your voice agent before adversaries (or just confused customers) do it for you. Most teams skip it. The ones that do it find embarrassing failures fast — and fix them before they cost real money.

Tyler Weitzman · Jan 23, 2026 · 5 min
🧠 LLMs

Building a Conversation Memory Layer for Voice Agents

The model has no memory beyond what you put in its context window. For a 5-minute support call this is fine. For longer calls, multi-call interactions, or agents that need to remember preferences across sessions, you need an explicit memory layer.

Tyler Weitzman · Jan 22, 2026 · 5 min
🧠 LLMs

Why Context Windows Matter Less Than You Think for Voice

LLM marketing has been all about context window expansion — 128K, 200K, 1M, 2M tokens. For voice agents, this race mostly doesn't matter. Voice conversations rarely exceed 5,000 tokens of meaningful context.

Tyler Weitzman · Jan 22, 2026 · 4 min
🧠 LLMs

How to A/B Test Voice Agent Prompts

Most teams don't A/B test voice agent prompts. They tweak the prompt, listen to a few calls, and ship if it "feels better." This works until it doesn't — until a tweak that helps one use case silently breaks another.

Tyler Weitzman · Jan 21, 2026 · 5 min
🧠 LLMs

Streaming LLM Outputs to Voice: The Engineering

Streaming the LLM's output to TTS as it generates is the difference between a snappy voice agent and a sluggish one. The basic idea is simple: don't wait for the model to finish thinking before you start speaking.

Tyler Weitzman · Jan 21, 2026 · 6 min
🧠 LLMs

The Role of Embeddings in Voice Agent Knowledge

Embeddings are the numerical representations of text that make retrieval-augmented generation work. Most voice agent builders never have to think about embeddings directly — their platform handles them.

Tyler Weitzman · Jan 20, 2026 · 5 min
🧠 LLMs

Multi-Agent Architectures for Customer Service

When a single agent gets too complex — too many intents, too many tools, conflicting style requirements — teams reach for multi-agent architectures. A "router" or "supervisor" routes turns to specialized sub-agents (a billing expert, a tech support expert, a returns expert).

Tyler Weitzman · Jan 20, 2026 · 5 min
🧠 LLMs

How to Stop a Voice Agent from Hallucinating

Hallucination is the failure mode that scares everyone off voice AI faster than anything else. The agent confidently tells a customer the wrong policy, the wrong price, or makes up a refund.

Tyler Weitzman · Jan 19, 2026 · 6 min
🧠 LLMs

Designing System Prompts for Multi-Turn Voice Conversations

The system prompt is the single most-iterated artifact in any production voice agent. It's where most of the agent's personality, rules, and reliability live. Most teams underinvest here, treating the prompt as a "set it and forget it" string.

Tyler Weitzman · Jan 19, 2026 · 6 min
🧠 LLMs

Tool Use vs Function Calling: What's the Difference?

You'll hear "tool use" and "function calling" used interchangeably in voice agent docs. They mean roughly the same thing. The reason both terms exist is mostly historical — different vendors named the same idea differently.

Tyler Weitzman · Jan 18, 2026 · 4 min
🧠 LLMs

Why Smaller LLMs Often Win for Voice Agents

There's a strong reflex in AI: bigger model = better outcome. For voice agents specifically, this reflex is often wrong. A fast 8B parameter model with sub-200ms time-to-first-token can outperform a 70B frontier model on nearly every voice metric that matters.

Tyler Weitzman · Jan 17, 2026 · 5 min
🧠 LLMs

Guardrails for Voice Agents: A Pragmatic Take

Guardrails are the rules that prevent your voice agent from doing things it shouldn't — agreeing to refunds it can't authorize, giving medical advice, leaking PII, or making up policies.

Tyler Weitzman · Jan 17, 2026 · 6 min
🧠 LLMs

Retrieval-Augmented Generation for Voice Agents

RAG — retrieval-augmented generation — is the standard pattern for grounding an LLM in a specific knowledge base. For voice agents, RAG works the same as for chatbots, with one crucial difference: every millisecond of retrieval latency shows up in the conversation.

Tyler Weitzman · Jan 16, 2026 · 5 min
🧠 LLMs

LLM Evaluation for Conversational Agents

You can't tune what you can't measure. Evaluation is the unsexy work that separates voice agent teams shipping production-quality work from teams flying blind. Most teams underinvest here for the first few months, then have a wake-up moment when something breaks.

Tyler Weitzman · Jan 16, 2026 · 6 min
🧠 LLMs

How to Give a Voice Agent Long-Term Memory

By default, voice agents have no memory beyond the current call. The caller hangs up, the agent forgets everything. For many use cases this is fine. For loyalty-driven businesses where the same caller comes back repeatedly, it's a missed opportunity.

Tyler Weitzman · Jan 15, 2026 · 5 min
🧠 LLMs

Prompt Engineering for Voice (vs Text) Agents

If you've written prompts for chatbots, you have a head start on voice agents — but only halfway. The fundamentals of clear instructions and tool definitions carry over. The style guide, the latency considerations, and the failure-mode handling are very different.

Tyler Weitzman · Jan 15, 2026 · 6 min
🧠 LLMs

Function Calling for Voice Agents: A Practical Guide

Function calling is the feature that turns a voice agent from a chatbot with audio into an actual worker. Without it, the agent can talk about looking up your account; with it, the agent can actually do it.

Tyler Weitzman · Jan 14, 2026 · 6 min
🧠 LLMs

How Large Language Models Power Voice Agents

When people ask "what's inside a voice agent?" they usually want to hear about the LLM. That's fair — the LLM is the most visible new piece of the stack.

Tyler Weitzman · Jan 14, 2026 · 10 min