How to Handle Personally Identifiable Information in Voice Agents
Voice agents collect PII constantly — names, phone numbers, addresses, dates of birth, account numbers, sometimes even social security numbers and credit cards. Handling this responsibly isn't optional.
Voice agents collect PII constantly — names, phone numbers, addresses, dates of birth, account numbers, sometimes even social security numbers and credit cards. Handling this responsibly isn't optional. It's a regulatory requirement, a security concern, and a customer trust commitment. The good news: the rules are clear once you internalize them.
TL;DR
- Capture only the PII you need for the task.
- Don't read sensitive PII back unless necessary.
- Never log raw PII; redact before storing.
- Use DTMF for credit cards, SSNs, and other high-sensitivity numbers.
- Comply with disclosure rules in your jurisdiction.
What counts as PII
In US contexts, PII includes:
- Name (combined with other identifiers)
- Address (full or partial)
- Phone number
- Email address
- Date of birth
- Social Security Number
- Driver's license number
- Account numbers
- Credit card numbers
- Health information (also covered by HIPAA)
- Biometric data
Different jurisdictions add to this list (the GDPR's definition is broader, for example).
The three places PII appears
In the audio. The caller speaks PII; STT transcribes it.
In the transcript. Stored as text in your database / logs.
In the LLM prompt. Pulled from caller history or function calls; included in the model's context.
Each requires its own handling.
Capture only what's needed
The most underused principle. If your agent can resolve the call without the SSN, don't ask for it. Reduces risk surface.
A useful exercise: list every piece of PII your agent currently asks for. For each, ask: "what would happen if we didn't capture this?"
You'll often find 30–50% of PII captures aren't strictly needed.
Reading PII back
Common pattern: agent confirms back what the caller said. Sometimes you should; sometimes you shouldn't.
Confirm back:
- Names (so the caller knows you got it)
- Appointment times
- Last 4 digits of cards (for verification)
- Email address (because spelling matters)
Don't confirm back:
- Full credit card numbers (risk of being overheard)
- SSNs (same)
- Full account numbers when last 4 suffice
- Health diagnoses (could be overheard or recorded)
DTMF for high-sensitivity numbers
For credit cards, SSNs, PINs, account numbers — let the caller punch them in via touchpad instead of speaking them.
The pattern:
- Agent prompts: "Please enter your card number followed by pound."
- Telephony layer captures DTMF.
- Numbers go directly to your payment processor (or a tokenization vendor).
- The agent never "hears" or "sees" the raw digits.
- Confirmation includes only the last 4 digits.
This is also a PCI compliance pattern for credit cards. STT-captured cards are operationally and regulatorily painful.
Logging and redaction
Default: don't log PII in plain text.
Implementation:
- Run a redactor on transcripts before storage. Patterns: phone numbers, SSNs, credit cards, email addresses.
- Replace with placeholders ("[REDACTED:phone]").
- Store the original only if you have a strong, documented reason.
For audio recordings: encrypt at rest; access controlled; delete on a defined schedule.
PII in the LLM prompt
A subtle issue: when you inject the caller's profile (name, recent issues, preferences) into the system prompt, that PII flows to your LLM provider.
For most teams, the LLM provider has a BAA (HIPAA) or DPA (GDPR) that covers this. Verify yours does. If you can't get a BAA, use a self-hosted model.
Disclosure requirements
Several jurisdictions require disclosing that the call may be recorded:
- California (and other "two-party consent" states) require disclosure to the caller.
- EU GDPR requires disclosure of recording and data processing.
- HIPAA requires Notice of Privacy Practices for healthcare contexts.
Best practice: disclose at call start regardless of jurisdiction. "This call may be recorded for quality and training purposes." Cheap insurance.
Right to deletion
Several frameworks (CCPA, GDPR) give callers the right to request deletion of their personal data.
Your platform needs:
- A way to identify all data tied to a specific caller.
- A workflow for deletion requests.
- Audit logging of deletion (for compliance).
If you don't have this from day one, build it before you have your first deletion request.
What to do when the caller volunteers sensitive info
Sometimes the caller blurts out PII you didn't ask for ("yeah, my SSN is..."). Handle gracefully:
- Don't repeat it back ("I heard your SSN was 123-45-6789" is bad).
- Don't store it unnecessarily.
- Acknowledge briefly and redirect ("got it — let me look up your account").
- Redact in logs.
The agent's prompt should include a rule about this.
Encryption requirements
Standard:
- TLS for all data in transit.
- AES-256 for data at rest.
- Encryption keys rotated regularly.
- Audio recordings encrypted with separate keys from transcripts.
Most cloud providers (AWS, GCP, Azure) handle this if configured correctly. Verify yours is.
Health-specific PHI
If you're in healthcare, PHI (Protected Health Information) has additional rules under HIPAA:
- BAAs with every vendor that touches PHI (LLM provider, telephony, hosting, etc.).
- Minimum necessary standard (capture only what's needed).
- Patient access rights (they can request copies of their data).
- Breach notification rules.
For more, see HIPAA compliance for AI voice agents in healthcare.
Common mistakes
Three patterns:
Storing full credit cards in transcripts. Big PCI exposure. Use DTMF and tokenization.
Sending full SSNs to logging services. Even encrypted, log volume is searchable. Redact.
Failing to honor deletion requests. "We'll get to it" doesn't satisfy GDPR.
Related reading
- How Large Language Models Power Voice Agents
- Designing Voice Agents That Ask Better Questions
- Open-Source vs Closed-Source LLMs for Voice Agents
- How LLMs Decide What to Say Next in a Voice Conversation
- Red-Teaming Your Voice Agent
FAQ
Do I need a BAA with every vendor? For HIPAA contexts, yes. For non-healthcare, you still need DPAs (data processing addenda) for EU/GDPR compliance.
Can I store recordings indefinitely? You can technically. You probably shouldn't. Most teams retain 30-90 days unless there's a specific reason.
What's the right retention policy? Match your business need + regulatory minimum. For most: 90 days for audio, 1 year for transcripts.
Can the caller request a copy of their data? Under GDPR/CCPA, yes. Build the workflow before you need it.
What happens if I have a breach? Notification requirements vary by jurisdiction. Have a plan and a legal contact ready.

Tyler Weitzman is co-founder and Head of AI at Speechify. He has spent the past decade building the speech-synthesis stack that powers millions of users. Tyler writes about the engineering of real-time conversational systems — text-to-speech, speech recognition, latency budgets, model serving, and the architectural choices that separate prototypes from production-grade voice agents.
More from Tyler Weitzman
View all →Open-Source vs Proprietary Voice Agent Stacks
The open-source voice AI stack in 2026 is genuinely good. Whisper and its derivatives handle STT. Open-weight LLMs like Llama 3/4, Qwen, Mistral handle the reasoning. Open-source TTS (XTTS, StyleTTS, Orpheus-class) handles output.
Build vs Buy: When to Build Your Own Voice Agent
Build-vs-buy for voice agents in 2026 is a different conversation than it was two years ago. Then, the open-source stack was rough and most serious deployments ended up building.
Voice Agents for Developer Support
Developer support is a strange category. Developers don't generally want to call anyone. They want Stack Overflow, they want clear docs, they want an LLM that can read their code.
Related reading
Designing Voice Agents That Ask Better Questions
A voice agent that asks bad questions wastes the caller's time and produces bad data. Good questions feel natural and capture what you need in fewer turns.
Open-Source vs Closed-Source LLMs for Voice Agents
The open-source LLM ecosystem caught up to closed models faster than anyone expected. Llama 3.3, Mistral, Qwen — all good enough for most voice agent use cases.
How LLMs Decide What to Say Next in a Voice Conversation
Step inside the LLM's "head" for a moment and look at how it picks what to say on each turn of a voice call. The answer is less mysterious than the term "AI" suggests and more interesting than "next-token prediction" implies.
Voice AI, twice a month.
Get the best of the SIMBA resources hub — new articles, trend notes, and operator guides. No spam.
