Docs

Models

Mix and match voices, transcription, and language models for every agent. All models are invocable from the API and dashboard.

Voice (text-to-speech)

Natural-sounding voices optimized for conversational latency. Clone your own or pick from the library.

simba-natural-v2
Languages: 70+
Latency: < 300 ms
General-purpose conversational agents
simba-expressive-v2
Languages: 32
Latency: < 350 ms
High-emotion outbound, marketing, entertainment
simba-fast-v1
Languages: 29
Latency: < 180 ms
Low-latency streaming, edge deployments
simba-clone-v2
Languages: 70+
Latency: < 320 ms
Custom voice clones from 30 seconds of audio

Speech-to-text

Streaming and batch transcription tuned for telephony and noisy environments.

simba-transcribe-v1
Languages: 92
Latency: streaming
Real-time conversation transcripts
simba-transcribe-batch-v1
Languages: 92
Latency: batch
Post-call analytics, bulk processing

Language (LLM) — bring your own

SIMBA routes to the LLM of your choice. Use hosted providers or connect your own.

gpt-4o
Languages: any
Latency: ~ 400 ms
Highest reasoning, premium pricing
gpt-4o-mini
Languages: any
Latency: ~ 220 ms
Balanced latency and cost
claude-sonnet-4-6
Languages: any
Latency: ~ 300 ms
Complex instructions, long context
claude-haiku-4-5
Languages: any
Latency: ~ 180 ms
High-throughput, low-latency workflows
gemini-2.5-flash
Languages: any
Latency: ~ 200 ms
Cost-sensitive production traffic
self-hosted
Languages: any
Latency: varies
Compliance or data-residency requirements

Pick a model from the API

await simba.agents.create({
  name: "Support agent",
  conversationConfig: {
    agent: { prompt: { prompt: "..." }, llm: "gpt-4o-mini" },
    tts: { voice_id: "simba-natural-v2" },
    asr: { model: "simba-transcribe-v1" },
  },
});

See the API reference for the full config schema.