Text To Speech
Users are billed for text-to-speech (TTS) based on the duration of the generated audio, measured in seconds. Billing starts at a minimum of one second per request, ensuring a fair and consistent model that reflects actual usage while covering system overhead. This per-second approach allows users to scale efficiently and only pay for the audio they generate, making it both transparent and predictable.
Model
Audio Second Cost
Chatterbox
$0.0030 / Second
Stay tuned for more speech models.
Pricing subject to change.
Last updated