Search docs

Find any page, endpoint, or guide.

API Overview

One API across every TTS provider. OpenAI-compatible schemas, one-line provider switching.

Alpha

VoxRouter's request and response schemas mirror the OpenAI Audio API, with one addition: the model field carries a provider prefix — "{provider}/{model_id}" — so a single VoxRouter key routes across every supported TTS provider. Swap providers by changing one string, no rewrites of client code, no new credentials.

Base URL
api.voxrouter.ai/v1
Auth
Bearer pk_…
Content-Type
application/json

OpenAPI specification

The machine-readable spec lives in the repo at voxrouter/router/openapi.yaml. Feed it into Swagger UI, Postman, or any OpenAPI code generator. We also use it as the source of truth for the first-party voxrouter SDK — the published TypeScript types are generated from this file on every spec change.

bash
# Fetch the spec directly from GitHub
curl -L https://raw.githubusercontent.com/voxrouter/voxrouter/main/voxrouter/router/openapi.yaml \
  -o voxrouter.openapi.yaml

Authentication

Every request carries a Bearer token in the Authorization header. There are two token kinds, both wire-identical:

  • pk_* API keys — created from the console (or programmatically via POST /v1/keys). Used for the data plane: audio.speech, voices, providers, status, credits.
  • vr_session_* CLI session tokens — issued by the voxrouter login device-code flow. Used for management endpoints: auth.whoami, keys.*, billing.*, usage.
bash
curl https://api.voxrouter.ai/v1/audio/speech \
  -H "Authorization: Bearer $VOXROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs/eleven_turbo_v2_5","voice":"EXAVITQu4vr4xnSDxMaL","input":"hi"}'

401 Unauthorized means the key is missing or invalid. 429 Rate-Limited means you've hit the per-key limit — see Rate limits.

Requests

The router exposes endpoints across six areas: speech synthesis, catalog discovery, wallet inspection, authentication, API-key management, and billing. Each is documented in the API Reference sidebar; the prose below is a quick orientation.

POST /v1/audio/speech

Synthesize speech from a text input. The response body is raw audio — audio/mpeg for response_format: "mp3", audio/l16 (16-bit LE PCM @ 24 kHz) for "pcm".

typescript
// Request body (application/json)
type SpeechRequest = {
  /** Provider-prefixed model id, e.g. "elevenlabs/eleven_turbo_v2_5". */
  model: string;
  /** Text to synthesize. */
  input: string;
  /** Provider-local voice id. Use GET /v1/voices to discover. */
  voice: string;
  /** Output encoding. Defaults to "mp3". */
  response_format?: "mp3" | "pcm";
  /** Passthrough provider-specific options. */
  provider_options?: Record<string, unknown>;
};

GET /v1/voices

Return the voice catalog across every configured provider. Filter with query params:

typescript
// Query params
type VoicesQuery = {
  /** Comma-separated provider list, e.g. "elevenlabs,cartesia". */
  provider?: string;
  /** ISO language prefix (case-insensitive), e.g. "en" or "en-US". */
  language?: string;
  /** Exact gender label (case-insensitive), e.g. "female". */
  gender?: string;
};

GET /v1/providers

Return the catalog of routable providers and the models each exposes. Near-static — safe to cache for hours. For live availability use /v1/status.

GET /v1/status

Per-provider live health (available / degraded / unavailable) plus the reason a provider is non-available (missing_api_key, circuit_open, circuit_half_open). Cheap to poll.

GET /v1/credits

Wallet snapshot for the authenticated key's account: balanceMicros (available credit) and reservedMicros (in-flight reservations). Both in USD micro-dollars (1_000_000 = $1).

GET /v1/credits/activity

Recent ledger entries for the wallet (newest first). Each row records a wallet mutation (top-up, reserve, commit, refund) with the signed microsDelta and resulting microsBalanceAfter.

POST /v1/auth/device-code + POST /v1/auth/poll

Drive the voxrouter login device-code flow from your own CLI. device-code issues a pair (device_code + human-readable user_code shown in the terminal); poll exchanges the device code for a vr_session_* token after the user approves in the browser. Mirrors RFC 8628.

GET /v1/auth/whoami

Identify the calling token. Accepts both pk_* and vr_session_*. The auth field on the response distinguishes (api_key vs session).

POST /v1/auth/logout

Revoke a vr_session_* token. Idempotent — returns {revoked: false} for already-revoked or unknown tokens.

GET /v1/keys / POST /v1/keys / DELETE /v1/keys/{id}

Manage API keys programmatically. GET lists the caller's organization's active keys (TEAM-scoped). POST mints a new pk_* key — the response carries the secret value, returned only at creation. DELETE revokes by id (CREATOR-scoped: only the user who minted the key can revoke it). Requires a vr_session_* token; pk_* keys cannot mint other pk_* keys.

GET /v1/billing/methods + POST /v1/billing/topup

methods returns the caller's organization's saved Stripe payment methods (up to 5). topup charges a saved card off-session and credits the wallet — pass idempotencyKey per attempt to make retries safe across network blips. Throws stripe_3ds_required (402) when the card needs 3DS, card_declined (402) on Stripe-side decline.

GET /v1/usage

Aggregated usage breakdown for the caller's organization over the recent window — totals, per-provider rollup, error-code rollup, and a tail of recent rows. Same shape the dashboard's /app/usage page consumes.

Model strings

The model field always uses the "provider/model_id" shape. The part before the slash picks the provider; the part after is the provider-native model id (passed through unchanged).

text
elevenlabs/eleven_turbo_v2_5
cartesia/sonic-2
openai/gpt-4o-mini-tts

Responses

Successful POST /v1/audio/speech returns the raw audio stream. The provider that served the request is in the X-VoxRouter-Provider response header. Successful GET /v1/voices returns a JSON object with a voices array.

typescript
// Voice catalog response
type VoicesResponse = {
  voices: Array<{
    id: string;
    provider: string;
    name: string;
    language: string;
    labels: Record<string, string>;
    preview_url?: string;
    model_compatibility: string[];
  }>;
};

Errors

Non-2xx responses return a JSON error body with a machine-readable error code and an optional human-readable details. The first-party SDK surfaces these as VoxRouterError with .status, .code, and .details.

json
{
  "error": "invalid_model",
  "details": "unknown_provider: bad"
}
StatusCodeMeaning
400invalid_bodyJSON body failed schema validation
400invalid_modelMalformed model string or unknown provider
401unauthorizedMissing or invalid API key
402insufficient_creditWallet does not have enough credit to cover the estimated cost. Top up and retry.
402spend_limit_exceededThe API key tripped its per-key daily or monthly spend cap.
429rate_limitedPer-key rate limit exceeded. Retry-After header indicates seconds to wait.
429concurrency_limitedToo many in-flight requests for this key. Slots free on completion.
500internal_errorUnexpected server error.
502upstream_errorProvider returned an unrecoverable error after automatic retries.
503provider_unavailableProvider's circuit-breaker is open. Retry-After indicates expected reset.
504upstream_errorProvider did not respond within the per-attempt deadline.

Rate limits

Requests are rate-limited per API key. When you exceed the limit, the router returns 429 with {"error":"rate_limited"}. Retry with backoff. Concrete per-key limits are not yet published — reach out if you need a higher ceiling.

Streaming

POST /v1/audio/speech returns the audio body as a chunked HTTP response. In the SDK, use audio.speech.createRaw(…) to get the raw Response and read .body as a ReadableStream. In fetch-land, iterate the Blob or stream directly; see the Quickstart.