API

OpenAI-compatible. Every model.

Point your existing OpenAI SDK at https://api.yout.chat/v1 with your Yout.chat key and every major model is yours. One credit pool, one bill, no per-model contracts.

Base URL: https://api.yout.chat/v1 · Auth: Authorization: Bearer <your-key>

Drop-in replacement

If your code already talks to the OpenAI API, you only change two things.

from openai import OpenAI

client = OpenAI(
    api_key="yout-...",                  # ← from Account → API keys
    base_url="https://api.yout.chat/v1",       # ← that's it
)

resp = client.chat.completions.create(
    model="anthropic-claude-sonnet-46",
    messages=[
        {"role": "user", "content": "Write a haiku about caching."},
    ],
)
print(resp.choices[0].message.content)

Works unchanged with OpenAI Python SDK, OpenAI Node SDK, LangChain, LlamaIndex, Vercel AI SDK, LiteLLM, and anything else that speaks the OpenAI shape.

Endpoints

All endpoints accept Authorization: Bearer <key>. Session cookies work for the web app.

POST/v1/chat/completions

OpenAI-compatible. Pass stream: true for SSE. Credits debited atomically.

Request body
{
  "model": "anthropic-claude-sonnet-46",
  "messages": [
    {"role": "system", "content": "You are helpful."},
    {"role": "user",   "content": "What is 2+2?"}
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 512
}
Response (non-stream)
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "anthropic-claude-sonnet-46",
  "choices": [{"index":0,"message":{"role":"assistant","content":"Four."},"finish_reason":"stop"}],
  "usage": {"prompt_tokens": 15, "completion_tokens": 2, "total_tokens": 17},
  "yout": {"credits_charged": 10, "balance": 14999990}
}
GET/v1/models

OpenAI-compatible model list. Every active model with id, context_window, and per-token credit cost.

POST/api/chat/stream/

Native streaming endpoint — simpler JSON shape than OpenAI's, used by the web app. Prefer /v1/chat/completions for third-party SDK compatibility.

GET/api/chat/models/

Richer model catalog with task tags, modality, pro-only flag. Use this in your own UI.

GET/api/chat/limit/

Your current quota — credits balance + plan status, or anon daily allowance.

GET/api/chat/keys/

List your active API keys (masked).

POST/api/chat/keys/

Create a named API key. Full token is returned once — save it.

Request body
{"name": "production"}
POST/api/chat/keys/<id>/revoke/

Revoke a key. Existing sessions using it get 401 on the next call.

POST/api/media/image/

Image generation job. Returns job_uuid; poll /api/media/jobs/<uuid>/.

POST/api/media/video/

Video generation job. ~1.2M credits for 5-second Veo. Credits auto-refunded on failure.

POST/api/media/speech/

Text-to-speech. Returns job_uuid; output is a signed audio URL once complete.

GET/api/media/jobs/<uuid>/

Poll a generation job: status, output_url, error, credits_cost.

POST/api/chat/share/

Create a public read-only snapshot. Returns a /c/<slug> URL.

POST/api/chat/purge/

Wipe your conversations, messages, and usage ledger server-side.

Billing

API usage pulls from the same credit pool as the web app. No separate API billing.

Per-token pricing

Text models charge credits_per_1k_input × input tokens + credits_per_1k_output × output tokens. Media models charge a flat credits_per_call.

Minimum charge

Every successful call debits at least 10 credits to cover overhead, regardless of token count.

Failure refund

Image/video/speech jobs that fail upstream automatically refund credits. Text stream failures bill only what was produced.

Overdraft

Calls return 402 insufficient_credits when your balance can't cover the estimated cost. Plans are not throttled as long as plan_active is true and balance is positive.

Authentication & privacy

  • Bearer token — per user, found on your Account page. Rotate by deleting + recreating your account (proper rotate endpoint on the roadmap).
  • No user identifiers forwarded — we proxy to providers without your email, IP, user agent, or account ID. Generic HTTP-Referer: yout.chat only.
  • no-log signal — sent upstream so model providers are asked not to log/train on your prompts where their contracts support it.
  • No message content persisted by default — we write a UsageLedger row (tokens + credits + model) but not the prompt or response.

Roadmap

  • OpenAI-compatible /v1/chat/completions shim so existing SDKs work unchanged
  • Webhook callbacks for long-running media jobs (replace polling)
  • Token rotation + per-token scopes
  • Idempotency keys on /stream/ + retry-safe job creation
  • Self-serve rate-limit controls

Ready to ship?

Free tier gives you 50K credits to play with. Upgrade when you need more.