How to use the API
A REST API that generates image, audio and video on local GPUs. One API key, OpenAI-compatible request shapes. Voice TTS returns the first audio bytes in ~100ms, images in seconds, and video via async jobs. Every response carries a timing block so you can verify speed yourself.
How to use (4 steps)
- 1. Sign in / registerSign in to your account first. Registration is free if you do not have one.
- 2. Issue an API keyOn the API Management page, enter a project name and click Issue. The plaintext key is shown only once at creation — copy it somewhere safe (only a hash is stored in the DB).
- 3. Your first requestCall an endpoint below with Authorization: Bearer <your key>.
- 4. Handle the response / operateImage and audio return base64; video returns a job id to poll. Check the timing block for speed. Revoke unused keys, and revoke + reissue if a key leaks.
Latency (RTX PRO Blackwell)
| Endpoint | Mode | Speed |
|---|---|---|
Audio /audio/generations | sync | first audio ~85–120ms / under 1s for short text |
Image /images/generations | sync | ~4s (1024², 20 steps) |
Video /videos/generations | async job | ~60–90s (poll the job id) |
Authentication
Send your API key as a Bearer token on every request. Issue keys from the API Management page (the plaintext is shown only once at creation).
Authorization: Bearer kotonia_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Image API
curl -X POST https://kotonia.ai/api/v1/images/generations \
-H "Authorization: Bearer $KOTONIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "a serene japanese garden at dawn, soft light",
"size": "1024x1024",
"steps": 20
}'
# → { "data": [ { "b64_json": "<PNG base64>" } ], "timing": { "total_ms": 4200 } }Optional: seed, guidance_scale, shift, ref_image (base64, edit mode). Limits: prompt ≤ 4000 chars, size 256–2048 per side (out-of-range returns 400).
Audio API
curl -X POST https://kotonia.ai/api/v1/audio/generations \
-H "Authorization: Bearer $KOTONIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "Hello, what a lovely day.",
"engine": "qwen3",
"language": "en"
}'
# → { "audio": { "b64": "<WAV base64>", "format": "wav", "sample_rate": 24000 },
# "timing": { "first_byte_ms": 92, "total_ms": 480 } }engine: qwen3 (default, multilingual) / irodori / voicevox. Optional: voice, speed, instruct. Limits: input ≤ 4000 chars.
Video API (async)
# 1) submit job
curl -X POST https://kotonia.ai/api/v1/videos/generations \
-H "Authorization: Bearer $KOTONIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "prompt": "a cat walking through neon-lit streets", "width": 768, "height": 512 }'
# → { "id": "<job_id>", "status": "queued", "poll_url": "/api/v1/videos/generations/<job_id>" }
# 2) poll
curl https://kotonia.ai/api/v1/videos/generations/<job_id> \
-H "Authorization: Bearer $KOTONIA_API_KEY"
# → { "status": "completed", "data": [ { "url": "/api/ltx/video?path=..." } ] }Optional: image (base64, I2V), audio (base64, A2V lip-sync), num_frames. Limits: prompt ≤ 4000 chars, width/height 256–1280 per side, num_frames ≤ 200 (out-of-range returns 400).
Response codes
200 | Success. Image/audio return body (base64); video returns a job id. |
400 | Bad request. Missing or invalid params (e.g. missing prompt / input, bad base64). |
401 | Unauthorized. API key missing or invalid (check the Authorization header). |
429 | Too Many Requests. Free-tier daily limit exceeded. Resets at JST midnight. |
503 | Service unavailable. Generation server temporarily down/busy — retry later. |