We have just added twelve new LLM chat models to YepAPI. This is the biggest model refresh of the quarter: the newest flagships from OpenAI (GPT-5.5 and GPT-5.5 Pro), Anthropic (Claude Opus 4.8, Opus 4.8 Fast, and the creative-first Claude Fable 5), Google (Gemini 3.5 Flash), and xAI (Grok 4.3), plus high-value additions GLM 5.2, MiniMax M3, Qwen3.7 Max, Kimi K2.7 Code, and Mistral Medium 3.5. Every one is callable from the same `/v1/ai/chat` (and OpenAI-compatible `/v1/ai/chat/completions`) endpoint you already use, under the same key, with pay-per-token pricing and no waitlist.
What's new
- OpenAI GPT-5.5 & GPT-5.5 Pro — newest OpenAI flagship and its max-accuracy reasoning tier
- Claude Opus 4.8, Opus 4.8 Fast & Claude Fable 5 — Anthropic's newest flagship, a low-latency build, and a creative-writing model
- Google Gemini 3.5 Flash — fast multimodal (text, image, audio, video) with 1M context
- xAI Grok 4.3 — 1M context at an aggressive price
- GLM 5.2, MiniMax M3, Qwen3.7 Max, Kimi K2.7 Code, Mistral Medium 3.5 — strong value picks
- All on the unified `/v1/ai/chat` endpoint — same envelope, same auth, pay per token
- Short aliases updated to newest: `claude-opus` → 4.8, `grok` → 4.3, `glm` → 5.2, `minimax` → M3, `qwen` → Qwen3.7 Max
Endpoints in this release
What's new
These are the current-generation flagships across the major labs, plus a set of high-value workhorses. GPT-5.5 and Claude Opus 4.8 push the frontier on reasoning and agentic coding; GPT-5.5 Pro and Opus 4.8 Fast give you a max-accuracy tier and a low-latency tier of the same intelligence. Claude Fable 5 is tuned for creative and long-form writing. Gemini 3.5 Flash adds fast multimodal input (text, image, audio, video) with a 1M context window. Grok 4.3, GLM 5.2, MiniMax M3, Qwen3.7 Max, Kimi K2.7 Code, and Mistral Medium 3.5 round out the lineup with strong price-to-performance for everyday and coding workloads.
How to call them
Pass the model ID as the `model` field on `/v1/ai/chat` (or `/v1/ai/chat/completions` for the OpenAI-compatible shape). Streaming is supported via `stream: true`. Auth, billing, retries, and rate limits work exactly like every other model on YepAPI.
curl -X POST https://api.yepapi.com/v1/ai/chat \
-H "x-api-key: $YEPAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-opus-4.8",
"messages": [{"role": "user", "content": "Design a resilient job queue."}]
}'Pricing
Pay-per-token, billed only on successful completions from your prepaid balance, with a $0.01 minimum per request. Prices range from MiniMax M3 at $0.42 / $1.68 per 1M tokens up to GPT-5.5 Pro at $42.00 / $252.00 per 1M tokens — see each model's page for exact rates.
Try the new models today
Sign up, grab an API key, and spend your free starter credits on the latest flagships.
Browse the AI models