How do I use Gemma 4 31B through YepAPI?

Sign up for a free API key, then send requests to the /v1/ai/chat endpoint.

GoogleText Generation/v1/ai/chat

Gemma 4 31B

Access Gemma 4 31B through one API key. Google's open-weight model at ultra-low cost.

Google's open-weight 31B model. Strong performance at ultra-low cost with 131K output tokens.

Full Docs Get API Key — Free $5 Credit

No credit card required. Takes 30 seconds.

2,400+

Developers

1.2M+

API calls served

100+

Endpoints

$0.01

Per call

Yep, that's it.

Try it live

Send a message and see Gemma 4 31B respond in real time.

POST/v1/ai/chatgoogle/gemma-4-31b-it

Message *

Max Tokens

Maximum tokens in the response.

Stream

Real-time tokens

Hit "Send Request" to see the response

Context Window

262K tokens

Max Output

131K tokens

Input Price

$0.20 / 1M tokens

Output Price

$0.56 / 1M tokens

Strengths

✓

Open-weight

Gemma 4 31B ships as an open-weight model from Google, so you can inspect, fine-tune, and self-host the same 31B-parameter weights served here through one API key.

✓

Ultra low cost

At $0.20 per 1M input tokens and $0.56 per 1M output tokens, it delivers strong instruction following at a fraction of the cost of frontier proprietary models.

✓

131K output

Generates up to 131,072 output tokens in a single response, enough for long-form documents, full code files, or large structured JSON without truncation.

✓

262K context

A 262,144-token context window lets you feed entire codebases, long transcripts, or multi-document prompts in one request.

Quick start

Copy this snippet and start making calls with Gemma 4 31B.

const res = await fetch('https://api.yepapi.com/v1/ai/chat', {
  method: 'POST',
  headers: {
    'x-api-key': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    "model": "google/gemma-4-31b-it",
    "messages": [
      {
        "role": "user",
        "content": "Explain API gateways in 2 sentences."
      }
    ],
    "maxTokens": 256
  }),
});
const { data } = await res.json();
console.log(data.message.content);

Why use Gemma 4 31B through YepAPI?

✓One API key for all models — no separate accounts

✓OpenAI SDK compatible — just change the base URL

✓No monthly minimums — pay per token

✓Switch models with one line of code

✓Full provider passthrough — citations, search results, and all extras included

✓Streaming and non-streaming support on every model

✓Works with Cursor, Claude, LangChain, and any LLM tool

✓Unified billing across all providers

Gemma 4 31B API: open-weight reasoning at ultra-low cost

Gemma 4 31B is Google's open-weight 31-billion-parameter instruction-tuned model, built for developers who want strong general reasoning and code generation without frontier pricing. With a 262K context window and 131K max output, it handles long documents and large generations in a single call.

Through YepAPI you call Gemma 4 31B with one API key on the standard chat endpoint, paying just $0.20 per 1M input and $0.56 per 1M output tokens — ideal for high-volume, cost-sensitive workloads.

What is Gemma 4 31B?

Gemma 4 31B is an open-weight, instruction-tuned large language model from Google, part of the Gemma family derived from the same research as Gemini. It has 31 billion parameters and is designed for text-to-text tasks: answering questions, writing and explaining code, summarizing, and following multi-step instructions. Because the weights are openly published, teams can audit, fine-tune, or self-host the model, while still accessing it as a hosted endpoint when they want zero infrastructure. With a 262K-token context window and 131K-token output ceiling, it suits long-context reasoning and long-form generation at a price far below proprietary frontier models.

Build with Gemma 4 31B via YepAPI

Send chat completion requests to Gemma 4 31B over a single YepAPI key — no separate Google Cloud project, billing setup, or quota negotiation. Point your existing client at the /v1/ai/chat endpoint, pass your system and user messages, and stream responses back. The same key works across every model in the catalog, so you can route cheap bulk work to Gemma 4 31B and escalate harder requests to larger models without re-authenticating. It is a drop-in fit for chatbots, code assistants, and batch document processing.

Gemma 4 31B API pricing — $0.20/1M input, $0.56/1M output

Gemma 4 31B costs $0.20 per 1M input tokens and $0.56 per 1M output tokens through YepAPI. That makes it one of the cheapest capable 30B-class models available, so summarizing thousands of documents or powering a high-traffic assistant stays affordable. You pay only for the tokens you use, with no minimums or seat fees, and the same billing covers every model on the platform.

Gemma 4 31B for high-volume code and content

Gemma 4 31B shines on large-scale, cost-sensitive jobs: bulk summarization, classification, data extraction, and code generation across long files. Its 131K output ceiling means it can return complete multi-file code or long reports without splitting requests, while the 262K context lets it reason over entire repositories or document sets. For teams processing huge volumes where per-token cost dominates, it is a strong default.

Try Gemma 4 31B free

Start with $5 in free credit — no credit card required. Sign up, grab your API key, and call Gemma 4 31B in minutes to test long-context reasoning and code generation before you spend a cent.

Start generating in 30 seconds

$5 free credit on signup. No credit card required. Pay per call.

Get API Key

What developers say

“Switched from SerpAPI and cut our SERP costs by 80%. Same data quality, way simpler billing.”
Marcus T.
SEO Platform Founder

“One API key for AI models, SERP data, and web scraping. Saved us from managing 4 separate providers.”
Priya S.
Full-Stack Developer

“The $5 free credit let us prototype our entire rank tracking feature before committing. No other API does that.”
Jake R.
Indie Hacker

Frequently asked questions

Google's open-weight 31B model. Strong performance at ultra-low cost with 131K output tokens.

Input tokens cost $0.20 per 1M tokens and output tokens cost $0.56 per 1M tokens through YepAPI. No monthly minimums — you only pay for what you use.

Gemma 4 31B supports a 262K token context window with up to 131K output tokens per request.

Ready to use Gemma 4 31B?

$5 free credit on signup. No credit card required. Pay per call.

Get API Key — Free $5 Credit Read Docs

Explore more models

Gemini 2.5 Flash

Google

Access Gemini 2.5 Flash through one API key. Google's fastest model with 1M context.

1M ctx$0.21/M in

Gemini 2.5 Pro

Google

Access Gemini 2.5 Pro through one API key. Google's most capable model with 1M context.

1M ctx$3.50/M in

Gemini 3.5 Flash

Google

Access Gemini 3.5 Flash through one API key. Google's latest fast multimodal model.

1.0M ctx$2.10/M in

Command Palette

Gemma 4 31B

Try it live

Strengths

Quick start

Why use Gemma 4 31B through YepAPI?

Gemma 4 31B API: open-weight reasoning at ultra-low cost

What is Gemma 4 31B?

Build with Gemma 4 31B via YepAPI

Gemma 4 31B API pricing — $0.20/1M input, $0.56/1M output

Gemma 4 31B for high-volume code and content

Try Gemma 4 31B free

Start generating in 30 seconds

Frequently asked questions

Ready to use Gemma 4 31B?

Explore more models

Gemini 2.5 Flash

Gemini 2.5 Pro

Gemini 3.5 Flash