Gemini API Cheat Sheet: 2.5 Pro, Vision & Tools

Q: Which Gemini model should I select for software development?

Use gemini-2.5-pro for complex refactoring, multi-file code review, and deep architectural reasoning. Use gemini-2.5-flash or gemini-2.0-flash for high-throughput tasks like inline completions, basic chat, or fast code generation.

Q: How does Google Search grounding work with Gemini API?

By passing the googleSearch tool parameter in your API payload, Gemini executes web queries behind the scenes and embeds real-time search citations directly into the generated response. --- See how Gemini compares in practice: Claude vs Gemini 2.5 for Coding.

Key Takeaways

Google Gemini 2.5 Pro & Flash support 1M+ token context windows with native multimodal understanding.
Use googleSearch tool for live Google Search grounding directly within your generation requests.
Enable codeExecution to allow the model to run Python scripts safely in a sandboxed runtime.
Guarantee type-safe structured output by setting responseMimeType: 'application/json' with responseSchema.

TL;DR

Models: gemini-2.5-pro (1M context, best quality), gemini-2.5-flash (1M context, fast), gemini-2.0-flash (speed-optimized)
Use googleSearch tool for live search grounding in responses
Enable codeExecution for Python execution within the API
Structured JSON: responseMimeType: "application/json" with responseSchema
1M token context fits entire codebases, long documents, or hours of audio

What Are the Key Gemini API Models and Parameters?

Models

Model ID	Context	Best for
`gemini-2.5-pro`	1M tokens	Complex reasoning, long documents, coding
`gemini-2.5-flash`	1M tokens	Fast, cost-efficient, everyday tasks
`gemini-2.0-flash`	1M tokens	Speed-optimized, multimodal
`gemini-2.0-flash-lite`	1M tokens	Lightest, cheapest, high-volume
`gemini-1.5-pro`	2M tokens	Largest context window available
`text-embedding-004`	2048 tokens	Text embeddings
`imagen-3.0-generate-002`	—	Image generation

generateContent — key parameters

Parameter	Type	What it does
`model`	string	Which Gemini model to use
`contents`	array	`[{role, parts}]` — user/model turns
`systemInstruction`	object	System prompt `{parts: [{text}]}`
`generationConfig`	object	Temperature, tokens, format settings
`safetySettings`	array	Content filtering thresholds
`tools`	array	Function declarations or built-in tools
`toolConfig`	object	`{functionCallingConfig: {mode}}`

generationConfig options

Setting	Type	What it does
`temperature`	float 0–2	Randomness
`topP`	float 0–1	Nucleus sampling
`topK`	int	Token pool size
`maxOutputTokens`	int	Max response length
`stopSequences`	array	Strings that stop generation
`responseMimeType`	string	`"application/json"` for JSON mode
`responseSchema`	object	JSON schema for structured output
`candidateCount`	int	Number of responses to generate
`thinkingConfig`	object	`{thinkingBudget: N}` for reasoning

Built-in tools

Tool	What it does
`googleSearch`	Grounds responses in live Google Search results
`codeExecution`	Runs Python code, returns output + charts
`urlContext`	Fetches and includes content from URLs

Gemini CLI — commands

Command	What it does
`gemini`	Start interactive REPL
`gemini -p "prompt"`	Non-interactive single prompt
`gemini --model gemini-2.5-pro`	Use a specific model
`gemini --yolo`	Auto-accept all tool actions (no confirmation)
`gemini --sandbox`	Run code execution in sandboxed environment
`gemini --debug`	Show full API request/response details
`/help`	Show slash commands in REPL
`/clear`	Clear conversation history
`/stats`	Show token usage for this session
`/tools`	List available tools

Safety settings — harm categories

Category	HarmCategory constant
Dangerous content	`HARM_CATEGORY_DANGEROUS_CONTENT`
Harassment	`HARM_CATEGORY_HARASSMENT`
Hate speech	`HARM_CATEGORY_HATE_SPEECH`
Sexually explicit	`HARM_CATEGORY_SEXUALLY_EXPLICIT`

Threshold values: BLOCK_NONE, BLOCK_LOW_AND_ABOVE, BLOCK_MEDIUM_AND_ABOVE, BLOCK_HIGH_AND_ABOVE

Detailed sections

Basic text generation (Node.js)

javascript

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });

const result = await model.generateContent("Explain recursion simply.");
console.log(result.response.text());

System instruction + chat

javascript

const model = genAI.getGenerativeModel({
  model: "gemini-2.5-flash",
  systemInstruction: "You are a senior DevOps engineer. Give concise, practical answers.",
});

const chat = model.startChat();

const r1 = await chat.sendMessage("What is a Kubernetes pod?");
console.log(r1.response.text());

const r2 = await chat.sendMessage("How is it different from a deployment?");
console.log(r2.response.text());

Streaming response

javascript

const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });

const result = await model.generateContentStream(
  "Write a step-by-step guide to setting up CI/CD."
);

for await (const chunk of result.stream) {
  process.stdout.write(chunk.text());
}

Vision — image input

javascript

import fs from "fs";

const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });

const imageData = fs.readFileSync("diagram.png");
const base64 = imageData.toString("base64");

const result = await model.generateContent([
  { inlineData: { mimeType: "image/png", data: base64 } },
  "Describe what's in this architecture diagram.",
]);

console.log(result.response.text());

JSON / structured output

javascript

const model = genAI.getGenerativeModel({
  model: "gemini-2.5-flash",
  generationConfig: {
    responseMimeType: "application/json",
    responseSchema: {
      type: "object",
      properties: {
        name: { type: "string" },
        language: { type: "string" },
        stars: { type: "integer" },
      },
      required: ["name", "language", "stars"],
    },
  },
});

const result = await model.generateContent(
  "Extract repo info from: react/react - JavaScript - 230k stars"
);

const data = JSON.parse(result.response.text());
console.log(data); // { name: 'react/react', language: 'JavaScript', stars: 230000 }

Function calling (tool use)

javascript

const tools = [
  {
    functionDeclarations: [
      {
        name: "get_stock_price",
        description: "Get the current stock price for a ticker symbol",
        parameters: {
          type: "object",
          properties: {
            ticker: {
              type: "string",
              description: "Stock ticker symbol, e.g. GOOG",
            },
          },
          required: ["ticker"],
        },
      },
    ],
  },
];

const model = genAI.getGenerativeModel({
  model: "gemini-2.0-flash",
  tools,
});

const result = await model.generateContent("What's Google's stock price?");
const response = result.response;

// Check if model wants to call a function
const call = response.candidates[0].content.parts[0].functionCall;
if (call) {
  console.log(call.name, call.args); // get_stock_price { ticker: 'GOOG' }
}

Grounding with Google Search

javascript

const model = genAI.getGenerativeModel({
  model: "gemini-2.0-flash",
  tools: [{ googleSearch: {} }], // enable live search grounding
});

const result = await model.generateContent(
  "What happened in AI news this week?"
);

console.log(result.response.text());

// Check grounding metadata
const groundingMeta = result.response.candidates[0].groundingMetadata;
console.log(groundingMeta?.webSearchQueries); // queries used
console.log(groundingMeta?.groundingChunks); // sources cited

Code execution

javascript

const model = genAI.getGenerativeModel({
  model: "gemini-2.5-flash",
  tools: [{ codeExecution: {} }],
});

const result = await model.generateContent(
  "Calculate the first 20 Fibonacci numbers and plot them."
);

// Response includes code written, execution output, and optionally a chart
const parts = result.response.candidates[0].content.parts;
for (const part of parts) {
  if (part.executableCode) console.log("Code:", part.executableCode.code);
  if (part.codeExecutionResult) console.log("Output:", part.codeExecutionResult.output);
}

Embeddings

javascript

const embModel = genAI.getGenerativeModel({ model: "text-embedding-004" });

const result = await embModel.embedContent("How do I deploy to Kubernetes?");
const vector = result.embedding.values; // float array (768 dims)

// Batch embeddings
const batchResult = await embModel.batchEmbedContents({
  requests: [
    { content: { parts: [{ text: "First document" }] } },
    { content: { parts: [{ text: "Second document" }] } },
  ],
});

Long document — file upload (Files API)

javascript

import { GoogleAIFileManager } from "@google/generative-ai/server";

const fileManager = new GoogleAIFileManager(process.env.GEMINI_API_KEY);

// Upload a large PDF
const uploadResult = await fileManager.uploadFile("report.pdf", {
  mimeType: "application/pdf",
  displayName: "Q4 Report",
});

const file = uploadResult.file;
console.log(`Uploaded: ${file.uri}`);

// Use the uploaded file in a prompt
const model = genAI.getGenerativeModel({ model: "gemini-2.5-pro" });

const result = await model.generateContent([
  { fileData: { fileUri: file.uri, mimeType: "application/pdf" } },
  "Summarize the key financial highlights from this report.",
]);

console.log(result.response.text());

Environment setup

bash

# Install SDK
npm install @google/generative-ai

# Set API key
export GEMINI_API_KEY=AIza...

# Get a key: aistudio.google.com

# Python SDK
pip install google-generativeai

python3 -c "
import google.generativeai as genai
genai.configure(api_key='YOUR_KEY')
model = genai.GenerativeModel('gemini-2.0-flash')
r = model.generate_content('Hello!')
print(r.text)
"

Gemini CLI setup

bash

# Install
npm install -g @google/gemini-cli

# Authenticate (opens browser)
gemini auth login

# Or set API key directly
export GEMINI_API_KEY=AIza...

# Start interactive session
gemini

# One-shot with file context
gemini -p "Review this code for bugs" < src/main.ts

Frequently Asked Questions

Which Gemini model should I select for software development?

Use gemini-2.5-pro for complex refactoring, multi-file code review, and deep architectural reasoning. Use gemini-2.5-flash or gemini-2.0-flash for high-throughput tasks like inline completions, basic chat, or fast code generation.

How does Google Search grounding work with Gemini API?

By passing the googleSearch tool parameter in your API payload, Gemini executes web queries behind the scenes and embeds real-time search citations directly into the generated response.

See how Gemini compares in practice: Claude vs Gemini 2.5 for Coding.

Gemini API Cheat Sheet: 2.5 Pro, Vision & Tools

Key Takeaways

What Are the Key Gemini API Models and Parameters?

Models

generateContent — key parameters

generationConfig options

Built-in tools

Gemini CLI — commands

Safety settings — harm categories

Detailed sections

Basic text generation (Node.js)

System instruction + chat

Streaming response

Vision — image input

JSON / structured output

Function calling (tool use)

Grounding with Google Search

Code execution

Embeddings

Long document — file upload (Files API)

Environment setup

Gemini CLI setup

Frequently Asked Questions

Which Gemini model should I select for software development?

How does Google Search grounding work with Gemini API?

Related Articles

Gemini CLI & Code Assist Cheatsheet (2026 Edition)

OpenAI API Cheat Sheet: GPT-4o, Tools & Assistants

Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

Key Takeaways

What Are the Key Gemini API Models and Parameters?

Models

generateContent — key parameters

generationConfig options

Built-in tools

Gemini CLI — commands

Safety settings — harm categories

Detailed sections

Basic text generation (Node.js)

System instruction + chat

Streaming response

Vision — image input

JSON / structured output

Function calling (tool use)

Grounding with Google Search

Code execution

Embeddings

Long document — file upload (Files API)

Environment setup

Gemini CLI setup

Frequently Asked Questions

Which Gemini model should I select for software development?

How does Google Search grounding work with Gemini API?

Related Articles

Gemini CLI & Code Assist Cheatsheet (2026 Edition)

OpenAI API Cheat Sheet: GPT-4o, Tools & Assistants

Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

Before you go...