OpenAI API Cheat Sheet: GPT-4o, Tools & Assistants

TL;DR

Models: gpt-4o (multimodal, 128K context), gpt-4o-mini (fast, cheap), o3 (reasoning), o4-mini (fast reasoning)
Strict JSON: Use response_format: {type: "json_schema"} with strict: true
Function calling: Define tools, let model decide when to call them
Assistants API: Create persistent agents with threads and tool access
Embeddings: text-embedding-3-large (3072 dims), text-embedding-3-small (1536 dims)

Quick reference tables

Models

Model ID	Context	Best for
`gpt-4o`	128K tokens	Multimodal, speed + quality balance
`gpt-4o-mini`	128K tokens	Fast, cheap, everyday tasks
`o3`	200K tokens	Complex reasoning, math, coding
`o4-mini`	200K tokens	Fast reasoning at lower cost
`gpt-4.5-preview`	128K tokens	Creative writing, nuanced conversation
`text-embedding-3-large`	8K tokens	High-quality embeddings
`text-embedding-3-small`	8K tokens	Cheaper embeddings
`dall-e-3`	—	Image generation
`whisper-1`	—	Audio transcription
`tts-1`	—	Text-to-speech

Chat Completions — key parameters

Parameter	Type	What it does
`model`	string	Model to use
`messages`	array	`[{role, content}]` — system/user/assistant
`max_completion_tokens`	int	Max output tokens
`temperature`	float 0–2	Randomness (0 = deterministic)
`top_p`	float 0–1	Nucleus sampling
`stream`	bool	Stream tokens as server-sent events
`tools`	array	Function definitions for tool use
`tool_choice`	string/obj	`auto`, `none`, `required`, or specific tool
`response_format`	object	`{type: "json_object"}` or `json_schema`
`seed`	int	Reproducible outputs (best effort)
`store`	bool	Store conversation for fine-tuning
`n`	int	Number of completions to generate
`stop`	string/array	Stop sequences

Roles in messages

Role	Purpose
`system`	Instructions, persona, context
`user`	Human turn
`assistant`	Model turn (for multi-turn)
`tool`	Tool result (returned after function call)

Assistants API — key concepts

Concept	What it is
Assistant	A configured agent with model, tools, instructions
Thread	A conversation session (stores messages)
Message	A user or assistant message in a Thread
Run	Executing an Assistant on a Thread
Run Step	Individual actions within a Run
Vector Store	Storage for file search (RAG)

Embeddings

Parameter	Value
Endpoint	`POST /v1/embeddings`
Best model	`text-embedding-3-large` (3072 dims)
Cheap model	`text-embedding-3-small` (1536 dims)
Max input	8191 tokens
Use case	Semantic search, RAG, similarity

Image generation (DALL-E 3)

Parameter	Options
`model`	`dall-e-3`
`size`	`1024x1024`, `1792x1024`, `1024x1792`
`quality`	`standard`, `hd`
`style`	`vivid`, `natural`
`n`	1 (DALL-E 3 supports only 1)

Detailed sections

Basic chat completion (Node.js)

javascript

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful coding assistant." },
    { role: "user", content: "Explain the difference between == and === in JavaScript." },
  ],
  max_completion_tokens: 512,
});

console.log(response.choices[0].message.content);

Streaming

javascript

const stream = await client.chat.completions.create({
  model: "gpt-4o",
  stream: true,
  messages: [{ role: "user", content: "Write a haiku about databases." }],
});

for await (const chunk of stream) {
  const text = chunk.choices[0]?.delta?.content || "";
  process.stdout.write(text);
}

Structured output (JSON schema)

javascript

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "user", content: "Extract: John Doe, [email protected], Senior Engineer" },
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "contact",
      schema: {
        type: "object",
        properties: {
          name: { type: "string" },
          email: { type: "string" },
          title: { type: "string" },
        },
        required: ["name", "email", "title"],
        additionalProperties: false,
      },
      strict: true,
    },
  },
});

const contact = JSON.parse(response.choices[0].message.content);

Function calling / tool use

javascript

const tools = [
  {
    type: "function",
    function: {
      name: "search_database",
      description: "Search the product database",
      parameters: {
        type: "object",
        properties: {
          query: { type: "string" },
          limit: { type: "integer", default: 10 },
        },
        required: ["query"],
      },
    },
  },
];

const response = await client.chat.completions.create({
  model: "gpt-4o",
  tools,
  tool_choice: "auto",
  messages: [{ role: "user", content: "Find products under $50" }],
});

const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
  const args = JSON.parse(toolCall.function.arguments);
  // call your actual function: searchDatabase(args.query, args.limit)
}

Embeddings — semantic search

javascript

// Create embeddings
const embedding = await client.embeddings.create({
  model: "text-embedding-3-small",
  input: "How do I reset my password?",
});

const vector = embedding.data[0].embedding; // float array

// Compare two texts (cosine similarity)
function cosineSimilarity(a, b) {
  const dot = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dot / (magA * magB);
}

Assistants API — simple agent

javascript

// 1. Create an Assistant (do this once)
const assistant = await client.beta.assistants.create({
  name: "Code Reviewer",
  instructions: "Review code for bugs, style issues, and security vulnerabilities.",
  model: "gpt-4o",
  tools: [{ type: "code_interpreter" }],
});

// 2. Create a Thread per user session
const thread = await client.beta.threads.create();

// 3. Add user message
await client.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Review this function: function add(a,b){return a+b}",
});

// 4. Run the Assistant
const run = await client.beta.threads.runs.createAndPoll(thread.id, {
  assistant_id: assistant.id,
});

// 5. Get the response
if (run.status === "completed") {
  const messages = await client.beta.threads.messages.list(thread.id);
  console.log(messages.data[0].content[0].text.value);
}

Image generation

javascript

const image = await client.images.generate({
  model: "dall-e-3",
  prompt: "A futuristic server room with glowing blue circuits",
  size: "1792x1024",
  quality: "hd",
  style: "vivid",
  n: 1,
});

console.log(image.data[0].url);

Audio transcription (Whisper)

javascript

import fs from "fs";

const transcription = await client.audio.transcriptions.create({
  model: "whisper-1",
  file: fs.createReadStream("recording.mp3"),
  language: "en", // optional
  response_format: "text",
});

console.log(transcription);

Text-to-speech

javascript

import fs from "fs";

const audio = await client.audio.speech.create({
  model: "tts-1",
  voice: "nova", // alloy, echo, fable, onyx, nova, shimmer
  input: "Hello, this is a test of the text to speech API.",
  speed: 1.0, // 0.25–4.0
});

const buffer = Buffer.from(await audio.arrayBuffer());
fs.writeFileSync("output.mp3", buffer);

Environment setup

bash

# Install SDK
npm install openai

# Set API key
export OPENAI_API_KEY=sk-...

# Python SDK
pip install openai

python3 -c "
import openai
client = openai.OpenAI()
r = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[{'role':'user','content':'Hello'}]
)
print(r.choices[0].message.content)
"

Related: OpenAI Codex Cheatsheet for Agents SDK | Claude API Cheatsheet for alternatives | Gemma 4 Local Setup for open models

OpenAI API Cheat Sheet: GPT-4o, Tools & Assistants

Quick reference tables

Models

Chat Completions — key parameters

Roles in messages

Assistants API — key concepts

Embeddings

Image generation (DALL-E 3)

Detailed sections

Basic chat completion (Node.js)

Streaming

Structured output (JSON schema)

Function calling / tool use

Embeddings — semantic search

Assistants API — simple agent

Image generation

Audio transcription (Whisper)

Text-to-speech

Environment setup

Related Articles

OpenAI Codex & Agents Cheatsheet (2026 Edition)

Gemini API Cheat Sheet: 2.5 Pro, Vision & Tools

Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

Quick reference tables

Models

Chat Completions — key parameters

Roles in messages

Assistants API — key concepts

Embeddings

Image generation (DALL-E 3)

Detailed sections

Basic chat completion (Node.js)

Streaming

Structured output (JSON schema)

Function calling / tool use

Embeddings — semantic search

Assistants API — simple agent

Image generation

Audio transcription (Whisper)

Text-to-speech

Environment setup

Related Articles

OpenAI Codex & Agents Cheatsheet (2026 Edition)

Gemini API Cheat Sheet: 2.5 Pro, Vision & Tools

Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

Before you go...