M
MeshWorld.
Cheatsheet Claude Anthropic AI LLM Claude Code API Developer Tools 7 min read

Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

By Vishnu Damwala

Quick reference tables

Models

Model IDContextBest for
claude-opus-4-6200K tokensComplex reasoning, research, long documents
claude-sonnet-4-6200K tokensBalanced speed + quality (default choice)
claude-haiku-4-5-20251001200K tokensFast, lightweight, high-volume tasks

Anthropic API — core requests

TaskWhat to use
Chat completionPOST /v1/messages
Streaming responsestream: true in request body
Count tokensPOST /v1/messages/count_tokens
List modelsGET /v1/models
Create a batchPOST /v1/messages/batches

Messages API — key parameters

ParameterTypeWhat it does
modelstringWhich Claude model to use
max_tokensintMaximum tokens in response
messagesarrayConversation history [{role, content}]
systemstringSystem prompt
temperaturefloat 0–1Randomness (0 = deterministic)
top_pfloat 0–1Nucleus sampling
top_kintToken sampling pool size
stop_sequencesarrayStrings that stop generation
toolsarrayTool/function definitions
tool_choiceobjectForce tool use (auto, any, tool)
streamboolStream tokens as they generate

Claude Code CLI — essential commands

CommandWhat it does
claudeStart interactive REPL
claude "fix this bug"One-shot prompt, no REPL
claude -p "prompt"Non-interactive, print output
claude --model claude-opus-4-6Use a specific model
claude --no-streamDisable streaming
claude /helpShow available slash commands
claude /clearClear conversation history
claude /compactCompact context to save tokens
claude /commitAuto-generate and create git commit
claude /review-pr 123Review a pull request
claude /costShow token usage and cost for session
claude /doctorCheck Claude Code health
claude /initCreate a CLAUDE.md for this repo

Claude Code CLI — flags

FlagWhat it does
--modelSpecify model ID
--api-keyPass API key directly
--max-tokensOverride max output tokens
--add-dir /pathAdd directory to working context
--print / -pPrint output without REPL
--output-format jsonJSON output (for scripting)
--output-format stream-jsonStreaming JSON output
--verboseShow full tool call details
--no-streamWait for full response
--dangerously-skip-permissionsSkip tool permission prompts

MCP — Model Context Protocol

Command / ConceptWhat it does
claude mcp add name urlAdd an MCP server by URL
claude mcp add name -- cmd argsAdd a local MCP server via stdio
claude mcp listList configured MCP servers
claude mcp remove nameRemove an MCP server
MCP scope localAvailable in current project only
MCP scope userAvailable across all projects
MCP scope projectShared via .mcp.json in repo
CLAUDE.mdProject instructions Claude reads on start

Token limits (approx.)

ModelInput limitOutput limit
Opus 4.6200K tokens32K tokens
Sonnet 4.6200K tokens64K tokens
Haiku 4.5200K tokens8K tokens

Prompt caching

FeatureWhat it does
cache_control: {type: "ephemeral"}Cache a content block (5-min TTL)
Cache hit~90% cheaper, ~85% faster than full prompt
Minimum cacheable size1024 tokens (Opus/Sonnet), 2048 (Haiku)
Cached blocksTools, system prompt, messages

Detailed sections

Basic API call (Node.js)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain async/await in JavaScript." }],
});

console.log(message.content[0].text);

Streaming response

const stream = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  stream: true,
  messages: [{ role: "user", content: "Write a short story." }],
});

for await (const event of stream) {
  if (event.type === "content_block_delta") {
    process.stdout.write(event.delta.text);
  }
}

System prompt + multi-turn conversation

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 2048,
  system: "You are a senior backend engineer. Be concise and precise.",
  messages: [
    { role: "user", content: "What's wrong with N+1 queries?" },
    { role: "assistant", content: "N+1 queries happen when..." },
    { role: "user", content: "How do I fix it in PostgreSQL?" },
  ],
});

Tool use (function calling)

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  tools: [
    {
      name: "get_weather",
      description: "Get current weather for a city",
      input_schema: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" },
        },
        required: ["city"],
      },
    },
  ],
  messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
});

// Check if Claude wants to use a tool
if (response.stop_reason === "tool_use") {
  const toolUse = response.content.find((b) => b.type === "tool_use");
  console.log(toolUse.name, toolUse.input); // get_weather { city: 'Tokyo' }
}

Prompt caching — reduce costs on large system prompts

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "You are an expert codebase assistant...\n\n[large context here]",
      cache_control: { type: "ephemeral" }, // cache this block
    },
  ],
  messages: [{ role: "user", content: "Explain the auth module." }],
});
// Subsequent calls with same system block → cache hit → ~90% cheaper

Claude Code — useful workflows

# One-shot: explain code without starting REPL
claude -p "Explain what this does" < src/utils/parser.ts

# Pipe output from another command
git diff | claude -p "Summarize what changed in plain English"

# Use in scripts
SUMMARY=$(claude -p "Summarize this log" < app.log)
echo "$SUMMARY"

# Ask Claude to write tests
claude "Write unit tests for src/auth/login.ts using Vitest"

# Ask Claude to fix a failing test
claude "The test in auth.test.ts is failing — fix it"

# Review a PR
claude /review-pr 42

CLAUDE.md — project instructions

Create CLAUDE.md in your repo root. Claude Code reads it on every session:

# Project: My App

## Stack
- Node.js 20, TypeScript, Fastify, PostgreSQL
- Tests: Vitest, run with `pnpm test`
- Lint: `pnpm lint` (ESLint + Prettier)

## Conventions
- Use named exports, no default exports
- Prefer `async/await` over `.then()`
- All DB queries go in `src/db/queries/`

## Commands
- `pnpm dev` — start dev server
- `pnpm build` — production build
- `pnpm test` — run tests

Batch API — process many prompts at once

// Create a batch (async, ~1hr processing)
const batch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "req-1",
      params: {
        model: "claude-haiku-4-5-20251001",
        max_tokens: 256,
        messages: [{ role: "user", content: "Translate: Hello world" }],
      },
    },
    // ... up to 10,000 requests
  ],
});

// Poll for completion
const result = await client.messages.batches.retrieve(batch.id);
console.log(result.processing_status); // "ended" when done

Batch API is ~50% cheaper than individual calls. Good for bulk classification, data extraction, report generation.

Environment setup

# Install SDK
npm install @anthropic-ai/sdk

# Set API key
export ANTHROPIC_API_KEY=sk-ant-...

# Or use .env
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env
# Python SDK
pip install anthropic

import anthropic
client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

Wondering how Claude stacks up? Read Claude vs Gemini 2.5 for Coding: Honest Comparison.