Quick reference tables
Models
| Model ID | Context | Best for |
|---|
claude-opus-4-6 | 200K tokens | Complex reasoning, research, long documents |
claude-sonnet-4-6 | 200K tokens | Balanced speed + quality (default choice) |
claude-haiku-4-5-20251001 | 200K tokens | Fast, lightweight, high-volume tasks |
Anthropic API — core requests
| Task | What to use |
|---|
| Chat completion | POST /v1/messages |
| Streaming response | stream: true in request body |
| Count tokens | POST /v1/messages/count_tokens |
| List models | GET /v1/models |
| Create a batch | POST /v1/messages/batches |
Messages API — key parameters
| Parameter | Type | What it does |
|---|
model | string | Which Claude model to use |
max_tokens | int | Maximum tokens in response |
messages | array | Conversation history [{role, content}] |
system | string | System prompt |
temperature | float 0–1 | Randomness (0 = deterministic) |
top_p | float 0–1 | Nucleus sampling |
top_k | int | Token sampling pool size |
stop_sequences | array | Strings that stop generation |
tools | array | Tool/function definitions |
tool_choice | object | Force tool use (auto, any, tool) |
stream | bool | Stream tokens as they generate |
Claude Code CLI — essential commands
| Command | What it does |
|---|
claude | Start interactive REPL |
claude "fix this bug" | One-shot prompt, no REPL |
claude -p "prompt" | Non-interactive, print output |
claude --model claude-opus-4-6 | Use a specific model |
claude --no-stream | Disable streaming |
claude /help | Show available slash commands |
claude /clear | Clear conversation history |
claude /compact | Compact context to save tokens |
claude /commit | Auto-generate and create git commit |
claude /review-pr 123 | Review a pull request |
claude /cost | Show token usage and cost for session |
claude /doctor | Check Claude Code health |
claude /init | Create a CLAUDE.md for this repo |
Claude Code CLI — flags
| Flag | What it does |
|---|
--model | Specify model ID |
--api-key | Pass API key directly |
--max-tokens | Override max output tokens |
--add-dir /path | Add directory to working context |
--print / -p | Print output without REPL |
--output-format json | JSON output (for scripting) |
--output-format stream-json | Streaming JSON output |
--verbose | Show full tool call details |
--no-stream | Wait for full response |
--dangerously-skip-permissions | Skip tool permission prompts |
MCP — Model Context Protocol
| Command / Concept | What it does |
|---|
claude mcp add name url | Add an MCP server by URL |
claude mcp add name -- cmd args | Add a local MCP server via stdio |
claude mcp list | List configured MCP servers |
claude mcp remove name | Remove an MCP server |
MCP scope local | Available in current project only |
MCP scope user | Available across all projects |
MCP scope project | Shared via .mcp.json in repo |
CLAUDE.md | Project instructions Claude reads on start |
Token limits (approx.)
| Model | Input limit | Output limit |
|---|
| Opus 4.6 | 200K tokens | 32K tokens |
| Sonnet 4.6 | 200K tokens | 64K tokens |
| Haiku 4.5 | 200K tokens | 8K tokens |
Prompt caching
| Feature | What it does |
|---|
cache_control: {type: "ephemeral"} | Cache a content block (5-min TTL) |
| Cache hit | ~90% cheaper, ~85% faster than full prompt |
| Minimum cacheable size | 1024 tokens (Opus/Sonnet), 2048 (Haiku) |
| Cached blocks | Tools, system prompt, messages |
Detailed sections
Basic API call (Node.js)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain async/await in JavaScript." }],
});
console.log(message.content[0].text);
Streaming response
const stream = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
stream: true,
messages: [{ role: "user", content: "Write a short story." }],
});
for await (const event of stream) {
if (event.type === "content_block_delta") {
process.stdout.write(event.delta.text);
}
}
System prompt + multi-turn conversation
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 2048,
system: "You are a senior backend engineer. Be concise and precise.",
messages: [
{ role: "user", content: "What's wrong with N+1 queries?" },
{ role: "assistant", content: "N+1 queries happen when..." },
{ role: "user", content: "How do I fix it in PostgreSQL?" },
],
});
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
tools: [
{
name: "get_weather",
description: "Get current weather for a city",
input_schema: {
type: "object",
properties: {
city: { type: "string", description: "City name" },
},
required: ["city"],
},
},
],
messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
});
// Check if Claude wants to use a tool
if (response.stop_reason === "tool_use") {
const toolUse = response.content.find((b) => b.type === "tool_use");
console.log(toolUse.name, toolUse.input); // get_weather { city: 'Tokyo' }
}
Prompt caching — reduce costs on large system prompts
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
system: [
{
type: "text",
text: "You are an expert codebase assistant...\n\n[large context here]",
cache_control: { type: "ephemeral" }, // cache this block
},
],
messages: [{ role: "user", content: "Explain the auth module." }],
});
// Subsequent calls with same system block → cache hit → ~90% cheaper
Claude Code — useful workflows
# One-shot: explain code without starting REPL
claude -p "Explain what this does" < src/utils/parser.ts
# Pipe output from another command
git diff | claude -p "Summarize what changed in plain English"
# Use in scripts
SUMMARY=$(claude -p "Summarize this log" < app.log)
echo "$SUMMARY"
# Ask Claude to write tests
claude "Write unit tests for src/auth/login.ts using Vitest"
# Ask Claude to fix a failing test
claude "The test in auth.test.ts is failing — fix it"
# Review a PR
claude /review-pr 42
CLAUDE.md — project instructions
Create CLAUDE.md in your repo root. Claude Code reads it on every session:
# Project: My App
## Stack
- Node.js 20, TypeScript, Fastify, PostgreSQL
- Tests: Vitest, run with `pnpm test`
- Lint: `pnpm lint` (ESLint + Prettier)
## Conventions
- Use named exports, no default exports
- Prefer `async/await` over `.then()`
- All DB queries go in `src/db/queries/`
## Commands
- `pnpm dev` — start dev server
- `pnpm build` — production build
- `pnpm test` — run tests
Batch API — process many prompts at once
// Create a batch (async, ~1hr processing)
const batch = await client.messages.batches.create({
requests: [
{
custom_id: "req-1",
params: {
model: "claude-haiku-4-5-20251001",
max_tokens: 256,
messages: [{ role: "user", content: "Translate: Hello world" }],
},
},
// ... up to 10,000 requests
],
});
// Poll for completion
const result = await client.messages.batches.retrieve(batch.id);
console.log(result.processing_status); // "ended" when done
Batch API is ~50% cheaper than individual calls. Good for bulk classification, data extraction, report generation.
Environment setup
# Install SDK
npm install @anthropic-ai/sdk
# Set API key
export ANTHROPIC_API_KEY=sk-ant-...
# Or use .env
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env
# Python SDK
pip install anthropic
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)
Wondering how Claude stacks up? Read Claude vs Gemini 2.5 for Coding: Honest Comparison.