Fix: Context Overflow & Token Burn from Large Tool Outputs in OpenClaw
OpenClaw burning through tokens or crashing from context overflow? Large tool outputs are bloating your session. Here's how compaction works and when to reset.
Fix: Context Overflow & Token Burn from Large Tool Outputs in OpenClaw
TL;DR: Large tool outputs (web scrapes, file reads, API responses) bloat your context window, burning tokens and degrading responses. Manage tool output size, use compaction, and reset sessions when needed.
The Error
Error: context_length_exceeded — This model's maximum context length is 200000 tokens.
However, your messages resulted in 234,521 tokens.
You might also see:
Error: Request too large — reduce the number of messages or tool outputs
Warning: Context approaching limit (187,432 / 200,000 tokens). Compaction triggered.
Or the subtler version — no error at all, but:
- Responses get slower and more expensive
- The AI starts "forgetting" earlier parts of the conversation
- Your API bill spikes for no obvious reason
- Responses become generic or ignore your instructions
Why This Happens
Every message in an OpenClaw session is part of the context — the full conversation history sent to the AI model with each request. This includes:
- Your system prompt (personality, instructions)
- All user messages
- All AI responses
- All tool outputs — and this is where things blow up
When the AI uses tools (browsing a webpage, reading a file, calling an API, searching your emails), the tool's output is added to the context. A single web scrape can be 10,000-50,000 tokens. Read a large file? That's thousands more. And it compounds:
Turn 1: "Summarize this webpage" → +15,000 tokens (page content)
Turn 2: "Now check this other page" → +22,000 tokens
Turn 3: "Compare them" → both pages are still in context = 37,000+ tokens
Turn 4: "Also check my email" → +8,000 tokens
...you're at 80,000 tokens and climbing
Each subsequent message sends ALL of that back to the API. You're paying for the full context every single turn.
How to Fix It
Step 1: Understand what's eating your tokens
Check your current session size:
openclaw session info
Look for the token count. If it's above 100,000, you've got bloat.
You can also check the logs to see which tool calls are generating the most output:
openclaw logs --tail 200 | grep -i "tool\|tokens\|context"
Step 2: Reset the session (immediate relief)
If your context is already overflowing, the fastest fix is a clean slate:
openclaw session reset
Or for a specific channel/conversation:
openclaw session reset --channel telegram
openclaw session reset --channel telegram --user 123456789
This clears the conversation history. The AI "forgets" the current conversation but will work normally again.
Step 3: Configure compaction
Compaction is OpenClaw's built-in mechanism for managing context size. When the context approaches the model's limit, compaction summarizes older messages to free up space.
{
"ai": {
"compaction": {
"enabled": true,
"threshold": 0.75,
"strategy": "summarize",
"preserveSystemPrompt": true,
"preserveRecentMessages": 10
}
}
}
threshold: 0.75— Trigger compaction when context reaches 75% of the model's limitstrategy: "summarize"— Summarize old messages instead of deleting thempreserveRecentMessages: 10— Always keep the last 10 messages intact
Important: Compaction itself costs tokens (it asks the model to summarize the conversation). But it's much cheaper than sending a 200K-token context with every message.
Step 4: Limit tool output size
Prevent the bloat in the first place by capping how much data tools can inject into context:
{
"ai": {
"maxToolOutputTokens": 4000,
"truncateToolOutput": true
}
}
This caps each tool's output to ~4,000 tokens. If a web scrape returns 50,000 tokens, only the first 4,000 are kept in context.
Step 5: Use smarter prompting
Instead of asking the AI to "read this webpage" (which dumps the entire page into context), ask it to extract specific information:
Bad (dumps everything into context):
Read https://example.com/long-report and tell me about it
Better (AI extracts only what's needed):
From https://example.com/long-report, extract only the key statistics and conclusions
The AI will still read the full page, but if configured correctly, it'll only keep the relevant parts in its response (which is what stays in context for future turns).
Step 6: Monitor token usage
Add token tracking to spot problems early:
# Check recent API costs
openclaw usage --last 24h
# Check per-session token counts
openclaw session list --verbose
If you see a session with 150K+ tokens, it's time for a reset or compaction review.
When Compaction Fails
Sometimes compaction doesn't help:
- Tool outputs that can't be summarized — binary data, code snippets, structured JSON. Compaction tries to summarize these and produces garbage.
- Compaction loops — the context is so large that the compaction call itself exceeds the limit. You're stuck and need a manual reset.
- Critical context gets lost — compaction summarizes away details the AI needed. The conversation quality drops after compaction.
If you're hitting these edge cases, session resets are your friend. Think of conversations as disposable — start a new session for each distinct task instead of running everything in one never-ending thread.
How to Prevent It
- Start new sessions for new tasks. Don't use one session for "summarize a webpage, then check my email, then write code." Each task should be its own session.
- Set
maxToolOutputTokensfrom day one. 4,000-8,000 tokens is a good range. You rarely need more than that in context. - Prefer concise tools. If you have a choice between a tool that returns raw HTML and one that returns extracted text, use the extracted version.
- Set a compaction threshold of 0.6-0.75. Lower thresholds mean more frequent compaction, but you'll never hit the hard limit.
- Monitor your API spending. A sudden spike in token usage almost always means context bloat.
The Easy Way
lobsterfarm is a managed hosting service for OpenClaw — deployment, updates, and support handled for you.
Skip the setup. Start using your AI assistant today.
lobsterfarm gives you a fully managed OpenClaw instance — one click, your own server, running 24/7.