June 21, 2026
Agentic Coding Burns Tokens — Here's How to Keep It in Check
Why AI coding assistants consume so many tokens and how ZDR Chat's cost tracking helps you stay in control.
If you’ve used an AI coding assistant like Cursor, Copilot, or Claude Code, you’ve probably noticed one thing: they consume tokens fast.
This isn’t a bug — it’s a feature of how agentic coding works. But it can be a shock if you’re used to conversational AI where a single query costs fractions of a cent. Here’s what’s going on.
Why Agentic Coding Is Token-Intensive
Agentic coding tools don’t just answer questions. They:
- Read your entire codebase — every function, every file, every dependency gets sent as context
- Generate multi-file edits — creating or modifying dozens of files in a single session
- Self-correct — when they break something, they read the error and try again, compounding the token usage
A single agentic coding session can consume 100,000–500,000 tokens or more. At premium model pricing ($15/M input), that’s $1.50–$7.50 per session.
ZDR Chat Is Not an Agentic Coding Tool
We should be clear about this: ZDR Chat is a conversational chat app. You send messages, get responses. You decide how much context to include.
This means your token usage is naturally bounded:
- Short Q&A: 50–200 tokens
- Document analysis: 5,000–50,000 tokens
- Code discussion: 500–10,000 tokens
Nothing like the 500K token sessions common in agentic tools.
The Status Bar Is Your Friend
ZDR Chat’s status bar shows three things that help you stay aware of costs:
- Tokens in — how much you’ve sent this session
- Tokens out — how much the model has replied
- Cost — total USD for this session (based on actual model pricing)
Check it after every few messages. You’ll quickly develop a sense for what different activities cost.
Tips for Keeping Costs Low
1. Start New Conversations for Big Topics
Context accumulates. A conversation with 20 previous messages sends all 20 messages with every new request. If you’re switching topics, start a fresh conversation.
2. Use Cheaper Models for Simple Tasks
Don’t ask Claude Opus ($15/M input) to proofread a sentence. GPT-4o Mini ($0.15/M input) is 100× cheaper and perfectly capable for routine tasks.
3. Be Specific About What You Need
Instead of “Review this code for issues” (which will analyze everything), try “Find any SQL injection vulnerabilities in this function” (focused, less output).
4. Use Free Models When Appropriate
OpenRouter offers several free models (rate-limited, but $0 cost). Great for:
- Brainstorming ideas
- Formatting text
- Simple explanations
- Drafting emails
5. Check Your Balance Regularly
The status bar shows your remaining credit from your last balance check. OpenRouter also lets you set spending limits in your account settings.
Example: Code Review Sessions
Here’s what a typical code review costs with different models:
| Model | Code Pasted (400 tokens) | Response (500 tokens) | Cost |
|---|---|---|---|
| GPT-4o Mini | 15 | 60 | $0.00004 |
| Claude Sonnet | $1.29/M | $5.60/M | $0.0033 |
| Claude Opus | $15/M | $75/M | $0.0435 |
Even Claude Opus costs just 4 cents for a thorough code review. The key is keeping context small — don’t paste your entire codebase when one function will do.
The Bottom Line
Agentic coding tools are expensive because they’re designed to work autonomously — reading everything, generating broadly, and iterating. That’s useful for some workflows, but it burns tokens fast.
ZDR Chat gives you full control over what you send. The cost tracking is right there in the status bar. Use cheaper models for routine tasks, keep context focused, and you’ll find that conversational AI costs pennies, not dollars.