Agentic Coding Burns Tokens — Here's How to Keep It in Check

If you’ve used an AI coding assistant like Cursor, Copilot, or Claude Code, you’ve probably noticed one thing: they consume tokens fast.

This isn’t a bug — it’s a feature of how agentic coding works. But it can be a shock if you’re used to conversational AI where a single query costs fractions of a cent. Here’s what’s going on.

Why Agentic Coding Is Token-Intensive

Agentic coding tools don’t just answer questions. They:

Read your entire codebase — every function, every file, every dependency gets sent as context
Generate multi-file edits — creating or modifying dozens of files in a single session
Self-correct — when they break something, they read the error and try again, compounding the token usage

A single agentic coding session can consume 100,000–500,000 tokens or more. At premium model pricing ($15/M input), that’s $1.50–$7.50 per session.

ZDR Chat Is Not an Agentic Coding Tool

We should be clear about this: ZDR Chat is a conversational chat app. You send messages, get responses. You decide how much context to include.

This means your token usage is naturally bounded:

Short Q&A: 50–200 tokens
Document analysis: 5,000–50,000 tokens
Code discussion: 500–10,000 tokens

Nothing like the 500K token sessions common in agentic tools.

The Status Bar Is Your Friend

ZDR Chat’s status bar shows three things that help you stay aware of costs:

Tokens in — how much you’ve sent this session
Tokens out — how much the model has replied
Cost — total USD for this session (based on actual model pricing)

Check it after every few messages. You’ll quickly develop a sense for what different activities cost.

Tips for Keeping Costs Low

1. Start New Conversations for Big Topics

Context accumulates. A conversation with 20 previous messages sends all 20 messages with every new request. If you’re switching topics, start a fresh conversation.

2. Use Cheaper Models for Simple Tasks

Don’t ask Claude Opus ($15/M input) to proofread a sentence. GPT-4o Mini ($0.15/M input) is 100× cheaper and perfectly capable for routine tasks.

3. Be Specific About What You Need

Instead of “Review this code for issues” (which will analyze everything), try “Find any SQL injection vulnerabilities in this function” (focused, less output).

4. Use Free Models When Appropriate

OpenRouter offers several free models (rate-limited, but $0 cost). Great for:

Brainstorming ideas
Formatting text
Simple explanations
Drafting emails

5. Check Your Balance Regularly

The status bar shows your remaining credit from your last balance check. OpenRouter also lets you set spending limits in your account settings.

Example: Code Review Sessions

Here’s what a typical code review costs with different models:

Model	Code Pasted (400 tokens)	Response (500 tokens)	Cost
GPT-4o Mini	15	60	$0.00004
Claude Sonnet	$1.29/M	$5.60/M	$0.0033
Claude Opus	$15/M	$75/M	$0.0435

Even Claude Opus costs just 4 cents for a thorough code review. The key is keeping context small — don’t paste your entire codebase when one function will do.

The Bottom Line

Agentic coding tools are expensive because they’re designed to work autonomously — reading everything, generating broadly, and iterating. That’s useful for some workflows, but it burns tokens fast.

ZDR Chat gives you full control over what you send. The cost tracking is right there in the status bar. Use cheaper models for routine tasks, keep context focused, and you’ll find that conversational AI costs pennies, not dollars.