June 21, 2026

Agentic Coding Burns Tokens — Here's How to Keep It in Check

Why AI coding assistants consume so many tokens and how ZDR Chat's cost tracking helps you stay in control.

tipspricingcoding

If you’ve used an AI coding assistant like Cursor, Copilot, or Claude Code, you’ve probably noticed one thing: they consume tokens fast.

This isn’t a bug — it’s a feature of how agentic coding works. But it can be a shock if you’re used to conversational AI where a single query costs fractions of a cent. Here’s what’s going on.

Why Agentic Coding Is Token-Intensive

Agentic coding tools don’t just answer questions. They:

  1. Read your entire codebase — every function, every file, every dependency gets sent as context
  2. Generate multi-file edits — creating or modifying dozens of files in a single session
  3. Self-correct — when they break something, they read the error and try again, compounding the token usage

A single agentic coding session can consume 100,000–500,000 tokens or more. At premium model pricing ($15/M input), that’s $1.50–$7.50 per session.

ZDR Chat Is Not an Agentic Coding Tool

We should be clear about this: ZDR Chat is a conversational chat app. You send messages, get responses. You decide how much context to include.

This means your token usage is naturally bounded:

  • Short Q&A: 50–200 tokens
  • Document analysis: 5,000–50,000 tokens
  • Code discussion: 500–10,000 tokens

Nothing like the 500K token sessions common in agentic tools.

The Status Bar Is Your Friend

ZDR Chat’s status bar shows three things that help you stay aware of costs:

  • Tokens in — how much you’ve sent this session
  • Tokens out — how much the model has replied
  • Cost — total USD for this session (based on actual model pricing)

Check it after every few messages. You’ll quickly develop a sense for what different activities cost.

Tips for Keeping Costs Low

1. Start New Conversations for Big Topics

Context accumulates. A conversation with 20 previous messages sends all 20 messages with every new request. If you’re switching topics, start a fresh conversation.

2. Use Cheaper Models for Simple Tasks

Don’t ask Claude Opus ($15/M input) to proofread a sentence. GPT-4o Mini ($0.15/M input) is 100× cheaper and perfectly capable for routine tasks.

3. Be Specific About What You Need

Instead of “Review this code for issues” (which will analyze everything), try “Find any SQL injection vulnerabilities in this function” (focused, less output).

4. Use Free Models When Appropriate

OpenRouter offers several free models (rate-limited, but $0 cost). Great for:

  • Brainstorming ideas
  • Formatting text
  • Simple explanations
  • Drafting emails

5. Check Your Balance Regularly

The status bar shows your remaining credit from your last balance check. OpenRouter also lets you set spending limits in your account settings.

Example: Code Review Sessions

Here’s what a typical code review costs with different models:

ModelCode Pasted (400 tokens)Response (500 tokens)Cost
GPT-4o Mini1560$0.00004
Claude Sonnet$1.29/M$5.60/M$0.0033
Claude Opus$15/M$75/M$0.0435

Even Claude Opus costs just 4 cents for a thorough code review. The key is keeping context small — don’t paste your entire codebase when one function will do.

The Bottom Line

Agentic coding tools are expensive because they’re designed to work autonomously — reading everything, generating broadly, and iterating. That’s useful for some workflows, but it burns tokens fast.

ZDR Chat gives you full control over what you send. The cost tracking is right there in the status bar. Use cheaper models for routine tasks, keep context focused, and you’ll find that conversational AI costs pennies, not dollars.