AI token counters lie—here’s how to bill clients accurately

An accurately located doorway - geograph.org.uk - 4478898.jpg

Written by

in

The newsletter for newsletter operators

Daily field notes on deliverability, AI tools, hosting, and monetisation. No "top 10 plugins" filler — real tools, real numbers, real failures.

If you bill clients for AI-assisted work—copywriting, image generation, data extraction—you’ve probably noticed that the token count in your API dashboard doesn’t match the estimate your prompt tool gave you. Sometimes the difference is negligible. Other times it’s 20% or more, and you’re left trying to explain why the invoice doesn’t match the quote.

Token counting isn’t standardized across models, and the tools we use to estimate cost rarely account for the invisible overhead that APIs add. Here’s what’s actually happening, and how to track usage in a way that holds up when a client asks questions.

Why token counts don’t match between tools

Most AI token counters—like the estimators built into prompt libraries or third-party calculators—use open-source tokenizers that approximate how a model splits text. OpenAI’s tiktoken library is the most common. It’s accurate for GPT-3.5 and GPT-4, but it doesn’t account for function calling, system messages, or the formatting overhead that APIs inject when you use features like JSON mode or structured outputs.

Anthropic’s Claude models use a different tokenizer entirely. If you’re using a calculator built for OpenAI and then running prompts through Claude, your estimate can be off by 15–30%. The variance gets worse with non-English text, code blocks, or anything that includes special characters.

Then there’s the API wrapper problem. If you’re using a tool like LangChain, Make, or Zapier to call an AI model, the platform often adds its own metadata—timestamps, user IDs, retry logic—that inflates token usage without appearing in your prompt preview.

What your API dashboard is actually counting

Your API provider’s usage dashboard is the source of truth, but it’s counting more than you think. Every API call includes:

  • System messages that set model behavior (e.g., “You are a helpful assistant”).
  • Function definitions if you’re using tools or structured outputs.
  • Conversation history if you’re maintaining context across multiple turns.
  • Formatting tokens for JSON mode, which adds schema instructions behind the scenes.

A 500-token prompt can easily become 650 tokens by the time it hits the API. If you quoted a client based on the prompt alone, you’re eating the difference.

OpenAI’s API returns token counts in the response object (usage.prompt_tokens and usage.completion_tokens), so you can log the actual cost per call. Anthropic does the same in the usage field. If you’re not capturing that data, you’re guessing.

How to track usage without undercharging

The simplest fix: log every API response and pull token counts directly from the provider. If you’re using Python, store the usage object in a CSV or database after each call. If you’re using a no-code tool like Make or Zapier, add a step that writes the token count to a Google Sheet or Airtable base.

For client work, I run a weekly script that sums token usage by project tag and multiplies by the current API rate (OpenAI charges $0.01 per 1K prompt tokens and $0.03 per 1K completion tokens for GPT-4). That number goes into the invoice as a line item, and I attach a CSV export if the client asks for proof.

If you’re quoting a fixed price, pad your estimate by 25% to cover system message overhead, retries, and any follow-up calls the client requests mid-project. Estimators are useful for ballpark numbers, but they shouldn’t be the final quote.

When flat-rate pricing beats usage tracking

Some operators skip per-token billing entirely and charge a flat rate per deliverable—$200 for a landing page, $500 for a lead magnet, regardless of how many API calls it takes. This works if you’re confident in your workflow and don’t want to explain token math to every client.

The tradeoff: you absorb cost volatility. If a client requests three rounds of revisions, you’re paying for the extra tokens. If you’re using a model like GPT-4 or Claude Opus, a single long-context project can cost $15–$30 in API fees. Flat pricing makes sense when your process is repeatable and your margins are wide enough to cover outliers.

For subscription clients—content retainers, weekly reports—I set a token budget per month (e.g., 100K tokens) and log usage in a shared dashboard. If they go over, the next invoice includes an overage charge at $0.02 per 1K tokens. That rate is higher than my cost, but it discourages scope creep and keeps the math transparent.

Want more breakdowns like this? Subscribe to One Two Three Send for weekly deep-dives on the tools and workflows that power solo content businesses.

Heads up — some links in this article are affiliate links. If you sign up through them, we may earn a small commission at no extra cost to you. We only recommend tools we use ourselves.

The newsletter for newsletter operators

Daily field notes on deliverability, AI tools, hosting, and monetisation. No "top 10 plugins" filler — real tools, real numbers, real failures.

Other newsletters you might like

Love Castles

Apart from the fascinating and rich history of castles, people love to visit them for their majestic beauty. From the imposing stone walls to the beautiful architecture, there is something captivating about these grand structures.

Subscribe

Love London

A newsletter for Londoners who want to rediscover their own city. Travellers planning their first or fifth visit. Anglophiles who fell in love with London through literature, film, or a rainy afternoon on the South Bank.

Subscribe

Love South Africa

South Africa as a travel destination. The Rainbow nation full of wonderful gems to visit. Going on Safari in the Kruger National Park, visiting the beautiful beaches of Cape Town, indulge in the South African culture and heritage.

Subscribe

My Local Dublin

The Dublin you don't see from a tour bus — local stories, hidden gems, food, events and the best of the city, by locals for locals.

Subscribe

Newsletters via the One Two Three Send network.  ·  Want your newsletter featured here? Click here