← Back to WeighMyPrompt

What Are AI Tokens? Complete Guide for Developers

Last updated: April 15, 2026 · 8 min read

If you work with AI APIs like OpenAI, Anthropic, or Google, you've seen the word "tokens" everywhere. Tokens directly affect how much you pay, how fast your responses come, and whether your prompt fits within the model's context window. This guide explains everything a developer needs to know.

What is a token?

A token is a chunk of text that an AI model processes as a single unit. It's not exactly a word — it's closer to a syllable or a common character sequence. The tokenizer breaks your text into these chunks before the model sees it.

For example, with OpenAI's cl100k_base tokenizer:

Why do tokens matter?

Three reasons:

  1. Cost — Every API charges per token. GPT-4o costs $5 per million input tokens. A 1,000-token prompt costs $0.005.
  2. Context window — Each model has a maximum token limit (e.g., GPT-4o: 128K, Claude: 200K, Gemini: 2M). Your prompt + response must fit within this limit.
  3. Speed — More output tokens = longer response time. Models generate 30-200 tokens/second depending on the model.

Token costs across popular models

ModelInput (per 1M tokens)Output (per 1M tokens)Context
GPT-4o$5.00$15.00128K
GPT-4o mini$0.15$0.60128K
Claude 3.5 Sonnet$3.00$15.00200K
Gemini 1.5 Flash$0.075$0.301M
Llama 3.1 70B$0.88$0.88128K

How to count tokens

The most accurate way is to use the tokenizer library for your provider:

import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
tokens = enc.encode("Your prompt here")
print(len(tokens))  # exact token count

Or use WeighMyPrompt — it runs the same tiktoken tokenizer directly in your browser via WebAssembly. No data is sent anywhere.

Tips to reduce token usage

Different tokenizers for different models

Each provider uses a different tokenizer:

This means the same text has a different token count on each provider. WeighMyPrompt uses exact counting for OpenAI and smart approximation for others.

Start counting

Try WeighMyPrompt — paste your prompt, see exact tokens, compare costs across 16 models, and optimize with one click. 100% free, 100% private.