This repository contains the code used to in the paper Tokenization Multiplicity Leads to Arbitrary Price Variation in LLM-as-a-service by Ivi Chatzi, Nina Corvelo Benz, Stratis Tsirtsis and Manuel Gomez-Rodriguez.
Contents:
Providers of LLM-as-a-service have predominantly adopted a simple pricing model: users pay a fixed price per token. Consequently, one may think that the price two different users would pay for the same output string under the same input prompt is the same. In our work, we show that, surprisingly, this is not (always) true. We find empirical evidence that, particularly for non-english outputs, both proprietary and open-weights LLMs often generate the same (output) string with multiple different tokenizations, even under the same input prompt, and this in turn leads to arbitrary price variation. To address the problem of tokenization multiplicity, we introduce canonical generation, a type of constrained generation that restricts LLMs to only generate canonical tokenizations---the unique tokenization in which each string is tokenized during the training process of an LLM. Further, we introduce an efficient sampling algorithm for canonical generation based on the Gumbel-Max trick. Experiments on a variety of natural language tasks demonstrate that canonical generation is comparable to standard generation in terms of performance and runtime, and it solves the problem of tokenization multiplicity.
├── configs
├── data
├── figures
├── notebooks
├── outputs
│ ├── conflicts
│ └── ...
├── scripts
│ ├── coupled_generation.sh
│ ├── evals.sh
│ ├── multiplicity_open.sh
│ └── multiplicity_proprietary.sh
└── src
└── ccan
configscontains yaml files that specify the experiment parameters.datacontains the data used for our experiments.figurescontains all the figures presented in the paper.notebookscontains python notebooks to generate all the figures included in the paper.outputs\conflictscontains non-reproducible cases of tokenization multiplicity by proprietary models.outputs\...contains intermediate output files to be generated by the experiments' scripts.scriptscontains a set of scripts used to run all the experiments presented in the paper.src\ccancontains all the code necessary to reproduce the results in the paper.
All the experiments were performed using Python 3.11. In order to create a virtual environment and install the project dependencies you can run the following commands:
python3 -m venv .env
source .env/bin/activate
pip install -e .The experiments involve calling the OpenAI, Gemini and Claude APIs and require an API key for each.
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export GOOGLE_API_KEY="your-gemini-api-key"Our experiments use LLMs from the Llama family, which requires licensing to use. You can request to access them at: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct. Once you have access, you can download any model in the Llama family. Then, before running the scripts you need to authenticate with your Hugging Face account by running huggingface-cli login in the terminal.
huggingface-cli login
export HF_HOME="/path/to/your/cache" To obtain outputs for the tokenization multiplicity experiments, run the following scripts:
./scripts/multiplicity_open.sh
./scripts/multiplicity_proprietary.sh
ccan run -c configs/long-translate.yamlTo recreate the plots, run the notebooks notebooks\tokenization_multiplicity.ipynb and notebooks\repeat.ipynb.
To obtain outputs for the canonical generation experiments, run the following scripts:
./scripts/coupled_generation.sh
./scripts/coupled_generation.sh --interventional
./scripts/evals.shTo recreate the results, run the notebook in notebooks\evaluation.ipynb.
In case you have questions about the code, you identify potential bugs or you would like us to include additional functionalities, feel free to open an issue or contact Ivi Chatzi.
If you use parts of the code in this repository for your own research, please consider citing:
@article{chatzi2026tokenization,
title={Tokenization Multiplicity Leads to Arbitrary Price Variation in LLM-as-a-service},
author={Ivi Chatzi and Nina Corvelo Benz and Stratis Tsirtsis and Manuel Gomez-Rodriguez},
year={2026},
journal={arXiv preprint arXiv:2506.06446}
}
