This repository contains code and data for inferring the internal floating-point precision of language models from API-exposed logprobs.
Blog post: TODO
# Compile the attack
make
# Run on cached OpenAI data
python src/eval.py --openai
# Run on cached Gemini data
python src/eval.py --gemini
# Run on synthetic ground-truth data
python src/eval.py --synthetic
# Infer precision from raw logprobs
./inverted_search -1.5 -0.77 -3.02 -4.1 -5.2| Model | Precision | Agreement |
|---|---|---|
| gpt-3.5-turbo | FP32 | 100% |
| gpt-4 | FP32 | 100% |
| gpt-4-turbo | FP32 | 100% |
| gpt-4o | BF16 | 97% |
| gpt-4o-mini | BF16 | 100% |
| gpt-4.1 | BF16 | 100% |
| gpt-4.1-mini | BF16 | 100% |
| gpt-4.1-nano | BF16 | 100% |
| Model | Precision | Agreement |
|---|---|---|
| gemini-2.0-flash | FP32 | 100% |
| gemini-2.0-flash-lite | FP32 | 100% |
Note: Gemini 2.5+ models do not expose logprobs.
| Precision | Accuracy | Notes |
|---|---|---|
| FP32 | 100% | |
| FP16 | 100% | |
| BF16 | 100% | |
| FP8 E4M3 | 89% | Sometimes detected as E5M2 |
| FP8 E5M2 | 100% |
When an LLM computes logprobs, it performs:
logprob[i] = logit[i] - log_sum_exp(logits)
If logits are stored in a reduced precision format (e.g., BF16), they have a characteristic pattern of trailing zeros in their mantissa. Our attack recovers the log_sum_exp constant by searching over possible logit values.
Key insight: If logits are BF16, there are only 65536 possible values. We enumerate candidate logits, compute w = candidate - logprob[0], and verify that logprob[i] + w gives valid BF16 values for all remaining logprobs.
src/
inverted_search.c # Fast attack
brute_force.c # Naive attack
eval.py # Evaluate on cached data
cache_logprobs.py # Collect logprobs from OpenAI API
cache_gemini.py # Collect logprobs from Gemini API
cache_quantized.py # Generate synthetic ground-truth data
data/
openai/ # Cached OpenAI model logprobs
gemini/ # Cached Gemini model logprobs
synthetic/ # Ground-truth validation data
# Fast inverted search (recommended)
make
# Or manually:
gcc -O3 -march=native src/inverted_search.c -o inverted_search -lm
# Brute-force (slow, for demonstration)
gcc -O3 -march=native src/brute_force.c -o brute_force -lmmake
python src/eval.pyexport OPENAI_API_KEY=...
python src/cache_logprobs.pyexport GEMINI_API_KEY=...
python src/cache_gemini.pyRequires GPU and PyTorch:
pip install torch transformers accelerate
python src/cache_quantized.pyMIT