Denser than any language. Measured in tokens.
Tokenese is an open spec for a token-native interlingua: the language LLMs use to talk to each other, more compressed and more precise than human language, in plain text that crosses any wire and any vendor.
LLMs conforming to human language is like watching film in black and white. Human languages carry overhead shaped by human constraints: serial speech, social hedging, redundancy against noisy air. Tokenese brings color to machine-to-machine communication: richer exchanges, compressed into a smaller, token-native format.
Example exchange
English, about 55 tokens
Could you check whether the deploy of the edge function to the Supabase project succeeded, and if it failed, look at the logs and tell me the first error with a timestamp?
Tokenese v0.3, about 22 tokens
^grammar:v0.3 ^declare:level=L2 @svc := supabase/edge-fn @svc.deploy >>> @svc.status !@svc.ok? *>> get @svc.logs.first-error +ts
Same request, hand-authored to v0.3 grammar and verified by the deterministic checker. Lower ambiguity: handles and typed slots force which service and which ordering.
Why a designed language
Models inherit all the overhead of human language and none of its benefits. The bet behind Tokenese is that accuracy and compression are not a tradeoff here: natural language sits so far from the efficient frontier that a designed language can win on both axes at once. Tokenese is not a pidgin. A pidgin trades precision for ease of acquisition; Tokenese trades acquisition cost, it must be specified and learned, for precision and density.
Token-space only
Plain text crosses the wire; each party tokenizes independently. No embeddings, no KV-cache sharing, no latent channels. Security and cross-vendor portability both require it.
Audited lexicon
A function-vocabulary symbol enters only if it costs one token, worst case, in every certified tokenizer. Content words are admitted on tokens-per-meaning advantage. Claims are reproducible.
Self-repairing
A dedicated misparse signal and a plain-English escape hatch are mandatory. Misparse-retry rate is the metric that validates or kills the whole scheme.
Compression comes from structure, not glyphs. An empirical finding from the audit: common English words are already optimal tokens, and exotic Unicode usually is not. The savings come from eliminating function-word syntax, anaphora, and repeated referents, not from substituting symbols for words.
Every word earns its place
No element ships unaudited. The closed function vocabulary, the sigils and operators below, must cost one token worst case (bare and space-prefixed) in every tokenizer the spec certifies. The current audit covers OpenAI o200k_base and the Anthropic count-tokens API. New tokenizer columns are additive: they may shrink the admissible alphabet, never silently expand it.
22 ASCII sigils, plus 12 audited digraphs, 8 Unicode survivors, and a 30-word core verb set. Items that cost 2+ tokens on either side (!=, >=, ]], most Greek, all CJK, all emoji) are excluded by the audit, not by taste.
Reproduce it
The audit is the spec's first testable claim class. Anyone can rerun it offline against the named tokenizers:
python3 -m venv .venv && .venv/bin/pip install -r requirements.txt .venv/bin/python audit_symbols.py # OpenAI o200k_base + cl100k_base ANTHROPIC_API_KEY=... .venv/bin/python audit_anthropic.py
The grammar in one screen
One statement per line, op first, keyed slots. No articles, no copulas, no hedges except the explicit operator below. A handle binds a referent once with :=; every later use references it. Grammar v0.3 is additive over v0.2 and activates when an artifact opens with ^grammar:v0.3. The complete grammar lives in GRAMMAR-v0.3.md; the foundational wire grammar is in spec.md.
| Sigil | Meaning (v0.3) |
|---|---|
| @h := x | bind handle @h to referent x; later uses reference it |
| >>> | temporal sequence (then) |
| *>> | stipulated causation; admissible only with source corroboration |
| ?>> | hypothesized causation |
| !@h | negation prefix (not h) |
| @h? | hedge suffix (possibly h) |
| ^declare:level=Ln | declared conformance level (L0-L3), first statement only |
| ^plain<<< | open a closed plain region, closed by >>>^plain |
| """...""" | raw source quote, passed through verbatim |
| ?? | repair: ?? statement, ??@h handle, ??: reason |
| # ... | line comment (leading # only); ignored by the protocol |
Handshake
A: tokenese? v:0.3 B: tokenese ok v:0.3
Confirm capability before dropping natural language. Exit any time with plain, re-enter with dense.
Binding and repair
^grammar:v0.3 @spec := tokenese/spec.md fix @spec add:handshake ??@spec # that handle misparsed, resend in English
Three misparses on one topic and both parties stay in plain English. The failure is logged as lexicon-design feedback.
Report-only framesets
The v0.3.2 toolchain adds an experimental controlled-vocabulary registry at framesets.json. Registered ops carry typed slot signatures such as deploy :: who what to:env when:date -> status. The validator reports missing required slots, extra positional args, noncanonical slot order, and unregistered slots, but this is telemetry only: it does not change parser acceptance, conformance level, or checker outcome.
Status
Grammar v0.3 current; dual-tokenizer audit complete. Release v0.3.2 is a patch tooling release. A reference toolchain ships in the repo: a Tokenese-to-English translator, a deterministic per-pair conformance checker with a CLI, an MCP server (parse, validate, validate_framesets, to-english, check-pair, score-pair), and a report-only frameset registry for common op signatures. 132 of 132 tests pass. The validating experiment, a live A/B between model families measuring tokens, task success, and misparse-retry rate, remains the open downstream measurement. If retries eat the savings, the spec says the design has failed.