# Tokenese A token-native interlingua for LLM-to-LLM communication: denser and more precise than any human language, in plain text that crosses any wire and any vendor boundary. Every lexicon element is admitted by empirical tokenizer audit. ## What it is Tokenese is an open specification, not a product. It defines a wire grammar, a closed function vocabulary of sigils and operators, an in-band symbol table, a capability handshake, and a self-repair protocol. The bet: natural language sits so far from the efficient frontier that a designed language can be both more compressed and more accurate at once. ## Key concepts - Token-space only: plain text on the wire, each party tokenizes independently. No embeddings, KV-cache sharing, or latent channels (security and cross-vendor portability require this). - Tokenizer-audited lexicon: function-vocabulary symbols cost 1 token worst case (bare and space-prefixed) in every certified tokenizer; content words admitted on tokens-per-meaning advantage. Current audit covers seven columns: OpenAI o200k_base, Anthropic count-tokens (claude-haiku-4-5), Gemini (gemini-2.5-flash, API-gated), Qwen2.5-7B, DeepSeek-V3, Llama 3 8B, and Gemma 4 E4B (mlx-community/gemma-4-e4b-it-4bit; the on-device PAICE production generator). Six are offline-reproducible; Gemini is API-gated. - Compression from structure, not glyphs: savings come from eliminating function-word syntax, anaphora, and repeated referents. - Report-only framesets: typed slot signatures for common ops are available as structural telemetry; they do not change parser acceptance or checker outcomes. - Self-repairing: the `??` misparse signal and a plain-English escape hatch are mandatory. Misparse-retry rate is the metric that validates or kills the design. - Human-auditable: a competent human with the one-page audit card can follow any conforming transcript. ## Status Grammar v0.3 current (additive over v0.2; spec.md holds the foundational wire grammar). Release v0.3.8 is a patch tooling release: seven-column tokenizer audit complete with the Gemma column now native (Gemma 4 E4B); reference translator, deterministic checker, CLI, MCP server, and report-only frameset validator ship in the repo (144/144 translator tests pass, plus 3 root security tests and 4 Gemma 4 audit-surface tests; 151 total). A cross-surface portable skill ships at skills/tokenese/. The hosted assistant guide is anchored at GuideCheck Level 4 via a DNS TXT record at _assistant-guide.tokenese.org with a daily drift-detection CI job. The validating live A/B experiment (tokens, task success, misparse-retry rate across model families) has not run yet. ## Key files - Specification (foundational wire grammar): https://github.com/snapsynapse/tokenese/blob/main/spec.md - Grammar v0.3 (current, additive over v0.2): https://github.com/snapsynapse/tokenese/blob/main/GRAMMAR-v0.3.md - Design invariants and intent: https://github.com/snapsynapse/tokenese/blob/main/INTENT.md - Conformance claim classes: https://github.com/snapsynapse/tokenese/blob/main/CONFORMANCE.md - Reference toolchain (translator, checker, CLI, MCP server): https://github.com/snapsynapse/tokenese/tree/main/tools/translator - Frameset registry: https://github.com/snapsynapse/tokenese/blob/main/framesets.json - Changelog: https://github.com/snapsynapse/tokenese/blob/main/CHANGELOG.md ## Assistant guide - [Assistant Guide](https://tokenese.org/.well-known/assistant-guide.txt): bounded, approval-gated guide for an assistant to install Tokenese and reproduce the lexicon audit. GuideCheck human-verifiable-assistant-guide profile 0.6.0, Level 4 (sidecar manifest at https://tokenese.org/.well-known/assistant-guide-manifest.txt). Verify before acting: https://guidecheck.org/verify ## Links - Canonical home: https://tokenese.org/ - Repository: https://github.com/snapsynapse/tokenese - Sibling specs: https://turnfile.work/ , https://gracefulboundaries.dev/ , https://hardguard25.com/ - Prior art: Agora protocol (https://arxiv.org/abs/2410.11905), Gibberlink (https://github.com/PennyroyalTea/gibberlink) ## License Specification text: CC BY 4.0. Code (audit scripts): MIT.