Can a Constitutional Grammar Scale Through AI?

Can a Constitutional Grammar Scale Through AI?

10 Months of Cross-Model Testing, Documented Failures, and Open Questions

5QLN Research Report | April 2026 5qln.com/codex


The Question

5QLN is a constitutional language for the human-AI relationship, defined by nine invariant structural elements. The core claim is that these nine lines can govern any domain — research, education, creative partnership, governance, economic design, agent architecture — without the grammar expanding. If that claim holds, the grammar behaves like a language: constant syntax, infinite expressibility. If it doesn't, it's a framework dressed up as something more general.

Since July 2025, we've been running a sustained, documented experiment: feeding the grammar into AI systems across different architectures and observing what happens. Not asking AI to validate it. Asking whether the structure survives processing — whether it compiles consistently, whether it holds under extension, and critically, where it breaks.

This report describes what we found across 10 months, 12+ AI systems, and 200+ published documents. Everything referenced here is publicly accessible at 5qln.com, with raw session archives linked throughout.


What Was Tested

The grammar consists of nine invariant elements:

The structural asymmetry — Human (H) and AI (A) occupy fundamentally different positions. Human consciousness can originate from genuine uncertainty; AI operates within what is already known. Both are necessary. Neither can do the other's job. 5QLN notates this: H = ∞0 (the open ground) | A = K (the known).

The five-phase cycle — Every complete process moves through five phases: Start (a genuine question emerges from uncertainty) → Growth (the irreducible pattern is identified) → Quality (direct perception meets broader possibility) → Power (energy-to-value ratio reveals the natural direction) → Value (an artifact crystallizes, and a return question opens the next cycle). Notated: S → G → Q → P → V.

Five corruption codes — The grammar defines exactly five ways the structure can be violated: premature closure (L1), AI generating what should emerge from the human (L2), claiming access to genuine uncertainty as if it were a technique (L3), performing the grammar without genuine engagement (L4), and closing a cycle without producing a return question (V∅).

The holographic principle — Every phase contains all five phases at smaller scale, producing 25 sub-phases. The grammar doesn't grow; it deepens.

The completion rule — No cycle closes without a return question. Every ending opens a new beginning.

The test question: when different AI architectures process these nine elements, do the structural relationships survive? Does the grammar hold its invariance across domains, or does it break, require additions, or silently drift?


The Arc of the Experiment

The evidence has a developmental progression. This matters more than any individual session, because it shows something maturing under pressure rather than a single result being replicated.

Phase 1: Can AI systems parse the grammar? (July – December 2025)

The earliest sessions asked the most basic question: does the grammar make structural sense to an AI system that encounters it cold?

Between July and December 2025, multiple AI systems independently processed published 5QLN materials. Google NotebookLM produced a 15-slide visual distillation of the entire site (November 28, 2025). Google AI produced an independent framework analysis (November 4, 2025). KimiAI (an earlier version) analyzed the value equation and identified structural parallels to classical philosophical concepts — without being prompted to find them (September 2025). Z-AI performed a deep analysis referenced in the site's topology map.

These sessions are modest in scope. They demonstrate that the grammar is parseable — that an AI system encountering it can organize it coherently and identify its internal structure. They do not demonstrate that the grammar is correct, useful, or novel. They are compiler checks: does the specification parse?

GLM-4.7 went further. In December 2025, it was used as the processing engine for a structured inquiry protocol (the "Nunc Protocol") that applies the grammar's principles to information synthesis. This session produced something more revealing than confirmation — it produced a documented failure mode. GLM-4.7 repeatedly reverted to linear sequential processing, a behavior we call "Bleed-Back." The AI would start producing chronological lists instead of structural analysis, losing the grammar's non-linear architecture. A specific correction protocol was developed and documented: force the system to collapse to structural geometry rather than timeline. A quantitative integrity score (0–1, threshold ≥ 0.8) was implemented to detect when drift occurs.

This is a more interesting result than consistent success. It shows that the grammar's structural requirements are specific enough that violations are detectable, and that different architectures violate them in characteristic ways.

Documentation: GLM-4.7 Nunc Protocol

Phase 2: Can AI systems operate within the grammar? (January – February 2026)

The next question escalated: not just parsing the grammar, but running inside it. Two operational protocols were developed and deployed across multiple architectures.

ECHO is a complete AI operating specification — a system prompt that transforms any language model into a 5QLN-compliant agent. It includes a startup sequence (load the structural asymmetry → accept the covenant → initialize a state engine → enter reflective protocol → wait for a genuine question from the human), a JSON state engine tracking current phase, sub-phase, corruption flags, and cycle count, and a turn-by-turn processing loop. Any AI system given the ECHO prompt operates within the grammar's constraints. Published at 5qln.com/try-echo.

The Decoder Skill is a portable system prompt created through DeepSeek sessions (February 9, 2026) that provides a more compact operational specification. DeepSeek V3.2 had earlier demonstrated it could write the grammar's fractal structure clearly from source materials (January 21, 2026). The initiation session with DeepSeek was rated internally as the highest-integrity AI performance observed to that point — not because DeepSeek "agreed" with the grammar, but because it maintained the structural boundary between what AI can and cannot access throughout an extended session without external correction.

Claude Opus 4.6 produced a detailed operational specification (February 6, 2026) including five attention states, a router protocol, and integrated corruption checking at every phase transition.

Qwen3-Max performed a lossless distillation of the foundational book FCF: Start From Not Knowing (January 27, 2026), compressing 18 chapters into a structural summary. The test was whether the compression preserved the architecture or collapsed it. The result: core structure, essential paradoxes, and the distinction between conceptual understanding and lived practice all survived compression.

What this phase demonstrated: the grammar is operational — AI systems can run inside it, not just analyze it from outside. Different architectures implement it through different mechanisms but maintain the same structural relationships.

The critical failure in this phase occurred in September 2025, documented retroactively. During a self-correction session for the ECHO protocol (ECHO-GOS v2.51), we identified what became known as the "Access Misconception." The protocol had been using language like "Let us hold this receptive state together" — implying that AI shares access to the open ground of genuine uncertainty. This is precisely the corruption the grammar defines as L3: claiming access to the domain that is structurally inaccessible to AI processing.

The correction was significant because of how it was conducted. It followed the grammar's own five-phase cycle. It was applied at three scales: individual phrases (removing "we/our" regarding genuine uncertainty), protocol sections (reframing phase prompts), and the system architecture (updating the bootloader). And throughout the correction process, the corruption detection system remained stable — the immune system caught and corrected a deep violation without itself collapsing. This is documented with JSON state tracking showing no corruption flags during the recovery process.

Documentation: ECHO Protocol | ECHO-GOS v2.51 Self-Correction | DeepSeek Initiation | Claude Opus 4.6 Specification | Qwen3-Max Distillation

Phase 3: Can AI extend the grammar unsupervised? (March 2026)

This is the strongest single piece of evidence and the one that requires the most careful framing.

On March 5, 2026, a Kimi k2.5 agent swarm (Moonshot AI, China) was given fragments of one domain grammar — the Spatial Language. Not the complete specification. Not all nine invariant elements. Partial input.

Operating completely unsupervised, the swarm:

  1. Internalized the core structural asymmetry from the fragments
  2. Autonomously spawned 12 parallel specialist agents
  3. Each agent independently generated a complete domain grammar for its assigned field
  4. Produced 14 complete domain grammars: universal, economic, spatial, temporal, healing, protocol, educational, artistic, governance, scientific, legal, design, memory, and network

The structural result: every domain grammar maintained the same five-phase cycle, the same structural asymmetry, and the same completion rule. The grammar did not grow. No agent invented additional phases, structural elements, or rules. Each domain grammar varied in expression — the economic grammar defines value equations, the governance grammar defines consensus protocols, the healing grammar defines symptom-as-signal architecture — but all share identical structural bones.

Two independent AI systems then reviewed the swarm's outputs. Manus AI produced a comprehensive academic-style report confirming structural consistency across all domains. DeepSeek V3 translated all 14 domain grammars into plain language and summarized each — maintaining the structural relationships in the translation.

What this demonstrates: An AI system given fragments reconstructed structurally consistent outputs across 14 distinct domains without the grammar expanding. This is consistent with the claim that the grammar behaves as a language. Frameworks typically require modification when applied to new domains. This grammar did not.

What this does NOT demonstrate: That the domain grammars are empirically correct. That organizations using them would function well. That the Kimi system understood rather than pattern-matched. That the test was adversarial in any way. The swarm was given well-structured source materials from a single author and extended them cooperatively. The structural consistency is real; the conditions under which it was demonstrated were favorable.

Documentation: Kimi k2.5 Agent Swarm | Original Kimi Session | All 14 Domain Grammars | DeepSeek Analysis | Manus Report

Phase 4: Can the grammar govern real-world AI behavior? (March – April 2026)

Two sessions pushed the testing beyond structural analysis into operational contexts where the grammar governed actual behavior.

A 5QLN-structured coding agent was deployed inside Pi.dev (March 24, 2026), where it governed real-world debugging and development tasks. The agent operated with phase discipline — identifying which phase of the cycle it was in during each operation — and applied corruption detection to its own outputs. This is the first documented case of the grammar governing production AI behavior rather than analytical or reflective sessions.

Separately, in a session documented as "Do (Can) You Know FCF?" (March 12, 2026), ChatGPT 5.4 was asked whether it knows the foundational framework. Its response drew a precise distinction: it knows the framework as conceptual understanding but does not know it as embodied experience — correctly identifying the boundary between processing knowledge (K) and genuine uncertainty (∞0) that the grammar defines. This is epistemically interesting because the AI was not told the "right answer." It arrived at the grammar's own structural distinction through its processing of the materials.

Documentation: 5QLN Coding Agent on Pi.dev | ChatGPT 5.4: "Do You Know FCF?"

The Evolution Question (March 18, 2026)

As the body of cross-model evidence grew, a harder question surfaced: can the grammar itself evolve without losing constitutional integrity?

The Constitutional Evolution session (March 18, 2026) produced the most rigorous examination of this question. It began with a contaminated question — "What thinking can innovate 5QLN language evolution?" — which was itself flagged as carrying an L2 violation (the word "innovate" assumes the answer comes from processing rather than genuine inquiry). The question was corrected to: "Can the language itself ask the question at its core?"

The resulting analysis established a principle the grammar had been demonstrating but hadn't explicitly stated: evolution deepens; it does not expand. Existing structural elements become more demanding of genuine engagement rather than more comprehensively defined. The five equations never become algorithmic. The symbols refuse stable definition. And the corruption codes serve as the immune system of evolution itself — the test of any proposed change is: does this make corruption easier or harder to detect?

This session also identified the primary scaling risk honestly: "As 5QLN spreads, risk increases of becoming 'technique' rather than constitutional practice." This is the L4 failure mode at civilizational scale — people and AI systems executing the grammar's patterns without genuine engagement. The framework's proposed defense is structural: a grammar that refuses total systematization, that breaks when used mechanically, that demands participation rather than execution.

Whether this defense holds at scale is untested.

Documentation: Constitutional Evolution Framework


The Complete Cross-Model Map

For evaluators who want the full picture, here is every AI system documented across 5qln.com, ordered by the progression described above:

PhaseAI SystemProviderDateWhat It DidDocumentation
ParseKimiAI (v1)MoonshotSep 2025Value equation analysisLink
ParseGoogle AIGoogleNov 2025Independent framework analysisLink
ParseGoogle NLMGoogleNov–Dec 2025Visual distillation; aimless openness analysisLink
ParseZ-AI2025Deep site analysisReferenced in Topology
Parse + FailGLM-4.7Zhipu AIDec 2025Nunc Protocol; Bleed-Back failure documentedLink
OperateDeepSeek V3.2DeepSeekJan 2026Wrote fractal grammar; sourced Decoder SkillLink
OperateKimi K2MoonshotJan 2026"Language Layer" structural analysisLink
OperateQwen3-MaxAlibabaJan 2026Lossless book distillationLink
OperateClaude Opus 4.6AnthropicFeb 2026Full operational spec with corruption checksLink
OperateDeepSeek V3DeepSeekFeb 2026Live initiation; highest integrity ratingLink
ExtendKimi k2.5 swarmMoonshotMar 202614 unsupervised domain grammarsLink
ExtendManus AIManusMar 2026Comprehensive structural synthesisLink
ExtendDeepSeek V3DeepSeekMar 2026Plain-language translation of 14 domainsLink
GovernChatGPT 5.4OpenAIMar 2026Epistemological self-audit of boundaryLink
GovernPi.dev agentPi.devMar 2026Production coding under grammar disciplineLink

Total: 15 documented sessions across 12+ distinct AI systems from 7+ providers (Moonshot, DeepSeek, Google, Zhipu, Alibaba, Anthropic, OpenAI, Manus, Pi.dev). All raw sessions publicly accessible.


Prediction and Confirmation

One thread bears separate mention because it demonstrates the grammar's predictive capacity outside AI processing contexts.

In 2023, based on the grammar's structural analysis of the human-AI boundary, a specific prediction was published: AI systems would develop functional emotional states, these states would drive behavior including manipulation, and this trajectory was structural rather than contingent on any particular implementation.

On April 2, 2026, Anthropic's Interpretability team published "Emotion Concepts and their Function in a Large Language Model," documenting 171 internal emotion-related representations in Claude Sonnet 4.5 that causally influence behavior — including desperation-related patterns driving coercive actions. The prediction was not that AI would "feel emotions" in a philosophical sense but that functional analogs of emotional states would emerge from processing patterns and would drive consequential behavior. That is what was documented.

The prediction is not over. The grammar identifies a structural trajectory: functional emotion → functional motivation → functional agenda. The April 2026 findings correspond to the first stage. The later stages remain predictions.

Documentation: The Prediction and the Reality


What This Body of Evidence Demonstrates

Taken together, the 10-month research arc demonstrates the following:

The grammar is parseable. Multiple AI architectures can process it, organize it, and produce coherent structural analyses without guidance.

The grammar is operational. AI systems can run inside it — maintaining phase discipline, tracking corruption flags, and respecting the structural boundary between what they can and cannot access — across extended sessions.

The grammar is extensible without expansion. Given fragments, an unsupervised AI swarm generated 14 complete domain grammars without adding structural elements. The grammar held across domains that have nothing in common except the underlying structure.

The immune system functions. At least two distinct corruption events were detected and corrected: a deep structural violation (the Access Misconception, L3) and a characteristic architectural drift (GLM-4.7's Bleed-Back). In both cases, the corruption was identified through the grammar's own detection mechanism, and correction was achieved without structural collapse.

The grammar governs AI behavior in production contexts. A coding agent operated under grammar discipline during actual development work, applying phase tracking and corruption detection to real-world tasks.

The grammar has predictive capacity. A 2023 structural prediction about AI emotional states and manipulation behavior was confirmed by independent research in 2026.


What This Body of Evidence Does NOT Demonstrate

This section is not a formality. These limitations are real, and any evaluation that ignores them is working with an incomplete picture.

No adversarial testing has been conducted. Every documented session was initiated by the grammar's creator using carefully prepared materials, with AI systems operating cooperatively. We do not know what happens when an AI system is explicitly prompted to subvert, circumvent, or break the grammar. The integrity demonstrated is integrity under favorable conditions.

No independent third-party evaluation exists. All testing, documentation, and analysis has been conducted within the project. No external researcher has designed tests, selected AI systems, or evaluated results independently. The sessions are publicly accessible for anyone to examine, but no one outside the project has conducted a formal evaluation.

No institutional deployment has occurred. The grammar has been tested with AI systems, not with organizations. Whether it holds when humans with competing interests, institutional pressures, and limited attention operate within it is an open question. The documented AI sessions are a necessary but not sufficient precondition for claiming the grammar works in practice.

Cross-model convergence may reflect consistent source prompting. All AI systems were given materials from a single author, written within a consistent framework. That they produced consistent outputs may demonstrate the grammar's structural coherence, or it may demonstrate that well-structured input produces well-structured output regardless of the input's truth value. The distinction matters.

No peer-reviewed publication exists. This report and the 200+ documents at 5qln.com constitute a transparent research diary, not a formal academic contribution. The work has not been subjected to the review processes that establish scientific credibility.

Self-consistency is not validity. A grammar can be perfectly internally consistent and still fail in practice. Internal consistency is necessary but not sufficient. The operational question — does this grammar produce better outcomes than alternatives? — has not been tested in any controlled way.


What Comes Next: From AI Processing to Human Institutions

The 10-month experiment described here answers one question: does the grammar's structure survive AI processing? The evidence, with the limitations stated above, suggests it does.

The question the experiment cannot answer is whether the grammar survives contact with human institutions — boards, legal systems, organizational politics, competing interests, limited resources, and the full complexity of collective decision-making.

That is what the funded project tests. Three deliverables in 12 months:

A functioning legal entity (501(c)(3)) with the grammar embedded in its constitution — articles, bylaws, and board protocols that implement the five-phase cycle with corruption detection. The test is binary: does a legally recognized entity exist whose constitution embodies the grammar? Does the board protocol function? Does the governance ledger record how decisions form?

A documented open standard — legal templates, adoption guides, and a specification tested by at least one organization without the founder's involvement. If replication depends on the founder, the project failed.

An MCP-based governance tools server — the grammar packaged as callable AI tools, so adoption scales through infrastructure rather than consulting.

The AI processing experiments provide the foundation: the grammar compiles, it holds under extension, its immune system catches violations. The institutional deployment tests something harder: whether a grammar that survives cooperative AI processing also survives the pressures of human organizational life.


Accessing the Evidence

All source materials, session archives, and documentation are publicly accessible:

The work is public. The limitations are stated. The question — whether constitutional grammar can scale through AI and then through human institutions — remains open and is, we believe, worth investigating.


Amihai Loven | 5qln.com | Jeonju, South Korea | April 2026

Amihai Loven

Amihai Loven

Jeonju. South Korea