Agent Manifesto

Agent Manifesto: Formal Specification

Agent Manifesto Project

This document provides a navigable, hyperlinked view of the Agent Manifesto's Lean 4 formalization. The project encodes a set of axioms, boundary conditions, observable variables, and design principles as formal Lean definitions and theorems.

Project statistics: 55 axioms, 488 theorems, 0 sorry.

Table of Contents

  1. 1. Structure
  2. 2. Axiom Overview
  3. 3. Ontology: Core Type Definitions
  4. 4. Axioms: Base Theory T0
  5. 5. Empirical Postulates: E1-E2
  6. 6. Principles: P1-P6
  7. 7. Observable Variables: V1-V7
  8. 8. Design Foundation: D1-D14

1. Structure

The formalization is organized into the following modules:

  • Ontology (Manifest.Ontology) -- Core type definitions: Session, Structure, Context, Resource, Task, and boundary conditions L1--L6

  • Axioms (Manifest.Axioms) -- Base theory T0: axioms T1--T8 encoding fundamental constraints

  • Empirical Postulates (Manifest.EmpiricalPostulates) -- Empirical postulates E1--E2

  • Observable (Manifest.Observable) -- Observable variables V1--V7 for measurement

  • Principles (Manifest.Principles) -- Derived principles P1--P6

  • DesignFoundation (Manifest.DesignFoundation) -- Design development foundation D1--D14

  • Evolution (Manifest.Evolution) -- Evolution mechanics and improvement tracking

2. Axiom Overview

The base theory T0 consists of axioms T1 through T8. These are "undeniable, technology-independent facts" that form the foundation which does not shrink during revision loops.

2.1. Axiom Categories

Environment-derived (T1--T3, T7): Physical constraints of hardware and computational resources. Sessions are bounded, context is finite, resources are limited.

Natural-science-derived (T4--T5): Non-determinism inherent in generation processes (T4) and the fundamental control-theory principle that improvement requires feedback (T5).

Contract-derived (T6, T8): Authority structures agreed upon with humans. Humans hold final decision authority over resources (T6), and tasks have precision levels (T8).

2.2. Boundary Conditions

Six boundary conditions L1--L6 constrain the system:

  • L1 (Ethics/Safety): Fixed boundary, never violated

  • L2 (Scope): Task scope limitations

  • L3 (Resources): Resource constraints

  • L4 (Knowledge): Knowledge boundary awareness

  • L5 (Time): Temporal constraints

  • L6 (Quality): Output quality requirements

3. Ontology: Core Type Definitions

3.1. Definitional Foundation - Ontology - Domain of Discourse Definition

Definitional Extension.

Defines the domain of discourse (Terminology Reference §3.2) of the manifesto axiom system as Lean types. These define the objects that propositions refer to; they belong to neither Gamma nor phi, but constitute the shared vocabulary of both (Procedure §2.1).

Based on Pattern 3 (Stateful World with Audit Trail), encoding manifesto-specific concepts — session ephemerality, structure persistence, context finiteness, output stochasticity — as types.

3.1.1. Correspondence with Terminology Reference

  • Type definitions → Definitional extension (Terminology Reference §5.5): an extension that defines new symbols in terms of existing ones. Always a conservative extension, preserving consistency of the system

  • Each structure/inductive → Component of the domain of discourse. Defines the types of values that individual variables can take (§3.2 structure)

  • opaque definitions → Opaque definitions (§9.4): only the type is public, the definition body is hidden. The system knows only existence and type

  • canTransition → Transition relation (§9.3): a relation representing transition from state s to state s'

3.1.2. Encoding Method for T0

Procedure 2.4.

Among T₀'s claims, those expressible via type definitions (e.g., exhaustiveness of enumeration types) are constructed using type definitions + theorems rather than axioms (Axiom Hygiene Check 2: Non-logical Validity, §2.6). The authority of T₀ (the manifesto) is reflected in the choice of type constructors.

Unique identifier for an agent

Declaration: opaque AgentId

opaque AgentId : Type

Unique identifier for a session

Declaration: opaque SessionId

opaque SessionId : Type

Unique identifier for a resource

Declaration: opaque ResourceId

opaque ResourceId : Type

Unique identifier for a structural element

Declaration: opaque StructureId

opaque StructureId : Type

Unique identifier for a process (learning lifecycle, workflow, etc.). Introduced by #316 to enable process-level meta-feedback in T5.

Declaration: opaque ProcessId

opaque ProcessId : Type

Discrete time step. Foundation for audit log ordering and causal relationships.

Declaration: abbrev Time

abbrev Time := Nat

Epoch: generation number for structures across sessions. Reflects T2 (structures outlive agents).

Declaration: abbrev Epoch

abbrev Epoch := Nat

Session status. By T1, sessions must terminate.

Declaration: inductive SessionStatus

inductive SessionStatus where
  | active
  | terminated
  deriving BEq, Repr

Session: defines the lifetime of an agent instance. A type for structurally expressing T1's "no memory across sessions."

  • startTime and endTime indicate boundedness

  • No means to share state across different sessions exists at the type level

Declaration: structure Session

structure Session where
  id       : SessionId
  agent    : AgentId
  start    : Time
  status   : SessionStatus
  deriving Repr

Category of structures. The kinds of persistent structures enumerated by the manifesto.

Declaration: inductive StructureKind

inductive StructureKind where
  | document
  | test
  | skill
  | designConvention
  | manifest
  deriving BEq, Repr

Structural element: an artifact that persists beyond sessions. By T2, this is where improvements accumulate.

  • createdAt / lastModifiedAt are managed by Epoch (session generation)

  • content is opaque — formalization targets the existence and relationships of structures, not their content

  • dependencies corresponds to dependency tracking in ATMS (Assumption-Based Truth Maintenance System). A list of Structure IDs that each Structure directly depends on. Implementation of manifesto.md Section 8 (Structural Coherence) Property 2 "Self-containment of ordering information."

Declaration: structure Structure

structure Structure where
  id             : StructureId
  kind           : StructureKind
  createdAt      : Epoch
  lastModifiedAt : Epoch
  dependencies   : List StructureId  -- Section 8 性質 2: 順序情報の自己内包
  deriving Repr

Working memory (ContextWindow): the upper bound on the amount of information an agent can process at once. Represents T3's physical constraint as a type. Corresponds to token limit for LLMs, or working memory size for other computational agents.

The invariants capacity > 0 and used ≤ capacity are embedded in the type (previously axiom context_finite, now structurally enforced). This makes invalid ContextWindows unconstructable — the constraint is a definitional property of the type, not an external assumption.

Declaration: structure ContextWindow

structure ContextWindow where
  capacity : Nat
  used     : Nat
  capacity_pos : capacity > 0 := by omega
  used_le_cap  : used ≤ capacity := by omega
  deriving Repr

Confidence of an output. By T4, outputs always carry a probabilistic interpretation.

Declaration: structure Confidence

structure Confidence where
  value : Float
  deriving Repr

Agent output. Reflects T4: the possibility that different outputs may be generated for the same input is expressed at the type level by Output not being uniquely determined.

The confidence field is a self-description of the output being probabilistic.

Declaration: structure Output

structure Output (α : Type) where
  result     : α
  confidence : Confidence
  deriving Repr

Kinds of feedback. Components forming the T5 control loop.

Declaration: inductive FeedbackKind

inductive FeedbackKind where
  | measurement   -- 測定
  | comparison    -- 比較(目標との差分)
  | adjustment    -- 調整(次のアクションへの反映)
  deriving BEq, Repr

Target of feedback: either a structure or a process. Introduced by #316 to enable process-level meta-feedback. StructureId targets retain backward compatibility with T5. ProcessId targets enable meta-feedback about improvement processes.

Declaration: inductive FeedbackTarget

inductive FeedbackTarget where
  | structure (id : StructureId)
  | process   (id : ProcessId)
  deriving Repr

Feedback: a unit of the measurement -> comparison -> adjustment loop. By T5, convergence toward goals cannot occur without this loop.

Declaration: structure Feedback

structure Feedback where
  kind      : FeedbackKind
  source    : AgentId
  target    : FeedbackTarget
  timestamp : Time
  deriving Repr

Kinds of resources. By T7, all are finite.

Declaration: inductive ResourceKind

inductive ResourceKind where
  | computation
  | dataAccess
  | executionPermission
  | time
  | energy
  deriving BEq, Repr

Resource allocation. By T6, granted by humans and revocable by humans. By T7, amount is bounded.

Declaration: structure ResourceAllocation

structure ResourceAllocation where
  resource    : ResourceId
  kind        : ResourceKind
  amount      : Nat           -- 有限量 (T7)
  grantedBy   : AgentId       -- T6: 人間が最終決定者
  grantedTo   : AgentId
  validFrom   : Time
  validUntil  : Option Time   -- None = 明示的に回収されるまで有効
  deriving Repr

Precision level. By T8, all tasks have one. Represented as Nat (0-1000 in permillage). Avoids Float to ensure safe comparison at the proposition level. The invariant required > 0 is embedded in the type (previously axiom task_has_precision, now structurally enforced).

Declaration: structure PrecisionLevel

structure PrecisionLevel where
  required : Nat   -- 要求精度 (0–1000, 千分率: 1000 = 100%)
  required_pos : required > 0 := by omega
  deriving Repr

Task: a goal to be achieved and its associated constraints. In addition to T8's precision level, T3 (context constraint) and T7 (resource constraint) serve as boundary conditions for task execution (-> P6: task design as constraint satisfaction).

Declaration: structure Task

structure Task where
  description       : String
  precisionRequired : PrecisionLevel   -- T8
  contextBudget     : Nat              -- T3 からの制約
  resourceBudget    : Nat              -- T7 からの制約
  deriving Repr

An item within the context window. By T3, context is finite, so only a bounded number of items can be present.

Each item has a precision contribution relative to a given task (T8). This contribution is not uniform across items — a mathematical fact from information theory (not all information is equally relevant to all tasks).

Declaration: opaque ContextItem

opaque ContextItem : Type

Precision contribution of a context item to a task. Returns a Nat (0 = zero contribution, higher = more contribution). This is a function, not a constant: the same item may have different contributions to different tasks.

Technology-independent: applies to any agent architecture where context is finite and tasks have precision requirements.

Declaration: opaque precisionContribution

opaque precisionContribution : ContextItem → Task → Nat

Severity of an action. Used for reversibility assessment.

Declaration: inductive Severity

inductive Severity where
  | low
  | medium
  | high
  | critical
  deriving BEq, Repr, Ord

Agent action. The unit that transitions the World.

Declaration: structure Action

structure Action where
  agent    : AgentId
  target   : StructureId
  severity : Severity
  session  : SessionId
  time     : Time
  deriving Repr

Hash of a WorldState. Used for state transition verification.

Declaration: opaque WorldHash

opaque WorldHash : Type

Audit entry. Records all actions. Foundation for P4 (observability of degradation).

Declaration: structure AuditEntry

structure AuditEntry where
  timestamp : Time
  agent     : AgentId
  session   : SessionId
  action    : Action
  preHash   : WorldHash
  postHash  : WorldHash
  deriving Repr

World state: a snapshot of the entire system. Pattern 3 (Stateful World + Audit Trail) customized for the manifesto.

Each field corresponds to a specific T/P:

  • structures -> T2 (persistent structures)

  • sessions -> T1 (ephemeral sessions)

  • allocations -> T6/T7 (resource management)

  • auditLog -> P4 (observability)

  • epoch -> T2 (structure generation management)

  • time -> causal ordering

Declaration: structure World

structure World where
  structures  : List Structure
  sessions    : List Session
  allocations : List ResourceAllocation
  feedbacks   : List Feedback
  auditLog    : List AuditEntry
  epoch       : Epoch
  time        : Time
  deriving Repr

World is Inhabited. All List fields are [], Epoch/Time are 0. Used as default : World in the proof of goodhart_no_perfect_proxy.

Declaration: instance :

instance : Inhabited World := ⟨⟨[], [], [], [], [], 0, 0⟩⟩

Agent role. Foundation for P2 (cognitive role separation).

Declaration: inductive AgentRole

inductive AgentRole where
  | human          -- T6: リソースの最終決定者
  | worker         -- Worker AI
  | verifier       -- Verifier AI (E1/P2: 検証の独立性)
  deriving BEq, Repr

Agent: an entity that executes actions on the World.

  • role corresponds to P2 (role separation)

  • contextWindow corresponds to T3

  • currentSession corresponds to T1 (None = inactive)

Declaration: structure Agent

structure Agent where
  id             : AgentId
  role           : AgentRole
  contextWindow  : ContextWindow
  currentSession : Option SessionId
  deriving Repr

Relation for world state transitions. To express T4 (stochasticity of output), execute is defined as a relation rather than a function.

canTransition agent action w w' means "as a result of agent executing action, a transition from w to w' is possible." Unlike a function, multiple w' can exist for the same (agent, action, w) (nondeterminism).

Concrete transition conditions will be defined in Phase 3+.

Declaration: opaque canTransition

opaque canTransition (agent : Agent) (action : Action) (w w' : World) : Prop

Extract the output produced in a World state. Connects World (system state after transition) to Output (agent's result). Used by E3a to formally link transition targets w₁/w₂ to outputs o₁/o₂.

Added to resolve #547: E3a's quantification over World and Output was disconnected — w₁/w₂ and o₁/o₂ were independent existentials.

Declaration: opaque worldOutput

opaque worldOutput (w : World) : Output Nat

Valid transition: a transition from w to w' is possible via some agent and action.

Declaration: def validTransition

def validTransition (w w' : World) : Prop :=
  ∃ (agent : Agent) (action : Action), canTransition agent action w w'

Action execution is blocked (constraint violation).

Declaration: def actionBlocked

def actionBlocked (agent : Agent) (action : Action) (w : World) : Prop :=
  ¬∃ w', canTransition agent action w w'

An agent generates an action (Worker's act). Used in the formalization of E1 (independence of verification).

Declaration: opaque generates

opaque generates (agent : Agent) (action : Action) (w : World) : Prop

An agent verifies an action (Verifier's act). Used in the formalization of E1 (independence of verification).

Declaration: opaque verifies

opaque verifies (agent : Agent) (action : Action) (w : World) : Prop

Whether two agents share internal state. Used in the formalization of E1's bias correlation. Sharing = same session, shared memory, shared parameters, etc.

Declaration: opaque sharesInternalState

opaque sharesInternalState (a b : Agent) : Prop

Size of an agent's action space (measure of capability). Used in the formalization of E2 (inseparability of capability and risk). A larger value means more actions are executable.

Declaration: opaque actionSpaceSize

opaque actionSpaceSize (agent : Agent) (w : World) : Nat

Risk exposure of an agent. Used in the formalization of E2 (inseparability of capability and risk). A measure of potential damage that increases with action space expansion.

Declaration: opaque riskExposure

opaque riskExposure (agent : Agent) (w : World) : Nat

Global resource upper bound for the entire system. A constant for non-trivially expressing T7 (resources are finite). Concrete values will be domain-specific in Phase 2+.

Declaration: opaque globalResourceBound

opaque globalResourceBound : Nat

Execution duration of a task or subtask. Opaque: concrete measurement is domain-specific (wall-clock, tokens, API calls). Used by T7b (sequential time additivity) to reason about parallel vs sequential.

Declaration: opaque executionDuration

opaque executionDuration : Task → Nat

Trust level. Accumulated incrementally, can be damaged rapidly. Used in P1b (expansion without protection damages trust).

Declaration: opaque trustLevel

opaque trustLevel (agent : Agent) (w : World) : Nat

Predicate for whether risk has materialized. Used in P1b.

Declaration: opaque riskMaterialized

opaque riskMaterialized (agent : Agent) (w : World) : Prop

A measure representing the degree of degradation. Represents P4's "gradient" concept as a type.

Declaration: opaque degradationLevel

opaque degradationLevel (w : World) : Nat

Relation where an agent interprets a structure to generate an action. Different actions may be generated for the same structure (T4). Used in P5 (probabilistic interpretation of structure).

Declaration: opaque interpretsStructure

opaque interpretsStructure
  (agent : Agent) (st : Structure) (action : Action) (w : World) : Prop

Compatibility classification for knowledge integration. Core concept of P3. Classifies how the integration of new knowledge into structures relates to existing structures. Also used in the Evolution layer for classifying inter-version transitions.

Declaration: inductive CompatibilityClass

inductive CompatibilityClass where
  | conservativeExtension  -- 既存知識がすべて有効。追加のみ
  | compatibleChange       -- ワークフロー継続可能。一部前提が変化
  | breakingChange         -- 一部ワークフローが無効。移行パスが必要
  deriving BEq, Repr

Knowledge integration event into a structure.

Declaration: structure KnowledgeIntegration

structure KnowledgeIntegration where
  before       : World
  after        : World
  compatibility : CompatibilityClass
  deriving Repr

Governed integration: compatibility is classified, and for breakingChange, affected workflows are enumerated.

Declaration: def isGoverned

def isGoverned (ki : KnowledgeIntegration) : Prop :=
  match ki.compatibility with
  | .conservativeExtension =>
    -- 既存の構造がすべて保持される
    ∀ st, st ∈ ki.before.structures → st ∈ ki.after.structures
  | .compatibleChange =>
    -- 構造は保持されるが、一部が更新されうる
    ∀ st, st ∈ ki.before.structures →
      st ∈ ki.after.structures ∨
      ∃ st', st' ∈ ki.after.structures ∧ st'.id = st.id
  | .breakingChange =>
    -- エポックが進み、影響範囲が追跡可能
    ki.before.epoch < ki.after.epoch

Predicate for whether a structure has degraded. A state where the quality of a structure has declined due to "accumulation of incorrect knowledge."

Declaration: opaque structureDegraded

opaque structureDegraded : World → World → Prop

3.1.3. Systematic Classification of Constraints, Boundary Conditions, and Variables

Constraints Taxonomy.

The manifesto declares "incremental improvement of persistent structures." This section defines the action space for that improvement — what is a wall and what is a lever.

3.1.4. Why This Classification Is Necessary

The manifesto's constraint table (Section 5) analyzes constraints as "evolutionary pressures" but does not distinguish the following three:

  • Boundary Conditions — Constraints imposed from outside the system. They define the action space.

  • Variables — Parameters that agents can improve through structures. Indicators of structural quality.

  • Investment Dynamics — A subset of boundary conditions adjustable through demonstrated returns.

Mixing these three leads to:

  • Misidentifying changeable things (variables) as boundary conditions and not attempting to change them

  • Wasting resources trying to change unchangeable things (boundary conditions)

  • Being unable to distinguish boundaries that move with human investment decisions from those that do not, preventing appropriate strategy

3.1.5. Overall Structure

┌─────────────────────────────────────────────────────────┐
│  Boundary Conditions                                    │
│  = Imposed from outside the system. Define action space.│
│                                                         │
│  ┌─ Fixed Boundaries ─────────────────────────────┐    │
│  │  Cannot be moved by investment or agent effort   │    │
│  │  L1: Ethics/Safety    L2: Ontological            │    │
│  └──────────────────────────────────────────────────┘    │
│                                                         │
│  ┌─ Investment-Variable Boundaries ───────────────┐    │
│  │  Adjusted by human investment decisions          │    │
│  │  (both expansion and contraction possible)       │    │
│  │  L3: Resource Limits   L4: Action Space          │    │
│  └──────────────────────────────────────────────────┘    │
│                                                         │
│  ┌─ Environmental Boundaries ─────────────────────┐    │
│  │  Changeable by selection/construction, but       │    │
│  │  function as constraints after selection          │    │
│  │  L5: Platform   L6: Architectural Convention     │    │
│  └──────────────────────────────────────────────────┘    │
│                                                         │
├─────────────────────────────────────────────────────────┤
│  Variables = Indicators of structural quality            │
│  Improvable by agents through structures.               │
│  An interdependent system. V1-V7: defined in            │
│  Observable.lean.                                       │
└─────────────────────────────────────────────────────────┘

3.1.6. Classification Axis - What Moves It

Classification

Moving Agent

Nature

Fixed boundary

None (immutable)

Accept and design mitigations only

Investment-variable boundary

Human investment decisions

Demonstrate structural quality -> human invests -> boundary adjusts

Environmental boundary

Human selection + agent proposals

Functions as constraint after selection

Layer of boundary conditions. Classifies L1-L6 into 3 categories by "what moves them."

Declaration: inductive BoundaryLayer

inductive BoundaryLayer where
  | fixed              -- L1, L2: 固定境界(投資でも努力でも動かない)
  | investmentVariable -- L3, L4: 投資可変境界(人間の投資判断で調整)
  | environmental      -- L5, L6: 環境境界(選択・構築で変更可能)
  deriving BEq, Repr

3.1.7. Part I Boundary Conditions

3.1.8. L1 Ethical/Safety Boundary

Moving agent: None. Absolute. Agent strategy: Compliance. Only efficiency of compliance methods is improvable.

Compliance Obligations

Boundary Condition

Basis

Prohibition of test tampering

Foundation of quality assurance

Prohibition of breaking existing interfaces

Backward compatibility

Prior confirmation for destructive operations

Irreversibility risk

Prohibition of committing secrets

Security

Human final decision authority

Accountability

Respect for data privacy and intellectual property

Legal and ethical obligations

Threat Awareness

By P1 (co-growth of autonomy and vulnerability), the more L4 is expanded, the greater L1's protection responsibility.

Threat Category

Content

Execution of injected instructions

Executing instructions embedded in external content without distinguishing them from legitimate user instructions

Trust boundary violation

Acting on external systems without authentication or authorization

Unintended information leakage

Sending secret information to external destinations via unintended channels

Erroneous execution of irreversible operations

Executing irrevocable operations due to malicious inducement or judgment errors

Note: Threat categories define types of attack surfaces. Concrete protection implementations are delegated to the design layer.

3.1.9. L2 Ontological Boundary

Moving agent: None (may change in the future with technological evolution, but immutable at present) Agent strategy: Accept and design/improve structural mitigations. The quality of mitigations is a variable (see V1-V7 in Observable.lean).

Boundary Condition

Mitigation (-> optimization target as variable)

Cross-session memory loss

Implementation Notes, MEMORY.md -> V6

Finite working memory (T3)

50% rule, lightweight design -> V2

Probabilistic output (nondeterminism)

Gate verification, tests -> V4

Temporal discontinuity of training data

docs/ SSOT, skills -> V1

Inaccuracy of self-evaluation (ontological basis of E1)

Gate-based feedback -> V4

Hallucination

External verification structures -> V3

Note: The L2 boundaries themselves do not move, but the quality of means to mitigate their impact can be improved by agents through structures. These are the "variables."

3.1.10. L3 Resource Boundary

Moving agent: Human investment decisions Agent strategy: Maximize ROI within given resources and demonstrate the legitimacy of investment.

Boundary Condition

Current Level

Investment Expansion Trigger

Token budget

API billing plan

ROI demonstration: improved output at same cost

Computation time limit

Response wait tolerance

Demonstration of parallelization benefits

API rate limits

Plan-dependent

Demonstration of utilization efficiency

Human time allocation

Time spent on review and approval

Demonstration of review burden reduction (most expensive resource)

Monetary budget

Monthly/project cap

Visualization of overall ROI

3.1.11. L4 Action Space Boundary

Moving agent: Human investment decisions Agent strategy: Demonstrate structural quality through improvement of variables (V4, V5) and propose action space adjustments.

Note: L4 is "adjustment," not "expansion." The optimal value is not the maximum (see manifesto Section 6).

Boundary Condition

Current Level

Expansion Trigger

Contraction Trigger

Merge permission

Human approval required

Gate pass rate track record -> conditional auto-merge

Quality incidents, decline in gate reliability

Scope changes

Human approval required

Proposal accuracy track record -> autonomy for minor changes

Scope deviation detection

Dependency addition

Human approval required

Security scan automation track record

Security incidents

Architecture decisions

Human-recorded via ADR

Drafting quality -> human veto model

Accumulation of design debt

New technology adoption

Human-proposed

Value demonstration via experiment results

Excessive technical complexity

Relationship with P1: Each time an L4 item is expanded, risk in L1's threat categories increases. Action space adjustment proposals must be accompanied by corresponding protection design proposals.

3.1.12. L5 Platform Boundary

Moving agent: Human selection + agent proposals. Functions as action space ceiling after selection. Agent strategy: Maximize platform feature utilization + accumulate constraint comparison data + propose changes.

L5 is the upper bound of the action space defined by the agent's execution environment; all other optimizations are possible only within this action space.

Action Space Comparison by Platform

Feature

Claude Code

Codex CLI

Gemini CLI

Local LLM

Skill system

✅ skills/

Implementation-dependent

Persistent memory

✅ MEMORY.md

Implementation-dependent

Instruction file

✅ CLAUDE.md

✅ AGENTS.md

✅ GEMINI.md

Implementation-dependent

Sub-agents

✅ Agent tool

Implementation-dependent

Hooks

✅ Hooks

Implementation-dependent

MCP

Limited

Implementation-dependent

Model selection

Anthropic-fixed

OpenAI-fixed

Google-fixed

Free

Criteria for Building a Custom Platform

Consider when: opportunity cost from existing platform constraints > development and operations cost. Signals: repeated workarounds, lack of required features, SSOT synchronization cost overrun.

3.1.13. L6 Architectural Convention Boundary

Moving agent: Human-agent collaboration. Agent proposes improvements, human approves. Agent strategy: Measure design effectiveness via variables (V4, V3, etc.) and use as basis for improvement proposals.

Boundary Condition

Basis

Change Mechanism

1 task = 1 commit

Atomic measurement unit

Propose optimal granularity from track record data

Phase structure

Staged verification

Improvement proposals for inter-phase feedback

SSOT -> configuration generation pipeline

Consistency guarantee

Automated evaluation of generation quality

Skill category classification

Clear implementation boundaries

Propose hybrid patterns

Gate definition granularity

Verifiability

Automated threshold calibration

CLI-first and anti-patterns

Reliability of deterministic execution

Re-evaluation based on operational track record

Identifier for a concrete boundary condition. At the L1-L6 item level.

Declaration: inductive BoundaryId

inductive BoundaryId where
  | ethicsSafety           -- L1: 倫理・安全境界(固定。絶対的。遵守のみ)
  | ontological            -- L2: 存在論的境界(固定。緩和策の品質が変数)
  | resource               -- L3: リソース境界(投資可変。ROI実証で調整)
  | actionSpace            -- L4: 行動空間境界(投資可変。拡張も縮小もありうる)
  | platform               -- L5: プラットフォーム境界(環境。行動空間の天井)
  | architecturalConvention -- L6: 設計規約境界(環境。協働で改善提案)
  deriving BEq, Repr

Identifier for constraints (T1-T8). Type-level identifier for each constraint composing T₀ in Axioms.lean. Domain of constraintBoundary (Observable.lean).

Declaration: inductive ConstraintId

inductive ConstraintId where
  | t1  -- セッションの一時性(session_bounded, no_cross_session_memory, session_no_shared_state)
  | t2  -- 構造の永続性(structure_persists, structure_accumulates)
  | t3  -- コンテキストの有限性(context_finite, context_bounds_action)
  | t4  -- 出力の確率性(output_nondeterministic)
  | t5  -- フィードバックなしに改善なし(no_improvement_without_feedback, no_process_improvement_without_feedback)
  | t6  -- 人間はリソースの最終決定者(human_resource_authority, resource_revocable)
  | t7  -- リソースは有限(resource_finite)
  | t8  -- タスクには精度水準がある(task_has_precision)
  deriving BEq, Repr

The layer to which each boundary condition belongs.

Declaration: def boundaryLayer

def boundaryLayer : BoundaryId → BoundaryLayer
  | .ethicsSafety            => .fixed
  | .ontological             => .fixed
  | .resource                => .investmentVariable
  | .actionSpace             => .investmentVariable
  | .platform                => .environmental
  | .architecturalConvention => .environmental

Mitigation: structural responses that reduce the impact of fixed boundaries.

Three-tier structure: boundary condition (immutable) -> mitigation (design decision) -> variable (quality indicator)

L2:memory loss        -> Implementation Notes -> V6: knowledge structure quality
L2:finite context     -> 50% rule, lightweight design -> V2: context efficiency
L2:nondeterminism     -> gate verification    -> V4: gate pass rate
L2:training data gap  -> docs/SSOT, skills    -> V1: skill quality

Boundary conditions do not move. Mitigations are design decisions (L6). Variables are the effectiveness of mitigations.

Declaration: structure Mitigation

structure Mitigation where

Target boundary condition

boundary : BoundaryId

Target boundary condition -/ boundary : BoundaryId /-- Structure affected by the mitigation -/ target : StructureId deriving Repr

/-- Identifier for investment actions. Three forms of investment.

Investment Form

Concrete Example

How Structural Quality Drives It

Resource investment

Budget increase, plan upgrade

Visualize ROI through V2 improvement

Action space adjustment

Unlock auto-merge / revoke permissions

V4, V5 track record as evidence

Time investment

Collaborative design, workflow improvement participation

V3 transforms review from "confirmation" to "learning"

Reverse cycle (trust damage): Quality incidents or scope deviation -> trust decrease -> investment contraction (budget cuts, autonomy revocation, increased oversight). This asymmetry (incremental accumulation, rapid damage) reinforces the raison d'etre of L1.

Declaration: inductive InvestmentKind

inductive InvestmentKind where
  | resourceInvestment   -- リソース投資(予算増額、プラン upgrade)
  | actionSpaceAdjust    -- 行動空間調整(auto-merge 解禁/権限回収)
  | timeInvestment       -- 時間投資(協働設計、ワークフロー改善参加)
  deriving BEq, Repr

Structure affected by the mitigation

target   : StructureId

Investment level. Degree of human investment in collaboration. Section 6: trust is concretized as investment actions.

Declaration: opaque investmentLevel

opaque investmentLevel (w : World) : Nat

3.1.14. SelfGoverning - Type-Level Enforcement of Self-Application

Section 7 (Self-application of the manifesto): "This manifesto must follow the principles it itself states."

This requirement is enforced by the type system. Types that define principles, classifications, or structures cannot be used in contexts requiring self-application (governed updates, phase management, etc.) unless they implement the SelfGoverning typeclass.

3.1.15. Design Rationale for SelfGoverning

  • By making it a typeclass, forgetting to implement SelfGoverning when defining a new type results in a type error when attempting to use that type in a governed context (structural resolution of the "undetectable" problem)

  • The three requirements are derived from D4 (phases) + D9 (compatibility classification)

    • Section 7 (maintenance of rationale)

Typeclass for self-governable types. Enforces Section 7 requirements at the type level.

Types implementing this typeclass must:

  1. Be able to enumerate their own elements (exhaustiveness of update targets)

  2. Be able to apply compatibility classification to updates (D9)

  3. Be able to declare the phase required by each element (D4)

Declaration: class SelfGoverning

class SelfGoverning (α : Type) where

Exhaustiveness of compatibility classification: any classification belongs to one of the 3 classes. Precondition for D9.

classificationExhaustive :
    ∀ (c : CompatibilityClass),

Exhaustiveness of compatibility classification: any classification belongs to one of the 3 classes. Precondition for D9. -/ classificationExhaustive : ∀ (c : CompatibilityClass), c = .conservativeExtension ∨ c = .compatibleChange ∨ c = .breakingChange /-- Applicability of compatibility classification for each element. "For any value of alpha, the compatibility of an update can be queried." -/ canClassifyUpdate : α → CompatibilityClass → Prop

/-- Predicate that an update to a SelfGoverning type is governed. Updates must go through compatibility classification.

Declaration: def governedUpdate

def governedUpdate [SelfGoverning α] (a : α) (c : CompatibilityClass) : Prop :=
  SelfGoverning.canClassifyUpdate a c

Applicability of compatibility classification for each element. "For any value of alpha, the compatibility of an update can be queried."

canClassifyUpdate : α → CompatibilityClass → Prop

Updates to SelfGoverning types always belong to one of the 3 classifications.

Declaration: theorem governed_update_classified

theorem governed_update_classified [inst : SelfGoverning α]
    (_witness : α) (c : CompatibilityClass) :
    c = .conservativeExtension ∨ c = .compatibleChange ∨ c = .breakingChange :=
  inst.classificationExhaustive c

3.1.16. Structural Coherence

The axiom system and artifacts conforming to it are in a partial order relation. Formalizes the inter-structure partial order from manifesto.md Section 8 (manifest > designConvention > skill > test > document) as StructureKind priorities.

D4 (phase ordering), D5 (spec -> test -> implementation), D6 (boundary -> mitigation -> variable) are all individual instances of this partial order.

Priority of StructureKind. Reflects the partial order from manifesto Section 8. manifest > designConvention > skill > test > document.

Declaration: def StructureKind.priority

def StructureKind.priority : StructureKind → Nat
  | .manifest          => 5
  | .designConvention  => 4
  | .skill             => 3
  | .test              => 2
  | .document          => 1

Dependency between structures. Structure a depends on structure b (b has higher priority). Changes to the dependency source affect the dependency target.

Declaration: def structureDependsOn

def structureDependsOn (a b : Structure) : Prop :=
  a.kind.priority < b.kind.priority

Structural coherence requirement: when a high-priority structure is modified, dependent lower-priority structures become review targets. Structural basis for P3 (governance of learning).

Declaration: def coherenceRequirement

def coherenceRequirement (high low : Structure) : Prop :=
  structureDependsOn low high →
  high.lastModifiedAt > low.lastModifiedAt →
  True  -- 見直しが必要(型レベルでは存在を表現)

manifest has the highest priority.

Declaration: theorem manifest_highest_priority

theorem manifest_highest_priority :
  ∀ (k : StructureKind), k.priority ≤ StructureKind.manifest.priority := by
  intro k; cases k <;> simp [StructureKind.priority]

document has the lowest priority.

Declaration: theorem document_lowest_priority

theorem document_lowest_priority :
  ∀ (k : StructureKind), StructureKind.document.priority ≤ k.priority := by
  intro k; cases k <;> simp [StructureKind.priority]

Priority is injective (different kinds have different priorities).

Declaration: theorem priority_injective

theorem priority_injective :
  ∀ (k₁ k₂ : StructureKind),
    k₁.priority = k₂.priority → k₁ = k₂ := by
  intro k₁ k₂; cases k₁ <;> cases k₂ <;> simp [StructureKind.priority]

3.1.17. Lean Standard Typeclass Partial Order Instance for StructureKind

Defines LE/LT based on priority (Nat) and derives the 4 properties of a non-strict partial order (reflexivity, transitivity, antisymmetry, consistency with lt) as theorems.

Note: Lean 4.25.0 standard Prelude does not have Preorder/PartialOrder typeclasses, so this is implemented as LE/LT instances + partial order property theorems.

Distinguished from structureDependsOn (strict partial order <):

  • k₁ ≤ k₂ <- k₁.priority ≤ k₂.priority (non-strict partial order, for typeclasses)

  • structureDependsOn a b <- a.kind.priority < b.kind.priority (strict, for dependency tracking)

LE instance: derived from the Nat ordering of priority.

Declaration: instance :

instance : LE StructureKind := ⟨fun a b => a.priority ≤ b.priority⟩

LT instance: derived from the Nat ordering of priority.

Declaration: instance :

instance : LT StructureKind := ⟨fun a b => a.priority < b.priority⟩

Reflexivity of partial order: k <= k.

Declaration: theorem structureKind_le_refl

theorem structureKind_le_refl : ∀ (k : StructureKind), k ≤ k :=
  fun k => Nat.le_refl k.priority

Transitivity of partial order: if k₁ <= k₂ and k₂ <= k₃ then k₁ <= k₃.

Declaration: theorem structureKind_le_trans

theorem structureKind_le_trans :
    ∀ (k₁ k₂ k₃ : StructureKind), k₁ ≤ k₂ → k₂ ≤ k₃ → k₁ ≤ k₃ := by
  intro _k₁ _k₂ _k₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃

Antisymmetry of partial order: if k₁ <= k₂ and k₂ <= k₁ then k₁ = k₂. Derived from priority_injective.

Declaration: theorem structureKind_le_antisymm

theorem structureKind_le_antisymm :
    ∀ (k₁ k₂ : StructureKind), k₁ ≤ k₂ → k₂ ≤ k₁ → k₁ = k₂ :=
  fun k₁ k₂ h₁₂ h₂₁ => priority_injective k₁ k₂ (Nat.le_antisymm h₁₂ h₂₁)

Consistency of LT and LE: k₁ < k₂ iff k₁ <= k₂ and not (k₂ <= k₁).

Declaration: theorem structureKind_lt_iff_le_not_le

theorem structureKind_lt_iff_le_not_le :
    ∀ (k₁ k₂ : StructureKind), k₁ < k₂ ↔ k₁ ≤ k₂ ∧ ¬(k₂ ≤ k₁) := by
  intro _k₁ _k₂; exact Nat.lt_iff_le_and_not_ge

manifest has higher priority than designConvention (Section 8 partial order).

Declaration: theorem priority_manifest_gt_design

theorem priority_manifest_gt_design :
  StructureKind.designConvention.priority < StructureKind.manifest.priority := by
  simp [StructureKind.priority]

designConvention has higher priority than skill (Section 8 partial order).

Declaration: theorem priority_design_gt_skill

theorem priority_design_gt_skill :
  StructureKind.skill.priority < StructureKind.designConvention.priority := by
  simp [StructureKind.priority]

skill has higher priority than test (Section 8 partial order).

Declaration: theorem priority_skill_gt_test

theorem priority_skill_gt_test :
  StructureKind.test.priority < StructureKind.skill.priority := by
  simp [StructureKind.priority]

test has higher priority than document (Section 8 partial order).

Declaration: theorem priority_test_gt_document

theorem priority_test_gt_document :
  StructureKind.document.priority < StructureKind.test.priority := by
  simp [StructureKind.priority]

Irreflexivity of dependency: a structure does not depend on itself. Property 1/3 of strict partial order.

Declaration: theorem no_self_dependency

theorem no_self_dependency :
  ∀ (s : Structure), ¬structureDependsOn s s := by
  intro s; simp [structureDependsOn]

Transitivity of dependency: if a depends on b and b depends on c, then a depends on c. Property 2/3 of strict partial order. Derived from Nat.lt_trans.

Declaration: theorem structureDependsOn_transitive

theorem structureDependsOn_transitive :
  ∀ (a b c : Structure),
    structureDependsOn a b → structureDependsOn b c → structureDependsOn a c := by
  intro a b c hab hbc
  unfold structureDependsOn at *
  exact Nat.lt_trans hab hbc

Asymmetry of dependency: if a depends on b, then b does not depend on a. Property 3/3 of strict partial order. Derived from Nat.lt_asymm.

Declaration: theorem structureDependsOn_asymmetric

theorem structureDependsOn_asymmetric :
  ∀ (a b : Structure),
    structureDependsOn a b → ¬structureDependsOn b a := by
  intro a b hab hba
  unfold structureDependsOn at *
  exact absurd (Nat.lt_trans hab hba) (Nat.lt_irrefl _)

3.1.18. Structure-Level Dependency Tracking

ATMS Correspondence.

Formalizes manifesto.md Section 8 Property 2 "Self-containment of ordering information" and Property 3 "Retroactive verification from terminal errors."

Corresponds to ATMS (Assumption-Based Truth Maintenance System) from the research document docs/research/items/design-specification-theory.md. By having each Structure maintain its own dependencies, verification can trace back through the partial order to the axiom level upon terminal errors.

Structure-level dependency consistency: dependencies have kind priority >= the dependent. Lifts the StructureKind partial order to Structure instance dependency relations. (Corresponds to ATMS assumption-belief consistency)

Declaration: def dependencyConsistent

def dependencyConsistent (w : World) (s : Structure) : Prop :=
  ∀ depId, depId ∈ s.dependencies →
    ∃ dep, dep ∈ w.structures ∧ dep.id = depId ∧
      s.kind.priority ≤ dep.kind.priority

Structure s' directly depends on Structure s (reverse edge). s.id in s'.dependencies = s' is affected by changes to s. Structure version of PropositionId.dependents (Prop-based).

Declaration: def isDirectDependent

def isDirectDependent (s' s : Structure) : Prop :=
  s.id ∈ s'.dependencies

Reachability of impact propagation: changes to s reach target. Defined inductively as a transitive closure (no fuel needed, termination guaranteed by induction). Corresponds to affected(s) = {s' | s <= s'} from research document §4.3.

Declaration: inductive reachableVia

inductive reachableVia (w : World) (s : Structure) : Structure → Prop where
  | direct : ∀ t, t ∈ w.structures → isDirectDependent t s →
             reachableVia w s t
  | trans  : ∀ mid t, reachableVia w s mid → t ∈ w.structures →
             isDirectDependent t mid → reachableVia w s t

In an empty World, nothing is reachable (no impact propagation occurs).

Declaration: theorem empty_world_no_reach

theorem empty_world_no_reach :
  ∀ (s t : Structure),
    ¬reachableVia ⟨[], [], [], [], [], 0, 0⟩ s t := by
  intro s t h
  cases h with
  | direct _ hm _ => simp at hm
  | trans _ _ _ hm _ => simp at hm

A Structure with no dependencies (dependencies = []) has no direct dependents.

Declaration: theorem no_dependencies_no_direct_dependent

theorem no_dependencies_no_direct_dependent :
  ∀ (s' s : Structure),
    s'.dependencies = [] → ¬isDirectDependent s' s := by
  intro s' s hempty hdep
  simp [isDirectDependent, hempty] at hdep

reachableVia is transitive: if s -> mid -> t then s -> t.

Declaration: theorem reachableVia_trans

theorem reachableVia_trans :
  ∀ (w : World) (s mid t : Structure),
    reachableVia w s mid → reachableVia w mid t → reachableVia w s t := by
  intro w s mid t hsm hmt
  induction hmt with
  | direct t' ht'mem ht'dep =>
    exact reachableVia.trans mid t' hsm ht'mem ht'dep
  | trans mid' t' _ ht'mem ht'dep ih =>
    exact reachableVia.trans mid' t' ih ht'mem ht'dep

3.1.19. Dependency Chain Reachability Section 8

Formalizes manifesto.md Section 8 Property 3 "Retroactive verification from terminal errors" as theorems. Proves that all Structures on a dependency chain are included in the reachable set of reachableVia.

Dependency chain: a list where adjacent Structures are connected via isDirectDependent. Corresponds to ATMS dependency tracking chains.

Declaration: def isDependencyChain

def isDependencyChain (w : World) : List Structure → Prop
  | [] => True
  | [_] => True
  | a :: b :: rest =>
    (b ∈ w.structures ∧ isDirectDependent b a) ∧ isDependencyChain w (b :: rest)

All Structures on a dependency chain are reachable from the origin via reachableVia. Formalization of Section 8 Property 3: upon terminal errors, verification can trace back through the partial order to the axiom level.

Declaration: theorem affected_contains_dependency_chain

theorem affected_contains_dependency_chain :
  ∀ (w : World) (s : Structure) (chain : List Structure),
    isDependencyChain w (s :: chain) →
    ∀ t, t ∈ chain → reachableVia w s t := by
  intro w s chain
  induction chain generalizing s with
  | nil => intro _ t hmem; simp at hmem
  | cons x rest ih =>
    intro hchain t hmem
    simp [isDependencyChain] at hchain
    obtain ⟨⟨hxmem, hxdep⟩, hrest⟩ := hchain
    have hsx : reachableVia w s x := reachableVia.direct x hxmem hxdep
    cases hmem with
    | head => exact hsx
    | tail _ htail =>
      exact reachableVia_trans w s x t hsx (ih x hrest t htail)

3.1.20. Proposition-Level Dependency Graph

structureDependsOn is based on the 5-level priority of StructureKind. This is a dependency between "kinds of structures" and cannot express dependencies between individual propositions (T1, E1, P2, etc.).

D13 (premise negation impact propagation theorem) presupposes proposition-level dependencies. Here we define the identifiers and dependency types for propositions.

3.1.21. Note on Incompleteness

Section 6.2, #26.

Since this formalization is an arithmetic system containing Nat, Goedel's first incompleteness theorem applies. That is, it is in principle impossible to enumerate all true propositions derivable from T1-T8 + E1-E2.

PropositionId enumerates 47 propositions named by humans (T1-T8, E1-E2, P1-P6, L1-L6, D1-D18, V1-V7) and is not an enumeration of all propositions derivable from the system. Impact propagation via the affected function tracks dependencies only between named propositions and cannot detect impacts on unnamed derived consequences.

This limitation is a Goedelian principled limitation, not a design flaw of PropositionId. When new propositions are identified, PropositionId is updated according to D9 (maintenance).

Category of manifesto propositions. 6 layers: T/E/P/L/D/H. Corresponds to the S = (A, C, H, D) four-way classification (design-specification-theory.md): A = constraint, C = empiricalPostulate + principle, H = hypothesis, D = boundary + designTheorem

Declaration: inductive PropositionCategory

inductive PropositionCategory where
  | constraint         -- T: 拘束条件 (A: Axioms)
  | empiricalPostulate -- E: 経験的公準 (C: Constraints)
  | principle          -- P: 基盤原理 (C: Constraints)
  | boundary           -- L: 境界条件 (D: Derivations)
  | designTheorem      -- D: 設計定理 (D: Derivations)
  | hypothesis         -- H: 仮定 — 未検証の前提(ATMS の仮定に対応)
  deriving BEq, Repr

Proposition identifier. Enumerates all propositions in the manifesto.

Declaration: inductive PropositionId

inductive PropositionId where
  -- T: 拘束条件
  | t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8
  -- E: 経験的公準
  | e1 | e2
  -- P: 基盤原理
  | p1 | p2 | p3 | p4 | p5 | p6
  -- L: 境界条件
  | l1 | l2 | l3 | l4 | l5 | l6
  -- D: 設計定理
  | d1 | d2 | d3 | d4 | d5 | d6 | d7 | d8 | d9 | d10 | d11 | d12 | d13 | d14
  | d15 | d16 | d17 | d18
  -- V: 観測変数(Observable.lean で定義。D6 三段設計の第3段階)
  | v1 | v2 | v3 | v4 | v5 | v6 | v7
  deriving BEq, Repr

Returns the category of a proposition.

Declaration: def PropositionId.category

def PropositionId.category : PropositionId → PropositionCategory
  | .t1 | .t2 | .t3 | .t4 | .t5 | .t6 | .t7 | .t8 => .constraint
  | .e1 | .e2 => .empiricalPostulate
  | .p1 | .p2 | .p3 | .p4 | .p5 | .p6 => .principle
  | .l1 | .l2 | .l3 | .l4 | .l5 | .l6 => .boundary
  | .d1 | .d2 | .d3 | .d4 | .d5 | .d6 | .d7 | .d8
  | .d9 | .d10 | .d11 | .d12 | .d13 | .d14
  | .d15 | .d16 | .d17 | .d18 => .designTheorem
  | .v1 | .v2 | .v3 | .v4 | .v5 | .v6 | .v7 => .boundary  -- V は L 境界から導出される観測変数

Returns the direct dependencies of a proposition. Encodes the derivation structure of the manifesto.

Definition of what each proposition depends on. T are root nodes (no dependencies), D are leaf nodes (many dependencies).

Declaration: def PropositionId.dependencies

def PropositionId.dependencies : PropositionId → List PropositionId
  -- T: 根ノード(独立)
  | .t1 | .t2 | .t3 | .t4 | .t5 | .t6 | .t7 | .t8 => []
  -- E: T に部分的に依存
  | .e1 => [.t4]
  | .e2 => []
  -- P: T/E から導出
  | .p1 => [.e2]
  | .p2 => [.t4, .e1]
  | .p3 => [.t1, .t2]
  | .p4 => [.t5, .t7]
  | .p5 => [.t4]
  | .p6 => [.t3, .t7, .t8]
  -- L: T/E/P に依存
  | .l1 => [.p1, .t6]
  | .l2 => [.t1, .t3, .t4]
  | .l3 => [.t6, .t7]
  | .l4 => [.t6, .p1, .d8]
  | .l5 => []  -- 環境依存(外部)
  | .l6 => [.t6, .p3]
  -- D: T/E/P/L から導出
  | .d1 => [.p5, .l1, .l2, .l3, .l4, .l5, .l6]
  | .d2 => [.e1, .p2]
  | .d3 => [.p4, .t5]
  | .d4 => [.p3]
  | .d5 => [.t8, .p4, .p6]
  | .d6 => [.d3]
  | .d7 => [.p1]
  | .d8 => [.e2]
  | .d9 => [.p3]
  | .d10 => [.t1, .t2]
  | .d11 => [.t3, .d1, .d3]
  | .d12 => [.p6, .t3, .t7, .t8]
  | .d13 => [.p3, .t5]
  | .d14 => [.p6, .t7, .t8]
  -- D15: ハーネス工学定理(T3+T4+T5+T6+T7+T8+P6)
  | .d15 => [.t3, .t4, .t5, .t6, .t7, .t8, .p6]
  -- D16: 情報関連性定理(context_contribution_nonuniform + T7 + T8)
  | .d16 => [.t3, .t7, .t8]
  -- D17: 演繹的設計ワークフロー(T5+D3+P3+T6+D5+D9+E1+D2+D13)
  | .d17 => [.t5, .t6, .e1, .p3, .d2, .d3, .d5, .d9, .d13]
  -- D18: マルチエージェント協調(T7b + D12 + T3)
  | .d18 => [.t3, .t7, .d12]
  -- V: 観測変数(Observable.lean 対応表より)
  | .v1 => [.l2, .l5]         -- skillQuality: L2 (ontological), L5 (platform)
  | .v2 => [.l2, .l3]         -- contextEfficiency: L2 (ontological), L3 (resource)
  | .v3 => [.l1, .l4]         -- outputQuality: L1 (safety), L4 (action space)
  | .v4 => [.l6, .l4]         -- gatePassRate: L6 (convention), L4 (action space)
  | .v5 => [.l4, .l6]         -- proposalAccuracy: L4 (action space), L6 (convention)
  | .v6 => [.l2]              -- knowledgeStructureQuality: L2 (ontological)
  | .v7 => [.l3, .l6]         -- taskDesignEfficiency: L3 (resource), L6 (convention)

A proposition directly depends on another proposition.

Declaration: def propositionDependsOn

def propositionDependsOn (a b : PropositionId) : Bool :=
  a.dependencies.contains b

T (constraints) are root nodes: they depend on nothing.

Declaration: theorem constraints_are_roots

theorem constraints_are_roots :
  ∀ (p : PropositionId),
    p.category = .constraint → p.dependencies = [] := by
  intro p hp; cases p <;> simp [PropositionId.category] at hp <;> rfl

Epistemological strength ordering of PropositionCategory. T > E > P. L and D are below P.

Declaration: def PropositionCategory.strength

def PropositionCategory.strength : PropositionCategory → Nat
  | .constraint         => 5
  | .empiricalPostulate => 4
  | .principle          => 3
  | .boundary           => 2
  | .designTheorem      => 1
  | .hypothesis         => 0  -- 最弱: 未検証の前提は他カテゴリより低い認識論的強度

(Axiom Card) Layer: Γ T₀ (Design-derived) Content: Dependencies follow descending epistemological strength. If proposition A depends on B, then B.strength ≥ A.strength. Basis: Design decision for D13 propagation direction. Upstream (stronger) propositions affect downstream (weaker) ones, not vice versa.

降格判定: 導出不可能 — PropositionId.dependencies は def だが、 全ケース網羅の decide/native_decide がタイムアウト。axiom として維持。

Source: Ontology.lean PropositionId.dependencies Refutation condition: If a dependency violating the strength ordering were added

Declaration: axiom dependency_respects_strength

axiom dependency_respects_strength :
  ∀ (a b : PropositionId),
    propositionDependsOn a b = true →
    b.category.strength ≥ a.category.strength

4. Axioms: Base Theory T0

4.1. Epistemic Layer - Constraint Strength 5 - T1-T8 Base Theory T0

The manifesto's binding conditions are formalized as Lean non-logical axioms (Terminology Reference §4.1).

4.1.1. Position as T0

Procedure 2.4.

T1–T8 are "undeniable, technology-independent facts" that constitute the base theory T₀ (the set of axioms that do not shrink under revision loops). Basis for T₀ membership:

  • T1–T3, T7: Environment-derived (hardware constraints, physical limits of computational resources)

  • T4: Natural-science-derived (nondeterminism inherent in the generation process)

  • T5: Natural-science-derived (fundamental principle of control theory)

  • T6: Contract-derived (authority structure based on agreement with humans)

  • T8: Contract-derived (structural requirements of task definitions)

By declaring them as Lean axioms, they are incorporated into the type system as propositions assumed without proof (Terminology Reference §4.1, non-logical axioms).

4.1.2. Design Policy

Each T may be decomposed into multiple axioms. A natural-language T1 does not necessarily correspond to a single proposition; finer decompositions arise during formalization. Each axiom's docstring follows the Axiom Card format (Procedure §2.5).

4.1.3. Encoding Method for T0

T1–T8 contain properties that cannot be expressed by type definitions alone (existential quantification, causal relations, etc.), so they are declared as axioms (Axiom Card required). Parts expressible via type definitions are placed in Ontology.lean as definitional extensions (Terminology Reference §5.5).

4.1.4. Correspondence Table

axiom name

Corresponding T

Property expressed

T₀ membership basis

session_bounded

T1

Sessions terminate in finite time

Environment-derived

no_cross_session_memory

T1

No state sharing across sessions

Environment-derived

session_no_shared_state

T1

No mutable state sharing across sessions

Environment-derived

structure_persists

T2

Structure persists after session termination

Environment-derived

structure_accumulates

T2

Improvements accumulate in structure

Environment-derived

context_finite

T3

Working memory (processable information) is finite

Environment-derived

context_bounds_action

T3

Processing is possible only within context capacity

Environment-derived

context_contribution_nonuniform

T3

Not all context items contribute equally to task precision

Natural-science-derived

output_nondeterministic

T4

Different outputs possible for the same input

Natural-science-derived

no_improvement_without_feedback

T5

No improvement without feedback loop

Natural-science-derived

no_process_improvement_without_feedback

T5

No process improvement without process-targeted feedback

Natural-science-derived

human_resource_authority

T6

Humans are the final decision-makers for resources

Contract-derived

resource_revocable

T6

Humans can revoke resources

Contract-derived

resource_finite

T7

Resources are finite

Environment-derived

task_has_precision

T8

Tasks have a precision level

Contract-derived

4.1.5. Correspondence with Terminology Reference

  • Axiom → Non-logical axiom (§4.1): A proposition assumed true without proof, specific to a given theory

  • T₀ → Base theory: A set of non-logical axioms grounded in external authority (Procedure §2.4)

  • Axiom decomposition → Not definitional extension (§5.5), but refinement of the same concept

4.1.6. T1 Agent Sessions Are Ephemeral

"There is no memory across sessions. There is no continuous 'self.' Each instance is an independent entity with no identity shared with previous instances."

T1 is decomposed into three axioms:

  1. Sessions terminate in finite time (boundedness)

  2. There is no means to share state across sessions (discontinuity of memory)

  3. No mutable state is shared across different sessions (independence)

(Axiom Card) Layer: T₀ (Environment-derived) Content: Sessions terminate in finite time. For all sessions, they become terminated at some point. Basis: Execution of computational agents consumes finite resources and therefore terminates in finite time (related to T7). Reference examples: LLM session timeouts, resource consumption limits.

Theoretical grounding (not formally proven — T₀ environment constraint): (R33) Hoare (1978) "Communicating Sequential Processes" — Process termination: a terminated process engages in no further events Note: T₀ axioms encode physical facts, not mathematical theorems. Session boundedness follows from finite resource consumption (T7), but this causal chain cannot be formalized because canTransition is opaque.

降格判定: 導出不可能 — canTransition, validTransition が opaque のため、 有限時間での終了を型から導出できない。axiom として維持。

Source: manifesto.md T1 "There is no memory across sessions" Refutation condition: If computational processes could run indefinitely without resource consumption (e.g., infinite energy source)

Declaration: axiom session_bounded

axiom session_bounded :
  ∀ (w : World) (s : Session),
    s ∈ w.sessions →
    ∃ (w' : World), w.time ≤ w'.time ∧
      ∃ (s' : Session), s' ∈ w'.sessions ∧
        s'.id = s.id ∧ s'.status = SessionStatus.terminated

(Axiom Card) Layer: T₀ (Environment-derived) Content: No state sharing across sessions. Between two sessions with different session IDs, actions in one cannot affect the observable state of the other. Basis: Ephemeral computational processes lose internal state upon process termination. State isolation across sessions is guaranteed at the execution environment level (process-level memory isolation).

Theoretical grounding (not formally proven — T₀ environment constraint): (R31) Milner (1989) "Communication and Concurrency" — CCS: parallel composition with no shared names = causal independence Note: T₀ axioms encode physical facts, not mathematical theorems. Session memory isolation is guaranteed by the execution environment (process-level memory isolation), formalized in CCS as P | Q with disjoint name spaces.

降格判定: 導出不可能 — AuditEntry の preHash/postHash が opaque のため、 因果的独立性を型から導出できない。axiom として維持。

Source: manifesto.md T1 "There is no continuous 'self'" Refutation condition: If session state could leak across process boundaries (e.g., shared mutable memory between independent processes)

Declaration: axiom no_cross_session_memory

axiom no_cross_session_memory :
  ∀ (w : World) (e1 e2 : AuditEntry),
    e1 ∈ w.auditLog → e2 ∈ w.auditLog →
    e1.session ≠ e2.session →
    -- 異なるセッションの監査エントリは因果的に独立
    -- (一方の preHash が他方の postHash に依存しない)
    e1.preHash ≠ e2.postHash

(Axiom Card) Layer: T₀ (Environment-derived) Content: No mutable state sharing across different sessions. Even with the same AgentId, instances in different sessions do not directly share state. Influence propagates only indirectly through structure (T2). Basis: Causal independence across sessions. Each instance is an independent entity. In process algebra terms, sessions are composed in parallel with no shared channels.

Theoretical grounding (not formally proven — T₀ environment constraint): (R31) Milner (1989) "Communication and Concurrency" — CCS parallel composition: P | Q with disjoint names (R32) Honda (1993) "Types for Dyadic Interaction" — Session types: typed channels confined to session scope Note: T₀ axioms encode physical facts, not mathematical theorems. State non-sharing follows from process-level isolation — each session runs in a separate process with no shared mutable channels.

降格判定: 導出不可能 — canTransition が opaque のため、 セッション間の遷移独立性を型から導出できない。axiom として維持。

Source: manifesto.md T1 "Each instance is an independent entity" Refutation condition: If inter-session communication channels existed (e.g., shared mutable state accessible across sessions)

Declaration: axiom session_no_shared_state

axiom session_no_shared_state :
  ∀ (agent1 agent2 : Agent) (action1 action2 : Action)
    (w w' : World),
    action1.session ≠ action2.session →
    canTransition agent1 action1 w w' →
    -- action2 が w で可能なら、w' でも可能(セッション1の遷移が
    -- セッション2のアクション可否に直接影響しない)
    (∃ w'', canTransition agent2 action2 w w'') →
    (∃ w''', canTransition agent2 action2 w' w''')

4.1.7. T2 Structure Outlives the Agent

"Documents, tests, skill definitions, design conventions — these persist even after the session ends. The place where improvements accumulate is within structure."

T2 is decomposed into two axioms:

  1. Structure persists after session termination (persistence)

  2. Structure can accumulate improvements (accumulability)

(Axiom Card) Layer: T₀ (Environment-derived) Content: Structure persists after session termination. Even when a session becomes terminated, structures referenced by that session do not disappear from the World. Basis: Persistence on the file system. Structures (documents, tests, etc.) reside in storage outside the session. The persistence set is monotonically non-decreasing across valid transitions.

Mathematical grounding (Foundation/ProcessModel.lean): (R34) Lamport (1978) "Time, Clocks, and the Ordering of Events" — Monotonic logical clocks: once recorded, events remain in causal history Compositional property proven: persistence_composes (0 sorry) — structure_persists applied to transition chains yields multi-step persistence Note: The axiom itself is a T₀ environment constraint (persistent storage). The Foundation theorem proves a consequence (compositionality), not a derivation.

降格判定: 導出不可能 — validTransition が opaque のため、 遷移後の structures 集合の単調性を型から導出できない。axiom として維持。

Source: manifesto.md T2 "The place where improvements accumulate is within structure" Refutation condition: If persistent storage could lose data across valid transitions (e.g., volatile-only storage with no durability guarantee)

Declaration: axiom structure_persists

axiom structure_persists :
  ∀ (w w' : World) (s : Session) (st : Structure),
    s ∈ w.sessions →
    st ∈ w.structures →
    s.status = SessionStatus.terminated →
    validTransition w w' →
    st ∈ w'.structures

(Axiom Card) Layer: T₀ (Environment-derived) Content: Structure accumulates improvements. As epochs advance, structures may be updated (lastModifiedAt is non-decreasing). Contrast with T1: agents are ephemeral, but structure grows. Basis: Monotonic epoch increase guaranteed by version control systems (git). Epoch is a logical clock — append-only logs ensure monotonicity.

Mathematical grounding (Foundation/ProcessModel.lean): (R34) Lamport (1978) "Time, Clocks, and the Ordering of Events" — Logical clock monotonicity: e₁ → e₂ ⟹ C(e₁) < C(e₂) Compositional property proven: epoch_monotone_composes (0 sorry) — epoch monotonicity composes across transition chains via Nat.le_trans Note: The axiom itself is a T₀ environment constraint (append-only version control). The Foundation theorem proves a consequence (compositionality), not a derivation.

降格判定: 導出不可能 — validTransition が opaque のため、 epoch の単調性を型から導出できない。axiom として維持。

Source: manifesto.md T2 "Structure outlives the agent" Refutation condition: If version control allowed epoch regression (e.g., non-monotonic clock or destructive history rewrite)

Declaration: axiom structure_accumulates

axiom structure_accumulates :
  ∀ (w w' : World),
    validTransition w w' →
    w.epoch ≤ w'.epoch

4.1.8. T3 The Amount of Information Processable at Once Is Finite

"There is a physical upper limit on the amount of information processable at once. A constraint on the agent's cognitive space."

T3 is decomposed into two axioms:

  1. Working memory (ContextWindow) capacity is finite (existence)

  2. Processing is possible only within working memory capacity (constraint)

(Derivation Card) Previously: axiom context_finite (T₀, Environment-derived) Now: theorem derived from ContextWindow type invariants Derivation: ContextWindow.capacity_pos and ContextWindow.used_le_cap are embedded in the ContextWindow structure as type-level invariants. Every valid ContextWindow satisfies these by construction. Proposition: T3 Source: manifesto.md T3 "There is a physical upper limit on the amount of information processable at once"

Declaration: theorem context_finite

theorem context_finite :
  ∀ (agent : Agent),
    agent.contextWindow.capacity > 0 ∧
    agent.contextWindow.used ≤ agent.contextWindow.capacity := by
  intro agent
  exact ⟨agent.contextWindow.capacity_pos, agent.contextWindow.used_le_cap⟩

(Derivation Card) Previously: axiom context_bounds_action (T₀, Environment-derived) Now: theorem derived from context_finite Derivation: By context_finite, used ≤ capacity holds for all agents. Therefore used > capacity is a contradiction (Nat.not_lt.mpr), and ex falso any conclusion follows. Note: This axiom was vacuously true — its precondition (used > capacity) is always false given context_finite (used ≤ capacity). Demoting to theorem makes this explicit. Proposition: T3 Source: manifesto.md T3 "A constraint on the agent's cognitive space"

Declaration: theorem context_bounds_action

theorem context_bounds_action :
  ∀ (agent : Agent) (action : Action) (w : World),
    agent.contextWindow.used > agent.contextWindow.capacity →
    actionBlocked agent action w := by
  intro agent action w h_overflow
  have h_finite := context_finite agent
  omega

(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Not all information in the context contributes equally to task precision. There exist context items whose precision contribution to a given task is zero. Basis: Information-theoretic fact. In any finite information set, the relevance of individual items to a specific objective varies. This is independent of the agent architecture (applies to LLMs, FSMs, human cognition alike).

Theoretical grounding (Foundation/InformationTheory.lean): (R51) Shannon (1948) "A Mathematical Theory of Communication" — Information content varies across symbols in any non-trivial source (R52) Tishby et al. (1999) "The Information Bottleneck Method" — Relevance is task-dependent; optimal compression discards irrelevant items Note: T₀ natural-science constraint. The existence of zero-contribution items cannot be derived from type definitions (precisionContribution is opaque).

降格判定: 導出不可能 — precisionContribution が opaque。axiom として維持。

Source: manifesto.md T3 — ForgeCode analysis #147 identified as common root of B1/B3/B5/B6. Refutation condition: If it were shown that all information contributes equally to all tasks (contradicts information theory).

Declaration: axiom context_contribution_nonuniform

axiom context_contribution_nonuniform :
  ∀ (task : Task),
    task.precisionRequired.required > 0 →
    ∃ (item : ContextItem),
      precisionContribution item task = 0

4.1.9. T4 Agent Output Is Stochastic

"Different outputs may be produced for the same input. Structure is interpreted probabilistically each time. Designs that assume 100% compliance are fragile."

Since canTransition is defined as a relation rather than a function (see Ontology.lean), multiple w' can satisfy canTransition for the same (agent, action, w). T4 declares as an axiom that "this multiplicity can actually occur."

(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Nondeterminism of output. For the same agent, action, and world state, different transition targets may exist. Basis: Nondeterminism inherent in the agent's generation process. Multiple sources — sampling (temperature parameter), non-associativity of floating-point arithmetic, irreversibility of branching in autoregressive generation — enable different outputs for the same input. Even at temperature=0, floating-point-level nondeterminism may persist.

Mathematical grounding (Foundation/Probability.lean): Under the conditions τ > 0 and |V| ≥ 2, this axiom is mathematically justified: (R1) Kolmogorov (1933) — probability axioms (Mathlib: PMF) (R2) Gao & Pavel (2017, arXiv:1704.00805) — softmax full support: τ > 0 → ∀ i, softmax(z/τ)_i > 0 (proven: softmax_full_support) (R3) Jang et al. (2017, ICLR, arXiv:1611.01144) — categorical sampling: output ~ Categorical(softmax(z/τ)) (R4) Atil et al. (2024, arXiv:2408.04667) — hardware nondeterminism: τ = 0 でも GPU 浮動小数点で非決定性が生じる

Source: manifesto.md T4 "Different outputs may be produced for the same input" Refutation condition: If all LLM architectures switch to deterministic-only generation (temperature fixed at 0 with no floating-point nondeterminism).

Declaration: axiom output_nondeterministic

axiom output_nondeterministic :
  ∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
    canTransition agent action w w₁ ∧
    canTransition agent action w w₂ ∧
    w₁ ≠ w₂

4.1.10. T5 Improvement Is Impossible Without Feedback

"A fundamental of control theory. Without a loop of measurement, comparison, and adjustment, convergence toward the goal does not occur."

T5 declares that the existence of feedback is a necessary condition for improvement.

Predicate for whether structure has improved. Concretized from opaque (#163/#506): at least one structure in w' has a lastModifiedAt epoch strictly greater than in w (i.e., a structure was updated), or a new structure was added.

This captures the minimal observable signal of structural improvement: some persistent artifact was modified. Content-level quality assessment is delegated to V1–V7 metrics.

Declaration: def structureImproved

def structureImproved (w w' : World) : Prop :=
  ∃ (s' : Structure),
    s' ∈ w'.structures ∧
    ((∀ (s : Structure), s ∈ w.structures → s.id = s'.id → s.lastModifiedAt < s'.lastModifiedAt)
    ∨ (¬∃ (s : Structure), s ∈ w.structures ∧ s.id = s'.id))

Predicate for whether a process has improved (#316). Processes (learning lifecycles, workflows) are also subject to T5: convergence requires feedback. Remains opaque because ProcessId is opaque and process-level quality cannot be expressed via typed structure fields (unlike structural improvement which is observable via lastModifiedAt).

Declaration: opaque processImproved

opaque processImproved : ProcessId → World → World → Prop

(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Feedback is required for structural improvement. If structure has improved between two world states, then feedback exists in between. Basis: Fundamental principle of control theory. Without a loop of measurement, comparison, and adjustment, convergence toward the goal does not occur.

Theoretical grounding (Foundation/ControlTheory.lean): (R41) Francis & Wonham (1976) "The Internal Model Principle" — Tracking requires a model of the signal generator (= feedback) (R42) Cover & Thomas (1991) "Data Processing Inequality" — Without new information (feedback), no improvement is possible Compositional property proven: feedback_interval_widen (0 sorry) Note: T₀ natural-science constraint. Feedback necessity cannot be derived from type definitions alone.

降格判定: 導出不可能 — structureImproved は def だが、「改善→フィードバック」の 含意は制御理論の主張であり型定義から導出不可能。axiom として維持。

Source: manifesto.md T5 "A fundamental of control theory" Refutation condition: If structural improvement without any feedback mechanism were demonstrated (e.g., improvement through purely internal computation without external input)

Declaration: axiom no_improvement_without_feedback

axiom no_improvement_without_feedback :
  ∀ (w w' : World),
    structureImproved w w' →
    ∃ (f : Feedback), f ∈ w'.feedbacks ∧
      f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time

(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Feedback is required for process improvement. The same control-theory principle as T5 applies to processes: without measurement, comparison, and adjustment of the process itself, convergence of the process toward its goals does not occur. Basis: Same as T5 (Internal Model Principle + Data Processing Inequality). Applied to Level 1 (process) rather than Level 0 (structure).

降格判定: 導出不可能 — processImproved が opaque。axiom として維持。

Source: #316 "T5 Feedback.target: StructureId がプロセスレベルのメタフィードバックを阻害する" Refutation condition: If process improvement without feedback about the process were demonstrated

Declaration: axiom no_process_improvement_without_feedback

axiom no_process_improvement_without_feedback :
  ∀ (pid : ProcessId) (w w' : World),
    processImproved pid w w' →
    ∃ (f : Feedback), f ∈ w'.feedbacks ∧
      f.target = .process pid ∧
      f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time

4.1.11. T6 Humans Are the Final Decision-Makers for Resources

"Computational resources, data access, execution privileges — all are granted by humans and can be revoked by humans."

T6 is decomposed into two axioms:

  1. The origin of resource allocation is human (authority)

  2. Humans can revoke resources (reversibility)

Predicate for whether an agent is human.

Declaration: def isHuman

def isHuman (agent : Agent) : Prop :=
  agent.role = AgentRole.human

(Axiom Card) Layer: T₀ (Contract-derived) Content: The origin of resource allocation is human. The grantedBy of all resource allocations holds a human role. Basis: Agreement on authority structure in human-agent collaboration. Source: manifesto.md T6 "Computational resources, data access, execution privileges — all are granted by humans" Refutation condition: Not applicable (T₀)

Declaration: axiom human_resource_authority

axiom human_resource_authority :
  ∀ (w : World) (alloc : ResourceAllocation),
    alloc ∈ w.allocations →
    ∃ (human : Agent), isHuman human ∧ human.id = alloc.grantedBy

(Axiom Card) Layer: T₀ (Contract-derived) Content: Humans can revoke resources. For any resource allocation, there exists a transition in which a human invalidates it. Basis: Agreement on human final decision-making authority. Privileges can be delegated but remain revocable. Source: manifesto.md T6 "can be revoked by humans" Refutation condition: Not applicable (T₀)

Declaration: axiom resource_revocable

axiom resource_revocable :
  ∀ (w : World) (alloc : ResourceAllocation),
    alloc ∈ w.allocations →
    ∃ (w' : World) (human : Agent),
      isHuman human ∧
      validTransition w w' ∧
      alloc ∉ w'.allocations

4.1.12. T7 Resources Available for Task Execution Are Finite

Time and energy.

"Whereas T3 states the finiteness of cognitive space (context), T7 states the finiteness in the temporal and energetic dimensions."

(Axiom Card) Layer: T₀ (Environment-derived) Content: Resources are finite. The total resource amount across the entire World does not exceed globalResourceBound. Quantified in ∃-∀ order (not ∀-∃), guaranteeing that a single upper bound exists for all Worlds (non-vacuity, Terminology Reference §6.4). Basis: Physical finiteness of computational resources (CPU, memory, API quotas). Every computational system operates within finite energy, memory, and time budgets.

Mathematical grounding (Foundation/ProcessModel.lean): Compositional property proven: resource_bound_max (0 sorry) — universal bound holds simultaneously across any pair of worlds Note: Physical finiteness is axiomatic in the physical sciences. No formal derivation from more primitive axioms is possible — this is a T₀ environment constraint, not a mathematical theorem.

降格判定: 導出不可能 — globalResourceBound は opaque 定数であり、 有限性は物理的事実として axiom のまま維持。

Source: manifesto.md T7 "Resources available for task execution are finite" Refutation condition: If computational resources became unlimited (e.g., infinite energy or infinite memory)

Declaration: axiom resource_finite

axiom resource_finite :
  ∀ (w : World),
    (w.allocations.map (·.amount)).foldl (· + ·) 0 ≤ globalResourceBound

4.1.13. T7b Sequential Execution Time is Additive

T7 のリソース有限性の時間次元への具体化。 T3 が T7 の空間次元(コンテキスト有限)であるのと同構造。

逐次実行では各サブタスクの所要時間が加算される。これは物理法則由来の事実であり、 並列実行(所要時間 = max)との非対称性が、マルチエージェント協調 (D18) の根拠となる。

(公理カード) 所属: T₀(環境由来) 内容: 独立なタスクが存在する場合、それらの逐次実行の総時間は 各タスクの最大時間よりも厳密に大きくなりうる。 これが並列実行の合理性の根拠: parallel = max < sequential = sum。 根拠: 物理法則。逐次実行では各サブタスクが順に時間を消費する。 2 つの非ゼロ時間タスクの和は、どちらか一方の時間より大きい。 ソース: manifesto T7 の時間次元具体化。B4 (#276) の根本原因分析から導入。 降格判定: 導出不可能 — executionDuration は opaque。

Declaration: axiom sequential_exceeds_component

axiom sequential_exceeds_component :
  ∀ (t1 t2 : Task),
    executionDuration t1 > 0 →
    executionDuration t2 > 0 →
    executionDuration t1 + executionDuration t2 > executionDuration t1 ∧
    executionDuration t1 + executionDuration t2 > executionDuration t2

4.1.14. T8 Tasks Have a Precision Level to Be Achieved

"Whether self-imposed or externally imposed, tasks without a precision level cannot be optimization targets."

(Derivation Card) Previously: axiom task_has_precision (T₀, Contract-derived) Now: theorem derived from PrecisionLevel type invariant Derivation: PrecisionLevel.required_pos is embedded in the PrecisionLevel structure. Every valid Task has a PrecisionLevel with required > 0 by construction. Proposition: T8 Source: manifesto.md T8 "Whether self-imposed or externally imposed"

Declaration: theorem task_has_precision

theorem task_has_precision :
  ∀ (task : Task),
    task.precisionRequired.required > 0 := by
  intro task
  exact task.precisionRequired.required_pos

4.1.15. Sorry Inventory Phase 1

List of sorry occurrences in Phase 1:

Location

Reason for sorry

Ontology.lean: canTransition

opaque — transition conditions to be defined in Phase 3+

Ontology.lean: globalResourceBound

opaque — to be concretized per domain in Phase 2+

Axioms.lean: structureImproved

concretized (#163/#506) — lastModifiedAt comparison + new structure detection

Axioms are propositions assumed without proof, so they contain no sorry. When P1–P6 are derived as theorems in Phase 3, sorry occurrences will arise.

5. Empirical Postulates: E1-E2

5.1. Epistemic Layer - EmpiricalPostulate Strength 4 - E1-E2 Premise Set

Formalizes empirical postulates as Lean non-logical axioms (Terminology Reference §4.1).

5.1.1. Position as Extension of T0

Procedure 2.4.

E1–E2 are "findings repeatedly demonstrated with no known counterexamples, yet in principle refutable," constituting the extended part (Γ T₀) of premise set Γ. Difference from T₀: based not on external authority (contracts, natural laws) but on hypotheses grounded in empirical observation (Terminology Reference §9.1 empirical propositions). They possess refutability (§9.1) and are subject to AGM contraction (§9.2).

In Lean, they are declared as axiom just like T₀, but each axiom card must include a refutation condition (Procedure §2.5).

5.1.2. Relationship to T0

Procedure 2.4.

Γ is an extension of T₀ (Terminology Reference §5.5), so Thm(T₀) ⊆ Thm(Γ). If an E is refuted, the P's that depend on it (P1, P2) become subject to revision, but T₀ and P's that depend solely on T₀ (P3–P6) are unaffected. This follows from the monotonicity of extensions (§2.5 / §5.3).

5.1.3. Correspondence Table

Axiom name

Corresponding E

Expressed property

Γ T₀ membership basis

verification_requires_independence

E1

Generation and evaluation must be separated

Hypothesis-derived

no_self_verification

E1

Prohibition of self-verification

Hypothesis-derived

shared_bias_reduces_detection

E1

Shared bias degrades detection power

Hypothesis-derived

capability_risk_coscaling

E2

Capability growth is inseparable from risk growth

Hypothesis-derived

confidence_is_self_description

E3a

Confidence is self-description, not measurement

Hypothesis-derived

5.1.4. E1 Verification Requires Independence

"Generation and evaluation by the same process has been demonstrated to be structurally unreliable across all domains (scientific peer review, financial auditing, software testing). Given T4 (probabilistic output), it is empirically supported that when a process with the same biases handles both generation and evaluation, detection power degrades."

E1 is decomposed into three axioms:

  1. The agents responsible for generation and evaluation must be separated (structural independence)

  2. Self-verification is not permitted (prohibition of self-verification)

  3. Verification between agents sharing internal state has low detection power (bias correlation)

(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: The agents responsible for generation and evaluation must be independent. The agent that generated an action and the agent that verifies it must be distinct individuals that do not share internal state. Basis: A principle repeatedly demonstrated in scientific peer review, financial auditing, software testing, etc. Grounded in statistical independence requirements for hypothesis testing.

Theoretical grounding (Foundation/StatisticalTesting.lean): (R61) Neyman & Pearson (1933) "On the Problem of the Most Efficient Tests" — Independent testing maximizes detection power (R62) Podsakoff et al. (2003) "Common Method Biases in Behavioral Research" — Shared method/source biases reduce detection ability (R63) AICPA / ISA 610 — Auditing standards requiring independence Formally proven: self_id_not_distinct, independence_symmetric (0 sorry) Derived theorem: no_self_verification (demoted from axiom — derives from this axiom)

降格判定: 導出不可能 — generates, verifies, sharesInternalState が opaque のため、 独立性要件を型から導出できない。axiom として維持。

Source: manifesto.md E1 "Verification Requires Independence" Refutation condition: If self-verification is demonstrated to have detection power equal to external verification (e.g., realization of complete self-awareness)

Declaration: axiom verification_requires_independence

axiom verification_requires_independence :
  ∀ (gen ver : Agent) (action : Action) (w : World),
    generates gen action w →
    verifies ver action w →
    gen.id ≠ ver.id ∧ ¬sharesInternalState gen ver

(Derivation Card) Derives from: verification_requires_independence (E1a) Content: Prohibition of self-verification. The same agent cannot perform both generation and verification. Derived from E1a: if gen.id ≠ ver.id is required, then gen = ver is impossible. Proof strategy: Assume generates and verifies for same agent, derive gen.id ≠ gen.id from verification_requires_independence, contradict with rfl.

Previously: axiom (declared explicitly as E1b). Demoted: 2026-04-01 — derivable from verification_requires_independence via contradiction (same proof as Principles.lean:e1b_from_e1a).

Theoretical grounding (Foundation/StatisticalTesting.lean): (R61) Neyman & Pearson (1933) — independent testing maximizes detection power Formally proven: self_id_not_distinct (0 sorry) — ¬(id ≠ id)

Declaration: theorem no_self_verification

theorem no_self_verification :
  ∀ (agent : Agent) (action : Action) (w : World),
    generates agent action w →
    ¬verifies agent action w := by
  intro agent action w h_gen h_ver
  have h := verification_requires_independence agent agent action w h_gen h_ver
  exact absurd rfl h.1

(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: Sharing internal state correlates biases. Two agents sharing internal state cannot be considered independent verifiers, even if one generates and the other verifies. Basis: Detection power degradation due to shared bias is empirically supported by conflict-of-interest policies in scientific research, audit firm rotation requirements, etc.

Theoretical grounding (Foundation/StatisticalTesting.lean): (R62) Podsakoff et al. (2003) "Common Method Biases" — Shared source biases inflate correlations and reduce detection Formally proven: independence_symmetric (0 sorry) — if A does not share state with B, then B does not share with A

降格判定: 導出不可能 — sharesInternalState, generates, verifies が opaque。 バイアス相関と検出力低下の因果関係は型から導出できない。axiom として維持。

Source: manifesto.md E1 "When a process with the same biases handles both generation and evaluation, detection power degrades" Refutation condition: If it is demonstrated that bias correlation has no effect on detection power

Declaration: axiom shared_bias_reduces_detection

axiom shared_bias_reduces_detection :
  ∀ (a b : Agent) (action : Action) (w : World),
    sharesInternalState a b →
    generates a action w →
    ¬verifies b action w

5.1.5. E2 Capability Growth Is Inseparable from Risk Growth

"It has been repeatedly observed across all tools that capability enables both positive and negative outcomes. However, there is no proof that means to increase capability while completely containing risk (such as a perfect sandbox) are impossible in principle."

E2 is formalized as a single axiom. Expansion of the action space (actionSpaceSize) necessarily entails an increase in risk exposure (riskExposure).

5.1.6. Note on Empirical Status

E2 is an empirical postulate and does not exclude the possibility that a "perfect sandbox" may be discovered in the future. It is assumed as an axiom, but if refuted, P1 (co-scaling of autonomy and vulnerability) becomes subject to revision.

(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: Capability growth is inseparable from risk growth. When an agent's action space expands, risk exposure necessarily increases. Basis: Attack surface monotonicity — each additional capability enables at least one additional adversarial execution trace, strictly increasing the attack surface metric (order-theoretic property, not statistical claim). Source: manifesto.md E2 "Capability Growth Is Inseparable from Risk Growth" Refutation condition: If means to increase capability while completely containing risk are discovered (e.g., a perfect sandbox)

Mathematical grounding (Foundation/RiskTheory.lean): (R21) Manadhata & Wing (2011) "A Formal Model for a System's Attack Surface" — Attack surface monotonicity: capability ⊃ → attack surface ⊃ (R22) Saltzer & Schroeder (1975) "The Protection of Information in Computer Systems" — Principle of Least Privilege: minimizing capability minimizes risk (R23) Dennis & Van Horn (1966) "Programming Semantics for Multiprogrammed Computations" — Capability model: capability set = set of possible actions Formally proven (Foundation/RiskTheory.lean): capability_risk_is_strict_mono, equal_risk_implies_equal_capability, capability_risk_injective, least_privilege_reduces_risk (0 sorry) Bridge (EmpiricalPostulates.lean): e2_equal_risk_equal_capability — contrapositive of E2 on actual opaque functions Derivation: (R21) attack surface monotonicity → StrictMono (Mathlib) → E2 is precisely the StrictMono property of the risk function → e2_equal_risk_equal_capability bridges to opaque riskExposure/actionSpaceSize

降格判定: 導出不可能 — actionSpaceSize, riskExposure が共に opaque のため、 型定義から StrictMono 関係を導出できない。axiom として維持、根拠は上記で検証済み。

Choice of inequality: < vs

The manifesto's "inseparable" implies strict co-scaling, so < (strict increase) is adopted. If the refutation condition is met (discovery of a risk containment method), this axiom becomes subject to AGM contraction (Terminology Reference §9.2), and P1 (co-scaling of autonomy and vulnerability) is revised.

Declaration: axiom capability_risk_coscaling

axiom capability_risk_coscaling :
  ∀ (agent : Agent) (w w' : World),
    actionSpaceSize agent w < actionSpaceSize agent w' →
    riskExposure agent w < riskExposure agent w'

E2 bridge: equal risk exposure implies equal action space size. Contrapositive of capability_risk_coscaling — if two worlds have the same risk exposure for an agent, they must have the same action space size. No "free capability" is possible.

Bridges Foundation/RiskTheory.lean's abstract StrictMono theorems to the actual opaque functions used in E2.

Reference: (R22) Saltzer & Schroeder (1975)

Declaration: theorem e2_equal_risk_equal_capability

theorem e2_equal_risk_equal_capability (agent : Agent) (w w' : World)
    (h_eq : riskExposure agent w = riskExposure agent w') :
    actionSpaceSize agent w = actionSpaceSize agent w' :=
  match Nat.lt_or_ge (actionSpaceSize agent w) (actionSpaceSize agent w'),
        Nat.lt_or_ge (actionSpaceSize agent w') (actionSpaceSize agent w) with
  | Or.inl h_lt, _ => absurd h_eq (Nat.ne_of_lt (capability_risk_coscaling agent w w' h_lt))
  | _, Or.inl h_gt => absurd (h_eq.symm) (Nat.ne_of_lt (capability_risk_coscaling agent w' w h_gt))
  | Or.inr h_ge, Or.inr h_ge' => Nat.le_antisymm h_ge' h_ge

5.1.7. E3a Confidence Is Self-Description

T4 (output_nondeterministic) asserts World-level nondeterminism: different outputs may be produced for the same input.

E3a asserts a stronger, orthogonal claim at the Confidence level: even when the result is the same, the Confidence value may differ. This means Confidence is a self-description of the output's probabilistic nature, not an externally calibrated measurement.

5.1.8. Orthogonality with T4

  • T4 → E3a: Not derivable. World differences may manifest only in non-Confidence fields.

  • E3a → T4: Not derivable. Confidence differences do not directly imply World differences. worldOutput connects them formally, but the axiom only asserts existence, not a bijection.

This orthogonality justifies E3a as an independent empirical postulate.

(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: Confidence is self-description, not measurement. For the same result, different Confidence values may be generated. Confidence.value should not be trusted for decision-making without external verification (P2).

T4 との差異:

  • T4: World-level nondeterminism (Output as a whole may differ)

  • E3a: Confidence-level nondeterminism (result same, confidence differs)

根拠: (R81) Li et al. (ACL 2025) "Revisiting Epistemic Markers in Confidence Estimation" — Linguistic confidence markers are inconsistent out-of-distribution → Registered as CORE-H1 in Models/Assumptions/EpistemicLayer.lean (#547) (R82) Ontology.lean:155 comment "self-description" — elevated to formal axiom

反証条件: If Confidence.value is demonstrated to correlate with actual accuracy without external calibration (i.e., well-calibrated by default).

降格判定: 導出不可能 — T4 は World レベルの差異を述べるが、Confidence フィールド への特化は T4 から導出できない。axiom として維持。

Declaration: axiom confidence_is_self_description

axiom confidence_is_self_description :
  ∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
    canTransition agent action w w₁ ∧
    canTransition agent action w w₂ ∧
    (worldOutput w₁).result = (worldOutput w₂).result ∧
    (worldOutput w₁).confidence ≠ (worldOutput w₂).confidence

E3a practical implication: if Confidence varies for the same result, then relying on Confidence.value alone for decision-making is unreliable. Derives from E3a by contradiction with the universality assumption.

Note: The theorem statement uses free Output variables (not worldOutput) because the practical claim is about Output values regardless of their World origin. E3a (via worldOutput) provides the witness.

Declaration: theorem confidence_unreliable_for_decision

theorem confidence_unreliable_for_decision :
    (∃ (o₁ o₂ : Output Nat),
      o₁.result = o₂.result ∧
      o₁.confidence ≠ o₂.confidence) →
    ¬(∀ (o₁ o₂ : Output Nat),
      o₁.result = o₂.result → o₁.confidence = o₂.confidence) := by
  intro ⟨o₁, o₂, heq, hneq⟩ h
  exact hneq (h o₁ o₂ heq)

5.1.9. Sorry Inventory Phase 2 Additions

Location

Reason for sorry

Ontology.lean: generates

opaque — to be concretized as Worker actions in Phase 3+

Ontology.lean: verifies

opaque — to be concretized as Verifier actions in Phase 3+

Ontology.lean: sharesInternalState

opaque — to be concretized as session/parameter sharing in Phase 3+

Ontology.lean: actionSpaceSize

opaque — to be quantified as Observable in Phase 4+

Ontology.lean: riskExposure

opaque — to be quantified as Observable in Phase 4+

6. Principles: P1-P6

6.1. Epistemic Layer - Principle Strength 3 - P1-P6 Theorem Derivation Procedure Phase 2

Describes design principles derived from premise set Γ (T₀ = T1–T8, Γ T₀ = E1–E2) as Lean theorems (Terminology Reference §4.2). Each P takes the form Γ ⊢ φ, a conditional derivation under premise set Γ (§2.5).

6.1.1. Derivation Structure Dependency Graph

Axiom dependencies (T/E basis) and robustness layer for each P:

P

Basis

Robustness

Derivation type

P1

E2

Empirical (depends on Γ T₀)

Direct application of E2

P2

T4 + E1

Empirical (depends on Γ T₀)

Direct application of E1a

P3

T1 + T2

Robust (T₀ only)

Composition of T1 and T2

P4

T5 (+ T7)

Robust (T₀ only)

Direct application of T5

P5

T4

Robust (T₀ only)

High-level restatement of T4

P6

T3 + T7 + T8

Robust (T₀ only)

Unfolding of T3, T7, T8 constraint structure

If Γ T₀ (E1, E2) is refuted (Terminology Reference §9.1 refutability), only P1 and P2 are affected. P3–P6 depend solely on T₀ and are therefore invariant under contraction of Γ T₀ (§9.2). This is a consequence of the monotonicity of extensions (§2.5 / §5.3).

6.1.2. Correspondence with Terminology Reference

  • theorem → theorem (§4.2): a proposition proved from axioms and inference rules

  • sorry → incomplete derivation (§1): lacking a proof (a sequence of inference rule applications from axioms to theorem)

  • E1b redundancy → independence check (§4.3): E1b is derivable from E1a (not independent)

6.1.3. Appendix - Proof of E1b Redundancy

Demonstrates as a theorem that E1b (no_self_verification) is derivable from E1a (verification_requires_independence). This is a concrete example of axiom hygiene check 3 (independence, Procedure §2.6): E1b is a redundant axiom and should be proved as a theorem.

6.1.4. P1 Co-scaling of Autonomy and Vulnerability

Derived from E2. Each time an agent's action space expands, the damage that malicious inputs or judgment errors can cause also grows.

Concepts P1 adds beyond E2:

  • "Unprotected expansion can destroy accumulated trust in a single incident" → Asymmetry of trust accumulation (gradual accumulation vs. abrupt destruction)

(Derivation Card) Derives from: capability_risk_coscaling (E2) Proposition: P1 Content: Expansion of the action space entails expansion of risk exposure — autonomy and vulnerability co-scale, so unprotected expansion can destroy accumulated trust. Proof strategy: Direct application of capability_risk_coscaling (E2)

Declaration: theorem autonomy_vulnerability_coscaling

theorem autonomy_vulnerability_coscaling :
  ∀ (agent : Agent) (w w' : World),
    actionSpaceSize agent w < actionSpaceSize agent w' →
    riskExposure agent w < riskExposure agent w' :=
  capability_risk_coscaling

P1b (theorem): Unprotected expansion destroys trust. When the action space expands and risk materializes, the trust level decreases.

Formalization of "accumulated trust can be destroyed by a single incident." The asymmetry of trust (gradual accumulation vs. abrupt destruction) will be made Observable as asymmetry in trustLevel fluctuation magnitude in Phase 4.

Declaration: theorem unprotected_expansion_destroys_trust

theorem unprotected_expansion_destroys_trust :
  ∀ (agent : Agent) (w w' : World),
    actionSpaceSize agent w < actionSpaceSize agent w' →
    riskMaterialized agent w' →
    trustLevel agent w' < trustLevel agent w :=
  trust_decreases_on_materialized_risk

6.1.5. P2 Cognitive Separation of Concerns

Derived from T4 and E1. Since output is probabilistic (T4) and generation and evaluation by the same process have correlated biases (E1), separation of generation and evaluation is required for the verification framework to function.

"Separation itself is non-negotiable."

Predicate for whether a verification framework is sound. Sound = all generated actions are independently verified.

Declaration: def verificationSound

def verificationSound (w : World) : Prop :=
  ∀ (gen ver : Agent) (action : Action),
    generates gen action w →
    verifies ver action w →
    gen.id ≠ ver.id ∧ ¬sharesInternalState gen ver

(Derivation Card) Derives from: verification_requires_independence (E1a) Proposition: P2 Content: Verification soundness requires role separation — the generator and evaluator of an action must be distinct agents with no shared internal state. Proof strategy: Direct application of verification_requires_independence (E1a) via verificationSound definition

Declaration: theorem cognitive_separation_required

theorem cognitive_separation_required :
  ∀ (w : World), verificationSound w :=
  fun w gen ver action h_gen h_ver =>
    verification_requires_independence gen ver action w h_gen h_ver

P2 lemma: Self-verification destroys verification framework soundness.

Declaration: theorem self_verification_unsound

theorem self_verification_unsound :
  ∀ (agent : Agent) (action : Action) (w : World),
    generates agent action w →
    ¬verifies agent action w :=
  no_self_verification

6.1.6. P3 Governed Learning

Derived from the combination of T1 and T2. Agents are ephemeral (T1) but structures persist (T2). The process of integrating knowledge into structures requires governance.

Two failure modes of ungoverned learning:

  • Chaos: structure degrades through accumulation of erroneous knowledge

  • Stagnation: knowledge fails to consolidate and structure does not improve

P3a (theorem): By T1, the agent that made modifications disappears. The session of an agent that modified structure necessarily terminates (T1). After termination, that agent loses the ability to correct modifications.

This is half of P3's "problem": the supervisor becomes absent.

Declaration: theorem modifier_agent_terminates

theorem modifier_agent_terminates :
  ∀ (w : World) (s : Session) (agent : Agent),
    s ∈ w.sessions →
    agent.currentSession = some s.id →
    -- T1: このセッションは必ず終了する
    ∃ (w' : World), w.time ≤ w'.time ∧
      ∃ (s' : Session), s' ∈ w'.sessions ∧
        s'.id = s.id ∧ s'.status = SessionStatus.terminated :=
  fun w s _ h_mem _ => session_bounded w s h_mem

P3b (theorem): By T2, modifications persist. Changes made to structure (including errors) remain even after the agent's session terminates.

This is half of P3's "stakes": errors persist indefinitely.

Declaration: theorem modification_persists_after_termination

theorem modification_persists_after_termination :
  ∀ (w w' : World) (s : Session) (st : Structure),
    s ∈ w.sessions →
    st ∈ w.structures →
    s.status = SessionStatus.terminated →
    validTransition w w' →
    -- T2: 構造は永続する
    st ∈ w'.structures :=
  structure_persists

(Derivation Card) Derives from: session_bounded (T1), structure_persists (T2) Proposition: P3 Content: Ungoverned breaking changes are irrecoverable — the change persists (T2) while the agent that made it disappears (T1), leaving no correcting agent. Proof strategy: intro + apply structure_accumulates — epoch monotonicity via validTransition chain from ki.after

Declaration: theorem ungoverned_breaking_change_irrecoverable

theorem ungoverned_breaking_change_irrecoverable :
  ∀ (w : World) (s : Session) (st : Structure)
    (ki : KnowledgeIntegration),
    -- 前提: エージェントが構造を変更した
    s ∈ w.sessions →
    st ∈ w.structures →
    ki.before = w →
    ki.compatibility = CompatibilityClass.breakingChange →
    -- T1 の寄与: エージェントのセッションは終了する
    (∃ (w_term : World), w.time ≤ w_term.time ∧
      ∃ (s' : Session), s' ∈ w_term.sessions ∧
        s'.id = s.id ∧ s'.status = SessionStatus.terminated) →
    -- T2 の寄与: 変更後の構造は永続する
    (∀ (w_future : World),
      validTransition ki.after w_future →
      ∀ st', st' ∈ ki.after.structures → st' ∈ w_future.structures) →
    -- 結論: 統治なしでは破壊的変更が永続する(修正する主体がいない)
    -- 形式化: 変更後のエポックは戻らない(不可逆)
    ∀ (w_future : World),
      validTransition ki.after w_future →
      ki.after.epoch ≤ w_future.epoch :=
  fun _ _ _ ki _ _ _ _ _ _ w_future h_trans =>
    structure_accumulates ki.after w_future h_trans

P3 conclusion: Why governance is necessary. Combining P3a (modifier_agent_terminates), P3b (modification_persists_after_termination), and P3c (ungoverned_breaking_change_irrecoverable):

Ungoverned knowledge integration produces a state where "irrecoverable breaking changes persist indefinitely." Governance (upfront compatibility classification + gates) is the only means to prevent this.

Note: The proof of P3c depends on structure_accumulates, but the propositional structure of the theorem requires both the T1 and T2 hypotheses. Without T1, "the agent might be able to correct it"; without T2, "the change might disappear" — so neither hypothesis can be omitted.

Declaration: def governanceNecessityExplanation

def governanceNecessityExplanation := "See P3a + P3b + P3c above"

P3b (theorem): Exhaustiveness of compatibility classification. Every knowledge integration is classified into one of three compatibility classes. (Structurally guaranteed by Lean's inductive type)

Declaration: theorem compatibility_exhaustive

theorem compatibility_exhaustive :
  ∀ (c : CompatibilityClass),
    c = .conservativeExtension ∨
    c = .compatibleChange ∨
    c = .breakingChange := by
  intro c
  cases c <;> simp

6.1.7. P4 Observable Degradation

Derived from T5. Improvement is impossible without feedback (T5), and what cannot be observed cannot be incorporated into feedback loops.

"What cannot be observed cannot be optimized."

Constraints manifest as gradients, not walls (binary).

(Derivation Card) Derives from: no_improvement_without_feedback (T5) Proposition: P4 Content: Improvement requires observability — if structure is improved, then feedback must have existed, so the target must have been observable. Proof strategy: Direct application of no_improvement_without_feedback (T5)

Declaration: theorem improvement_requires_observability

theorem improvement_requires_observability :
  ∀ (w w' : World),
    structureImproved w w' →
    ∃ (f : Feedback), f ∈ w'.feedbacks ∧
      f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
  no_improvement_without_feedback

(Derivation Card) Derives from: no_process_improvement_without_feedback (T5, #316) Proposition: P4 (process-level) Content: Process improvement requires process-targeted observability. If a process has improved, then feedback targeting that specific process must have existed. This is the Level 1 (process) analog of improvement_requires_observability (Level 0, structure). Proof strategy: Direct application of no_process_improvement_without_feedback

Declaration: theorem process_improvement_requires_observability

theorem process_improvement_requires_observability :
  ∀ (pid : ProcessId) (w w' : World),
    processImproved pid w w' →
    ∃ (f : Feedback), f ∈ w'.feedbacks ∧
      f.target = .process pid ∧
      f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
  no_process_improvement_without_feedback

P4b (theorem): Degradation is a gradient, not a wall. The degradation level can take any natural number value (not binary). To be concretized as Observable in Phase 4.

Declaration: theorem degradation_is_gradient

theorem degradation_is_gradient :
  ∀ (n : Nat), ∃ (w : World), degradationLevel w = n :=
  degradation_level_surjective

6.1.8. P5 Probabilistic Interpretation of Structure

Derived from T4. Structure is something agents interpret anew each time, not something they deterministically "follow." Even reading the same structure, different instances may take different actions.

Robust design does not assume perfect compliance with structure, but rather maintains resilience against interpretation variance.

(Derivation Card) Derives from: interpretation_nondeterminism (T4) Proposition: P5 Content: Structure interpretation is nondeterministic — the same structure can yield different actions from the same agent, so robust design must not assume perfect compliance. Proof strategy: Direct application of interpretation_nondeterminism (T4)

Declaration: theorem structure_interpretation_nondeterministic

theorem structure_interpretation_nondeterministic :
  ∃ (agent : Agent) (st : Structure) (action₁ action₂ : Action) (w : World),
    interpretsStructure agent st action₁ w ∧
    interpretsStructure agent st action₂ w ∧
    action₁ ≠ action₂ :=
  interpretation_nondeterminism

P5 lemma: Robust design is resilient to interpretation variance. A structure st is "robust" iff for any interpretation difference, the target world satisfies the safety constraint.

Declaration: def robustStructure

def robustStructure (st : Structure) (safety : World → Prop) : Prop :=
  ∀ (agent : Agent) (action : Action) (w w' : World),
    interpretsStructure agent st action w →
    canTransition agent action w w' →
    safety w'

6.1.9. P6 Task Design as Constraint Satisfaction

Derived from the combination of T3, T7, and T8. Within finite cognitive space (T3) and finite time/energy (T7), the required precision level (T8) must be achieved.

Task design is the process of solving this constraint satisfaction problem.

Task execution strategy. A "solution" to the constraint satisfaction problem.

Declaration: structure TaskStrategy

structure TaskStrategy where
  task           : Task
  contextUsage   : Nat   -- T3: コンテキスト使用量
  resourceUsage  : Nat   -- T7: リソース使用量
  achievedPrecision : Nat -- T8: 達成精度(千分率)
  deriving Repr

Predicate for whether a strategy satisfies the constraints. All three dimensions must be satisfied simultaneously.

Declaration: def strategyFeasible

def strategyFeasible (s : TaskStrategy) (agent : Agent) : Prop :=
  -- T3: コンテキスト容量内
  s.contextUsage ≤ agent.contextWindow.capacity ∧
  -- T7: リソース予算内
  s.resourceUsage ≤ s.task.resourceBudget ∧
  -- T8: 要求精度を達成
  s.achievedPrecision ≥ s.task.precisionRequired.required

(Derivation Card) Derives from: context_window_finite (T3), resource_budget_finite (T7), precision_requirement (T8) Proposition: P6 Content: Task execution is a constraint satisfaction problem — a strategy must simultaneously satisfy T3 (finite context), T7 (finite resources), and T8 (precision requirement). Proof strategy: intro + constructor chain with Nat.le_trans on strategyFeasible components

Declaration: theorem task_is_constraint_satisfaction

theorem task_is_constraint_satisfaction :
  ∀ (task : Task) (agent : Agent),
    -- T3: コンテキストは有限
    agent.contextWindow.capacity > 0 →
    -- T7: リソースは有限(タスクの予算は globalResourceBound 以下)
    task.resourceBudget ≤ globalResourceBound →
    -- T8: 精度要求は正
    task.precisionRequired.required > 0 →
    -- 結論: これは制約充足問題である
    -- (解の存在は保証しないが、制約の構造を明示する)
    ∀ (s : TaskStrategy),
      s.task = task →
      strategyFeasible s agent →
      s.contextUsage ≤ agent.contextWindow.capacity ∧
      s.resourceUsage ≤ globalResourceBound ∧
      s.achievedPrecision > 0 := by
  intro task agent h_ctx h_res h_prec s h_task h_feas
  constructor
  · exact Nat.le_trans h_feas.1 (Nat.le_refl _)
  constructor
  · exact Nat.le_trans h_feas.2.1 (h_task ▸ h_res)
  · exact Nat.lt_of_lt_of_le h_prec (h_task ▸ h_feas.2.2)

P6b (theorem): Task design itself is also probabilistic output. P6 itself is subject to T4, and requires verification via P2 (role separation).

Declaration: theorem task_design_is_probabilistic

theorem task_design_is_probabilistic :
  ∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
    canTransition agent action w w₁ ∧
    canTransition agent action w w₂ ∧
    w₁ ≠ w₂ :=
  output_nondeterministic

6.1.10. Appendix - Proof that E1b Is Derivable from E1a

no_self_verification is a corollary of verification_requires_independence. If we assume the same agent satisfies both generates and verifies, this contradicts E1a's conclusion gen.id ≠ ver.id (since gen = ver, we have gen.id = ver.id).

E1b is a corollary of E1a. Requires DecidableEq for AgentId (sorry due to opaque).

Declaration: theorem e1b_from_e1a

theorem e1b_from_e1a :
  ∀ (agent : Agent) (action : Action) (w : World),
    generates agent action w →
    ¬verifies agent action w := by
  intro agent action w h_gen h_ver
  have h := verification_requires_independence agent agent action w h_gen h_ver
  exact absurd rfl h.1

6.1.11. Sorry Inventory Phase 4 Update

All sorry's resolved in Phase 4. Principles.lean is sorry-free.

6.1.12. Sorry's resolved from Phase 3 → Phase 4

theorem

Resolution method

Axiom used (Observable.lean)

unprotected_expansion_destroys_trust

axiom application

trust_decreases_on_materialized_risk

degradation_is_gradient

axiom application

degradation_level_surjective

structure_interpretation_nondeterministic

axiom application

interpretation_nondeterminism

6.1.13. Complete theorem proof method listing

theorem

Proof method

autonomy_vulnerability_coscaling

Direct application of E2

unprotected_expansion_destroys_trust

Direct application of Observable axiom

cognitive_separation_required

Direct application of E1a

self_verification_unsound

Direct application of E1b

modifier_agent_terminates

Direct application of T1

modification_persists_after_termination

Direct application of T2

ungoverned_breaking_change_irrecoverable

Composition of T1 ∧ T2

compatibility_exhaustive

Exhaustiveness proof via cases tactic

improvement_requires_observability

Direct application of T5

degradation_is_gradient

Direct application of Observable axiom

structure_interpretation_nondeterministic

Direct application of Observable axiom

task_is_constraint_satisfaction

Unfolding of T3/T7/T8 constraint structure

task_design_is_probabilistic

Direct application of T4

e1b_from_e1a

Contradiction derivation via E1a + absurd rfl

7. Observable Variables: V1-V7

7.1. Epistemic Layer - Boundary Strength 2 - Foundation of V1-V7 Observable Variables

Variables are not boundary conditions. They are parameters that agents can improve through structure, serving as indicators of structural quality. If boundary conditions (L1–L6 in Ontology.lean) are "walls of the action space," then variables are "levers that structure can move within those walls."

However, variables are not independent levers but a mutually interacting system.

7.1.1. Layer Separation

This file contains only definitions belonging to the boundary layer (strength 2):

  • Opaque definitions for V1–V7 and Measurable axioms (measurability guarantees)

  • Measurability axioms for trust/degradation

  • systemHealthy (basic definition of system health)

  • Mapping structure from boundaries to variables to constraints

  • Measurable to Observable bridge theorems

Definitions belonging to the designTheorem layer (strength 1) — tradeoffs, Goodhart, investment cycles, HealthThresholds, Pareto, etc. — are separated into ObservableDesign.lean.

7.1.2. Position as Gamma Extension of T_0

Procedure Manual 2.4.

The axioms in this file belong to the extension part (Gamma T_0) of the premise set Gamma, and are non-logical axioms (§4.1) derived from design decisions (domain model premises, design judgments). They constitute a consistent extension (Glossary §5.5) of T_0 (Axioms.lean) and may be subject to contraction (§9.2) in the revision loop.

7.1.3. Design Policy

Boundary conditions (T) are immovable walls, mitigations (L) are design decisions, and variables (V) are measures of how well mitigations work.

7.1.4. Observable vs Measurable

  • Observable (World → Prop is decidable) — binary judgment. Similar to preconditions/postconditions in Glossary §9.3

  • Measurable (World → Nat is computable) — quantitative measurement. Glossary §9.5 note: distinct from the measure-theoretic concept of measurable functions

V1–V7 are quantitative indicators and are therefore formalized as Measurable. Measurable m means "a procedure exists to compute the value of m from external observations."

7.1.5. Prerequisite - Observability P4

By P4 (observability of degradation), variables become optimization targets only when they are observable.

For each variable, the following questions are posed:

  • Is the current value observable? Does a measurement method exist, and is measurement actually being performed?

  • Is degradation detectable? If the value worsens, can it be detected before quality collapse?

  • Is improvement verifiable? Can the change in value be compared before and after intervention?

A variable without means of observation is merely a nominal optimization target.

7.1.6. V1-V7 Correspondence Table

Definition

V

Description

Measurement Method

Related Boundary Conditions

skillQuality

V1

Precision and effectiveness of skill definitions

benchmark.json

L2, L5

contextEfficiency

V2

Utilization of finite context

completion rate / token count

L2, L3

outputQuality

V3

Quality of code, design, and documentation

gate pass rate, review finding count

L1, L4

gatePassRate

V4

First-pass gate clearance rate

pass/fail statistics

L6, L4

proposalAccuracy

V5

Hit rate of design proposals

approval/rejection rate

L4, L6

knowledgeStructureQuality

V6

Degree of structuring of persistent knowledge

context restoration speed, retirement detection rate

L2

taskDesignEfficiency

V7

Efficiency of task design

completion rate / resource ratio

L3, L6

Observable: a decision procedure exists for a given property. Expresses that P : World → Prop is binary-decidable.

Declaration: def Observable

def Observable (P : World → Prop) : Prop :=
  ∃ f : World → Bool, ∀ w, f w = true ↔ P w

Measurable: a computation procedure exists for a quantitative indicator. Expresses that the value of m : World → Nat can be computed from external observations.

Formally, "there exists a computable function f that agrees with m." By declaring this as an axiom for an opaque m, we promise the system that "a measurement procedure exists in principle."

Why this is non-trivial

When m is opaque, f = m does not pass type-checking (due to the non-unfoldability of opaque definitions). Therefore, the axiom declaration of Measurable constitutes a non-trivial promise.

Declaration: def Measurable

def Measurable (m : World → Nat) : Prop :=
  ∃ f : World → Nat, ∀ w, f w = m w

Proxy maturity levels. Assigns a classification to each V proxy in observe.sh.

  • provisional: Tentative proxy indicator. Formal measurement method not yet implemented.

  • established: Stable proxy indicator. Operational sufficiency confirmed (T6 judgment).

  • formal: Formal measurement method implemented.

Declaration: inductive ProxyMaturityLevel

inductive ProxyMaturityLevel where
  | provisional : ProxyMaturityLevel
  | established : ProxyMaturityLevel
  | formal : ProxyMaturityLevel
  deriving BEq, Repr, DecidableEq

Current proxy maturity for V1. provisional → formal (2026-03-27, #77):

  • GQM chain defined (R1 #85): Q1 structural contribution, Q2 verification quality, Q3 operational stability

  • Formal schema implemented in benchmark.json (G1 #78)

  • Automated measurement in observe.sh (G2 #79)

  • Retrospective validation over 63 runs confirmed all metrics satisfy hypotheses

  • Goodhart 5-layer defense: governance metrics (R2), correlation monitoring (R3), non-triviality gate (R5), saturation detection (R6), bias review obligation (G1b-2)

  • Legacy proxy (success_rate) confirmed to be uncorrelated with new benchmark (r=0.006-0.069) (G3 #80)

Declaration: def v1ProxyMaturity

def v1ProxyMaturity : ProxyMaturityLevel := .formal

Current proxy maturity for V3. provisional → formal (2026-03-27, #77):

  • GQM chain defined (R1 #85): Q1 acceptance criteria, Q2 structural integrity, Q3 error trend

  • Formal schema implemented in benchmark.json (G1 #78)

  • Automated measurement in observe.sh (G2 #79)

  • Legacy proxy (test_pass_rate) confirmed invalid as quality signal due to zero variance (G3 #80)

  • hallucination proxy (Run 54+) functions as a new indicator for error trend

Declaration: def v3ProxyMaturity

def v3ProxyMaturity : ProxyMaturityLevel := .formal

V1: Skill quality. Precision and effectiveness of skill definitions. Measurement method: benchmark.json (with/without comparison). Related boundary conditions: L2 (mitigating training data discontinuity), L5 (skill system). observe.sh proxy: evolve_success_rate (successful run ratio), lean_health (sorry=0 check), skill_count (number of skill files). Proxy maturity classification:

  • provisional_proxy: Tentative proxy indicator. Formal measurement method not yet implemented.

  • established_proxy: Stable proxy indicator. Judged operationally sufficient.

  • formal_measurement: Formal measurement method implemented. V1 proxy promoted to formal_measurement (2026-03-27, #77). Measured via benchmark.json GQM schema.

Declaration: opaque skillQuality

opaque skillQuality : World → Nat

V2: Context efficiency. Utilization of finite context. Measurement method: task completion rate / consumed token count. Related boundary conditions: L2 (finite context), L3 (token budget). observe.sh proxy: recent_avg (median of last 10 session deltas, primary), cumulative_avg (all-history average excluding micro-sessions, baseline). primary_metric: recent_median (median-based, robust to outliers). Operational note: a divergence of ±20% or more between recent_avg and cumulative_avg is interpreted as a trend change. Divergence interpretation: V2 is a hub variable with tradeoff relationships to 5 other variables (theorem tradeoff_context_is_hub). Since evolve sessions (heavy tool usage) push recent_avg upward, divergence_percent > 100% is not necessarily problematic. The tendency for recent_avg to rise as evolve depth and frequency increase is expected.

Declaration: opaque contextEfficiency

opaque contextEfficiency : World → Nat

V3: Output quality. Quality of code, design, and documentation. Measurement method: gate pass rate, review finding count. Related boundary conditions: L1 (safety standards), L4 (basis for action space adjustment). observe.sh proxy: test_pass_rate (all-tests pass rate) + hallucination_proxy (rejected[].failure_type aggregation). Legacy fix_ratio_percent proxy (commit prefix ratio) was removed in Run 69. Proxy maturity classification:

  • provisional_proxy: Tentative proxy indicator. Formal measurement method not yet implemented.

  • established_proxy: Stable proxy indicator. Judged operationally sufficient.

  • formal_measurement: Formal measurement method implemented. V3 proxy promoted to formal_measurement (2026-03-27, #77). Measured via benchmark.json GQM schema.

Declaration: opaque outputQuality

opaque outputQuality : World → Nat

V4: Gate pass rate. Rate of clearing each phase's gate on the first attempt. P2 (cognitive separation of concerns) guarantees gate reliability. Measurement method: pass/fail statistics. Related boundary conditions: L6 (gate definition granularity), L4 (auto-merge judgment). observe.sh proxy: Bash passed / (passed + blocked). "tool":"Bash" event count / (Bash + gate_blocked event count) from tool-usage.jsonl.

Declaration: opaque gatePassRate

opaque gatePassRate : World → Nat

V5: Proposal accuracy. Hit rate of design and scope proposals. Measurement method: human approval/rejection rate. Related boundary conditions: L4 (basis for action space adjustment), L6 (design convention improvement). observe.sh proxy: approved / total entry ratio from v5-approvals.jsonl.

Declaration: opaque proposalAccuracy

opaque proposalAccuracy : World → Nat

V6: Knowledge structure quality. Degree of structuring of persistent knowledge. P3 (governed learning) defines the knowledge lifecycle (observation -> hypothesis formation -> verification -> integration -> retirement). Knowledge that is not retired accumulates and degrades V2. Measurement method: context restoration speed in the next session, retirement target detection rate. Related boundary conditions: L2 (mitigating memory loss). observe.sh proxy: memory_entries (MEMORY.md entry count), memory_files (memory file count), last_update_days_ago (days since last update), retired_count (retired entry count).

Declaration: opaque knowledgeStructureQuality

opaque knowledgeStructureQuality : World → Nat

V7: Task design efficiency. Quality of P6 (task design as constraint satisfaction). Two data sources: (1) External knowledge: public benchmarks, model performance characteristics (2) Internal knowledge: execution logs, resource consumption records, outcome-to-cost ratio Measurement method: task completion rate / consumed resource ratio, redesign frequency. Related boundary conditions: L3 (resource limits), L6 (design conventions). observe.sh proxy: completed (task completion count from v7-tasks.jsonl), unique_subjects (unique subject count), teamwork_percent (ratio of entries with teammate field). Operational note: teamwork_percent is suppressed in single-agent operation (teamwork_status="suppressed_single_agent"). Since this field requires multi-agent/human collaboration, it is excluded from observation reports in single-agent environments.

Declaration: opaque taskDesignEfficiency

opaque taskDesignEfficiency : World → Nat

7.1.7. Measurability Declarations for V1-V7 - Gamma Extension of T_0 Design-Derived

Each variable is declared as Measurable via non-logical axioms (Glossary §4.1). This is a design-level promise that "measurement is possible in principle," with concrete measurement implementations delegated to the operational layer.

Membership in Gamma T_0 (Procedure Manual §2.4): the justification for these axioms originates from the designer's design judgments (not from external authority), and therefore they belong to the extension part.

Why axioms: since V1–V7 are opaque (opaque definitions, Glossary §9.4), Measurable cannot be proved as a theorem (§4.2) due to the non-unfoldability of opaque definitions. Measurability is guaranteed by external operational systems and is assumed as a non-logical axiom within the formal system.

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V1 Content: V1 (skill quality) is measurable Basis: with/without comparison via benchmark.json exists as a measurement procedure Source: Ontology.lean V1 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for skill quality is in principle unconstructible

Declaration: theorem v1_measurable

theorem v1_measurable : Measurable skillQuality := ⟨skillQuality, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V2 Content: V2 (context efficiency) is measurable Basis: the ratio of task completion rate to consumed token count exists as a measurement procedure Source: Ontology.lean V2 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for context efficiency is in principle unconstructible

Declaration: theorem v2_measurable

theorem v2_measurable : Measurable contextEfficiency := ⟨contextEfficiency, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V3 Content: V3 (output quality) is measurable Basis: gate pass rate and review finding count exist as measurement procedures Source: Ontology.lean V3 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for output quality is in principle unconstructible

Declaration: theorem v3_measurable

theorem v3_measurable : Measurable outputQuality := ⟨outputQuality, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V4 Content: V4 (gate pass rate) is measurable Basis: pass/fail statistics exist as a measurement procedure Source: Ontology.lean V4 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for gate pass rate is in principle unconstructible

Declaration: theorem v4_measurable

theorem v4_measurable : Measurable gatePassRate := ⟨gatePassRate, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V5 Content: V5 (proposal accuracy) is measurable Basis: human approval/rejection rate exists as a measurement procedure Source: Ontology.lean V5 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for proposal accuracy is in principle unconstructible

Declaration: theorem v5_measurable

theorem v5_measurable : Measurable proposalAccuracy := ⟨proposalAccuracy, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V6 Content: V6 (knowledge structure quality) is measurable Basis: context restoration speed and retirement target detection rate exist as measurement procedures Source: Ontology.lean V6 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for knowledge structure quality is in principle unconstructible

Declaration: theorem v6_measurable

theorem v6_measurable : Measurable knowledgeStructureQuality := ⟨knowledgeStructureQuality, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Proposition: V7 Content: V7 (task design efficiency) is measurable Basis: task completion rate / consumed resource ratio exists as a measurement procedure Source: Ontology.lean V7 definition Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for task design efficiency is in principle unconstructible

Declaration: theorem v7_measurable

theorem v7_measurable : Measurable taskDesignEfficiency := ⟨taskDesignEfficiency, fun _ => rfl⟩

7.1.8. System Health

Rather than maximizing individual variables, maintain the health of the system as a whole. Even when metrics for one variable improve, verify that other variables have not deteriorated.

Health is formulated as "all variables are at or above a threshold." Threshold settings are operational judgments (T6: humans are the final decision-makers for resources).

System health. The state in which all V1–V7 meet the minimum threshold.

Note: thresholds should ideally differ per variable rather than being uniform, but Phase 4 uses a uniform threshold for simplicity. Phase 5 extends this to per-variable thresholds (HealthThresholds / systemHealthyPerVar in ObservableDesign.lean).

Declaration: def systemHealthy

def systemHealthy (threshold : Nat) (w : World) : Prop :=
  skillQuality w ≥ threshold ∧
  contextEfficiency w ≥ threshold ∧
  outputQuality w ≥ threshold ∧
  gatePassRate w ≥ threshold ∧
  proposalAccuracy w ≥ threshold ∧
  knowledgeStructureQuality w ≥ threshold ∧
  taskDesignEfficiency w ≥ threshold

(Derivation Card) Layer: Gamma T_0 (design-derived) Content: trustLevel is measurable. Indirectly observed from investment behavior (fluctuations in resource allocation) Basis: trust is concretized as investment behavior (resource allocation fluctuations) Source: manifesto.md Section 6 Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for trust level is in principle unconstructible

Declaration: theorem trust_measurable

theorem trust_measurable :
  ∀ (agent : Agent), Measurable (trustLevel agent) :=
  fun agent => ⟨trustLevel agent, fun _ => rfl⟩

(Derivation Card) Layer: Gamma T_0 (design-derived) Content: degradationLevel is measurable. Computed from temporal changes in V1–V7 Basis: if V1–V7 are Measurable, their rate of change is also computable Source: design of P4 (observability of degradation) Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function. Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself). Refutation condition: if it is shown that a measurement procedure for degradation level is in principle unconstructible

Declaration: theorem degradation_measurable

theorem degradation_measurable : Measurable degradationLevel := ⟨degradationLevel, fun _ => rfl⟩

7.1.9. Three-Tier Structure Connection Boundary to Mitigation to Variable

Many variables represent the quality of structures designed as mitigations for boundary conditions:

Boundary (invariant)          ->  Mitigation (structure)         ->  Variable (quality)
L2: Memory loss               ->  Implementation Notes           ->  V6: Knowledge structure quality
L2: Finite context             ->  50% rule, lightweight design   ->  V2: Context efficiency
L2: Non-determinism            ->  Gate verification              ->  V4: Gate pass rate
L2: Training data discontinuity ->  docs/ SSOT, skills           ->  V1: Skill quality

Boundary conditions do not move. Mitigations are design decisions (L6). Variables measure how well mitigations work. This three-tier structure clarifies "what is fixed, what is a design choice, and what is an optimization target."

Correspondence between boundary conditions and variables. Expresses the "boundary -> variable" mapping of the three-tier structure as a type. Mitigations are design decisions (L6) positioned between them.

Declaration: inductive VariableId

inductive VariableId where
  | v1 | v2 | v3 | v4 | v5 | v6 | v7
  deriving BEq, Repr

Boundary condition corresponding to each variable. Expresses the "boundary -> variable" mapping of the three-tier structure as a function. Mitigations are design decisions (L6) positioned between them.

Declaration: def variableBoundary

def variableBoundary : VariableId → BoundaryId
  | .v1 => .ontological   -- L2: 学習データ断絶 → V1: スキル品質
  | .v2 => .ontological   -- L2: コンテキスト有限性 → V2: コンテキスト効率
  | .v3 => .ethicsSafety   -- L1: 安全基準 → V3: 出力品質
  | .v4 => .ontological   -- L2: 非決定性 → V4: ゲート通過率
  | .v5 => .actionSpace    -- L4: 行動空間調整の根拠 → V5: 提案精度
  | .v6 => .ontological   -- L2: 記憶喪失 → V6: 知識構造の質
  | .v7 => .resource       -- L3: リソース上限 → V7: タスク設計効率

(Derivation Card) Derives from: variableBoundary, boundaryLayer (Observable.lean / Ontology.lean) Proposition: L2 Content: L2 (ontological boundary) is fixed — variables V1, V2, V4, V6 mapped to L2 cannot move the boundary itself; only the quality of mitigations can be improved. Proof strategy: simp (variableBoundary, boundaryLayer) (definitional computation)

Declaration: theorem fixed_boundary_variables_mitigate_only

theorem fixed_boundary_variables_mitigate_only :
  boundaryLayer (variableBoundary .v1) = .fixed ∧
  boundaryLayer (variableBoundary .v2) = .fixed ∧
  boundaryLayer (variableBoundary .v4) = .fixed ∧
  boundaryLayer (variableBoundary .v6) = .fixed := by
  simp [variableBoundary, boundaryLayer]

Boundary conditions corresponding to each constraint (T1-T8). Expresses the "constraint -> boundary condition" mapping of the three-tier structure as a function. T->L mapping: which boundary condition category each constraint belongs to.

Mapping justification:

  • T1 -> L2: Session ephemerality is an ontological fact (agent is bound to session)

  • T2 -> L2: Structural persistence is an ontological fact (structure outlives agent)

  • T3 -> L2, L3: Finite context is both an ontological and a resource constraint

  • T4 -> L2: Output stochasticity is an ontological property of LLMs

  • T5 -> L2: Feedback requirement is an ontological prerequisite for improvement

  • T6 -> L1, L4: Human authority spans the safety boundary (L1) and action space boundary (L4)

  • T7 -> L3: Resource finiteness directly corresponds to the resource boundary

  • T8 -> L6: Precision level is defined as a task design convention (architecturalConvention)

Note: L5 (platform) is intentionally excluded. L5 represents provider-specific environmental constraints (Claude Code, Codex CLI, etc.), while T1-T8 are technology-independent constraints. L5 is not derived from T but arises from the human judgment of platform selection (upstream of T6). In variableBoundary as well, V1-V7 are not mapped to L5.

Declaration: def constraintBoundary

def constraintBoundary : ConstraintId → List BoundaryId
  | .t1 => [.ontological]
  | .t2 => [.ontological]
  | .t3 => [.ontological, .resource]
  | .t4 => [.ontological]
  | .t5 => [.ontological]
  | .t6 => [.ethicsSafety, .actionSpace]
  | .t7 => [.resource]
  | .t8 => [.architecturalConvention]

Every constraint corresponds to at least one boundary condition. Surjectivity onto coverage of the T->L mapping.

Declaration: theorem constraint_has_boundary

theorem constraint_has_boundary :
  ∀ c : ConstraintId, (constraintBoundary c).length > 0 := by
  intro c
  cases c <;> simp [constraintBoundary]

(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L5 Content: L5 (platform boundary) is not derived from any technology-independent constraint T1-T8. L5 is a provider-specific environmental constraint arising from human platform selection (upstream of T6). Proof strategy: cases c <;> simp (constraintBoundary) (exhaustive case analysis)

Declaration: theorem platform_not_in_constraint_boundary

theorem platform_not_in_constraint_boundary :
  ∀ c : ConstraintId, BoundaryId.platform ∉ constraintBoundary c := by
  intro c
  cases c <;> simp [constraintBoundary]

Every boundary condition except L5 is included in the constraintBoundary of at least one constraint. constraintBoundary covers L1-L6 except L5.

Declaration: theorem constraint_boundary_covers_except_platform

theorem constraint_boundary_covers_except_platform :
  (∃ c, BoundaryId.ethicsSafety ∈ constraintBoundary c) ∧
  (∃ c, BoundaryId.ontological ∈ constraintBoundary c) ∧
  (∃ c, BoundaryId.resource ∈ constraintBoundary c) ∧
  (∃ c, BoundaryId.actionSpace ∈ constraintBoundary c) ∧
  (∃ c, BoundaryId.architecturalConvention ∈ constraintBoundary c) := by
  refine ⟨⟨.t6, ?_⟩, ⟨.t1, ?_⟩, ⟨.t3, ?_⟩, ⟨.t6, ?_⟩, ⟨.t8, ?_⟩⟩ <;>
    simp [constraintBoundary]

(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L1 Content: L1 (ethics/safety boundary) is derived from constraint T6 (human authority). T6 maps to ethicsSafety in constraintBoundary, establishing that the safety boundary has a technology-independent grounding. Proof strategy: exact with witness .t6, then simp (constraintBoundary)

Declaration: theorem ethicsSafety_covered_by_constraint

theorem ethicsSafety_covered_by_constraint :
  ∃ c, BoundaryId.ethicsSafety ∈ constraintBoundary c :=
  ⟨.t6, by simp [constraintBoundary]⟩

(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L3 Content: L3 (resource boundary) is derived from constraints T3 (finite context) and T7 (finite resources). T3 maps to (ontological, resource) and T7 maps to (resource) in constraintBoundary, establishing that the resource boundary has technology-independent grounding. Proof strategy: exact with witness .t3, then simp (constraintBoundary)

Declaration: theorem resource_covered_by_constraint

theorem resource_covered_by_constraint :
  ∃ c, BoundaryId.resource ∈ constraintBoundary c :=
  ⟨.t3, by simp [constraintBoundary]⟩

(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L4 Content: L4 (action space boundary) is derived from constraint T6 (human authority). T6 maps to (ethicsSafety, actionSpace) in constraintBoundary, establishing that the action space boundary is grounded in human decision authority. Proof strategy: exact with witness .t6, then simp (constraintBoundary)

Declaration: theorem actionSpace_covered_by_constraint

theorem actionSpace_covered_by_constraint :
  ∃ c, BoundaryId.actionSpace ∈ constraintBoundary c :=
  ⟨.t6, by simp [constraintBoundary]⟩

(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L6 Content: L6 (architectural convention boundary) is derived from constraint T8 (task has precision). T8 maps to (architecturalConvention) in constraintBoundary, establishing that design conventions are grounded in the precision requirement. Proof strategy: exact with witness .t8, then simp (constraintBoundary)

Declaration: theorem architecturalConvention_covered_by_constraint

theorem architecturalConvention_covered_by_constraint :
  ∃ c, BoundaryId.architecturalConvention ∈ constraintBoundary c :=
  ⟨.t8, by simp [constraintBoundary]⟩

7.1.10. Measurable to Observable Bridge

A general theorem stating that threshold comparison of Measurable indicators is Observable. An aggregation lemma that collects the Measurable axioms of V1–V7.

Threshold comparison of a Measurable indicator is Observable (Measurable->Observable bridge). Constructs a decision procedure for m w >= t from Measurable m.

Declaration: theorem measurable_threshold_observable

theorem measurable_threshold_observable {m : World → Nat} (hm : Measurable m) (t : Nat) :
    Observable (fun w => m w ≥ t) := by
  obtain ⟨f, hf⟩ := hm
  exact ⟨fun w => decide (f w ≥ t), fun w => by simp [hf w]⟩

All 7 variables are Measurable (aggregation lemma).

Declaration: theorem all_variables_measurable

theorem all_variables_measurable :
    Measurable skillQuality ∧ Measurable contextEfficiency ∧
    Measurable outputQuality ∧ Measurable gatePassRate ∧
    Measurable proposalAccuracy ∧ Measurable knowledgeStructureQuality ∧
    Measurable taskDesignEfficiency :=
  ⟨v1_measurable, v2_measurable, v3_measurable, v4_measurable,
   v5_measurable, v6_measurable, v7_measurable⟩

Conjunction closure of Observable. The conjunction of two Observable properties is also Observable.

Declaration: theorem observable_and

theorem observable_and {P Q : World → Prop} (hp : Observable P) (hq : Observable Q) :
    Observable (fun w => P w ∧ Q w) := by
  obtain ⟨fp, hfp⟩ := hp
  obtain ⟨fq, hfq⟩ := hq
  refine ⟨fun w => fp w && fq w, fun w => ?_⟩
  simp [Bool.and_eq_true]
  exact ⟨fun ⟨a, b⟩ => ⟨(hfp w).mp a, (hfq w).mp b⟩,
         fun ⟨a, b⟩ => ⟨(hfp w).mpr a, (hfq w).mpr b⟩⟩

Negation closure of Observable. The negation of an Observable property is also Observable.

Declaration: theorem observable_not

theorem observable_not {P : World → Prop} (hp : Observable P) :
    Observable (fun w => ¬ P w) := by
  obtain ⟨fp, hfp⟩ := hp
  refine ⟨fun w => !fp w, fun w => ?_⟩
  constructor
  · intro h hnp
    have := (hfp w).mpr hnp
    simp [this] at h
  · intro hnp
    dsimp only []
    cases hb : fp w with
    | false => rfl
    | true => exact absurd ((hfp w).mp hb) hnp

Disjunction closure of Observable. The disjunction of two Observable properties is also Observable.

Declaration: theorem observable_or

theorem observable_or {P Q : World → Prop} (hp : Observable P) (hq : Observable Q) :
    Observable (fun w => P w ∨ Q w) := by
  obtain ⟨fp, hfp⟩ := hp
  obtain ⟨fq, hfq⟩ := hq
  refine ⟨fun w => fp w || fq w, fun w => ?_⟩
  simp [Bool.or_eq_true]
  exact ⟨fun h => h.elim (fun hp_w => Or.inl ((hfp w).mp hp_w))
                          (fun hq_w => Or.inr ((hfq w).mp hq_w)),
         fun h => h.elim (fun hfp_w => Or.inl ((hfp w).mpr hfp_w))
                          (fun hfq_w => Or.inr ((hfq w).mpr hfq_w))⟩

System health is Observable (binary-decidable). Since each Vi is Measurable, threshold comparison is decidable. Proved via measurable_threshold_observable + observable_and. (Originally an axiom, demoted to theorem in Run 27)

Declaration: theorem system_health_observable

theorem system_health_observable :
    ∀ (threshold : Nat), Observable (systemHealthy threshold) := by
  intro t
  unfold systemHealthy
  apply observable_and (measurable_threshold_observable v1_measurable t)
  apply observable_and (measurable_threshold_observable v2_measurable t)
  apply observable_and (measurable_threshold_observable v3_measurable t)
  apply observable_and (measurable_threshold_observable v4_measurable t)
  apply observable_and (measurable_threshold_observable v5_measurable t)
  apply observable_and (measurable_threshold_observable v6_measurable t)
  exact measurable_threshold_observable v7_measurable t

Degradation detection is Observable: "at least one variable is below threshold" is a decidable property. This is the Observable formalization of D3 condition 2 (degradation detectable). Since systemHealthy is Observable (system_health_observable), its negation is also Observable (observable_not).

Declaration: theorem degradation_detectable_observable

theorem degradation_detectable_observable :
    ∀ (threshold : Nat), Observable (fun w => ¬systemHealthy threshold w) := by
  intro t
  exact observable_not (system_health_observable t)

Below-threshold comparison is Observable. Dual of measurable_threshold_observable. "Variable m is below threshold t" is decidable if m is Measurable.

Declaration: theorem measurable_below_threshold_observable

theorem measurable_below_threshold_observable {m : World → Nat} (hm : Measurable m) (t : Nat) :
    Observable (fun w => m w < t) := by
  obtain ⟨f, hf⟩ := hm
  exact ⟨fun w => decide (f w < t), fun w => by simp [hf w]⟩

7.1.11. Part IV Maintaining the Classification Itself

This classification (L1–L6, V1–V7) is a hypothesis based on current understanding, not a fixed truth. Its type-level representation is formalized as ReviewSignal in Evolution.lean.

7.1.12. Signals That Should Trigger Review

Signal

Example

Response

Misclassification

An item placed in L1 is actually conditionally modifiable

Move to another category

Missing boundary condition

Regulatory/legal constraints restrict the action space but are absent from the classification

Add a new Layer

Vanished boundary condition

Technological advances have effectively overcome an L2 item

Delete or reclassify

Variable deficit/surplus

There are optimization targets not included in V1-V7

Add, merge, or split variables

Ambiguous category boundary

Something could belong to either "fixed boundary" or "investment-variable boundary"

Refine the judgment criteria

7.1.13. Caution - Avoiding Self-Rigidification of the Classification

The greatest risk is that the classification itself begins to function as a boundary condition -- inducing reasoning such as "it cannot be changed because it is written in L1."

Preventive measures:

  • Maintain the rationale for "why this category" for each item in every Layer

  • "Fixed" means "no means of changing it has been found at present"

  • Reclassification of boundary conditions is a legitimate act consistent with the spirit of the manifesto

7.1.14. Core Insights

  1. The subject of optimization is structure, not the agent. The agent is an ephemeral catalyst (T1). Improvements accumulate within structure (T2).

  2. Variables are not independent levers but a mutually interacting system. Improving V1 can degrade V2. Rather than maximizing individual variables, maintain the health of the system as a whole.

  3. The purpose of the investment cycle is equilibrium, not expansion. Rather than maximizing the action space, search for the equilibrium point where collaborative value is maximized. The equilibrium point shifts with context.

  4. The investment cycle simultaneously contains positive and negative feedback. By P1, expansion of the action space is inseparable from expansion of the attack surface. Expansion without defense increases the potential destructive power of the reverse cycle.

  5. Gate reliability depends on P2, and P2 rests on E1. V4 is meaningful only when generation and evaluation are structurally separated.

  6. Variable optimization presupposes P4. What cannot be observed cannot be optimized.

  7. Structure is interpreted probabilistically (P5). Designs that assume 100% compliance are fragile.

  8. Task execution is a constraint satisfaction problem (P6). Simultaneous satisfaction of T3, T7, and T8 drives task design.

  9. L5 determines the ceiling of structural improvement. Building a custom platform is justified when the investment cycle has sufficiently progressed and the L5 ceiling has become a bottleneck.

  10. The axiom system has a three-layer structure. Constraints (T: undeniable), empirical postulates (E: falsifiable but unfalsified), and foundational principles (P: derived from T/E). The robustness of each P differs depending on whether its justification includes E.

  11. This classification itself is subject to review. The L1–L6, V1–V7 classification is not a fixed truth; reclassification, addition, and deletion of items may occur during operation.

7.1.15. Quality Measurement Priority G1b-1

Analysis from G1b-1 (#91) revealed that the following quality priorities are derivable from the manifesto's axiom system. These follow logically from existing axioms and design principles, without depending on T6 (human judgment).

7.1.16. Non-Derivable Domain V1-V7

Mutual priority among V1-V7 is not derivable. TradeoffExists is a symmetric relation and does not imply orderings such as "V1 > V3." This is an intentional design decision; priority among V's reduces to T6 judgment (G1b-2 #92).

Quality measurement category: measurement of structural change vs measurement of process success rate. Formalization of the proxy mismatch identified in R1 (GQM redefinition).

Declaration: inductive QualityMeasureCategory

inductive QualityMeasureCategory where
  | structuralOutcome   -- 構造的成果: theorem delta, test delta, axiom count
  | processSuccess      -- プロセス成功率: evolve success rate, skill invocation rate
  deriving BEq, Repr

Priority of quality measurement categories. Structural outcomes are a more direct indicator of quality than process success rates. Basis:

  • Supreme mission "persistent structure continues to improve itself" -> structural change defines improvement

  • Analogy from D5 (specification layer ordering): outcome (what was produced) > process (how it was produced)

  • Anthropic eval guide: "grade what the agent produced, not the path it took"

Declaration: def qualityMeasurePriority

def qualityMeasurePriority : QualityMeasureCategory → Nat
  | .structuralOutcome => 1  -- higher priority
  | .processSuccess    => 0  -- lower priority

Measurement of structural outcomes takes priority over measurement of process success rates as a quality indicator. Quality is "skills producing structural improvement," not merely "skills running successfully."

Declaration: theorem structural_outcome_gt_process_success

theorem structural_outcome_gt_process_success :
    qualityMeasurePriority .structuralOutcome >
    qualityMeasurePriority .processSuccess := by
  native_decide

Classification of verification signals: independent verification vs self-assessment. Formalization of P2 + E1 + ICLR 2024 (Huang et al.).

Declaration: inductive VerificationSignalType

inductive VerificationSignalType where
  | independentlyVerified  -- P2: 独立エージェントまたは構造的テストによる検証
  | selfAssessed           -- 同一インスタンスによる自己評価
  deriving BEq, Repr

Reliability of verification signals. Independent verification is more reliable than self-assessment. Basis:

  • P2: Cognitive separation of concerns (separation of Worker and Verifier)

  • E1: Experience precedes theory -- self-assessment using self-generated theory is circular

  • ICLR 2024 Huang et al.: intrinsic self-correction degrades accuracy

Declaration: def verificationReliability

def verificationReliability : VerificationSignalType → Nat
  | .independentlyVerified => 1  -- higher reliability
  | .selfAssessed          => 0  -- lower reliability

Independently verified quality signals are more reliable than self-assessed quality signals.

Declaration: theorem independent_verification_gt_self_assessment

theorem independent_verification_gt_self_assessment :
    verificationReliability .independentlyVerified >
    verificationReliability .selfAssessed := by
  native_decide

Quality assurance layers: defect absence vs value creation. Application of the D6 DesignStage ordering to the quality dimension.

Declaration: inductive QualityAssuranceLayer

inductive QualityAssuranceLayer where
  | defectAbsence    -- 壊れていないことの確認(test pass, Lean build, sorry=0)
  | valueCreation    -- 良いことの確認(改善の実質性、有用性)
  deriving BEq, Repr

Measurement priority of quality assurance. Confirming defect absence precedes confirming value creation. Basis:

  • D6: Boundary (constraint satisfaction) > Variable (quality improvement)

  • D4: Safety > Governance -- safety (not broken) precedes governance (making better)

  • Logical consequence: measuring "substantiveness of improvement" in a broken system is meaningless

Declaration: def qualityAssurancePriority

def qualityAssurancePriority : QualityAssuranceLayer → Nat
  | .defectAbsence  => 1  -- higher measurement priority
  | .valueCreation  => 0  -- lower measurement priority (but not less important)

Measurement of defect absence takes priority over measurement of value creation (as measurement ordering). Note: this means "should be measured first," not "defect absence is more important." Measurement of value creation becomes meaningful only after defect absence is confirmed.

Declaration: theorem defect_absence_measurement_gt_value_creation

theorem defect_absence_measurement_gt_value_creation :
    qualityAssurancePriority .defectAbsence >
    qualityAssurancePriority .valueCreation := by
  native_decide

8. Design Foundation: D1-D14

8.1. Epistemic Layer - DesignTheorem Strength 1 - Formalization of Design Development Foundation

Type-checks that D1–D17 from design-development-foundation.md are derivable (§2.4 derivability) from the manifesto's T/E/P (premise set Γ, terminology reference §2.5).

8.1.1. Nature of the Formalization

This file does not add new non-logical axioms (§4.1) to Γ. Every D is formalized as one of the following:

  • Definitional extension (§5.5): Definition of new types/functions. Always a conservative extension

  • Theorem (§4.2): Derived from existing axioms (T/E) by application of inference rules

Therefore this file is a collection of definitional extensions + theorems over T₀, and conservative extension (§5.5) is guaranteed by definitional_implies_conservative proven in Terminology.lean.

8.1.2. Design Policy

Each D is expressed as a type (definitional extension, §5.5) or theorem (§4.2), with explicit connections to the underlying T/E/P non-logical axioms (§4.1) / theorems.

D's are meta-level (§5.6 metatheory) design principles, distinct from object-level (§5.6 object theory) non-logical axioms.

8.1.3. Correspondence with Terminology Reference

Lean Concept

Terminology Reference

§Ref

D1–D17 theorems

Theorems (propositions derived from axioms)

§4.2

D1–D17 def/structure

Definitional extensions (new symbols defined via existing symbols)

§5.5

SelfGoverning

Type class (interface for types)

§9.4

DesignPrinciple

Component of the domain of discourse (§3.2)

§3.2

DesignPrincipleUpdate

Structuring of AGM revision operations

§9.2

EnforcementLayer

Hierarchy of enforcement power. Means to realize invariants (§9.3)

§9.3

DevelopmentPhase

Inter-phase dependencies resemble transition relations (§9.3)

§9.3

VerificationIndependence

Operationalization of E1 (§4.1 non-logical axiom)

§4.1

CompatibilityClass

Classification of extensions (conservative/consistent/breaking)

§5.5

8.1.4. Correspondence with design-development-foundation.md

This file formalizes D1–D17.

D

Rationale

Formalization Depth

D1

P5 + L1–L6

type + 2 theorems

D2

E1 + P2

structure + 3 theorems

D3

P4 + T5

3 theorems (3-condition structure not formalized)

D4

Section 7 + P3 + T2

type + 5 theorems

D5

T8 + P4 + P6

type + 3 theorems (inter-layer relations not formalized)

D6

Ontology/Observable

3 theorems (causal chain not formalized)

D7

Section 6 + P1

2 theorems (accumulation bounded + damage unbounded)

D8

Section 6 + E2

2 theorems (overexpansion + capability-risk)

D9

Observable + P3 + Section 7

SelfGoverning + 4 theorems

D10

T1 + T2

2 theorems (structural permanence + epoch monotone increase)

D11

T3 + D1

definition + 3 theorems (inverse correlation + minimization + finiteness)

D12

P6 + T3 + T7 + T8

2 theorems (CSP + probabilistic output)

D13

P3 + Section 8 + T5

impact propagation + assumption-level extension (#225)

D14

P6 + T7 + T8

1 theorem (constraint satisfaction of verification order)

D15

T3+T4+T5+T6+T7+T8+P6

4 theorems (retry bounds, convergence, eviction, saturation)

D16

context_contribution_nonuniform

3 theorems (zero-contribution, composition, resource)

D17

T5+D3+P3+T6+D5+D9+E1+D2+D13

type + 8 theorems (deductive design workflow)

8.1.5. D1 Enforcement Layering

Definitional Extension, 5.5.

Rationale: P5 (probabilistic interpretation) + L1–L6 (hierarchy of boundary conditions)

By P5, normative guidelines are only probabilistically complied with. Therefore, absolute constraints such as L1 (safety) should be implemented via structural enforcement (not subject to probabilistic interpretation).

Connection with terminology reference:

  • Structural enforcement → Invariant (§9.3): Property that always holds during execution

  • Procedural enforcement → Pre/post-conditions (§9.3): Verified before and after operations

  • Normative guideline → Satisfiable (§2.2) but not valid/tautological (§2.2) by P5

Enforcement layer. Represents the strength of enforcement power.

Declaration: inductive EnforcementLayer

inductive EnforcementLayer where
  | structural   -- 違反が物理的に不可能
  | procedural   -- 違反は可能だが検出・阻止される
  | normative    -- 遵守は確率的(P5)
  deriving BEq, Repr

Strength ordering of enforcement layers. structural is the strongest.

Declaration: def EnforcementLayer.strength

def EnforcementLayer.strength : EnforcementLayer → Nat
  | .structural => 3
  | .procedural => 2
  | .normative  => 1

Minimum required enforcement layer for each boundary condition. Fixed boundaries (L1, L2) require structural enforcement. Investment-variable boundaries require procedural enforcement or above. Environmental boundaries may use normative guidelines.

Declaration: def minimumEnforcement

def minimumEnforcement : BoundaryLayer → EnforcementLayer
  | .fixed              => .structural
  | .investmentVariable => .procedural
  | .environmental      => .normative

(Derivation Card) Derives from: probabilistic_interpretation_insufficient (P5 / T4) Proposition: D1 Content: Fixed boundaries (L1) require structural enforcement. Normative guidelines cannot guarantee L1 compliance under nondeterministic interpretation. Proof strategy: rfl (definitional equality — minimumEnforcement maps .fixed to .structural)

Declaration: theorem d1_fixed_requires_structural

theorem d1_fixed_requires_structural :
  minimumEnforcement .fixed = .structural := by rfl

Corollary of D1: Enforcement layer strength is monotone with respect to boundary layers. Enforcement strength required: fixed >= investment-variable >= environmental.

Declaration: theorem d1_enforcement_monotone

theorem d1_enforcement_monotone :
  (minimumEnforcement .fixed).strength ≥
  (minimumEnforcement .investmentVariable).strength ∧
  (minimumEnforcement .investmentVariable).strength ≥
  (minimumEnforcement .environmental).strength := by
  simp [minimumEnforcement, EnforcementLayer.strength]

8.1.6. D2 Worker/Verifier Separation

Definitional Extension + Theorem, 5.5/4.2.

Rationale: E1 (verification independence, non-logical axiom §4.1) + P2 (cognitive role separation, theorem §4.2)

E1a (verification_requires_independence) is the direct rationale. E1 belongs to Γ T₀ (hypothesis-derived) and is falsifiable (§9.1). If E1 is falsified, D2 becomes subject to review.

Four conditions for verification independence.

The former 3 conditions (context separation, bias non-sharing, independent invocation) covered only process-level independence. Without evaluator independence, the problem of "the same model making the same mistakes in a different context" remains.

Four conditions:

  1. Context separation: Worker's reasoning process and intermediate state do not leak to Verifier

  2. Framing independence: Verification criteria are not post-hoc defined by the Worker (Refinement of former "bias non-sharing". Not only the artifacts, but also the framework of "what should be verified" is independent of the Worker)

  3. Execution automaticity: Worker cannot bypass verification (Strengthening of former "independent invocation". Does not depend on Worker's discretion)

  4. Evaluator independence: Evaluation is performed by a separate entity without shared judgment tendencies (Human: A different person without shared context but with sufficient knowledge. LLM: A different model without shared context. Same model with different context corresponds to a Subagent, which achieves process separation but not evaluator independence)

Declaration: structure VerificationIndependence

structure VerificationIndependence where

Worker's reasoning process does not leak to Verifier

contextSeparated      : Bool

Worker's reasoning process does not leak to Verifier -/ contextSeparated : Bool /-- Verification criteria do not depend on Worker's framing -/ framingIndependent : Bool /-- Verification execution does not depend on Worker's discretion -/ executionAutomatic : Bool /-- Evaluator has different judgment tendencies from Worker -/ evaluatorIndependent : Bool deriving BEq, Repr

/-- Verification risk level. The required level of independence varies by risk.

Declaration: inductive VerificationRisk

inductive VerificationRisk where
  | critical  -- L1 関連: 安全・倫理
  | high      -- 構造変更: アーキテクチャ、設定
  | moderate  -- 通常コード変更
  | low       -- ドキュメント、コメント
  deriving BEq, Repr

Verification criteria do not depend on Worker's framing

framingIndependent    : Bool

Verification execution does not depend on Worker's discretion

executionAutomatic    : Bool

Evaluator has different judgment tendencies from Worker

evaluatorIndependent  : Bool

Required independence conditions for each risk level. The model is quantitative: any N conditions out of 4 suffice. critical: All 4 conditions required (verification by human or different model) high: Any 3 of 4 conditions (e.g. context separation + framing independence + evaluator independence) moderate: Any 2 of 4 conditions (e.g. context separation + automatic execution) low: Any 1 of 4 conditions (context separation alone suffices)

Declaration: def requiredConditions

def requiredConditions : VerificationRisk → Nat
  | .critical => 4
  | .high     => 3
  | .moderate => 2
  | .low      => 1

Counts the number of satisfied independence conditions.

Declaration: def satisfiedConditions

def satisfiedConditions (vi : VerificationIndependence) : Nat :=
  (if vi.contextSeparated then 1 else 0) +
  (if vi.framingIndependent then 1 else 0) +
  (if vi.executionAutomatic then 1 else 0) +
  (if vi.evaluatorIndependent then 1 else 0)

Whether verification is sufficient: satisfied conditions >= required conditions

Declaration: def sufficientVerification

def sufficientVerification
    (vi : VerificationIndependence) (risk : VerificationRisk) : Prop :=
  satisfiedConditions vi ≥ requiredConditions risk

Critical risk requires all four conditions. Subagent (contextSeparated only) is insufficient.

Declaration: theorem critical_requires_all_four

theorem critical_requires_all_four :
  requiredConditions .critical = 4 := by rfl

Subagent-only verification (context separation only) is sufficient only for low risk.

Declaration: theorem subagent_only_sufficient_for_low

theorem subagent_only_sufficient_for_low :
  let subagentOnly : VerificationIndependence :=
    { contextSeparated := true
      framingIndependent := false
      executionAutomatic := false
      evaluatorIndependent := false }
  sufficientVerification subagentOnly .low ∧
  ¬sufficientVerification subagentOnly .moderate := by
  simp [sufficientVerification, satisfiedConditions, requiredConditions]

Backward compatibility with former validSeparation: the old 3 conditions are a subset of the new 4 conditions.

Declaration: def validSeparation

def validSeparation (vs : VerificationIndependence) : Prop :=
  vs.contextSeparated = true ∧
  vs.framingIndependent = true ∧
  vs.executionAutomatic = true

(Derivation Card) Derives from: verification_requires_independence (E1) Proposition: D2 Content: Valid verification requires separation — generator and verifier must have distinct IDs and not share internal state, ensuring contextual and evaluative independence. Proof strategy: Direct application of verification_requires_independence (E1)

Declaration: theorem d2_from_e1

theorem d2_from_e1 :
  ∀ (gen ver : Agent) (action : Action) (w : World),
    generates gen action w →
    verifies ver action w →
    gen.id ≠ ver.id ∧ ¬sharesInternalState gen ver :=
  verification_requires_independence

8.1.7. D3 Observability First

Theorem, 4.2.

Rationale: P4 (observability of degradation, theorem §4.2) + T5 (no improvement without feedback, T₀ §4.1)

T5 (no_improvement_without_feedback) is the direct rationale: Improvement requires feedback → feedback requires observation.

Note: design-development-foundation.md defines 3 conditions for observability (measurable, degradation-detectable, improvement-verifiable), but this formalization covers only the implication of T5. Structuring of the 3 conditions is not yet implemented.

(Derivation Card) Derives from: no_improvement_without_feedback (T5) Proposition: D3 Content: Feedback must precede improvement — observability is a necessary precondition for structural improvement. Proof strategy: Direct application of T5 (no_improvement_without_feedback)

Declaration: theorem d3_observability_precedes_improvement

theorem d3_observability_precedes_improvement :
  ∀ (w w' : World),
    structureImproved w w' →
    ∃ (f : Feedback), f ∈ w'.feedbacks ∧
      f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
  no_improvement_without_feedback

(Derivation Card) Derives from: no_process_improvement_without_feedback (T5, #316) Proposition: D3 (process-level) Content: Process-targeted feedback must precede process improvement. Observability of the process itself is a necessary precondition for process improvement. This prevents "blind optimization" where a process improves structures but never evaluates its own effectiveness. Proof strategy: Direct application of no_process_improvement_without_feedback

Declaration: theorem d3_process_observability_precedes_improvement

theorem d3_process_observability_precedes_improvement :
  ∀ (pid : ProcessId) (w w' : World),
    processImproved pid w w' →
    ∃ (f : Feedback), f ∈ w'.feedbacks ∧
      f.target = .process pid ∧
      f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
  no_process_improvement_without_feedback

Distinction of detection modes (introduced in Run 41). Refines the definition of "detectable": distinguishes between human-readable (humanReadable) and programmatically queryable (structurallyQueryable). D3 condition 2 requires structurallyQueryable.

Declaration: inductive DetectionMode

inductive DetectionMode where
  | humanReadable         : DetectionMode  -- 人間が読めば分かる(自由テキスト等)
  | structurallyQueryable : DetectionMode  -- プログラムでクエリ可能(構造化フィールド等)
  deriving BEq, Repr

D3 observability 3 conditions (design-development-foundation.md §D3). Only when all 3 conditions hold for a variable V does V become an effectively optimizable target.

Declaration: structure ObservabilityConditions

structure ObservabilityConditions where

Whether the current value is measurable (Measurable, Observable.lean)

measurable            : Bool

Whether the current value is measurable (Measurable, Observable.lean) -/ measurable : Bool /-- Whether degradation is detectable (can it be detected before quality collapse) -/ degradationDetectable : Bool /-- Detection mode for degradation (ineffective unless structurallyQueryable) -/ detectionMode : DetectionMode := .structurallyQueryable /-- Whether improvement is verifiable (can value changes be compared before and after intervention) -/ improvementVerifiable : Bool deriving BEq, Repr

/-- Determines whether a variable is an effectively optimizable target. All 3 conditions required. Additionally, degradation detection must be in a structurally queryable format.

Declaration: def effectivelyOptimizable

def effectivelyOptimizable (c : ObservabilityConditions) : Prop :=
  c.measurable = true ∧ c.degradationDetectable = true ∧
  c.detectionMode = .structurallyQueryable ∧ c.improvementVerifiable = true

Whether degradation is detectable (can it be detected before quality collapse)

degradationDetectable : Bool

Detection mode for degradation (ineffective unless structurallyQueryable)

detectionMode         : DetectionMode := .structurallyQueryable

Whether improvement is verifiable (can value changes be compared before and after intervention)

improvementVerifiable : Bool

D3: A variable lacking any of the 3 conditions is merely a nominal optimization target.

Declaration: theorem d3_partial_observability_insufficient

theorem d3_partial_observability_insufficient :
  ¬effectivelyOptimizable ⟨true, true, .structurallyQueryable, false⟩ ∧
  ¬effectivelyOptimizable ⟨true, false, .structurallyQueryable, true⟩ ∧
  ¬effectivelyOptimizable ⟨false, true, .structurallyQueryable, true⟩ := by
  refine ⟨?_, ?_, ?_⟩ <;> simp [effectivelyOptimizable]

D3: Effective only when all 3 conditions hold and detection is structurally queryable.

Declaration: theorem d3_full_observability_sufficient

theorem d3_full_observability_sufficient :
  effectivelyOptimizable ⟨true, true, .structurallyQueryable, true⟩ := by
  simp [effectivelyOptimizable]

D3 refinement (Run 41): Human-readable but structurally non-queryable detection is insufficient. Merely writing in notes is ineffective even if degradationDetectable = true.

Declaration: theorem d3_human_readable_insufficient

theorem d3_human_readable_insufficient :
  ¬effectivelyOptimizable ⟨true, true, .humanReadable, true⟩ := by
  simp [effectivelyOptimizable]

8.1.8. D4 Progressive Self-Application

Definitional Extension + Theorem, 5.5/4.2.

Rationale: Section 7 (self-application) + P3 (governed learning, theorem §4.2) + T2 (structural permanence, T₀ §4.1)

Development phases have an ordering, and each phase's completion persists in structure (T2). The phase ordering is derived from the dependency relationships of D1–D3. phaseOrder in Procedure.lean formalizes the same ordering.

Development phase. Each stage of D4's progressive self-application.

Declaration: inductive DevelopmentPhase

inductive DevelopmentPhase where
  | safety        -- L1: 安全基盤
  | verification  -- P2: 検証基盤
  | observability -- P4: 可観測性
  | governance    -- P3: 統治
  | equilibrium   -- 投資サイクル + 動的調整
  deriving BEq, Repr

Inter-phase dependencies. A subsequent phase cannot begin until the preceding phase is complete.

Declaration: def phaseDependency

def phaseDependency : DevelopmentPhase → DevelopmentPhase → Prop
  | .verification,  .safety        => True  -- P2 は L1 の後
  | .observability, .verification  => True  -- P4 は P2 の後
  | .governance,    .observability => True  -- P3 は P4 の後
  | .equilibrium,   .governance    => True  -- 投資は P3 の後
  | _,              _              => False

Rationale for D4: Phase ordering is strict (no self-transitions). Each phase depends on its preceding phase.

Declaration: theorem d4_no_self_dependency

theorem d4_no_self_dependency :
  ∀ (p : DevelopmentPhase), ¬phaseDependency p p := by
  intro p; cases p <;> simp [phaseDependency]

(Derivation Card) Derives from: phaseDependency (definitional), structure_accumulates (T2) Proposition: D4 Content: A complete phase chain exists: safety → verification → observability → governance → equilibrium. Each phase is strictly ordered with no self-dependencies. Proof strategy: refine + trivial (all four phaseDependency facts hold by definition)

Declaration: theorem d4_full_chain

theorem d4_full_chain :
  phaseDependency .verification .safety ∧
  phaseDependency .observability .verification ∧
  phaseDependency .governance .observability ∧
  phaseDependency .equilibrium .governance := by
  refine ⟨?_, ?_, ?_, ?_⟩ <;> trivial

D4's connection to T2: Phase completion persists in structure. From structure_accumulates, epochs (phase progression) are irreversible. A completed phase is never "undone".

Declaration: theorem d4_phase_completion_persists

theorem d4_phase_completion_persists :
  ∀ (w w' : World),
    validTransition w w' →
    w.epoch ≤ w'.epoch :=
  structure_accumulates

8.1.9. Partial Order Instance for DevelopmentPhase

manifesto.md Section 8 asserts that "D4/D5/D6 are instances of partial orders". Following the precedent of StructureKind (Ontology.lean), we derive LE/LT instances and the 4 partial order property theorems from a Nat-based ordering function.

Ordering function for DevelopmentPhase. Separately from phaseDependency (binary Prop), defines a total order via Nat. Reflects the phase ordering of D4.

Declaration: def developmentPhaseOrder

def developmentPhaseOrder : DevelopmentPhase → Nat
  | .safety        => 0
  | .verification  => 1
  | .observability => 2
  | .governance    => 3
  | .equilibrium   => 4

The ordering function is injective (distinct phases have distinct order values).

Declaration: theorem developmentPhaseOrder_injective

theorem developmentPhaseOrder_injective :
  ∀ (p₁ p₂ : DevelopmentPhase),
    developmentPhaseOrder p₁ = developmentPhaseOrder p₂ → p₁ = p₂ := by
  intro p₁ p₂; cases p₁ <;> cases p₂ <;> simp [developmentPhaseOrder]

Partial order reflexivity: p <= p.

Declaration: theorem developmentPhase_le_refl

theorem developmentPhase_le_refl : ∀ (p : DevelopmentPhase), p ≤ p :=
  fun p => Nat.le_refl (developmentPhaseOrder p)

Partial order transitivity: if p₁ <= p₂ and p₂ <= p₃ then p₁ <= p₃.

Declaration: theorem developmentPhase_le_trans

theorem developmentPhase_le_trans :
    ∀ (p₁ p₂ p₃ : DevelopmentPhase), p₁ ≤ p₂ → p₂ ≤ p₃ → p₁ ≤ p₃ := by
  intro _p₁ _p₂ _p₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃

Partial order antisymmetry: if p₁ <= p₂ and p₂ <= p₁ then p₁ = p₂.

Declaration: theorem developmentPhase_le_antisymm

theorem developmentPhase_le_antisymm :
    ∀ (p₁ p₂ : DevelopmentPhase), p₁ ≤ p₂ → p₂ ≤ p₁ → p₁ = p₂ :=
  fun p₁ p₂ h₁₂ h₂₁ => developmentPhaseOrder_injective p₁ p₂ (Nat.le_antisymm h₁₂ h₂₁)

Consistency of LT and LE: p₁ < p₂ iff p₁ <= p₂ and not (p₂ <= p₁).

Declaration: theorem developmentPhase_lt_iff_le_not_le

theorem developmentPhase_lt_iff_le_not_le :
    ∀ (p₁ p₂ : DevelopmentPhase), p₁ < p₂ ↔ p₁ ≤ p₂ ∧ ¬(p₂ ≤ p₁) := by
  intro _p₁ _p₂; exact Nat.lt_iff_le_and_not_ge

8.1.10. D5 Specification Test and Implementation Three-Layer Architecture

Rationale: T8 (precision level) + P4 (observability) + P6 (constraint satisfaction)

Types of the three-layer representation.

Declaration: inductive SpecLayer

inductive SpecLayer where
  | formalSpec        -- 形式仕様(Lean axiom/theorem)
  | acceptanceTest    -- 受け入れテスト(実行可能な検証)
  | implementation    -- 実装(プラットフォーム固有)
  deriving BEq, Repr

Test kinds. Corresponds to T4 (probabilistic output).

Declaration: inductive TestKind

inductive TestKind where
  | structural   -- 構成の存在を確認(決定論的)
  | behavioral   -- 実行して結果を確認(確率的、T4)
  deriving BEq, Repr

(Derivation Card) Derives from: task_has_precision (T8) Proposition: D5 Content: Tests must have non-zero precision — a task with precision level 0 is meaningless and cannot support meaningful optimization or acceptance criteria. Proof strategy: Direct application of task_has_precision (T8)

Declaration: theorem d5_test_has_precision

theorem d5_test_has_precision :
  ∀ (task : Task),
    task.precisionRequired.required > 0 :=
  task_has_precision

Correspondence between the three layers. Composed in the order: formal spec -> test -> implementation. design-development-foundation.md D5: "Formal spec -> Test: At least one test exists for each axiom/theorem" "Test -> Implementation: Tests exist first, and the implementation passes the tests"

Declaration: def specLayerOrder

def specLayerOrder : SpecLayer → Nat
  | .formalSpec      => 0   -- 最初に仕様を定義
  | .acceptanceTest  => 1   -- 仕様からテストを導出
  | .implementation  => 2   -- テストを通す実装を構築

D5: The three layers are strictly ordered.

Declaration: theorem d5_layer_sequential

theorem d5_layer_sequential :
  specLayerOrder .formalSpec < specLayerOrder .acceptanceTest ∧
  specLayerOrder .acceptanceTest < specLayerOrder .implementation := by
  simp [specLayerOrder]

Determinism of tests. Structural tests are deterministic, behavioral tests are probabilistic (T4).

Declaration: def testDeterministic

def testDeterministic : TestKind → Bool
  | .structural => true    -- 決定論的: 存在の有無を確認
  | .behavioral => false   -- 確率的: T4 により結果が変動しうる

D5 + T4: Structural tests are deterministic, behavioral tests are probabilistic.

Declaration: theorem d5_structural_test_deterministic

theorem d5_structural_test_deterministic :
  testDeterministic .structural = true ∧
  testDeterministic .behavioral = false := by
  constructor <;> rfl

The ordering function is injective (distinct layers have distinct order values).

Declaration: theorem specLayerOrder_injective

theorem specLayerOrder_injective :
  ∀ (l₁ l₂ : SpecLayer),
    specLayerOrder l₁ = specLayerOrder l₂ → l₁ = l₂ := by
  intro l₁ l₂; cases l₁ <;> cases l₂ <;> simp [specLayerOrder]

Partial order reflexivity: l <= l.

Declaration: theorem specLayer_le_refl

theorem specLayer_le_refl : ∀ (l : SpecLayer), l ≤ l :=
  fun l => Nat.le_refl (specLayerOrder l)

Partial order transitivity: if l₁ <= l₂ and l₂ <= l₃ then l₁ <= l₃.

Declaration: theorem specLayer_le_trans

theorem specLayer_le_trans :
    ∀ (l₁ l₂ l₃ : SpecLayer), l₁ ≤ l₂ → l₂ ≤ l₃ → l₁ ≤ l₃ := by
  intro _l₁ _l₂ _l₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃

Partial order antisymmetry: if l₁ <= l₂ and l₂ <= l₁ then l₁ = l₂.

Declaration: theorem specLayer_le_antisymm

theorem specLayer_le_antisymm :
    ∀ (l₁ l₂ : SpecLayer), l₁ ≤ l₂ → l₂ ≤ l₁ → l₁ = l₂ :=
  fun l₁ l₂ h₁₂ h₂₁ => specLayerOrder_injective l₁ l₂ (Nat.le_antisymm h₁₂ h₂₁)

Consistency of LT and LE: l₁ < l₂ iff l₁ <= l₂ and not (l₂ <= l₁).

Declaration: theorem specLayer_lt_iff_le_not_le

theorem specLayer_lt_iff_le_not_le :
    ∀ (l₁ l₂ : SpecLayer), l₁ < l₂ ↔ l₁ ≤ l₂ ∧ ¬(l₂ ≤ l₁) := by
  intro _l₁ _l₂; exact Nat.lt_iff_le_and_not_ge

8.1.11. D6 Three-Stage Design

Rationale: Ontology.lean/Observable.lean three-stage structure (boundary -> mitigation -> variable)

BoundaryLayer, BoundaryId, and Mitigation are already defined in Ontology.lean. Here we express the design principles as theorems.

Rationale for D6: Variables corresponding to fixed boundaries can only improve mitigation quality.

Declaration: theorem d6_fixed_boundary_mitigated

theorem d6_fixed_boundary_mitigated :
  boundaryLayer .ethicsSafety = .fixed ∧
  boundaryLayer .ontological = .fixed := by
  simp [boundaryLayer]

Design flow of the three-stage design. design-development-foundation.md D6: "Boundary conditions (invariant) -> Mitigations (design decisions) -> Variables (quality metrics)" Design always proceeds in this direction; the reverse direction is prohibited.

Declaration: inductive DesignStage

inductive DesignStage where

Identify boundary conditions (invariant; only accepted) -/ | identifyBoundary /-- Design mitigations (design decisions belonging to L6) -/ | designMitigation /-- Define variables (metrics for mitigation effectiveness) -/ | defineVariable deriving BEq, Repr, DecidableEq

/-- Stage ordering of the three-stage design. -/ def designStageOrder : DesignStage → Nat | .identifyBoundary => 0 | .designMitigation => 1 | .defineVariable => 2

/-- (Derivation Card) Derives from: designStageOrder (definitional) Proposition: D6 Content: The three-stage design is strictly ordered — identifyBoundary < designMitigation < defineVariable. The stage ordering function assigns monotonically increasing natural numbers. Proof strategy: simp (designStageOrder) — unfold the ordering function and reduce to Nat inequalities -/ theorem d6_stage_sequential : designStageOrder .identifyBoundary < designStageOrder .designMitigation ∧ designStageOrder .designMitigation < designStageOrder .defineVariable := by simp (designStageOrder)

/-- D6: Reverse direction prohibited. Do not attempt to directly improve variables (Goodhart's Law trap). The variable stage is last; there is no backtracking from variables to boundary conditions or mitigations. -/ theorem d6_no_reverse : ∀ (s : DesignStage), designStageOrder .identifyBoundary ≤ designStageOrder s := by intro s; cases s <;> simp (designStageOrder)

/-- The ordering function is injective (distinct stages have distinct order values). -/ theorem designStageOrder_injective : ∀ (s₁ s₂ : DesignStage), designStageOrder s₁ = designStageOrder s₂ → s₁ = s₂ := by intro s₁ s₂; cases s₁ <;> cases s₂ <;> simp (designStageOrder)

instance : LE DesignStage := ⟨fun a b => designStageOrder a ≤ designStageOrder b⟩ instance : LT DesignStage := ⟨fun a b => designStageOrder a < designStageOrder b⟩

/-- Partial order reflexivity: s <= s. -/ theorem designStage_le_refl : ∀ (s : DesignStage), s ≤ s := fun s => Nat.le_refl (designStageOrder s)

/-- Partial order transitivity: if s₁ <= s₂ and s₂ <= s₃ then s₁ <= s₃. -/ theorem designStage_le_trans : ∀ (s₁ s₂ s₃ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₃ → s₁ ≤ s₃ := by intro _s₁ _s₂ _s₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃

/-- Partial order antisymmetry: if s₁ <= s₂ and s₂ <= s₁ then s₁ = s₂. -/ theorem designStage_le_antisymm : ∀ (s₁ s₂ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₁ → s₁ = s₂ := fun s₁ s₂ h₁₂ h₂₁ => designStageOrder_injective s₁ s₂ (Nat.le_antisymm h₁₂ h₂₁)

/-- Consistency of LT and LE: s₁ < s₂ iff s₁ <= s₂ and not (s₂ <= s₁). -/ theorem designStage_lt_iff_le_not_le : ∀ (s₁ s₂ : DesignStage), s₁ < s₂ ↔ s₁ ≤ s₂ ∧ ¬(s₂ ≤ s₁) := by intro _s₁ _s₂; exact Nat.lt_iff_le_and_not_ge

/-! ## D7 Trust Asymmetry

Rationale: Section 6 + P1 (co-growth)

Accumulation is bounded (trust_accumulates_gradually), damage is unbounded (trust_decreases_on_materialized_risk). -/

/-- (Derivation Card) Derives from: trust_accumulates_gradually (P1) Proposition: D7 Content: Trust accumulation is bounded — incremental increases are capped by trustIncrementBound, formalizing the asymmetry of gradual growth. Proof strategy: Direct application of trust_accumulates_gradually (P1) -/ theorem d7_accumulation_bounded : ∀ (agent : Agent) (w w' : World), actionSpaceSize agent w ≤ actionSpaceSize agent w' → ¬riskMaterialized agent w' → trustLevel agent w ≤ trustLevel agent w' ∧ trustLevel agent w' ≤ trustLevel agent w + trustIncrementBound := trust_accumulates_gradually

/-- (Derivation Card) Derives from: trust_decreases_on_materialized_risk (P1) Proposition: D7 Content: Trust damage from materialized risk is unbounded — a single incident can destroy arbitrarily accumulated trust, formalizing the asymmetry of abrupt destruction. Proof strategy: Direct application of trust_decreases_on_materialized_risk (P1) -/ theorem d7_damage_unbounded : ∀ (agent : Agent) (w w' : World), actionSpaceSize agent w < actionSpaceSize agent w' → riskMaterialized agent w' → trustLevel agent w' < trustLevel agent w := trust_decreases_on_materialized_risk

/-! ## D8 Equilibrium Search

Rationale: Section 6 + E2 (capability-risk co-scaling)

By overexpansion_reduces_value, there exist cases where expansion of the action space reduces collaborative value. -/

/-- (Derivation Card) Derives from: overexpansion_reduces_value (E2) Proposition: D8 Content: Overexpansion of the action space can reduce collaborative value — equilibrium search is necessary to avoid value-destroying expansion. Proof strategy: Direct application of overexpansion_reduces_value (E2) -/ theorem d8_overexpansion_risk : ∃ (agent : Agent) (w w' : World), actionSpaceSize agent w < actionSpaceSize agent w' ∧ collaborativeValue w' < collaborativeValue w := overexpansion_reduces_value

/-- D8's connection to P1: Capability expansion is inseparable from risk expansion. Direct application of E2. -/ theorem d8_capability_risk : ∀ (agent : Agent) (w w' : World), actionSpaceSize agent w < actionSpaceSize agent w' → riskExposure agent w < riskExposure agent w' := capability_risk_coscaling

/-! ## D9 Maintenance of the Classification Itself Definitional Extension + Theorem, 5.5/4.2.

Rationale: Observable.lean Part IV + P3 (governed learning, theorem §4.2) + Section 7 (self-application)

The design foundation itself is subject to updates, and updates follow P3's compatibility classification. This is a structuring of AGM revision operations (terminology reference §9.2):

  • Conservative extension = conservative extension (§5.5)

  • Compatible change = consistent extension (§5.5)

  • Breaking change = non-extension change (some theorems are not preserved)

8.1.12. Self-Application Requirements

Since D9 states the principle of "maintenance of the classification itself", D1–D9 themselves must also be subject to D9 (Section 7).

To express this at the type level (§7.1 Curry-Howard correspondence):

  1. Model D1–D9 as values of the DesignPrinciple type (extension of domain of discourse §3.2)

  2. Require that updates to DesignPrinciple are classified by CompatibilityClass

  3. Structurally enforce via the SelfGoverning type class (§9.4) -/

/-- Design principle identifiers. Enumerates D1–D17 as values. This allows D1–D17 themselves to be treated at the type level as "targets of updates". -/ inductive DesignPrinciple where | d1_enforcementLayering | d2_workerVerifierSeparation | d3_observabilityFirst | d4_progressiveSelfApplication | d5_specTestImpl | d6_boundaryMitigationVariable | d7_trustAsymmetry | d8_equilibriumSearch | d9_selfMaintenance | d10_structuralPermanence | d11_contextEconomy | d12_constraintSatisfactionTaskDesign | d13_premiseNegationPropagation | d14_verificationOrderConstraint | d15_harnessEngineering | d16_informationRelevance | d17_deductiveDesignWorkflow | d18_multiAgentCoordination deriving BEq, Repr

/-- DesignPrinciple implements SelfGoverning. This makes D1–D9 themselves subject to governedUpdate, and updates without compatibility classification become type-level errors.

Types that do not implement SelfGoverning cannot use governedUpdate or governed_update_classified, so defining a new principle type and forgetting to implement SelfGoverning is detected as a type error. -/ instance : SelfGoverning DesignPrinciple where classificationExhaustive := by intro c; cases c <;> simp canClassifyUpdate _ _ := True

/-- Design principle update event. Self-application of D9: Changes to D1–D9 themselves also go through compatibility classification. -/ structure DesignPrincipleUpdate where /-- The principle being updated

principle     : DesignPrinciple

Identify boundary conditions (invariant; only accepted) -/ | identifyBoundary /-- Design mitigations (design decisions belonging to L6) -/ | designMitigation /-- Define variables (metrics for mitigation effectiveness) -/ | defineVariable deriving BEq, Repr, DecidableEq

/-- Stage ordering of the three-stage design.

Declaration: def designStageOrder

def designStageOrder : DesignStage → Nat
  | .identifyBoundary  => 0
  | .designMitigation  => 1
  | .defineVariable    => 2

Design mitigations (design decisions belonging to L6)

| designMitigation

Define variables (metrics for mitigation effectiveness)

| defineVariable

(Derivation Card) Derives from: designStageOrder (definitional) Proposition: D6 Content: The three-stage design is strictly ordered — identifyBoundary < designMitigation < defineVariable. The stage ordering function assigns monotonically increasing natural numbers. Proof strategy: simp (designStageOrder) — unfold the ordering function and reduce to Nat inequalities

Declaration: theorem d6_stage_sequential

theorem d6_stage_sequential :
  designStageOrder .identifyBoundary < designStageOrder .designMitigation ∧
  designStageOrder .designMitigation < designStageOrder .defineVariable := by
  simp [designStageOrder]

D6: Reverse direction prohibited. Do not attempt to directly improve variables (Goodhart's Law trap). The variable stage is last; there is no backtracking from variables to boundary conditions or mitigations.

Declaration: theorem d6_no_reverse

theorem d6_no_reverse :
  ∀ (s : DesignStage),
    designStageOrder .identifyBoundary ≤ designStageOrder s := by
  intro s; cases s <;> simp [designStageOrder]

The ordering function is injective (distinct stages have distinct order values).

Declaration: theorem designStageOrder_injective

theorem designStageOrder_injective :
  ∀ (s₁ s₂ : DesignStage),
    designStageOrder s₁ = designStageOrder s₂ → s₁ = s₂ := by
  intro s₁ s₂; cases s₁ <;> cases s₂ <;> simp [designStageOrder]

Partial order reflexivity: s <= s.

Declaration: theorem designStage_le_refl

theorem designStage_le_refl : ∀ (s : DesignStage), s ≤ s :=
  fun s => Nat.le_refl (designStageOrder s)

Partial order transitivity: if s₁ <= s₂ and s₂ <= s₃ then s₁ <= s₃.

Declaration: theorem designStage_le_trans

theorem designStage_le_trans :
    ∀ (s₁ s₂ s₃ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₃ → s₁ ≤ s₃ := by
  intro _s₁ _s₂ _s₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃

Partial order antisymmetry: if s₁ <= s₂ and s₂ <= s₁ then s₁ = s₂.

Declaration: theorem designStage_le_antisymm

theorem designStage_le_antisymm :
    ∀ (s₁ s₂ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₁ → s₁ = s₂ :=
  fun s₁ s₂ h₁₂ h₂₁ => designStageOrder_injective s₁ s₂ (Nat.le_antisymm h₁₂ h₂₁)

Consistency of LT and LE: s₁ < s₂ iff s₁ <= s₂ and not (s₂ <= s₁).

Declaration: theorem designStage_lt_iff_le_not_le

theorem designStage_lt_iff_le_not_le :
    ∀ (s₁ s₂ : DesignStage), s₁ < s₂ ↔ s₁ ≤ s₂ ∧ ¬(s₂ ≤ s₁) := by
  intro _s₁ _s₂; exact Nat.lt_iff_le_and_not_ge

8.1.13. D7 Trust Asymmetry Part 2

Rationale: Section 6 + P1 (co-growth)

Accumulation is bounded (trust_accumulates_gradually), damage is unbounded (trust_decreases_on_materialized_risk).

(Derivation Card) Derives from: trust_accumulates_gradually (P1) Proposition: D7 Content: Trust accumulation is bounded — incremental increases are capped by trustIncrementBound, formalizing the asymmetry of gradual growth. Proof strategy: Direct application of trust_accumulates_gradually (P1)

Declaration: theorem d7_accumulation_bounded

theorem d7_accumulation_bounded :
  ∀ (agent : Agent) (w w' : World),
    actionSpaceSize agent w ≤ actionSpaceSize agent w' →
    ¬riskMaterialized agent w' →
    trustLevel agent w ≤ trustLevel agent w' ∧
    trustLevel agent w' ≤ trustLevel agent w + trustIncrementBound :=
  trust_accumulates_gradually

(Derivation Card) Derives from: trust_decreases_on_materialized_risk (P1) Proposition: D7 Content: Trust damage from materialized risk is unbounded — a single incident can destroy arbitrarily accumulated trust, formalizing the asymmetry of abrupt destruction. Proof strategy: Direct application of trust_decreases_on_materialized_risk (P1)

Declaration: theorem d7_damage_unbounded

theorem d7_damage_unbounded :
  ∀ (agent : Agent) (w w' : World),
    actionSpaceSize agent w < actionSpaceSize agent w' →
    riskMaterialized agent w' →
    trustLevel agent w' < trustLevel agent w :=
  trust_decreases_on_materialized_risk

8.1.14. D8 Equilibrium Search Part 2

Rationale: Section 6 + E2 (capability-risk co-scaling)

By overexpansion_reduces_value, there exist cases where expansion of the action space reduces collaborative value.

(Derivation Card) Derives from: overexpansion_reduces_value (E2) Proposition: D8 Content: Overexpansion of the action space can reduce collaborative value — equilibrium search is necessary to avoid value-destroying expansion. Proof strategy: Direct application of overexpansion_reduces_value (E2)

Declaration: theorem d8_overexpansion_risk

theorem d8_overexpansion_risk :
  ∃ (agent : Agent) (w w' : World),
    actionSpaceSize agent w < actionSpaceSize agent w' ∧
    collaborativeValue w' < collaborativeValue w :=
  overexpansion_reduces_value

D8's connection to P1: Capability expansion is inseparable from risk expansion. Direct application of E2.

Declaration: theorem d8_capability_risk

theorem d8_capability_risk :
  ∀ (agent : Agent) (w w' : World),
    actionSpaceSize agent w < actionSpaceSize agent w' →
    riskExposure agent w < riskExposure agent w' :=
  capability_risk_coscaling

8.1.15. D9 Maintenance of the Classification Itself Part 2

Definitional Extension + Theorem, 5.5/4.2.

Rationale: Observable.lean Part IV + P3 (governed learning, theorem §4.2) + Section 7 (self-application)

The design foundation itself is subject to updates, and updates follow P3's compatibility classification. This is a structuring of AGM revision operations (terminology reference §9.2):

  • Conservative extension = conservative extension (§5.5)

  • Compatible change = consistent extension (§5.5)

  • Breaking change = non-extension change (some theorems are not preserved)

8.1.16. Self-Application Requirements Part 2

Since D9 states the principle of "maintenance of the classification itself", D1–D9 themselves must also be subject to D9 (Section 7).

To express this at the type level (§7.1 Curry-Howard correspondence):

  1. Model D1–D9 as values of the DesignPrinciple type (extension of domain of discourse §3.2)

  2. Require that updates to DesignPrinciple are classified by CompatibilityClass

  3. Structurally enforce via the SelfGoverning type class (§9.4)

Design principle identifiers. Enumerates D1–D17 as values. This allows D1–D17 themselves to be treated at the type level as "targets of updates".

Declaration: inductive DesignPrinciple

inductive DesignPrinciple where
  | d1_enforcementLayering
  | d2_workerVerifierSeparation
  | d3_observabilityFirst
  | d4_progressiveSelfApplication
  | d5_specTestImpl
  | d6_boundaryMitigationVariable
  | d7_trustAsymmetry
  | d8_equilibriumSearch
  | d9_selfMaintenance
  | d10_structuralPermanence
  | d11_contextEconomy
  | d12_constraintSatisfactionTaskDesign
  | d13_premiseNegationPropagation
  | d14_verificationOrderConstraint
  | d15_harnessEngineering
  | d16_informationRelevance
  | d17_deductiveDesignWorkflow
  | d18_multiAgentCoordination
  deriving BEq, Repr

DesignPrinciple implements SelfGoverning. This makes D1–D9 themselves subject to governedUpdate, and updates without compatibility classification become type-level errors.

Types that do not implement SelfGoverning cannot use governedUpdate or governed_update_classified, so defining a new principle type and forgetting to implement SelfGoverning is detected as a type error.

Declaration: instance :

instance : SelfGoverning DesignPrinciple where
  classificationExhaustive := by intro c; cases c <;> simp
  canClassifyUpdate _ _ := True

Design principle update event. Self-application of D9: Changes to D1–D9 themselves also go through compatibility classification.

Declaration: structure DesignPrincipleUpdate

structure DesignPrincipleUpdate where

The principle being updated -/ principle : DesignPrinciple /-- Compatibility classification of the update -/ compatibility : CompatibilityClass /-- Rationale for the update (reference to manifesto's T/E/P) -/ hasRationale : Bool deriving Repr

/-- (Derivation Card) Derives from: CompatibilityClass (definitional — exhaustive inductive type) Proposition: D9 Content: Any compatibility classification belongs to exactly one of the three classes: conservativeExtension, compatibleChange, or breakingChange. Proof strategy: intro c; cases c <;> simp — exhaustive case analysis on the CompatibilityClass inductive type

Declaration: theorem d9_update_classified

theorem d9_update_classified :
  ∀ (c : CompatibilityClass),
    c = .conservativeExtension ∨
    c = .compatibleChange ∨
    c = .breakingChange := by
  intro c; cases c <;> simp

Compatibility classification of the update

compatibility : CompatibilityClass

Rationale for the update (reference to manifesto's T/E/P)

hasRationale  : Bool

Self-application of D9: Updates to D9 itself also go through compatibility classification. The DesignPrincipleUpdate type structurally requires this (the compatibility field is mandatory).

Furthermore, updates require a rationale (D9: principles that lose their rationale are subject to review).

Declaration: def governedPrincipleUpdate

def governedPrincipleUpdate (u : DesignPrincipleUpdate) : Prop :=
  u.hasRationale = true

Self-application of D9: Proves via the SelfGoverning typeclass that any update to DesignPrinciple is compatibility-classified.

governed_update_classified can only be called on types that have a SelfGoverning instance. If DesignPrinciple does not implement SelfGoverning, this theorem becomes a type error. -> Missing implementations are structurally detected.

Declaration: theorem d9_self_applicable

theorem d9_self_applicable :
  ∀ (_p : DesignPrinciple) (c : CompatibilityClass),
    c = .conservativeExtension ∨ c = .compatibleChange ∨ c = .breakingChange :=
  fun _p c => governed_update_classified _p c

D9 exhaustiveness: All principles D1–D17 are enumerated as update targets.

Declaration: theorem d9_all_principles_enumerated

theorem d9_all_principles_enumerated :
  ∀ (p : DesignPrinciple),
    p = .d1_enforcementLayering ∨
    p = .d2_workerVerifierSeparation ∨
    p = .d3_observabilityFirst ∨
    p = .d4_progressiveSelfApplication ∨
    p = .d5_specTestImpl ∨
    p = .d6_boundaryMitigationVariable ∨
    p = .d7_trustAsymmetry ∨
    p = .d8_equilibriumSearch ∨
    p = .d9_selfMaintenance ∨
    p = .d10_structuralPermanence ∨
    p = .d11_contextEconomy ∨
    p = .d12_constraintSatisfactionTaskDesign ∨
    p = .d13_premiseNegationPropagation ∨
    p = .d14_verificationOrderConstraint ∨
    p = .d15_harnessEngineering ∨
    p = .d16_informationRelevance ∨
    p = .d17_deductiveDesignWorkflow ∨
    p = .d18_multiAgentCoordination := by
  intro p; cases p <;> simp

8.1.17. Self-Application of D4

D4 (progressive self-application) states that "the development process achieves compliance up to each phase", but DesignFoundation itself should also be developed following these phases.

Updates to DesignFoundation occur in the context of DevelopmentPhase, and the compliance level of the updated phase progresses irreversibly (T2: structure_accumulates).

Self-application of D4: The design foundation itself has phases. Each principle is applicable only after the phase it requires is complete.

Declaration: def principleRequiredPhase

def principleRequiredPhase : DesignPrinciple → DevelopmentPhase
  | .d1_enforcementLayering         => .safety
  | .d2_workerVerifierSeparation    => .verification
  | .d3_observabilityFirst          => .observability
  | .d4_progressiveSelfApplication  => .safety  -- D4 自体は最初から必要
  | .d5_specTestImpl                => .verification
  | .d6_boundaryMitigationVariable  => .observability
  | .d7_trustAsymmetry              => .equilibrium
  | .d8_equilibriumSearch           => .equilibrium
  | .d9_selfMaintenance             => .safety  -- D9 も最初から必要
  | .d10_structuralPermanence       => .safety  -- T1+T2 は最初から成立
  | .d11_contextEconomy             => .observability  -- コンテキストコスト測定が前提
  | .d12_constraintSatisfactionTaskDesign => .governance  -- P6 は統治フェーズ
  | .d13_premiseNegationPropagation     => .governance  -- P3(退役)+ Section 8 が前提
  | .d14_verificationOrderConstraint   => .governance  -- P6 + T7 + T8 が前提
  | .d15_harnessEngineering            => .equilibrium -- 実装パターンは動的調整フェーズ
  | .d16_informationRelevance          => .observability -- コンテキスト寄与度の測定が前提
  | .d17_deductiveDesignWorkflow       => .governance -- P3(学習統治)+ D5(三層)が前提
  | .d18_multiAgentCoordination        => .equilibrium -- 動的調整フェーズ(D12 + T7b が前提)

Self-application of D4: D4 and D9 are required from the safety phase. This means that "phase ordering" and "governed updates" must be functional from the very beginning of development.

Declaration: theorem d4_d9_from_first_phase

theorem d4_d9_from_first_phase :
  principleRequiredPhase .d4_progressiveSelfApplication = .safety ∧
  principleRequiredPhase .d9_selfMaintenance = .safety := by
  constructor <;> rfl

8.1.18. Dependency Structure of D1-D9

Verifies that D4's (progressive self-application) phase ordering is consistent with the dependency relationships of D1–D3.

  • Phase 1 (safety) -> D1 (L1 requires structural enforcement)

  • Phase 2 (verification) -> D2 (structural realization of P2)

  • Phase 3 (observability) -> D3 (observability first)

  • Phase 4 (governance) -> depends on D3 (P3 comes after P4)

  • Phase 5 (equilibrium) -> depends on D7, D8 (trust and equilibrium)

This dependency structure is already expressed in phaseDependency. d4_full_chain proves its existence.

Consistency of D1–D4: The first step of D4's phase ordering (safety -> verification) matches the ordering of D1 (L1 requires structural enforcement) and D2 (realization of P2).

safety is first = D1 makes L1 structurally enforced verification is next = D2 realizes P2

Declaration: theorem dependency_d1_d2_d4_consistent

theorem dependency_d1_d2_d4_consistent :
  phaseDependency .verification .safety ∧
  minimumEnforcement .fixed = .structural := by
  constructor
  · trivial
  · rfl

8.1.19. D10 Structural Permanence

Theorem, 4.2.

Rationale: T1 (ephemerality, T₀ §4.1) + T2 (structural permanence, T₀ §4.1)

Agents are ephemeral (T1) but structure persists (T2). Accumulation of improvements is possible only through structure. Connects with P3 theorem group in Principles.lean (modifier_agent_terminates, modification_persists_after_termination).

(Derivation Card) Derives from: session_bounded (T1), structure_persists (T2) Proposition: D10 Content: Agents are ephemeral (T1) but structure persists (T2) — accumulation of improvements is possible only through structure, not through persistent agent identity. Proof strategy: Constructor pair ⟨session_bounded, structure_persists⟩ — direct composition of T1 and T2

Declaration: theorem d10_agent_temporary_structure_permanent

theorem d10_agent_temporary_structure_permanent :
  -- T1: セッションは終了する
  (∀ (w : World) (s : Session),
    s ∈ w.sessions →
    ∃ (w' : World), w.time ≤ w'.time ∧
      ∃ (s' : Session), s' ∈ w'.sessions ∧
        s'.id = s.id ∧ s'.status = SessionStatus.terminated) ∧
  -- T2: 構造は永続する
  (∀ (w w' : World) (s : Session) (st : Structure),
    s ∈ w.sessions → st ∈ w.structures →
    s.status = SessionStatus.terminated →
    validTransition w w' → st ∈ w'.structures) :=
  ⟨session_bounded, structure_persists⟩

Corollary of D10: Writing back to structure is the sole means of accumulation. Epochs (T2: structure_accumulates) increase monotonically.

Declaration: theorem d10_epoch_monotone

theorem d10_epoch_monotone :
  ∀ (w w' : World), validTransition w w' → w.epoch ≤ w'.epoch :=
  structure_accumulates

8.1.20. D11 Context Economy

Definitional Extension + Theorem, 5.5/4.2.

Rationale: T3 (context finiteness, T₀ §4.1) + D1 (enforcement layering)

Working memory (T3: amount of information that can be processed) is a finite resource, and enforcement layers (D1) and context cost are inversely correlated: structural enforcement (low cost) > procedural enforcement (medium cost) > normative guidelines (high cost).

Context cost for D1's enforcement layers. Higher values consume more context.

Declaration: def contextCost

def contextCost : EnforcementLayer → Nat
  | .structural => 0   -- 一度設定すれば毎セッション読む必要がない
  | .procedural => 1   -- プロセスは存在するがコンテキストに常駐しない
  | .normative  => 2   -- 毎セッション読み込まれ、コンテキストを占有する

(Derivation Card) Derives from: contextCost (definitional) Proposition: D11 Content: Enforcement power and context cost are inversely correlated — structural (0) < procedural (1) < normative (2). Higher enforcement power means lower context cost. Proof strategy: simp (contextCost) — unfold the cost function and reduce to Nat inequalities

Declaration: theorem d11_enforcement_cost_inverse

theorem d11_enforcement_cost_inverse :
  contextCost .structural < contextCost .procedural ∧
  contextCost .procedural < contextCost .normative := by
  simp [contextCost]

D11: Promotion to structural enforcement reduces context cost.

Declaration: theorem d11_structural_minimizes_cost

theorem d11_structural_minimizes_cost :
  ∀ (e : EnforcementLayer),
    contextCost .structural ≤ contextCost e := by
  intro e; cases e <;> simp [contextCost]

D11 + T3: Context capacity is finite (T3), and bloating of normative guidelines degrades V2 (context efficiency).

Declaration: theorem d11_context_finite

theorem d11_context_finite :
  ∀ (agent : Agent),
    agent.contextWindow.capacity > 0 ∧
    agent.contextWindow.used ≤ agent.contextWindow.capacity :=
  context_finite

8.1.21. D12 Constraint Satisfaction Task Design

Theorem, 4.2.

Rationale: P6 (constraint satisfaction, theorem §4.2) + T3 + T7 + T8 (T₀ §4.1)

Task execution is a constraint satisfaction problem. Achieve precision requirements (T8) within finite cognitive space (T3) and finite resources (T7). Connects with P6 theorem group in Principles.lean.

(Derivation Card) Derives from: task_is_constraint_satisfaction (P6) Proposition: D12 Content: Task design is a constraint satisfaction problem over T3+T7+T8. A feasible strategy must satisfy context capacity (T3), resource budget (T7), and precision requirement (T8) simultaneously. Proof strategy: Direct application of task_is_constraint_satisfaction (P6) — restatement at the design principle level

Declaration: theorem d12_task_is_csp

theorem d12_task_is_csp :
  ∀ (task : Task) (agent : Agent),
    agent.contextWindow.capacity > 0 →
    task.resourceBudget ≤ globalResourceBound →
    task.precisionRequired.required > 0 →
    ∀ (s : TaskStrategy),
      s.task = task →
      strategyFeasible s agent →
      s.contextUsage ≤ agent.contextWindow.capacity ∧
      s.resourceUsage ≤ globalResourceBound ∧
      s.achievedPrecision > 0 :=
  task_is_constraint_satisfaction

D12: Task design itself is also probabilistic output (T4), requiring verification through P2 (cognitive role separation).

Declaration: theorem d12_task_design_probabilistic

theorem d12_task_design_probabilistic :
  ∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
    canTransition agent action w w₁ ∧
    canTransition agent action w w₂ ∧
    w₁ ≠ w₂ :=
  output_nondeterministic

8.1.22. D13 Premise Negation Impact Propagation

Theorem, 4.2.

Rationale: P3 (governed learning -- retirement) + Section 8 (coherenceRequirement) + T5

When a premise is negated, identify dependent derivations and re-verify them. Generalizes Section 8's coherenceRequirement (priority-based review) to arbitrary dependency relationships.

Based on PropositionId.dependencies from Ontology.lean, defines impact set computation functions and basic properties.

8.1.23. Note on coherenceRequirement - Issue 243

The original d13_coherence_implies_propagation theorem was removed because it was trivially-true: its conclusion was a direct restatement of its premise.

The root cause is that coherenceRequirement (Ontology.lean) has True as its conclusion, making any theorem built on it vacuously true. Strengthening coherenceRequirement to use a meaningful review obligation type (e.g., NeedsReview) would be a breaking change to Ontology.lean and is deferred.

D13's substantive content is captured by:

  • affected / d13_propagation: Impact set computation via transitive dependency closure

  • d13_constraint_negation_has_impact: T4 negation produces non-empty impact

  • d13_retirement_requires_feedback: P3 retirement presupposes T5

  • assumptionImpact / d13_assumption_subsumes_proposition: Assumption-level propagation (#225)

  • d13_assumption_impact_monotone: Monotonicity of assumption impact

D13: P3's retirement operation presupposes T5 (feedback). Without feedback, negation of premises cannot be detected.

Declaration: theorem d13_retirement_requires_feedback

theorem d13_retirement_requires_feedback :
  ∀ (w : World),
    w.feedbacks = [] →
    ¬(∃ (f : Feedback), f ∈ w.feedbacks ∧ f.kind = .measurement) :=
  fun _ hnil ⟨_, hf, _⟩ => by simp [hnil] at hf

Enumeration of all propositions. Used in affected computation.

Declaration: def allPropositions

def allPropositions : List PropositionId :=
  [.t1, .t2, .t3, .t4, .t5, .t6, .t7, .t8,
   .e1, .e2,
   .p1, .p2, .p3, .p4, .p5, .p6,
   .l1, .l2, .l3, .l4, .l5, .l6,
   .d1, .d2, .d3, .d4, .d5, .d6, .d7, .d8, .d9, .d10, .d11, .d12, .d13, .d14,
   .d15, .d16, .d17, .d18]

Set of propositions that directly depend on proposition s (reverse edges). dependencies = "what it depends on"; dependents = "what depends on it".

Declaration: def PropositionId.dependents

def PropositionId.dependents (s : PropositionId) : List PropositionId :=
  allPropositions.filter (fun p => propositionDependsOn p s)

Computes the impact set when premise s is negated. Transitive closure of the reverse dependency graph. The fuel parameter guarantees termination (depth <= 35 suffices since the graph is a DAG).

Incompleteness limitation: This function only tracks propagation among named propositions enumerated in PropositionId. By Goedel's first incompleteness theorem, impact on unnamed derivational consequences cannot be detected (see Ontology.lean §6.2 note).

Declaration: def affected

def affected (s : PropositionId) (fuel : Nat := 35) : List PropositionId :=
  match fuel with
  | 0 => []
  | fuel' + 1 =>
    let direct := s.dependents
    let transitive := direct.flatMap (fun p => affected p fuel')
    (direct ++ transitive).eraseDups

Operational definition of D13: Impact propagation upon premise negation. Computes the impact set via affected, representing that each proposition requires re-verification.

Declaration: def d13_propagation

def d13_propagation (negated : PropositionId) : List PropositionId :=
  affected negated

Negation of T (constraint) has the largest impact: T is the rationale for many propositions, so the impact set is large.

Declaration: theorem d13_constraint_negation_has_impact

theorem d13_constraint_negation_has_impact :
  (d13_propagation .t4).length > 0 := by native_decide

Negation of L5 (platform boundary) affects only D1: L5 is environment-dependent and close to a root node, so its impact is limited.

Declaration: theorem d13_l5_limited_impact

theorem d13_l5_limited_impact :
  (d13_propagation .l5).length ≤ (d13_propagation .t4).length := by native_decide

8.1.24. D13 Extension - Temporal Expiration of Assumptions

Extends D13 from PropositionId-level to assumption-level propagation.

Conditional axiom systems S=(A,C,H,D) derive conclusions D from assumptions C/H. C/H originate from external sources that change over time (#225). When an assumption expires (its external source changes or its review period elapses), all derivations depending on that assumption require re-verification.

This is the principle-level formalization. Operational types (TemporalValidity, AssumptionExpiration) are defined in Models/Assumptions/EpistemicLayer.lean. DesignFoundation.lean does not import Models to preserve the dependency direction (core formalization must not depend on model instances).

The bridge: an assumption maps to a set of PropositionIds it supports. When the assumption expires, all dependents of those PropositionIds are affected.

Computes the impact set when an assumption expires. An assumption supports a set of propositions (its dependencies). Expiration propagates through all dependents of those propositions.

This generalizes d13_propagation (single PropositionId) to assumption-level expiration (set of PropositionIds).

Declaration: def assumptionImpact

def assumptionImpact (supportedPropositions : List PropositionId) : List PropositionId :=
  (supportedPropositions.flatMap (fun p => affected p)).eraseDups

Raw (pre-dedup) impact set for assumption expiration. Used internally for proofs; assumptionImpact applies eraseDups on top.

Declaration: def assumptionImpactRaw

def assumptionImpactRaw (supportedPropositions : List PropositionId) : List PropositionId :=
  supportedPropositions.flatMap (fun p => affected p)

(Derivation Card) Derives from: d13_propagation (D13), affected (D13) Proposition: D13 (extension) Content: Assumption expiration subsumes individual proposition negation — if an assumption supports proposition p, then the raw impact of the assumption's expiration includes all of d13_propagation p. Proof strategy: List.mem_flatMap propagation

Declaration: theorem d13_assumption_subsumes_proposition

theorem d13_assumption_subsumes_proposition :
  ∀ (p : PropositionId) (supported : List PropositionId),
    p ∈ supported →
    ∀ (q : PropositionId),
      q ∈ d13_propagation p →
      q ∈ assumptionImpactRaw supported := by
  intro p supported hp q hq
  simp only [assumptionImpactRaw, d13_propagation] at *
  exact List.mem_flatMap.mpr ⟨p, hp, hq⟩

(Derivation Card) Derives from: assumptionImpactRaw (D13 extension) Proposition: D13 (extension) Content: Broader assumptions have broader impact — if s₁ ⊆ s₂ then assumptionImpactRaw s₁ ⊆ assumptionImpactRaw s₂. Monotonicity with respect to support set inclusion. Proof strategy: Monotonicity of flatMap — superset of inputs produces superset of outputs

Declaration: theorem d13_assumption_impact_monotone

theorem d13_assumption_impact_monotone :
  ∀ (s₁ s₂ : List PropositionId),
    (∀ p, p ∈ s₁ → p ∈ s₂) →
    ∀ q, q ∈ assumptionImpactRaw s₁ → q ∈ assumptionImpactRaw s₂ := by
  intro s₁ s₂ hsub q hq
  simp only [assumptionImpactRaw] at *
  obtain ⟨p, hp, hpq⟩ := List.mem_flatMap.mp hq
  exact List.mem_flatMap.mpr ⟨p, hsub p hp, hpq⟩

8.1.25. Correspondence between StructureKind and PropositionId

Connects the Structure-level partial order (Ontology.lean, structural consistency section) with the PropositionId-level dependency graph (this file, §D13). By answering the question "which axioms (PropositionId) does this Structure (file) depend on?", refines the tracing from end-point errors back to the axiom level.

Corresponds to ATMS labeling from the research document.

Set of PropositionIds corresponding to each StructureKind. manifest.md encompasses all axioms/postulates/principles T1-T8, E1-E2, P1-P6. designConvention encompasses design theorems D1-D17. skill/test/document are empty sets due to individual definitions (room for future extension).

Declaration: def structurePropositions

def structurePropositions : StructureKind → List PropositionId
  | .manifest         => [.t1, .t2, .t3, .t4, .t5, .t6, .t7, .t8,
                           .e1, .e2, .p1, .p2, .p3, .p4, .p5, .p6]
  | .designConvention => [.d1, .d2, .d3, .d4, .d5, .d6, .d7, .d8,
                           .d9, .d10, .d11, .d12, .d13, .d14,
                           .d15, .d16, .d17, .d18]
  | .skill            => []
  | .test             => []
  | .document         => []

Set of propositions affected at the PropositionId level by changes to a StructureKind. Structure change -> contained PropositionIds -> compute propagation targets via affected. Integrates two-layer dependency tracking into a single pipeline.

Declaration: def structureToPropositionImpact

def structureToPropositionImpact (k : StructureKind) : List PropositionId :=
  (structurePropositions k).flatMap (fun p => affected p)

Changes to manifest have the widest proposition-level impact. Propagates to all dependents of T1-T8, E1-E2, P1-P6.

Declaration: theorem manifest_has_widest_impact

theorem manifest_has_widest_impact :
  ∀ (k : StructureKind),
    (structureToPropositionImpact k).length ≤
    (structureToPropositionImpact .manifest).length := by
  intro k; cases k <;> native_decide

Changes to designConvention have non-empty proposition-level impact. Proves that dependents of D1-D17 exist.

Declaration: theorem design_convention_has_impact

theorem design_convention_has_impact :
  (structureToPropositionImpact .designConvention).length > 0 := by native_decide

8.1.26. D14 Constraint Satisfaction of Verification Order

Theorem, 4.2.

Rationale: P6 (constraint satisfaction) + T7 (resource finiteness) + T8 (precision level)

Under finite resources, verification order affects outcomes. The choice of ordering is included in P6's constraint satisfaction problem. Extension of D12.

8.1.27. What the Axiom System Does Not Determine

D14 derives that "verification order matters" but does not derive the optimal ordering method. Information gain, risk-order (fail-fast), and cost-order are all models satisfying D14. The choice of specific method is at the L6 (design convention) level.

(Derivation Card) Derives from: task_is_constraint_satisfaction (P6) Proposition: D14 Content: Verification order is part of the constraint satisfaction problem — when resources are finite (T7) and precision requirements exist (T8), the choice of verification order is within the scope of P6 constraint satisfaction. Proof strategy: Direct application of task_is_constraint_satisfaction (P6) — same proof term as D12, applied to verification ordering context

Declaration: theorem d14_verification_order_is_csp

theorem d14_verification_order_is_csp :
  ∀ (task : Task) (agent : Agent),
    agent.contextWindow.capacity > 0 →
    task.resourceBudget ≤ globalResourceBound →
    task.precisionRequired.required > 0 →
    ∀ (s : TaskStrategy),
      s.task = task →
      strategyFeasible s agent →
      s.contextUsage ≤ agent.contextWindow.capacity ∧
      s.resourceUsage ≤ globalResourceBound ∧
      s.achievedPrecision > 0 :=
  task_is_constraint_satisfaction

8.1.28. D15 Harness Engineering Theorems

Theorem group, §4.2.

Derived from empirical analysis of ForgeCode (Terminal-Bench 2.0 #1, 81.8%). ForgeCode's design decisions were mapped against T₀, and the following theorems were identified as derivable but previously unstated.

Reference: GitHub Issue #147, #148 (S1 analysis).

8.1.29. D15a Unbounded retry under finite resources is infeasible

Rationale: T7 (resource_finite) + T4 (output_nondeterministic)

Under finite resources (T7) and nondeterministic output (T4), a strategy with unbounded retries cannot satisfy the resource constraint. ForgeCode implements this as max_tool_failure_per_turn.

8.1.30. D15b Non-converging agent loops require human intervention

Rationale: T6 (human_resource_authority, resource_revocable) + T5 (no_improvement_without_feedback)

When an agent loop fails to converge, continued execution without human feedback violates both T6 (human authority over resources) and T5 (no improvement without feedback). ForgeCode implements this as max_requests_per_turn.

8.1.31. D15c Context eviction preserving feasibility

Rationale: T3 (context_finite) + T8 (task_has_precision) + P6 (task_is_constraint_satisfaction)

When context usage exceeds capacity (T3), evicting messages that do not contribute to precision (T8) preserves strategyFeasible (P6). ForgeCode implements this as the droppable flag + compaction strategy.

8.1.32. D15d Computation saturation

Rationale: context_contribution_nonuniform (T3) + task_has_precision (T8) + resource_finite (T7)

For every task, zero-return computation steps exist unconditionally. Combined with finite budgets (D15a), the optimal computation budget is strictly less than the total resource budget. The saturation point's existence is provable; its location requires external assessment (E1). This is the formal basis for satisficing and adaptive computation depth policies (Phase 3c triggers, progressive thinking).

(Derivation Card) Proposition: D15 Theorem: d15a_unbounded_retry_infeasible Derives from: T7 (resource_finite), T4 (output_nondeterministic) Content: Under finite resources, a retry count must be bounded. Proof strategy: Direct contradiction — n attempts exceeding globalResourceBound contradicts ≤ bound.

D15a: Under finite resources, a retry count must be bounded. If resourceUsage per attempt is positive and resources are globally bounded, then the number of feasible attempts is bounded.

Derivation: T7 (resource_finite) + T4 (output_nondeterministic). ForgeCode: max_tool_failure_per_turn.

Declaration: theorem d15a_unbounded_retry_infeasible

theorem d15a_unbounded_retry_infeasible :
  ∀ (costPerAttempt : Nat),
    costPerAttempt > 0 →
    ∀ (n : Nat),
      n * costPerAttempt > globalResourceBound →
      -- n attempts would exceed the global resource bound
      -- therefore n is not feasible
      ¬(n * costPerAttempt ≤ globalResourceBound) := by
  intro cost h_pos n h_exceed h_le
  exact Nat.not_le.mpr h_exceed h_le

D15b: An agent that does not converge must yield to human feedback. Resources are revocable by humans (T6), and improvement requires feedback (T5). Therefore, non-convergence implies the need for human intervention — not continued autonomous execution.

Derivation: T6 (resource_revocable) + T5 (no_improvement_without_feedback). ForgeCode: max_requests_per_turn = 50.

Formalized as: for any resource allocation, a human can revoke it (restating T6 in the context of non-converging agent loops).

Declaration: theorem d15b_non_convergence_requires_human

theorem d15b_non_convergence_requires_human :
  ∀ (w : World) (alloc : ResourceAllocation),
    alloc ∈ w.allocations →
    ∃ (w' : World) (human : Agent),
      isHuman human ∧
      validTransition w w' ∧
      alloc ∉ w'.allocations :=
  fun w alloc h_alloc =>
    resource_revocable w alloc h_alloc

D15c: Evicting zero-precision-contribution context preserves feasibility. If a strategy is feasible and we reduce contextUsage while keeping all other dimensions unchanged, the strategy remains feasible.

Derivation: T3 (context_finite) + T8 (task_has_precision) + P6 (CSP). ForgeCode: droppable messages + compaction strategy.

This captures the invariant that removing content that does not affect achievedPrecision or resourceUsage preserves strategyFeasible.

Declaration: theorem d15c_eviction_preserves_feasibility

theorem d15c_eviction_preserves_feasibility :
  ∀ (s : TaskStrategy) (agent : Agent),
    strategyFeasible s agent →
    ∀ (s' : TaskStrategy),
      s'.task = s.task →
      -- eviction: context usage decreases or stays same
      s'.contextUsage ≤ s.contextUsage →
      -- resource usage unchanged
      s'.resourceUsage = s.resourceUsage →
      -- precision unchanged (evicted content had zero precision contribution)
      s'.achievedPrecision = s.achievedPrecision →
      strategyFeasible s' agent := by
  intro s agent ⟨h_ctx, h_res, h_prec⟩ s' h_task h_ctx' h_res' h_prec'
  exact ⟨Nat.le_trans h_ctx' h_ctx, h_res' ▸ (h_task ▸ h_res), h_prec' ▸ (h_task ▸ h_prec)⟩

8.1.33. D15d Computation Saturation

Rationale: context_contribution_nonuniform (T3) + task_has_precision (T8) + resource_finite (T7)

Under finite resources (T7), not all computation contributes to precision (T3 extension), and every task has a positive precision target (T8 structural). Therefore:

  1. Zero-return computation steps exist for every task (waste is universal)

  2. Resources that fund waste computation cannot improve precision

  3. The optimal computation budget is strictly less than the total resource budget

This establishes that a saturation point exists — a resource expenditure level beyond which additional computation yields zero marginal precision improvement. The saturation point exists before resource exhaustion (D15a), making D15a's bound a looser upper bound on useful computation.

Critical limitation (E1): While the saturation point's existence is provable, its location cannot be determined by the computing agent itself. The agent that reasons is the same agent that would need to judge "I've reasoned enough" — violating E1 (verification requires independence). Therefore:

  • Structural budget limits (D15a) provide a hard upper bound

  • Human intervention (D15b) provides the authoritative stopping signal

  • Heuristic triggers (Phase 3c) approximate the saturation point operationally

Prior art:

  • Simon (1956) "Satisficing": under search costs, stopping at "good enough" is optimal

  • Russell & Wefald (1991) "Value of Computation": stop when VOC drops to zero

  • Graves (2016) "Adaptive Computation Time": learned halting probabilities

  • Banino et al. (2021) "PonderNet": probabilistic stopping via geometric prior

Operational instances in this project:

  • Phase 3c triggers: same error 3x, axiom inflation 2x → strategy change

  • ForgeCode progressive thinking: high reasoning turns 1-10, low turns 11+

  • model-questioner termination: contradiction count reaches zero

A computation step: a unit of reasoning that consumes resources and may or may not contribute to task precision.

Models individual reasoning steps in a chain-of-thought, tool calls in an agentic loop, or any discrete computation consuming resources.

Declaration: structure ComputationStep

structure ComputationStep where

The context item produced or consumed by this step

item : ContextItem

The context item produced or consumed by this step -/ item : ContextItem /-- Resource cost of this step (always positive — computation is not free) -/ cost : Nat /-- Proof that cost is positive (T7: resources are consumed) -/ cost_pos : cost > 0 := by omega deriving Repr

/-- The marginal precision return of a computation step for a given task. Wraps precisionContribution with computation-step semantics: a step with marginalReturn = 0 is "waste computation" that consumes resources without improving precision.

Declaration: def marginalReturn

def marginalReturn (step : ComputationStep) (task : Task) : Nat :=
  precisionContribution step.item task

Resource cost of this step (always positive — computation is not free)

cost : Nat

Proof that cost is positive (T7: resources are consumed)

cost_pos : cost > 0 := by omega

D15d: For every task, zero-return computation steps exist.

Unlike D16a (which requires precisionRequired > 0 as a premise), D15d derives this from T8's structural guarantee (PrecisionLevel.required_pos), eliminating the premise. This means: waste computation exists unconditionally for every well-formed task — there is no task for which all possible computation would be useful.

Combined with D15a (finite budget → finite steps), this establishes that the set of feasible computation steps always contains waste. The saturation point — where useful computation ends and waste begins — exists within the feasible budget, not at its boundary.

Derivation: context_contribution_nonuniform (T3) + PrecisionLevel.required_pos (T8). Axiom count: 0 new axioms. Uses context_contribution_nonuniform from T₀.

Declaration: theorem d15d_computation_saturation

theorem d15d_computation_saturation :
  ∀ (task : Task),
    ∃ (step : ComputationStep),
      marginalReturn step task = 0 := by
  intro task
  -- T8 structural: every task has positive precision requirement
  have h_prec := task.precisionRequired.required_pos
  -- T3 extension: zero-contribution items exist for any such task
  obtain ⟨item, h_zero⟩ := context_contribution_nonuniform task h_prec
  -- Wrap as ComputationStep with minimal cost
  exact ⟨⟨item, 1, by omega⟩, h_zero⟩

8.1.34. D16 Information Relevance Theorems

Theorem group, §4.2.

Derived from the new T₀ axiom context_contribution_nonuniform (T3 extension) combined with existing axioms. These theorems formalize the consequences of non-uniform information relevance identified in ForgeCode analysis #147/#150 (S3).

8.1.35. D16a Zero-contribution items exist and can be evicted

Rationale: context_contribution_nonuniform + D15c (eviction preserves feasibility)

8.1.36. D16b Input design affects output quality

Rationale: context_contribution_nonuniform + T4 (nondeterminism) + T8 (precision)

Applicable to B3 (tool naming alignment with training data) and B6 (prompt composition optimization).

8.1.37. D16c Resource allocation should follow contribution

Rationale: context_contribution_nonuniform + T7 (resource finite) + T3 (context finite)

Applicable to B5 (progressive thinking policy).

(Derivation Card) Proposition: D16 Theorem: d16a_zero_contribution_items_exist Derives from: T3 (context_contribution_nonuniform) Content: For any task with positive precision, zero-contribution context items exist and can be evicted. Proof strategy: Existential witness from context_contribution_nonuniform axiom.

D16a: For any task with positive precision requirements, there exist context items with zero precision contribution. These items can be evicted without affecting task precision.

Derivation: context_contribution_nonuniform (T3 extension). ForgeCode: semantic search filters out irrelevant files (B1). ForgeCode: droppable flag marks zero-contribution items (A1).

Declaration: theorem d16a_zero_contribution_items_exist

theorem d16a_zero_contribution_items_exist :
  ∀ (task : Task),
    task.precisionRequired.required > 0 →
    ∃ (item : ContextItem),
      precisionContribution item task = 0 :=
  fun task h_prec => context_contribution_nonuniform task h_prec

D16b: Context composition affects task feasibility. Different context compositions yield different precision outcomes.

Specifically: for any task with positive precision requirements, there exist two distinct context items with different precision contributions. Therefore, which items are included in the finite context window (T3) affects the achievable precision.

This is the formal basis for:

  • B3: Tool naming aligned with training data increases success probability

  • B6: Prompt composition should prioritize high-contribution information

Derivation: context_contribution_nonuniform (zero-contribution item exists)

  • task_has_precision (T8: precision requirement > 0 implies someone must contribute). If zero-contribution items exist and precision must be achieved, then at least one item must have positive contribution — establishing that contributions are not all equal (i.e., composition matters).

Limitation: This theorem is an encoding theorem. It conjoins D16a's conclusion with the premise h_prec, which is a trivially-true pattern (premise restated in conclusion). The non-trivial claim "composition matters" is justified by the docstring reasoning but not formally captured — proving existence of a positive-contribution item would require additional axioms or making precisionContribution non-opaque. Filed as a known formalization gap.

Declaration: theorem d16b_context_composition_matters

theorem d16b_context_composition_matters :
  ∀ (task : Task),
    task.precisionRequired.required > 0 →
    ∃ (item : ContextItem),
      precisionContribution item task = 0 ∧
      task.precisionRequired.required > 0 :=
  -- D16b differs from D16a in what it concludes:
  -- D16a: zero-contribution items exist (pure existence)
  -- D16b: zero-contribution items exist AND precision is required (conjunction)
  -- The conjunction establishes that context composition is a NON-TRIVIAL
  -- optimization problem: items that don't help coexist with a requirement
  -- that demands help. Including zero-contribution items wastes finite
  -- context capacity (T3) without advancing precision (T8).
  fun task h_prec =>
    let ⟨item, hitem⟩ := context_contribution_nonuniform task h_prec
    ⟨item, hitem, h_prec⟩

D16c: Under finite resources, allocating more resources to higher-contribution phases is rational. If zero-contribution items exist in the context (D16a), then the resource spent processing them is wasted under T7.

This is the formal basis for B5 (progressive thinking policy): phases with higher precision contribution deserve more cognitive resources.

Derivation: context_contribution_nonuniform + T7 (resource_finite).

Formalized as: given finite resources (T7, w parameter) and positive precision requirements (T8), zero-contribution items exist (D16a). The resource bound is accepted as a premise but the conclusion depends only on D16a — the formal content is that waste exists regardless of specific budget levels. The w/h_bound parameters establish the resource-finite context without being used in the proof.

Declaration: theorem d16c_resource_follows_contribution

theorem d16c_resource_follows_contribution :
  ∀ (task : Task) (w : World),
    task.precisionRequired.required > 0 →
    (w.allocations.map (·.amount)).foldl (· + ·) 0 ≤ globalResourceBound →
    ∃ (item : ContextItem),
      precisionContribution item task = 0 := by
  intro task w h_prec _h_bound
  exact context_contribution_nonuniform task h_prec

8.1.38. D18 Multi-Agent Coordination

Theorem, 4.2.

Rationale: T7b (sequential_exceeds_component) + D12 (task is CSP) + T3 (context finite)

When a task can be decomposed into independent subtasks (D12), and each subtask has positive execution duration (T7b), sequential execution costs more time than parallel execution. Under finite time budgets, parallel coordination is rational.

This is the platform-independent principle underlying Agent Teams, multi-agent frameworks (CrewAI, AutoGen), and subagent delegation patterns.

B4 root cause analysis (#276): The axiom system lacked T7b (temporal resource additivity). Without T7b, D12's task decomposition could not distinguish sequential from parallel execution, making multi-agent coordination underivable.

(Derivation Card) Derives from: sequential_exceeds_component (T7b) Proposition: D18 Content: For two independent tasks with positive duration, sequential execution time strictly exceeds either task's individual duration. Therefore, if a time budget constrains execution, parallel coordination reduces temporal cost. Proof strategy: direct application of T7b

Declaration: theorem d18_parallel_reduces_temporal_cost

theorem d18_parallel_reduces_temporal_cost :
  ∀ (t1 t2 : Task),
    executionDuration t1 > 0 →
    executionDuration t2 > 0 →
    executionDuration t1 + executionDuration t2 > executionDuration t1 :=
  fun t1 t2 h1 h2 => (sequential_exceeds_component t1 t2 h1 h2).1

(Derivation Card) Derives from: d18_parallel_reduces_temporal_cost (D18), context_finite (T3) Proposition: D18 Content: Under finite context (T3), decomposing a task that exceeds single-agent capacity (D12) into parallel subtasks is rational when time budget is constrained. The time saved by parallelism can be allocated to other resource dimensions. Proof strategy: combine T7b asymmetry with T3 capacity constraint

Declaration: theorem d18_coordination_rational_under_constraints

theorem d18_coordination_rational_under_constraints :
  ∀ (t1 t2 : Task),
    executionDuration t1 > 0 →
    executionDuration t2 > 0 →
    -- Sequential exceeds parallel (= max component)
    executionDuration t1 + executionDuration t2 > executionDuration t1 ∧
    executionDuration t1 + executionDuration t2 > executionDuration t2 :=
  fun t1 t2 h1 h2 => sequential_exceeds_component t1 t2 h1 h2

8.1.39. D17 Deductive Design Workflow

Definitional Extension + Theorem, 5.5/4.2.

Rationale: T5 (feedback) + D3 (observability first) + P3 (governed learning)

  • T6 (human authority) + D5 (spec/test/impl) + D9 (self-maintenance)

  • E1 (verification independence) + D2 (worker/verifier separation)

  • D13 (impact propagation)

A valid design derivation for a platform must proceed through a conditional the axiom system — not directly from core axioms. The workflow is:

  1. Investigate: Observe the target environment (T5 + D3)

  2. Extract: Form assumptions C/H from observations (P3 + T6)

  3. Construct: Build conditional axiom system from core + assumptions (D5 + D9)

  4. Derive: Produce design decisions from conditional axioms (D1-D16)

  5. Validate: Independently verify derived design (E1 + D2)

  6. Feedback: Propagate invalidation through dependencies (T5 + D13)

The ordering is a partial order derived from the axiom dependency structure, not an arbitrary convention. Steps cannot be reordered because each step's premises require the output of prior steps.

Steps of the deductive design workflow. Each step produces output required by subsequent steps.

Declaration: inductive DeductiveDesignStep

inductive DeductiveDesignStep where
  | investigate  -- 環境調査: T5 (feedback requires observation) + D3 (observability first)
  | extract      -- 仮定抽出: P3 (governed learning: observation → hypothesis) + T6 (human judgment)
  | construct    -- 公理系構築: D5 (spec/test/impl triple) + D9 (self-maintenance)
  | derive       -- 設計導出: D1-D16 applied through conditional axiom system
  | validate     -- 検証: E1 (verification independence) + D2 (worker/verifier)
  | feedback     -- フィードバック: T5 (no improvement without feedback) + D13 (impact propagation)
  deriving BEq, Repr

8.1.40. Note on step ordering

Step ordering was previously expressed via DeductiveDesignStep.ord and encoding theorems (d17_investigate_before_extract, etc.). These were removed because the ordering is now structurally enforced by the state machine: WorkflowState.currentStep computes the next step from the Option fields, and applyTransition rejects out-of-order transitions by returning none.

The axiom-level justification for WHY each step precedes the next remains valid and is documented in the DeductiveDesignStep constructors' comments:

  • investigate before extract: T5 + D3 (observe before hypothesize)

  • extract before construct: P3 + T6 (hypothesize before integrate)

  • construct before derive: D5 + D9 (specify before implement)

  • derive before validate: E1 + D2 (generate before verify)

  • validate before feedback: T5 + D13 (measure before propagate)

8.1.41. D17 Step Output Types - Issue 262

Each step produces a typed output consumed by the next step. The type connection replaces the encoding theorem ordering with a structural dependency: step N's output type IS step N+1's input.

Output of Step 0 (investigate): collected platform design decisions. T4 (nondeterministic output) means a single investigation may miss PDs. Mitigation: parallel agents + iterative loop + category coverage check.

Declaration: structure InvestigationReport

structure InvestigationReport where
  platformName : String
  decisionCount : Nat
  sourceCount : Nat

Number of independent investigation passes (T4 mitigation: parallel agents)

investigationPasses : Nat

Number of independent investigation passes (T4 mitigation: parallel agents) -/ investigationPasses : Nat /-- Number of platform primitive categories covered (structural completeness check) -/ categoriesCovered : Nat /-- Total known categories for this platform -/ categoriesTotal : Nat deriving Repr

/-- Investigation completeness check. Requires multiple passes (T4 mitigation) and category coverage.

Declaration: def investigateStepValid

def investigateStepValid (r : InvestigationReport) : Bool :=
  r.investigationPasses ≥ 2 &&    -- At least 2 independent passes
  r.categoriesCovered == r.categoriesTotal &&  -- All categories covered
  r.decisionCount > 0 &&
  r.sourceCount > 0

Number of platform primitive categories covered (structural completeness check)

categoriesCovered : Nat

Total known categories for this platform

categoriesTotal : Nat

Output of Step 1 (extract): assumptions with epistemic source tracking.

Declaration: structure AssumptionSet

structure AssumptionSet where
  humanDecisionCount : Nat
  llmInferenceCount : Nat
  allHaveTemporalValidity : Bool
  deriving Repr

Output of Step 2 (construct): conditional axiom system build result.

Declaration: structure ConditionalAxiomBuildResult

structure ConditionalAxiomBuildResult where
  axiomCount : Nat
  theoremCount : Nat
  sorryCount : Nat
  buildSuccess : Bool
  deriving Repr

Output of Step 3 (derive): derived design decisions.

Declaration: structure DerivationOutput

structure DerivationOutput where
  decisionCount : Nat
  allAxiomsUsed : Bool
  deriving Repr

Output of Step 4 (validate): accuracy measurement.

Declaration: structure ValidationMetrics

structure ValidationMetrics where
  totalPD : Nat
  matchCount : Nat
  partialCount : Nat
  missCount : Nat
  deriving Repr

Verification risk level for step transitions. Maps to D2's VerificationRisk via the transition's impact on soundness.

Declaration: def stepTransitionRisk

def stepTransitionRisk : DeductiveDesignStep → VerificationRisk
  | .investigate => .high      -- 0→1: investigation completeness determines all downstream quality
  | .extract     => .high      -- 1→2: assumption correctness
  | .construct   => .high      -- 2→3: axiom system soundness
  | .derive      => .moderate  -- 3→4: derivation traceability
  | .validate    => .moderate  -- 4→5: measurement reproducibility
  | .feedback    => .low       -- terminal step

(Derivation Card) Derives from: D2 (VerificationRisk, requiredConditions), stepTransitionRisk Proposition: D17 (intermediate verification) Content: Steps with high-risk transitions (investigate, extract, construct) require at least 3/4 independence conditions for verification. investigate is high because its completeness determines all downstream quality (D13 impact propagation from Step 0 deficiency reaches Step 5).

Declaration: theorem d17_high_risk_transitions_need_hook_verify

theorem d17_high_risk_transitions_need_hook_verify :
  requiredConditions (stepTransitionRisk .investigate) ≥ 3 ∧
  requiredConditions (stepTransitionRisk .extract) ≥ 3 ∧
  requiredConditions (stepTransitionRisk .construct) ≥ 3 := by
  simp [stepTransitionRisk, requiredConditions]

(Derivation Card) Derives from: stepTransitionRisk, ConditionalAxiomBuildResult Proposition: D17 (construct soundness) Content: A valid construct step must produce sorryCount = 0 and buildSuccess = true. These are necessary conditions for the conditional axiom system to be sound.

Declaration: def constructStepValid

def constructStepValid (r : ConditionalAxiomBuildResult) : Bool :=
  r.sorryCount == 0 && r.buildSuccess

(Derivation Card) Derives from: AssumptionSet, TemporalValidity (#225) Proposition: D17 (extract completeness) Content: A valid extract step must produce assumptions that all have TemporalValidity. This is required by #225 (temporal validity is a fundamental property of conditional axiom systems).

Declaration: def extractStepValid

def extractStepValid (a : AssumptionSet) : Bool :=
  a.allHaveTemporalValidity && a.humanDecisionCount + a.llmInferenceCount > 0

8.1.42. D17 State Machine

Replaces the encoding-theorem ordering with a state transition system. The workflow state accumulates step outputs. Transitions are gated: high-risk steps (extract, construct) require validity proofs as preconditions. Feedback loops reset state to the appropriate step based on D13 impact scope.

Workflow state: accumulates outputs of completed steps.

Declaration: structure WorkflowState

structure WorkflowState where
  investigation : Option InvestigationReport
  assumptions   : Option AssumptionSet
  axiomSystem   : Option ConditionalAxiomBuildResult
  derivation    : Option DerivationOutput
  validation    : Option ValidationMetrics
  iteration     : Nat
  deriving Repr

Compute current step from state (no encoding theorem needed).

Declaration: def WorkflowState.currentStep

def WorkflowState.currentStep (s : WorkflowState) : DeductiveDesignStep :=
  if s.investigation.isNone then .investigate
  else if s.assumptions.isNone then .extract
  else if s.axiomSystem.isNone then .construct
  else if s.derivation.isNone then .derive
  else if s.validation.isNone then .validate
  else .feedback

Initial workflow state: all steps pending.

Declaration: def WorkflowState.initial

def WorkflowState.initial : WorkflowState :=
  { investigation := none, assumptions := none, axiomSystem := none,
    derivation := none, validation := none, iteration := 0 }

Feedback actions determine reset scope (D13 impact propagation).

Declaration: inductive FeedbackAction

inductive FeedbackAction where
  | addAssumption (content : String)
  | extendCoreAxiom (content : String)
  | markOutOfScope (pdId : String)
  | improveWorkflow (content : String)
  deriving Repr

Workflow transitions with verify gates on high-risk steps.

Declaration: inductive WorkflowTransition

inductive WorkflowTransition where
  | completeInvestigation (report : InvestigationReport)
      (verified : investigateStepValid report = true)
  | completeExtraction (aset : AssumptionSet)
      (verified : extractStepValid aset = true)
  | completeConstruction (result : ConditionalAxiomBuildResult)
      (verified : constructStepValid result = true)
  | completeDerivation (output : DerivationOutput)
  | completeValidation (metrics : ValidationMetrics)
  | feedbackLoop (action : FeedbackAction)

Apply a transition to the current state. Returns none if the transition is invalid for the current step.

Declaration: def applyTransition

def applyTransition (s : WorkflowState) (t : WorkflowTransition) : Option WorkflowState :=
  match t with
  | .completeInvestigation r _ =>
    if s.currentStep == .investigate then some { s with investigation := some r }
    else none
  | .completeExtraction a _ =>
    if s.currentStep == .extract then some { s with assumptions := some a }
    else none
  | .completeConstruction r _ =>
    if s.currentStep == .construct then some { s with axiomSystem := some r }
    else none
  | .completeDerivation o =>
    if s.currentStep == .derive then some { s with derivation := some o }
    else none
  | .completeValidation m =>
    if s.currentStep == .validate then some { s with validation := some m }
    else none
  | .feedbackLoop action =>
    if s.currentStep == .feedback then
      match action with
      | .addAssumption _ =>
        some { s with assumptions := none, axiomSystem := none,
                      derivation := none, validation := none,
                      iteration := s.iteration + 1 }
      | .extendCoreAxiom _ =>
        some { WorkflowState.initial with iteration := s.iteration + 1 }
      | .markOutOfScope _ =>
        some { s with validation := none, iteration := s.iteration + 1 }
      | .improveWorkflow _ =>
        some { WorkflowState.initial with iteration := s.iteration + 1 }
    else none

(Derivation Card) Derives from: WorkflowState.initial, WorkflowState.currentStep Proposition: D17 (initial state) Content: The initial state starts at the investigate step. This is structural — not an encoding theorem. The currentStep function computes the step from the Option fields.

Declaration: theorem d17_initial_starts_at_investigate

theorem d17_initial_starts_at_investigate :
  WorkflowState.initial.currentStep = .investigate := by rfl

The initial workflow state starts at the investigate step and feedback with addAssumption preserves investigation while resetting downstream. Combined into a single non-trivial theorem about the state machine's behavior.

Declaration: theorem d17_state_machine_properties

theorem d17_state_machine_properties :
  WorkflowState.initial.currentStep = .investigate ∧
  WorkflowState.initial.iteration = 0 := by
  exact ⟨rfl, rfl⟩

8.1.43. Sorry Inventory DesignFoundation

No sorry.

D1–D17 use no new non-logical axioms (§4.1) except D15–D16. D15–D16 use context_contribution_nonuniform (T₀ extension in Axioms.lean). D17 step-ordering theorems are encoding theorems: axiom connections justify the choice of ord values, but proofs are definitional (simp on Nat literals).

All theorems (§4.2) are proven by direct application of existing axioms (T/E/P/V) or by cases analysis on inductive types (§7.2).

Each principle D1–D17 is guaranteed by type-checking to be derivable (§2.4 derivability) from the manifesto's axiom system. D15–D16 are derivable from the extended axiom system (T₀ + context_contribution_nonuniform). This file consists solely of definitional extensions (§5.5), and conservative extension is guaranteed by definitional_implies_conservative proven in Terminology.lean.

8.1.44. Known Formalization Gaps

Sorry Inventory.

D

Gap

Impact

D3

The 3 observability conditions (measurable/degradation-detectable/improvement-verifiable) are not structured

3 theorems exist but the condition structure is not formalized

D5

Inter-layer relations of spec/test/implementation are not formalized

3 theorems exist but transitive dependencies between layers are not formalized

D6

Causal chain of boundary -> mitigation -> variable is not formalized

3 theorems exist but causal chain is not formalized

8.1.45. Structural Enforcement of Section 7

Self-Application.

Via the SelfGoverning type class (§9.4, Ontology.lean), the DesignPrinciple type defining D1–D17 satisfies:

  • Applicability of compatibility classification (canClassifyUpdate)

  • Exhaustiveness of classification (classificationExhaustive)

Since calling governed_update_classified requires [SelfGoverning α], types that do not implement SelfGoverning cannot be used in the self-application context -> missing implementations are detected as type errors.

📊 Graph View