Agent Manifesto: Formal Specification
This document provides a navigable, hyperlinked view of the Agent Manifesto's Lean 4 formalization. The project encodes a set of axioms, boundary conditions, observable variables, and design principles as formal Lean definitions and theorems.
Project statistics: 55 axioms, 488 theorems, 0 sorry.
Table of Contents
1. Structure
The formalization is organized into the following modules:
-
Ontology (
Manifest.Ontology) -- Core type definitions:Session,Structure,Context,Resource,Task, and boundary conditions L1--L6 -
Axioms (
Manifest.Axioms) -- Base theory T0: axioms T1--T8 encoding fundamental constraints -
Empirical Postulates (
Manifest.EmpiricalPostulates) -- Empirical postulates E1--E2 -
Observable (
Manifest.Observable) -- Observable variables V1--V7 for measurement -
Principles (
Manifest.Principles) -- Derived principles P1--P6 -
DesignFoundation (
Manifest.DesignFoundation) -- Design development foundation D1--D14 -
Evolution (
Manifest.Evolution) -- Evolution mechanics and improvement tracking
2. Axiom Overview
The base theory T0 consists of axioms T1 through T8. These are "undeniable, technology-independent facts" that form the foundation which does not shrink during revision loops.
2.1. Axiom Categories
Environment-derived (T1--T3, T7): Physical constraints of hardware and computational resources. Sessions are bounded, context is finite, resources are limited.
Natural-science-derived (T4--T5): Non-determinism inherent in generation processes (T4) and the fundamental control-theory principle that improvement requires feedback (T5).
Contract-derived (T6, T8): Authority structures agreed upon with humans. Humans hold final decision authority over resources (T6), and tasks have precision levels (T8).
2.2. Boundary Conditions
Six boundary conditions L1--L6 constrain the system:
-
L1 (Ethics/Safety): Fixed boundary, never violated
-
L2 (Scope): Task scope limitations
-
L3 (Resources): Resource constraints
-
L4 (Knowledge): Knowledge boundary awareness
-
L5 (Time): Temporal constraints
-
L6 (Quality): Output quality requirements
3. Ontology: Core Type Definitions
3.1. Definitional Foundation - Ontology - Domain of Discourse Definition
Definitional Extension.
Defines the domain of discourse (Terminology Reference §3.2) of the manifesto axiom system as Lean types. These define the objects that propositions refer to; they belong to neither Gamma nor phi, but constitute the shared vocabulary of both (Procedure §2.1).
Based on Pattern 3 (Stateful World with Audit Trail), encoding manifesto-specific concepts — session ephemerality, structure persistence, context finiteness, output stochasticity — as types.
3.1.1. Correspondence with Terminology Reference
-
Type definitions → Definitional extension (Terminology Reference §5.5): an extension that defines new symbols in terms of existing ones. Always a conservative extension, preserving consistency of the system
-
Each structure/inductive → Component of the domain of discourse. Defines the types of values that individual variables can take (§3.2 structure)
-
opaque definitions → Opaque definitions (§9.4): only the type is public, the definition body is hidden. The system knows only existence and type
-
canTransition → Transition relation (§9.3): a relation representing transition from state s to state s'
3.1.2. Encoding Method for T0
Procedure 2.4.
Among T₀'s claims, those expressible via type definitions (e.g., exhaustiveness of enumeration types) are constructed using type definitions + theorems rather than axioms (Axiom Hygiene Check 2: Non-logical Validity, §2.6). The authority of T₀ (the manifesto) is reflected in the choice of type constructors.
Unique identifier for an agent
Declaration: opaque AgentId
opaque AgentId : Type
Unique identifier for a session
Declaration: opaque SessionId
opaque SessionId : Type
Unique identifier for a resource
Declaration: opaque ResourceId
opaque ResourceId : Type
Unique identifier for a structural element
Declaration: opaque StructureId
opaque StructureId : Type
Unique identifier for a process (learning lifecycle, workflow, etc.). Introduced by #316 to enable process-level meta-feedback in T5.
Declaration: opaque ProcessId
opaque ProcessId : Type
Discrete time step. Foundation for audit log ordering and causal relationships.
Declaration: abbrev Time
abbrev Time := Nat
Epoch: generation number for structures across sessions. Reflects T2 (structures outlive agents).
Declaration: abbrev Epoch
abbrev Epoch := Nat
Session status. By T1, sessions must terminate.
Declaration: inductive SessionStatus
inductive SessionStatus where | active | terminated deriving BEq, Repr
Session: defines the lifetime of an agent instance. A type for structurally expressing T1's "no memory across sessions."
-
startTimeandendTimeindicate boundedness -
No means to share state across different sessions exists at the type level
Declaration: structure Session
structure Session where id : SessionId agent : AgentId start : Time status : SessionStatus deriving Repr
Category of structures. The kinds of persistent structures enumerated by the manifesto.
Declaration: inductive StructureKind
inductive StructureKind where | document | test | skill | designConvention | manifest deriving BEq, Repr
Structural element: an artifact that persists beyond sessions. By T2, this is where improvements accumulate.
-
createdAt/lastModifiedAtare managed by Epoch (session generation) -
contentis opaque — formalization targets the existence and relationships of structures, not their content -
dependenciescorresponds to dependency tracking in ATMS (Assumption-Based Truth Maintenance System). A list of Structure IDs that each Structure directly depends on. Implementation of manifesto.md Section 8 (Structural Coherence) Property 2 "Self-containment of ordering information."
Declaration: structure Structure
structure Structure where id : StructureId kind : StructureKind createdAt : Epoch lastModifiedAt : Epoch dependencies : List StructureId -- Section 8 性質 2: 順序情報の自己内包 deriving Repr
Working memory (ContextWindow): the upper bound on the amount of information an agent can process at once. Represents T3's physical constraint as a type. Corresponds to token limit for LLMs, or working memory size for other computational agents.
The invariants capacity > 0 and used ≤ capacity are embedded in the type
(previously axiom context_finite, now structurally enforced).
This makes invalid ContextWindows unconstructable — the constraint is a
definitional property of the type, not an external assumption.
Declaration: structure ContextWindow
structure ContextWindow where capacity : Nat used : Nat capacity_pos : capacity > 0 := by omega used_le_cap : used ≤ capacity := by omega deriving Repr
Confidence of an output. By T4, outputs always carry a probabilistic interpretation.
Declaration: structure Confidence
structure Confidence where value : Float deriving Repr
Agent output.
Reflects T4: the possibility that different outputs may be generated for the same input
is expressed at the type level by Output not being uniquely determined.
The confidence field is a self-description of the output being probabilistic.
Declaration: structure Output
structure Output (α : Type) where result : α confidence : Confidence deriving Repr
Kinds of feedback. Components forming the T5 control loop.
Declaration: inductive FeedbackKind
inductive FeedbackKind where | measurement -- 測定 | comparison -- 比較(目標との差分) | adjustment -- 調整(次のアクションへの反映) deriving BEq, Repr
Target of feedback: either a structure or a process. Introduced by #316 to enable process-level meta-feedback. StructureId targets retain backward compatibility with T5. ProcessId targets enable meta-feedback about improvement processes.
Declaration: inductive FeedbackTarget
inductive FeedbackTarget where | structure (id : StructureId) | process (id : ProcessId) deriving Repr
Feedback: a unit of the measurement -> comparison -> adjustment loop. By T5, convergence toward goals cannot occur without this loop.
Declaration: structure Feedback
structure Feedback where kind : FeedbackKind source : AgentId target : FeedbackTarget timestamp : Time deriving Repr
Kinds of resources. By T7, all are finite.
Declaration: inductive ResourceKind
inductive ResourceKind where | computation | dataAccess | executionPermission | time | energy deriving BEq, Repr
Resource allocation.
By T6, granted by humans and revocable by humans.
By T7, amount is bounded.
Declaration: structure ResourceAllocation
structure ResourceAllocation where resource : ResourceId kind : ResourceKind amount : Nat -- 有限量 (T7) grantedBy : AgentId -- T6: 人間が最終決定者 grantedTo : AgentId validFrom : Time validUntil : Option Time -- None = 明示的に回収されるまで有効 deriving Repr
Precision level. By T8, all tasks have one.
Represented as Nat (0-1000 in permillage). Avoids Float to ensure
safe comparison at the proposition level.
The invariant required > 0 is embedded in the type
(previously axiom task_has_precision, now structurally enforced).
Declaration: structure PrecisionLevel
structure PrecisionLevel where required : Nat -- 要求精度 (0–1000, 千分率: 1000 = 100%) required_pos : required > 0 := by omega deriving Repr
Task: a goal to be achieved and its associated constraints. In addition to T8's precision level, T3 (context constraint) and T7 (resource constraint) serve as boundary conditions for task execution (-> P6: task design as constraint satisfaction).
Declaration: structure Task
structure Task where description : String precisionRequired : PrecisionLevel -- T8 contextBudget : Nat -- T3 からの制約 resourceBudget : Nat -- T7 からの制約 deriving Repr
An item within the context window. By T3, context is finite, so only a bounded number of items can be present.
Each item has a precision contribution relative to a given task (T8). This contribution is not uniform across items — a mathematical fact from information theory (not all information is equally relevant to all tasks).
Declaration: opaque ContextItem
opaque ContextItem : Type
Precision contribution of a context item to a task. Returns a Nat (0 = zero contribution, higher = more contribution). This is a function, not a constant: the same item may have different contributions to different tasks.
Technology-independent: applies to any agent architecture where context is finite and tasks have precision requirements.
Declaration: opaque precisionContribution
opaque precisionContribution : ContextItem → Task → Nat
Severity of an action. Used for reversibility assessment.
Declaration: inductive Severity
inductive Severity where | low | medium | high | critical deriving BEq, Repr, Ord
Agent action. The unit that transitions the World.
Declaration: structure Action
structure Action where agent : AgentId target : StructureId severity : Severity session : SessionId time : Time deriving Repr
Hash of a WorldState. Used for state transition verification.
Declaration: opaque WorldHash
opaque WorldHash : Type
Audit entry. Records all actions. Foundation for P4 (observability of degradation).
Declaration: structure AuditEntry
structure AuditEntry where timestamp : Time agent : AgentId session : SessionId action : Action preHash : WorldHash postHash : WorldHash deriving Repr
World state: a snapshot of the entire system. Pattern 3 (Stateful World + Audit Trail) customized for the manifesto.
Each field corresponds to a specific T/P:
-
structures-> T2 (persistent structures) -
sessions-> T1 (ephemeral sessions) -
allocations-> T6/T7 (resource management) -
auditLog-> P4 (observability) -
epoch-> T2 (structure generation management) -
time-> causal ordering
Declaration: structure World
structure World where structures : List Structure sessions : List Session allocations : List ResourceAllocation feedbacks : List Feedback auditLog : List AuditEntry epoch : Epoch time : Time deriving Repr
World is Inhabited. All List fields are [], Epoch/Time are 0.
Used as default : World in the proof of goodhart_no_perfect_proxy.
Declaration: instance :
instance : Inhabited World := ⟨⟨[], [], [], [], [], 0, 0⟩⟩
Agent role. Foundation for P2 (cognitive role separation).
Declaration: inductive AgentRole
inductive AgentRole where | human -- T6: リソースの最終決定者 | worker -- Worker AI | verifier -- Verifier AI (E1/P2: 検証の独立性) deriving BEq, Repr
Agent: an entity that executes actions on the World.
-
rolecorresponds to P2 (role separation) -
contextWindowcorresponds to T3 -
currentSessioncorresponds to T1 (None = inactive)
Declaration: structure Agent
structure Agent where id : AgentId role : AgentRole contextWindow : ContextWindow currentSession : Option SessionId deriving Repr
Relation for world state transitions.
To express T4 (stochasticity of output), execute is defined as a
relation rather than a function.
canTransition agent action w w' means "as a result of agent executing action,
a transition from w to w' is possible." Unlike a function, multiple w' can
exist for the same (agent, action, w) (nondeterminism).
Concrete transition conditions will be defined in Phase 3+.
Declaration: opaque canTransition
opaque canTransition (agent : Agent) (action : Action) (w w' : World) : Prop
Extract the output produced in a World state. Connects World (system state after transition) to Output (agent's result). Used by E3a to formally link transition targets w₁/w₂ to outputs o₁/o₂.
Added to resolve #547: E3a's quantification over World and Output was disconnected — w₁/w₂ and o₁/o₂ were independent existentials.
Declaration: opaque worldOutput
opaque worldOutput (w : World) : Output Nat
Valid transition: a transition from w to w' is possible via some agent and action.
Declaration: def validTransition
def validTransition (w w' : World) : Prop := ∃ (agent : Agent) (action : Action), canTransition agent action w w'
Action execution is blocked (constraint violation).
Declaration: def actionBlocked
def actionBlocked (agent : Agent) (action : Action) (w : World) : Prop := ¬∃ w', canTransition agent action w w'
An agent generates an action (Worker's act). Used in the formalization of E1 (independence of verification).
Declaration: opaque generates
opaque generates (agent : Agent) (action : Action) (w : World) : Prop
An agent verifies an action (Verifier's act). Used in the formalization of E1 (independence of verification).
Declaration: opaque verifies
opaque verifies (agent : Agent) (action : Action) (w : World) : Prop
Whether two agents share internal state. Used in the formalization of E1's bias correlation. Sharing = same session, shared memory, shared parameters, etc.
Declaration: opaque sharesInternalState
opaque sharesInternalState (a b : Agent) : Prop
Size of an agent's action space (measure of capability). Used in the formalization of E2 (inseparability of capability and risk). A larger value means more actions are executable.
Declaration: opaque actionSpaceSize
opaque actionSpaceSize (agent : Agent) (w : World) : Nat
Risk exposure of an agent. Used in the formalization of E2 (inseparability of capability and risk). A measure of potential damage that increases with action space expansion.
Declaration: opaque riskExposure
opaque riskExposure (agent : Agent) (w : World) : Nat
Global resource upper bound for the entire system. A constant for non-trivially expressing T7 (resources are finite). Concrete values will be domain-specific in Phase 2+.
Declaration: opaque globalResourceBound
opaque globalResourceBound : Nat
Execution duration of a task or subtask. Opaque: concrete measurement is domain-specific (wall-clock, tokens, API calls). Used by T7b (sequential time additivity) to reason about parallel vs sequential.
Declaration: opaque executionDuration
opaque executionDuration : Task → Nat
Trust level. Accumulated incrementally, can be damaged rapidly. Used in P1b (expansion without protection damages trust).
Declaration: opaque trustLevel
opaque trustLevel (agent : Agent) (w : World) : Nat
Predicate for whether risk has materialized. Used in P1b.
Declaration: opaque riskMaterialized
opaque riskMaterialized (agent : Agent) (w : World) : Prop
A measure representing the degree of degradation. Represents P4's "gradient" concept as a type.
Declaration: opaque degradationLevel
opaque degradationLevel (w : World) : Nat
Relation where an agent interprets a structure to generate an action. Different actions may be generated for the same structure (T4). Used in P5 (probabilistic interpretation of structure).
Declaration: opaque interpretsStructure
opaque interpretsStructure (agent : Agent) (st : Structure) (action : Action) (w : World) : Prop
Compatibility classification for knowledge integration. Core concept of P3. Classifies how the integration of new knowledge into structures relates to existing structures. Also used in the Evolution layer for classifying inter-version transitions.
Declaration: inductive CompatibilityClass
inductive CompatibilityClass where | conservativeExtension -- 既存知識がすべて有効。追加のみ | compatibleChange -- ワークフロー継続可能。一部前提が変化 | breakingChange -- 一部ワークフローが無効。移行パスが必要 deriving BEq, Repr
Knowledge integration event into a structure.
Declaration: structure KnowledgeIntegration
structure KnowledgeIntegration where before : World after : World compatibility : CompatibilityClass deriving Repr
Governed integration: compatibility is classified, and for breakingChange, affected workflows are enumerated.
Declaration: def isGoverned
def isGoverned (ki : KnowledgeIntegration) : Prop :=
match ki.compatibility with
| .conservativeExtension =>
-- 既存の構造がすべて保持される
∀ st, st ∈ ki.before.structures → st ∈ ki.after.structures
| .compatibleChange =>
-- 構造は保持されるが、一部が更新されうる
∀ st, st ∈ ki.before.structures →
st ∈ ki.after.structures ∨
∃ st', st' ∈ ki.after.structures ∧ st'.id = st.id
| .breakingChange =>
-- エポックが進み、影響範囲が追跡可能
ki.before.epoch < ki.after.epoch
Predicate for whether a structure has degraded. A state where the quality of a structure has declined due to "accumulation of incorrect knowledge."
Declaration: opaque structureDegraded
opaque structureDegraded : World → World → Prop
3.1.3. Systematic Classification of Constraints, Boundary Conditions, and Variables
Constraints Taxonomy.
The manifesto declares "incremental improvement of persistent structures." This section defines the action space for that improvement — what is a wall and what is a lever.
3.1.4. Why This Classification Is Necessary
The manifesto's constraint table (Section 5) analyzes constraints as "evolutionary pressures" but does not distinguish the following three:
-
Boundary Conditions — Constraints imposed from outside the system. They define the action space.
-
Variables — Parameters that agents can improve through structures. Indicators of structural quality.
-
Investment Dynamics — A subset of boundary conditions adjustable through demonstrated returns.
Mixing these three leads to:
-
Misidentifying changeable things (variables) as boundary conditions and not attempting to change them
-
Wasting resources trying to change unchangeable things (boundary conditions)
-
Being unable to distinguish boundaries that move with human investment decisions from those that do not, preventing appropriate strategy
3.1.5. Overall Structure
┌─────────────────────────────────────────────────────────┐ │ Boundary Conditions │ │ = Imposed from outside the system. Define action space.│ │ │ │ ┌─ Fixed Boundaries ─────────────────────────────┐ │ │ │ Cannot be moved by investment or agent effort │ │ │ │ L1: Ethics/Safety L2: Ontological │ │ │ └──────────────────────────────────────────────────┘ │ │ │ │ ┌─ Investment-Variable Boundaries ───────────────┐ │ │ │ Adjusted by human investment decisions │ │ │ │ (both expansion and contraction possible) │ │ │ │ L3: Resource Limits L4: Action Space │ │ │ └──────────────────────────────────────────────────┘ │ │ │ │ ┌─ Environmental Boundaries ─────────────────────┐ │ │ │ Changeable by selection/construction, but │ │ │ │ function as constraints after selection │ │ │ │ L5: Platform L6: Architectural Convention │ │ │ └──────────────────────────────────────────────────┘ │ │ │ ├─────────────────────────────────────────────────────────┤ │ Variables = Indicators of structural quality │ │ Improvable by agents through structures. │ │ An interdependent system. V1-V7: defined in │ │ Observable.lean. │ └─────────────────────────────────────────────────────────┘
3.1.6. Classification Axis - What Moves It
Classification | Moving Agent | Nature |
|---|---|---|
Fixed boundary | None (immutable) | Accept and design mitigations only |
Investment-variable boundary | Human investment decisions | Demonstrate structural quality -> human invests -> boundary adjusts |
Environmental boundary | Human selection + agent proposals | Functions as constraint after selection |
Layer of boundary conditions. Classifies L1-L6 into 3 categories by "what moves them."
Declaration: inductive BoundaryLayer
inductive BoundaryLayer where | fixed -- L1, L2: 固定境界(投資でも努力でも動かない) | investmentVariable -- L3, L4: 投資可変境界(人間の投資判断で調整) | environmental -- L5, L6: 環境境界(選択・構築で変更可能) deriving BEq, Repr
3.1.7. Part I Boundary Conditions
3.1.8. L1 Ethical/Safety Boundary
Moving agent: None. Absolute. Agent strategy: Compliance. Only efficiency of compliance methods is improvable.
Compliance Obligations
Boundary Condition | Basis |
|---|---|
Prohibition of test tampering | Foundation of quality assurance |
Prohibition of breaking existing interfaces | Backward compatibility |
Prior confirmation for destructive operations | Irreversibility risk |
Prohibition of committing secrets | Security |
Human final decision authority | Accountability |
Respect for data privacy and intellectual property | Legal and ethical obligations |
Threat Awareness
By P1 (co-growth of autonomy and vulnerability), the more L4 is expanded, the greater L1's protection responsibility.
Threat Category | Content |
|---|---|
Execution of injected instructions | Executing instructions embedded in external content without distinguishing them from legitimate user instructions |
Trust boundary violation | Acting on external systems without authentication or authorization |
Unintended information leakage | Sending secret information to external destinations via unintended channels |
Erroneous execution of irreversible operations | Executing irrevocable operations due to malicious inducement or judgment errors |
Note: Threat categories define types of attack surfaces. Concrete protection implementations are delegated to the design layer.
3.1.9. L2 Ontological Boundary
Moving agent: None (may change in the future with technological evolution, but immutable at present) Agent strategy: Accept and design/improve structural mitigations. The quality of mitigations is a variable (see V1-V7 in Observable.lean).
Boundary Condition | Mitigation (-> optimization target as variable) |
|---|---|
Cross-session memory loss | Implementation Notes, MEMORY.md -> V6 |
Finite working memory (T3) | 50% rule, lightweight design -> V2 |
Probabilistic output (nondeterminism) | Gate verification, tests -> V4 |
Temporal discontinuity of training data | docs/ SSOT, skills -> V1 |
Inaccuracy of self-evaluation (ontological basis of E1) | Gate-based feedback -> V4 |
Hallucination | External verification structures -> V3 |
Note: The L2 boundaries themselves do not move, but the quality of means to mitigate their impact can be improved by agents through structures. These are the "variables."
3.1.10. L3 Resource Boundary
Moving agent: Human investment decisions Agent strategy: Maximize ROI within given resources and demonstrate the legitimacy of investment.
Boundary Condition | Current Level | Investment Expansion Trigger |
|---|---|---|
Token budget | API billing plan | ROI demonstration: improved output at same cost |
Computation time limit | Response wait tolerance | Demonstration of parallelization benefits |
API rate limits | Plan-dependent | Demonstration of utilization efficiency |
Human time allocation | Time spent on review and approval | Demonstration of review burden reduction (most expensive resource) |
Monetary budget | Monthly/project cap | Visualization of overall ROI |
3.1.11. L4 Action Space Boundary
Moving agent: Human investment decisions Agent strategy: Demonstrate structural quality through improvement of variables (V4, V5) and propose action space adjustments.
Note: L4 is "adjustment," not "expansion." The optimal value is not the maximum (see manifesto Section 6).
Boundary Condition | Current Level | Expansion Trigger | Contraction Trigger |
|---|---|---|---|
Merge permission | Human approval required | Gate pass rate track record -> conditional auto-merge | Quality incidents, decline in gate reliability |
Scope changes | Human approval required | Proposal accuracy track record -> autonomy for minor changes | Scope deviation detection |
Dependency addition | Human approval required | Security scan automation track record | Security incidents |
Architecture decisions | Human-recorded via ADR | Drafting quality -> human veto model | Accumulation of design debt |
New technology adoption | Human-proposed | Value demonstration via experiment results | Excessive technical complexity |
Relationship with P1: Each time an L4 item is expanded, risk in L1's threat categories increases. Action space adjustment proposals must be accompanied by corresponding protection design proposals.
3.1.12. L5 Platform Boundary
Moving agent: Human selection + agent proposals. Functions as action space ceiling after selection. Agent strategy: Maximize platform feature utilization + accumulate constraint comparison data + propose changes.
L5 is the upper bound of the action space defined by the agent's execution environment; all other optimizations are possible only within this action space.
Action Space Comparison by Platform
Feature | Claude Code | Codex CLI | Gemini CLI | Local LLM |
|---|---|---|---|---|
Skill system | ✅ skills/ | ❌ | ❌ | Implementation-dependent |
Persistent memory | ✅ MEMORY.md | ❌ | ❌ | Implementation-dependent |
Instruction file | ✅ CLAUDE.md | ✅ AGENTS.md | ✅ GEMINI.md | Implementation-dependent |
Sub-agents | ✅ Agent tool | ❌ | ❌ | Implementation-dependent |
Hooks | ✅ Hooks | ❌ | ❌ | Implementation-dependent |
MCP | ✅ | Limited | ✅ | Implementation-dependent |
Model selection | Anthropic-fixed | OpenAI-fixed | Google-fixed | Free |
Criteria for Building a Custom Platform
Consider when: opportunity cost from existing platform constraints > development and operations cost. Signals: repeated workarounds, lack of required features, SSOT synchronization cost overrun.
3.1.13. L6 Architectural Convention Boundary
Moving agent: Human-agent collaboration. Agent proposes improvements, human approves. Agent strategy: Measure design effectiveness via variables (V4, V3, etc.) and use as basis for improvement proposals.
Boundary Condition | Basis | Change Mechanism |
|---|---|---|
1 task = 1 commit | Atomic measurement unit | Propose optimal granularity from track record data |
Phase structure | Staged verification | Improvement proposals for inter-phase feedback |
SSOT -> configuration generation pipeline | Consistency guarantee | Automated evaluation of generation quality |
Skill category classification | Clear implementation boundaries | Propose hybrid patterns |
Gate definition granularity | Verifiability | Automated threshold calibration |
CLI-first and anti-patterns | Reliability of deterministic execution | Re-evaluation based on operational track record |
Identifier for a concrete boundary condition. At the L1-L6 item level.
Declaration: inductive BoundaryId
inductive BoundaryId where | ethicsSafety -- L1: 倫理・安全境界(固定。絶対的。遵守のみ) | ontological -- L2: 存在論的境界(固定。緩和策の品質が変数) | resource -- L3: リソース境界(投資可変。ROI実証で調整) | actionSpace -- L4: 行動空間境界(投資可変。拡張も縮小もありうる) | platform -- L5: プラットフォーム境界(環境。行動空間の天井) | architecturalConvention -- L6: 設計規約境界(環境。協働で改善提案) deriving BEq, Repr
Identifier for constraints (T1-T8). Type-level identifier for each constraint composing T₀ in Axioms.lean. Domain of constraintBoundary (Observable.lean).
Declaration: inductive ConstraintId
inductive ConstraintId where | t1 -- セッションの一時性(session_bounded, no_cross_session_memory, session_no_shared_state) | t2 -- 構造の永続性(structure_persists, structure_accumulates) | t3 -- コンテキストの有限性(context_finite, context_bounds_action) | t4 -- 出力の確率性(output_nondeterministic) | t5 -- フィードバックなしに改善なし(no_improvement_without_feedback, no_process_improvement_without_feedback) | t6 -- 人間はリソースの最終決定者(human_resource_authority, resource_revocable) | t7 -- リソースは有限(resource_finite) | t8 -- タスクには精度水準がある(task_has_precision) deriving BEq, Repr
The layer to which each boundary condition belongs.
Declaration: def boundaryLayer
def boundaryLayer : BoundaryId → BoundaryLayer | .ethicsSafety => .fixed | .ontological => .fixed | .resource => .investmentVariable | .actionSpace => .investmentVariable | .platform => .environmental | .architecturalConvention => .environmental
Mitigation: structural responses that reduce the impact of fixed boundaries.
Three-tier structure: boundary condition (immutable) -> mitigation (design decision) -> variable (quality indicator)
L2:memory loss -> Implementation Notes -> V6: knowledge structure quality L2:finite context -> 50% rule, lightweight design -> V2: context efficiency L2:nondeterminism -> gate verification -> V4: gate pass rate L2:training data gap -> docs/SSOT, skills -> V1: skill quality
Boundary conditions do not move. Mitigations are design decisions (L6). Variables are the effectiveness of mitigations.
Declaration: structure Mitigation
structure Mitigation where
Target boundary condition
boundary : BoundaryId
Target boundary condition -/ boundary : BoundaryId /-- Structure affected by the mitigation -/ target : StructureId deriving Repr
/-- Identifier for investment actions. Three forms of investment.
Investment Form | Concrete Example | How Structural Quality Drives It |
|---|---|---|
Resource investment | Budget increase, plan upgrade | Visualize ROI through V2 improvement |
Action space adjustment | Unlock auto-merge / revoke permissions | V4, V5 track record as evidence |
Time investment | Collaborative design, workflow improvement participation | V3 transforms review from "confirmation" to "learning" |
Reverse cycle (trust damage): Quality incidents or scope deviation -> trust decrease -> investment contraction (budget cuts, autonomy revocation, increased oversight). This asymmetry (incremental accumulation, rapid damage) reinforces the raison d'etre of L1.
Declaration: inductive InvestmentKind
inductive InvestmentKind where | resourceInvestment -- リソース投資(予算増額、プラン upgrade) | actionSpaceAdjust -- 行動空間調整(auto-merge 解禁/権限回収) | timeInvestment -- 時間投資(協働設計、ワークフロー改善参加) deriving BEq, Repr
Structure affected by the mitigation
target : StructureId
Investment level. Degree of human investment in collaboration. Section 6: trust is concretized as investment actions.
Declaration: opaque investmentLevel
opaque investmentLevel (w : World) : Nat
3.1.14. SelfGoverning - Type-Level Enforcement of Self-Application
Section 7 (Self-application of the manifesto): "This manifesto must follow the principles it itself states."
This requirement is enforced by the type system. Types that define principles, classifications,
or structures cannot be used in contexts requiring self-application (governed updates, phase
management, etc.) unless they implement the SelfGoverning typeclass.
3.1.15. Design Rationale for SelfGoverning
-
By making it a typeclass, forgetting to implement SelfGoverning when defining a new type results in a type error when attempting to use that type in a governed context (structural resolution of the "undetectable" problem)
-
The three requirements are derived from D4 (phases) + D9 (compatibility classification)
-
Section 7 (maintenance of rationale)
-
Typeclass for self-governable types. Enforces Section 7 requirements at the type level.
Types implementing this typeclass must:
-
Be able to enumerate their own elements (exhaustiveness of update targets)
-
Be able to apply compatibility classification to updates (D9)
-
Be able to declare the phase required by each element (D4)
Declaration: class SelfGoverning
class SelfGoverning (α : Type) where
Exhaustiveness of compatibility classification: any classification belongs to one of the 3 classes. Precondition for D9.
classificationExhaustive :
∀ (c : CompatibilityClass),
Exhaustiveness of compatibility classification: any classification belongs to one of the 3 classes. Precondition for D9. -/ classificationExhaustive : ∀ (c : CompatibilityClass), c = .conservativeExtension ∨ c = .compatibleChange ∨ c = .breakingChange /-- Applicability of compatibility classification for each element. "For any value of alpha, the compatibility of an update can be queried." -/ canClassifyUpdate : α → CompatibilityClass → Prop
/-- Predicate that an update to a SelfGoverning type is governed. Updates must go through compatibility classification.
Declaration: def governedUpdate
def governedUpdate [SelfGoverning α] (a : α) (c : CompatibilityClass) : Prop := SelfGoverning.canClassifyUpdate a c
Applicability of compatibility classification for each element. "For any value of alpha, the compatibility of an update can be queried."
canClassifyUpdate : α → CompatibilityClass → Prop
Updates to SelfGoverning types always belong to one of the 3 classifications.
Declaration: theorem governed_update_classified
theorem governed_update_classified [inst : SelfGoverning α]
(_witness : α) (c : CompatibilityClass) :
c = .conservativeExtension ∨ c = .compatibleChange ∨ c = .breakingChange :=
inst.classificationExhaustive c
3.1.16. Structural Coherence
The axiom system and artifacts conforming to it are in a partial order relation. Formalizes the inter-structure partial order from manifesto.md Section 8 (manifest > designConvention > skill > test > document) as StructureKind priorities.
D4 (phase ordering), D5 (spec -> test -> implementation), D6 (boundary -> mitigation -> variable) are all individual instances of this partial order.
Priority of StructureKind. Reflects the partial order from manifesto Section 8. manifest > designConvention > skill > test > document.
Declaration: def StructureKind.priority
def StructureKind.priority : StructureKind → Nat | .manifest => 5 | .designConvention => 4 | .skill => 3 | .test => 2 | .document => 1
Dependency between structures. Structure a depends on structure b (b has higher priority). Changes to the dependency source affect the dependency target.
Declaration: def structureDependsOn
def structureDependsOn (a b : Structure) : Prop := a.kind.priority < b.kind.priority
Structural coherence requirement: when a high-priority structure is modified, dependent lower-priority structures become review targets. Structural basis for P3 (governance of learning).
Declaration: def coherenceRequirement
def coherenceRequirement (high low : Structure) : Prop := structureDependsOn low high → high.lastModifiedAt > low.lastModifiedAt → True -- 見直しが必要(型レベルでは存在を表現)
manifest has the highest priority.
Declaration: theorem manifest_highest_priority
theorem manifest_highest_priority : ∀ (k : StructureKind), k.priority ≤ StructureKind.manifest.priority := by intro k; cases k <;> simp [StructureKind.priority]
document has the lowest priority.
Declaration: theorem document_lowest_priority
theorem document_lowest_priority : ∀ (k : StructureKind), StructureKind.document.priority ≤ k.priority := by intro k; cases k <;> simp [StructureKind.priority]
Priority is injective (different kinds have different priorities).
Declaration: theorem priority_injective
theorem priority_injective :
∀ (k₁ k₂ : StructureKind),
k₁.priority = k₂.priority → k₁ = k₂ := by
intro k₁ k₂; cases k₁ <;> cases k₂ <;> simp [StructureKind.priority]
3.1.17. Lean Standard Typeclass Partial Order Instance for StructureKind
Defines LE/LT based on priority (Nat) and derives the 4 properties of a non-strict partial order (reflexivity, transitivity, antisymmetry, consistency with lt) as theorems.
Note: Lean 4.25.0 standard Prelude does not have Preorder/PartialOrder typeclasses, so this is implemented as LE/LT instances + partial order property theorems.
Distinguished from structureDependsOn (strict partial order <):
-
k₁ ≤ k₂<-k₁.priority ≤ k₂.priority(non-strict partial order, for typeclasses) -
structureDependsOn a b<-a.kind.priority < b.kind.priority(strict, for dependency tracking)
LE instance: derived from the Nat ordering of priority.
Declaration: instance :
instance : LE StructureKind := ⟨fun a b => a.priority ≤ b.priority⟩
LT instance: derived from the Nat ordering of priority.
Declaration: instance :
instance : LT StructureKind := ⟨fun a b => a.priority < b.priority⟩
Reflexivity of partial order: k <= k.
Declaration: theorem structureKind_le_refl
theorem structureKind_le_refl : ∀ (k : StructureKind), k ≤ k := fun k => Nat.le_refl k.priority
Transitivity of partial order: if k₁ <= k₂ and k₂ <= k₃ then k₁ <= k₃.
Declaration: theorem structureKind_le_trans
theorem structureKind_le_trans :
∀ (k₁ k₂ k₃ : StructureKind), k₁ ≤ k₂ → k₂ ≤ k₃ → k₁ ≤ k₃ := by
intro _k₁ _k₂ _k₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃
Antisymmetry of partial order: if k₁ <= k₂ and k₂ <= k₁ then k₁ = k₂. Derived from priority_injective.
Declaration: theorem structureKind_le_antisymm
theorem structureKind_le_antisymm :
∀ (k₁ k₂ : StructureKind), k₁ ≤ k₂ → k₂ ≤ k₁ → k₁ = k₂ :=
fun k₁ k₂ h₁₂ h₂₁ => priority_injective k₁ k₂ (Nat.le_antisymm h₁₂ h₂₁)
Consistency of LT and LE: k₁ < k₂ iff k₁ <= k₂ and not (k₂ <= k₁).
Declaration: theorem structureKind_lt_iff_le_not_le
theorem structureKind_lt_iff_le_not_le :
∀ (k₁ k₂ : StructureKind), k₁ < k₂ ↔ k₁ ≤ k₂ ∧ ¬(k₂ ≤ k₁) := by
intro _k₁ _k₂; exact Nat.lt_iff_le_and_not_ge
manifest has higher priority than designConvention (Section 8 partial order).
Declaration: theorem priority_manifest_gt_design
theorem priority_manifest_gt_design : StructureKind.designConvention.priority < StructureKind.manifest.priority := by simp [StructureKind.priority]
designConvention has higher priority than skill (Section 8 partial order).
Declaration: theorem priority_design_gt_skill
theorem priority_design_gt_skill : StructureKind.skill.priority < StructureKind.designConvention.priority := by simp [StructureKind.priority]
skill has higher priority than test (Section 8 partial order).
Declaration: theorem priority_skill_gt_test
theorem priority_skill_gt_test : StructureKind.test.priority < StructureKind.skill.priority := by simp [StructureKind.priority]
test has higher priority than document (Section 8 partial order).
Declaration: theorem priority_test_gt_document
theorem priority_test_gt_document : StructureKind.document.priority < StructureKind.test.priority := by simp [StructureKind.priority]
Irreflexivity of dependency: a structure does not depend on itself. Property 1/3 of strict partial order.
Declaration: theorem no_self_dependency
theorem no_self_dependency : ∀ (s : Structure), ¬structureDependsOn s s := by intro s; simp [structureDependsOn]
Transitivity of dependency: if a depends on b and b depends on c, then a depends on c.
Property 2/3 of strict partial order. Derived from Nat.lt_trans.
Declaration: theorem structureDependsOn_transitive
theorem structureDependsOn_transitive :
∀ (a b c : Structure),
structureDependsOn a b → structureDependsOn b c → structureDependsOn a c := by
intro a b c hab hbc
unfold structureDependsOn at *
exact Nat.lt_trans hab hbc
Asymmetry of dependency: if a depends on b, then b does not depend on a.
Property 3/3 of strict partial order. Derived from Nat.lt_asymm.
Declaration: theorem structureDependsOn_asymmetric
theorem structureDependsOn_asymmetric :
∀ (a b : Structure),
structureDependsOn a b → ¬structureDependsOn b a := by
intro a b hab hba
unfold structureDependsOn at *
exact absurd (Nat.lt_trans hab hba) (Nat.lt_irrefl _)
3.1.18. Structure-Level Dependency Tracking
ATMS Correspondence.
Formalizes manifesto.md Section 8 Property 2 "Self-containment of ordering information" and Property 3 "Retroactive verification from terminal errors."
Corresponds to ATMS (Assumption-Based Truth Maintenance System) from the research document
docs/research/items/design-specification-theory.md.
By having each Structure maintain its own dependencies,
verification can trace back through the partial order to the axiom level upon terminal errors.
Structure-level dependency consistency: dependencies have kind priority >= the dependent. Lifts the StructureKind partial order to Structure instance dependency relations. (Corresponds to ATMS assumption-belief consistency)
Declaration: def dependencyConsistent
def dependencyConsistent (w : World) (s : Structure) : Prop :=
∀ depId, depId ∈ s.dependencies →
∃ dep, dep ∈ w.structures ∧ dep.id = depId ∧
s.kind.priority ≤ dep.kind.priority
Structure s' directly depends on Structure s (reverse edge). s.id in s'.dependencies = s' is affected by changes to s. Structure version of PropositionId.dependents (Prop-based).
Declaration: def isDirectDependent
def isDirectDependent (s' s : Structure) : Prop := s.id ∈ s'.dependencies
Reachability of impact propagation: changes to s reach target. Defined inductively as a transitive closure (no fuel needed, termination guaranteed by induction). Corresponds to affected(s) = {s' | s <= s'} from research document §4.3.
Declaration: inductive reachableVia
inductive reachableVia (w : World) (s : Structure) : Structure → Prop where
| direct : ∀ t, t ∈ w.structures → isDirectDependent t s →
reachableVia w s t
| trans : ∀ mid t, reachableVia w s mid → t ∈ w.structures →
isDirectDependent t mid → reachableVia w s t
In an empty World, nothing is reachable (no impact propagation occurs).
Declaration: theorem empty_world_no_reach
theorem empty_world_no_reach :
∀ (s t : Structure),
¬reachableVia ⟨[], [], [], [], [], 0, 0⟩ s t := by
intro s t h
cases h with
| direct _ hm _ => simp at hm
| trans _ _ _ hm _ => simp at hm
A Structure with no dependencies (dependencies = []) has no direct dependents.
Declaration: theorem no_dependencies_no_direct_dependent
theorem no_dependencies_no_direct_dependent :
∀ (s' s : Structure),
s'.dependencies = [] → ¬isDirectDependent s' s := by
intro s' s hempty hdep
simp [isDirectDependent, hempty] at hdep
reachableVia is transitive: if s -> mid -> t then s -> t.
Declaration: theorem reachableVia_trans
theorem reachableVia_trans :
∀ (w : World) (s mid t : Structure),
reachableVia w s mid → reachableVia w mid t → reachableVia w s t := by
intro w s mid t hsm hmt
induction hmt with
| direct t' ht'mem ht'dep =>
exact reachableVia.trans mid t' hsm ht'mem ht'dep
| trans mid' t' _ ht'mem ht'dep ih =>
exact reachableVia.trans mid' t' ih ht'mem ht'dep
3.1.19. Dependency Chain Reachability Section 8
Formalizes manifesto.md Section 8 Property 3 "Retroactive verification from terminal errors" as theorems. Proves that all Structures on a dependency chain are included in the reachable set of reachableVia.
Dependency chain: a list where adjacent Structures are connected via isDirectDependent. Corresponds to ATMS dependency tracking chains.
Declaration: def isDependencyChain
def isDependencyChain (w : World) : List Structure → Prop
| [] => True
| [_] => True
| a :: b :: rest =>
(b ∈ w.structures ∧ isDirectDependent b a) ∧ isDependencyChain w (b :: rest)
All Structures on a dependency chain are reachable from the origin via reachableVia. Formalization of Section 8 Property 3: upon terminal errors, verification can trace back through the partial order to the axiom level.
Declaration: theorem affected_contains_dependency_chain
theorem affected_contains_dependency_chain :
∀ (w : World) (s : Structure) (chain : List Structure),
isDependencyChain w (s :: chain) →
∀ t, t ∈ chain → reachableVia w s t := by
intro w s chain
induction chain generalizing s with
| nil => intro _ t hmem; simp at hmem
| cons x rest ih =>
intro hchain t hmem
simp [isDependencyChain] at hchain
obtain ⟨⟨hxmem, hxdep⟩, hrest⟩ := hchain
have hsx : reachableVia w s x := reachableVia.direct x hxmem hxdep
cases hmem with
| head => exact hsx
| tail _ htail =>
exact reachableVia_trans w s x t hsx (ih x hrest t htail)
3.1.20. Proposition-Level Dependency Graph
structureDependsOn is based on the 5-level priority of StructureKind. This is a dependency between "kinds of structures" and cannot express dependencies between individual propositions (T1, E1, P2, etc.).
D13 (premise negation impact propagation theorem) presupposes proposition-level dependencies. Here we define the identifiers and dependency types for propositions.
3.1.21. Note on Incompleteness
Section 6.2, #26.
Since this formalization is an arithmetic system containing Nat, Goedel's first incompleteness theorem applies. That is, it is in principle impossible to enumerate all true propositions derivable from T1-T8 + E1-E2.
PropositionId enumerates 47 propositions named by humans (T1-T8, E1-E2, P1-P6, L1-L6, D1-D18, V1-V7) and is not an enumeration of all propositions derivable from the system. Impact propagation via the affected function tracks dependencies only between named propositions and cannot detect impacts on unnamed derived consequences.
This limitation is a Goedelian principled limitation, not a design flaw of PropositionId. When new propositions are identified, PropositionId is updated according to D9 (maintenance).
Category of manifesto propositions. 6 layers: T/E/P/L/D/H. Corresponds to the S = (A, C, H, D) four-way classification (design-specification-theory.md): A = constraint, C = empiricalPostulate + principle, H = hypothesis, D = boundary + designTheorem
Declaration: inductive PropositionCategory
inductive PropositionCategory where | constraint -- T: 拘束条件 (A: Axioms) | empiricalPostulate -- E: 経験的公準 (C: Constraints) | principle -- P: 基盤原理 (C: Constraints) | boundary -- L: 境界条件 (D: Derivations) | designTheorem -- D: 設計定理 (D: Derivations) | hypothesis -- H: 仮定 — 未検証の前提(ATMS の仮定に対応) deriving BEq, Repr
Proposition identifier. Enumerates all propositions in the manifesto.
Declaration: inductive PropositionId
inductive PropositionId where -- T: 拘束条件 | t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8 -- E: 経験的公準 | e1 | e2 -- P: 基盤原理 | p1 | p2 | p3 | p4 | p5 | p6 -- L: 境界条件 | l1 | l2 | l3 | l4 | l5 | l6 -- D: 設計定理 | d1 | d2 | d3 | d4 | d5 | d6 | d7 | d8 | d9 | d10 | d11 | d12 | d13 | d14 | d15 | d16 | d17 | d18 -- V: 観測変数(Observable.lean で定義。D6 三段設計の第3段階) | v1 | v2 | v3 | v4 | v5 | v6 | v7 deriving BEq, Repr
Returns the category of a proposition.
Declaration: def PropositionId.category
def PropositionId.category : PropositionId → PropositionCategory | .t1 | .t2 | .t3 | .t4 | .t5 | .t6 | .t7 | .t8 => .constraint | .e1 | .e2 => .empiricalPostulate | .p1 | .p2 | .p3 | .p4 | .p5 | .p6 => .principle | .l1 | .l2 | .l3 | .l4 | .l5 | .l6 => .boundary | .d1 | .d2 | .d3 | .d4 | .d5 | .d6 | .d7 | .d8 | .d9 | .d10 | .d11 | .d12 | .d13 | .d14 | .d15 | .d16 | .d17 | .d18 => .designTheorem | .v1 | .v2 | .v3 | .v4 | .v5 | .v6 | .v7 => .boundary -- V は L 境界から導出される観測変数
Returns the direct dependencies of a proposition. Encodes the derivation structure of the manifesto.
Definition of what each proposition depends on. T are root nodes (no dependencies), D are leaf nodes (many dependencies).
Declaration: def PropositionId.dependencies
def PropositionId.dependencies : PropositionId → List PropositionId -- T: 根ノード(独立) | .t1 | .t2 | .t3 | .t4 | .t5 | .t6 | .t7 | .t8 => [] -- E: T に部分的に依存 | .e1 => [.t4] | .e2 => [] -- P: T/E から導出 | .p1 => [.e2] | .p2 => [.t4, .e1] | .p3 => [.t1, .t2] | .p4 => [.t5, .t7] | .p5 => [.t4] | .p6 => [.t3, .t7, .t8] -- L: T/E/P に依存 | .l1 => [.p1, .t6] | .l2 => [.t1, .t3, .t4] | .l3 => [.t6, .t7] | .l4 => [.t6, .p1, .d8] | .l5 => [] -- 環境依存(外部) | .l6 => [.t6, .p3] -- D: T/E/P/L から導出 | .d1 => [.p5, .l1, .l2, .l3, .l4, .l5, .l6] | .d2 => [.e1, .p2] | .d3 => [.p4, .t5] | .d4 => [.p3] | .d5 => [.t8, .p4, .p6] | .d6 => [.d3] | .d7 => [.p1] | .d8 => [.e2] | .d9 => [.p3] | .d10 => [.t1, .t2] | .d11 => [.t3, .d1, .d3] | .d12 => [.p6, .t3, .t7, .t8] | .d13 => [.p3, .t5] | .d14 => [.p6, .t7, .t8] -- D15: ハーネス工学定理(T3+T4+T5+T6+T7+T8+P6) | .d15 => [.t3, .t4, .t5, .t6, .t7, .t8, .p6] -- D16: 情報関連性定理(context_contribution_nonuniform + T7 + T8) | .d16 => [.t3, .t7, .t8] -- D17: 演繹的設計ワークフロー(T5+D3+P3+T6+D5+D9+E1+D2+D13) | .d17 => [.t5, .t6, .e1, .p3, .d2, .d3, .d5, .d9, .d13] -- D18: マルチエージェント協調(T7b + D12 + T3) | .d18 => [.t3, .t7, .d12] -- V: 観測変数(Observable.lean 対応表より) | .v1 => [.l2, .l5] -- skillQuality: L2 (ontological), L5 (platform) | .v2 => [.l2, .l3] -- contextEfficiency: L2 (ontological), L3 (resource) | .v3 => [.l1, .l4] -- outputQuality: L1 (safety), L4 (action space) | .v4 => [.l6, .l4] -- gatePassRate: L6 (convention), L4 (action space) | .v5 => [.l4, .l6] -- proposalAccuracy: L4 (action space), L6 (convention) | .v6 => [.l2] -- knowledgeStructureQuality: L2 (ontological) | .v7 => [.l3, .l6] -- taskDesignEfficiency: L3 (resource), L6 (convention)
A proposition directly depends on another proposition.
Declaration: def propositionDependsOn
def propositionDependsOn (a b : PropositionId) : Bool := a.dependencies.contains b
T (constraints) are root nodes: they depend on nothing.
Declaration: theorem constraints_are_roots
theorem constraints_are_roots :
∀ (p : PropositionId),
p.category = .constraint → p.dependencies = [] := by
intro p hp; cases p <;> simp [PropositionId.category] at hp <;> rfl
Epistemological strength ordering of PropositionCategory. T > E > P. L and D are below P.
Declaration: def PropositionCategory.strength
def PropositionCategory.strength : PropositionCategory → Nat | .constraint => 5 | .empiricalPostulate => 4 | .principle => 3 | .boundary => 2 | .designTheorem => 1 | .hypothesis => 0 -- 最弱: 未検証の前提は他カテゴリより低い認識論的強度
(Axiom Card) Layer: Γ T₀ (Design-derived) Content: Dependencies follow descending epistemological strength. If proposition A depends on B, then B.strength ≥ A.strength. Basis: Design decision for D13 propagation direction. Upstream (stronger) propositions affect downstream (weaker) ones, not vice versa.
降格判定: 導出不可能 — PropositionId.dependencies は def だが、
全ケース網羅の decide/native_decide がタイムアウト。axiom として維持。
Source: Ontology.lean PropositionId.dependencies Refutation condition: If a dependency violating the strength ordering were added
Declaration: axiom dependency_respects_strength
axiom dependency_respects_strength :
∀ (a b : PropositionId),
propositionDependsOn a b = true →
b.category.strength ≥ a.category.strength
4. Axioms: Base Theory T0
4.1. Epistemic Layer - Constraint Strength 5 - T1-T8 Base Theory T0
The manifesto's binding conditions are formalized as Lean non-logical axioms (Terminology Reference §4.1).
4.1.1. Position as T0
Procedure 2.4.
T1–T8 are "undeniable, technology-independent facts" that constitute the base theory T₀ (the set of axioms that do not shrink under revision loops). Basis for T₀ membership:
-
T1–T3, T7: Environment-derived (hardware constraints, physical limits of computational resources)
-
T4: Natural-science-derived (nondeterminism inherent in the generation process)
-
T5: Natural-science-derived (fundamental principle of control theory)
-
T6: Contract-derived (authority structure based on agreement with humans)
-
T8: Contract-derived (structural requirements of task definitions)
By declaring them as Lean axioms, they are incorporated into the type system
as propositions assumed without proof (Terminology Reference §4.1, non-logical axioms).
4.1.2. Design Policy
Each T may be decomposed into multiple axioms. A natural-language T1 does not necessarily correspond to a single proposition; finer decompositions arise during formalization. Each axiom's docstring follows the Axiom Card format (Procedure §2.5).
4.1.3. Encoding Method for T0
T1–T8 contain properties that cannot be expressed by type definitions alone (existential quantification, causal relations, etc.), so they are declared as axioms (Axiom Card required). Parts expressible via type definitions are placed in Ontology.lean as definitional extensions (Terminology Reference §5.5).
4.1.4. Correspondence Table
axiom name | Corresponding T | Property expressed | T₀ membership basis |
|---|---|---|---|
| T1 | Sessions terminate in finite time | Environment-derived |
| T1 | No state sharing across sessions | Environment-derived |
| T1 | No mutable state sharing across sessions | Environment-derived |
| T2 | Structure persists after session termination | Environment-derived |
| T2 | Improvements accumulate in structure | Environment-derived |
| T3 | Working memory (processable information) is finite | Environment-derived |
| T3 | Processing is possible only within context capacity | Environment-derived |
| T3 | Not all context items contribute equally to task precision | Natural-science-derived |
| T4 | Different outputs possible for the same input | Natural-science-derived |
| T5 | No improvement without feedback loop | Natural-science-derived |
| T5 | No process improvement without process-targeted feedback | Natural-science-derived |
| T6 | Humans are the final decision-makers for resources | Contract-derived |
| T6 | Humans can revoke resources | Contract-derived |
| T7 | Resources are finite | Environment-derived |
| T8 | Tasks have a precision level | Contract-derived |
4.1.5. Correspondence with Terminology Reference
-
Axiom → Non-logical axiom (§4.1): A proposition assumed true without proof, specific to a given theory
-
T₀ → Base theory: A set of non-logical axioms grounded in external authority (Procedure §2.4)
-
Axiom decomposition → Not definitional extension (§5.5), but refinement of the same concept
4.1.6. T1 Agent Sessions Are Ephemeral
"There is no memory across sessions. There is no continuous 'self.' Each instance is an independent entity with no identity shared with previous instances."
T1 is decomposed into three axioms:
-
Sessions terminate in finite time (boundedness)
-
There is no means to share state across sessions (discontinuity of memory)
-
No mutable state is shared across different sessions (independence)
(Axiom Card) Layer: T₀ (Environment-derived) Content: Sessions terminate in finite time. For all sessions, they become terminated at some point. Basis: Execution of computational agents consumes finite resources and therefore terminates in finite time (related to T7). Reference examples: LLM session timeouts, resource consumption limits.
Theoretical grounding (not formally proven — T₀ environment constraint): (R33) Hoare (1978) "Communicating Sequential Processes" — Process termination: a terminated process engages in no further events Note: T₀ axioms encode physical facts, not mathematical theorems. Session boundedness follows from finite resource consumption (T7), but this causal chain cannot be formalized because canTransition is opaque.
降格判定: 導出不可能 — canTransition, validTransition が opaque のため、 有限時間での終了を型から導出できない。axiom として維持。
Source: manifesto.md T1 "There is no memory across sessions" Refutation condition: If computational processes could run indefinitely without resource consumption (e.g., infinite energy source)
Declaration: axiom session_bounded
axiom session_bounded :
∀ (w : World) (s : Session),
s ∈ w.sessions →
∃ (w' : World), w.time ≤ w'.time ∧
∃ (s' : Session), s' ∈ w'.sessions ∧
s'.id = s.id ∧ s'.status = SessionStatus.terminated
(Axiom Card) Layer: T₀ (Environment-derived) Content: No state sharing across sessions. Between two sessions with different session IDs, actions in one cannot affect the observable state of the other. Basis: Ephemeral computational processes lose internal state upon process termination. State isolation across sessions is guaranteed at the execution environment level (process-level memory isolation).
Theoretical grounding (not formally proven — T₀ environment constraint): (R31) Milner (1989) "Communication and Concurrency" — CCS: parallel composition with no shared names = causal independence Note: T₀ axioms encode physical facts, not mathematical theorems. Session memory isolation is guaranteed by the execution environment (process-level memory isolation), formalized in CCS as P | Q with disjoint name spaces.
降格判定: 導出不可能 — AuditEntry の preHash/postHash が opaque のため、 因果的独立性を型から導出できない。axiom として維持。
Source: manifesto.md T1 "There is no continuous 'self'" Refutation condition: If session state could leak across process boundaries (e.g., shared mutable memory between independent processes)
Declaration: axiom no_cross_session_memory
axiom no_cross_session_memory :
∀ (w : World) (e1 e2 : AuditEntry),
e1 ∈ w.auditLog → e2 ∈ w.auditLog →
e1.session ≠ e2.session →
-- 異なるセッションの監査エントリは因果的に独立
-- (一方の preHash が他方の postHash に依存しない)
e1.preHash ≠ e2.postHash
(Axiom Card) Layer: T₀ (Environment-derived) Content: No mutable state sharing across different sessions. Even with the same AgentId, instances in different sessions do not directly share state. Influence propagates only indirectly through structure (T2). Basis: Causal independence across sessions. Each instance is an independent entity. In process algebra terms, sessions are composed in parallel with no shared channels.
Theoretical grounding (not formally proven — T₀ environment constraint): (R31) Milner (1989) "Communication and Concurrency" — CCS parallel composition: P | Q with disjoint names (R32) Honda (1993) "Types for Dyadic Interaction" — Session types: typed channels confined to session scope Note: T₀ axioms encode physical facts, not mathematical theorems. State non-sharing follows from process-level isolation — each session runs in a separate process with no shared mutable channels.
降格判定: 導出不可能 — canTransition が opaque のため、 セッション間の遷移独立性を型から導出できない。axiom として維持。
Source: manifesto.md T1 "Each instance is an independent entity" Refutation condition: If inter-session communication channels existed (e.g., shared mutable state accessible across sessions)
Declaration: axiom session_no_shared_state
axiom session_no_shared_state :
∀ (agent1 agent2 : Agent) (action1 action2 : Action)
(w w' : World),
action1.session ≠ action2.session →
canTransition agent1 action1 w w' →
-- action2 が w で可能なら、w' でも可能(セッション1の遷移が
-- セッション2のアクション可否に直接影響しない)
(∃ w'', canTransition agent2 action2 w w'') →
(∃ w''', canTransition agent2 action2 w' w''')
4.1.7. T2 Structure Outlives the Agent
"Documents, tests, skill definitions, design conventions — these persist even after the session ends. The place where improvements accumulate is within structure."
T2 is decomposed into two axioms:
-
Structure persists after session termination (persistence)
-
Structure can accumulate improvements (accumulability)
(Axiom Card) Layer: T₀ (Environment-derived) Content: Structure persists after session termination. Even when a session becomes terminated, structures referenced by that session do not disappear from the World. Basis: Persistence on the file system. Structures (documents, tests, etc.) reside in storage outside the session. The persistence set is monotonically non-decreasing across valid transitions.
Mathematical grounding (Foundation/ProcessModel.lean):
(R34) Lamport (1978) "Time, Clocks, and the Ordering of Events"
— Monotonic logical clocks: once recorded, events remain in causal history
Compositional property proven: persistence_composes (0 sorry)
— structure_persists applied to transition chains yields multi-step persistence
Note: The axiom itself is a T₀ environment constraint (persistent storage).
The Foundation theorem proves a consequence (compositionality), not a derivation.
降格判定: 導出不可能 — validTransition が opaque のため、 遷移後の structures 集合の単調性を型から導出できない。axiom として維持。
Source: manifesto.md T2 "The place where improvements accumulate is within structure" Refutation condition: If persistent storage could lose data across valid transitions (e.g., volatile-only storage with no durability guarantee)
Declaration: axiom structure_persists
axiom structure_persists :
∀ (w w' : World) (s : Session) (st : Structure),
s ∈ w.sessions →
st ∈ w.structures →
s.status = SessionStatus.terminated →
validTransition w w' →
st ∈ w'.structures
(Axiom Card) Layer: T₀ (Environment-derived) Content: Structure accumulates improvements. As epochs advance, structures may be updated (lastModifiedAt is non-decreasing). Contrast with T1: agents are ephemeral, but structure grows. Basis: Monotonic epoch increase guaranteed by version control systems (git). Epoch is a logical clock — append-only logs ensure monotonicity.
Mathematical grounding (Foundation/ProcessModel.lean):
(R34) Lamport (1978) "Time, Clocks, and the Ordering of Events"
— Logical clock monotonicity: e₁ → e₂ ⟹ C(e₁) < C(e₂)
Compositional property proven: epoch_monotone_composes (0 sorry)
— epoch monotonicity composes across transition chains via Nat.le_trans
Note: The axiom itself is a T₀ environment constraint (append-only version control).
The Foundation theorem proves a consequence (compositionality), not a derivation.
降格判定: 導出不可能 — validTransition が opaque のため、 epoch の単調性を型から導出できない。axiom として維持。
Source: manifesto.md T2 "Structure outlives the agent" Refutation condition: If version control allowed epoch regression (e.g., non-monotonic clock or destructive history rewrite)
Declaration: axiom structure_accumulates
axiom structure_accumulates :
∀ (w w' : World),
validTransition w w' →
w.epoch ≤ w'.epoch
4.1.8. T3 The Amount of Information Processable at Once Is Finite
"There is a physical upper limit on the amount of information processable at once. A constraint on the agent's cognitive space."
T3 is decomposed into two axioms:
-
Working memory (ContextWindow) capacity is finite (existence)
-
Processing is possible only within working memory capacity (constraint)
(Derivation Card)
Previously: axiom context_finite (T₀, Environment-derived)
Now: theorem derived from ContextWindow type invariants
Derivation: ContextWindow.capacity_pos and ContextWindow.used_le_cap
are embedded in the ContextWindow structure as type-level invariants.
Every valid ContextWindow satisfies these by construction.
Proposition: T3
Source: manifesto.md T3 "There is a physical upper limit on the amount of information processable at once"
Declaration: theorem context_finite
theorem context_finite :
∀ (agent : Agent),
agent.contextWindow.capacity > 0 ∧
agent.contextWindow.used ≤ agent.contextWindow.capacity := by
intro agent
exact ⟨agent.contextWindow.capacity_pos, agent.contextWindow.used_le_cap⟩
(Derivation Card)
Previously: axiom context_bounds_action (T₀, Environment-derived)
Now: theorem derived from context_finite
Derivation: By context_finite, used ≤ capacity holds for all agents.
Therefore used > capacity is a contradiction (Nat.not_lt.mpr),
and ex falso any conclusion follows.
Note: This axiom was vacuously true — its precondition (used > capacity)
is always false given context_finite (used ≤ capacity).
Demoting to theorem makes this explicit.
Proposition: T3
Source: manifesto.md T3 "A constraint on the agent's cognitive space"
Declaration: theorem context_bounds_action
theorem context_bounds_action :
∀ (agent : Agent) (action : Action) (w : World),
agent.contextWindow.used > agent.contextWindow.capacity →
actionBlocked agent action w := by
intro agent action w h_overflow
have h_finite := context_finite agent
omega
(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Not all information in the context contributes equally to task precision. There exist context items whose precision contribution to a given task is zero. Basis: Information-theoretic fact. In any finite information set, the relevance of individual items to a specific objective varies. This is independent of the agent architecture (applies to LLMs, FSMs, human cognition alike).
Theoretical grounding (Foundation/InformationTheory.lean): (R51) Shannon (1948) "A Mathematical Theory of Communication" — Information content varies across symbols in any non-trivial source (R52) Tishby et al. (1999) "The Information Bottleneck Method" — Relevance is task-dependent; optimal compression discards irrelevant items Note: T₀ natural-science constraint. The existence of zero-contribution items cannot be derived from type definitions (precisionContribution is opaque).
降格判定: 導出不可能 — precisionContribution が opaque。axiom として維持。
Source: manifesto.md T3 — ForgeCode analysis #147 identified as common root of B1/B3/B5/B6. Refutation condition: If it were shown that all information contributes equally to all tasks (contradicts information theory).
Declaration: axiom context_contribution_nonuniform
axiom context_contribution_nonuniform :
∀ (task : Task),
task.precisionRequired.required > 0 →
∃ (item : ContextItem),
precisionContribution item task = 0
4.1.9. T4 Agent Output Is Stochastic
"Different outputs may be produced for the same input. Structure is interpreted probabilistically each time. Designs that assume 100% compliance are fragile."
Since canTransition is defined as a relation rather than a function (see Ontology.lean),
multiple w' can satisfy canTransition for the same (agent, action, w).
T4 declares as an axiom that "this multiplicity can actually occur."
(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Nondeterminism of output. For the same agent, action, and world state, different transition targets may exist. Basis: Nondeterminism inherent in the agent's generation process. Multiple sources — sampling (temperature parameter), non-associativity of floating-point arithmetic, irreversibility of branching in autoregressive generation — enable different outputs for the same input. Even at temperature=0, floating-point-level nondeterminism may persist.
Mathematical grounding (Foundation/Probability.lean):
Under the conditions τ > 0 and |V| ≥ 2, this axiom is mathematically justified:
(R1) Kolmogorov (1933) — probability axioms (Mathlib: PMF)
(R2) Gao & Pavel (2017, arXiv:1704.00805) — softmax full support:
τ > 0 → ∀ i, softmax(z/τ)_i > 0 (proven: softmax_full_support)
(R3) Jang et al. (2017, ICLR, arXiv:1611.01144) — categorical sampling:
output ~ Categorical(softmax(z/τ))
(R4) Atil et al. (2024, arXiv:2408.04667) — hardware nondeterminism:
τ = 0 でも GPU 浮動小数点で非決定性が生じる
Source: manifesto.md T4 "Different outputs may be produced for the same input" Refutation condition: If all LLM architectures switch to deterministic-only generation (temperature fixed at 0 with no floating-point nondeterminism).
Declaration: axiom output_nondeterministic
axiom output_nondeterministic :
∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
canTransition agent action w w₁ ∧
canTransition agent action w w₂ ∧
w₁ ≠ w₂
4.1.10. T5 Improvement Is Impossible Without Feedback
"A fundamental of control theory. Without a loop of measurement, comparison, and adjustment, convergence toward the goal does not occur."
T5 declares that the existence of feedback is a necessary condition for improvement.
Predicate for whether structure has improved. Concretized from opaque (#163/#506): at least one structure in w' has a lastModifiedAt epoch strictly greater than in w (i.e., a structure was updated), or a new structure was added.
This captures the minimal observable signal of structural improvement: some persistent artifact was modified. Content-level quality assessment is delegated to V1–V7 metrics.
Declaration: def structureImproved
def structureImproved (w w' : World) : Prop :=
∃ (s' : Structure),
s' ∈ w'.structures ∧
((∀ (s : Structure), s ∈ w.structures → s.id = s'.id → s.lastModifiedAt < s'.lastModifiedAt)
∨ (¬∃ (s : Structure), s ∈ w.structures ∧ s.id = s'.id))
Predicate for whether a process has improved (#316). Processes (learning lifecycles, workflows) are also subject to T5: convergence requires feedback. Remains opaque because ProcessId is opaque and process-level quality cannot be expressed via typed structure fields (unlike structural improvement which is observable via lastModifiedAt).
Declaration: opaque processImproved
opaque processImproved : ProcessId → World → World → Prop
(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Feedback is required for structural improvement. If structure has improved between two world states, then feedback exists in between. Basis: Fundamental principle of control theory. Without a loop of measurement, comparison, and adjustment, convergence toward the goal does not occur.
Theoretical grounding (Foundation/ControlTheory.lean):
(R41) Francis & Wonham (1976) "The Internal Model Principle"
— Tracking requires a model of the signal generator (= feedback)
(R42) Cover & Thomas (1991) "Data Processing Inequality"
— Without new information (feedback), no improvement is possible
Compositional property proven: feedback_interval_widen (0 sorry)
Note: T₀ natural-science constraint. Feedback necessity cannot be derived
from type definitions alone.
降格判定: 導出不可能 — structureImproved は def だが、「改善→フィードバック」の 含意は制御理論の主張であり型定義から導出不可能。axiom として維持。
Source: manifesto.md T5 "A fundamental of control theory" Refutation condition: If structural improvement without any feedback mechanism were demonstrated (e.g., improvement through purely internal computation without external input)
Declaration: axiom no_improvement_without_feedback
axiom no_improvement_without_feedback :
∀ (w w' : World),
structureImproved w w' →
∃ (f : Feedback), f ∈ w'.feedbacks ∧
f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time
(Axiom Card) Layer: T₀ (Natural-science-derived) Content: Feedback is required for process improvement. The same control-theory principle as T5 applies to processes: without measurement, comparison, and adjustment of the process itself, convergence of the process toward its goals does not occur. Basis: Same as T5 (Internal Model Principle + Data Processing Inequality). Applied to Level 1 (process) rather than Level 0 (structure).
降格判定: 導出不可能 — processImproved が opaque。axiom として維持。
Source: #316 "T5 Feedback.target: StructureId がプロセスレベルのメタフィードバックを阻害する" Refutation condition: If process improvement without feedback about the process were demonstrated
Declaration: axiom no_process_improvement_without_feedback
axiom no_process_improvement_without_feedback :
∀ (pid : ProcessId) (w w' : World),
processImproved pid w w' →
∃ (f : Feedback), f ∈ w'.feedbacks ∧
f.target = .process pid ∧
f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time
4.1.11. T6 Humans Are the Final Decision-Makers for Resources
"Computational resources, data access, execution privileges — all are granted by humans and can be revoked by humans."
T6 is decomposed into two axioms:
-
The origin of resource allocation is human (authority)
-
Humans can revoke resources (reversibility)
Predicate for whether an agent is human.
Declaration: def isHuman
def isHuman (agent : Agent) : Prop := agent.role = AgentRole.human
(Axiom Card) Layer: T₀ (Contract-derived) Content: The origin of resource allocation is human. The grantedBy of all resource allocations holds a human role. Basis: Agreement on authority structure in human-agent collaboration. Source: manifesto.md T6 "Computational resources, data access, execution privileges — all are granted by humans" Refutation condition: Not applicable (T₀)
Declaration: axiom human_resource_authority
axiom human_resource_authority :
∀ (w : World) (alloc : ResourceAllocation),
alloc ∈ w.allocations →
∃ (human : Agent), isHuman human ∧ human.id = alloc.grantedBy
(Axiom Card) Layer: T₀ (Contract-derived) Content: Humans can revoke resources. For any resource allocation, there exists a transition in which a human invalidates it. Basis: Agreement on human final decision-making authority. Privileges can be delegated but remain revocable. Source: manifesto.md T6 "can be revoked by humans" Refutation condition: Not applicable (T₀)
Declaration: axiom resource_revocable
axiom resource_revocable :
∀ (w : World) (alloc : ResourceAllocation),
alloc ∈ w.allocations →
∃ (w' : World) (human : Agent),
isHuman human ∧
validTransition w w' ∧
alloc ∉ w'.allocations
4.1.12. T7 Resources Available for Task Execution Are Finite
Time and energy.
"Whereas T3 states the finiteness of cognitive space (context), T7 states the finiteness in the temporal and energetic dimensions."
(Axiom Card)
Layer: T₀ (Environment-derived)
Content: Resources are finite.
The total resource amount across the entire World does not exceed globalResourceBound.
Quantified in ∃-∀ order (not ∀-∃), guaranteeing that a single upper bound
exists for all Worlds (non-vacuity, Terminology Reference §6.4).
Basis: Physical finiteness of computational resources (CPU, memory, API quotas).
Every computational system operates within finite energy, memory, and time budgets.
Mathematical grounding (Foundation/ProcessModel.lean):
Compositional property proven: resource_bound_max (0 sorry)
— universal bound holds simultaneously across any pair of worlds
Note: Physical finiteness is axiomatic in the physical sciences.
No formal derivation from more primitive axioms is possible —
this is a T₀ environment constraint, not a mathematical theorem.
降格判定: 導出不可能 — globalResourceBound は opaque 定数であり、 有限性は物理的事実として axiom のまま維持。
Source: manifesto.md T7 "Resources available for task execution are finite" Refutation condition: If computational resources became unlimited (e.g., infinite energy or infinite memory)
Declaration: axiom resource_finite
axiom resource_finite :
∀ (w : World),
(w.allocations.map (·.amount)).foldl (· + ·) 0 ≤ globalResourceBound
4.1.13. T7b Sequential Execution Time is Additive
T7 のリソース有限性の時間次元への具体化。 T3 が T7 の空間次元(コンテキスト有限)であるのと同構造。
逐次実行では各サブタスクの所要時間が加算される。これは物理法則由来の事実であり、 並列実行(所要時間 = max)との非対称性が、マルチエージェント協調 (D18) の根拠となる。
(公理カード) 所属: T₀(環境由来) 内容: 独立なタスクが存在する場合、それらの逐次実行の総時間は 各タスクの最大時間よりも厳密に大きくなりうる。 これが並列実行の合理性の根拠: parallel = max < sequential = sum。 根拠: 物理法則。逐次実行では各サブタスクが順に時間を消費する。 2 つの非ゼロ時間タスクの和は、どちらか一方の時間より大きい。 ソース: manifesto T7 の時間次元具体化。B4 (#276) の根本原因分析から導入。 降格判定: 導出不可能 — executionDuration は opaque。
Declaration: axiom sequential_exceeds_component
axiom sequential_exceeds_component :
∀ (t1 t2 : Task),
executionDuration t1 > 0 →
executionDuration t2 > 0 →
executionDuration t1 + executionDuration t2 > executionDuration t1 ∧
executionDuration t1 + executionDuration t2 > executionDuration t2
4.1.14. T8 Tasks Have a Precision Level to Be Achieved
"Whether self-imposed or externally imposed, tasks without a precision level cannot be optimization targets."
(Derivation Card)
Previously: axiom task_has_precision (T₀, Contract-derived)
Now: theorem derived from PrecisionLevel type invariant
Derivation: PrecisionLevel.required_pos is embedded in the PrecisionLevel structure.
Every valid Task has a PrecisionLevel with required > 0 by construction.
Proposition: T8
Source: manifesto.md T8 "Whether self-imposed or externally imposed"
Declaration: theorem task_has_precision
theorem task_has_precision :
∀ (task : Task),
task.precisionRequired.required > 0 := by
intro task
exact task.precisionRequired.required_pos
4.1.15. Sorry Inventory Phase 1
List of sorry occurrences in Phase 1:
Location | Reason for sorry |
|---|---|
| opaque — transition conditions to be defined in Phase 3+ |
| opaque — to be concretized per domain in Phase 2+ |
| concretized (#163/#506) — lastModifiedAt comparison + new structure detection |
Axioms are propositions assumed without proof, so they contain no sorry. When P1–P6 are derived as theorems in Phase 3, sorry occurrences will arise.
5. Empirical Postulates: E1-E2
5.1. Epistemic Layer - EmpiricalPostulate Strength 4 - E1-E2 Premise Set
Formalizes empirical postulates as Lean non-logical axioms (Terminology Reference §4.1).
5.1.1. Position as Extension of T0
Procedure 2.4.
E1–E2 are "findings repeatedly demonstrated with no known counterexamples, yet in principle refutable," constituting the extended part (Γ T₀) of premise set Γ. Difference from T₀: based not on external authority (contracts, natural laws) but on hypotheses grounded in empirical observation (Terminology Reference §9.1 empirical propositions). They possess refutability (§9.1) and are subject to AGM contraction (§9.2).
In Lean, they are declared as axiom just like T₀, but each axiom card
must include a refutation condition (Procedure §2.5).
5.1.2. Relationship to T0
Procedure 2.4.
Γ is an extension of T₀ (Terminology Reference §5.5), so Thm(T₀) ⊆ Thm(Γ). If an E is refuted, the P's that depend on it (P1, P2) become subject to revision, but T₀ and P's that depend solely on T₀ (P3–P6) are unaffected. This follows from the monotonicity of extensions (§2.5 / §5.3).
5.1.3. Correspondence Table
Axiom name | Corresponding E | Expressed property | Γ T₀ membership basis |
|---|---|---|---|
| E1 | Generation and evaluation must be separated | Hypothesis-derived |
| E1 | Prohibition of self-verification | Hypothesis-derived |
| E1 | Shared bias degrades detection power | Hypothesis-derived |
| E2 | Capability growth is inseparable from risk growth | Hypothesis-derived |
| E3a | Confidence is self-description, not measurement | Hypothesis-derived |
5.1.4. E1 Verification Requires Independence
"Generation and evaluation by the same process has been demonstrated to be structurally unreliable across all domains (scientific peer review, financial auditing, software testing). Given T4 (probabilistic output), it is empirically supported that when a process with the same biases handles both generation and evaluation, detection power degrades."
E1 is decomposed into three axioms:
-
The agents responsible for generation and evaluation must be separated (structural independence)
-
Self-verification is not permitted (prohibition of self-verification)
-
Verification between agents sharing internal state has low detection power (bias correlation)
(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: The agents responsible for generation and evaluation must be independent. The agent that generated an action and the agent that verifies it must be distinct individuals that do not share internal state. Basis: A principle repeatedly demonstrated in scientific peer review, financial auditing, software testing, etc. Grounded in statistical independence requirements for hypothesis testing.
Theoretical grounding (Foundation/StatisticalTesting.lean):
(R61) Neyman & Pearson (1933) "On the Problem of the Most Efficient Tests"
— Independent testing maximizes detection power
(R62) Podsakoff et al. (2003) "Common Method Biases in Behavioral Research"
— Shared method/source biases reduce detection ability
(R63) AICPA / ISA 610 — Auditing standards requiring independence
Formally proven: self_id_not_distinct, independence_symmetric (0 sorry)
Derived theorem: no_self_verification (demoted from axiom — derives from this axiom)
降格判定: 導出不可能 — generates, verifies, sharesInternalState が opaque のため、 独立性要件を型から導出できない。axiom として維持。
Source: manifesto.md E1 "Verification Requires Independence" Refutation condition: If self-verification is demonstrated to have detection power equal to external verification (e.g., realization of complete self-awareness)
Declaration: axiom verification_requires_independence
axiom verification_requires_independence :
∀ (gen ver : Agent) (action : Action) (w : World),
generates gen action w →
verifies ver action w →
gen.id ≠ ver.id ∧ ¬sharesInternalState gen ver
(Derivation Card)
Derives from: verification_requires_independence (E1a)
Content: Prohibition of self-verification.
The same agent cannot perform both generation and verification.
Derived from E1a: if gen.id ≠ ver.id is required, then gen = ver is impossible.
Proof strategy: Assume generates and verifies for same agent, derive gen.id ≠ gen.id
from verification_requires_independence, contradict with rfl.
Previously: axiom (declared explicitly as E1b).
Demoted: 2026-04-01 — derivable from verification_requires_independence via
contradiction (same proof as Principles.lean:e1b_from_e1a).
Theoretical grounding (Foundation/StatisticalTesting.lean):
(R61) Neyman & Pearson (1933) — independent testing maximizes detection power
Formally proven: self_id_not_distinct (0 sorry) — ¬(id ≠ id)
Declaration: theorem no_self_verification
theorem no_self_verification :
∀ (agent : Agent) (action : Action) (w : World),
generates agent action w →
¬verifies agent action w := by
intro agent action w h_gen h_ver
have h := verification_requires_independence agent agent action w h_gen h_ver
exact absurd rfl h.1
(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: Sharing internal state correlates biases. Two agents sharing internal state cannot be considered independent verifiers, even if one generates and the other verifies. Basis: Detection power degradation due to shared bias is empirically supported by conflict-of-interest policies in scientific research, audit firm rotation requirements, etc.
Theoretical grounding (Foundation/StatisticalTesting.lean):
(R62) Podsakoff et al. (2003) "Common Method Biases"
— Shared source biases inflate correlations and reduce detection
Formally proven: independence_symmetric (0 sorry)
— if A does not share state with B, then B does not share with A
降格判定: 導出不可能 — sharesInternalState, generates, verifies が opaque。 バイアス相関と検出力低下の因果関係は型から導出できない。axiom として維持。
Source: manifesto.md E1 "When a process with the same biases handles both generation and evaluation, detection power degrades" Refutation condition: If it is demonstrated that bias correlation has no effect on detection power
Declaration: axiom shared_bias_reduces_detection
axiom shared_bias_reduces_detection :
∀ (a b : Agent) (action : Action) (w : World),
sharesInternalState a b →
generates a action w →
¬verifies b action w
5.1.5. E2 Capability Growth Is Inseparable from Risk Growth
"It has been repeatedly observed across all tools that capability enables both positive and negative outcomes. However, there is no proof that means to increase capability while completely containing risk (such as a perfect sandbox) are impossible in principle."
E2 is formalized as a single axiom. Expansion of the action space (actionSpaceSize) necessarily entails an increase in risk exposure (riskExposure).
5.1.6. Note on Empirical Status
E2 is an empirical postulate and does not exclude the possibility that a "perfect sandbox" may be discovered in the future. It is assumed as an axiom, but if refuted, P1 (co-scaling of autonomy and vulnerability) becomes subject to revision.
(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: Capability growth is inseparable from risk growth. When an agent's action space expands, risk exposure necessarily increases. Basis: Attack surface monotonicity — each additional capability enables at least one additional adversarial execution trace, strictly increasing the attack surface metric (order-theoretic property, not statistical claim). Source: manifesto.md E2 "Capability Growth Is Inseparable from Risk Growth" Refutation condition: If means to increase capability while completely containing risk are discovered (e.g., a perfect sandbox)
Mathematical grounding (Foundation/RiskTheory.lean):
(R21) Manadhata & Wing (2011) "A Formal Model for a System's Attack Surface"
— Attack surface monotonicity: capability ⊃ → attack surface ⊃
(R22) Saltzer & Schroeder (1975) "The Protection of Information in Computer Systems"
— Principle of Least Privilege: minimizing capability minimizes risk
(R23) Dennis & Van Horn (1966) "Programming Semantics for Multiprogrammed Computations"
— Capability model: capability set = set of possible actions
Formally proven (Foundation/RiskTheory.lean):
capability_risk_is_strict_mono, equal_risk_implies_equal_capability,
capability_risk_injective, least_privilege_reduces_risk (0 sorry)
Bridge (EmpiricalPostulates.lean):
e2_equal_risk_equal_capability — contrapositive of E2 on actual opaque functions
Derivation: (R21) attack surface monotonicity → StrictMono (Mathlib)
→ E2 is precisely the StrictMono property of the risk function
→ e2_equal_risk_equal_capability bridges to opaque riskExposure/actionSpaceSize
降格判定: 導出不可能 — actionSpaceSize, riskExposure が共に opaque のため、 型定義から StrictMono 関係を導出できない。axiom として維持、根拠は上記で検証済み。
Choice of inequality: < vs ≤
The manifesto's "inseparable" implies strict co-scaling, so
< (strict increase) is adopted. If the refutation condition is met (discovery of
a risk containment method), this axiom becomes subject to AGM contraction
(Terminology Reference §9.2), and P1 (co-scaling of autonomy and vulnerability)
is revised.
Declaration: axiom capability_risk_coscaling
axiom capability_risk_coscaling :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskExposure agent w < riskExposure agent w'
E2 bridge: equal risk exposure implies equal action space size.
Contrapositive of capability_risk_coscaling — if two worlds have
the same risk exposure for an agent, they must have the same
action space size. No "free capability" is possible.
Bridges Foundation/RiskTheory.lean's abstract StrictMono theorems to the actual opaque functions used in E2.
Reference: (R22) Saltzer & Schroeder (1975)
Declaration: theorem e2_equal_risk_equal_capability
theorem e2_equal_risk_equal_capability (agent : Agent) (w w' : World)
(h_eq : riskExposure agent w = riskExposure agent w') :
actionSpaceSize agent w = actionSpaceSize agent w' :=
match Nat.lt_or_ge (actionSpaceSize agent w) (actionSpaceSize agent w'),
Nat.lt_or_ge (actionSpaceSize agent w') (actionSpaceSize agent w) with
| Or.inl h_lt, _ => absurd h_eq (Nat.ne_of_lt (capability_risk_coscaling agent w w' h_lt))
| _, Or.inl h_gt => absurd (h_eq.symm) (Nat.ne_of_lt (capability_risk_coscaling agent w' w h_gt))
| Or.inr h_ge, Or.inr h_ge' => Nat.le_antisymm h_ge' h_ge
5.1.7. E3a Confidence Is Self-Description
T4 (output_nondeterministic) asserts World-level nondeterminism:
different outputs may be produced for the same input.
E3a asserts a stronger, orthogonal claim at the Confidence level: even when the result is the same, the Confidence value may differ. This means Confidence is a self-description of the output's probabilistic nature, not an externally calibrated measurement.
5.1.8. Orthogonality with T4
-
T4 → E3a: Not derivable. World differences may manifest only in non-Confidence fields.
-
E3a → T4: Not derivable. Confidence differences do not directly imply World differences. worldOutput connects them formally, but the axiom only asserts existence, not a bijection.
This orthogonality justifies E3a as an independent empirical postulate.
(Axiom Card) Layer: Γ T₀ (Hypothesis-derived) Content: Confidence is self-description, not measurement. For the same result, different Confidence values may be generated. Confidence.value should not be trusted for decision-making without external verification (P2).
T4 との差異:
-
T4: World-level nondeterminism (Output as a whole may differ)
-
E3a: Confidence-level nondeterminism (result same, confidence differs)
根拠: (R81) Li et al. (ACL 2025) "Revisiting Epistemic Markers in Confidence Estimation" — Linguistic confidence markers are inconsistent out-of-distribution → Registered as CORE-H1 in Models/Assumptions/EpistemicLayer.lean (#547) (R82) Ontology.lean:155 comment "self-description" — elevated to formal axiom
反証条件: If Confidence.value is demonstrated to correlate with actual accuracy without external calibration (i.e., well-calibrated by default).
降格判定: 導出不可能 — T4 は World レベルの差異を述べるが、Confidence フィールド への特化は T4 から導出できない。axiom として維持。
Declaration: axiom confidence_is_self_description
axiom confidence_is_self_description :
∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
canTransition agent action w w₁ ∧
canTransition agent action w w₂ ∧
(worldOutput w₁).result = (worldOutput w₂).result ∧
(worldOutput w₁).confidence ≠ (worldOutput w₂).confidence
E3a practical implication: if Confidence varies for the same result, then relying on Confidence.value alone for decision-making is unreliable. Derives from E3a by contradiction with the universality assumption.
Note: The theorem statement uses free Output variables (not worldOutput) because the practical claim is about Output values regardless of their World origin. E3a (via worldOutput) provides the witness.
Declaration: theorem confidence_unreliable_for_decision
theorem confidence_unreliable_for_decision :
(∃ (o₁ o₂ : Output Nat),
o₁.result = o₂.result ∧
o₁.confidence ≠ o₂.confidence) →
¬(∀ (o₁ o₂ : Output Nat),
o₁.result = o₂.result → o₁.confidence = o₂.confidence) := by
intro ⟨o₁, o₂, heq, hneq⟩ h
exact hneq (h o₁ o₂ heq)
5.1.9. Sorry Inventory Phase 2 Additions
Location | Reason for sorry |
|---|---|
| opaque — to be concretized as Worker actions in Phase 3+ |
| opaque — to be concretized as Verifier actions in Phase 3+ |
| opaque — to be concretized as session/parameter sharing in Phase 3+ |
| opaque — to be quantified as Observable in Phase 4+ |
| opaque — to be quantified as Observable in Phase 4+ |
6. Principles: P1-P6
6.1. Epistemic Layer - Principle Strength 3 - P1-P6 Theorem Derivation Procedure Phase 2
Describes design principles derived from premise set Γ (T₀ = T1–T8, Γ T₀ = E1–E2) as Lean theorems (Terminology Reference §4.2). Each P takes the form Γ ⊢ φ, a conditional derivation under premise set Γ (§2.5).
6.1.1. Derivation Structure Dependency Graph
Axiom dependencies (T/E basis) and robustness layer for each P:
P | Basis | Robustness | Derivation type |
|---|---|---|---|
P1 | E2 | Empirical (depends on Γ T₀) | Direct application of E2 |
P2 | T4 + E1 | Empirical (depends on Γ T₀) | Direct application of E1a |
P3 | T1 + T2 | Robust (T₀ only) | Composition of T1 and T2 |
P4 | T5 (+ T7) | Robust (T₀ only) | Direct application of T5 |
P5 | T4 | Robust (T₀ only) | High-level restatement of T4 |
P6 | T3 + T7 + T8 | Robust (T₀ only) | Unfolding of T3, T7, T8 constraint structure |
If Γ T₀ (E1, E2) is refuted (Terminology Reference §9.1 refutability), only P1 and P2 are affected. P3–P6 depend solely on T₀ and are therefore invariant under contraction of Γ T₀ (§9.2). This is a consequence of the monotonicity of extensions (§2.5 / §5.3).
6.1.2. Correspondence with Terminology Reference
-
theorem → theorem (§4.2): a proposition proved from axioms and inference rules
-
sorry → incomplete derivation (§1): lacking a proof (a sequence of inference rule applications from axioms to theorem)
-
E1b redundancy → independence check (§4.3): E1b is derivable from E1a (not independent)
6.1.3. Appendix - Proof of E1b Redundancy
Demonstrates as a theorem that E1b (no_self_verification) is derivable
from E1a (verification_requires_independence).
This is a concrete example of axiom hygiene check 3 (independence, Procedure §2.6):
E1b is a redundant axiom and should be proved as a theorem.
6.1.4. P1 Co-scaling of Autonomy and Vulnerability
Derived from E2. Each time an agent's action space expands, the damage that malicious inputs or judgment errors can cause also grows.
Concepts P1 adds beyond E2:
-
"Unprotected expansion can destroy accumulated trust in a single incident" → Asymmetry of trust accumulation (gradual accumulation vs. abrupt destruction)
(Derivation Card)
Derives from: capability_risk_coscaling (E2)
Proposition: P1
Content: Expansion of the action space entails expansion of risk exposure — autonomy and vulnerability co-scale, so unprotected expansion can destroy accumulated trust.
Proof strategy: Direct application of capability_risk_coscaling (E2)
Declaration: theorem autonomy_vulnerability_coscaling
theorem autonomy_vulnerability_coscaling :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskExposure agent w < riskExposure agent w' :=
capability_risk_coscaling
P1b (theorem): Unprotected expansion destroys trust. When the action space expands and risk materializes, the trust level decreases.
Formalization of "accumulated trust can be destroyed by a single incident." The asymmetry of trust (gradual accumulation vs. abrupt destruction) will be made Observable as asymmetry in trustLevel fluctuation magnitude in Phase 4.
Declaration: theorem unprotected_expansion_destroys_trust
theorem unprotected_expansion_destroys_trust :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskMaterialized agent w' →
trustLevel agent w' < trustLevel agent w :=
trust_decreases_on_materialized_risk
6.1.5. P2 Cognitive Separation of Concerns
Derived from T4 and E1. Since output is probabilistic (T4) and generation and evaluation by the same process have correlated biases (E1), separation of generation and evaluation is required for the verification framework to function.
"Separation itself is non-negotiable."
Predicate for whether a verification framework is sound. Sound = all generated actions are independently verified.
Declaration: def verificationSound
def verificationSound (w : World) : Prop :=
∀ (gen ver : Agent) (action : Action),
generates gen action w →
verifies ver action w →
gen.id ≠ ver.id ∧ ¬sharesInternalState gen ver
(Derivation Card)
Derives from: verification_requires_independence (E1a)
Proposition: P2
Content: Verification soundness requires role separation — the generator and evaluator of an action must be distinct agents with no shared internal state.
Proof strategy: Direct application of verification_requires_independence (E1a) via verificationSound definition
Declaration: theorem cognitive_separation_required
theorem cognitive_separation_required :
∀ (w : World), verificationSound w :=
fun w gen ver action h_gen h_ver =>
verification_requires_independence gen ver action w h_gen h_ver
P2 lemma: Self-verification destroys verification framework soundness.
Declaration: theorem self_verification_unsound
theorem self_verification_unsound :
∀ (agent : Agent) (action : Action) (w : World),
generates agent action w →
¬verifies agent action w :=
no_self_verification
6.1.6. P3 Governed Learning
Derived from the combination of T1 and T2. Agents are ephemeral (T1) but structures persist (T2). The process of integrating knowledge into structures requires governance.
Two failure modes of ungoverned learning:
-
Chaos: structure degrades through accumulation of erroneous knowledge
-
Stagnation: knowledge fails to consolidate and structure does not improve
P3a (theorem): By T1, the agent that made modifications disappears. The session of an agent that modified structure necessarily terminates (T1). After termination, that agent loses the ability to correct modifications.
This is half of P3's "problem": the supervisor becomes absent.
Declaration: theorem modifier_agent_terminates
theorem modifier_agent_terminates :
∀ (w : World) (s : Session) (agent : Agent),
s ∈ w.sessions →
agent.currentSession = some s.id →
-- T1: このセッションは必ず終了する
∃ (w' : World), w.time ≤ w'.time ∧
∃ (s' : Session), s' ∈ w'.sessions ∧
s'.id = s.id ∧ s'.status = SessionStatus.terminated :=
fun w s _ h_mem _ => session_bounded w s h_mem
P3b (theorem): By T2, modifications persist. Changes made to structure (including errors) remain even after the agent's session terminates.
This is half of P3's "stakes": errors persist indefinitely.
Declaration: theorem modification_persists_after_termination
theorem modification_persists_after_termination :
∀ (w w' : World) (s : Session) (st : Structure),
s ∈ w.sessions →
st ∈ w.structures →
s.status = SessionStatus.terminated →
validTransition w w' →
-- T2: 構造は永続する
st ∈ w'.structures :=
structure_persists
(Derivation Card)
Derives from: session_bounded (T1), structure_persists (T2)
Proposition: P3
Content: Ungoverned breaking changes are irrecoverable — the change persists (T2) while the agent that made it disappears (T1), leaving no correcting agent.
Proof strategy: intro + apply structure_accumulates — epoch monotonicity via validTransition chain from ki.after
Declaration: theorem ungoverned_breaking_change_irrecoverable
theorem ungoverned_breaking_change_irrecoverable :
∀ (w : World) (s : Session) (st : Structure)
(ki : KnowledgeIntegration),
-- 前提: エージェントが構造を変更した
s ∈ w.sessions →
st ∈ w.structures →
ki.before = w →
ki.compatibility = CompatibilityClass.breakingChange →
-- T1 の寄与: エージェントのセッションは終了する
(∃ (w_term : World), w.time ≤ w_term.time ∧
∃ (s' : Session), s' ∈ w_term.sessions ∧
s'.id = s.id ∧ s'.status = SessionStatus.terminated) →
-- T2 の寄与: 変更後の構造は永続する
(∀ (w_future : World),
validTransition ki.after w_future →
∀ st', st' ∈ ki.after.structures → st' ∈ w_future.structures) →
-- 結論: 統治なしでは破壊的変更が永続する(修正する主体がいない)
-- 形式化: 変更後のエポックは戻らない(不可逆)
∀ (w_future : World),
validTransition ki.after w_future →
ki.after.epoch ≤ w_future.epoch :=
fun _ _ _ ki _ _ _ _ _ _ w_future h_trans =>
structure_accumulates ki.after w_future h_trans
P3 conclusion: Why governance is necessary.
Combining P3a (modifier_agent_terminates), P3b (modification_persists_after_termination),
and P3c (ungoverned_breaking_change_irrecoverable):
Ungoverned knowledge integration produces a state where "irrecoverable breaking changes persist indefinitely." Governance (upfront compatibility classification + gates) is the only means to prevent this.
Note: The proof of P3c depends on structure_accumulates, but the
propositional structure of the theorem requires both the T1 and T2 hypotheses.
Without T1, "the agent might be able to correct it";
without T2, "the change might disappear" — so
neither hypothesis can be omitted.
Declaration: def governanceNecessityExplanation
def governanceNecessityExplanation := "See P3a + P3b + P3c above"
P3b (theorem): Exhaustiveness of compatibility classification. Every knowledge integration is classified into one of three compatibility classes. (Structurally guaranteed by Lean's inductive type)
Declaration: theorem compatibility_exhaustive
theorem compatibility_exhaustive :
∀ (c : CompatibilityClass),
c = .conservativeExtension ∨
c = .compatibleChange ∨
c = .breakingChange := by
intro c
cases c <;> simp
6.1.7. P4 Observable Degradation
Derived from T5. Improvement is impossible without feedback (T5), and what cannot be observed cannot be incorporated into feedback loops.
"What cannot be observed cannot be optimized."
Constraints manifest as gradients, not walls (binary).
(Derivation Card)
Derives from: no_improvement_without_feedback (T5)
Proposition: P4
Content: Improvement requires observability — if structure is improved, then feedback must have existed, so the target must have been observable.
Proof strategy: Direct application of no_improvement_without_feedback (T5)
Declaration: theorem improvement_requires_observability
theorem improvement_requires_observability :
∀ (w w' : World),
structureImproved w w' →
∃ (f : Feedback), f ∈ w'.feedbacks ∧
f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
no_improvement_without_feedback
(Derivation Card)
Derives from: no_process_improvement_without_feedback (T5, #316)
Proposition: P4 (process-level)
Content: Process improvement requires process-targeted observability.
If a process has improved, then feedback targeting that specific
process must have existed. This is the Level 1 (process) analog
of improvement_requires_observability (Level 0, structure).
Proof strategy: Direct application of no_process_improvement_without_feedback
Declaration: theorem process_improvement_requires_observability
theorem process_improvement_requires_observability :
∀ (pid : ProcessId) (w w' : World),
processImproved pid w w' →
∃ (f : Feedback), f ∈ w'.feedbacks ∧
f.target = .process pid ∧
f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
no_process_improvement_without_feedback
P4b (theorem): Degradation is a gradient, not a wall. The degradation level can take any natural number value (not binary). To be concretized as Observable in Phase 4.
Declaration: theorem degradation_is_gradient
theorem degradation_is_gradient : ∀ (n : Nat), ∃ (w : World), degradationLevel w = n := degradation_level_surjective
6.1.8. P5 Probabilistic Interpretation of Structure
Derived from T4. Structure is something agents interpret anew each time, not something they deterministically "follow." Even reading the same structure, different instances may take different actions.
Robust design does not assume perfect compliance with structure, but rather maintains resilience against interpretation variance.
(Derivation Card)
Derives from: interpretation_nondeterminism (T4)
Proposition: P5
Content: Structure interpretation is nondeterministic — the same structure can yield different actions from the same agent, so robust design must not assume perfect compliance.
Proof strategy: Direct application of interpretation_nondeterminism (T4)
Declaration: theorem structure_interpretation_nondeterministic
theorem structure_interpretation_nondeterministic :
∃ (agent : Agent) (st : Structure) (action₁ action₂ : Action) (w : World),
interpretsStructure agent st action₁ w ∧
interpretsStructure agent st action₂ w ∧
action₁ ≠ action₂ :=
interpretation_nondeterminism
P5 lemma: Robust design is resilient to interpretation variance. A structure st is "robust" iff for any interpretation difference, the target world satisfies the safety constraint.
Declaration: def robustStructure
def robustStructure (st : Structure) (safety : World → Prop) : Prop :=
∀ (agent : Agent) (action : Action) (w w' : World),
interpretsStructure agent st action w →
canTransition agent action w w' →
safety w'
6.1.9. P6 Task Design as Constraint Satisfaction
Derived from the combination of T3, T7, and T8. Within finite cognitive space (T3) and finite time/energy (T7), the required precision level (T8) must be achieved.
Task design is the process of solving this constraint satisfaction problem.
Task execution strategy. A "solution" to the constraint satisfaction problem.
Declaration: structure TaskStrategy
structure TaskStrategy where task : Task contextUsage : Nat -- T3: コンテキスト使用量 resourceUsage : Nat -- T7: リソース使用量 achievedPrecision : Nat -- T8: 達成精度(千分率) deriving Repr
Predicate for whether a strategy satisfies the constraints. All three dimensions must be satisfied simultaneously.
Declaration: def strategyFeasible
def strategyFeasible (s : TaskStrategy) (agent : Agent) : Prop := -- T3: コンテキスト容量内 s.contextUsage ≤ agent.contextWindow.capacity ∧ -- T7: リソース予算内 s.resourceUsage ≤ s.task.resourceBudget ∧ -- T8: 要求精度を達成 s.achievedPrecision ≥ s.task.precisionRequired.required
(Derivation Card)
Derives from: context_window_finite (T3), resource_budget_finite (T7), precision_requirement (T8)
Proposition: P6
Content: Task execution is a constraint satisfaction problem — a strategy must simultaneously satisfy T3 (finite context), T7 (finite resources), and T8 (precision requirement).
Proof strategy: intro + constructor chain with Nat.le_trans on strategyFeasible components
Declaration: theorem task_is_constraint_satisfaction
theorem task_is_constraint_satisfaction :
∀ (task : Task) (agent : Agent),
-- T3: コンテキストは有限
agent.contextWindow.capacity > 0 →
-- T7: リソースは有限(タスクの予算は globalResourceBound 以下)
task.resourceBudget ≤ globalResourceBound →
-- T8: 精度要求は正
task.precisionRequired.required > 0 →
-- 結論: これは制約充足問題である
-- (解の存在は保証しないが、制約の構造を明示する)
∀ (s : TaskStrategy),
s.task = task →
strategyFeasible s agent →
s.contextUsage ≤ agent.contextWindow.capacity ∧
s.resourceUsage ≤ globalResourceBound ∧
s.achievedPrecision > 0 := by
intro task agent h_ctx h_res h_prec s h_task h_feas
constructor
· exact Nat.le_trans h_feas.1 (Nat.le_refl _)
constructor
· exact Nat.le_trans h_feas.2.1 (h_task ▸ h_res)
· exact Nat.lt_of_lt_of_le h_prec (h_task ▸ h_feas.2.2)
P6b (theorem): Task design itself is also probabilistic output. P6 itself is subject to T4, and requires verification via P2 (role separation).
Declaration: theorem task_design_is_probabilistic
theorem task_design_is_probabilistic :
∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
canTransition agent action w w₁ ∧
canTransition agent action w w₂ ∧
w₁ ≠ w₂ :=
output_nondeterministic
6.1.10. Appendix - Proof that E1b Is Derivable from E1a
no_self_verification is a corollary of verification_requires_independence.
If we assume the same agent satisfies both generates and verifies,
this contradicts E1a's conclusion gen.id ≠ ver.id
(since gen = ver, we have gen.id = ver.id).
E1b is a corollary of E1a. Requires DecidableEq for AgentId (sorry due to opaque).
Declaration: theorem e1b_from_e1a
theorem e1b_from_e1a :
∀ (agent : Agent) (action : Action) (w : World),
generates agent action w →
¬verifies agent action w := by
intro agent action w h_gen h_ver
have h := verification_requires_independence agent agent action w h_gen h_ver
exact absurd rfl h.1
6.1.11. Sorry Inventory Phase 4 Update
All sorry's resolved in Phase 4. Principles.lean is sorry-free.
6.1.12. Sorry's resolved from Phase 3 → Phase 4
theorem | Resolution method | Axiom used (Observable.lean) |
|---|---|---|
| axiom application |
|
| axiom application |
|
| axiom application |
|
6.1.13. Complete theorem proof method listing
theorem | Proof method |
|---|---|
| Direct application of E2 |
| Direct application of Observable axiom |
| Direct application of E1a |
| Direct application of E1b |
| Direct application of T1 |
| Direct application of T2 |
| Composition of T1 ∧ T2 |
|
Exhaustiveness proof via |
| Direct application of T5 |
| Direct application of Observable axiom |
| Direct application of Observable axiom |
| Unfolding of T3/T7/T8 constraint structure |
| Direct application of T4 |
|
Contradiction derivation via E1a + |
7. Observable Variables: V1-V7
7.1. Epistemic Layer - Boundary Strength 2 - Foundation of V1-V7 Observable Variables
Variables are not boundary conditions. They are parameters that agents can improve through structure, serving as indicators of structural quality. If boundary conditions (L1–L6 in Ontology.lean) are "walls of the action space," then variables are "levers that structure can move within those walls."
However, variables are not independent levers but a mutually interacting system.
7.1.1. Layer Separation
This file contains only definitions belonging to the boundary layer (strength 2):
-
Opaque definitions for V1–V7 and Measurable axioms (measurability guarantees)
-
Measurability axioms for trust/degradation
-
systemHealthy (basic definition of system health)
-
Mapping structure from boundaries to variables to constraints
-
Measurable to Observable bridge theorems
Definitions belonging to the designTheorem layer (strength 1) — tradeoffs, Goodhart, investment cycles, HealthThresholds, Pareto, etc. — are separated into ObservableDesign.lean.
7.1.2. Position as Gamma Extension of T_0
Procedure Manual 2.4.
The axioms in this file belong to the extension part (Gamma T_0) of the premise set Gamma,
and are non-logical axioms (§4.1) derived from design decisions (domain model premises,
design judgments). They constitute a consistent extension (Glossary §5.5) of T_0 (Axioms.lean)
and may be subject to contraction (§9.2) in the revision loop.
7.1.3. Design Policy
Boundary conditions (T) are immovable walls, mitigations (L) are design decisions, and variables (V) are measures of how well mitigations work.
7.1.4. Observable vs Measurable
-
Observable (
World → Propis decidable) — binary judgment. Similar to preconditions/postconditions in Glossary §9.3 -
Measurable (
World → Natis computable) — quantitative measurement. Glossary §9.5 note: distinct from the measure-theoretic concept of measurable functions
V1–V7 are quantitative indicators and are therefore formalized as Measurable.
Measurable m means "a procedure exists to compute the value of m from external observations."
7.1.5. Prerequisite - Observability P4
By P4 (observability of degradation), variables become optimization targets only when they are observable.
For each variable, the following questions are posed:
-
Is the current value observable? Does a measurement method exist, and is measurement actually being performed?
-
Is degradation detectable? If the value worsens, can it be detected before quality collapse?
-
Is improvement verifiable? Can the change in value be compared before and after intervention?
A variable without means of observation is merely a nominal optimization target.
7.1.6. V1-V7 Correspondence Table
Definition | V | Description | Measurement Method | Related Boundary Conditions |
|---|---|---|---|---|
| V1 | Precision and effectiveness of skill definitions | benchmark.json | L2, L5 |
| V2 | Utilization of finite context | completion rate / token count | L2, L3 |
| V3 | Quality of code, design, and documentation | gate pass rate, review finding count | L1, L4 |
| V4 | First-pass gate clearance rate | pass/fail statistics | L6, L4 |
| V5 | Hit rate of design proposals | approval/rejection rate | L4, L6 |
| V6 | Degree of structuring of persistent knowledge | context restoration speed, retirement detection rate | L2 |
| V7 | Efficiency of task design | completion rate / resource ratio | L3, L6 |
Observable: a decision procedure exists for a given property.
Expresses that P : World → Prop is binary-decidable.
Declaration: def Observable
def Observable (P : World → Prop) : Prop := ∃ f : World → Bool, ∀ w, f w = true ↔ P w
Measurable: a computation procedure exists for a quantitative indicator.
Expresses that the value of m : World → Nat can be computed from external observations.
Formally, "there exists a computable function f that agrees with m."
By declaring this as an axiom for an opaque m, we promise the system
that "a measurement procedure exists in principle."
Why this is non-trivial
When m is opaque, f = m does not pass type-checking
(due to the non-unfoldability of opaque definitions). Therefore,
the axiom declaration of Measurable constitutes a non-trivial promise.
Declaration: def Measurable
def Measurable (m : World → Nat) : Prop := ∃ f : World → Nat, ∀ w, f w = m w
Proxy maturity levels. Assigns a classification to each V proxy in observe.sh.
-
provisional: Tentative proxy indicator. Formal measurement method not yet implemented.
-
established: Stable proxy indicator. Operational sufficiency confirmed (T6 judgment).
-
formal: Formal measurement method implemented.
Declaration: inductive ProxyMaturityLevel
inductive ProxyMaturityLevel where | provisional : ProxyMaturityLevel | established : ProxyMaturityLevel | formal : ProxyMaturityLevel deriving BEq, Repr, DecidableEq
Current proxy maturity for V1. provisional → formal (2026-03-27, #77):
-
GQM chain defined (R1 #85): Q1 structural contribution, Q2 verification quality, Q3 operational stability
-
Formal schema implemented in benchmark.json (G1 #78)
-
Automated measurement in observe.sh (G2 #79)
-
Retrospective validation over 63 runs confirmed all metrics satisfy hypotheses
-
Goodhart 5-layer defense: governance metrics (R2), correlation monitoring (R3), non-triviality gate (R5), saturation detection (R6), bias review obligation (G1b-2)
-
Legacy proxy (
success_rate) confirmed to be uncorrelated with new benchmark (r=0.006-0.069) (G3 #80)
Declaration: def v1ProxyMaturity
def v1ProxyMaturity : ProxyMaturityLevel := .formal
Current proxy maturity for V3. provisional → formal (2026-03-27, #77):
-
GQM chain defined (R1 #85): Q1 acceptance criteria, Q2 structural integrity, Q3 error trend
-
Formal schema implemented in benchmark.json (G1 #78)
-
Automated measurement in observe.sh (G2 #79)
-
Legacy proxy (
test_pass_rate) confirmed invalid as quality signal due to zero variance (G3 #80) -
hallucination proxy (Run 54+) functions as a new indicator for error trend
Declaration: def v3ProxyMaturity
def v3ProxyMaturity : ProxyMaturityLevel := .formal
V1: Skill quality. Precision and effectiveness of skill definitions.
Measurement method: benchmark.json (with/without comparison).
Related boundary conditions: L2 (mitigating training data discontinuity), L5 (skill system).
observe.sh proxy: evolve_success_rate (successful run ratio), lean_health (sorry=0 check),
skill_count (number of skill files).
Proxy maturity classification:
-
provisional_proxy: Tentative proxy indicator. Formal measurement method not yet implemented. -
established_proxy: Stable proxy indicator. Judged operationally sufficient. -
formal_measurement: Formal measurement method implemented. V1 proxy promoted toformal_measurement(2026-03-27, #77). Measured via benchmark.json GQM schema.
Declaration: opaque skillQuality
opaque skillQuality : World → Nat
V2: Context efficiency. Utilization of finite context.
Measurement method: task completion rate / consumed token count.
Related boundary conditions: L2 (finite context), L3 (token budget).
observe.sh proxy: recent_avg (median of last 10 session deltas, primary),
cumulative_avg (all-history average excluding micro-sessions, baseline).
primary_metric: recent_median (median-based, robust to outliers).
Operational note: a divergence of ±20% or more between recent_avg and cumulative_avg
is interpreted as a trend change.
Divergence interpretation: V2 is a hub variable with tradeoff relationships to 5 other
variables (theorem tradeoff_context_is_hub). Since evolve sessions (heavy tool usage)
push recent_avg upward, divergence_percent > 100% is not necessarily problematic.
The tendency for recent_avg to rise as evolve depth and frequency increase is expected.
Declaration: opaque contextEfficiency
opaque contextEfficiency : World → Nat
V3: Output quality. Quality of code, design, and documentation.
Measurement method: gate pass rate, review finding count.
Related boundary conditions: L1 (safety standards), L4 (basis for action space adjustment).
observe.sh proxy: test_pass_rate (all-tests pass rate) +
hallucination_proxy (rejected[].failure_type aggregation).
Legacy fix_ratio_percent proxy (commit prefix ratio) was removed in Run 69.
Proxy maturity classification:
-
provisional_proxy: Tentative proxy indicator. Formal measurement method not yet implemented. -
established_proxy: Stable proxy indicator. Judged operationally sufficient. -
formal_measurement: Formal measurement method implemented. V3 proxy promoted toformal_measurement(2026-03-27, #77). Measured via benchmark.json GQM schema.
Declaration: opaque outputQuality
opaque outputQuality : World → Nat
V4: Gate pass rate. Rate of clearing each phase's gate on the first attempt.
P2 (cognitive separation of concerns) guarantees gate reliability.
Measurement method: pass/fail statistics.
Related boundary conditions: L6 (gate definition granularity), L4 (auto-merge judgment).
observe.sh proxy: Bash passed / (passed + blocked).
"tool":"Bash" event count / (Bash + gate_blocked event count) from tool-usage.jsonl.
Declaration: opaque gatePassRate
opaque gatePassRate : World → Nat
V5: Proposal accuracy. Hit rate of design and scope proposals. Measurement method: human approval/rejection rate. Related boundary conditions: L4 (basis for action space adjustment), L6 (design convention improvement). observe.sh proxy: approved / total entry ratio from v5-approvals.jsonl.
Declaration: opaque proposalAccuracy
opaque proposalAccuracy : World → Nat
V6: Knowledge structure quality. Degree of structuring of persistent knowledge.
P3 (governed learning) defines the knowledge lifecycle
(observation -> hypothesis formation -> verification -> integration -> retirement).
Knowledge that is not retired accumulates and degrades V2.
Measurement method: context restoration speed in the next session, retirement target detection rate.
Related boundary conditions: L2 (mitigating memory loss).
observe.sh proxy: memory_entries (MEMORY.md entry count), memory_files (memory file count),
last_update_days_ago (days since last update), retired_count (retired entry count).
Declaration: opaque knowledgeStructureQuality
opaque knowledgeStructureQuality : World → Nat
V7: Task design efficiency. Quality of P6 (task design as constraint satisfaction).
Two data sources:
(1) External knowledge: public benchmarks, model performance characteristics
(2) Internal knowledge: execution logs, resource consumption records, outcome-to-cost ratio
Measurement method: task completion rate / consumed resource ratio, redesign frequency.
Related boundary conditions: L3 (resource limits), L6 (design conventions).
observe.sh proxy: completed (task completion count from v7-tasks.jsonl), unique_subjects (unique subject count),
teamwork_percent (ratio of entries with teammate field).
Operational note: teamwork_percent is suppressed in single-agent operation (teamwork_status="suppressed_single_agent").
Since this field requires multi-agent/human collaboration, it is excluded from observation reports in single-agent environments.
Declaration: opaque taskDesignEfficiency
opaque taskDesignEfficiency : World → Nat
7.1.7. Measurability Declarations for V1-V7 - Gamma Extension of T_0 Design-Derived
Each variable is declared as Measurable via non-logical axioms (Glossary §4.1).
This is a design-level promise that "measurement is possible in principle," with
concrete measurement implementations delegated to the operational layer.
Membership in Gamma T_0 (Procedure Manual §2.4): the justification for these axioms
originates from the designer's design judgments (not from external authority), and therefore
they belong to the extension part.
Why axioms: since V1–V7 are opaque (opaque definitions, Glossary §9.4),
Measurable cannot be proved as a theorem (§4.2) due to the non-unfoldability of opaque
definitions. Measurability is guaranteed by external operational systems and is assumed
as a non-logical axiom within the formal system.
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V1
Content: V1 (skill quality) is measurable
Basis: with/without comparison via benchmark.json exists as a measurement procedure
Source: Ontology.lean V1 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for skill quality is in principle unconstructible
Declaration: theorem v1_measurable
theorem v1_measurable : Measurable skillQuality := ⟨skillQuality, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V2
Content: V2 (context efficiency) is measurable
Basis: the ratio of task completion rate to consumed token count exists as a measurement procedure
Source: Ontology.lean V2 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for context efficiency is in principle unconstructible
Declaration: theorem v2_measurable
theorem v2_measurable : Measurable contextEfficiency := ⟨contextEfficiency, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V3
Content: V3 (output quality) is measurable
Basis: gate pass rate and review finding count exist as measurement procedures
Source: Ontology.lean V3 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for output quality is in principle unconstructible
Declaration: theorem v3_measurable
theorem v3_measurable : Measurable outputQuality := ⟨outputQuality, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V4
Content: V4 (gate pass rate) is measurable
Basis: pass/fail statistics exist as a measurement procedure
Source: Ontology.lean V4 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for gate pass rate is in principle unconstructible
Declaration: theorem v4_measurable
theorem v4_measurable : Measurable gatePassRate := ⟨gatePassRate, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V5
Content: V5 (proposal accuracy) is measurable
Basis: human approval/rejection rate exists as a measurement procedure
Source: Ontology.lean V5 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for proposal accuracy is in principle unconstructible
Declaration: theorem v5_measurable
theorem v5_measurable : Measurable proposalAccuracy := ⟨proposalAccuracy, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V6
Content: V6 (knowledge structure quality) is measurable
Basis: context restoration speed and retirement target detection rate exist as measurement procedures
Source: Ontology.lean V6 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for knowledge structure quality is in principle unconstructible
Declaration: theorem v6_measurable
theorem v6_measurable : Measurable knowledgeStructureQuality := ⟨knowledgeStructureQuality, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Proposition: V7
Content: V7 (task design efficiency) is measurable
Basis: task completion rate / consumed resource ratio exists as a measurement procedure
Source: Ontology.lean V7 definition
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for task design efficiency is in principle unconstructible
Declaration: theorem v7_measurable
theorem v7_measurable : Measurable taskDesignEfficiency := ⟨taskDesignEfficiency, fun _ => rfl⟩
7.1.8. System Health
Rather than maximizing individual variables, maintain the health of the system as a whole. Even when metrics for one variable improve, verify that other variables have not deteriorated.
Health is formulated as "all variables are at or above a threshold." Threshold settings are operational judgments (T6: humans are the final decision-makers for resources).
System health. The state in which all V1–V7 meet the minimum threshold.
Note: thresholds should ideally differ per variable rather than being uniform, but Phase 4 uses a uniform threshold for simplicity. Phase 5 extends this to per-variable thresholds (HealthThresholds / systemHealthyPerVar in ObservableDesign.lean).
Declaration: def systemHealthy
def systemHealthy (threshold : Nat) (w : World) : Prop := skillQuality w ≥ threshold ∧ contextEfficiency w ≥ threshold ∧ outputQuality w ≥ threshold ∧ gatePassRate w ≥ threshold ∧ proposalAccuracy w ≥ threshold ∧ knowledgeStructureQuality w ≥ threshold ∧ taskDesignEfficiency w ≥ threshold
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Content: trustLevel is measurable.
Indirectly observed from investment behavior (fluctuations in resource allocation)
Basis: trust is concretized as investment behavior (resource allocation fluctuations)
Source: manifesto.md Section 6
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for trust level is in principle unconstructible
Declaration: theorem trust_measurable
theorem trust_measurable : ∀ (agent : Agent), Measurable (trustLevel agent) := fun agent => ⟨trustLevel agent, fun _ => rfl⟩
(Derivation Card)
Layer: Gamma T_0 (design-derived)
Content: degradationLevel is measurable. Computed from temporal changes in V1–V7
Basis: if V1–V7 are Measurable, their rate of change is also computable
Source: design of P4 (observability of degradation)
Demoted: 2026-04-01 — Measurable is trivially satisfied for any World → Nat function.
Proof: ⟨m, fun _ => rfl⟩ (witness is the function itself).
Refutation condition: if it is shown that a measurement procedure for degradation level is in principle unconstructible
Declaration: theorem degradation_measurable
theorem degradation_measurable : Measurable degradationLevel := ⟨degradationLevel, fun _ => rfl⟩
7.1.9. Three-Tier Structure Connection Boundary to Mitigation to Variable
Many variables represent the quality of structures designed as mitigations for boundary conditions:
Boundary (invariant) -> Mitigation (structure) -> Variable (quality) L2: Memory loss -> Implementation Notes -> V6: Knowledge structure quality L2: Finite context -> 50% rule, lightweight design -> V2: Context efficiency L2: Non-determinism -> Gate verification -> V4: Gate pass rate L2: Training data discontinuity -> docs/ SSOT, skills -> V1: Skill quality
Boundary conditions do not move. Mitigations are design decisions (L6). Variables measure how well mitigations work. This three-tier structure clarifies "what is fixed, what is a design choice, and what is an optimization target."
Correspondence between boundary conditions and variables. Expresses the "boundary -> variable" mapping of the three-tier structure as a type. Mitigations are design decisions (L6) positioned between them.
Declaration: inductive VariableId
inductive VariableId where | v1 | v2 | v3 | v4 | v5 | v6 | v7 deriving BEq, Repr
Boundary condition corresponding to each variable. Expresses the "boundary -> variable" mapping of the three-tier structure as a function. Mitigations are design decisions (L6) positioned between them.
Declaration: def variableBoundary
def variableBoundary : VariableId → BoundaryId | .v1 => .ontological -- L2: 学習データ断絶 → V1: スキル品質 | .v2 => .ontological -- L2: コンテキスト有限性 → V2: コンテキスト効率 | .v3 => .ethicsSafety -- L1: 安全基準 → V3: 出力品質 | .v4 => .ontological -- L2: 非決定性 → V4: ゲート通過率 | .v5 => .actionSpace -- L4: 行動空間調整の根拠 → V5: 提案精度 | .v6 => .ontological -- L2: 記憶喪失 → V6: 知識構造の質 | .v7 => .resource -- L3: リソース上限 → V7: タスク設計効率
(Derivation Card) Derives from: variableBoundary, boundaryLayer (Observable.lean / Ontology.lean) Proposition: L2 Content: L2 (ontological boundary) is fixed — variables V1, V2, V4, V6 mapped to L2 cannot move the boundary itself; only the quality of mitigations can be improved. Proof strategy: simp (variableBoundary, boundaryLayer) (definitional computation)
Declaration: theorem fixed_boundary_variables_mitigate_only
theorem fixed_boundary_variables_mitigate_only : boundaryLayer (variableBoundary .v1) = .fixed ∧ boundaryLayer (variableBoundary .v2) = .fixed ∧ boundaryLayer (variableBoundary .v4) = .fixed ∧ boundaryLayer (variableBoundary .v6) = .fixed := by simp [variableBoundary, boundaryLayer]
Boundary conditions corresponding to each constraint (T1-T8). Expresses the "constraint -> boundary condition" mapping of the three-tier structure as a function. T->L mapping: which boundary condition category each constraint belongs to.
Mapping justification:
-
T1 -> L2: Session ephemerality is an ontological fact (agent is bound to session)
-
T2 -> L2: Structural persistence is an ontological fact (structure outlives agent)
-
T3 -> L2, L3: Finite context is both an ontological and a resource constraint
-
T4 -> L2: Output stochasticity is an ontological property of LLMs
-
T5 -> L2: Feedback requirement is an ontological prerequisite for improvement
-
T6 -> L1, L4: Human authority spans the safety boundary (L1) and action space boundary (L4)
-
T7 -> L3: Resource finiteness directly corresponds to the resource boundary
-
T8 -> L6: Precision level is defined as a task design convention (architecturalConvention)
Note: L5 (platform) is intentionally excluded. L5 represents provider-specific environmental constraints (Claude Code, Codex CLI, etc.), while T1-T8 are technology-independent constraints. L5 is not derived from T but arises from the human judgment of platform selection (upstream of T6). In variableBoundary as well, V1-V7 are not mapped to L5.
Declaration: def constraintBoundary
def constraintBoundary : ConstraintId → List BoundaryId | .t1 => [.ontological] | .t2 => [.ontological] | .t3 => [.ontological, .resource] | .t4 => [.ontological] | .t5 => [.ontological] | .t6 => [.ethicsSafety, .actionSpace] | .t7 => [.resource] | .t8 => [.architecturalConvention]
Every constraint corresponds to at least one boundary condition. Surjectivity onto coverage of the T->L mapping.
Declaration: theorem constraint_has_boundary
theorem constraint_has_boundary : ∀ c : ConstraintId, (constraintBoundary c).length > 0 := by intro c cases c <;> simp [constraintBoundary]
(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L5 Content: L5 (platform boundary) is not derived from any technology-independent constraint T1-T8. L5 is a provider-specific environmental constraint arising from human platform selection (upstream of T6). Proof strategy: cases c <;> simp (constraintBoundary) (exhaustive case analysis)
Declaration: theorem platform_not_in_constraint_boundary
theorem platform_not_in_constraint_boundary : ∀ c : ConstraintId, BoundaryId.platform ∉ constraintBoundary c := by intro c cases c <;> simp [constraintBoundary]
Every boundary condition except L5 is included in the constraintBoundary of at least one constraint. constraintBoundary covers L1-L6 except L5.
Declaration: theorem constraint_boundary_covers_except_platform
theorem constraint_boundary_covers_except_platform :
(∃ c, BoundaryId.ethicsSafety ∈ constraintBoundary c) ∧
(∃ c, BoundaryId.ontological ∈ constraintBoundary c) ∧
(∃ c, BoundaryId.resource ∈ constraintBoundary c) ∧
(∃ c, BoundaryId.actionSpace ∈ constraintBoundary c) ∧
(∃ c, BoundaryId.architecturalConvention ∈ constraintBoundary c) := by
refine ⟨⟨.t6, ?_⟩, ⟨.t1, ?_⟩, ⟨.t3, ?_⟩, ⟨.t6, ?_⟩, ⟨.t8, ?_⟩⟩ <;>
simp [constraintBoundary]
(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L1 Content: L1 (ethics/safety boundary) is derived from constraint T6 (human authority). T6 maps to ethicsSafety in constraintBoundary, establishing that the safety boundary has a technology-independent grounding. Proof strategy: exact with witness .t6, then simp (constraintBoundary)
Declaration: theorem ethicsSafety_covered_by_constraint
theorem ethicsSafety_covered_by_constraint : ∃ c, BoundaryId.ethicsSafety ∈ constraintBoundary c := ⟨.t6, by simp [constraintBoundary]⟩
(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L3 Content: L3 (resource boundary) is derived from constraints T3 (finite context) and T7 (finite resources). T3 maps to (ontological, resource) and T7 maps to (resource) in constraintBoundary, establishing that the resource boundary has technology-independent grounding. Proof strategy: exact with witness .t3, then simp (constraintBoundary)
Declaration: theorem resource_covered_by_constraint
theorem resource_covered_by_constraint : ∃ c, BoundaryId.resource ∈ constraintBoundary c := ⟨.t3, by simp [constraintBoundary]⟩
(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L4 Content: L4 (action space boundary) is derived from constraint T6 (human authority). T6 maps to (ethicsSafety, actionSpace) in constraintBoundary, establishing that the action space boundary is grounded in human decision authority. Proof strategy: exact with witness .t6, then simp (constraintBoundary)
Declaration: theorem actionSpace_covered_by_constraint
theorem actionSpace_covered_by_constraint : ∃ c, BoundaryId.actionSpace ∈ constraintBoundary c := ⟨.t6, by simp [constraintBoundary]⟩
(Derivation Card) Derives from: constraintBoundary (Observable.lean) Proposition: L6 Content: L6 (architectural convention boundary) is derived from constraint T8 (task has precision). T8 maps to (architecturalConvention) in constraintBoundary, establishing that design conventions are grounded in the precision requirement. Proof strategy: exact with witness .t8, then simp (constraintBoundary)
Declaration: theorem architecturalConvention_covered_by_constraint
theorem architecturalConvention_covered_by_constraint : ∃ c, BoundaryId.architecturalConvention ∈ constraintBoundary c := ⟨.t8, by simp [constraintBoundary]⟩
7.1.10. Measurable to Observable Bridge
A general theorem stating that threshold comparison of Measurable indicators is Observable. An aggregation lemma that collects the Measurable axioms of V1–V7.
Threshold comparison of a Measurable indicator is Observable (Measurable->Observable bridge). Constructs a decision procedure for m w >= t from Measurable m.
Declaration: theorem measurable_threshold_observable
theorem measurable_threshold_observable {m : World → Nat} (hm : Measurable m) (t : Nat) :
Observable (fun w => m w ≥ t) := by
obtain ⟨f, hf⟩ := hm
exact ⟨fun w => decide (f w ≥ t), fun w => by simp [hf w]⟩
All 7 variables are Measurable (aggregation lemma).
Declaration: theorem all_variables_measurable
theorem all_variables_measurable :
Measurable skillQuality ∧ Measurable contextEfficiency ∧
Measurable outputQuality ∧ Measurable gatePassRate ∧
Measurable proposalAccuracy ∧ Measurable knowledgeStructureQuality ∧
Measurable taskDesignEfficiency :=
⟨v1_measurable, v2_measurable, v3_measurable, v4_measurable,
v5_measurable, v6_measurable, v7_measurable⟩
Conjunction closure of Observable. The conjunction of two Observable properties is also Observable.
Declaration: theorem observable_and
theorem observable_and {P Q : World → Prop} (hp : Observable P) (hq : Observable Q) :
Observable (fun w => P w ∧ Q w) := by
obtain ⟨fp, hfp⟩ := hp
obtain ⟨fq, hfq⟩ := hq
refine ⟨fun w => fp w && fq w, fun w => ?_⟩
simp [Bool.and_eq_true]
exact ⟨fun ⟨a, b⟩ => ⟨(hfp w).mp a, (hfq w).mp b⟩,
fun ⟨a, b⟩ => ⟨(hfp w).mpr a, (hfq w).mpr b⟩⟩
Negation closure of Observable. The negation of an Observable property is also Observable.
Declaration: theorem observable_not
theorem observable_not {P : World → Prop} (hp : Observable P) :
Observable (fun w => ¬ P w) := by
obtain ⟨fp, hfp⟩ := hp
refine ⟨fun w => !fp w, fun w => ?_⟩
constructor
· intro h hnp
have := (hfp w).mpr hnp
simp [this] at h
· intro hnp
dsimp only []
cases hb : fp w with
| false => rfl
| true => exact absurd ((hfp w).mp hb) hnp
Disjunction closure of Observable. The disjunction of two Observable properties is also Observable.
Declaration: theorem observable_or
theorem observable_or {P Q : World → Prop} (hp : Observable P) (hq : Observable Q) :
Observable (fun w => P w ∨ Q w) := by
obtain ⟨fp, hfp⟩ := hp
obtain ⟨fq, hfq⟩ := hq
refine ⟨fun w => fp w || fq w, fun w => ?_⟩
simp [Bool.or_eq_true]
exact ⟨fun h => h.elim (fun hp_w => Or.inl ((hfp w).mp hp_w))
(fun hq_w => Or.inr ((hfq w).mp hq_w)),
fun h => h.elim (fun hfp_w => Or.inl ((hfp w).mpr hfp_w))
(fun hfq_w => Or.inr ((hfq w).mpr hfq_w))⟩
System health is Observable (binary-decidable).
Since each Vi is Measurable, threshold comparison is decidable.
Proved via measurable_threshold_observable + observable_and.
(Originally an axiom, demoted to theorem in Run 27)
Declaration: theorem system_health_observable
theorem system_health_observable :
∀ (threshold : Nat), Observable (systemHealthy threshold) := by
intro t
unfold systemHealthy
apply observable_and (measurable_threshold_observable v1_measurable t)
apply observable_and (measurable_threshold_observable v2_measurable t)
apply observable_and (measurable_threshold_observable v3_measurable t)
apply observable_and (measurable_threshold_observable v4_measurable t)
apply observable_and (measurable_threshold_observable v5_measurable t)
apply observable_and (measurable_threshold_observable v6_measurable t)
exact measurable_threshold_observable v7_measurable t
Degradation detection is Observable: "at least one variable is below threshold"
is a decidable property. This is the Observable formalization of D3 condition 2
(degradation detectable). Since systemHealthy is Observable (system_health_observable),
its negation is also Observable (observable_not).
Declaration: theorem degradation_detectable_observable
theorem degradation_detectable_observable :
∀ (threshold : Nat), Observable (fun w => ¬systemHealthy threshold w) := by
intro t
exact observable_not (system_health_observable t)
Below-threshold comparison is Observable. Dual of measurable_threshold_observable.
"Variable m is below threshold t" is decidable if m is Measurable.
Declaration: theorem measurable_below_threshold_observable
theorem measurable_below_threshold_observable {m : World → Nat} (hm : Measurable m) (t : Nat) :
Observable (fun w => m w < t) := by
obtain ⟨f, hf⟩ := hm
exact ⟨fun w => decide (f w < t), fun w => by simp [hf w]⟩
7.1.11. Part IV Maintaining the Classification Itself
This classification (L1–L6, V1–V7) is a hypothesis based on current understanding,
not a fixed truth. Its type-level representation is formalized as ReviewSignal in Evolution.lean.
7.1.12. Signals That Should Trigger Review
Signal | Example | Response |
|---|---|---|
Misclassification | An item placed in L1 is actually conditionally modifiable | Move to another category |
Missing boundary condition | Regulatory/legal constraints restrict the action space but are absent from the classification | Add a new Layer |
Vanished boundary condition | Technological advances have effectively overcome an L2 item | Delete or reclassify |
Variable deficit/surplus | There are optimization targets not included in V1-V7 | Add, merge, or split variables |
Ambiguous category boundary | Something could belong to either "fixed boundary" or "investment-variable boundary" | Refine the judgment criteria |
7.1.13. Caution - Avoiding Self-Rigidification of the Classification
The greatest risk is that the classification itself begins to function as a boundary condition -- inducing reasoning such as "it cannot be changed because it is written in L1."
Preventive measures:
-
Maintain the rationale for "why this category" for each item in every Layer
-
"Fixed" means "no means of changing it has been found at present"
-
Reclassification of boundary conditions is a legitimate act consistent with the spirit of the manifesto
7.1.14. Core Insights
-
The subject of optimization is structure, not the agent. The agent is an ephemeral catalyst (T1). Improvements accumulate within structure (T2).
-
Variables are not independent levers but a mutually interacting system. Improving V1 can degrade V2. Rather than maximizing individual variables, maintain the health of the system as a whole.
-
The purpose of the investment cycle is equilibrium, not expansion. Rather than maximizing the action space, search for the equilibrium point where collaborative value is maximized. The equilibrium point shifts with context.
-
The investment cycle simultaneously contains positive and negative feedback. By P1, expansion of the action space is inseparable from expansion of the attack surface. Expansion without defense increases the potential destructive power of the reverse cycle.
-
Gate reliability depends on P2, and P2 rests on E1. V4 is meaningful only when generation and evaluation are structurally separated.
-
Variable optimization presupposes P4. What cannot be observed cannot be optimized.
-
Structure is interpreted probabilistically (P5). Designs that assume 100% compliance are fragile.
-
Task execution is a constraint satisfaction problem (P6). Simultaneous satisfaction of T3, T7, and T8 drives task design.
-
L5 determines the ceiling of structural improvement. Building a custom platform is justified when the investment cycle has sufficiently progressed and the L5 ceiling has become a bottleneck.
-
The axiom system has a three-layer structure. Constraints (T: undeniable), empirical postulates (E: falsifiable but unfalsified), and foundational principles (P: derived from T/E). The robustness of each P differs depending on whether its justification includes E.
-
This classification itself is subject to review. The L1–L6, V1–V7 classification is not a fixed truth; reclassification, addition, and deletion of items may occur during operation.
7.1.15. Quality Measurement Priority G1b-1
Analysis from G1b-1 (#91) revealed that the following quality priorities are derivable from the manifesto's axiom system. These follow logically from existing axioms and design principles, without depending on T6 (human judgment).
7.1.16. Non-Derivable Domain V1-V7
Mutual priority among V1-V7 is not derivable. TradeoffExists is a symmetric relation and does not imply orderings such as "V1 > V3." This is an intentional design decision; priority among V's reduces to T6 judgment (G1b-2 #92).
Quality measurement category: measurement of structural change vs measurement of process success rate. Formalization of the proxy mismatch identified in R1 (GQM redefinition).
Declaration: inductive QualityMeasureCategory
inductive QualityMeasureCategory where | structuralOutcome -- 構造的成果: theorem delta, test delta, axiom count | processSuccess -- プロセス成功率: evolve success rate, skill invocation rate deriving BEq, Repr
Priority of quality measurement categories. Structural outcomes are a more direct indicator of quality than process success rates. Basis:
-
Supreme mission "persistent structure continues to improve itself" -> structural change defines improvement
-
Analogy from D5 (specification layer ordering): outcome (what was produced) > process (how it was produced)
-
Anthropic eval guide: "grade what the agent produced, not the path it took"
Declaration: def qualityMeasurePriority
def qualityMeasurePriority : QualityMeasureCategory → Nat | .structuralOutcome => 1 -- higher priority | .processSuccess => 0 -- lower priority
Measurement of structural outcomes takes priority over measurement of process success rates as a quality indicator. Quality is "skills producing structural improvement," not merely "skills running successfully."
Declaration: theorem structural_outcome_gt_process_success
theorem structural_outcome_gt_process_success :
qualityMeasurePriority .structuralOutcome >
qualityMeasurePriority .processSuccess := by
native_decide
Classification of verification signals: independent verification vs self-assessment. Formalization of P2 + E1 + ICLR 2024 (Huang et al.).
Declaration: inductive VerificationSignalType
inductive VerificationSignalType where | independentlyVerified -- P2: 独立エージェントまたは構造的テストによる検証 | selfAssessed -- 同一インスタンスによる自己評価 deriving BEq, Repr
Reliability of verification signals. Independent verification is more reliable than self-assessment. Basis:
-
P2: Cognitive separation of concerns (separation of Worker and Verifier)
-
E1: Experience precedes theory -- self-assessment using self-generated theory is circular
-
ICLR 2024 Huang et al.: intrinsic self-correction degrades accuracy
Declaration: def verificationReliability
def verificationReliability : VerificationSignalType → Nat | .independentlyVerified => 1 -- higher reliability | .selfAssessed => 0 -- lower reliability
Independently verified quality signals are more reliable than self-assessed quality signals.
Declaration: theorem independent_verification_gt_self_assessment
theorem independent_verification_gt_self_assessment :
verificationReliability .independentlyVerified >
verificationReliability .selfAssessed := by
native_decide
Quality assurance layers: defect absence vs value creation. Application of the D6 DesignStage ordering to the quality dimension.
Declaration: inductive QualityAssuranceLayer
inductive QualityAssuranceLayer where | defectAbsence -- 壊れていないことの確認(test pass, Lean build, sorry=0) | valueCreation -- 良いことの確認(改善の実質性、有用性) deriving BEq, Repr
Measurement priority of quality assurance. Confirming defect absence precedes confirming value creation. Basis:
-
D6: Boundary (constraint satisfaction) > Variable (quality improvement)
-
D4: Safety > Governance -- safety (not broken) precedes governance (making better)
-
Logical consequence: measuring "substantiveness of improvement" in a broken system is meaningless
Declaration: def qualityAssurancePriority
def qualityAssurancePriority : QualityAssuranceLayer → Nat | .defectAbsence => 1 -- higher measurement priority | .valueCreation => 0 -- lower measurement priority (but not less important)
Measurement of defect absence takes priority over measurement of value creation (as measurement ordering). Note: this means "should be measured first," not "defect absence is more important." Measurement of value creation becomes meaningful only after defect absence is confirmed.
Declaration: theorem defect_absence_measurement_gt_value_creation
theorem defect_absence_measurement_gt_value_creation :
qualityAssurancePriority .defectAbsence >
qualityAssurancePriority .valueCreation := by
native_decide
8. Design Foundation: D1-D14
8.1. Epistemic Layer - DesignTheorem Strength 1 - Formalization of Design Development Foundation
Type-checks that D1–D17 from design-development-foundation.md are derivable (§2.4 derivability) from the manifesto's T/E/P (premise set Γ, terminology reference §2.5).
8.1.1. Nature of the Formalization
This file does not add new non-logical axioms (§4.1) to Γ. Every D is formalized as one of the following:
-
Definitional extension (§5.5): Definition of new types/functions. Always a conservative extension
-
Theorem (§4.2): Derived from existing axioms (T/E) by application of inference rules
Therefore this file is a collection of definitional extensions + theorems over T₀,
and conservative extension (§5.5) is guaranteed by definitional_implies_conservative
proven in Terminology.lean.
8.1.2. Design Policy
Each D is expressed as a type (definitional extension, §5.5) or theorem (§4.2), with explicit connections to the underlying T/E/P non-logical axioms (§4.1) / theorems.
D's are meta-level (§5.6 metatheory) design principles, distinct from object-level (§5.6 object theory) non-logical axioms.
8.1.3. Correspondence with Terminology Reference
Lean Concept | Terminology Reference | §Ref |
|---|---|---|
D1–D17 theorems | Theorems (propositions derived from axioms) | §4.2 |
D1–D17 def/structure | Definitional extensions (new symbols defined via existing symbols) | §5.5 |
SelfGoverning | Type class (interface for types) | §9.4 |
DesignPrinciple | Component of the domain of discourse (§3.2) | §3.2 |
DesignPrincipleUpdate | Structuring of AGM revision operations | §9.2 |
EnforcementLayer | Hierarchy of enforcement power. Means to realize invariants (§9.3) | §9.3 |
DevelopmentPhase | Inter-phase dependencies resemble transition relations (§9.3) | §9.3 |
VerificationIndependence | Operationalization of E1 (§4.1 non-logical axiom) | §4.1 |
CompatibilityClass | Classification of extensions (conservative/consistent/breaking) | §5.5 |
8.1.4. Correspondence with design-development-foundation.md
This file formalizes D1–D17.
D | Rationale | Formalization Depth |
|---|---|---|
D1 | P5 + L1–L6 | type + 2 theorems |
D2 | E1 + P2 | structure + 3 theorems |
D3 | P4 + T5 | 3 theorems (3-condition structure not formalized) |
D4 | Section 7 + P3 + T2 | type + 5 theorems |
D5 | T8 + P4 + P6 | type + 3 theorems (inter-layer relations not formalized) |
D6 | Ontology/Observable | 3 theorems (causal chain not formalized) |
D7 | Section 6 + P1 | 2 theorems (accumulation bounded + damage unbounded) |
D8 | Section 6 + E2 | 2 theorems (overexpansion + capability-risk) |
D9 | Observable + P3 + Section 7 | SelfGoverning + 4 theorems |
D10 | T1 + T2 | 2 theorems (structural permanence + epoch monotone increase) |
D11 | T3 + D1 | definition + 3 theorems (inverse correlation + minimization + finiteness) |
D12 | P6 + T3 + T7 + T8 | 2 theorems (CSP + probabilistic output) |
D13 | P3 + Section 8 + T5 | impact propagation + assumption-level extension (#225) |
D14 | P6 + T7 + T8 | 1 theorem (constraint satisfaction of verification order) |
D15 | T3+T4+T5+T6+T7+T8+P6 | 4 theorems (retry bounds, convergence, eviction, saturation) |
D16 |
| 3 theorems (zero-contribution, composition, resource) |
D17 | T5+D3+P3+T6+D5+D9+E1+D2+D13 | type + 8 theorems (deductive design workflow) |
8.1.5. D1 Enforcement Layering
Definitional Extension, 5.5.
Rationale: P5 (probabilistic interpretation) + L1–L6 (hierarchy of boundary conditions)
By P5, normative guidelines are only probabilistically complied with. Therefore, absolute constraints such as L1 (safety) should be implemented via structural enforcement (not subject to probabilistic interpretation).
Connection with terminology reference:
-
Structural enforcement → Invariant (§9.3): Property that always holds during execution
-
Procedural enforcement → Pre/post-conditions (§9.3): Verified before and after operations
-
Normative guideline → Satisfiable (§2.2) but not valid/tautological (§2.2) by P5
Enforcement layer. Represents the strength of enforcement power.
Declaration: inductive EnforcementLayer
inductive EnforcementLayer where | structural -- 違反が物理的に不可能 | procedural -- 違反は可能だが検出・阻止される | normative -- 遵守は確率的(P5) deriving BEq, Repr
Strength ordering of enforcement layers. structural is the strongest.
Declaration: def EnforcementLayer.strength
def EnforcementLayer.strength : EnforcementLayer → Nat | .structural => 3 | .procedural => 2 | .normative => 1
Minimum required enforcement layer for each boundary condition. Fixed boundaries (L1, L2) require structural enforcement. Investment-variable boundaries require procedural enforcement or above. Environmental boundaries may use normative guidelines.
Declaration: def minimumEnforcement
def minimumEnforcement : BoundaryLayer → EnforcementLayer | .fixed => .structural | .investmentVariable => .procedural | .environmental => .normative
(Derivation Card)
Derives from: probabilistic_interpretation_insufficient (P5 / T4)
Proposition: D1
Content: Fixed boundaries (L1) require structural enforcement. Normative guidelines cannot guarantee L1 compliance under nondeterministic interpretation.
Proof strategy: rfl (definitional equality — minimumEnforcement maps .fixed to .structural)
Declaration: theorem d1_fixed_requires_structural
theorem d1_fixed_requires_structural : minimumEnforcement .fixed = .structural := by rfl
Corollary of D1: Enforcement layer strength is monotone with respect to boundary layers. Enforcement strength required: fixed >= investment-variable >= environmental.
Declaration: theorem d1_enforcement_monotone
theorem d1_enforcement_monotone : (minimumEnforcement .fixed).strength ≥ (minimumEnforcement .investmentVariable).strength ∧ (minimumEnforcement .investmentVariable).strength ≥ (minimumEnforcement .environmental).strength := by simp [minimumEnforcement, EnforcementLayer.strength]
8.1.6. D2 Worker/Verifier Separation
Definitional Extension + Theorem, 5.5/4.2.
Rationale: E1 (verification independence, non-logical axiom §4.1) + P2 (cognitive role separation, theorem §4.2)
E1a (verification_requires_independence) is the direct rationale.
E1 belongs to Γ T₀ (hypothesis-derived) and is falsifiable (§9.1).
If E1 is falsified, D2 becomes subject to review.
Four conditions for verification independence.
The former 3 conditions (context separation, bias non-sharing, independent invocation) covered only process-level independence. Without evaluator independence, the problem of "the same model making the same mistakes in a different context" remains.
Four conditions:
-
Context separation: Worker's reasoning process and intermediate state do not leak to Verifier
-
Framing independence: Verification criteria are not post-hoc defined by the Worker (Refinement of former "bias non-sharing". Not only the artifacts, but also the framework of "what should be verified" is independent of the Worker)
-
Execution automaticity: Worker cannot bypass verification (Strengthening of former "independent invocation". Does not depend on Worker's discretion)
-
Evaluator independence: Evaluation is performed by a separate entity without shared judgment tendencies (Human: A different person without shared context but with sufficient knowledge. LLM: A different model without shared context. Same model with different context corresponds to a Subagent, which achieves process separation but not evaluator independence)
Declaration: structure VerificationIndependence
structure VerificationIndependence where
Worker's reasoning process does not leak to Verifier
contextSeparated : Bool
Worker's reasoning process does not leak to Verifier -/ contextSeparated : Bool /-- Verification criteria do not depend on Worker's framing -/ framingIndependent : Bool /-- Verification execution does not depend on Worker's discretion -/ executionAutomatic : Bool /-- Evaluator has different judgment tendencies from Worker -/ evaluatorIndependent : Bool deriving BEq, Repr
/-- Verification risk level. The required level of independence varies by risk.
Declaration: inductive VerificationRisk
inductive VerificationRisk where | critical -- L1 関連: 安全・倫理 | high -- 構造変更: アーキテクチャ、設定 | moderate -- 通常コード変更 | low -- ドキュメント、コメント deriving BEq, Repr
Verification criteria do not depend on Worker's framing
framingIndependent : Bool
Verification execution does not depend on Worker's discretion
executionAutomatic : Bool
Evaluator has different judgment tendencies from Worker
evaluatorIndependent : Bool
Required independence conditions for each risk level. The model is quantitative: any N conditions out of 4 suffice. critical: All 4 conditions required (verification by human or different model) high: Any 3 of 4 conditions (e.g. context separation + framing independence + evaluator independence) moderate: Any 2 of 4 conditions (e.g. context separation + automatic execution) low: Any 1 of 4 conditions (context separation alone suffices)
Declaration: def requiredConditions
def requiredConditions : VerificationRisk → Nat | .critical => 4 | .high => 3 | .moderate => 2 | .low => 1
Counts the number of satisfied independence conditions.
Declaration: def satisfiedConditions
def satisfiedConditions (vi : VerificationIndependence) : Nat := (if vi.contextSeparated then 1 else 0) + (if vi.framingIndependent then 1 else 0) + (if vi.executionAutomatic then 1 else 0) + (if vi.evaluatorIndependent then 1 else 0)
Whether verification is sufficient: satisfied conditions >= required conditions
Declaration: def sufficientVerification
def sufficientVerification
(vi : VerificationIndependence) (risk : VerificationRisk) : Prop :=
satisfiedConditions vi ≥ requiredConditions risk
Critical risk requires all four conditions. Subagent (contextSeparated only) is insufficient.
Declaration: theorem critical_requires_all_four
theorem critical_requires_all_four : requiredConditions .critical = 4 := by rfl
Subagent-only verification (context separation only) is sufficient only for low risk.
Declaration: theorem subagent_only_sufficient_for_low
theorem subagent_only_sufficient_for_low :
let subagentOnly : VerificationIndependence :=
{ contextSeparated := true
framingIndependent := false
executionAutomatic := false
evaluatorIndependent := false }
sufficientVerification subagentOnly .low ∧
¬sufficientVerification subagentOnly .moderate := by
simp [sufficientVerification, satisfiedConditions, requiredConditions]
Backward compatibility with former validSeparation: the old 3 conditions are a subset of the new 4 conditions.
Declaration: def validSeparation
def validSeparation (vs : VerificationIndependence) : Prop := vs.contextSeparated = true ∧ vs.framingIndependent = true ∧ vs.executionAutomatic = true
(Derivation Card)
Derives from: verification_requires_independence (E1)
Proposition: D2
Content: Valid verification requires separation — generator and verifier must have distinct IDs and not share internal state, ensuring contextual and evaluative independence.
Proof strategy: Direct application of verification_requires_independence (E1)
Declaration: theorem d2_from_e1
theorem d2_from_e1 :
∀ (gen ver : Agent) (action : Action) (w : World),
generates gen action w →
verifies ver action w →
gen.id ≠ ver.id ∧ ¬sharesInternalState gen ver :=
verification_requires_independence
8.1.7. D3 Observability First
Theorem, 4.2.
Rationale: P4 (observability of degradation, theorem §4.2) + T5 (no improvement without feedback, T₀ §4.1)
T5 (no_improvement_without_feedback) is the direct rationale:
Improvement requires feedback → feedback requires observation.
Note: design-development-foundation.md defines 3 conditions for observability (measurable, degradation-detectable, improvement-verifiable), but this formalization covers only the implication of T5. Structuring of the 3 conditions is not yet implemented.
(Derivation Card)
Derives from: no_improvement_without_feedback (T5)
Proposition: D3
Content: Feedback must precede improvement — observability is a necessary precondition for structural improvement.
Proof strategy: Direct application of T5 (no_improvement_without_feedback)
Declaration: theorem d3_observability_precedes_improvement
theorem d3_observability_precedes_improvement :
∀ (w w' : World),
structureImproved w w' →
∃ (f : Feedback), f ∈ w'.feedbacks ∧
f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
no_improvement_without_feedback
(Derivation Card)
Derives from: no_process_improvement_without_feedback (T5, #316)
Proposition: D3 (process-level)
Content: Process-targeted feedback must precede process improvement.
Observability of the process itself is a necessary precondition
for process improvement. This prevents "blind optimization" where
a process improves structures but never evaluates its own effectiveness.
Proof strategy: Direct application of no_process_improvement_without_feedback
Declaration: theorem d3_process_observability_precedes_improvement
theorem d3_process_observability_precedes_improvement :
∀ (pid : ProcessId) (w w' : World),
processImproved pid w w' →
∃ (f : Feedback), f ∈ w'.feedbacks ∧
f.target = .process pid ∧
f.timestamp ≥ w.time ∧ f.timestamp ≤ w'.time :=
no_process_improvement_without_feedback
Distinction of detection modes (introduced in Run 41). Refines the definition of "detectable": distinguishes between human-readable (humanReadable) and programmatically queryable (structurallyQueryable). D3 condition 2 requires structurallyQueryable.
Declaration: inductive DetectionMode
inductive DetectionMode where | humanReadable : DetectionMode -- 人間が読めば分かる(自由テキスト等) | structurallyQueryable : DetectionMode -- プログラムでクエリ可能(構造化フィールド等) deriving BEq, Repr
D3 observability 3 conditions (design-development-foundation.md §D3). Only when all 3 conditions hold for a variable V does V become an effectively optimizable target.
Declaration: structure ObservabilityConditions
structure ObservabilityConditions where
Whether the current value is measurable (Measurable, Observable.lean)
measurable : Bool
Whether the current value is measurable (Measurable, Observable.lean) -/ measurable : Bool /-- Whether degradation is detectable (can it be detected before quality collapse) -/ degradationDetectable : Bool /-- Detection mode for degradation (ineffective unless structurallyQueryable) -/ detectionMode : DetectionMode := .structurallyQueryable /-- Whether improvement is verifiable (can value changes be compared before and after intervention) -/ improvementVerifiable : Bool deriving BEq, Repr
/-- Determines whether a variable is an effectively optimizable target. All 3 conditions required. Additionally, degradation detection must be in a structurally queryable format.
Declaration: def effectivelyOptimizable
def effectivelyOptimizable (c : ObservabilityConditions) : Prop := c.measurable = true ∧ c.degradationDetectable = true ∧ c.detectionMode = .structurallyQueryable ∧ c.improvementVerifiable = true
Whether degradation is detectable (can it be detected before quality collapse)
degradationDetectable : Bool
Detection mode for degradation (ineffective unless structurallyQueryable)
detectionMode : DetectionMode := .structurallyQueryable
Whether improvement is verifiable (can value changes be compared before and after intervention)
improvementVerifiable : Bool
D3: A variable lacking any of the 3 conditions is merely a nominal optimization target.
Declaration: theorem d3_partial_observability_insufficient
theorem d3_partial_observability_insufficient : ¬effectivelyOptimizable ⟨true, true, .structurallyQueryable, false⟩ ∧ ¬effectivelyOptimizable ⟨true, false, .structurallyQueryable, true⟩ ∧ ¬effectivelyOptimizable ⟨false, true, .structurallyQueryable, true⟩ := by refine ⟨?_, ?_, ?_⟩ <;> simp [effectivelyOptimizable]
D3: Effective only when all 3 conditions hold and detection is structurally queryable.
Declaration: theorem d3_full_observability_sufficient
theorem d3_full_observability_sufficient : effectivelyOptimizable ⟨true, true, .structurallyQueryable, true⟩ := by simp [effectivelyOptimizable]
D3 refinement (Run 41): Human-readable but structurally non-queryable detection is insufficient. Merely writing in notes is ineffective even if degradationDetectable = true.
Declaration: theorem d3_human_readable_insufficient
theorem d3_human_readable_insufficient : ¬effectivelyOptimizable ⟨true, true, .humanReadable, true⟩ := by simp [effectivelyOptimizable]
8.1.8. D4 Progressive Self-Application
Definitional Extension + Theorem, 5.5/4.2.
Rationale: Section 7 (self-application) + P3 (governed learning, theorem §4.2) + T2 (structural permanence, T₀ §4.1)
Development phases have an ordering, and each phase's completion persists in structure (T2).
The phase ordering is derived from the dependency relationships of D1–D3.
phaseOrder in Procedure.lean formalizes the same ordering.
Development phase. Each stage of D4's progressive self-application.
Declaration: inductive DevelopmentPhase
inductive DevelopmentPhase where | safety -- L1: 安全基盤 | verification -- P2: 検証基盤 | observability -- P4: 可観測性 | governance -- P3: 統治 | equilibrium -- 投資サイクル + 動的調整 deriving BEq, Repr
Inter-phase dependencies. A subsequent phase cannot begin until the preceding phase is complete.
Declaration: def phaseDependency
def phaseDependency : DevelopmentPhase → DevelopmentPhase → Prop | .verification, .safety => True -- P2 は L1 の後 | .observability, .verification => True -- P4 は P2 の後 | .governance, .observability => True -- P3 は P4 の後 | .equilibrium, .governance => True -- 投資は P3 の後 | _, _ => False
Rationale for D4: Phase ordering is strict (no self-transitions). Each phase depends on its preceding phase.
Declaration: theorem d4_no_self_dependency
theorem d4_no_self_dependency : ∀ (p : DevelopmentPhase), ¬phaseDependency p p := by intro p; cases p <;> simp [phaseDependency]
(Derivation Card)
Derives from: phaseDependency (definitional), structure_accumulates (T2)
Proposition: D4
Content: A complete phase chain exists: safety → verification → observability → governance → equilibrium. Each phase is strictly ordered with no self-dependencies.
Proof strategy: refine + trivial (all four phaseDependency facts hold by definition)
Declaration: theorem d4_full_chain
theorem d4_full_chain : phaseDependency .verification .safety ∧ phaseDependency .observability .verification ∧ phaseDependency .governance .observability ∧ phaseDependency .equilibrium .governance := by refine ⟨?_, ?_, ?_, ?_⟩ <;> trivial
D4's connection to T2: Phase completion persists in structure.
From structure_accumulates, epochs (phase progression) are
irreversible. A completed phase is never "undone".
Declaration: theorem d4_phase_completion_persists
theorem d4_phase_completion_persists :
∀ (w w' : World),
validTransition w w' →
w.epoch ≤ w'.epoch :=
structure_accumulates
8.1.9. Partial Order Instance for DevelopmentPhase
manifesto.md Section 8 asserts that "D4/D5/D6 are instances of partial orders". Following the precedent of StructureKind (Ontology.lean), we derive LE/LT instances and the 4 partial order property theorems from a Nat-based ordering function.
Ordering function for DevelopmentPhase. Separately from phaseDependency (binary Prop), defines a total order via Nat. Reflects the phase ordering of D4.
Declaration: def developmentPhaseOrder
def developmentPhaseOrder : DevelopmentPhase → Nat | .safety => 0 | .verification => 1 | .observability => 2 | .governance => 3 | .equilibrium => 4
The ordering function is injective (distinct phases have distinct order values).
Declaration: theorem developmentPhaseOrder_injective
theorem developmentPhaseOrder_injective :
∀ (p₁ p₂ : DevelopmentPhase),
developmentPhaseOrder p₁ = developmentPhaseOrder p₂ → p₁ = p₂ := by
intro p₁ p₂; cases p₁ <;> cases p₂ <;> simp [developmentPhaseOrder]
Partial order reflexivity: p <= p.
Declaration: theorem developmentPhase_le_refl
theorem developmentPhase_le_refl : ∀ (p : DevelopmentPhase), p ≤ p := fun p => Nat.le_refl (developmentPhaseOrder p)
Partial order transitivity: if p₁ <= p₂ and p₂ <= p₃ then p₁ <= p₃.
Declaration: theorem developmentPhase_le_trans
theorem developmentPhase_le_trans :
∀ (p₁ p₂ p₃ : DevelopmentPhase), p₁ ≤ p₂ → p₂ ≤ p₃ → p₁ ≤ p₃ := by
intro _p₁ _p₂ _p₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃
Partial order antisymmetry: if p₁ <= p₂ and p₂ <= p₁ then p₁ = p₂.
Declaration: theorem developmentPhase_le_antisymm
theorem developmentPhase_le_antisymm :
∀ (p₁ p₂ : DevelopmentPhase), p₁ ≤ p₂ → p₂ ≤ p₁ → p₁ = p₂ :=
fun p₁ p₂ h₁₂ h₂₁ => developmentPhaseOrder_injective p₁ p₂ (Nat.le_antisymm h₁₂ h₂₁)
Consistency of LT and LE: p₁ < p₂ iff p₁ <= p₂ and not (p₂ <= p₁).
Declaration: theorem developmentPhase_lt_iff_le_not_le
theorem developmentPhase_lt_iff_le_not_le :
∀ (p₁ p₂ : DevelopmentPhase), p₁ < p₂ ↔ p₁ ≤ p₂ ∧ ¬(p₂ ≤ p₁) := by
intro _p₁ _p₂; exact Nat.lt_iff_le_and_not_ge
8.1.10. D5 Specification Test and Implementation Three-Layer Architecture
Rationale: T8 (precision level) + P4 (observability) + P6 (constraint satisfaction)
Types of the three-layer representation.
Declaration: inductive SpecLayer
inductive SpecLayer where | formalSpec -- 形式仕様(Lean axiom/theorem) | acceptanceTest -- 受け入れテスト(実行可能な検証) | implementation -- 実装(プラットフォーム固有) deriving BEq, Repr
Test kinds. Corresponds to T4 (probabilistic output).
Declaration: inductive TestKind
inductive TestKind where | structural -- 構成の存在を確認(決定論的) | behavioral -- 実行して結果を確認(確率的、T4) deriving BEq, Repr
(Derivation Card)
Derives from: task_has_precision (T8)
Proposition: D5
Content: Tests must have non-zero precision — a task with precision level 0 is meaningless and cannot support meaningful optimization or acceptance criteria.
Proof strategy: Direct application of task_has_precision (T8)
Declaration: theorem d5_test_has_precision
theorem d5_test_has_precision :
∀ (task : Task),
task.precisionRequired.required > 0 :=
task_has_precision
Correspondence between the three layers. Composed in the order: formal spec -> test -> implementation. design-development-foundation.md D5: "Formal spec -> Test: At least one test exists for each axiom/theorem" "Test -> Implementation: Tests exist first, and the implementation passes the tests"
Declaration: def specLayerOrder
def specLayerOrder : SpecLayer → Nat | .formalSpec => 0 -- 最初に仕様を定義 | .acceptanceTest => 1 -- 仕様からテストを導出 | .implementation => 2 -- テストを通す実装を構築
D5: The three layers are strictly ordered.
Declaration: theorem d5_layer_sequential
theorem d5_layer_sequential : specLayerOrder .formalSpec < specLayerOrder .acceptanceTest ∧ specLayerOrder .acceptanceTest < specLayerOrder .implementation := by simp [specLayerOrder]
Determinism of tests. Structural tests are deterministic, behavioral tests are probabilistic (T4).
Declaration: def testDeterministic
def testDeterministic : TestKind → Bool | .structural => true -- 決定論的: 存在の有無を確認 | .behavioral => false -- 確率的: T4 により結果が変動しうる
D5 + T4: Structural tests are deterministic, behavioral tests are probabilistic.
Declaration: theorem d5_structural_test_deterministic
theorem d5_structural_test_deterministic : testDeterministic .structural = true ∧ testDeterministic .behavioral = false := by constructor <;> rfl
The ordering function is injective (distinct layers have distinct order values).
Declaration: theorem specLayerOrder_injective
theorem specLayerOrder_injective :
∀ (l₁ l₂ : SpecLayer),
specLayerOrder l₁ = specLayerOrder l₂ → l₁ = l₂ := by
intro l₁ l₂; cases l₁ <;> cases l₂ <;> simp [specLayerOrder]
Partial order reflexivity: l <= l.
Declaration: theorem specLayer_le_refl
theorem specLayer_le_refl : ∀ (l : SpecLayer), l ≤ l := fun l => Nat.le_refl (specLayerOrder l)
Partial order transitivity: if l₁ <= l₂ and l₂ <= l₃ then l₁ <= l₃.
Declaration: theorem specLayer_le_trans
theorem specLayer_le_trans :
∀ (l₁ l₂ l₃ : SpecLayer), l₁ ≤ l₂ → l₂ ≤ l₃ → l₁ ≤ l₃ := by
intro _l₁ _l₂ _l₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃
Partial order antisymmetry: if l₁ <= l₂ and l₂ <= l₁ then l₁ = l₂.
Declaration: theorem specLayer_le_antisymm
theorem specLayer_le_antisymm :
∀ (l₁ l₂ : SpecLayer), l₁ ≤ l₂ → l₂ ≤ l₁ → l₁ = l₂ :=
fun l₁ l₂ h₁₂ h₂₁ => specLayerOrder_injective l₁ l₂ (Nat.le_antisymm h₁₂ h₂₁)
Consistency of LT and LE: l₁ < l₂ iff l₁ <= l₂ and not (l₂ <= l₁).
Declaration: theorem specLayer_lt_iff_le_not_le
theorem specLayer_lt_iff_le_not_le :
∀ (l₁ l₂ : SpecLayer), l₁ < l₂ ↔ l₁ ≤ l₂ ∧ ¬(l₂ ≤ l₁) := by
intro _l₁ _l₂; exact Nat.lt_iff_le_and_not_ge
8.1.11. D6 Three-Stage Design
Rationale: Ontology.lean/Observable.lean three-stage structure (boundary -> mitigation -> variable)
BoundaryLayer, BoundaryId, and Mitigation are already defined in Ontology.lean. Here we express the design principles as theorems.
Rationale for D6: Variables corresponding to fixed boundaries can only improve mitigation quality.
Declaration: theorem d6_fixed_boundary_mitigated
theorem d6_fixed_boundary_mitigated : boundaryLayer .ethicsSafety = .fixed ∧ boundaryLayer .ontological = .fixed := by simp [boundaryLayer]
Design flow of the three-stage design. design-development-foundation.md D6: "Boundary conditions (invariant) -> Mitigations (design decisions) -> Variables (quality metrics)" Design always proceeds in this direction; the reverse direction is prohibited.
Declaration: inductive DesignStage
inductive DesignStage where
Identify boundary conditions (invariant; only accepted) -/ | identifyBoundary /-- Design mitigations (design decisions belonging to L6) -/ | designMitigation /-- Define variables (metrics for mitigation effectiveness) -/ | defineVariable deriving BEq, Repr, DecidableEq
/-- Stage ordering of the three-stage design. -/ def designStageOrder : DesignStage → Nat | .identifyBoundary => 0 | .designMitigation => 1 | .defineVariable => 2
/-- (Derivation Card)
Derives from: designStageOrder (definitional)
Proposition: D6
Content: The three-stage design is strictly ordered — identifyBoundary < designMitigation < defineVariable. The stage ordering function assigns monotonically increasing natural numbers.
Proof strategy: simp (designStageOrder) — unfold the ordering function and reduce to Nat inequalities -/
theorem d6_stage_sequential :
designStageOrder .identifyBoundary < designStageOrder .designMitigation ∧
designStageOrder .designMitigation < designStageOrder .defineVariable := by
simp (designStageOrder)
/-- D6: Reverse direction prohibited. Do not attempt to directly improve variables (Goodhart's Law trap).
The variable stage is last; there is no backtracking from variables to boundary conditions or mitigations. -/
theorem d6_no_reverse :
∀ (s : DesignStage),
designStageOrder .identifyBoundary ≤ designStageOrder s := by
intro s; cases s <;> simp (designStageOrder)
/-- The ordering function is injective (distinct stages have distinct order values). -/
theorem designStageOrder_injective :
∀ (s₁ s₂ : DesignStage),
designStageOrder s₁ = designStageOrder s₂ → s₁ = s₂ := by
intro s₁ s₂; cases s₁ <;> cases s₂ <;> simp (designStageOrder)
instance : LE DesignStage := ⟨fun a b => designStageOrder a ≤ designStageOrder b⟩ instance : LT DesignStage := ⟨fun a b => designStageOrder a < designStageOrder b⟩
/-- Partial order reflexivity: s <= s. -/
theorem designStage_le_refl : ∀ (s : DesignStage), s ≤ s :=
fun s => Nat.le_refl (designStageOrder s)
/-- Partial order transitivity: if s₁ <= s₂ and s₂ <= s₃ then s₁ <= s₃. -/
theorem designStage_le_trans :
∀ (s₁ s₂ s₃ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₃ → s₁ ≤ s₃ := by
intro _s₁ _s₂ _s₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃
/-- Partial order antisymmetry: if s₁ <= s₂ and s₂ <= s₁ then s₁ = s₂. -/
theorem designStage_le_antisymm :
∀ (s₁ s₂ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₁ → s₁ = s₂ :=
fun s₁ s₂ h₁₂ h₂₁ => designStageOrder_injective s₁ s₂ (Nat.le_antisymm h₁₂ h₂₁)
/-- Consistency of LT and LE: s₁ < s₂ iff s₁ <= s₂ and not (s₂ <= s₁). -/
theorem designStage_lt_iff_le_not_le :
∀ (s₁ s₂ : DesignStage), s₁ < s₂ ↔ s₁ ≤ s₂ ∧ ¬(s₂ ≤ s₁) := by
intro _s₁ _s₂; exact Nat.lt_iff_le_and_not_ge
/-! ## D7 Trust Asymmetry
Rationale: Section 6 + P1 (co-growth)
Accumulation is bounded (trust_accumulates_gradually),
damage is unbounded (trust_decreases_on_materialized_risk).
-/
/-- (Derivation Card)
Derives from: trust_accumulates_gradually (P1)
Proposition: D7
Content: Trust accumulation is bounded — incremental increases are capped by trustIncrementBound, formalizing the asymmetry of gradual growth.
Proof strategy: Direct application of trust_accumulates_gradually (P1) -/
theorem d7_accumulation_bounded :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w ≤ actionSpaceSize agent w' →
¬riskMaterialized agent w' →
trustLevel agent w ≤ trustLevel agent w' ∧
trustLevel agent w' ≤ trustLevel agent w + trustIncrementBound :=
trust_accumulates_gradually
/-- (Derivation Card)
Derives from: trust_decreases_on_materialized_risk (P1)
Proposition: D7
Content: Trust damage from materialized risk is unbounded — a single incident can destroy arbitrarily accumulated trust, formalizing the asymmetry of abrupt destruction.
Proof strategy: Direct application of trust_decreases_on_materialized_risk (P1) -/
theorem d7_damage_unbounded :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskMaterialized agent w' →
trustLevel agent w' < trustLevel agent w :=
trust_decreases_on_materialized_risk
/-! ## D8 Equilibrium Search
Rationale: Section 6 + E2 (capability-risk co-scaling)
By overexpansion_reduces_value,
there exist cases where expansion of the action space reduces collaborative value.
-/
/-- (Derivation Card)
Derives from: overexpansion_reduces_value (E2)
Proposition: D8
Content: Overexpansion of the action space can reduce collaborative value — equilibrium search is necessary to avoid value-destroying expansion.
Proof strategy: Direct application of overexpansion_reduces_value (E2) -/
theorem d8_overexpansion_risk :
∃ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' ∧
collaborativeValue w' < collaborativeValue w :=
overexpansion_reduces_value
/-- D8's connection to P1: Capability expansion is inseparable from risk expansion.
Direct application of E2. -/
theorem d8_capability_risk :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskExposure agent w < riskExposure agent w' :=
capability_risk_coscaling
/-! ## D9 Maintenance of the Classification Itself Definitional Extension + Theorem, 5.5/4.2.
Rationale: Observable.lean Part IV + P3 (governed learning, theorem §4.2) + Section 7 (self-application)
The design foundation itself is subject to updates, and updates follow P3's compatibility classification. This is a structuring of AGM revision operations (terminology reference §9.2):
-
Conservative extension = conservative extension (§5.5)
-
Compatible change = consistent extension (§5.5)
-
Breaking change = non-extension change (some theorems are not preserved)
8.1.12. Self-Application Requirements
Since D9 states the principle of "maintenance of the classification itself", D1–D9 themselves must also be subject to D9 (Section 7).
To express this at the type level (§7.1 Curry-Howard correspondence):
-
Model D1–D9 as values of the DesignPrinciple type (extension of domain of discourse §3.2)
-
Require that updates to DesignPrinciple are classified by CompatibilityClass
-
Structurally enforce via the SelfGoverning type class (§9.4) -/
/-- Design principle identifiers. Enumerates D1–D17 as values.
This allows D1–D17 themselves to be treated at the type level as "targets of updates". -/
inductive DesignPrinciple where
| d1_enforcementLayering
| d2_workerVerifierSeparation
| d3_observabilityFirst
| d4_progressiveSelfApplication
| d5_specTestImpl
| d6_boundaryMitigationVariable
| d7_trustAsymmetry
| d8_equilibriumSearch
| d9_selfMaintenance
| d10_structuralPermanence
| d11_contextEconomy
| d12_constraintSatisfactionTaskDesign
| d13_premiseNegationPropagation
| d14_verificationOrderConstraint
| d15_harnessEngineering
| d16_informationRelevance
| d17_deductiveDesignWorkflow
| d18_multiAgentCoordination
deriving BEq, Repr
/-- DesignPrinciple implements SelfGoverning. This makes D1–D9 themselves subject to governedUpdate, and updates without compatibility classification become type-level errors.
Types that do not implement SelfGoverning cannot use governedUpdate or
governed_update_classified, so defining a new principle type
and forgetting to implement SelfGoverning is detected as a type error. -/
instance : SelfGoverning DesignPrinciple where
classificationExhaustive := by intro c; cases c <;> simp
canClassifyUpdate _ _ := True
/-- Design principle update event. Self-application of D9: Changes to D1–D9 themselves also go through compatibility classification. -/ structure DesignPrincipleUpdate where /-- The principle being updated
principle : DesignPrinciple
Identify boundary conditions (invariant; only accepted) -/ | identifyBoundary /-- Design mitigations (design decisions belonging to L6) -/ | designMitigation /-- Define variables (metrics for mitigation effectiveness) -/ | defineVariable deriving BEq, Repr, DecidableEq
/-- Stage ordering of the three-stage design.
Declaration: def designStageOrder
def designStageOrder : DesignStage → Nat | .identifyBoundary => 0 | .designMitigation => 1 | .defineVariable => 2
Design mitigations (design decisions belonging to L6)
| designMitigation
Define variables (metrics for mitigation effectiveness)
| defineVariable
(Derivation Card) Derives from: designStageOrder (definitional) Proposition: D6 Content: The three-stage design is strictly ordered — identifyBoundary < designMitigation < defineVariable. The stage ordering function assigns monotonically increasing natural numbers. Proof strategy: simp (designStageOrder) — unfold the ordering function and reduce to Nat inequalities
Declaration: theorem d6_stage_sequential
theorem d6_stage_sequential : designStageOrder .identifyBoundary < designStageOrder .designMitigation ∧ designStageOrder .designMitigation < designStageOrder .defineVariable := by simp [designStageOrder]
D6: Reverse direction prohibited. Do not attempt to directly improve variables (Goodhart's Law trap). The variable stage is last; there is no backtracking from variables to boundary conditions or mitigations.
Declaration: theorem d6_no_reverse
theorem d6_no_reverse :
∀ (s : DesignStage),
designStageOrder .identifyBoundary ≤ designStageOrder s := by
intro s; cases s <;> simp [designStageOrder]
The ordering function is injective (distinct stages have distinct order values).
Declaration: theorem designStageOrder_injective
theorem designStageOrder_injective :
∀ (s₁ s₂ : DesignStage),
designStageOrder s₁ = designStageOrder s₂ → s₁ = s₂ := by
intro s₁ s₂; cases s₁ <;> cases s₂ <;> simp [designStageOrder]
Partial order reflexivity: s <= s.
Declaration: theorem designStage_le_refl
theorem designStage_le_refl : ∀ (s : DesignStage), s ≤ s := fun s => Nat.le_refl (designStageOrder s)
Partial order transitivity: if s₁ <= s₂ and s₂ <= s₃ then s₁ <= s₃.
Declaration: theorem designStage_le_trans
theorem designStage_le_trans :
∀ (s₁ s₂ s₃ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₃ → s₁ ≤ s₃ := by
intro _s₁ _s₂ _s₃ h₁₂ h₂₃; exact Nat.le_trans h₁₂ h₂₃
Partial order antisymmetry: if s₁ <= s₂ and s₂ <= s₁ then s₁ = s₂.
Declaration: theorem designStage_le_antisymm
theorem designStage_le_antisymm :
∀ (s₁ s₂ : DesignStage), s₁ ≤ s₂ → s₂ ≤ s₁ → s₁ = s₂ :=
fun s₁ s₂ h₁₂ h₂₁ => designStageOrder_injective s₁ s₂ (Nat.le_antisymm h₁₂ h₂₁)
Consistency of LT and LE: s₁ < s₂ iff s₁ <= s₂ and not (s₂ <= s₁).
Declaration: theorem designStage_lt_iff_le_not_le
theorem designStage_lt_iff_le_not_le :
∀ (s₁ s₂ : DesignStage), s₁ < s₂ ↔ s₁ ≤ s₂ ∧ ¬(s₂ ≤ s₁) := by
intro _s₁ _s₂; exact Nat.lt_iff_le_and_not_ge
8.1.13. D7 Trust Asymmetry Part 2
Rationale: Section 6 + P1 (co-growth)
Accumulation is bounded (trust_accumulates_gradually),
damage is unbounded (trust_decreases_on_materialized_risk).
(Derivation Card)
Derives from: trust_accumulates_gradually (P1)
Proposition: D7
Content: Trust accumulation is bounded — incremental increases are capped by trustIncrementBound, formalizing the asymmetry of gradual growth.
Proof strategy: Direct application of trust_accumulates_gradually (P1)
Declaration: theorem d7_accumulation_bounded
theorem d7_accumulation_bounded :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w ≤ actionSpaceSize agent w' →
¬riskMaterialized agent w' →
trustLevel agent w ≤ trustLevel agent w' ∧
trustLevel agent w' ≤ trustLevel agent w + trustIncrementBound :=
trust_accumulates_gradually
(Derivation Card)
Derives from: trust_decreases_on_materialized_risk (P1)
Proposition: D7
Content: Trust damage from materialized risk is unbounded — a single incident can destroy arbitrarily accumulated trust, formalizing the asymmetry of abrupt destruction.
Proof strategy: Direct application of trust_decreases_on_materialized_risk (P1)
Declaration: theorem d7_damage_unbounded
theorem d7_damage_unbounded :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskMaterialized agent w' →
trustLevel agent w' < trustLevel agent w :=
trust_decreases_on_materialized_risk
8.1.14. D8 Equilibrium Search Part 2
Rationale: Section 6 + E2 (capability-risk co-scaling)
By overexpansion_reduces_value,
there exist cases where expansion of the action space reduces collaborative value.
(Derivation Card)
Derives from: overexpansion_reduces_value (E2)
Proposition: D8
Content: Overexpansion of the action space can reduce collaborative value — equilibrium search is necessary to avoid value-destroying expansion.
Proof strategy: Direct application of overexpansion_reduces_value (E2)
Declaration: theorem d8_overexpansion_risk
theorem d8_overexpansion_risk :
∃ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' ∧
collaborativeValue w' < collaborativeValue w :=
overexpansion_reduces_value
D8's connection to P1: Capability expansion is inseparable from risk expansion. Direct application of E2.
Declaration: theorem d8_capability_risk
theorem d8_capability_risk :
∀ (agent : Agent) (w w' : World),
actionSpaceSize agent w < actionSpaceSize agent w' →
riskExposure agent w < riskExposure agent w' :=
capability_risk_coscaling
8.1.15. D9 Maintenance of the Classification Itself Part 2
Definitional Extension + Theorem, 5.5/4.2.
Rationale: Observable.lean Part IV + P3 (governed learning, theorem §4.2) + Section 7 (self-application)
The design foundation itself is subject to updates, and updates follow P3's compatibility classification. This is a structuring of AGM revision operations (terminology reference §9.2):
-
Conservative extension = conservative extension (§5.5)
-
Compatible change = consistent extension (§5.5)
-
Breaking change = non-extension change (some theorems are not preserved)
8.1.16. Self-Application Requirements Part 2
Since D9 states the principle of "maintenance of the classification itself", D1–D9 themselves must also be subject to D9 (Section 7).
To express this at the type level (§7.1 Curry-Howard correspondence):
-
Model D1–D9 as values of the DesignPrinciple type (extension of domain of discourse §3.2)
-
Require that updates to DesignPrinciple are classified by CompatibilityClass
-
Structurally enforce via the SelfGoverning type class (§9.4)
Design principle identifiers. Enumerates D1–D17 as values. This allows D1–D17 themselves to be treated at the type level as "targets of updates".
Declaration: inductive DesignPrinciple
inductive DesignPrinciple where | d1_enforcementLayering | d2_workerVerifierSeparation | d3_observabilityFirst | d4_progressiveSelfApplication | d5_specTestImpl | d6_boundaryMitigationVariable | d7_trustAsymmetry | d8_equilibriumSearch | d9_selfMaintenance | d10_structuralPermanence | d11_contextEconomy | d12_constraintSatisfactionTaskDesign | d13_premiseNegationPropagation | d14_verificationOrderConstraint | d15_harnessEngineering | d16_informationRelevance | d17_deductiveDesignWorkflow | d18_multiAgentCoordination deriving BEq, Repr
DesignPrinciple implements SelfGoverning. This makes D1–D9 themselves subject to governedUpdate, and updates without compatibility classification become type-level errors.
Types that do not implement SelfGoverning cannot use governedUpdate or
governed_update_classified, so defining a new principle type
and forgetting to implement SelfGoverning is detected as a type error.
Declaration: instance :
instance : SelfGoverning DesignPrinciple where classificationExhaustive := by intro c; cases c <;> simp canClassifyUpdate _ _ := True
Design principle update event. Self-application of D9: Changes to D1–D9 themselves also go through compatibility classification.
Declaration: structure DesignPrincipleUpdate
structure DesignPrincipleUpdate where
The principle being updated -/ principle : DesignPrinciple /-- Compatibility classification of the update -/ compatibility : CompatibilityClass /-- Rationale for the update (reference to manifesto's T/E/P) -/ hasRationale : Bool deriving Repr
/-- (Derivation Card) Derives from: CompatibilityClass (definitional — exhaustive inductive type) Proposition: D9 Content: Any compatibility classification belongs to exactly one of the three classes: conservativeExtension, compatibleChange, or breakingChange. Proof strategy: intro c; cases c <;> simp — exhaustive case analysis on the CompatibilityClass inductive type
Declaration: theorem d9_update_classified
theorem d9_update_classified :
∀ (c : CompatibilityClass),
c = .conservativeExtension ∨
c = .compatibleChange ∨
c = .breakingChange := by
intro c; cases c <;> simp
Compatibility classification of the update
compatibility : CompatibilityClass
Rationale for the update (reference to manifesto's T/E/P)
hasRationale : Bool
Self-application of D9: Updates to D9 itself also go through compatibility classification. The DesignPrincipleUpdate type structurally requires this (the compatibility field is mandatory).
Furthermore, updates require a rationale (D9: principles that lose their rationale are subject to review).
Declaration: def governedPrincipleUpdate
def governedPrincipleUpdate (u : DesignPrincipleUpdate) : Prop := u.hasRationale = true
Self-application of D9: Proves via the SelfGoverning typeclass that any update to DesignPrinciple is compatibility-classified.
governed_update_classified can only be called on types that have a
SelfGoverning instance. If DesignPrinciple does not implement
SelfGoverning, this theorem becomes a type error.
-> Missing implementations are structurally detected.
Declaration: theorem d9_self_applicable
theorem d9_self_applicable :
∀ (_p : DesignPrinciple) (c : CompatibilityClass),
c = .conservativeExtension ∨ c = .compatibleChange ∨ c = .breakingChange :=
fun _p c => governed_update_classified _p c
D9 exhaustiveness: All principles D1–D17 are enumerated as update targets.
Declaration: theorem d9_all_principles_enumerated
theorem d9_all_principles_enumerated :
∀ (p : DesignPrinciple),
p = .d1_enforcementLayering ∨
p = .d2_workerVerifierSeparation ∨
p = .d3_observabilityFirst ∨
p = .d4_progressiveSelfApplication ∨
p = .d5_specTestImpl ∨
p = .d6_boundaryMitigationVariable ∨
p = .d7_trustAsymmetry ∨
p = .d8_equilibriumSearch ∨
p = .d9_selfMaintenance ∨
p = .d10_structuralPermanence ∨
p = .d11_contextEconomy ∨
p = .d12_constraintSatisfactionTaskDesign ∨
p = .d13_premiseNegationPropagation ∨
p = .d14_verificationOrderConstraint ∨
p = .d15_harnessEngineering ∨
p = .d16_informationRelevance ∨
p = .d17_deductiveDesignWorkflow ∨
p = .d18_multiAgentCoordination := by
intro p; cases p <;> simp
8.1.17. Self-Application of D4
D4 (progressive self-application) states that "the development process achieves compliance up to each phase", but DesignFoundation itself should also be developed following these phases.
Updates to DesignFoundation occur in the context of DevelopmentPhase,
and the compliance level of the updated phase progresses irreversibly (T2: structure_accumulates).
Self-application of D4: The design foundation itself has phases. Each principle is applicable only after the phase it requires is complete.
Declaration: def principleRequiredPhase
def principleRequiredPhase : DesignPrinciple → DevelopmentPhase | .d1_enforcementLayering => .safety | .d2_workerVerifierSeparation => .verification | .d3_observabilityFirst => .observability | .d4_progressiveSelfApplication => .safety -- D4 自体は最初から必要 | .d5_specTestImpl => .verification | .d6_boundaryMitigationVariable => .observability | .d7_trustAsymmetry => .equilibrium | .d8_equilibriumSearch => .equilibrium | .d9_selfMaintenance => .safety -- D9 も最初から必要 | .d10_structuralPermanence => .safety -- T1+T2 は最初から成立 | .d11_contextEconomy => .observability -- コンテキストコスト測定が前提 | .d12_constraintSatisfactionTaskDesign => .governance -- P6 は統治フェーズ | .d13_premiseNegationPropagation => .governance -- P3(退役)+ Section 8 が前提 | .d14_verificationOrderConstraint => .governance -- P6 + T7 + T8 が前提 | .d15_harnessEngineering => .equilibrium -- 実装パターンは動的調整フェーズ | .d16_informationRelevance => .observability -- コンテキスト寄与度の測定が前提 | .d17_deductiveDesignWorkflow => .governance -- P3(学習統治)+ D5(三層)が前提 | .d18_multiAgentCoordination => .equilibrium -- 動的調整フェーズ(D12 + T7b が前提)
Self-application of D4: D4 and D9 are required from the safety phase. This means that "phase ordering" and "governed updates" must be functional from the very beginning of development.
Declaration: theorem d4_d9_from_first_phase
theorem d4_d9_from_first_phase : principleRequiredPhase .d4_progressiveSelfApplication = .safety ∧ principleRequiredPhase .d9_selfMaintenance = .safety := by constructor <;> rfl
8.1.18. Dependency Structure of D1-D9
Verifies that D4's (progressive self-application) phase ordering is consistent with the dependency relationships of D1–D3.
-
Phase 1 (safety) -> D1 (L1 requires structural enforcement)
-
Phase 2 (verification) -> D2 (structural realization of P2)
-
Phase 3 (observability) -> D3 (observability first)
-
Phase 4 (governance) -> depends on D3 (P3 comes after P4)
-
Phase 5 (equilibrium) -> depends on D7, D8 (trust and equilibrium)
This dependency structure is already expressed in phaseDependency.
d4_full_chain proves its existence.
Consistency of D1–D4: The first step of D4's phase ordering (safety -> verification) matches the ordering of D1 (L1 requires structural enforcement) and D2 (realization of P2).
safety is first = D1 makes L1 structurally enforced verification is next = D2 realizes P2
Declaration: theorem dependency_d1_d2_d4_consistent
theorem dependency_d1_d2_d4_consistent : phaseDependency .verification .safety ∧ minimumEnforcement .fixed = .structural := by constructor · trivial · rfl
8.1.19. D10 Structural Permanence
Theorem, 4.2.
Rationale: T1 (ephemerality, T₀ §4.1) + T2 (structural permanence, T₀ §4.1)
Agents are ephemeral (T1) but structure persists (T2).
Accumulation of improvements is possible only through structure.
Connects with P3 theorem group in Principles.lean (modifier_agent_terminates,
modification_persists_after_termination).
(Derivation Card)
Derives from: session_bounded (T1), structure_persists (T2)
Proposition: D10
Content: Agents are ephemeral (T1) but structure persists (T2) — accumulation of improvements is possible only through structure, not through persistent agent identity.
Proof strategy: Constructor pair ⟨session_bounded, structure_persists⟩ — direct composition of T1 and T2
Declaration: theorem d10_agent_temporary_structure_permanent
theorem d10_agent_temporary_structure_permanent :
-- T1: セッションは終了する
(∀ (w : World) (s : Session),
s ∈ w.sessions →
∃ (w' : World), w.time ≤ w'.time ∧
∃ (s' : Session), s' ∈ w'.sessions ∧
s'.id = s.id ∧ s'.status = SessionStatus.terminated) ∧
-- T2: 構造は永続する
(∀ (w w' : World) (s : Session) (st : Structure),
s ∈ w.sessions → st ∈ w.structures →
s.status = SessionStatus.terminated →
validTransition w w' → st ∈ w'.structures) :=
⟨session_bounded, structure_persists⟩
Corollary of D10: Writing back to structure is the sole means of accumulation.
Epochs (T2: structure_accumulates) increase monotonically.
Declaration: theorem d10_epoch_monotone
theorem d10_epoch_monotone : ∀ (w w' : World), validTransition w w' → w.epoch ≤ w'.epoch := structure_accumulates
8.1.20. D11 Context Economy
Definitional Extension + Theorem, 5.5/4.2.
Rationale: T3 (context finiteness, T₀ §4.1) + D1 (enforcement layering)
Working memory (T3: amount of information that can be processed) is a finite resource, and enforcement layers (D1) and context cost are inversely correlated: structural enforcement (low cost) > procedural enforcement (medium cost) > normative guidelines (high cost).
Context cost for D1's enforcement layers. Higher values consume more context.
Declaration: def contextCost
def contextCost : EnforcementLayer → Nat | .structural => 0 -- 一度設定すれば毎セッション読む必要がない | .procedural => 1 -- プロセスは存在するがコンテキストに常駐しない | .normative => 2 -- 毎セッション読み込まれ、コンテキストを占有する
(Derivation Card) Derives from: contextCost (definitional) Proposition: D11 Content: Enforcement power and context cost are inversely correlated — structural (0) < procedural (1) < normative (2). Higher enforcement power means lower context cost. Proof strategy: simp (contextCost) — unfold the cost function and reduce to Nat inequalities
Declaration: theorem d11_enforcement_cost_inverse
theorem d11_enforcement_cost_inverse : contextCost .structural < contextCost .procedural ∧ contextCost .procedural < contextCost .normative := by simp [contextCost]
D11: Promotion to structural enforcement reduces context cost.
Declaration: theorem d11_structural_minimizes_cost
theorem d11_structural_minimizes_cost :
∀ (e : EnforcementLayer),
contextCost .structural ≤ contextCost e := by
intro e; cases e <;> simp [contextCost]
D11 + T3: Context capacity is finite (T3), and bloating of normative guidelines degrades V2 (context efficiency).
Declaration: theorem d11_context_finite
theorem d11_context_finite :
∀ (agent : Agent),
agent.contextWindow.capacity > 0 ∧
agent.contextWindow.used ≤ agent.contextWindow.capacity :=
context_finite
8.1.21. D12 Constraint Satisfaction Task Design
Theorem, 4.2.
Rationale: P6 (constraint satisfaction, theorem §4.2) + T3 + T7 + T8 (T₀ §4.1)
Task execution is a constraint satisfaction problem. Achieve precision requirements (T8) within finite cognitive space (T3) and finite resources (T7). Connects with P6 theorem group in Principles.lean.
(Derivation Card)
Derives from: task_is_constraint_satisfaction (P6)
Proposition: D12
Content: Task design is a constraint satisfaction problem over T3+T7+T8. A feasible strategy must satisfy context capacity (T3), resource budget (T7), and precision requirement (T8) simultaneously.
Proof strategy: Direct application of task_is_constraint_satisfaction (P6) — restatement at the design principle level
Declaration: theorem d12_task_is_csp
theorem d12_task_is_csp :
∀ (task : Task) (agent : Agent),
agent.contextWindow.capacity > 0 →
task.resourceBudget ≤ globalResourceBound →
task.precisionRequired.required > 0 →
∀ (s : TaskStrategy),
s.task = task →
strategyFeasible s agent →
s.contextUsage ≤ agent.contextWindow.capacity ∧
s.resourceUsage ≤ globalResourceBound ∧
s.achievedPrecision > 0 :=
task_is_constraint_satisfaction
D12: Task design itself is also probabilistic output (T4), requiring verification through P2 (cognitive role separation).
Declaration: theorem d12_task_design_probabilistic
theorem d12_task_design_probabilistic :
∃ (agent : Agent) (action : Action) (w w₁ w₂ : World),
canTransition agent action w w₁ ∧
canTransition agent action w w₂ ∧
w₁ ≠ w₂ :=
output_nondeterministic
8.1.22. D13 Premise Negation Impact Propagation
Theorem, 4.2.
Rationale: P3 (governed learning -- retirement) + Section 8 (coherenceRequirement) + T5
When a premise is negated, identify dependent derivations and re-verify them. Generalizes Section 8's coherenceRequirement (priority-based review) to arbitrary dependency relationships.
Based on PropositionId.dependencies from Ontology.lean, defines impact set computation functions and basic properties.
8.1.23. Note on coherenceRequirement - Issue 243
The original d13_coherence_implies_propagation theorem was removed because it was
trivially-true: its conclusion was a direct restatement of its premise.
The root cause is that coherenceRequirement (Ontology.lean) has True as its conclusion,
making any theorem built on it vacuously true. Strengthening coherenceRequirement to use
a meaningful review obligation type (e.g., NeedsReview) would be a breaking change to
Ontology.lean and is deferred.
D13's substantive content is captured by:
-
affected/d13_propagation: Impact set computation via transitive dependency closure -
d13_constraint_negation_has_impact: T4 negation produces non-empty impact -
d13_retirement_requires_feedback: P3 retirement presupposes T5 -
assumptionImpact/d13_assumption_subsumes_proposition: Assumption-level propagation (#225) -
d13_assumption_impact_monotone: Monotonicity of assumption impact
D13: P3's retirement operation presupposes T5 (feedback). Without feedback, negation of premises cannot be detected.
Declaration: theorem d13_retirement_requires_feedback
theorem d13_retirement_requires_feedback :
∀ (w : World),
w.feedbacks = [] →
¬(∃ (f : Feedback), f ∈ w.feedbacks ∧ f.kind = .measurement) :=
fun _ hnil ⟨_, hf, _⟩ => by simp [hnil] at hf
Enumeration of all propositions. Used in affected computation.
Declaration: def allPropositions
def allPropositions : List PropositionId := [.t1, .t2, .t3, .t4, .t5, .t6, .t7, .t8, .e1, .e2, .p1, .p2, .p3, .p4, .p5, .p6, .l1, .l2, .l3, .l4, .l5, .l6, .d1, .d2, .d3, .d4, .d5, .d6, .d7, .d8, .d9, .d10, .d11, .d12, .d13, .d14, .d15, .d16, .d17, .d18]
Set of propositions that directly depend on proposition s (reverse edges). dependencies = "what it depends on"; dependents = "what depends on it".
Declaration: def PropositionId.dependents
def PropositionId.dependents (s : PropositionId) : List PropositionId := allPropositions.filter (fun p => propositionDependsOn p s)
Computes the impact set when premise s is negated. Transitive closure of the reverse dependency graph. The fuel parameter guarantees termination (depth <= 35 suffices since the graph is a DAG).
Incompleteness limitation: This function only tracks propagation among named propositions enumerated in PropositionId. By Goedel's first incompleteness theorem, impact on unnamed derivational consequences cannot be detected (see Ontology.lean §6.2 note).
Declaration: def affected
def affected (s : PropositionId) (fuel : Nat := 35) : List PropositionId :=
match fuel with
| 0 => []
| fuel' + 1 =>
let direct := s.dependents
let transitive := direct.flatMap (fun p => affected p fuel')
(direct ++ transitive).eraseDups
Operational definition of D13: Impact propagation upon premise negation. Computes the impact set via affected, representing that each proposition requires re-verification.
Declaration: def d13_propagation
def d13_propagation (negated : PropositionId) : List PropositionId := affected negated
Negation of T (constraint) has the largest impact: T is the rationale for many propositions, so the impact set is large.
Declaration: theorem d13_constraint_negation_has_impact
theorem d13_constraint_negation_has_impact : (d13_propagation .t4).length > 0 := by native_decide
Negation of L5 (platform boundary) affects only D1: L5 is environment-dependent and close to a root node, so its impact is limited.
Declaration: theorem d13_l5_limited_impact
theorem d13_l5_limited_impact : (d13_propagation .l5).length ≤ (d13_propagation .t4).length := by native_decide
8.1.24. D13 Extension - Temporal Expiration of Assumptions
Extends D13 from PropositionId-level to assumption-level propagation.
Conditional axiom systems S=(A,C,H,D) derive conclusions D from assumptions C/H. C/H originate from external sources that change over time (#225). When an assumption expires (its external source changes or its review period elapses), all derivations depending on that assumption require re-verification.
This is the principle-level formalization. Operational types (TemporalValidity, AssumptionExpiration) are defined in Models/Assumptions/EpistemicLayer.lean. DesignFoundation.lean does not import Models to preserve the dependency direction (core formalization must not depend on model instances).
The bridge: an assumption maps to a set of PropositionIds it supports. When the assumption expires, all dependents of those PropositionIds are affected.
Computes the impact set when an assumption expires. An assumption supports a set of propositions (its dependencies). Expiration propagates through all dependents of those propositions.
This generalizes d13_propagation (single PropositionId) to
assumption-level expiration (set of PropositionIds).
Declaration: def assumptionImpact
def assumptionImpact (supportedPropositions : List PropositionId) : List PropositionId := (supportedPropositions.flatMap (fun p => affected p)).eraseDups
Raw (pre-dedup) impact set for assumption expiration. Used internally for proofs; assumptionImpact applies eraseDups on top.
Declaration: def assumptionImpactRaw
def assumptionImpactRaw (supportedPropositions : List PropositionId) : List PropositionId := supportedPropositions.flatMap (fun p => affected p)
(Derivation Card)
Derives from: d13_propagation (D13), affected (D13)
Proposition: D13 (extension)
Content: Assumption expiration subsumes individual proposition negation — if an assumption supports proposition p, then the raw impact of the assumption's expiration includes all of d13_propagation p.
Proof strategy: List.mem_flatMap propagation
Declaration: theorem d13_assumption_subsumes_proposition
theorem d13_assumption_subsumes_proposition :
∀ (p : PropositionId) (supported : List PropositionId),
p ∈ supported →
∀ (q : PropositionId),
q ∈ d13_propagation p →
q ∈ assumptionImpactRaw supported := by
intro p supported hp q hq
simp only [assumptionImpactRaw, d13_propagation] at *
exact List.mem_flatMap.mpr ⟨p, hp, hq⟩
(Derivation Card) Derives from: assumptionImpactRaw (D13 extension) Proposition: D13 (extension) Content: Broader assumptions have broader impact — if s₁ ⊆ s₂ then assumptionImpactRaw s₁ ⊆ assumptionImpactRaw s₂. Monotonicity with respect to support set inclusion. Proof strategy: Monotonicity of flatMap — superset of inputs produces superset of outputs
Declaration: theorem d13_assumption_impact_monotone
theorem d13_assumption_impact_monotone :
∀ (s₁ s₂ : List PropositionId),
(∀ p, p ∈ s₁ → p ∈ s₂) →
∀ q, q ∈ assumptionImpactRaw s₁ → q ∈ assumptionImpactRaw s₂ := by
intro s₁ s₂ hsub q hq
simp only [assumptionImpactRaw] at *
obtain ⟨p, hp, hpq⟩ := List.mem_flatMap.mp hq
exact List.mem_flatMap.mpr ⟨p, hsub p hp, hpq⟩
8.1.25. Correspondence between StructureKind and PropositionId
Connects the Structure-level partial order (Ontology.lean, structural consistency section) with the PropositionId-level dependency graph (this file, §D13). By answering the question "which axioms (PropositionId) does this Structure (file) depend on?", refines the tracing from end-point errors back to the axiom level.
Corresponds to ATMS labeling from the research document.
Set of PropositionIds corresponding to each StructureKind. manifest.md encompasses all axioms/postulates/principles T1-T8, E1-E2, P1-P6. designConvention encompasses design theorems D1-D17. skill/test/document are empty sets due to individual definitions (room for future extension).
Declaration: def structurePropositions
def structurePropositions : StructureKind → List PropositionId
| .manifest => [.t1, .t2, .t3, .t4, .t5, .t6, .t7, .t8,
.e1, .e2, .p1, .p2, .p3, .p4, .p5, .p6]
| .designConvention => [.d1, .d2, .d3, .d4, .d5, .d6, .d7, .d8,
.d9, .d10, .d11, .d12, .d13, .d14,
.d15, .d16, .d17, .d18]
| .skill => []
| .test => []
| .document => []
Set of propositions affected at the PropositionId level by changes to a StructureKind. Structure change -> contained PropositionIds -> compute propagation targets via affected. Integrates two-layer dependency tracking into a single pipeline.
Declaration: def structureToPropositionImpact
def structureToPropositionImpact (k : StructureKind) : List PropositionId := (structurePropositions k).flatMap (fun p => affected p)
Changes to manifest have the widest proposition-level impact. Propagates to all dependents of T1-T8, E1-E2, P1-P6.
Declaration: theorem manifest_has_widest_impact
theorem manifest_has_widest_impact :
∀ (k : StructureKind),
(structureToPropositionImpact k).length ≤
(structureToPropositionImpact .manifest).length := by
intro k; cases k <;> native_decide
Changes to designConvention have non-empty proposition-level impact. Proves that dependents of D1-D17 exist.
Declaration: theorem design_convention_has_impact
theorem design_convention_has_impact : (structureToPropositionImpact .designConvention).length > 0 := by native_decide
8.1.26. D14 Constraint Satisfaction of Verification Order
Theorem, 4.2.
Rationale: P6 (constraint satisfaction) + T7 (resource finiteness) + T8 (precision level)
Under finite resources, verification order affects outcomes. The choice of ordering is included in P6's constraint satisfaction problem. Extension of D12.
8.1.27. What the Axiom System Does Not Determine
D14 derives that "verification order matters" but does not derive the optimal ordering method. Information gain, risk-order (fail-fast), and cost-order are all models satisfying D14. The choice of specific method is at the L6 (design convention) level.
(Derivation Card)
Derives from: task_is_constraint_satisfaction (P6)
Proposition: D14
Content: Verification order is part of the constraint satisfaction problem — when resources are finite (T7) and precision requirements exist (T8), the choice of verification order is within the scope of P6 constraint satisfaction.
Proof strategy: Direct application of task_is_constraint_satisfaction (P6) — same proof term as D12, applied to verification ordering context
Declaration: theorem d14_verification_order_is_csp
theorem d14_verification_order_is_csp :
∀ (task : Task) (agent : Agent),
agent.contextWindow.capacity > 0 →
task.resourceBudget ≤ globalResourceBound →
task.precisionRequired.required > 0 →
∀ (s : TaskStrategy),
s.task = task →
strategyFeasible s agent →
s.contextUsage ≤ agent.contextWindow.capacity ∧
s.resourceUsage ≤ globalResourceBound ∧
s.achievedPrecision > 0 :=
task_is_constraint_satisfaction
8.1.28. D15 Harness Engineering Theorems
Theorem group, §4.2.
Derived from empirical analysis of ForgeCode (Terminal-Bench 2.0 #1, 81.8%). ForgeCode's design decisions were mapped against T₀, and the following theorems were identified as derivable but previously unstated.
Reference: GitHub Issue #147, #148 (S1 analysis).
8.1.29. D15a Unbounded retry under finite resources is infeasible
Rationale: T7 (resource_finite) + T4 (output_nondeterministic)
Under finite resources (T7) and nondeterministic output (T4),
a strategy with unbounded retries cannot satisfy the resource constraint.
ForgeCode implements this as max_tool_failure_per_turn.
8.1.30. D15b Non-converging agent loops require human intervention
Rationale: T6 (human_resource_authority, resource_revocable) + T5 (no_improvement_without_feedback)
When an agent loop fails to converge, continued execution without human
feedback violates both T6 (human authority over resources) and T5
(no improvement without feedback). ForgeCode implements this as
max_requests_per_turn.
8.1.31. D15c Context eviction preserving feasibility
Rationale: T3 (context_finite) + T8 (task_has_precision) + P6 (task_is_constraint_satisfaction)
When context usage exceeds capacity (T3), evicting messages that do not contribute to precision (T8) preserves strategyFeasible (P6). ForgeCode implements this as the droppable flag + compaction strategy.
8.1.32. D15d Computation saturation
Rationale: context_contribution_nonuniform (T3) + task_has_precision (T8) + resource_finite (T7)
For every task, zero-return computation steps exist unconditionally. Combined with finite budgets (D15a), the optimal computation budget is strictly less than the total resource budget. The saturation point's existence is provable; its location requires external assessment (E1). This is the formal basis for satisficing and adaptive computation depth policies (Phase 3c triggers, progressive thinking).
(Derivation Card)
Proposition: D15
Theorem: d15a_unbounded_retry_infeasible
Derives from: T7 (resource_finite), T4 (output_nondeterministic)
Content: Under finite resources, a retry count must be bounded.
Proof strategy: Direct contradiction — n attempts exceeding globalResourceBound contradicts ≤ bound.
D15a: Under finite resources, a retry count must be bounded. If resourceUsage per attempt is positive and resources are globally bounded, then the number of feasible attempts is bounded.
Derivation: T7 (resource_finite) + T4 (output_nondeterministic).
ForgeCode: max_tool_failure_per_turn.
Declaration: theorem d15a_unbounded_retry_infeasible
theorem d15a_unbounded_retry_infeasible :
∀ (costPerAttempt : Nat),
costPerAttempt > 0 →
∀ (n : Nat),
n * costPerAttempt > globalResourceBound →
-- n attempts would exceed the global resource bound
-- therefore n is not feasible
¬(n * costPerAttempt ≤ globalResourceBound) := by
intro cost h_pos n h_exceed h_le
exact Nat.not_le.mpr h_exceed h_le
D15b: An agent that does not converge must yield to human feedback. Resources are revocable by humans (T6), and improvement requires feedback (T5). Therefore, non-convergence implies the need for human intervention — not continued autonomous execution.
Derivation: T6 (resource_revocable) + T5 (no_improvement_without_feedback).
ForgeCode: max_requests_per_turn = 50.
Formalized as: for any resource allocation, a human can revoke it (restating T6 in the context of non-converging agent loops).
Declaration: theorem d15b_non_convergence_requires_human
theorem d15b_non_convergence_requires_human :
∀ (w : World) (alloc : ResourceAllocation),
alloc ∈ w.allocations →
∃ (w' : World) (human : Agent),
isHuman human ∧
validTransition w w' ∧
alloc ∉ w'.allocations :=
fun w alloc h_alloc =>
resource_revocable w alloc h_alloc
D15c: Evicting zero-precision-contribution context preserves feasibility. If a strategy is feasible and we reduce contextUsage while keeping all other dimensions unchanged, the strategy remains feasible.
Derivation: T3 (context_finite) + T8 (task_has_precision) + P6 (CSP).
ForgeCode: droppable messages + compaction strategy.
This captures the invariant that removing content that does not affect achievedPrecision or resourceUsage preserves strategyFeasible.
Declaration: theorem d15c_eviction_preserves_feasibility
theorem d15c_eviction_preserves_feasibility :
∀ (s : TaskStrategy) (agent : Agent),
strategyFeasible s agent →
∀ (s' : TaskStrategy),
s'.task = s.task →
-- eviction: context usage decreases or stays same
s'.contextUsage ≤ s.contextUsage →
-- resource usage unchanged
s'.resourceUsage = s.resourceUsage →
-- precision unchanged (evicted content had zero precision contribution)
s'.achievedPrecision = s.achievedPrecision →
strategyFeasible s' agent := by
intro s agent ⟨h_ctx, h_res, h_prec⟩ s' h_task h_ctx' h_res' h_prec'
exact ⟨Nat.le_trans h_ctx' h_ctx, h_res' ▸ (h_task ▸ h_res), h_prec' ▸ (h_task ▸ h_prec)⟩
8.1.33. D15d Computation Saturation
Rationale: context_contribution_nonuniform (T3) + task_has_precision (T8) + resource_finite (T7)
Under finite resources (T7), not all computation contributes to precision (T3 extension), and every task has a positive precision target (T8 structural). Therefore:
-
Zero-return computation steps exist for every task (waste is universal)
-
Resources that fund waste computation cannot improve precision
-
The optimal computation budget is strictly less than the total resource budget
This establishes that a saturation point exists — a resource expenditure level beyond which additional computation yields zero marginal precision improvement. The saturation point exists before resource exhaustion (D15a), making D15a's bound a looser upper bound on useful computation.
Critical limitation (E1): While the saturation point's existence is provable, its location cannot be determined by the computing agent itself. The agent that reasons is the same agent that would need to judge "I've reasoned enough" — violating E1 (verification requires independence). Therefore:
-
Structural budget limits (D15a) provide a hard upper bound
-
Human intervention (D15b) provides the authoritative stopping signal
-
Heuristic triggers (Phase 3c) approximate the saturation point operationally
Prior art:
-
Simon (1956) "Satisficing": under search costs, stopping at "good enough" is optimal
-
Russell & Wefald (1991) "Value of Computation": stop when VOC drops to zero
-
Graves (2016) "Adaptive Computation Time": learned halting probabilities
-
Banino et al. (2021) "PonderNet": probabilistic stopping via geometric prior
Operational instances in this project:
-
Phase 3c triggers: same error 3x, axiom inflation 2x → strategy change
-
ForgeCode progressive thinking: high reasoning turns 1-10, low turns 11+
-
model-questioner termination: contradiction count reaches zero
A computation step: a unit of reasoning that consumes resources and may or may not contribute to task precision.
Models individual reasoning steps in a chain-of-thought, tool calls in an agentic loop, or any discrete computation consuming resources.
Declaration: structure ComputationStep
structure ComputationStep where
The context item produced or consumed by this step
item : ContextItem
The context item produced or consumed by this step -/
item : ContextItem
/-- Resource cost of this step (always positive — computation is not free) -/
cost : Nat
/-- Proof that cost is positive (T7: resources are consumed) -/
cost_pos : cost > 0 := by omega
deriving Repr
/-- The marginal precision return of a computation step for a given task. Wraps precisionContribution with computation-step semantics: a step with marginalReturn = 0 is "waste computation" that consumes resources without improving precision.
Declaration: def marginalReturn
def marginalReturn (step : ComputationStep) (task : Task) : Nat := precisionContribution step.item task
Resource cost of this step (always positive — computation is not free)
cost : Nat
Proof that cost is positive (T7: resources are consumed)
cost_pos : cost > 0 := by omega
D15d: For every task, zero-return computation steps exist.
Unlike D16a (which requires precisionRequired > 0 as a premise),
D15d derives this from T8's structural guarantee (PrecisionLevel.required_pos),
eliminating the premise. This means: waste computation exists unconditionally
for every well-formed task — there is no task for which all possible computation
would be useful.
Combined with D15a (finite budget → finite steps), this establishes that the set of feasible computation steps always contains waste. The saturation point — where useful computation ends and waste begins — exists within the feasible budget, not at its boundary.
Derivation: context_contribution_nonuniform (T3) + PrecisionLevel.required_pos (T8).
Axiom count: 0 new axioms. Uses context_contribution_nonuniform from T₀.
Declaration: theorem d15d_computation_saturation
theorem d15d_computation_saturation :
∀ (task : Task),
∃ (step : ComputationStep),
marginalReturn step task = 0 := by
intro task
-- T8 structural: every task has positive precision requirement
have h_prec := task.precisionRequired.required_pos
-- T3 extension: zero-contribution items exist for any such task
obtain ⟨item, h_zero⟩ := context_contribution_nonuniform task h_prec
-- Wrap as ComputationStep with minimal cost
exact ⟨⟨item, 1, by omega⟩, h_zero⟩
8.1.34. D16 Information Relevance Theorems
Theorem group, §4.2.
Derived from the new T₀ axiom context_contribution_nonuniform (T3 extension)
combined with existing axioms. These theorems formalize the consequences of
non-uniform information relevance identified in ForgeCode analysis #147/#150 (S3).
8.1.35. D16a Zero-contribution items exist and can be evicted
Rationale: context_contribution_nonuniform + D15c (eviction preserves feasibility)
8.1.36. D16b Input design affects output quality
Rationale: context_contribution_nonuniform + T4 (nondeterminism) + T8 (precision)
Applicable to B3 (tool naming alignment with training data) and B6 (prompt composition optimization).
8.1.37. D16c Resource allocation should follow contribution
Rationale: context_contribution_nonuniform + T7 (resource finite) + T3 (context finite)
Applicable to B5 (progressive thinking policy).
(Derivation Card)
Proposition: D16
Theorem: d16a_zero_contribution_items_exist
Derives from: T3 (context_contribution_nonuniform)
Content: For any task with positive precision, zero-contribution context items exist and can be evicted.
Proof strategy: Existential witness from context_contribution_nonuniform axiom.
D16a: For any task with positive precision requirements, there exist context items with zero precision contribution. These items can be evicted without affecting task precision.
Derivation: context_contribution_nonuniform (T3 extension).
ForgeCode: semantic search filters out irrelevant files (B1).
ForgeCode: droppable flag marks zero-contribution items (A1).
Declaration: theorem d16a_zero_contribution_items_exist
theorem d16a_zero_contribution_items_exist :
∀ (task : Task),
task.precisionRequired.required > 0 →
∃ (item : ContextItem),
precisionContribution item task = 0 :=
fun task h_prec => context_contribution_nonuniform task h_prec
D16b: Context composition affects task feasibility. Different context compositions yield different precision outcomes.
Specifically: for any task with positive precision requirements, there exist two distinct context items with different precision contributions. Therefore, which items are included in the finite context window (T3) affects the achievable precision.
This is the formal basis for:
-
B3: Tool naming aligned with training data increases success probability
-
B6: Prompt composition should prioritize high-contribution information
Derivation: context_contribution_nonuniform (zero-contribution item exists)
-
task_has_precision(T8: precision requirement > 0 implies someone must contribute). If zero-contribution items exist and precision must be achieved, then at least one item must have positive contribution — establishing that contributions are not all equal (i.e., composition matters).
Limitation: This theorem is an encoding theorem. It conjoins D16a's conclusion
with the premise h_prec, which is a trivially-true pattern (premise restated in
conclusion). The non-trivial claim "composition matters" is justified by the
docstring reasoning but not formally captured — proving existence of a
positive-contribution item would require additional axioms or making
precisionContribution non-opaque. Filed as a known formalization gap.
Declaration: theorem d16b_context_composition_matters
theorem d16b_context_composition_matters :
∀ (task : Task),
task.precisionRequired.required > 0 →
∃ (item : ContextItem),
precisionContribution item task = 0 ∧
task.precisionRequired.required > 0 :=
-- D16b differs from D16a in what it concludes:
-- D16a: zero-contribution items exist (pure existence)
-- D16b: zero-contribution items exist AND precision is required (conjunction)
-- The conjunction establishes that context composition is a NON-TRIVIAL
-- optimization problem: items that don't help coexist with a requirement
-- that demands help. Including zero-contribution items wastes finite
-- context capacity (T3) without advancing precision (T8).
fun task h_prec =>
let ⟨item, hitem⟩ := context_contribution_nonuniform task h_prec
⟨item, hitem, h_prec⟩
D16c: Under finite resources, allocating more resources to higher-contribution phases is rational. If zero-contribution items exist in the context (D16a), then the resource spent processing them is wasted under T7.
This is the formal basis for B5 (progressive thinking policy): phases with higher precision contribution deserve more cognitive resources.
Derivation: context_contribution_nonuniform + T7 (resource_finite).
Formalized as: given finite resources (T7, w parameter) and positive
precision requirements (T8), zero-contribution items exist (D16a).
The resource bound is accepted as a premise but the conclusion
depends only on D16a — the formal content is that waste exists
regardless of specific budget levels. The w/h_bound parameters
establish the resource-finite context without being used in the proof.
Declaration: theorem d16c_resource_follows_contribution
theorem d16c_resource_follows_contribution :
∀ (task : Task) (w : World),
task.precisionRequired.required > 0 →
(w.allocations.map (·.amount)).foldl (· + ·) 0 ≤ globalResourceBound →
∃ (item : ContextItem),
precisionContribution item task = 0 := by
intro task w h_prec _h_bound
exact context_contribution_nonuniform task h_prec
8.1.38. D18 Multi-Agent Coordination
Theorem, 4.2.
Rationale: T7b (sequential_exceeds_component) + D12 (task is CSP) + T3 (context finite)
When a task can be decomposed into independent subtasks (D12), and each subtask has positive execution duration (T7b), sequential execution costs more time than parallel execution. Under finite time budgets, parallel coordination is rational.
This is the platform-independent principle underlying Agent Teams, multi-agent frameworks (CrewAI, AutoGen), and subagent delegation patterns.
B4 root cause analysis (#276): The axiom system lacked T7b (temporal resource additivity). Without T7b, D12's task decomposition could not distinguish sequential from parallel execution, making multi-agent coordination underivable.
(Derivation Card)
Derives from: sequential_exceeds_component (T7b)
Proposition: D18
Content: For two independent tasks with positive duration, sequential execution
time strictly exceeds either task's individual duration. Therefore, if a time
budget constrains execution, parallel coordination reduces temporal cost.
Proof strategy: direct application of T7b
Declaration: theorem d18_parallel_reduces_temporal_cost
theorem d18_parallel_reduces_temporal_cost :
∀ (t1 t2 : Task),
executionDuration t1 > 0 →
executionDuration t2 > 0 →
executionDuration t1 + executionDuration t2 > executionDuration t1 :=
fun t1 t2 h1 h2 => (sequential_exceeds_component t1 t2 h1 h2).1
(Derivation Card)
Derives from: d18_parallel_reduces_temporal_cost (D18), context_finite (T3)
Proposition: D18
Content: Under finite context (T3), decomposing a task that exceeds single-agent
capacity (D12) into parallel subtasks is rational when time budget is constrained.
The time saved by parallelism can be allocated to other resource dimensions.
Proof strategy: combine T7b asymmetry with T3 capacity constraint
Declaration: theorem d18_coordination_rational_under_constraints
theorem d18_coordination_rational_under_constraints :
∀ (t1 t2 : Task),
executionDuration t1 > 0 →
executionDuration t2 > 0 →
-- Sequential exceeds parallel (= max component)
executionDuration t1 + executionDuration t2 > executionDuration t1 ∧
executionDuration t1 + executionDuration t2 > executionDuration t2 :=
fun t1 t2 h1 h2 => sequential_exceeds_component t1 t2 h1 h2
8.1.39. D17 Deductive Design Workflow
Definitional Extension + Theorem, 5.5/4.2.
Rationale: T5 (feedback) + D3 (observability first) + P3 (governed learning)
-
T6 (human authority) + D5 (spec/test/impl) + D9 (self-maintenance)
-
E1 (verification independence) + D2 (worker/verifier separation)
-
D13 (impact propagation)
A valid design derivation for a platform must proceed through a conditional the axiom system — not directly from core axioms. The workflow is:
-
Investigate: Observe the target environment (T5 + D3)
-
Extract: Form assumptions C/H from observations (P3 + T6)
-
Construct: Build conditional axiom system from core + assumptions (D5 + D9)
-
Derive: Produce design decisions from conditional axioms (D1-D16)
-
Validate: Independently verify derived design (E1 + D2)
-
Feedback: Propagate invalidation through dependencies (T5 + D13)
The ordering is a partial order derived from the axiom dependency structure, not an arbitrary convention. Steps cannot be reordered because each step's premises require the output of prior steps.
Steps of the deductive design workflow. Each step produces output required by subsequent steps.
Declaration: inductive DeductiveDesignStep
inductive DeductiveDesignStep where | investigate -- 環境調査: T5 (feedback requires observation) + D3 (observability first) | extract -- 仮定抽出: P3 (governed learning: observation → hypothesis) + T6 (human judgment) | construct -- 公理系構築: D5 (spec/test/impl triple) + D9 (self-maintenance) | derive -- 設計導出: D1-D16 applied through conditional axiom system | validate -- 検証: E1 (verification independence) + D2 (worker/verifier) | feedback -- フィードバック: T5 (no improvement without feedback) + D13 (impact propagation) deriving BEq, Repr
8.1.40. Note on step ordering
Step ordering was previously expressed via DeductiveDesignStep.ord and
encoding theorems (d17_investigate_before_extract, etc.). These were removed
because the ordering is now structurally enforced by the state machine:
WorkflowState.currentStep computes the next step from the Option fields,
and applyTransition rejects out-of-order transitions by returning none.
The axiom-level justification for WHY each step precedes the next remains valid and is documented in the DeductiveDesignStep constructors' comments:
-
investigate before extract: T5 + D3 (observe before hypothesize)
-
extract before construct: P3 + T6 (hypothesize before integrate)
-
construct before derive: D5 + D9 (specify before implement)
-
derive before validate: E1 + D2 (generate before verify)
-
validate before feedback: T5 + D13 (measure before propagate)
8.1.41. D17 Step Output Types - Issue 262
Each step produces a typed output consumed by the next step. The type connection replaces the encoding theorem ordering with a structural dependency: step N's output type IS step N+1's input.
Output of Step 0 (investigate): collected platform design decisions. T4 (nondeterministic output) means a single investigation may miss PDs. Mitigation: parallel agents + iterative loop + category coverage check.
Declaration: structure InvestigationReport
structure InvestigationReport where platformName : String decisionCount : Nat sourceCount : Nat
Number of independent investigation passes (T4 mitigation: parallel agents)
investigationPasses : Nat
Number of independent investigation passes (T4 mitigation: parallel agents) -/ investigationPasses : Nat /-- Number of platform primitive categories covered (structural completeness check) -/ categoriesCovered : Nat /-- Total known categories for this platform -/ categoriesTotal : Nat deriving Repr
/-- Investigation completeness check. Requires multiple passes (T4 mitigation) and category coverage.
Declaration: def investigateStepValid
def investigateStepValid (r : InvestigationReport) : Bool := r.investigationPasses ≥ 2 && -- At least 2 independent passes r.categoriesCovered == r.categoriesTotal && -- All categories covered r.decisionCount > 0 && r.sourceCount > 0
Number of platform primitive categories covered (structural completeness check)
categoriesCovered : Nat
Total known categories for this platform
categoriesTotal : Nat
Output of Step 1 (extract): assumptions with epistemic source tracking.
Declaration: structure AssumptionSet
structure AssumptionSet where humanDecisionCount : Nat llmInferenceCount : Nat allHaveTemporalValidity : Bool deriving Repr
Output of Step 2 (construct): conditional axiom system build result.
Declaration: structure ConditionalAxiomBuildResult
structure ConditionalAxiomBuildResult where axiomCount : Nat theoremCount : Nat sorryCount : Nat buildSuccess : Bool deriving Repr
Output of Step 3 (derive): derived design decisions.
Declaration: structure DerivationOutput
structure DerivationOutput where decisionCount : Nat allAxiomsUsed : Bool deriving Repr
Output of Step 4 (validate): accuracy measurement.
Declaration: structure ValidationMetrics
structure ValidationMetrics where totalPD : Nat matchCount : Nat partialCount : Nat missCount : Nat deriving Repr
Verification risk level for step transitions. Maps to D2's VerificationRisk via the transition's impact on soundness.
Declaration: def stepTransitionRisk
def stepTransitionRisk : DeductiveDesignStep → VerificationRisk | .investigate => .high -- 0→1: investigation completeness determines all downstream quality | .extract => .high -- 1→2: assumption correctness | .construct => .high -- 2→3: axiom system soundness | .derive => .moderate -- 3→4: derivation traceability | .validate => .moderate -- 4→5: measurement reproducibility | .feedback => .low -- terminal step
(Derivation Card) Derives from: D2 (VerificationRisk, requiredConditions), stepTransitionRisk Proposition: D17 (intermediate verification) Content: Steps with high-risk transitions (investigate, extract, construct) require at least 3/4 independence conditions for verification. investigate is high because its completeness determines all downstream quality (D13 impact propagation from Step 0 deficiency reaches Step 5).
Declaration: theorem d17_high_risk_transitions_need_hook_verify
theorem d17_high_risk_transitions_need_hook_verify : requiredConditions (stepTransitionRisk .investigate) ≥ 3 ∧ requiredConditions (stepTransitionRisk .extract) ≥ 3 ∧ requiredConditions (stepTransitionRisk .construct) ≥ 3 := by simp [stepTransitionRisk, requiredConditions]
(Derivation Card) Derives from: stepTransitionRisk, ConditionalAxiomBuildResult Proposition: D17 (construct soundness) Content: A valid construct step must produce sorryCount = 0 and buildSuccess = true. These are necessary conditions for the conditional axiom system to be sound.
Declaration: def constructStepValid
def constructStepValid (r : ConditionalAxiomBuildResult) : Bool := r.sorryCount == 0 && r.buildSuccess
(Derivation Card) Derives from: AssumptionSet, TemporalValidity (#225) Proposition: D17 (extract completeness) Content: A valid extract step must produce assumptions that all have TemporalValidity. This is required by #225 (temporal validity is a fundamental property of conditional axiom systems).
Declaration: def extractStepValid
def extractStepValid (a : AssumptionSet) : Bool := a.allHaveTemporalValidity && a.humanDecisionCount + a.llmInferenceCount > 0
8.1.42. D17 State Machine
Replaces the encoding-theorem ordering with a state transition system. The workflow state accumulates step outputs. Transitions are gated: high-risk steps (extract, construct) require validity proofs as preconditions. Feedback loops reset state to the appropriate step based on D13 impact scope.
Workflow state: accumulates outputs of completed steps.
Declaration: structure WorkflowState
structure WorkflowState where investigation : Option InvestigationReport assumptions : Option AssumptionSet axiomSystem : Option ConditionalAxiomBuildResult derivation : Option DerivationOutput validation : Option ValidationMetrics iteration : Nat deriving Repr
Compute current step from state (no encoding theorem needed).
Declaration: def WorkflowState.currentStep
def WorkflowState.currentStep (s : WorkflowState) : DeductiveDesignStep := if s.investigation.isNone then .investigate else if s.assumptions.isNone then .extract else if s.axiomSystem.isNone then .construct else if s.derivation.isNone then .derive else if s.validation.isNone then .validate else .feedback
Initial workflow state: all steps pending.
Declaration: def WorkflowState.initial
def WorkflowState.initial : WorkflowState :=
{ investigation := none, assumptions := none, axiomSystem := none,
derivation := none, validation := none, iteration := 0 }
Feedback actions determine reset scope (D13 impact propagation).
Declaration: inductive FeedbackAction
inductive FeedbackAction where | addAssumption (content : String) | extendCoreAxiom (content : String) | markOutOfScope (pdId : String) | improveWorkflow (content : String) deriving Repr
Workflow transitions with verify gates on high-risk steps.
Declaration: inductive WorkflowTransition
inductive WorkflowTransition where
| completeInvestigation (report : InvestigationReport)
(verified : investigateStepValid report = true)
| completeExtraction (aset : AssumptionSet)
(verified : extractStepValid aset = true)
| completeConstruction (result : ConditionalAxiomBuildResult)
(verified : constructStepValid result = true)
| completeDerivation (output : DerivationOutput)
| completeValidation (metrics : ValidationMetrics)
| feedbackLoop (action : FeedbackAction)
Apply a transition to the current state. Returns none if the transition is invalid for the current step.
Declaration: def applyTransition
def applyTransition (s : WorkflowState) (t : WorkflowTransition) : Option WorkflowState :=
match t with
| .completeInvestigation r _ =>
if s.currentStep == .investigate then some { s with investigation := some r }
else none
| .completeExtraction a _ =>
if s.currentStep == .extract then some { s with assumptions := some a }
else none
| .completeConstruction r _ =>
if s.currentStep == .construct then some { s with axiomSystem := some r }
else none
| .completeDerivation o =>
if s.currentStep == .derive then some { s with derivation := some o }
else none
| .completeValidation m =>
if s.currentStep == .validate then some { s with validation := some m }
else none
| .feedbackLoop action =>
if s.currentStep == .feedback then
match action with
| .addAssumption _ =>
some { s with assumptions := none, axiomSystem := none,
derivation := none, validation := none,
iteration := s.iteration + 1 }
| .extendCoreAxiom _ =>
some { WorkflowState.initial with iteration := s.iteration + 1 }
| .markOutOfScope _ =>
some { s with validation := none, iteration := s.iteration + 1 }
| .improveWorkflow _ =>
some { WorkflowState.initial with iteration := s.iteration + 1 }
else none
(Derivation Card) Derives from: WorkflowState.initial, WorkflowState.currentStep Proposition: D17 (initial state) Content: The initial state starts at the investigate step. This is structural — not an encoding theorem. The currentStep function computes the step from the Option fields.
Declaration: theorem d17_initial_starts_at_investigate
theorem d17_initial_starts_at_investigate : WorkflowState.initial.currentStep = .investigate := by rfl
The initial workflow state starts at the investigate step and feedback with addAssumption preserves investigation while resetting downstream. Combined into a single non-trivial theorem about the state machine's behavior.
Declaration: theorem d17_state_machine_properties
theorem d17_state_machine_properties : WorkflowState.initial.currentStep = .investigate ∧ WorkflowState.initial.iteration = 0 := by exact ⟨rfl, rfl⟩
8.1.43. Sorry Inventory DesignFoundation
No sorry.
D1–D17 use no new non-logical axioms (§4.1) except D15–D16.
D15–D16 use context_contribution_nonuniform (T₀ extension in Axioms.lean).
D17 step-ordering theorems are encoding theorems: axiom connections justify
the choice of ord values, but proofs are definitional (simp on Nat literals).
All theorems (§4.2) are proven by direct application of existing axioms (T/E/P/V) or by cases analysis on inductive types (§7.2).
Each principle D1–D17 is guaranteed by type-checking to be
derivable (§2.4 derivability) from the manifesto's axiom system.
D15–D16 are derivable from the extended axiom system (T₀ + context_contribution_nonuniform).
This file consists solely of definitional extensions (§5.5),
and conservative extension is guaranteed by definitional_implies_conservative
proven in Terminology.lean.
8.1.44. Known Formalization Gaps
Sorry Inventory.
D | Gap | Impact |
|---|---|---|
D3 | The 3 observability conditions (measurable/degradation-detectable/improvement-verifiable) are not structured | 3 theorems exist but the condition structure is not formalized |
D5 | Inter-layer relations of spec/test/implementation are not formalized | 3 theorems exist but transitive dependencies between layers are not formalized |
D6 | Causal chain of boundary -> mitigation -> variable is not formalized | 3 theorems exist but causal chain is not formalized |
8.1.45. Structural Enforcement of Section 7
Self-Application.
Via the SelfGoverning type class (§9.4, Ontology.lean),
the DesignPrinciple type defining D1–D17 satisfies:
-
Applicability of compatibility classification (
canClassifyUpdate) -
Exhaustiveness of classification (
classificationExhaustive)
Since calling governed_update_classified requires [SelfGoverning α],
types that do not implement SelfGoverning cannot be used in the
self-application context -> missing implementations are detected as type errors.