Axiomatic Reasoning for LLMs

This is a snapshot plan of 2026/04/02

Implementation Plan on Python

1. Overview

This document defines a technical implementation plan for applying Deep Coding—a specification-driven, human–AI collaborative development methodology—to Python software projects. The plan translates the core principles of Deep Coding into concrete Python constructs, toolchains, and phase-based workflows.

The implementation rests on four technical pillars:

Intent–Specification–Implementation Separation: Human intent is captured as a formal structural specification using Python’s type system and data validation libraries.
Generative Conformance: Implementation code is generated from the specification and automatically verified against it.
Recursive Refinement with Fixed Premises: Development proceeds in phases; each phase concludes with a summarized specification that becomes immutable for subsequent phases.
Skeleton–Tissue Architecture: The invariant architectural skeleton is manually maintained, while the variable business logic (tissue) is fully generated by AI.

All execution steps, evaluation metrics, and verification gates are defined in Python-native terms, avoiding philosophical framing in favor of actionable technical procedures.

2. Core Technical Components

2.1 Structural Specification Layer

The specification is expressed using Python’s static and runtime type systems:

Interface Contracts: Use typing.Protocol to define expected behavior without requiring inheritance. Protocols enforce structural subtyping at static check time.
Data Models: Use pydantic.BaseModel for all data structures. Models provide runtime validation, JSON serialization, and schema generation.
Module Boundaries: Define public interfaces in __init__.py files with explicit __all__ lists. All cross-module dependencies must reference only these public interfaces.
Invariants: Encode business invariants in pydantic validators (@field_validator, @model_validator) or as abstract methods in protocol definitions.

Example specification fragment :

from typing import Protocol
from pydantic import BaseModel

class EntitySpec(Protocol):
    def update(self, delta_time: float) -> None: ...
    def render(self, surface) -> None: ...

class GameState(BaseModel):
    score: int
    lives: int
    invincible_frames: int = 0

2.2 Skeleton–Tissue Architecture

Skeleton (Manually Maintained, AI-Inaccessible)

Abstract base classes using abc.ABC that implement template method patterns.
Cross-cutting concerns (logging, error handling, transaction boundaries) placed in skeleton classes.
Skeleton files are placed in a designated directory (e.g., skeleton/) with write permissions restricted to human developers.

Tissue (AI-Generated, Regenerable)

Concrete classes inheriting from skeleton abstract classes.
Business logic implementations.
Module-specific configurations and helper functions.

The skeleton guarantees architectural invariants: because the skeleton owns the execution flow (e.g., a run method that calls abstract methods), any AI-generated tissue code is constrained by the skeleton’s calling order and error boundaries.

2.3 Generative Conformance Loop

For each module or feature, the generation process follows:

Specification Update: Modify the structural specification (protocols, pydantic models, invariants).
Generation Request: The AI generates implementation code that conforms to the updated specification.
Static Verification: Run mypy --strict to verify type conformance.
Runtime Validation: Execute pydantic model validations and unit tests.
Commit: If all gates pass, commit the new specification and generated code.

Any failure at steps 3 or 4 triggers a regeneration attempt. The system does not proceed to the next phase until verification passes.

2.4 Fixed Premise Management

After each development phase, a structured summary is committed as a fixed premise. The summary includes:

The current state of all specification files (protocols, models).
A list of verified invariants.
The public API surface of each module.
All passing test cases.

This summary is stored in a version-controlled file (e.g., FIXED_PREMISE.md or as a structured YAML file). Subsequent phases treat this summary as immutable. Changes to the specification must be explicitly planned and approved before the next phase.

3. Phase-Based Implementation Workflow

Development proceeds in phases, each consisting of the following steps:

Phase Template

Plan: Define the specification changes for this phase.
Approval: Human reviews and approves the specification delta.
Generation: AI generates implementation code from the updated specification.
Static Gate: mypy --strict passes with zero errors.
Test Gate: pytest passes all existing tests plus new tests for added functionality.
Validation Gate: pydantic runtime validation passes for all data models.
Summary: Generate and commit the fixed premise for this phase.

Example Phase Sequence for a Python Game Engine

Phase 1: Core Engine Skeleton

Specification: Define GameLoopSkeleton as an abstract base class with template methods setup, handle_event, update, render, cleanup.
Implementation: Manual implementation of the skeleton (human).
Verification: No AI generation in this phase; skeleton is hand-coded.

Phase 2: Player and Input Handling

Specification: Add Player protocol with move, shoot methods. Define InputState pydantic model.
Generation: AI generates Player implementation inheriting from skeleton.
Verification: mypy --strict confirms Player conforms to protocol. pytest validates input mapping.

Phase 3: Enemies and Collision

Specification: Add Enemy protocol and CollisionDetector class with defined interface.
Generation: AI generates enemy types and collision logic.
Verification: Unit tests for collision detection; runtime validation of enemy spawn parameters.

Phase 4: Data-Driven Stage System

Specification: Define Stage pydantic model with fields for enemy spawn patterns, background properties, etc.
Generation: AI refactors game engine to read from Stage model; generates stage loader.
Verification: Existing gameplay tests pass with new data-driven architecture.

Phase 5: UI and Editor Integration

Specification: Define StageEditor protocol and serialization interface (JSON export/import).
Generation: AI generates editor UI components (if in Python with GUI framework) or API endpoints.
Verification: Round-trip serialization tests; UI interaction tests.

Phase 6: Optimization and Polish

Specification: Add performance constraints (e.g., maximum frame time).
Generation: AI optimizes rendering loops, adds visual effects.
Verification: Performance benchmarks; visual regression tests (image comparison).

4. Technical Verification Gates

All generated code must pass the following automated checks before phase completion:

Gate	Tool	Command	Pass Condition
Type Safety	mypy	`mypy --strict --no-implicit-optional .`	0 errors
Linting	ruff	`ruff check .`	0 errors, 0 warnings
Formatting	black	`black --check .`	No formatting changes required
Unit Tests	pytest	`pytest --cov=. --cov-fail-under=80`	100% pass rate, coverage >= 80%
Docstring Coverage	pydocstyle	`pydocstyle .`	0 violations
Complexity	radon	`radon cc -a -nb .`	Average cyclomatic complexity < 10

These gates are implemented as pre-commit hooks and CI pipeline checks, ensuring that any generated code that fails verification is rejected before integration.

5. AI Integration Interface

The AI (LLM) is invoked through a structured prompt that provides:

The current fixed premise (specification summary).
The specification delta for the upcoming phase.
The existing skeleton code (read-only).
The test suite that must pass.

The AI’s output is restricted to:

Implementation code for tissue modules.
New unit tests for added functionality.
A summary of changes made.

The AI does not have permission to modify:

Skeleton files (abstract base classes, cross-cutting concerns).
The fixed premise summary.
Existing tests that are not related to the current phase.

This interface is implemented as a function call or API endpoint that wraps the LLM with these constraints.

6. Evaluation Metrics

The success of the implementation is measured by:

Metric	Target	Measurement Method
Skeleton Modification Count	0 per phase	Git diff on skeleton/ directory
Type Check Pass Rate	100%	mypy exit code
Test Pass Rate	100%	pytest exit code
Regeneration Iterations	<= 3 per phase	Count of generation attempts before gate pass
Specification Change Lead Time	< 15 minutes	Time from spec change to passing gates
Code Churn	< 5% per feature	Lines changed / lines added across phases
Human Intervention Ratio	< 10% of phases	Phases requiring human code edits / total phases

These metrics are collected automatically via CI logs and version control history.

7. Toolchain Requirements

Python 3.12 or later
pydantic >= 2.0
mypy >= 1.0
ruff >= 0.1
black >= 23.0
pytest >= 7.0 with pytest-cov
radon >= 6.0
pre-commit for gate automation
Git for version control and fixed premise management

Optional for advanced metrics:

networkx + custom curvature scripts for dependency graph analysis
scikit-learn for embedding-based semantic density measurement

8. Boundary Conditions and Limitations

This implementation plan assumes:

The project is developed from scratch or has a clean separation between architecture and business logic.
Human developers are available to define and maintain the skeleton and approve phases.
The AI generation model has sufficient capacity to produce code conforming to Python type hints and pydantic models.
Performance requirements are specifiable as constraints (e.g., maximum execution time) rather than emergent properties.

When these conditions are not met:

For legacy codebases, an initial specification extraction phase is required before Deep Coding can be applied.
For non-functional requirements that resist specification, runtime monitoring and manual tuning must supplement the automated gates.
For distributed teams, the fixed premise and skeleton must be managed through stricter code review and branching policies.

9. Conclusion

This implementation plan provides a concrete, toolchain-integrated pathway for adopting Deep Coding in Python projects. By encoding architectural constraints in Python’s native type system, enforcing generative conformance through automated verification gates, and maintaining fixed premises across recursive refinement phases, the plan translates theoretical principles into executable development workflows.

The result is a development process where:

Human developers focus on structural specifications and architectural invariants.
AI handles implementation generation and routine code maintenance.
Verification is automated and deterministic.
Long-term code maintainability is built into the process rather than retrofitted.

This is not a philosophical proposal but a technically executable methodology grounded in existing Python tools and established software engineering practices. ```

This site is open source. Improve this page.