Axiomatic Reasoning for LLMs

An Idea for AI Long Term Memory

1. Overview

This document describes a technical architecture for a natural language database designed specifically for large language models (LLMs) to use as long-term memory. Unlike vector databases or traditional RAG pipelines, this system stores memory as structured natural language summaries with explicit relational tags, version history, and confidence scores. The LLM can read, write, and update this memory directly, enabling recursive self‑improvement without context overflow.

The architecture is built on three operational principles extracted from prior work:

Structural separation – intent, structural specification, and implementation are kept distinct.
Generative conformance – all mutable parts are generated from a compact specification.
Fixed‑point convergence – recursive updates must terminate in a stable state.

2. Core Mechanisms

2.1 Recursive Compression

Each memory entry is a structured summary that maximises inferential information density (ID): ID = Kolmogorov complexity / average inference steps to understand/modify

The goal is to pack maximum constraint into minimal tokens while keeping the content interpretable by an LLM.

2.2 Structural Resonance

When a coherent specification is repeatedly applied, the LLM’s internal reasoning undergoes a phase shift from associative generation to structured inference. This resonance enables the model to treat the memory database as a reliable execution environment rather than a loose collection of facts.

2.3 Skeleton‑Tissue Separation

Skeleton: Invariant interfaces, base classes, cross‑cutting concerns (authentication, logging, transaction boundaries). Human‑maintained.
Tissue: Concrete business logic, implementation details. AI‑generated from the structural specification.

This physical firewall ensures that generated code never violates architectural invariants.

3. Data Model

3.1 Entry Structure

Each entry contains the following fields:

entry_id: UUID
url: string, unique
title: string
summary: string (1–3 sentences, ID‑optimised)
total_chars: integer
created_at, last_updated, last_verified: timestamps
hierarchy_path: array of strings, e.g. ["Reasoning", "Formal Systems"]
category: string
confidence: float (0.0–1.0)
status: one of ACTIVE, DEPRECATED, CONFLICT, PENDING_VERIFICATION
version: integer
ontological_layer: one of PHYSICAL, SEMANTIC, META

3.2 Relations

Tags: up to five tags per entry, each with a type (REFERENCE, CONTRAST, HIERARCHY, APPLICATION, FOUNDATION) and a weight.
References: directed edges with relation type (PRECONDITION, EXTENSION, CONTRADICTION, EXAMPLE, DERIVATION) and strength.
Referenced by: automatically maintained reverse index.

3.3 Ontological Layering

PHYSICAL: deterministic execution environment (hardware, runtime). Immutable.
SEMANTIC: concepts, inference rules, knowledge. Must be consistent internally and reducible to the physical layer.
META: self‑referential entries (e.g., “memory about how memory is updated”). Circular references allowed only if they converge to a fixed point.

4. Operations

4.1 CRUD Interface

The database exposes four basic operations:

add(entry) → returns a new entry ID.
get(entry_id) → returns the entry.
update(entry_id, changes) → applies modifications and returns the updated entry.
delete(entry_id) → logical deletion (sets status to DEPRECATED).

4.2 Layered Search

Search combines three retrieval strategies:

Hierarchy: filter by category and depth.
Tag: match tags with optional ANY/ALL logic.
Vector: semantic similarity using embedding indexes.

A HYBRID strategy merges results from all three and ranks by confidence.

4.3 Update Propagation

When an entry changes, the system propagates the change along reference edges. Propagation depth is configurable (default max 5). Bidirectional propagation is limited to depth 2 to avoid excessive cascades. The algorithm uses a breadth‑first queue and assesses impact per edge to decide whether a derived change is needed.

4.4 Conflict Detection and Resolution

Conflicts are detected on three axes:

Content: semantic divergence above a threshold while addressing the same subject.
Reference: inconsistency with existing dependency graph.
Confidence: low‑confidence entry contradicts a high‑confidence one.

Resolution options are returned to the caller (typically an LLM or a human operator) with suggested actions: MERGE, REFERENCE_AS_CONTRADICTION, DEFER_TO_HIGHER_CONFIDENCE, or FLAG_FOR_REVIEW.

5. Convergence and Fixed‑Point Verification

A database state is a fixed point if applying any recursive update operation yields a semantically equivalent state. Convergence is verified through four checks:

ID stability – the global inferential density does not change beyond tolerance.
Self‑consistent cycles – every cycle in the reference graph remains coherent when traversed.
Confidence stability – confidence scores stop drifting.
Propagation locality – updates no longer propagate beyond depth 1.

The system exposes a verify_fixed_point() method that returns the convergence status and, if not converged, the remaining drift.

6. Prototype Implementation

A minimal prototype uses:

SQLite for structured data (entries, tags, references, history).
ChromaDB (or pgvector) for embedding storage.
Python for the orchestration layer.

Core methods:

add() – validates conflicts, inserts row, tags, references, and embedding.
search() – runs hybrid retrieval and returns ranked entries.
update() – detects change type, evaluates impact, optionally propagates, and logs history.
propagate_update() – traverses the reference graph with depth limits.
verify_fixed_point() – runs the four convergence checks.

7. Limitations and Future Work

Limitation	Mitigation / Future
Automatic conflict resolution not always possible	Return options to LLM for interactive resolution
Scalability beyond 100k entries	Implement hierarchical clustering, distributed indices
Synchronous writes	Add background extraction/consolidation workers
External source synchronisation	Integrate periodic sitemap‑based diffing
Subjective confidence assignment	Derive confidence from source reliability, verification count, and decay over time

8. Conclusion

This specification outlines a natural language database that turns LLM long‑term memory into a self‑maintaining, recursively compressible, and verifiable system. By separating invariant architecture from generated content, enforcing fixed‑point convergence, and using the LLM’s own language as the storage format, the design enables agents to persistently improve their knowledge without suffering context explosion or semantic drift.

This site is open source. Improve this page.