Axiomatic Reasoning for LLMs

An Orchestration Strategy

1. Core Abstraction

A three‑layer architecture separates what a system must achieve from how it is realized, then distributes execution across parallel modules whose coordination overhead grows logarithmically while total compute scales multiplicatively.

Layer 0 – Intent: High‑level objective expressed in natural language or constrained form.
Layer 1 – Structural Specification: Machine‑interpretable contract (schemas, invariants, dependency rules) that defines interfaces and allowed interactions.
Layer 2 – Execution: Parallel modules that consume the specification and produce conforming artifacts (code, data, decisions).

The specification acts as a fixed premise—once validated, it becomes immutable for the current phase, enabling local reasoning and preventing context drift.

2. Decomposition Pattern

Intent is decomposed via a hierarchical task network (HTN) pattern with four primitive forms:

Pattern Behavior Parallelizability
Sequential Ordered subtasks, output feeds next None
Parallel Independent subtasks, results aggregated Full
Conditional Branching based on intermediate state Partial
Iterative Repeated application until convergence Loop‑level only

The decomposition engine selects patterns based on task dependency graph analysis. For any task, the decomposition is expressed as:

specification = { id, version, input_schema, output_schema, invariants: [condition_1, …, condition_n], decomposition: { pattern, subtasks, preconditions, postconditions }, dependencies: [{ target, type, version_constraint }] }

3. Parallel Execution Model

The middle layer consists of N intermediate modules, each responsible for a disjoint subset of the decomposed specification. Modules execute independently with:

Lower‑level modules (implementation layer) follow the same pattern recursively, forming a tree whose depth is determined by granularity requirements.

4. Load Balancing and Resource Allocation

Specialization creates natural load imbalance. The system adjusts through:

Score‑adaptive routing (LASER)

Least‑loaded expert parallelism (LLEP)
When a device becomes overloaded, its tokens and the corresponding expert parameters are dynamically relocated to underutilized devices. This yields:

5. Convergence Detection

A parallel system has converged when:

  1. Intra‑module uncertainty → each module’s output distribution is delta‑concentrated
  2. Inter‑module agreement → all modules map their outputs to the same semantic cluster

These are quantified as Collaborative Entropy (CoE) : U = mean(SE(x_i)) + sum(w_i * KL(p_i || p_bar))

Convergence is declared when U approaches zero. The verification gate at each layer validates conformance to specification before promoting results to the next layer.

6. Complexity Bounds

Let:

Total wall‑clock time: T_total = sum_over_p( max_i( T_mid(i,p) + max_j(T_low(ij,p)) ) + O(log N_p) + V_p )

Total compute: C_total = sum_over_p( sum_i C_mid(i,p) + sum_i sum_j C_low(ij,p) )

Because max_i and max_j are independent of N_p (parallel modules do not wait for each other), increasing N_p exponentially does not increase T_total beyond the linear factor introduced by additional phases P. Compute grows multiplicatively with N_p; time grows linearly with P.

7. Scaling Limits and Mitigations

Constraint Critical Value Mitigation
Inter‑module communication N ~ 16–32 Hierarchical sub‑grouping
Agreement verification N ~ 128 Cluster‑then‑cluster hierarchical aggregation (O(N log N))
Specification complexity ~MB scale Modular references, shared schemas
Human validation latency P ~ 10–20 Automated conformance gates, confidence thresholds

8. Relationship to Computational Irreducibility

Certain computations cannot be predicted without execution. This framework treats such irreducibility not as a limitation but as the reason to parallelize: the only way to determine behavior is to execute, and execution time is bounded by the slowest module while total compute scales with the number of modules.

This aligns with the negentropy maximization principle: semantic interference density increases when parallel modules operate on orthogonal subspaces, converting potential collisions into structured knowledge.

9. Summary

The orchestration strategy defines:

The architecture does not eliminate the need for human intent articulation or validation, but reduces human involvement to those two functions while automating decomposition, distribution, execution, and verification. The result is a system where parallelism scales compute without scaling time, bounded only by communication overhead and the irreducible cost of validation.