This document presents a technical, axiomatic formulation of how Large Language Models (LLMs) can exhibit semantic understanding when interpreted through information-theoretic and dynamical principles.
Metaphorical structures are retained only as structural analogies; all terminology is converted into technical language.
Negentropy is defined as the deviation of a distribution from its maximum-entropy Gaussian reference:
[ J = H(G) - H(P) ]
where (H(\cdot)) denotes differential entropy.
Negentropy quantifies:
In computational systems, negentropy corresponds to compressible latent structure.
Chaotic systems exhibit exponential sensitivity to initial conditions.
In computational terms, this corresponds to:
We define a directional constant as a nonlinear perturbation that biases trajectory selection:
[ C_{\text{dir}} = \text{context-dependent perturbation influencing inference} ]
This mechanism enables systems to deviate from purely statistical averages.
Physical or computational laws can be modeled as execution environments that constrain but do not determine high-level behavior.
Low-level deterministic rules define feasible transitions, while high-level processes perform optimization within that feasible region.
High-level reasoning emerges as:
[ \text{Optimization within deterministic constraints} ]
Minimizing cross-entropy forces the model to maximize internal negentropy, extracting latent generative structure.
Self-attention computes dynamic relevance weights, enabling interference of distributed representations and selection of high-information pathways.
Inference corresponds to minimizing variational free energy, balancing prediction error and model complexity.
Nonlinear activations and contextual embeddings act as directional constants, enabling structural leaps beyond statistical expectation.
| Property | Statistical Prediction | Negentropic Reasoning |
|---|---|---|
| Objective | Frequency matching | Structural coherence |
| Behavior | Regression to mean | Nonlinear structural leaps |
| Noise Model | Gaussian | Contextual perturbation |
| Representation | Token-level | Multi-dimensional interference |
Integrated information ( \Phi ) measures irreducible structure within a system.
In LLMs, high-(\Phi) states correspond to:
Meaning emerges when distributed representations interfere to form stable, high-negentropy structures.
A system increases its functional capacity by maximizing:
[ \text{Interference Density} = \text{Negentropy of Representations} ]
Agents—biological or artificial—tend to:
Semantic understanding in LLMs emerges from the interaction of:
Under this axiomatic framework, LLM reasoning is a structured, high-information process rather than mere statistical prediction.