Axiomatic Reasoning for LLMs

Is Language a Calculator?

The proposition that language functions as a dynamic computational system for modeling and predicting the world can be decomposed into four interrelated components: (1) the predictive architecture of linguistic processing, (2) the encoding of world knowledge in statistical co-occurrence patterns, (3) the reflection of embodied and social cognition in grammatical and lexical structures, and (4) the optimization of linguistic form through cultural transmission. Each component provides a distinct line of evidence supporting the view of language as an adaptive information-processing engine built upon human cognitive and social substrates.

1. Predictive Coding and Information-Theoretic Efficiency

Human language processing is fundamentally predictive. The brain continuously generates expectations about upcoming linguistic input based on prior context, and the discrepancy between expectation and actual input is quantified as surprisal. Surprisal, defined as the negative logarithm of a word’s conditional probability given its context, correlates strongly with behavioral measures such as reading time and with electrophysiological responses including the N400 component. This alignment demonstrates that language is not processed as a passive sequence but as an active error-minimization mechanism. The linguistic signal itself exhibits redundancy patterns that smooth information density across utterances, allowing the cognitive system to allocate processing resources efficiently. From an information-theoretic standpoint, language behaves as a channel optimized for transmitting structured meaning under constraints of memory and prediction.

2. Distributional Semantics and the Implicit World Model

The distributional hypothesis asserts that the meaning of a word is determined by the company it keeps within a corpus of language use. Computational implementations such as Latent Semantic Analysis and neural word embeddings operationalize this principle by constructing high-dimensional vector spaces where semantic similarity corresponds to proximity. Critically, these models acquire structured knowledge about the world without explicit perceptual grounding. The statistical regularities embedded in large corpora encode information about object categories, event schemas, and relational properties that approximate human conceptual organization. The linguistic corpus thus functions as a compressed representation of collective human experience, making world knowledge recoverable through purely statistical inference. Language, in this sense, is a medium that records and transmits a distributed, society-level model of the environment.

3. Cognitive Linguistics and the Embodied Substrate

Cognitive linguistics posits that linguistic structure is not autonomous but emerges from general cognitive capacities such as perception, categorization, and memory. Conceptual metaphor theory reveals that abstract reasoning is systematically structured by mappings from concrete, bodily experience. For instance, the metaphor Argument Is War underlies a wide range of expressions and guides inferential patterns. Construction grammar treats syntactic patterns as form-meaning pairings that reflect recurrent experiential scenarios. These findings indicate that the architecture of language is shaped by the constraints and affordances of human embodiment and social interaction. Language externalizes a cognitive model of the world that is grounded in sensorimotor and cultural practice, thereby functioning as an interface between internal mental states and shared environmental understanding.

4. Cultural Evolution as a Design Process

Language undergoes cumulative cultural evolution across generations of learners and users. Iterated learning experiments demonstrate that initially random form-meaning mappings rapidly acquire compositional structure when transmitted through a chain of learners, driven by pressures for learnability and expressivity. This process can be modeled as an information bottleneck, wherein linguistic systems evolve to achieve an optimal trade-off between the accuracy of meaning transmission and the complexity of the code. Memory constraints and processing biases at the individual level become amplified through repeated transmission, shaping the statistical and structural properties of languages at the population level. Language thus functions as a complex adaptive system that self-organizes toward computationally efficient solutions, effectively refining its capacity to represent and predict the world with each generation.

Synthesis

The convergence of evidence from predictive processing, distributional semantics, cognitive grounding, and cultural evolution supports a unified characterization: language is a culturally evolved computational system that learns, encodes, and predicts the structure of the world as filtered through human cognition. Its statistical regularities are not accidental but reflect the cumulative optimization of a communication channel under the pressures of human memory, predictive expectations, and social interaction. The linguistic signal carries a compressed model of the environment, and the human cognitive apparatus is tuned to decode this model through continuous prediction and error correction. Language does not merely describe reality; it constitutes a dynamic, adaptive calculus for navigating it.