artifacts/standard-named

Attention–Compression Framework

artifacts/standard-named/20260625__ATTENTION__FRAMEWORK__COMPRESSION__v1__attention-compression-framework-draft-substrate-independent.md

Rendered from markdown source. Open raw source on GitHub.

Attention–Compression Framework

A draft, substrate‑independent framework describing how attention, curiosity, and reality formation emerge from compression dynamics.

---

0) Purpose of This Framework

This framework unifies:

Attention mechanics (pointing, artifacts, decay)
Curiosity / interestingness (compression improvement)
Reality formation (fossilized attention)

into a single, operational model.

It is intended to be:

Conceptually tight
Mechanically interpretable
Applicable across cognition, culture, organizations, and systems

No game substrate is assumed.

---

1) Core Substrate: Compression

1.1 Data and Models

Let:

D = data stream (sensory, social, symbolic, environmental)
O(t) = the system’s internal model at time t
C(D, O) = compression cost of encoding D with model O

Lower C = better model.

---

1.2 Curiosity Reward

Curiosity reward is defined as:

r(t) = C(D, O(t-1)) − C(D, O(t))

Interpretation:

Reward is generated by model improvement
Not by truth, utility, or beauty directly
But by reduced description length

---

1.3 Interestingness

Interestingness is the rate of compression improvement:

I(D, O(t)) ∝ ∂B(D, O(t)) / ∂t

Where:

Beauty B ≈ compression quality
Interestingness I ≈ learning gradient

Edge cases:

Perfect randomness → no compression → not interesting
Perfect predictability → no improvement → not interesting

---

2) Attention (Redefined Precisely)

2.1 Definition

Attention is the allocation of finite compression capacity over time.

Attention determines:

What data is modeled
Which models are updated
Where curiosity reward can arise

---

2.2 Properties of Attention

Finite
Directed
Temporally extended
Subject to opportunity cost

Allocating attention to one process necessarily deprives others.

---

3) Pointing as a Primitive Act

3.1 Pointing

Pointing is any act that declares:

“Allocate compression effort here.”

Forms:

Naming
Measuring
Labeling
Recording
Repeated noticing

Pointing is irreversible in principle.

---

3.2 Imaginary Artifacts

Pointing creates an Imaginary Artifact (IA).

An IA is:

A discrete modeling target
Non‑material but causally real
Capable of accumulating attention
Subject to decay

Examples:

An idea
A plan
A role
A fear
A hypothesis

---

4) Artifact Dynamics

4.1 Attention Accumulation

Artifacts accumulate attention when:

They continue to generate curiosity reward
They remain promising sites of compression improvement

Formally:

Positive expected r(t) sustains attention

---

4.2 Decay and Boredom

When:

Compression improvement stalls
Expected future reward approaches zero

Attention decays.

Boredom = zero compression gradient.

Imaginary artifacts decay faster than real ones.

---

5) Thresholds: Imaginary → Real

5.1 Fossilization

When an imaginary artifact accumulates sufficient total attention:

The compression work becomes amortized
Ongoing maintenance cost drops
The artifact instantiates as a Real Artifact

Examples:

Idea → project
Repeated action → habit
Hypothesis → theory
Norm → institution

---

5.2 Partial Realization

Realization may be:

Incremental
Staged
Reversible

Small realized artifacts feed attention back into the parent IA.

---

6) Real Artifacts as Cached Compression

Real artifacts are:

Cached models
Compiled structure
Fossilized attention

They:

Persist with lower marginal attention
Shape future attention flows
Bias what is seen as interesting

Examples:

Language
Tools
Infrastructure
Bureaucracy

---

7) Attractors

7.1 Definition

Attractors are regions of expected future compression gain.

They are:

Field‑like
Non‑discrete
Named after the fact

Examples:

“Progress”
“Safety”
“Truth”
“Success”

---

7.2 Relationship to Attention

Attention naturally flows toward attractors unless constrained.

Constraint mechanisms:

Fear
Incentives
Authority
Scarcity

---

8) Leakage, Coupling, and Composition

8.1 Leakage

Attention leaks between artifacts that:

Share representational structure
Co‑compress efficiently

This produces:

Fame compounding
Institutional lock‑in
Paradigm coherence

---

8.2 Composition

Multiple IAs can merge
Shared attractors accelerate convergence

This enables:

Collective belief
Social movements
Cultural norms

---

9) Conservation and Pathology

9.1 Conservation Law

Attention is conserved at the system level.

Allocating attention to:

Maintaining existing artifacts
Filtering accumulated structure

Reduces capacity for:

Exploration
Novel model formation

---

9.2 Pathologies

Misaligned compression produces:

Addiction (short‑term reward, no long‑term compression)
Ideology (over‑compressed models defended at all cost)
Burnout (maintenance exceeds curiosity)
Stagnation (no accessible gradients)

---

10) Awe, Surprise, and Phase Transitions

This section extends the framework beyond curiosity/interestingness to include awe and related affective signals, while remaining mathematically compatible with:

curiosity reward: r(t) = C(D,O(t−1)) − C(D,O(t))
interestingness: I ∝ d(−C)/dt

10.1 Auxiliary quantities

Let observations be x_t and compression cost C_t := C(D,O(t)).

Surprisal / surprise (instant encoding cost under the previous model):

S_t := −log p_{O(t−1)}(x_t)

Learning progress (curiosity reward):

r_t := C_{t−1} − C_t

Expected learning progress over horizon k:

E[r_{t:t+k}] := E[C_t − C_{t+k}]

---

10.2 Boredom, confusion, and relief (quick definitions)

These are derived signals (not new primitives):

Boredom: low expected learning progress.

Boredom_t ∝ −E[r_{t:t+k}]

Confusion: high current cost with low expected progress.

Confusion_t ∝ C_t · (1 − σ(E[r_{t:t+k}]))

Relief: sharp drop in cost (a compression win).

Relief_t ∝ max(0, r_t)

(σ is any monotone squashing function.)

---

10.3 Awe (operational definition)

Awe is not merely high interestingness. It is a phase shift in modeling.

Awe tends to occur when:

surprise is high (S_t high)
but the experience is sensed as deeply learnable (E[r] high)
and successful compression likely requires a model-class shift (a new representational basis)

Define a model revision cost d(O(t),O(t−1)) and an indicator for “model-class shift required”:

P_shift(t) := P( O* lies in an expanded hypothesis class H_expanded )

Then a usable scalar proxy is:

Awe_t ∝ S_t · E[r_{t:t+k}] · P_shift(t)

Interpretation:

S_t captures vastness/violation
E[r] captures promise of future compression
P_shift captures that the needed move is not incremental

---

10.4 Awe as re-ontology

Awe is the felt recognition:

“There is a much better compression available, but my current representational basis cannot reach it by small updates.”

Formally:

C(D,O(t)) is high, ∃ O' in H_expanded such that C(D,O') ≪ C(D,O(t)), but O' is not reachable by small d(O,O').

---

10.5 Phase transitions: interest → awe → beauty

A common trajectory:

1) Interest: E[r] > 0, incremental improvement 2) Awe: S high, E[r] high, P_shift high (representational rupture) 3) Refit: high revision cost, temporary instability 4) Beauty: low C, stable compression

This explains why awe can feel disorienting before it becomes satisfying.

---

11) Appreciation (Active Steering)

Appreciation is deliberate gradient steering.

It is the practice of:

Seeing what is
Choosing which attractors to feed
Allowing low‑reward artifacts to decay

Appreciation is not denial. It is selective allocation of compression effort.

---

12) Love, Grief, Trust, and Meaning (Compression-Coupling Phenomena)

This section extends the framework to core relational and existential experiences, expressed using the same compression-compatible quantities.

12.1 Trust

Trust is the willingness to offload compression work to another system.

Formally, agent A trusts agent B when:

E[C_A(D | O_B)] < E[C_A(D | O_A)]

That is, A expects B’s model to compress A’s future experience more efficiently than A’s own.

Trust reduces:

modeling effort
uncertainty
attentional load

Trust fails when:

compression delegated to B increases cost or variance

---

12.2 Love

Love is sustained, reciprocal compression coupling.

Two agents A and B are in love when:

each becomes a high-leverage compression node for the other
mutual modeling reduces long-term cost despite short-term surprises

A minimal expression:

Love(A,B) ∝ ∫ ( r_A←B(t) + r_B←A(t) ) dt

Where r_A←B is learning progress about B by A, and vice versa.

Love feels safe because:

compression is efficient
prediction errors are rapidly amortized
model updates are mutually permitted

---

12.3 Grief

Grief is forced recompression after the sudden loss of a high-leverage compression node.

If agent B was a major contributor to A’s compression:

ΔC_A ≫ 0 when B is removed

Grief magnitude scales with:

how much of the world B helped compress
how irreplaceable that compression was

Grief persists until:

alternative models amortize the lost compression

---

12.4 Meaning

Meaning is compression leverage.

An artifact, relationship, symbol, or idea is meaningful to the extent that:

small description → large experiential compression

Formally:

Meaning(X) ∝ rac{bits of experience compressed}{bits required to represent X}

This explains why:

symbols outweigh details
rituals persist
simple stories dominate complex truths

Meaning collapses when:

leverage decays
symbols no longer compress lived experience

---

13) Power (Constraint Over Compression)

This section defines power as a first-class system property, fully compatible with the attention–compression formalism.

13.1 Definition

Power is the capacity to shape, constrain, or redirect the compression paths of other systems.

An agent A has power over agent B to the extent that A can:

determine what B is allowed to attend to
restrict which models B may form or update
impose pre-compressed narratives on B’s experience

---

13.2 Mechanisms of Power

Power operates through compression control, including:

Attention gating — limiting what data enters B’s model

(censorship, surveillance, distraction)

Narrative pre-compression — supplying ready-made models

(propaganda, ideology, branding)

Update penalties — increasing the cost of revising models

(punishment, social sanction, threat)

Gradient starvation — preventing access to curiosity reward

(monotony, overwork, chaos)

---

13.3 Power vs Trust

Trust lowers compression cost voluntarily
Power lowers apparent cost by removing alternatives

A system under power may experience apparent order without genuine compression improvement.

This explains why power often feels stabilizing in the short term but brittle over time.

---

13.4 Coercion and Harm

Coercion occurs when model updates are forced without consent.

Formally:

Forced update ⇒ d(O_B(t), O_B(t−1)) imposed externally

This creates:

high compression cost
loss of agency
long-term instability

Harm corresponds to non-consensual compression work.

---

13.5 Legibility and Over-Compression

Making a system legible to authority often requires:

reducing rich local structure → simplified global model

This lowers compression cost for the authority but raises it for the system itself.

Over-compression destroys:

resilience
adaptability
local meaning

---

13.6 Power Dynamics and Collapse

Powerful systems fail when:

maintained compression diverges too far from lived data
curiosity gradients are suppressed too long
forced models accumulate unresolved error

Collapse is delayed recompression.

---

14) Consent (Boundary Condition Between Trust and Power)

Consent is treated as a mechanical boundary condition on model updating and coupling.

14.1 Definition

Consent is a mutually acknowledged permission structure for compression and model update.

Agent A has consent with agent B when updates to B’s model caused by A are:

expected (within agreed bounds)
revocable
renegotiable
non-punitive to refuse

---

14.2 Consensual vs non-consensual update

Let ΔO_B(t) := d(O_B(t), O_B(t−1)) be B’s model revision magnitude.

Consensual update: B opts into ΔO_B(t)
Non-consensual update: ΔO_B(t) is imposed

A key distinction is not whether B updates, but whether B retains agency over update.

---

14.3 Consent as cost shaping

Consent alters the effective revision cost.

A simple expression:

C_B,total = C_B,data + μ · ΔO_B − κ · Consent(B,A)

Where Consent(B,A) ∈ [0,1] reduces perceived/experienced cost of revision.

This captures:

why the same surprise can feel thrilling (consensual) or traumatic (non-consensual)
why trust accelerates learning

---

14.4 Consent tokens (operationalization)

In real systems, consent is represented by artifacts such as:

explicit agreements
norms
safe words / stop mechanisms
boundaries and enforcement
reversible commitments

These are consent artifacts: cached structures that keep coupling safe.

---

14.5 Breach

A breach occurs when an interaction crosses agreed bounds.

Mechanically:

breach increases μ (revision cost)
decreases Consent(B,A)
increases variance of future costs

This pushes the system from trust-dynamics toward power-dynamics.

---

15) Ethics and Morality (Compression Heuristics Under Coupling)

Ethics is modeled here as rule-like compression for social coordination under finite attention.

15.1 Why morality exists (mechanically)

Social life is high-dimensional. Moral rules are:

low-description heuristics
that compress expected outcomes across many contexts

They reduce:

decision cost
negotiation overhead
model uncertainty

---

15.2 Heuristic validity and domain

A moral rule R is useful when:

E[C_society | follow R] < E[C_society | no rule]

But every heuristic has a domain; outside-domain use creates error.

Moral conflict often signals:

domain mismatch
competing compressions
unmodeled externalities

---

15.3 Harm principle (compression version)

A compact ethical primitive compatible with this framework:

Harm is imposed, non-consensual compression work that increases another system’s long-run cost.

Formally (schematic):

Harm(A→B) ∝ E[C_B,future | A] − E[C_B,future | ¬A]

with the additional condition that Consent(B,A) is low.

---

15.4 Justice as cost distribution

Justice concerns how compression costs and benefits are distributed.

Exploitation: one system externalizes its compression costs onto others
Fairness: costs are shared proportionally to benefits and agency

A toy measure:

Exploitation(A,B) ∝ (Cost imposed on B by A) − (Benefits returned to B)

---

15.5 Virtues as stable policies

Virtues can be treated as stable attention-allocation policies that:

reduce harm risk
preserve consent
keep gradients accessible

Examples (mechanically framed):

honesty: reduces model divergence and hidden error
humility: lowers revision resistance; keeps H_expanded reachable
compassion: allocates attention to others’ cost surfaces

---

15.6 The ethics–power interface

Ethical breakdown is strongly predicted by:

high power asymmetry
low consent artifacts
high imposed revision cost

Ethics without consent collapses into compliance.

---

16) System Summary (Extended)

Attention allocates compression effort
Pointing creates modeling targets
Interestingness is compression improvement
Awe signals the need for new representational bases
Artifacts are cached compression
Love and trust are shared compression strategies
Grief is forced recompression after loss
Meaning is compression leverage
Power constrains compression paths
Consent is the boundary condition that keeps coupling safe
Ethics is compression heuristics for coordination under coupling

Or compactly:

Reality is attention, compressed and slowed.

Attention allocates compression effort
Pointing creates modeling targets
Interestingness is compression improvement
Awe signals the need for new representational bases
Artifacts are cached compression
Love and trust are shared compression strategies
Grief is forced recompression after loss
Meaning is compression leverage
Power constrains compression paths

Or compactly:

Reality is attention, compressed and slowed.

Attention allocates compression effort
Pointing creates modeling targets
Interestingness is compression improvement
Awe signals the need for new representational bases
Artifacts are cached compression
Love and trust are shared compression strategies
Grief is forced recompression
Meaning is compression leverage

Or compactly:

Reality is attention, compressed and slowed.