Technical architecture · v1.0

The engine that runs without self-attention.

Crisol drops the transformer's attention tower and replaces it with high-dimensional holographic algebra combined with a geometric Mixture-of-Experts. Context is held in a fixed-size vector, not a quadratic matrix. The result: constant-cost memory, axiomatic reasoning, and a model that is an organism, not a static function.

See the living memory The problems it solves →

4096

holographic dim.

O(1)

context cost

active experts

8192

dim. per expert

The path of a token, end to end.

There is no encoder-attention-decoder. There is a holographic encoder, a stack of NebulaForge layers, and a final projection. Four stages, zero quadratic attention.

Token

Input text tokenized against a 64,000-entry vocabulary.

→

HoloMemZEncoder

Each discrete token is projected to a holographic vector of holo_dim = 4096 on the unit sphere, structured into subspaces.

→

N × NebulaForgeLayer

12 layers (Mini). Each layer holds HoloBinder + QSE + GestorExpertos. This is where reasoning happens — without a single self-attention operation.

→

RMSNorm → Logits

Final normalization and projection back to the vocabulary. The next token emerges from a cognitive state, not from an attention matrix.

Piece 1 · HoloMemZEncoder

A structured holographic space, not a flat embedding.

The encoder turns each token into a vector on the unit sphere of dimension holo_dim = 4096. But that sphere is not undifferentiated: it is split into three functional regions, each with a distinct cognitive purpose.

NAR

2048

Axiomatic logical-causal reasoning. The subspace where the QSE routes and where expert signatures live.

NOE

2048

Physical-causal invariants of the world — 256 invariants in v1.0. The anchor against which the causal engine validates what it claims.

Free

—

Trained domain knowledge. The flexible expressive capacity each expert populates in its own way.

This separation between reasoning (NAR) and world knowledge (NOE) is what lets Crisol route by axioms and validate by causality — without confusing what it believes with what it knows.

Piece 2 · HoloBinder: the end of quadratic cost.

The HoloBinder is the piece that removes attention. Instead of a matrix that grows with the square of sequence length, it keeps a single h_ctx vector of 4096 dimensions, updated token by token by Hadamard product with 0.95 decay.

O(1) vs O(n²)

Context cost, live

Move the control to see how the cost of holding context scales. Transformer attention grows with the square of sequence length; iCrisol's HoloBinder keeps it constant.

Sequence length2,048 tokens

51232 768

Transformer (self-attention)

16×

relative cost / memory

iCrisol (HoloBinder)

1×

relative cost / memory

at 2,048 tokens → Transformer 16× · iCrisol 1×

At 32,768 tokens a transformer needs ~7.5 GB of KV-cache alone, which dies when the session closes. iCrisol holds the context state in a fixed-size vector (4096): CONSTANT cost, persistent across sessions.

Because h_ctx is organism state and not ephemeral inference, it can be saved, restored, and inherited. The conversation does not die when you close the tab. How memory persists →

Inside a NebulaForgeLayer

Four pieces that replace attention.

Each of the 12 layers in the Crisol Mini combines holographic binding, axiomatic routing, slot governance, and specialized experts. This is the anatomy of a layer.

HoloBinder

Context memory in O(1)

Instead of a quadratic attention matrix, it keeps a single h_ctx vector of 4096 dimensions, updated token by token via Hadamard product with 0.95 decay. It is persistent organism state, not an ephemeral cache: it survives across sessions and is rebuilt from the 64 KB HolographicCore.

QSE

Quantum Specialization Engine

The Mixture-of-Experts router. It learns no arbitrary gating matrix: it routes geometrically by cosine similarity against expert signatures in the 2048-dimensional NAR subspace. It activates exactly n_active = 2 experts per step. Signatures are interoperable: an imported package fits into any Crisol in the ecosystem.

GestorExpertos

Typed slots + GobernadorSlots

Experts do not live in anonymous structures: they live in 5 universal typed slots (cognitive, memory, procedural, imported). The GobernadorSlots is the authority that decides which slot activates, with what permissions, and whether an imported expert is compatible. Explicit, auditable, reproducible technical governance.

ExpertNetwork

Specialized SwiGLU FFN

Each active expert is a feed-forward network with SwiGLU activation and dim_expert = 8192. Three projection matrices (gate, value, output) plus normalization sum to ~100 M parameters per physical sub-expert. The output is projected back onto the unit sphere.

Piece 3 · Training

DHTP: every layer with its own local brain.

Crisol is not trained with a single global optimizer like a transformer. It uses the Distributed Holographic Training Protocol: N+1 independent AdamW optimizers — one global plus one per layer. In the Crisol Mini, that is 13 AdamW.

The effect is local learning: lower layers converge fast on syntax, higher ones slowly on abstract reasoning. And when a new slot activates, its optimizer is born without disturbing the rest.

AdamW in the Crisol Mini

1+N

global plus one per layer

0.95

HoloBinder decay

losses combined in DHTP

Piece 4 · AKF-Z

The six-phase cognitive loop.

Where a transformer does forward and backward blindly, Crisol runs a cognitive cycle. The perception, prediction, evaluation, and metacognition phases exist in no transformer: they are the layer that turns a model into an organism.

Perception

NomotheticZ generates a curiosity vector that guides the sampling of the batch.

Prediction

HistorianZ predicts the expected loss before the forward pass — the model's surprise signal.

Processing

Forward pass: logits, final h_ctx, and DHTP predictions.

Evaluation

Loss computation and surprise: how far reality diverged from expectation.

Learning

Backward pass and step of the N+1 AdamW optimizers. Each layer learns at its own pace.

Metacognition

The ecosystem of Z agents logs the step; ArchitectZ decides whether to reorganize, InquisitorZ whether to run a causal simulation.

Follow the thread

An engine without attention needs a different memory — and solves problems the transformer cannot.

The HoloBinder is only the beginning. Persistent living memory and the catalog of structural problems complete the picture.

Explore the living memory See the problems solved →