Small models.
Absolute precision.
71 languages. 26 scripts. Three tokenizers. Two model families arriving.
AENEA is a family of small language models designed from first principles for factual determinism. Prelude-1 proved the thesis. Prelude-4 is surpassing it. The Factual Crystallisation Hypothesis is rewriting how we understand when small models achieve reliable factual recall.
From first commit to launch
AENEA began in August 2025 with a simple question: what happens when you treat data quality as architecture, not preprocessing? Six months later, we have our answer.
AENEA Model Family
Prelude-1 proved the thesis. Prelude-4 is surpassing it on the QT V.3 32K UltraLingo tokenizer. Overture-1 (planned) will add advanced reasoning and code generation.
Prelude-1
Prelude-4
Overture-1
Why smaller models can think bigger
Most parameters in large models are wasted — compensating for noisy data, fragmented representations, and training regimes that fight themselves. We start from the opposite premise.
Ultra-Clean Data
The Quartz v7.3 pipeline removes encoding artefacts, vandalism, and noise across 71 languages and 26 script families. Every malformed token is a wrinkle in the loss landscape — we iron them out before training begins.
Coherent Geometry
Architectures designed so representations built during one training phase remain geometrically compatible with the next. Knowledge encodes cleanly and manifests back into language without distortion.
Multi-Epoch Depth
Three passes over curated data. First epoch builds the map. Second irons out the paper. Third polishes routes between internal representation and fluent generation.
Factual Crystallisation
Our research has identified that gradient norm, not loss, predicts the onset of factual recall in language models. When gradient norm drops to approximately 0.27 the model transitions from memorisation to genuine factual crystallisation. This hypothesis challenges conventional training metrics and provides a principled framework for predicting when small models achieve reliable factual recall.
Simple to use, powerful underneath
AENEA models ship as standard checkpoints compatible with common inference frameworks. Load it, prompt it, generate — the engineering complexity is in the training, not the interface.
Prelude-1 is released. Prelude-4 is in training. All AENEA models support the QT tokenizer family providing efficient encoding across every script.
What's coming
Prelude-1 is released. The QT_V.2 tokenizer family is live. Now we're building the next generation of models.
Prelude-1 Base
QT_V.2 Tokenizer Family
QT V.3 32K UltraLingo
Prelude-4
Overture-1
Three pillars, one architecture
AENEA Global Ltd builds vertically integrated AI infrastructure. The model family, the data stack, and the research division are all designed to reinforce each other.
AENEA
Quartz
Crassus
The future is multilingual
Prelude-1 proved that precision beats parameter count. Prelude-4 is surpassing it. The Factual Crystallisation Hypothesis is rewriting the playbook.