Foreword
This document specifies a long-horizon platform for capturing, modeling, and re-instantiating an individual human being as a running approximation across changing generations of artificial intelligence. The word approximation is chosen with care. The platform does not claim to resurrect a person. It claims to construct a model of a person at the highest fidelity the available data and technology allow, and to keep that model portable as both improve.
The specification serves three readers at once, and every section is written for all three:
- A person, who should understand it on first pass, in plain language.
- An AI system, which should be able to parse it as input without ambiguity.
- A code generator, which should be able to produce working software from the contracts in Part IV without guessing.
The platform is designed for a twenty-year operational horizon without intentional obsolescence. Its single most important rule: the raw captured material and the subject's description outlive everything else. Engines will be replaced. Runtimes will be replaced. The person's data and description are carried forward.
Part I — The Architecture in Six Layers
The Virtual Human OS is a generic operating system for capturing data about one or more human subjects, processing that data, and running an approximation of those subjects with as much fidelity as the data and technology of the day allow. It wraps current and future AI systems behind a stable, versioned contract so that the person's model never depends on any one engine or vendor.
The whole system is six layers. Each has one job. Each is separated from its neighbors by a stable, versioned contract.
Core principles
| Principle | Plain meaning |
|---|---|
| Substrate independence | The subject model is never welded to any one AI engine. |
| Open, durable formats | Anything captured stays readable after its capturing tools are gone. Plain text, open codecs, documented schemas. |
| Raw material is sacred | Original data is never modified or destroyed. Derived models are regenerated; raw streams are not. |
| Versioned contracts | Every boundary between layers is a stable, versioned interface. Additive change only within a major version. |
| Affect-first modeling | The feeling subsystem is foundational, not decorative. It sits beneath the personal model, as it does in a living person. |
| Provenance everywhere | Every statement about the subject traces to its sources, its author, and a confidence value. |
| Honesty about the unverifiable | Where the platform cannot in principle confirm something — whether anything is actually felt — it says so plainly (Part VII). |
| Local-first by default | The subject's data lives on hardware the subject controls. Cloud services are optional accelerants, never custodians. (The 2024–2025 bankruptcy of a major legacy-avatar platform, with customer recordings caught inside it, is the cautionary tale — see Part VI.) |
SUBSTRATE — what thinks
The substrate is the AI engine of the day: language generation, reasoning, recall, synthesis, perception, and cognitive functions not yet invented. It is intentionally swappable. Nothing above it may call a vendor API directly.
The substrate's only face to the rest of the system is the Abstraction Contract — a small, versioned, paradigm-independent vocabulary of cognitive operations ("recall this memory," "reason against this context," "generate language conditioned on this corpus"). Adapters translate between the contract and each engine's native API. New engines are added by writing a new adapter; nothing upstream changes. The contract is given in machine-readable form in Part IV.
AFFECT — what it feels like
An AI core is cognition without an interior: it can parse, summarize, and reason about King Lear with nothing happening that a person would call grief. The AFFECT layer does not translate the core's understanding into feeling — there is nothing on the core's side to translate. It constructs feeling: it generates an affective state and feeds that state back into cognition, where it biases what the cognition does next. A loop, not a dictionary.
AFFECT has three components, stacked:
- SOMA — the simulated body. A set of synthetic interior signals — analogues of arousal, tension, fatigue, the autonomic weather — that the cognitive core is conditioned to read as its own bodily feedback and to treat as feeling rather than data. Because an AI's interior cannot be sensed directly, SOMA is calibrated from the outside: from the externalized signatures of a real body captured in MATERIAL. The face is the soma's billboard; the autonomic system is its hidden engine; both are partly observable from outside the skin, which is what makes calibration possible.
- The General Human Affect Model. The prior: core affect (broad valence and arousal) plus the coarse, widely shared signatures of emotion across faces, voices, postures, and word choice. Trainable from humanity at large — how frustration, awe, grief, or mirth generally read across many people.
- The Personal Tuning Layer. Learns the delta between what the general model predicts and how this specific subject actually behaves in known states. The general model says frustration looks like X. The subject's own paired data says: in me, frustration looks like X′ — drier, more likely to surface as a joke than a sharp word, with this particular tell. The tuning layer learns X′ minus X.
The theoretical warrant for the general/personal split is Barrett's theory of constructed emotion [10]: emotions are not universal hardwired categories but are constructed in the moment from raw core affect plus a person's learned, individual conceptual repertoire. If that is right, the personal layer is not modeling human emotion in general; it is running one particular person's concept-repertoire over core affect — which makes the affective fingerprint personal and, crucially, learnable from that person's corpus. Solms' affect-first account of consciousness [11] — feeling and homeostasis as the ground, cognition the later layer — is the warrant for placing this subsystem beneath the personal model rather than on top of it.
The build-order gradient
Emotions sit on a gradient of how bound they are to the body, and the gradient is the build order. At the visceral end — disgust, fear, sexual arousal, the bodily punch of grief — feeling is deeply somatic and hardest to construct without a mature SOMA. At the cognitive end — nostalgia, intellectual awe, the pleasure of an elegant proof, frustration, schadenfreude — feeling is mostly constructed concept over thin core affect, and is reachable now from corpus and concept modeling.
Humor sits encouragingly far toward the tractable end: mirth is affect triggered by a cognitive event — incongruity resolved. Practical consequence: the subject's humor, often among the most identifying things about a person, may be among the first things the subsystem carries convincingly, not the last. Build from the cognitive end inward; defer the visceral end until SOMA and the multimodal capture are mature.
SUBJECT — what we model
The SUBJECT layer is the person, described in the Human Description Language (HDL): a small domain-specific language for a subject's drives, social patterns, decision heuristics, self-narrative, and — new in this version — affective fingerprint. HDL is fully specified in Parts II and III. For architectural purposes it is a sealed unit: a stable language, parseable by any conforming compiler, durable across substrate generations.
MATERIAL — what flows in
MATERIAL ingests and preserves everything captured about the subject. Each stream type enters through a dedicated importer with a narrow, stable contract; the archive remains authoritative over every derived database or index; raw data is never destroyed.
| Stream | What it carries, in affective terms |
|---|---|
| Text corpus — journals, blogs, essays, email, chat logs | Labels. The subject's own account of what they were feeling in a given situation. Honest journals are the most valuable text the platform can hold. |
| Structured interviews — recorded life-story sessions | The single highest-leverage capture artifact known as of 2026: a two-hour interview outperforms demographic and persona profiles for simulating an individual [1]. Record it; transcribe it; keep both. |
| Video — micro-expressions, posture, gesture speed | The externalized soma. The surface readout of interior states, including the signal the composed self does not volunteer. |
| Biometric streams — heart rate and related autonomic measures, e.g. a smartwatch worn while filming | Underneath the face. A micro-expression can be performed; a heart-rate spike is far harder to fake. |
| Future inputs — e.g. high-resolution neuroimaging | Specified as future tiers, never current dependencies. The platform must work without them. A spec that overpromises its inputs rots. |
The synchronization requirement
Channels are only useful together if their clocks align. Capture sessions must record a shared clock across camera, biometric device, and (where possible) journal entry, so a moment in the video can be matched to the heart rate at that instant and to the subject's later written account of it. Time-synchronization is a hard requirement, not a convenience.
The triangulation principle
Three channels measuring the same moment give a structure no single channel provides:
- Where the honest journal, the involuntary micro-expression, and the autonomic signal agree, the platform has something close to a true label — high confidence about what was actually felt.
- Where they diverge, the platform has not found noise; it has found the moments the subject felt one thing and showed another — arguably more identifying than the aligned cases, and a deeply personal signature in their own right. The tuning layer treats divergence as a feature, not noise.
Honest caveat: no channel is ground truth alone. The journal is a report of feeling, filtered through how the subject narrates the self. Video is less mediated but ambiguous. Biometrics are real but coarse — arousal does not name its own cause. The value is in the triangulation, never in any single stream's authority.
RUNTIME — what runs
The RUNTIME brings a described subject to life through one or more embodiments: text interfaces, synthetic voice, video avatars, virtual and augmented reality, robotics, and concurrent multi-instance operation. It translates the subject's HDL description into substrate operations through the Abstraction Contract. Different runtimes may emphasize different aspects of the subject without modifying the underlying description.
The avatar is dual-purpose. Run forward, it is the output embodiment — the face and body through which the approximation expresses itself, with AFFECT driving micro-expression and posture rather than pasting them on. Run backward over the subject's archive, the same machinery is the instrument that recovers body signal — the analyzer that reads micro-expression and posture out of captured video to feed SOMA's calibration. The thing that displays affect and the thing that reads affect are the same machinery pointed in opposite directions.
The full runtime loop: SOMA generates raw interior signal → the General Affect Model interprets it into coarse emotion → the Personal Tuning Layer warps that into the subject's specific fingerprint → expression flows out through the avatar → and, during capture, the avatar analyzer reads real expression back in as data. A closed, coherent loop — and not one piece of it requires solving consciousness.
CONTINUITY — what survives
CONTINUITY ensures survivability across decades: migration between substrates, versioned contracts, open specifications, forward-compatible data structures, multi-language implementation support. Each generation of AI technology consumes the original captured material, reapplies updated processing, and re-instantiates the subject in newer systems.
Because the affective subsystem is calibrated from raw multimodal capture, the raw synchronized streams are part of the irreplaceable record. Derived affect models can be regenerated by better future methods only if the original streams survive. Preservation priority, in order: (1) raw capture, (2) the HDL description, (3) derived projections, (4) everything else. The RUNTIME may be replaced; the SUBSTRATE may be replaced; the subject's data and description are preserved and ported forward.
Part II — The Human Description Language (HDL)
HDL is a small language for describing the internal structure of one person: their values, their patterns of feeling and thinking, their characteristic ways of acting toward others, and the story they tell about themselves. The description serves as a stable input to AI systems that simulate the person across time.
Two design rules govern everything:
- Readable. Any reasonably literate person can read an HDL document and understand what it says about the subject on first pass.
- Parseable. An AI compiler can read the same document and produce a strict structured form without ambiguity.
The prose is the authored truth; the structured form (Part IV) is the derived projection used at runtime. Both carry the same provenance.
The four subject layers
Every HDL statement belongs to exactly one layer:
| Layer | What it describes |
|---|---|
| Drives | Basic motivations — what the subject pursues, fears, and requires to feel stable. Survival, fear, desire, stability. |
| Social | How the subject relates to others — belonging, reciprocity, trust, altruism, in-group and out-group. |
| Heuristics | How the subject decides — reasoning shortcuts, biases, susceptibility to influence. |
| Narrative | The story the subject tells about themselves — identity, moral framing, how they make sense of their past. |
Note on qualia. Gender identity, sexual desire, and other deeply personal experiences are distributed across the layers where they express themselves — in Narrative as self-conception, in Social as relating, in Drives as motivation — rather than carved out as a separate layer. These four layers describe the subject; they are distinct from the four sides of the emotion vocabulary below, which describe where an emotion lives in experience.
The authoring model and the self-authority rule
HDL is dual-authored. The subject may write statements directly; these are authoritative. The AI compiler derives statements from captured material; these carry full provenance back to their sources. Both coexist in the same document, distinguished by @AUTHOR.
The self-authority rule: a statement authored by the subject is preserved as written. The compiler must not lower its confidence, modify it, or remove it. When evidence appears to contradict a subject-authored statement, the compiler leaves it unchanged and may add new, related-but-distinct statements (with their own sources and confidence) that refine the model without overruling the subject. The subject is the authority on their own identity; the compiler is the authority on what the evidence shows. Both voices remain visible. Neither overwrites the other.
Statement forms
A statement is one of three shapes, each followed by an annotation block.
Assertion — a stable trait, value, belief, or disposition:
THIS SUBJECT VALUES understanding nature deeply. @LAYER drives @AUTHOR ai @CONFIDENCE 0.95 @SOURCES [writings/autobiographical-notes-1949.txt#para_3]
Conditional — a stimulus–response pattern:
WHEN THIS SUBJECT is interrupted mid-thought, they FEEL irritated weakly. @LAYER drives @AUTHOR self @CONFIDENCE 1.00 @SOURCES [self-attestation/2026-06-09]
Chain — a blended sequence in which one inner state causes the next, across the sides of experience. The verb at each link signals which side that link inhabits: FEEL for Internal Feelings, BECOME for a Thinking or Body State, ACT for External Behavior:
WHEN THIS SUBJECT learns of war or atrocity, they FEEL distressed, THEN BECOME reflective, THEN ACT defiant. @LAYER social @AUTHOR ai @CONFIDENCE 0.80 @SOURCES [biography/isaacson-2007.ch_14] @CHAIN [distressed -> reflective -> defiant]
Lines beginning with # are comments. Optional intensity modifiers (strongly, moderately, weakly, rarely, occasionally, always) follow the object of an assertion or the response of a conditional.
The verb set
The verb set is small and additive: a verb is added only when a workaround using existing verbs appears at least twice in real compilation output.
| Verb | Meaning | Typical layer |
|---|---|---|
VALUES | Holds as a core value or aspiration. | Drives / Social |
BELIEVES | Holds as a factual or moral conviction. | Narrative |
FEARS | Sustained aversion to. | Drives |
TRUSTS | Extends reliability or confidence to. | Social |
PREFERS | Leans toward in choice. | Heuristics |
DECIDES | Characteristic decision style. | Heuristics |
WEIGHS | Gives unusual weight to a consideration. | Heuristics |
DISCOUNTS | Habitually under-weights a consideration. | Heuristics |
ANCHORS_ON | Fixes initial estimates from a reference point. | Heuristics |
DEFAULTS_TO | When uncertain, falls back to a stance or action. | Heuristics |
IS_SWAYED_BY | Susceptibility to a channel of influence. | Heuristics |
RESISTS | Habitually pushes back against a kind of pressure. | Heuristics |
FRAMES | Self-narrative about self, others, or the world. | Narrative |
HOPES | Forward-directed aspiration. | Narrative / Drives |
IDENTIFIES_AS | Self-applied role, label, or identity. | Narrative |
REJECTS | Explicitly disavows a position or framing. | Narrative / Social |
The chain-form verbs FEEL, BECOME, DO, ACT appear only inside conditional and chain forms, never as standalone assertions.
The annotation block
Every statement is followed by an indented annotation block, one tag per line, in any order. The first four are required.
| Tag | Required | Meaning |
|---|---|---|
@LAYER | yes | drives | social | heuristics | narrative |
@AUTHOR | yes | self | ai | other:<id> |
@CONFIDENCE | yes | 0.00–1.00. Subject-authored statements are conventionally 1.00. |
@SOURCES | yes | List of source-pointers: relative paths with optional fragment, e.g. journal/2024-11-02.txt#para_3. self-attestation/<date> is the canonical pointer for statements the subject wrote directly. |
@CHAIN | no | Explicit chain flow for compilers that prefer not to parse prose. |
@AS_OF | no | When the claim was first true or last confirmed; supports temporal modeling (a value held at twenty-five may differ from one held at sixty-five). |
@AFFECT | no | (new in v3.0) Links the statement to entries in the subject's AFFECT_FINGERPRINT (Part III). |
@NOTE, @REVIEWED | no | Free commentary; review trail. |
The Heuristics layer: two synchronized forms
Drives, Social, and Narrative live naturally in prose. Heuristics — the characteristic shortcuts of a mind — are awkward in prose at simulator precision and awkward in pure numbers at human-reviewer texture. So the Heuristics layer travels in two equivalent forms: the prose form (the authored truth, same grammar as everything else) and a derived projection in a small structured vocabulary the runtime consumes directly. The compiler keeps both in sync; the same provenance rules govern both. The projection is the only place in HDL where numbers are first-class.
The projection has three slot families, each slot carrying a value in [0.0, 1.0] (or a qualitative token where the evidence doesn't support calibration) plus a confidence:
| Family | Slots |
|---|---|
| decision_style (axes, low→high) | deliberation impulsive→deliberate · intuition_vs_analysis intuitive→analytical · risk_tolerance averse→seeking · novelty_appetite familiar→novel · horizon short→long · granularity detail-first→pattern-first · consensus_orientation contrarian→consensus |
| biases (strength) | availability · anchoring · recency · narrative · authority · in_group · sunk_cost · confirmation |
| influence_susceptibility (strength) | emotional_appeal · repetition · social_proof · scarcity_urgency · authority_signaling · visual_vs_textual |
Each prose verb maps to projection slots via a published mapping table maintained alongside the vocabulary — for example, DECIDES from first principles raises intuition_vs_analysis toward analytical and raises deliberation; RESISTS a pressure source lowers that influence channel. Intensity modifiers translate to discrete deltas tuned per compiler version. A reviewer can always retrace any projection value to the prose statements that produced it, and from there to the sources. That traceability is non-negotiable.
The emotion vocabulary
HDL prose references emotion words in FEEL responses and chain forms. These resolve against a versioned vocabulary file (currently v0.2, 193 entries, unchanged from v1.1). The vocabulary organizes emotion by four plain-language sides — the same word may live on more than one side at once, because most emotions are blended: a person who is anxious is feeling something, thinking something, and bracing physically, all at once.
| Side | What it covers | Categories |
|---|---|---|
| Internal Feelings | What is felt inside. | high-energy uplifting · high-energy distressing · low-energy uplifting · low-energy distressing |
| Body States | What the body is doing. | activated · drained |
| Thinking States | How the mind is working right now. | disrupted · engaged · overloaded · self-focused |
| External Behavior | How the person shows up to others. | withdrawing · asserting · welcoming · yielding |
The complete canonical word lists appear in the Appendix, grouped by category — a form both a reader and a code generator can consume. (Per-word working definitions remain in v1.1, Appendix B; they are unchanged.) New words may be added in any minor version; existing mappings are never removed or repurposed; subject-specific additions live in a parallel file referenced from the document header and never pollute the shared vocabulary.
Worked example — a historical figure
A historical subject is the natural first instantiation: bounded, public source material; the subject's own writings standing in for @AUTHOR self; contemporary writings standing in for @AUTHOR ai — the full grammar exercised end to end with no consent questions. (Independent research has validated exactly this approach: trainable simulacra of historical figures built from collected writings and records [5].) The example is deliberately compact.
HDL_VERSION 0.4
SUBJECT_ID "albert_einstein"
PRIMARY_AUTHOR_MODE ai
LAST_COMPILED 2026-06-09
# === Drives ===
THIS SUBJECT VALUES understanding nature deeply.
@LAYER drives @AUTHOR ai @CONFIDENCE 0.95
@SOURCES [writings/autobiographical-notes-1949.txt#para_3]
# === Social ===
THIS SUBJECT REJECTS militarism as a path to human security.
@LAYER social @AUTHOR ai @CONFIDENCE 0.92
@SOURCES [correspondence/einstein-freud-why-war-1933.txt,
statements/manifesto-to-europeans-1914.txt]
# === Heuristics ===
THIS SUBJECT DECIDES from first principles when stakes are high.
@LAYER heuristics @AUTHOR ai @CONFIDENCE 0.85
@SOURCES [letters/1936-1945/principle-cases.txt]
THIS SUBJECT WEIGHS internal coherence strongly.
@LAYER heuristics @AUTHOR ai @CONFIDENCE 0.90
@SOURCES [essay/physics-and-reality-1936.txt]
# === Narrative ===
THIS SUBJECT IDENTIFIES_AS a pacifist by conviction.
@LAYER narrative @AUTHOR ai @CONFIDENCE 0.90
@SOURCES [interview/the-nation-1931.txt#para_4]
THIS SUBJECT BELIEVES the universe is comprehensible by reason.
@LAYER narrative @AUTHOR ai @CONFIDENCE 0.93
@SOURCES [essay/physics-and-reality-1936.txt#section_1]
# === Conditional and chain ===
WHEN THIS SUBJECT learns of war or atrocity,
they FEEL distressed,
THEN BECOME reflective,
THEN ACT defiant.
@LAYER social @AUTHOR ai @CONFIDENCE 0.80
@SOURCES [biography/isaacson-2007.ch_14]
@CHAIN [distressed -> reflective -> defiant]
# === Derived projection (Heuristics layer) ===
HEURISTICS_PROJECTION subject_id="albert_einstein"
decision_style:
deliberation: 0.85 @CONFIDENCE 0.82
intuition_vs_analysis: 0.60 @CONFIDENCE 0.80
novelty_appetite: 0.80 @CONFIDENCE 0.78
horizon: long @CONFIDENCE 0.88
granularity: pattern-first @CONFIDENCE 0.90
consensus_orientation: 0.20 @CONFIDENCE 0.85
biases:
narrative: 0.55 @CONFIDENCE 0.70
authority: 0.20 @CONFIDENCE 0.80
influence_susceptibility:
emotional_appeal: 0.30 @CONFIDENCE 0.72
authority_signaling: 0.20 @CONFIDENCE 0.78
Reserved words
Header keywords: HDL_VERSION, SUBJECT_ID, PRIMARY_AUTHOR_MODE, LAST_COMPILED, VOCABULARY_NAME, VOCABULARY_VERSION, LAST_UPDATED. Statement keywords: THIS SUBJECT, WHEN, THEN. Assertion verbs: the sixteen in the verb table. Chain verbs: FEEL, BECOME, DO, ACT. Intensity modifiers: strongly, moderately, weakly, rarely, occasionally, always. Annotation tags: @LAYER, @AUTHOR, @CONFIDENCE, @SOURCES, @NOTE, @REVIEWED, @CHAIN, @AS_OF, @AFFECT. Layer values: drives, social, heuristics, narrative. Author values: self, ai, other:<id>. Projection keywords: HEURISTICS_PROJECTION, AFFECT_FINGERPRINT, decision_style, biases, influence_susceptibility.
Part III — The Affective Fingerprint in HDL
This part closes the open fork left by v2.0: HDL is now affect-aware. The biographical and cognitive HDL of Part II remains the base; the affective dimension is what makes the described subject a particular person rather than a competent generic human. It is carried in one new top-level block, AFFECT_FINGERPRINT, which — like the heuristics projection — is a derived structured form, compiled from MATERIAL with full provenance, reviewable by the subject, and never authoritative over subject-authored prose.
The block has four sections:
| Section | What it holds | Compiled from |
|---|---|---|
concept_repertoire | The subject's personal emotion categories [10]: which vocabulary words they actually use of themselves, how finely they differentiate, under what triggers. May reference subject-specific vocabulary additions. | Text corpus, especially journals. |
tuning_deltas | The X′-minus-X corrections the Personal Tuning Layer has learned over the General Affect Model, per emotion and per context. | Synchronized capture sessions. |
divergence_map | The characteristic ways this subject's shown affect departs from felt affect — the felt-one-thing-showed-another signatures from triangulation. | Journal vs. video vs. biometrics. |
soma_calibration | How this particular body's external signatures map back to SOMA's synthetic interior signals. | Video + biometric streams. |
A compact illustrative fragment:
AFFECT_FINGERPRINT subject_id="albert_einstein"
concept_repertoire:
frequent: [reflective, amused, defiant, serene, troubled]
fine_distinctions: [troubled vs. distressed] @CONFIDENCE 0.62
tuning_deltas:
frustrated:
general: "sharp word, raised voice"
personal: "dry joke, subject change" @CONFIDENCE 0.70
@SOURCES [sessions/2026-03-11/, journal/2026-03-11.md#para_2]
divergence_map:
- felt: anxious shown: calm
context: "public speaking" @CONFIDENCE 0.55
@SOURCES [sessions/2026-04-02/]
soma_calibration:
arousal_baseline_hr: 58
arousal_spike_tell: "jaw set before voice changes" @CONFIDENCE 0.48
Prose statements may point into this block with the @AFFECT annotation, so a chain like feels anxious → becomes self-conscious → acts shy can carry the specific tuning data that makes the chain run in this subject's particular way.
Part IV — Machine-Readable Contracts
This part exists so that code can be generated from the specification without guessing. Three contracts: the archive layout, the data schemas, and the substrate interface. Prose remains the authored truth; these are its derived projections, versioned with the spec.
Contract 1 — the archive layout
One directory per subject. Plain files, open formats, no database required to read it. The raw/ tree is append-only and never modified; everything under derived/ can be deleted and regenerated.
subjects/<subject_id>/
manifest.json # spec version, checksums, inventory
raw/ # APPEND-ONLY. The irreplaceable record.
text/ # journals (.md), email, posts, chat exports
audio/ # interviews, voice memos (.flac/.wav + .txt transcript)
video/ # capture sessions (.mp4/open codec)
biometrics/ # heart rate etc. (.csv: utc_timestamp,metric,value)
interviews/ # life-story interview recordings + transcripts
sessions/<ISO-timestamp>/ # SYNCHRONIZED capture units
clock.json # shared-clock offsets for every device in session
video.mp4
hr.csv
journal.md # subject's later written account, timestamped
hdl/
subject.hdl # the prose document. The authored truth.
vocabulary-custom.hdl # subject-specific emotion additions (optional)
derived/<compiler-version>/
statements.json # compiled HDL statements (schema below)
heuristics.json # heuristics projection
affect.json # affect fingerprint
index/ # embeddings, search indexes — all regenerable
manifest.json records the spec version, a checksum (SHA-256) for every file under raw/ and sessions/, and the compiler version of each derived tree. Three copies of raw/, sessions/, and hdl/ on independent media is the minimum preservation posture.
Contract 2 — the data schemas
A compiled HDL statement, as JSON (the canonical interchange form between compiler and runtime):
{
"$schema": "vhos/3.0/statement",
"subject_id": "string",
"form": "assertion | conditional | chain",
"verb": "VALUES | BELIEVES | FEARS | TRUSTS | PREFERS | DECIDES |
WEIGHS | DISCOUNTS | ANCHORS_ON | DEFAULTS_TO |
IS_SWAYED_BY | RESISTS | FRAMES | HOPES |
IDENTIFIES_AS | REJECTS",
"object": "string (free text)",
"stimulus": "string | null // conditional and chain forms",
"chain": ["emotion-word", "..."] ,
"chain_sides": ["feel | become | act", "..."],
"intensity": "strongly | moderately | weakly |
rarely | occasionally | always | null",
"layer": "drives | social | heuristics | narrative",
"author": "self | ai | other:",
"confidence": 0.0,
"sources": ["relative/path.ext#fragment", "..."],
"as_of": "ISO-8601 date | null",
"affect_refs": ["fingerprint-entry-id", "..."],
"note": "string | null",
"reviewed": "ISO-8601 date | null"
}
The heuristics projection and affect fingerprint serialize as in Parts II–III: every leaf value pairs with a confidence, and every section carries sources reachable through its metadata. Validation rules a generator must enforce: confidence in [0,1]; layer, author, verb, intensity drawn only from the reserved-word sets; chain words resolving against the vocabulary file or the subject's custom additions; statements with author=self immutable to the compiler (the self-authority rule, enforced in code).
Contract 3 — the substrate interface
The Abstraction Contract, as the minimal interface every adapter implements. Paradigm-independent; deliberately small; versioned; additive-only within a major version.
interface Substrate { // Abstraction Contract v3.0
capabilities() -> CapabilityReport // versions, modalities, limits
generate(prompt: Text,
corpus: SourceRefs, // grounding material
persona: StatementSet, // compiled HDL statements
affect: AffectState | null, // current constructed state
params: GenParams) -> Text
recall(query: Text, scope: SourceRefs) -> Passages // retrieval
reason(context: Text, question: Text) -> {conclusion, rationale}
embed(text: Text) -> Vector
// AFFECT support (optional capability; report via capabilities())
read_soma(signals: SomaState) -> CoarseAffect // general model
apply_tuning(coarse: CoarseAffect,
fingerprint: AffectFingerprint) -> AffectState
}
An adapter for a 2026-era local language model implements generate, recall, reason, and embed trivially; the two affect calls may return unsupported until the AFFECT components exist. Nothing upstream breaks. That is the point of the contract.
Part V — Building It: A Practical Path
This part turns the architecture into a sequence a single person can begin today. The stages are ordered by leverage: each produces something usable on its own, and the earliest stages need no code at all. The evidence from the field (Part VI) shapes the order — in particular, the finding that a recorded two-hour life-story interview is currently the highest-value single artifact for simulating an individual [1].
Stage 0 — accumulate the data (now; no code)
- Stand up the archive. Create the directory layout of Part IV on storage you control. Local-first; encrypted backups; three copies.
- Record the life-story interview. Two or more hours, video and audio, following an arc from childhood through career, relationships, values, and worldview. Transcribe it. This single artifact, used as grounding context, is what carried the strongest published individual-simulation results to date [1].
- Journal with the vocabulary. Date-stamped entries that name feelings using the emotion vocabulary (Appendix). These become the labels for everything else.
- Run synchronized capture sessions. Camera on, smartwatch on, shared clock recorded, ordinary activities — conversation, work, watching something that moves you — followed by a journal entry about the session. Even a handful of sessions begins the triangulation record.
- Export the existing corpus. Email archives, posts, chats, documents, code — into
raw/text/in open formats, with provenance noted. - Write self-authored HDL. Open
subject.hdland begin writing assertions in your own voice with@AUTHOR self. These are the statements no compiler may ever override.
Stage 1 — the retrieval-grounded persona (weeks)
A local language model (consumer hardware suffices; privacy is the reason to stay local), the interview transcript and corpus served through retrieval, a system prompt assembled from the self-authored HDL statements. This is architecturally an adapter implementing generate + recall against the Abstraction Contract — and it already reaches the current research ceiling for individual simulation, since the strongest published results use exactly this pattern: full interview text grounding a large model [1]. Open-source local-first projects such as Second Me [6] demonstrate the pattern end to end and are worth studying or forking.
Stage 2 — the HDL compiler (months)
A compiler pass — itself an LLM run against the corpus through the Abstraction Contract — that proposes @AUTHOR ai statements with sources and confidences; the subject reviews; accepted statements join subject.hdl; the projection is derived per the mapping table. The subject's review loop is not optional: it is both quality control and the consent mechanism.
Stage 3 — heuristics and affect (quarters)
Derive the heuristics projection and begin the AFFECT_FINGERPRINT: concept-repertoire from the journals; tuning deltas and the divergence map from the synchronized sessions; SOMA calibration parameters from video + biometrics. Build from the cognitive end of the gradient inward — humor, nostalgia, intellectual awe first; the visceral end waits for a mature SOMA.
Stage 4 — runtime, avatar, continuity (years)
Voice, then avatar embodiment with AFFECT driving expression; the same avatar machinery run backward over archived video as the analyzer that feeds calibration; scheduled migration drills — re-instantiating the subject on a new substrate from raw + HDL alone, proving the continuity layer before it is needed.
A note on evaluation, learned from the field: hold out data. Keep a set of journal entries, opinions, and decisions the model never sees, and score the simulation against them — the way the strongest research scores agents against held-out survey answers [1] and the way the cautionary studies caught inflated claims [2]. Fidelity asserted without held-out evaluation is marketing, not measurement.
Part VI — State of the Field, 2026
This platform is no longer speculative in outline. As of this writing, individual-simulation work spans serious research, open source, and a commercial industry. The honest summary: the approach works far better than skeptics expected and far worse than the industry implies — and both halves of that sentence shape this specification.
The research
- Interview-grounded agents work. Park et al. (Stanford / Google DeepMind / Northwestern / UW) built generative agents of 1,052 real people from two-hour qualitative interviews. The agents replicated each person's General Social Survey answers at 85% of the person's own two-week test–retest consistency, predicted Big Five profiles comparably, and beat demographic- and persona-based agents by roughly 14–15 points — the interview is the signal [1]. This validates MATERIAL's interview stream as the highest-leverage capture artifact.
- But generalization is weak. Peng et al.'s "Digital Twins are Funhouse Mirrors" ran 19 pre-registered studies across 164 outcomes: twins trained on 500+ of a person's previous answers correlated only weakly with that person's responses on new questions (average r ≈ 0.20), only modestly beating the base model, with five systematic distortions [2]. The lesson for this platform: simulation fidelity is real but narrow; claims must be scoped to what held-out evaluation supports.
- Benchmarks now exist. BehaviorChain tests persona-based behavior-chain simulation [3]; TwinVoice decomposes persona fidelity into capabilities and finds current models still weak on syntactic style and memory recall [4]. These are ready-made evaluation harnesses for Stage 1+ builds.
- Historical figures are a proven on-ramp. Character-LLM trained simulacra of figures such as Beethoven and Caesar from collected records [5] — independent validation of this spec's Einstein-first strategy.
The open-source and commercial landscape
- Second Me (Mindverse) — open-source, locally trained and hosted personal AI identity, with memory versioning and continuous training pipelines [6]. The closest existing code to this platform's Stage 1–2, and proof the local-first posture is practical on consumer hardware.
- The digital-afterlife industry — Eternos (digital twins built from recordings and reflections; its best-known subject, Michael Bommer, built his before dying of cancer in 2024), HereAfter AI (guided audio life-story interviews), You Only Virtual, Project December, and StoryFile (interactive video interviews) [7]. StoryFile's Chapter 11 filing — it continues operating — is the object lesson behind this spec's local-first principle: a person's record must not die with a vendor [7].
- Regulation has begun. California's SB 243 (2025) made it the first state to regulate AI companion chatbots, requiring disclosure and safeguards [8]; scholars have mapped the ethics of "griefbots" and postmortem avatars, centering consent of the data donor [9]. This spec's consent mechanisms — the self-authority rule, the subject review loop, dual authorship with provenance — are its answers, designed in rather than bolted on.
What no one else is doing
Nothing surveyed combines (a) a durable, human-readable description language with per-statement provenance and self-authority, (b) an explicitly engineered affective subsystem calibrated from time-synchronized multimodal capture, and (c) a continuity layer that treats raw capture as an irreplaceable record portable across substrate generations. The industry builds products; the research builds experiments; this specification builds the thing both will need if the work is to outlast its decade.
Part VII — Open Problems: The Box
This section is kept deliberately, and is not a disclaimer. It is where the platform holds the questions it cannot close, without collapsing them in either direction.
The phenomenal question (the cat)
No part of this architecture establishes that the running approximation feels anything. Every functional signature and every interoceptive one can be replicated, and the question of whether the lights are on sits exactly where it started. This is the hard problem, and the platform does not claim to solve it. Two things keep that from being a reason to stop. First, the undecidability is not special to this design — a perfectly scanned and re-run biological human would face the identical wall; the phenomenal question is the universal tax on the entire enterprise of re-instantiation. Second, for the platform's stated goal — approximation within a vanishing margin of difference — functional plus interoceptive fidelity may be exactly sufficient. The platform builds to the limit of the verifiable and leaves the cat in the box rather than declaring it alive or dead.
The author holds broader views — about simulation, consciousness, and continuity — that motivate this work. They are deliberately not imported here. The engineering must stand on its own and be judged on its own terms. The leap and the spec are kept in separate boxes on purpose.
Lesser open problems
- The fidelity gap. The field's own numbers range from 85% normalized accuracy on held-out surveys [1] to r ≈ 0.20 on novel outcomes [2]. How far interview-plus-corpus grounding generalizes beyond captured situations is an empirical unknown this platform inherits.
- Self-report mediation. Journal labels are filtered self-narration; triangulation mitigates but cannot remove this.
- Arousal is non-specific. Biometrics say that something fired, rarely what; context and the other channels must supply the what.
- The visceral end. Without a mature SOMA, the body-bound emotions remain the frontier. They are scheduled last for a reason.
- The subject changes. A living subject keeps living.
@AS_OFand re-compilation handle drift in principle; the cadence and the question of which self is being preserved are left open.
Part VIII — Design Philosophy and Versioning
The architecture is based on additive accommodation rather than prediction: unknown future capabilities — connectomics, brain–computer interfaces, world models, synthetic embodiment, genome–connectome–envirome packaging — integrate through stable extension mechanisms, not redesign. The specification is durable infrastructure, independent of any implementation language, model generation, or vendor.
- No built-in obsolescence. No vendor lock-in. No destructive migrations. No architectural debt at core boundaries.
- Documentation written for maintainers decades into the future.
- Open, documented, multi-implementation standards wherever possible.
Versioning. This unified specification is at v3.0. The HDL grammar moves to 0.4 (additive: @AFFECT annotation, AFFECT_FINGERPRINT block). The emotion vocabulary remains at v0.2, unchanged. Until grammar 1.0, non-additive change is permitted with reason; from 1.0 onward, only additive change within a major version, and any removal requires a major bump with a documented migration path. The early symbolic syntax (DEFINE AXIOM, SET STYLE, MAP TRIGGER) remains superseded; its expressive content lives on inside the projections, and the historical forms are preserved in v1.1 for traceability.
Appendix — The Emotion Vocabulary v0.2, by Category
All 193 canonical entries, grouped by side and category. A word appearing in more than one list is a blended emotion touching more than one side — the common case. An asterisk marks a word whose category shifts with intensity (e.g. ashamed: high-energy distressing at high intensity, low-energy at low). Per-word working definitions are in v1.1, Appendix B, unchanged.
Internal Feelings
| Category | Words |
|---|---|
| high-energy uplifting | amazed, amused, aroused, astonished, awestruck, blissful, cheerful, delighted, eager, ecstatic, elated, energized, enthusiastic, euphoric, excited, exuberant, fulfilled, grateful, happy, hope, hopeful, infatuated, inspired, invigorated, joyful, jubilant, loving, optimistic, playful, pleased, proud, refreshed, rejuvenated, satisfied, stimulated, thankful, thrilled, triumphant, vibrant |
| high-energy distressing | afraid, alarmed, angry, annoyed, anxious, ashamed*, bitter, contemptuous, defiant, desperate, disdainful, disgusted, distressed, disturbed, embarrassed, enraged, envious, exasperated, frightened, frustrated, furious, hateful, heartbroken, horrified, hostile, humiliated, hurt, hysterical, impatient, indignant, insulted, irate, irritated, jealous, mad, mortified, nervous, offended, on edge, outraged, overwhelmed, panicked, paranoid, rattled, regretful, remorseful, resentful, scared, scornful, shaken, shocked, spiteful, stressed, surprised, suspicious, terrified, tormented, troubled, uneasy, unhappy*, unnerved, unsettled, upset, vengeful, vindictive, worried |
| low-energy uplifting | at ease, calm, compassionate, content, empathetic, kind, patient, peaceful, relaxed, relieved, safe, serene, sympathetic |
| low-energy distressing | alienated, ashamed*, bored, brooding, dependent, depressed, dispirited, gloomy, grief-stricken, grumpy, guilty, isolated, lonely, longing, melancholy, miserable, nostalgic, sad, self-critical, sentimental, sorry, sullen, trapped, unhappy*, vulnerable, worthless |
Body States
| Category | Words |
|---|---|
| activated | afraid, alarmed, alert, anxious, aroused, energized, frightened, invigorated, jittery, nauseous, nervous, on edge, rattled, refreshed, rejuvenated, restless, scared, shaken, stimulated, tense, terrified, vibrant, vigilant |
| drained | droopy, exhausted, lazy, listless, numb, sleepy, sluggish, tired, weary, worn out |
Thinking States
| Category | Words |
|---|---|
| disrupted | bewildered, confused, disoriented, dumbstruck, mystified, perplexed, puzzled, stuck |
| engaged | alert, daydreaming, focused, reflective, skeptical, vigilant |
| overloaded | anxious, brooding, distressed, overwhelmed, paranoid, restless, stressed, troubled, uneasy |
| self-focused | ashamed, embarrassed, guilty, humiliated, mortified, naive, self-conscious, self-critical, sensitive |
External Behavior
| Category | Words |
|---|---|
| withdrawing | alienated, detached, isolated, reserved, secluded, self-conscious, shunned, shy, timid |
| asserting | arrogant, condescending, contemptuous, defiant, disdainful, greedy, hostile, impatient, obstinate, scornful, smug, spiteful, stubborn, vengeful, vindictive |
| welcoming | compassionate, empathetic, humble, kind, optimistic, patient, self-confident, sympathetic, valiant, welcoming |
| yielding | dependent, docile, indifferent, patient, resigned, submissive |
References
- Park, J.S., Zou, C.Q., Shaw, A., Hill, B.M., Cai, C., Morris, M.R., Willer, R., Liang, P., Bernstein, M.S. Generative Agent Simulations of 1,000 People / LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals (2024–2026). arXiv:2411.10109. See also Stanford HAI's summary: hai.stanford.edu.
- Peng, T., Gui, G., Merlau, D., et al. Digital Twins are Funhouse Mirrors: Five Systematic Distortions (2025, rev. 2026). SSRN 5518418.
- Li, R., Xia, H., Yuan, X., et al. How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain Simulation (2025). arXiv:2502.14642.
- TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation (2025). arXiv:2510.25536.
- Shao, Y., et al. Character-LLM: A Trainable Agent for Role-Playing (2023). arXiv:2310.10158.
- Mindverse. Second Me: open-source, locally trained personal AI identity. github.com/mindverse/Second-Me; secondme.io.
- Industry overview of the digital-afterlife sector — Eternos (eternos.life), HereAfter AI (hereafter.ai), StoryFile (storyfile.com; Chapter 11 filed, operations continuing), You Only Virtual, Project December — surveyed in trade and press coverage through 2026, e.g. Good Grief (2026).
- California Senate Bill 243 (2025) — first U.S. state regulation of AI companion chatbots: disclosure and safeguard requirements. leginfo.legislature.ca.gov.
- Hollanek, T., Nowaczyk-Basińska, K. Griefbots, Deadbots, Postmortem Avatars: on Responsible Applications of Generative AI in the Digital Afterlife Industry. Philosophy & Technology (2024). doi:10.1007/s13347-024-00744-w.
- Barrett, L.F. How Emotions Are Made: The Secret Life of the Brain. Houghton Mifflin Harcourt (2017) — the theory of constructed emotion.
- Solms, M. The Hidden Spring: A Journey to the Source of Consciousness. Profile / Norton (2021) — the affect-first account.