runex Architecture
What this is
runex is a graph-shaped ontology engine: a typed graph store with declarative semantics for object lifecycles and reactive behavior. Object types, state machines, actions, and their triggers all live as data in the graph itself. New behavior is added by writing data — not Python — and takes effect immediately, without restart.
The N+M ingest pipeline (one adapter per source, one upsert routine for all sinks) is one consumer of this engine, not its purpose. Everything downstream of a write — link extraction, state transitions, derived structure — happens through reactive cascades on store events, not through call-site coupling.
Layers
L4 Interface CLI · Ontology facade ← contract lives here
L3 Ontology engine DSL eval · Loader · Engine · Kernels · ReactiveBus
L2 Store nodes / fields / links / identities · tx_log · on_commit
L1 SQLiteDependency is strictly one-way L4 → L3 → L2 → L1. L2 doesn't know what a machine or action is — it just emits mutation events through on_commit. Removing src/runex/ontology/ and src/runex/dsl/ leaves L2 + the pipeline functional.
The contract is an L4 concern — Result envelope, closed Event taxonomy, manifest(), cursor-resumable event stream. L3 stays a pure bool/raise executor; L4 wraps it. JSON is truth; Rich output is a view.
Core abstractions
Node — the only object type
nodes ── node_tags ── supertags
│
├── fields (typed EAV: text/longtext/number/date/bool/blob)
├── links (typed directed edges with props)
└── node_identities (natural_key → node_id projection)A node is the universal record. A Bookmark, a Person, a state-machine definition, and a kernel reference are all nodes — differentiated by which supertag they carry.
Supertag — type declaration
A supertag declares:
- Field schema (names, types,
required, multi-value) - Optional inheritance via
extends - Optional
natural_keyfield that uniquely identifies real-world objects
Three supertags are reserved for the engine's own metadata: MachineDefinition, ActionDefinition, KernelRef.
Supertag definitions are (supertag …) forms in .scm bundles, loaded via o.load(). The core type system ships in config/ontology/core.scm. The engine applies Policy-A migration gating: additive changes (new field, new supertag) always apply at runtime; destructive changes (drop a field, change a field's type, change the natural key) are refused if the supertag already has tagged nodes — unless allow_destructive=True. Re-applying a byte-identical definition is a true no-op (no event emitted, no churn on repeated load).
Machine — lifecycle template (data)
scheme
(machine "Name"
(supertag "Name")
(initial "s1")
(states ("s1" "act-a")
("s2" "act-b")
("s3"))) ; terminalCurrent state is stored on the node as the reserved field __state__<MachineName>. Only the engine's (transition …) primitive may write this field; manual set-field on it bypasses discipline.
Action — guarded transition with effect (data)
scheme
(action "act-a"
(machine "Name")
(from-states "s1")
(trigger TRIGGER-EXPR)
(guard GUARD-EXPR) ; pure boolean
(effect EFFECT-EXPR) ; mutations + (transition "s2")
(priority 100) ; optional, lower runs first
(conflict-policy "last-write-wins")) ; optionalfrom-states is a hard precondition checked before guard. Guards read but never write. Effects compose store primitives + kernel calls.
Reactive actions are ordered by priority ASC, name ASC — enforced at dispatch time, not at load time. If multiple matching actions write the same literal field, the default last-write-wins policy makes the last writer's value final and the bus records the conflict in reactive_dispatch_plan. An action marked error-on-conflict that participates in a same-field conflict blocks the whole conflict group and emits reactive_conflict_blocked. It also fails closed on opaque field targets: if the field name is not a literal (e.g., it comes from a kernel result or variable), the bus refuses rather than risk a silent accidental winner.
Trigger — when an action fires
(manual) only via explicit Ontology.dispatch
(on EVENT-TYPE PRED …) fires when event + predicates match
(any-of TRIGGER …) disjunctionEvent types synthesized from store mutations:
| Event | Source |
|---|---|
node-created | store.create_node / identity upsert |
node-updated | store.update_node |
field-set | store.set_field on non-state fields |
field-unset | store.unset_field |
link-created | store.create_link |
link-deleted | store.delete_link |
state-transitioned | set_field on __state__<M>, with machine + to synthesized |
tagged | store.tag |
Predicates: (field N) (supertag T) (rel R) (machine M) (to S).
Design principle: trigger on the event that signals the data the action reads is in place, not the event that signals the object exists. An action consuming a body field should trigger on (on field-set (field "body") …), not on (on tagged …) — the tag event fires before the body is written.
Read primitive — guard-safe Python helper
Read primitives are Python functions injected into the DSL env by the engine. They are available in guards and effects, must not mutate the store, and must not perform external I/O. Use this boundary for deterministic computations that guards need, such as timestamp arithmetic or business-day counting.
Current engine-owned read primitives beyond store reads include date-diff and business-days-between.
Kernel — effect-only Python escape hatch
Kernels are developer-registered Python functions:
python
engine.register_kernel("yt-dlp-download",
lambda url: {"blob_key": "…", "metadata": "…"})(call-kernel "name" arg …) is the only way for an effect to reach outside the store: HTTP, LLM, filesystem, subprocess. Agents can compose kernels via DSL but cannot define new ones. This is the safety boundary. Guards cannot call kernels.
Built-in kernels (pure, auto-installed): extract-wiki-links, extract-wiki-link-pairs, extract-hashtags, regex-find, regex-find-all, sum-field-on.
(writeback "sink-key") is the outbound sibling of call-kernel: same effect-only escape-hatch category, same registration model (engine.register_sink(key, sink), the Python/bootstrap author wires a configured SinkAdapter; the DSL author only names it), same guard-disallowed boundary. It projects $node to a typed CanonicalItem and hands it to the sink — the node's data does not cross the DSL.
Reactive bus — events → dispatch
Every successful transaction commit fires on_commit(events). The bus classifies each event into a high-level signal and matches every loaded action's trigger_spec against it. Matching actions are sorted by priority ASC, name ASC, then dispatched synchronously.
Before dispatching a matched group, the bus writes a reactive_dispatch_plan audit event containing the source signal, ordered actions, literal field writes, detected same-field conflicts, and the ordering rule. If error-on-conflict blocks a conflict group, the bus also writes reactive_conflict_blocked. No reactive outcome is silent — guard rejections (reactive_guard_rejected), illegal transitions (reactive_illegal_transition), and dispatch errors (reactive_dispatch_error) all emit typed events under actor system:reactive. Imperative dispatch returns ok:false with a structured error; it never raises at the caller.
Cascades nest; depth is bounded by max_depth=10.
Ontology — high-level facade (L4)
python
o = Ontology.open("data/data.db") # store + engine + bus + default kernels
o.load("config/ontology/foo.scm") # → Result{ok, data:{machines,actions}, events}
o.dispatch("act-a", node_id) # → Result{ok, data:{ran}, events, cursor}
o.manifest() # → Result{ok, data:{supertags,machines,...}}
o.analyze_conflicts() # → Result{ok, data:{field_conflicts,...}}
o.events(since=cursor, kinds=[…]) # → Result{ok, data:{count}, events, cursor}
trace = o.trace("act-a", node_id) # debug only — OntologyTrace, not ResultEvery operation returns a Result envelope (ok, data, error, events, cursor). trace() is the single deliberate exception — a debug introspection tool that returns the richer OntologyTrace. See agent-contract.md for the full contract specification.
CLI mirror: runex ontology {list|load|describe|run|trace|manifest|events|check}. Every subcommand accepts --json and emits the Result envelope verbatim; exit code mirrors ok.
Datasource registry
Adapters self-describe via a module-level SPEC: DataSourceSpec in src/runex/adapters/<name>.py. The registry (adapters/registry.py) lazily imports them and exposes registry(), get(key), and all_specs(). manifest().data.datasources surfaces the registry as structured data — agents and products consume it without reading source.
DataSourceSpec.build(ctx) validates required params and returns an adapter instance. A missing required param raises a structured error naming exactly what is missing.
A spec also declares requires: tuple[str, ...] — the capability keys the adapter needs. build() constructs those via the capability broker (adapters/capabilities.py) and injects them into the adapter; the adapter performs zero direct I/O and imports no filesystem/db/http library. A capability (e.g. FileCapability) is the only code that touches the OS on a channel's behalf — Invariant 4's sanctioned external-I/O surface made structural rather than conventional, the same broker-injection pattern Engine._make_env uses for the DSL env, one layer out. The split is enforced by tests/test_capability_seam.py. manifest() exposes requires so the contract stays fully introspectable. All six adapters are split; the shipped capabilities are file, sqlite, command, http (mcp deferred until a consumer exists). See docs/roadmap.md for the seam's executable criterion.
The registry is bidirectional. SinkSpec is the outbound mirror of DataSourceSpec (same lazy registry, same param/capability injection, same zero-direct-I/O seam; manifest().data.sinks exposes it). The inbound boundary is external → CanonicalItem → upsert_node_from_item; the outbound boundary is its exact mirror — project_node_to_item reads a node into a typed CanonicalItem, a SinkAdapter writes it out. The same typed boundary carries data both directions; it never degrades to JSON through the DSL.
Invariants
These hold across all changes; violating any of them is a regression.
- L3 → L2 is one-way. Removing
src/runex/{ontology,dsl}/leaves L2 + adapters + pipeline functional (and__state__*fields become inert text). - No business names in framework code.
Bookmark,Thought, etc. appear only inconfig/andtests/. Verifiable by grep. __state__*fields are write-protected by convention. Only(transition …)writes them. Direct writes bypass the discipline that makes state machines meaningful.- Read primitives and kernels are separate boundaries. Guards may use read primitives only. Effects can mutate L2 through store primitives or invoke a registered kernel. No
eval, no Python interop, no untrusted import. - Reactive order is deterministic.
priority ASC, name ASCis enforced at the point of dispatch. Same input event → same cascade. Reloading actions in a different order does not change outcome. error-on-conflictfails closed. If an action asserts a hard guarantee and the bus cannot prove it (same-field co-writer present, or field target is opaque), the whole conflict group is blocked. Ambiguity is refused loudly, never resolved by accident.
Extending the engine
| Goal | Action |
|---|---|
| Add an object type | (supertag …) in a .scm file, o.load() |
| Add a lifecycle / behavior | .scm in config/ontology/, runex ontology load |
| Add an external I/O kernel | Python def, engine.register_kernel("name", fn) |
| Add an ingest source | Adapter in src/runex/adapters/ with SPEC = DataSourceSpec(…); declare requires=(…) to get capabilities injected — do no direct I/O |
| Add an outbound writeback | Sink in src/runex/adapters/sinks/ with SPEC = SinkSpec(…); engine.register_sink(key, spec.build(ctx)); invoke via (writeback "key") in an effect |
| Inspect / debug a cascade | runex ontology trace ACTION NODE |
The DSL covers everything except external I/O. A .scm file is sufficient for new lifecycles; no Python edit, no restart.
Pipeline ingest
Adapters parse external sources (Obsidian vaults, chat JSONLs, …) into CanonicalItem:
python
@dataclass
class CanonicalItem:
source_uri: str
source_type: str
supertag: str
name: str
fields: dict[str, tuple[type, value]]
links: list[CanonicalLink]
extra_tags: list[str]
description: str | None
attachments: list[Attachment]upsert_node_from_item(store, item) upserts the item via L2 primitives: get-or-create by natural_key → tag → set fields → resolve RefByName targets → set links → store attachments. Each step emits events; the reactive bus handles everything downstream.
The adapter doesn't know about ontology. The pipeline doesn't know about ontology. The cascade is the connection.
Where to look
| File | For |
|---|---|
docs/dsl-reference.md | DSL grammar, special forms, primitives |
docs/ontology-authoring.md | Modeling workflow, migration gate, conflict policy, debug recipes |
docs/agent-contract.md | Result envelope, Event taxonomy, manifest(), event stream contract |
config/ontology/*.scm | Working machine + action examples |
src/runex/ontology/facade.py | Python API (Ontology class) |
src/runex/ontology/engine.py | Dispatch + env construction |
src/runex/ontology/reactive.py | Event classification + trigger match |
src/runex/adapters/registry.py | Datasource capability registry |
runex ontology --help | CLI |
Fit
This engine is at its best when:
- 5–50 supertags, 5–30 actions, cascade chains 2–5 deep
- Behavior is mostly graph mutation; computation lives in kernels
- Schema and behavior evolve frequently
- Agents or end-users need runtime extensibility
It is not the right substrate for:
- Sub-millisecond latency budgets (bus is synchronous, single-threaded)
- High-volume OLTP (single SQLite process)
- Strongly-typed RPC schemas (CanonicalItem is loose by design)
- Workflows where every step must be persisted to an external queue
When these constraints bite, the engine is the wrong layer; reach for a proper workflow engine or stream processor instead.
Future direction
L1 + L2 + L3 are a generic graph-ontology substrate that doesn't know about runex's specific domain. The plan is to extract them as a separate pip package once a second consumer appears, or a multi-graph use case emerges. The invariants above are what keep that extraction cheap.