Skip to content

runex Architecture

What this is

runex is a graph-shaped ontology engine: a typed graph store with declarative semantics for object lifecycles and reactive behavior. Object types, state machines, actions, and their triggers all live as data in the graph itself. New behavior is added by writing data — not Python — and takes effect immediately, without restart.

The N+M ingest pipeline (one adapter per source, one upsert routine for all sinks) is one consumer of this engine, not its purpose. Everything downstream of a write — link extraction, state transitions, derived structure — happens through reactive cascades on store events, not through call-site coupling.

Layers

L4   Interface         CLI · Ontology facade               ← contract lives here
L3   Ontology engine   DSL eval · Loader · Engine · Kernels · ReactiveBus
L2   Store             nodes / fields / links / identities · tx_log · on_commit
L1   SQLite

Dependency is strictly one-way L4 → L3 → L2 → L1. L2 doesn't know what a machine or action is — it just emits mutation events through on_commit. Removing src/runex/ontology/ and src/runex/dsl/ leaves L2 + the pipeline functional.

The contract is an L4 concern — Result envelope, closed Event taxonomy, manifest(), cursor-resumable event stream. L3 stays a pure bool/raise executor; L4 wraps it. JSON is truth; Rich output is a view.

Core abstractions

Node — the only object type

nodes ── node_tags ── supertags

   ├── fields            (typed EAV: text/longtext/number/date/bool/blob)
   ├── links             (typed directed edges with props)
   └── node_identities   (natural_key → node_id projection)

A node is the universal record. A Bookmark, a Person, a state-machine definition, and a kernel reference are all nodes — differentiated by which supertag they carry.

Supertag — type declaration

A supertag declares:

  • Field schema (names, types, required, multi-value)
  • Optional inheritance via extends
  • Optional natural_key field that uniquely identifies real-world objects

Three supertags are reserved for the engine's own metadata: MachineDefinition, ActionDefinition, KernelRef.

Supertag definitions are (supertag …) forms in .scm bundles, loaded via o.load(). The core type system ships in config/ontology/core.scm. The engine applies Policy-A migration gating: additive changes (new field, new supertag) always apply at runtime; destructive changes (drop a field, change a field's type, change the natural key) are refused if the supertag already has tagged nodes — unless allow_destructive=True. Re-applying a byte-identical definition is a true no-op (no event emitted, no churn on repeated load).

Machine — lifecycle template (data)

scheme
(machine "Name"
  (supertag "Name")
  (initial  "s1")
  (states ("s1" "act-a")
          ("s2" "act-b")
          ("s3")))                  ; terminal

Current state is stored on the node as the reserved field __state__<MachineName>. Only the engine's (transition …) primitive may write this field; manual set-field on it bypasses discipline.

Action — guarded transition with effect (data)

scheme
(action "act-a"
  (machine     "Name")
  (from-states "s1")
  (trigger     TRIGGER-EXPR)
  (guard       GUARD-EXPR)            ; pure boolean
  (effect      EFFECT-EXPR)           ; mutations + (transition "s2")
  (priority    100)                   ; optional, lower runs first
  (conflict-policy "last-write-wins")) ; optional

from-states is a hard precondition checked before guard. Guards read but never write. Effects compose store primitives + kernel calls.

Reactive actions are ordered by priority ASC, name ASC — enforced at dispatch time, not at load time. If multiple matching actions write the same literal field, the default last-write-wins policy makes the last writer's value final and the bus records the conflict in reactive_dispatch_plan. An action marked error-on-conflict that participates in a same-field conflict blocks the whole conflict group and emits reactive_conflict_blocked. It also fails closed on opaque field targets: if the field name is not a literal (e.g., it comes from a kernel result or variable), the bus refuses rather than risk a silent accidental winner.

Trigger — when an action fires

(manual)                        only via explicit Ontology.dispatch
(on EVENT-TYPE PRED …)          fires when event + predicates match
(any-of TRIGGER …)              disjunction

Event types synthesized from store mutations:

EventSource
node-createdstore.create_node / identity upsert
node-updatedstore.update_node
field-setstore.set_field on non-state fields
field-unsetstore.unset_field
link-createdstore.create_link
link-deletedstore.delete_link
state-transitionedset_field on __state__<M>, with machine + to synthesized
taggedstore.tag

Predicates: (field N) (supertag T) (rel R) (machine M) (to S).

Design principle: trigger on the event that signals the data the action reads is in place, not the event that signals the object exists. An action consuming a body field should trigger on (on field-set (field "body") …), not on (on tagged …) — the tag event fires before the body is written.

Read primitive — guard-safe Python helper

Read primitives are Python functions injected into the DSL env by the engine. They are available in guards and effects, must not mutate the store, and must not perform external I/O. Use this boundary for deterministic computations that guards need, such as timestamp arithmetic or business-day counting.

Current engine-owned read primitives beyond store reads include date-diff and business-days-between.

Kernel — effect-only Python escape hatch

Kernels are developer-registered Python functions:

python
engine.register_kernel("yt-dlp-download",
                       lambda url: {"blob_key": "…", "metadata": "…"})

(call-kernel "name" arg …) is the only way for an effect to reach outside the store: HTTP, LLM, filesystem, subprocess. Agents can compose kernels via DSL but cannot define new ones. This is the safety boundary. Guards cannot call kernels.

Built-in kernels (pure, auto-installed): extract-wiki-links, extract-wiki-link-pairs, extract-hashtags, regex-find, regex-find-all, sum-field-on.

(writeback "sink-key") is the outbound sibling of call-kernel: same effect-only escape-hatch category, same registration model (engine.register_sink(key, sink), the Python/bootstrap author wires a configured SinkAdapter; the DSL author only names it), same guard-disallowed boundary. It projects $node to a typed CanonicalItem and hands it to the sink — the node's data does not cross the DSL.

Reactive bus — events → dispatch

Every successful transaction commit fires on_commit(events). The bus classifies each event into a high-level signal and matches every loaded action's trigger_spec against it. Matching actions are sorted by priority ASC, name ASC, then dispatched synchronously.

Before dispatching a matched group, the bus writes a reactive_dispatch_plan audit event containing the source signal, ordered actions, literal field writes, detected same-field conflicts, and the ordering rule. If error-on-conflict blocks a conflict group, the bus also writes reactive_conflict_blocked. No reactive outcome is silent — guard rejections (reactive_guard_rejected), illegal transitions (reactive_illegal_transition), and dispatch errors (reactive_dispatch_error) all emit typed events under actor system:reactive. Imperative dispatch returns ok:false with a structured error; it never raises at the caller.

Cascades nest; depth is bounded by max_depth=10.

Ontology — high-level facade (L4)

python
o = Ontology.open("data/data.db")    # store + engine + bus + default kernels
o.load("config/ontology/foo.scm")    # → Result{ok, data:{machines,actions}, events}
o.dispatch("act-a", node_id)         # → Result{ok, data:{ran}, events, cursor}
o.manifest()                         # → Result{ok, data:{supertags,machines,...}}
o.analyze_conflicts()                # → Result{ok, data:{field_conflicts,...}}
o.events(since=cursor, kinds=[…])    # → Result{ok, data:{count}, events, cursor}
trace = o.trace("act-a", node_id)    # debug only — OntologyTrace, not Result

Every operation returns a Result envelope (ok, data, error, events, cursor). trace() is the single deliberate exception — a debug introspection tool that returns the richer OntologyTrace. See agent-contract.md for the full contract specification.

CLI mirror: runex ontology {list|load|describe|run|trace|manifest|events|check}. Every subcommand accepts --json and emits the Result envelope verbatim; exit code mirrors ok.

Datasource registry

Adapters self-describe via a module-level SPEC: DataSourceSpec in src/runex/adapters/<name>.py. The registry (adapters/registry.py) lazily imports them and exposes registry(), get(key), and all_specs(). manifest().data.datasources surfaces the registry as structured data — agents and products consume it without reading source.

DataSourceSpec.build(ctx) validates required params and returns an adapter instance. A missing required param raises a structured error naming exactly what is missing.

A spec also declares requires: tuple[str, ...] — the capability keys the adapter needs. build() constructs those via the capability broker (adapters/capabilities.py) and injects them into the adapter; the adapter performs zero direct I/O and imports no filesystem/db/http library. A capability (e.g. FileCapability) is the only code that touches the OS on a channel's behalf — Invariant 4's sanctioned external-I/O surface made structural rather than conventional, the same broker-injection pattern Engine._make_env uses for the DSL env, one layer out. The split is enforced by tests/test_capability_seam.py. manifest() exposes requires so the contract stays fully introspectable. All six adapters are split; the shipped capabilities are file, sqlite, command, http (mcp deferred until a consumer exists). See docs/roadmap.md for the seam's executable criterion.

The registry is bidirectional. SinkSpec is the outbound mirror of DataSourceSpec (same lazy registry, same param/capability injection, same zero-direct-I/O seam; manifest().data.sinks exposes it). The inbound boundary is external → CanonicalItem → upsert_node_from_item; the outbound boundary is its exact mirror — project_node_to_item reads a node into a typed CanonicalItem, a SinkAdapter writes it out. The same typed boundary carries data both directions; it never degrades to JSON through the DSL.

Invariants

These hold across all changes; violating any of them is a regression.

  1. L3 → L2 is one-way. Removing src/runex/{ontology,dsl}/ leaves L2 + adapters + pipeline functional (and __state__* fields become inert text).
  2. No business names in framework code. Bookmark, Thought, etc. appear only in config/ and tests/. Verifiable by grep.
  3. __state__* fields are write-protected by convention. Only (transition …) writes them. Direct writes bypass the discipline that makes state machines meaningful.
  4. Read primitives and kernels are separate boundaries. Guards may use read primitives only. Effects can mutate L2 through store primitives or invoke a registered kernel. No eval, no Python interop, no untrusted import.
  5. Reactive order is deterministic. priority ASC, name ASC is enforced at the point of dispatch. Same input event → same cascade. Reloading actions in a different order does not change outcome.
  6. error-on-conflict fails closed. If an action asserts a hard guarantee and the bus cannot prove it (same-field co-writer present, or field target is opaque), the whole conflict group is blocked. Ambiguity is refused loudly, never resolved by accident.

Extending the engine

GoalAction
Add an object type(supertag …) in a .scm file, o.load()
Add a lifecycle / behavior.scm in config/ontology/, runex ontology load
Add an external I/O kernelPython def, engine.register_kernel("name", fn)
Add an ingest sourceAdapter in src/runex/adapters/ with SPEC = DataSourceSpec(…); declare requires=(…) to get capabilities injected — do no direct I/O
Add an outbound writebackSink in src/runex/adapters/sinks/ with SPEC = SinkSpec(…); engine.register_sink(key, spec.build(ctx)); invoke via (writeback "key") in an effect
Inspect / debug a cascaderunex ontology trace ACTION NODE

The DSL covers everything except external I/O. A .scm file is sufficient for new lifecycles; no Python edit, no restart.

Pipeline ingest

Adapters parse external sources (Obsidian vaults, chat JSONLs, …) into CanonicalItem:

python
@dataclass
class CanonicalItem:
    source_uri:  str
    source_type: str
    supertag:    str
    name:        str
    fields:      dict[str, tuple[type, value]]
    links:       list[CanonicalLink]
    extra_tags:  list[str]
    description: str | None
    attachments: list[Attachment]

upsert_node_from_item(store, item) upserts the item via L2 primitives: get-or-create by natural_key → tag → set fields → resolve RefByName targets → set links → store attachments. Each step emits events; the reactive bus handles everything downstream.

The adapter doesn't know about ontology. The pipeline doesn't know about ontology. The cascade is the connection.

Where to look

FileFor
docs/dsl-reference.mdDSL grammar, special forms, primitives
docs/ontology-authoring.mdModeling workflow, migration gate, conflict policy, debug recipes
docs/agent-contract.mdResult envelope, Event taxonomy, manifest(), event stream contract
config/ontology/*.scmWorking machine + action examples
src/runex/ontology/facade.pyPython API (Ontology class)
src/runex/ontology/engine.pyDispatch + env construction
src/runex/ontology/reactive.pyEvent classification + trigger match
src/runex/adapters/registry.pyDatasource capability registry
runex ontology --helpCLI

Fit

This engine is at its best when:

  • 5–50 supertags, 5–30 actions, cascade chains 2–5 deep
  • Behavior is mostly graph mutation; computation lives in kernels
  • Schema and behavior evolve frequently
  • Agents or end-users need runtime extensibility

It is not the right substrate for:

  • Sub-millisecond latency budgets (bus is synchronous, single-threaded)
  • High-volume OLTP (single SQLite process)
  • Strongly-typed RPC schemas (CanonicalItem is loose by design)
  • Workflows where every step must be persisted to an external queue

When these constraints bite, the engine is the wrong layer; reach for a proper workflow engine or stream processor instead.

Future direction

L1 + L2 + L3 are a generic graph-ontology substrate that doesn't know about runex's specific domain. The plan is to extract them as a separate pip package once a second consumer appears, or a multi-graph use case emerges. The invariants above are what keep that extraction cheap.