Self-Similar Trees
Abstract
WunderOS keeps arriving at an old shape. A node is not merely an item in a tree. It is a place from which the same tree can be viewed again. The operation that makes sense at a leaf also makes sense at an internal node, provided the scope is stated and enforced. This is the self-similar tree.
The immediate ancestry is Plan 9 and XML, not a new theory of architecture. Plan 9 taught that a process can assemble a private hierarchical namespace from local and remote services, and then use ordinary file operations inside that view. XML taught the same lesson in data form: a document is a tree of nodes, and any selected element can be treated as the root of a subtree for selection, validation, transformation, or copying. Composite objects and recursive component models are later software-pattern relatives, but the older lesson is enough.
The shape appears in HOD, the substrate-internal logic layer: think of HOD as stored-procedures for the WunderOS substrate. A HOD0-bin attaches to a WaxRow or a WaxDeweyDecimal-addressed subtree, receives events there, keeps state there, and is confined to descendants of that attachment point. The same host vocabulary is used whether the attachment point is a single row or a subtree. It appears again in Erregung, the spreading-activation retrieval surface inside WunderOS’s natural-language ingestion pipeline. A query lights episode, entity, and abstraction nodes, spreads across a bounded graph, and uses WaxCenter centroids as higher-order nodes that stand for neighborhoods of episodes. Retrieval moves up to an abstraction, crosses there, and descends again.
The common claim is not novelty. It is application. An agentic substrate should make recursive structure operationally self-similar only when the old namespace lesson can be made enforceable. It should not merely store trees. It should let each subtree behave as a local system with the same vocabulary, the same audit surface, and a smaller boundary. The benefit is not elegance. It is that capability, replay, cost, and explanation all reduce to the boundary around the subtree being used.
1. The shape
A hierarchy by itself is not enough: some aren’t trees and others aren’t self-similar. A filesystem has a tree; a taxonomy has a tree; a plan has a tree. The self-similar case is narrower. It requires that a node be usable as the root of the same kind of system that contains it. If an operation is defined on the whole tree, the same operation is defined on any subtree, with the subtree’s root treated as the local origin.
Plan 9 is the clean operating-system version of the idea. Its namespace is not
one global tree but a per-process construction, assembled by mount and bind,
in which resources are represented as file systems and accessed through the
same ordinary operations. A remote process tree, a network interface, a window,
or an old dump can be made visible at a local path. The point is not that
everything is literally a disk file. The point is that a local rooted view
makes heterogeneous things usable through a small vocabulary.
XML is the clean data version. The XML Information Set describes a document as information items with parent and child relations. The DOM exposes the document as a tree of nodes. XPath selects nodes and paths through that tree. In common XML practice an element is both a node inside a larger document and the root of a subtree that can be selected, validated, transformed, copied, or serialized. That is the same form without the operating-system machinery.
This is not a claim about visual form. It is a claim about functional system interfaces. A self-similar tree has one vocabulary across scales. A read at the tenant root, a read at a project subtree, and a read at an episodic row are not three different operations dressed in similar names. They are the same operation with different scope. The boundary changes; the operation does not.
Five conditions make the shape useful rather than decorative.
- First, the tree must have stable addresses, because a subtree that cannot be named cannot be bounded.
- Second, the operations must be closed under restriction to a subtree, because otherwise every level of the tree needs its own special case.
- Third, state and trace must be recorded at the same level at which the operation runs, because an operation whose effects are stored elsewhere has escaped the tree it claims to inhabit.
- Fourth, cost must be local enough to budget, because a subtree that can spend without limit is only a global process under another name.
- Fifth, the boundary must be enforced by the substrate, not by the caller’s courtesy or self-restraint, because self-similarity without confinement is just repetitive ambient authority.
The result is recursive without being circular. A subtree may contain logic that acts on its own descendants; those descendants may contain smaller logic with the same vocabulary; and so on. But every step has an address, a capability set, a trace record, and a cost envelope. Recursion is admitted because each instance is finite and bounded.
2. HOD is the operational instance
Functionally, HOD is the substrate mechanism for putting small pieces of logic at named locations in the substrate. A HOD-bin is closest to a stored procedure, a file-server endpoint, and an actor folded into one object: it sits at a row or subtree, observes events there, may keep state there, may act on descendants, and is confined by the address at which it is attached. The bin itself is a WASM module, so the substrate gets a small sandboxed execution unit with a fixed host vocabulary rather than arbitrary in-process extension code. HOD is the case where logic is placed inside a local namespace and made responsible only for that local world. HOD makes the tree executable. The attachment point is the bin’s world. The bin receives substrate events at that point, fires reactively on writes and structural changes, wakes autonomously on periodic Gardener pulses, and can be called from a Brass Loom plan step. It keeps persistent state and emits trace records. It is not an agent, and it is not a second planner. It is substrate-resident logic attached to a place.
The important part is the confinement rule. A bin attached at a Dewey address can address descendants of that address and not its parent or siblings. This subtree-down rule is what turns hierarchy into a usable execution surface. The bin does not need to know where it sits in the tenant’s full tree in order to run correctly. It sees its attachment point as root, and the host-syscall boundary enforces that view.
This is self-similarity in the strict sense. The same HOD-SI vocabulary applies at every attachment point: resolve an address, read a spindle, etch a child, request a mount, checkpoint state, send a message, emit a trace. A bin attached to a narrow row and a bin attached to a broad subtree differ in scope, not in kind. The host vocabulary does not grow a new dialect when the attachment moves up or down the tree.
Mount and bindmount sharpen the point. A mount grafts one context into another under a controlled grant. A bindmount overlays logic under Plan 9-style composition. These operations are tree operations, but they are also operations on local worlds. The Mount Broker does not need a theory of all possible meanings of the graft. It needs an address, a capability descriptor, a quota, and an audit record. The local subtree becomes richer, while the boundary rule remains the same.
HOD-bins have a lifecycle or maturity model, too. At L1/L2, a HOD-bin is mostly invoked by events: substrate write, probe, mount, Gardener pulse, mount-and-call, etc. It wakes, handles the event, updates state, emits effects, and returns.
AT L3 HOD-bin’s actor shape carries the recursion one step further. A long-lived bin has an init function, a handler, termination, persistent memory, a mailbox, and a supervision tree. The supervision tree is inside the subtree to which the actor is attached. There is no contradiction in a tree node owning a smaller tree of supervision, because the same rule applies: local state is local, effects are bounded, and trace records identify the attachment address.
This deepens the self-similar tree pattern: a node in the substrate tree can host a little supervised actor tree of its own, but still under the same local boundary. It is not a customer agent and not a new planner. It is substrate-resident logic with longer-lived actor-style semantics.
The failure cases show why the shape is architectural. If a bin misses its deadline, it is quarantined or detached at its attachment point. The substrate does not have to reinterpret the global system. It suppresses or removes the local actor, records the event, and the larger tree continues. A misbehaving subtree is not allowed to become a misbehaving tenant.
3. Erregung is the memory instance
Functionally, Erregung is a retrieval method. Given a query, it builds a small temporary graph around likely memories, then spreads activation through episodes, entities, temporal links, lineage links, and centroid abstractions to find memories that direct lexical or vector retrieval may not surface. It is a read-side traversal mechanism, not a new memory store. HOD is the operational case: logic is placed at a subtree and acts inside it. Erregung is the retrieval case: a query constructs a temporary subtree-like view of memory and searches inside it.
The abstraction nodes matter. WaxCenter centroids are not merely an index optimization. They are nodes that stand for neighborhoods of episodes. They let retrieval move from a particular memory to a coarser region, cross at the coarser level, and then descend to other particular memories. A centroid is a local root for a memory neighborhood. It is not the same kind of thing as an episode, but it is used in the same traversal: it receives activation, passes activation, and carries an edge back to the concrete records it summarizes.
That is the self-similar move. The local memory neighborhood is treated as a tree-like unit inside a larger memory space. Episodes sit below entities and abstractions; abstractions stand for regions of episodes; phase-two WaxScene nodes would raise the same form again. The retrieval operation does not need a separate procedure for every scale. It needs nodes, weighted edges, a bounded radius, lateral inhibition, and a rule for descending back to the episodes that can answer the user.
Erregung is not literally a tree in the way the Dewey address space is a tree. It is a graph, and the distinction matters. Temporal edges, SPO edges, lineage edges, entity edges, and abstraction edges can form cycles. The self-similar part is not that the whole memory graph is a tree. It is that the retrieval surface uses tree-like abstraction boundaries to make a graph searchable. A centroid gives a neighborhood a handle. A handle lets a query traverse the neighborhood as a unit before it pays to inspect the members.
This is why the graph is constructed per query and bounded by WaxCenter. A permanent global activation graph would make the whole memory space the unit of computation. The v1 design refuses that. It forms a local tree-like view, runs activation inside it, and measures the gain. Scope is again the load-bearing term. The query gets a bounded world, not the world.
4. What self-similarity buys
The first benefit is capability. In HOD, capability is a descriptor over a path or an intensional set, validated at every host call. In Erregung, the analogous boundary is the candidate neighborhood, the WaxCenter-bounded graph whose nodes a query may consider. These are not the same mechanism, and it would be a mistake to collapse them. But they have the same form: the system names a local world, admits operations inside it, and refuses to treat the larger world as ambient context.
The second benefit is replay. A local operation can be replayed because its inputs can be stated. For HOD, the input is the event sequence, the bin hash, the state snapshot, the capability state, the attachment address, and the event payload. For Erregung, the input is the query, the anchor scores, the candidate graph construction rule, the edge weights, and the propagation parameters. Both are only replayable because their worlds are bounded. Without the bound, replay becomes a reconstruction of everything that might have affected the operation, and that is not an engineering contract.
The third benefit is cost. A recursive structure is dangerous if every descent silently expands the cost surface. HOD prevents this by attaching quotas, deadlines, and circuit breakers to the local actor. Erregung prevents it by building only the query-time neighborhood and by limiting propagation radius. The same design rule appears in different clothes: recursion is admitted only when each instance carries its own budget.
The fourth benefit is explanation. An operator can ask why a bin fired by reading the attachment, the triggering event, the bin state, and the resulting trace. A user can ask why a memory surfaced by reading the activation path through episode, entity, and centroid nodes. In both cases the explanation is not a post hoc story over an opaque global process. It is the path through the local tree or tree-like view that the operation actually used.
These benefits compound. A subtree that owns its capability boundary also owns its replay boundary. A subtree that owns its replay boundary can be tested under deterministic inputs. A subtree that owns its cost boundary can be scheduled and killed without interpreting the entire tenant. A subtree whose effects are traced locally can be explained locally. Self-similarity is the condition that lets these properties recur at many scales without new machinery at each scale.
5. The limit
Self-similarity does not mean that every structure should be forced into a tree. Some relations are graphs, and pretending otherwise loses information. Erregung is the warning case. Its memory surface contains cycles, many edge types, and non-tree traversal. The correct move is not to deny the graph. It is to introduce bounded abstraction nodes that make parts of the graph usable as local worlds while leaving the graph semantics intact.
Nor does self-similarity remove policy. A subtree boundary is not a permission grant. HOD still needs the capability layer to decide whether a caller has a capability of the relevant shape. Erregung still needs lineage trust and edge weights because not every edge deserves equal authority. The tree gives policy a place to attach; it does not decide policy by itself.
Nor is recursion an excuse to put computation everywhere. HOD explicitly does not make substrate logic a third agent execution locus. A bin may act inside its attached subtree, but it is not a customer agent and it does not own the plan. Erregung explicitly does not materialize a permanent activation graph in v1. It derives the local graph per query because the mechanism has to earn the right to become standing state. The recursive form is powerful enough that it has to be introduced under measurement, not enthusiasm.
6. The design rule
The design rule is simple. When a new subsystem wants hierarchy, ask whether a subtree can be treated as a local system with the same operations as the whole. If the answer is no, keep the hierarchy as a storage convenience and do not pretend it is architectural. If the answer is yes, require the five things that make the recursion safe: stable address, operation closure under subtree restriction, local state and trace, local cost, and substrate-enforced boundary.
This rule separates useful recursion from decorative recursion. A named cluster, a mounted subtree, a plan fragment, a compensation branch, a standing query, and an activation neighborhood may all be recursive structures, but they become self-similar only when the same vocabulary works at the smaller scale and the same guarantees can be stated there. The right test is not whether the diagram looks fractal. The right test is whether a smaller instance can be executed, replayed, budgeted, and explained without appealing to a special case outside itself.
Our two examples are therefore not merely two new mechanisms. They are two measurements of a recurring substrate fact. Logic wants to attach to a subtree. Memory wants to retrieve through a subtree-like abstraction. In both cases the system improves when it treats the local world as real and bounded, rather than as a view over an undifferentiated whole.
7. Antecedents and related notes
The direct operating-system antecedent is Pike, Presotto, Thompson, Trickey,
and Winterbottom, “The Use of Name Spaces in Plan
9”. Plan 9’s two relevant commitments are
per-process namespaces and a file-oriented interface over heterogeneous
resources. The mount and bind operations in that paper are the plain
ancestors of the HOD namespace surface.
The direct data antecedents are the W3C XML family: the XML Information Set, the Document Object Model Level 1, and XPath. Their shared lesson is that a document is a tree of addressable nodes and that operations can be stated over a selected subtree without inventing a second data model.
The software-pattern relatives are the Composite pattern of Gamma, Helm, Johnson, and Vlissides, where leaves and composites share an interface, and the Fractal component model, where components can be nested recursively inside composite components. These are relatives rather than the main ancestry here, because HOD inherits more from namespace construction and XML subtree practice than from object-oriented presentation.
The self-similar tree extends the substrate recursion of PLRN-005. That note described the convergence of contracts, triggers, verification, and teaching onto the existing evaluator. The present note names a different regularity: not the convergence of languages, but the recurrence of bounded local worlds. The two are compatible. A local world is useful because the same evaluator, trace, and audit language can run inside it.
The standing lease of PLRN-010 is another instance. A lease is a query held open, and the lease registry is itself queryable, so a standing query can watch standing queries. The saga tree of PLRN-011 is another. A compensation branch is held, unwound, and retired as a local structure inside the larger plan. These were not framed as self-similar trees when they were written, but the same test applies to them.
The point is not to multiply names. It is to keep the system honest about where recursion is doing work. A recursive structure earns its place when it reduces a global problem to a bounded local one without changing the vocabulary of the system. That is what HOD does for substrate logic. That is what Erregung does for memory traversal. It is likely to be what the next version of several other subsystems does once their local worlds become explicit.
A note on method
Written in conversation with GPT-5 (OpenAI) as structured interlocutor and prose editor, using HOD and the spreading-activation retrieval layer as the two motivating design records. The ideas, claims, framing, and architectural commitments are mine.
Kendall Clark · k@pentad.ai
Great Falls, Virginia
June 2026