The UnShell protocol is a tree-addressed packet protocol for remote procedure calls and bidirectional hook-backed data exchange across a hierarchy of connected endpoints.
The protocol is intended to be small, extensible, and canonical. The core stays narrow enough for constrained implementations, new behavior is introduced through leaves, procedures, and payload schemas instead of frequent protocol redesign, and each core protocol behavior has one clearly defined expression.
This document combines exact protocol definition with rationale. Rationale blocks explain why a rule exists, but do not define interoperability requirements.
The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt) when, and only when, they appear in all capitals.
Unless a section is explicitly marked otherwise, sections labeled `Normative` define protocol requirements and sections labeled `Non-Normative` provide description, rationale, deployment guidance, or open design commentary.
The purpose of this specification is to define the set of protocol components required to assemble complete UnShell protocol packets and to provide a framework through which the protocol can be extended through leaves and procedure contracts.
The UnShell protocol assumes that a connection already exists and that any required authentication, authorization, and routing admission decisions have already been handled by the surrounding system.
Every implementation is expected to maintain its own live connection set and its own ground truth about which peers are connected, admitted, and routable.
> **Rationale:** Authentication and handshakes were intentionally removed from the core scope. They are too deployment-specific to define canonically without bloating the protocol.
> **Rationale:** Packet serialization is in scope because independently authored endpoints need one canonical byte representation in order to interoperate. Transport selection remains out of scope because the same framed packet bytes can be carried over different transports.
If the caller wants output, it declares a hook inside the call. The recipient returns one or more `Data` packets toward the hook host. Once a hook exists, either side MAY continue exchanging `Data` packets associated with that hook. A side signals it is done by setting `end_hook = true` on its final `Data` packet; the hook closes when both sides have done so. If normal execution cannot proceed, the endpoint MAY instead send a `Fault` packet upstream for that hook, which closes it immediately.
`dst_leaf` names a specific leaf hosted by the destination endpoint. Leaf names SHOULD follow the same dotted convention as `procedure_id`: `org.product.vN.part.name`. The reserved empty string `""` MUST NOT be used as a leaf name.
`procedure_id` is the canonical identifier for a procedure contract. A procedure contract includes the source library or namespace, the specific procedure identity, and the expected input and output schema pair.
`procedure_id` SHOULD follow the dotted convention `org.product.vN.part.name`, except for the reserved empty string `""` used by the required introspection procedure defined in Section 12.1, where:
Each segment SHOULD be non-empty. Implementations SHOULD restrict segments to lowercase ASCII letters, digits, and underscores for portability. The version segment SHOULD appear in the third position.
> **Rationale:** `procedure_id` is intentionally stricter than a method name or content type. It identifies a full callable contract, not just a label. The dotted convention is a strong recommendation rather than a wire-format requirement because the protocol itself does not parse or validate `procedure_id` structure — it is treated as an opaque string for routing and matching purposes. Version segment format is deliberately left to the owning product to avoid constraining existing versioning schemes.
Each endpoint enforces authority only at the connections it directly maintains.
At a local routing boundary:
- a `Call` packet MUST be accepted only if it arrives from the direct parent connection permitted to issue downwards calls into the destination subtree represented by that boundary
- a `Call` packet that violates that rule MUST be dropped silently
- a `Data` packet MUST be accepted only if it belongs to a valid hook flow, routes correctly by path, and its `src_path` matches the expected peer recorded in local hook state; otherwise it MUST be discarded
- a `Fault` packet MAY arrive only from the subordinate side of a hook-attributable call flow, and its `src_path` MUST match the expected subordinate peer recorded in local hook state or pending call context
> **Rationale:** `rkyv` does not define one single universal format independent of configuration. Its archived representation depends on format-control settings such as endianness, alignment, and pointer width. This specification therefore fixes those settings so "use rkyv" means one exact interoperable byte format rather than a family of related formats.
> **Rationale:** `Fault` is separated from `Data` so ordinary application output does not need to share semantics with protocol failure signaling. A receiver can distinguish successful hook traffic from protocol failure immediately from `packet_type`, without inspecting `procedure_id` or the payload contract.
- the immediate receiver MUST validate that `src_path` matches the registered path of the peer on the connection from which the packet arrived; a packet whose `src_path` does not match MUST be discarded
An endpoint's local subtree consists of the endpoint's own path and every descendant path whose segment sequence begins with the endpoint's path as a prefix.
A path `A` lies within the subtree of path `B` if and only if `B` is a prefix of `A`.
The root endpoint's path is the empty path, and its subtree contains all paths.
> **Rationale:** Longest-prefix routing is defined as a path-selection rule, not as a way to resolve duplicate ownership. The tree model assumes each endpoint path names exactly one place in the topology. If two child routes claim the same path, the local routing table is already invalid.
When forwarding or receiving a `Call`, an endpoint MUST apply the local-authority rules defined in Section 7.1 at the boundary where the packet arrives.
An implementation MAY maintain an internal fastpath keyed by locally validated hook state for performance, provided it remains behaviorally equivalent to path-based routing. `hook_id` is scoped to the calling endpoint and is not globally routable, so path remains the canonical routing key.
If the destination endpoint exists but the `Call` cannot be executed — because `dst_leaf` names no local leaf, or because `procedure_id` is unknown or unsupported — the endpoint MUST send a `Fault` upstream using the declared `response_hook` if one is present. If no `response_hook` is present, the endpoint MUST discard the `Call` silently.
> **Rationale:** Fault reporting for an invalid call would be self-defeating if the callee first had to prove that the application procedure was valid before it could use the declared hook. The hook exists to carry either normal returned data or a protocol fault explaining why normal execution could not proceed.
Pending call context is local transient state created when an endpoint receives a `Call` that declares `response_hook` and before that call has either been accepted into active hook state, rejected with `Fault`, or discarded. It MUST be keyed by `(return_path, hook_id)` and MUST retain enough information to emit an upstream `Fault` for that call if needed.
- a pending call context MUST NOT be used to forward or process application data; it exists solely to validate and emit an upstream `Fault` for that received `Call`
> **Rationale:** Pending call context exists because some failures are discovered before normal application execution begins. The callee still needs enough validated state to attribute an upstream `Fault` to the declared hook without pretending that the hook was fully active for ordinary bidirectional traffic.
> **Rationale:** Ordinary hook traffic is part of the same procedure contract that created the hook, so the returned `procedure_id` stays anchored to the originating `Call`. This keeps hook validation simple and avoids treating a response as a separate contract lookup. Introspection therefore uses `""` on both the `Call` and the `Data` it produces. Protocol faults are separate packets and therefore do not need to overload `Data` semantics.
A hook MAY carry multiple `Data` packets in either direction if the application requires chunking, phased output, or prolonged bidirectional interaction. There is no protocol-level requirement that the callee send the first `Data` packet.
> **Rationale:** The protocol allows symmetric hook traffic after activation but does not introduce a readiness or acknowledgment packet to synchronize the first `Data` frame. Requiring discard of packets that arrive before activation keeps the rule simple and safe: a sender that races ahead of activation will need to retransmit or rely on higher-layer sequencing. Higher-layer protocols that need stricter startup guarantees should define their own first-packet discipline inside the hook.
A sender SHOULD set `end_hook = true` on its final `Data` packet for that hook. A sender MUST NOT send further `Data` packets on a hook after sending a packet with `end_hook = true`. A hook closes when both sides have sent `end_hook = true`, or when either side sends or receives a `Fault`.
> **Rationale:** Making `end_hook = true` a hard final marker rather than a soft hint removes ambiguity about whether the hook is still open. Both sides can close cleanly once they have each signaled completion, without needing a separate close packet or higher-layer shutdown sequence.
- a receiver of a `Fault` packet MUST validate `src_path` against the expected subordinate hook peer recorded in local hook state or pending call context
Sending a `Fault` packet closes the hook immediately for both sides. After sending or receiving a `Fault` packet, an implementation MUST remove that hook from active state.
If an endpoint receives a fault value it does not recognize, it MUST still treat the packet as a protocol fault and close the hook.
> **Rationale:** Protocol faults are part of interoperability, so they need a fixed canonical payload contract rather than a free-form error blob. A small enum with stable byte discriminants is cheap to encode, easy to evolve, and avoids coupling core protocol behavior to human-readable messages. Receivers can make deterministic decisions from the fault kind alone.
> **Rationale:** The fault set is intentionally small. Silent drop remains the canonical behavior for traffic that cannot be safely attributed to a valid call or hook, such as an unknown `hook_id`, malformed returned traffic, or a routing miss discovered by an intermediate router. `Fault` is reserved for failures that a receiver can attribute to a specific call flow and report upstream deterministically.
> **Rationale:** An unrecognized protocol fault still means the application contract has failed and the hook can no longer continue safely. Requiring unknown fault values to terminate the hook preserves forward compatibility: newer peers may introduce additional fault kinds without causing older peers to accidentally keep a broken hook alive.
| `sub_endpoints` | The path segment identifiers of directly registered child endpoints. Each entry is the single path segment that distinguishes the child from the local endpoint. The full path of a child can be inferred by appending its segment to the local endpoint's path. |
> **Rationale:** Returning full `procedure_id` values avoids forcing the caller to reconstruct contract names from leaf-local fragments. Endpoint introspection and leaf introspection deliberately share the same leaf record shape so the endpoint-wide form is just a list of the leaf-specific form. `sub_endpoints` returns only immediate child identifiers rather than a full subtree description because the tree topology is not assumed to be globally known; callers that need deeper discovery can issue further introspection calls toward each discovered child.
The UnShell protocol keeps its core narrow: path addressing, downwards `Call`, hook-backed `Data`, and upstream `Fault`. `procedure_id` is the main semantic anchor, so callers and callees are expected to share knowledge of each procedure contract without relying on a protocol-level registry.
This document uses Rust-like `rkyv` struct notation to describe fields because it matches the current implementation language. The notation is explanatory, but the on-wire byte format is normatively fixed in Section 8.