**Minimal complexity.** The protocol's minimal form must fit inside shellcode or a small embedded implant. Features that can be implemented as a leaf or an application-layer convention must not be part of the protocol. The protocol exists only to move packets between tree endpoints and to enforce authority relationships at the connection level.
**Extensibility.** The protocol defines a substrate for arbitrary application-layer capabilities. Content types, leaf procedures, and packet payloads are opaque to the router. New capabilities are added by defining new leaves and content types, not by modifying the protocol itself.
When these two principles appear to conflict, prefer the minimal option and delegate complexity to the leaf or application layer.
| **Endpoint** | Any node connected to the tree, identified by its registered path. |
| **Leaf** | A hosted service or data object on an endpoint (e.g. a shell session, a file system). Addressed by endpoint path plus leaf name. |
| **Path** | An ordered sequence of segments uniquely identifying an endpoint. Written as `/seg1/seg2` for readability; transmitted as `Vec<String>`. |
| **Actual Authority** | The endpoint that directly admitted another into the tree via the handshake. Has protocol-enforced control over that specific connection only. |
Endpoints are arranged in a tree. Each endpoint owns a path. A parent endpoint is the actual authority over the children it has directly admitted. Communication is directional: authorities send `Request` packets downward to their clients; clients send data upward exclusively through hooks.
The `{ leaf: name }` notation is a documentation convention. On the wire, the endpoint path is carried in `dst_path` and the leaf name is carried in the separate `dst_leaf` field of the packet header. Leaf names are not path segments and are invisible to the router.
Each connection has exactly one authority and one client. The authority is the endpoint that accepted the connection and ran the handshake. Actual authority grants:
Actual authority is **per-connection and one hop only**. The root has actual authority over `/abc123` because it directly admitted it. The root does not have actual authority over `/abc123/pivot` — that connection is managed by `/abc123` independently. Routers must reject `Request` packets whose sender is not the direct parent of the destination.
Endpoints closer to the root have implied precedence over deeper endpoints they did not directly admit. This is an operational expectation and is not enforced by the protocol. The operator at `/` trusts that `/abc123` will not admit hostile sub-endpoints. Only network architecture and pre-shared secrets can enforce this on the protocol's behalf.
Two endpoints may each be registered in the other's subtree, creating mutual actual authority. This is useful in multi-datacenter topologies where either site should be able to issue commands to the other's endpoints. A compromised node in a cycle has upward reach into the other side; cycles should be created deliberately and documented explicitly in deployment architecture.
Paths are transmitted as `Vec<String>`. Each element is one segment. Written in this document as `/seg1/seg2` for readability. The router operates on the segment array directly — no string joining or splitting occurs.
All other path segments are application-defined. Leaf names, hook IDs, and stream IDs are carried in dedicated header fields — not encoded into path segments.
Both length prefixes are big-endian `u32`. The `packet_type` field in the header fully determines the structure of the payload. The router never inspects the payload — it reads only the header to make all routing decisions.
Packet types are `u16` discriminants produced by rkyv serialisation of the `PacketType` enum. Parsers in any language should treat them as `u16` values matching the discriminants defined below.
The handshake is **authority-initiated**. The connecting node does not speak until challenged. If a connecting node sends any packet before receiving `AuthChallenge`, the authority closes the connection immediately without sending a response.
The authority issues a 32-byte random nonce. The client responds with `HMAC-SHA256(pre_shared_secret, nonce)`. The pre-shared secret is provisioned out-of-band. A failed HMAC closes the connection immediately, before any path data is exchanged.
**Registration is all-or-nothing.** If any path in `registered_paths` fails validation, the entire handshake is rejected with the reason for the first failed path. Partial registration is not supported.
The authority sends a `Request` to an endpoint it has actual authority over. The endpoint replies with a `Response` carrying the same `request_id` in the header.
**Direction enforcement.** A lower-authority endpoint may never send a `Request` to a higher-authority endpoint. All upward data flow goes through hooks. The router rejects `Request` packets whose sender is not the direct parent of the destination, returning `AuthorityViolation` to the sender.
**Response routing.** When the router forwards a `Request`, it records `request_id → src_connection` in an internal request table. When the corresponding `Response` arrives, the router forwards it to the recorded source and removes the entry. A `Response` with an unrecognised `request_id` is discarded with a warning.
**Timeouts.** There is no protocol-level timeout on a `Request` / `Response` pair. The calling endpoint is responsible for implementing application-layer timeouts.
A hook is a response channel declared by the authority inside a `CallProcedure` request. There is no separate hook declaration packet — the hook is born inside the call that establishes it.
`hook_id` is assigned by the authority and must be unique within that authority's set of currently active hooks. The router routes `HookData` to `fire_path` using normal path-based routing (longest-prefix match on `dst_path`). The authority demultiplexes incoming `HookData` packets by `hook_id` from the packet header.
When `end_hook: true` is received, the hook is complete. The authority removes it from its internal hook table. The router may discard any associated routing state for that `hook_id` after forwarding the final packet.
### Hook Cancellation
There is no protocol-level mechanism to cancel a running hook. To abort a streaming hook early, the authority invokes a leaf-defined cancel procedure (e.g. `CallProcedure("halt", {})`). Leaf implementers must provide such a procedure if early cancellation is required.
### Stream Establishment via Hook
When a `CallProcedure` declares a `Stream`-type hook, a bidirectional `StreamData` channel is established as part of the same call. The authority pre-assigns a `stream_id` (a `u32`) and places it in the `PacketHeader` of the `CallProcedure` packet alongside `stream_id`. The router, upon forwarding the `CallProcedure`, registers this `stream_id` in its stream table, mapping it to the pair `(authority_connection, leaf_connection)`.
The stream is considered live when the leaf sends its first `HookData` packet. Both sides may then exchange `StreamData` packets freely using the pre-assigned `stream_id`. The stream is closed when either side sends `StreamClose`, or when `end_hook: true` is received on the associated hook.
The authority is responsible for assigning unique `stream_id` values across all streams it currently manages. Because `stream_id` is a `u32` and the router's stream table is a flat `HashMap<u32, ...>`, values from two different authorities that happen to collide will corrupt each other's streams. Authorities should use random or cryptographically generated `stream_id` values to make collision probability negligible in practice.
---
## Streams
Streams provide a bidirectional, low-overhead data channel between an authority and a leaf. They are established exclusively via the hook mechanism described above. There is no standalone stream-open packet.
`StreamData` payload is raw bytes. `StreamClose` payload is empty.
The router maintains a `HashMap<u32, (ConnectionHandle, ConnectionHandle)>` for active streams. Populated when a `CallProcedure` establishing a `Stream`-type hook is forwarded. Cleared by `StreamClose` or on endpoint disconnect.
If a `StreamData` or `StreamClose` packet arrives with an unrecognised `stream_id` — which may occur in the race window following a payload reconnect, before the stale stream entry has been cleared — the router returns an error to the sender and closes the stream entry if it exists. The sender should treat this as a hard stream termination.
**Flow control.** The protocol has no acknowledgement or backpressure mechanism. Flow control is delegated entirely to the transport. `TcpTransport` inherits TCP's sliding window natively. Any custom transport must implement equivalent backpressure internally. Application-level rate limiting is the responsibility of the leaf implementation.
Leaves are the application layer of the protocol. A leaf represents a remote service or data object hosted on an endpoint: a shell session, a TCP tunnel, a file system, a running process. The protocol places no constraints on what a leaf does or what procedures it exposes.
On the wire, `dst_path` carries the endpoint path and `dst_leaf` carries the leaf name. The leaf name is not part of the path and is not visible to the router.
If the router cannot resolve the destination endpoint path, it returns `NoBranchError` to the sender. If the path resolves but the leaf name specified in `dst_leaf` is unknown to the endpoint, the endpoint silently discards the request. In either case, no hook fires. The calling endpoint is responsible for implementing an application-layer timeout.
Calling `start` with no arguments uses the leaf's stored defaults. Calling with arguments updates the stored defaults and spawns the process. The `stream` procedure establishes a `Stream`-type hook; the authority must pre-assign a `stream_id` in the `CallProcedure` header as described in the Hooks section.
O(1) lookup in `HashMap<u32, (ConnectionHandle, ConnectionHandle)>`. The entry is created when a `CallProcedure` carrying a `stream_id` is forwarded. The entry is removed on `StreamClose` or on endpoint disconnect. If `stream_id` is not found, the packet is discarded and an error is returned to the sender.
When a `Request` is forwarded, the router records `request_id → src_connection`. When the corresponding `Response` arrives, the router forwards it to the recorded source connection and removes the entry. A `Response` with an unrecognised `request_id` is discarded with a warning.
`content_type` is a free-form string namespaced by module. The protocol does not validate or interpret it — it is an application-layer concern. The router forwards it opaquely as part of the payload.
**Reconnect policy (payloads):** On `Disconnected` or `Io(_)`: close transport, wait 5 seconds, reconnect, run full auth and handshake. No maximum retry limit. All hook and stream state from the previous session is lost on disconnect. The authority must reissue any `CallProcedure` requests whose hooks it still needs after the payload reconnects.
The following issues have no clean resolution within the current design. They are documented here so that implementers understand the tradeoffs and can make informed decisions.
---
### 1. `AuthorityViolation` enforcement is unspecified
`ResponseStatus::AuthorityViolation` is defined for the case where a non-parent endpoint attempts to send a `Request` downward past a node it did not directly admit. The spec states that routers must reject such packets, but no mechanism for performing this check is defined.
To enforce this, the router would need to verify the authority relationship between a packet's sender and its destination before forwarding. The information is available in the routing table populated during admission. However, producing `AuthorityViolation` requires the router to generate a `TreeResponse` — an application-layer packet type — which conflicts with the design principle that the router is opaque to payloads and does not generate protocol-level responses of its own.
The options are:
- **Enforce it, accept the exception.** The router generates `AuthorityViolation` as a special case, accepting that this is the one place where the router produces application-layer content. This provides clear operational feedback but breaks the design principle.
- **Drop silently.** Consistent with the behavior of unresolvable `CallProcedure` destinations (no response, timeout at the application layer), but provides no diagnostic signal to a misbehaving or misconfigured endpoint.
- **Remove `AuthorityViolation`.** Drop the status code entirely on the grounds that a correctly implemented client will never send an upward `Request`. The check becomes a deploy-time correctness property rather than a runtime one.
### 2. `Read` and `Write` request types have no defined behavior
`RequestType::Read` and `RequestType::Write` are defined in the enum but no section of the spec describes what they do, what `data` should contain, or how an endpoint should respond to them. Every actual leaf interaction in the spec uses `CallProcedure`.
The coherent role for `Read` and `Write` would be for plain data endpoints — nodes that store a value without hosting a full leaf. However, the spec has no concept of a non-leaf data endpoint anywhere else. Introducing one would require defining how such nodes are registered, what types they hold, and how reads and writes are serialised.
The alternative is to remove `Read` and `Write` from `RequestType`, leaving only `GetProcedures` and `CallProcedure`. This is a deliberate narrowing of the protocol's scope: all structured interaction goes through leaves and procedures, and there is no raw read/write layer below that. This is consistent with the complexity budget and the extensibility tenant, but should be an explicit decision rather than an implicit one made by leaving the types undefined.
---
### 3. `LeafInfo` contains overlapping state representations
`GetProcedures` returns a `Vec<LeafInfo>`, each containing a `LeafState` (running, pid, error) and a `Vec<ProcedureDescriptor>` whose `params` fields carry current parameter values. `CallProcedure("state", {})` returns the same information through a hook. There are three overlapping paths to the current state of a leaf, with no stated difference in authority, freshness, or purpose.
The problem is that these serve roles that are conceptually distinct but practically redundant. `GetProcedures` is a discovery call — the operator asks what the leaf can do and what its current configuration is. `LeafState` answers whether the process is alive. `ProcedureDescriptor.params` answers what the current settings are. `CallProcedure("state")` is a live query that returns the same data.
The question is whether discovery should return live state at all. One approach is to separate static schema from dynamic state: `GetProcedures` returns only the shape of the leaf (available procedures, parameter names, and their types) with no live values, and all live state queries go through `CallProcedure`. This removes the redundancy but requires a separate round-trip to learn current state after discovery. The other approach is to keep the current design and simply document that `GetProcedures` returns a snapshot that may be stale by the time it is acted on, and that `CallProcedure("state")` is the authoritative live query. Either is defensible; the spec needs to commit to one.