Skip to content

Mesh Reliability Risks Under Scale and Mobility #1775

@robekl

Description

@robekl

How important are these issues in the current implementation?

  1. 1-byte path-hash collision risk in direct routing (semantic addressing ambiguity)
    Direct routing currently identifies the “next hop” using only a 1-byte hash prefix of a node public key. In practice, that means only 256 possible hop identifiers exist, so collisions are guaranteed as network size and diversity grow. The forwarding logic for direct packets accepts a packet when that single byte matches the local node’s prefix, not when a stronger identity match is proven.
    Why this matters materially: if two repeaters share the same prefix byte, either repeater can consider itself the intended next hop and consume/forward traffic that was meant for the other. This can create route drift, blackholing, mis-forwarding, and increased retransmissions. Under light traffic and small meshes it may appear rare, but in larger or multi-region deployments it becomes increasingly likely and can directly reduce delivery reliability for direct messages.

  2. “First packet wins” route learning with unconditional path replacement (path quality instability)
    Route learning behavior currently favors whichever valid path-related message arrives first, and then updates stored outbound path state by replacing previous path information without quality gating (no comparison of hop count, recency confidence, delivery success, link quality trend, or stability score). In effect, a single newly received path can overwrite a previously working route immediately.
    Why this matters materially: in RF environments with multipath, congestion, asymmetric links, or mobile repeaters, the first-arriving path is often not the best long-term route. Unconditional replacement can degrade a stable direct route into a brittle one, causing repeated direct failures, timeout-driven retries, and fallback flooding. The user experience is intermittent “works/doesn’t work” behavior for the same peer, and the mesh sees unnecessary airtime usage due to avoidable recovery traffic.

  3. Duplicate suppression robustness is bounded by small cyclic tables (replay/duplicate re-admission window)
    Duplicate suppression relies on fixed-size cyclic memory of recently seen packet hashes and ACKs. Once these tables wrap, older entries are evicted and can be treated as unseen again. This model is efficient and simple, but it provides only a finite duplicate horizon that shrinks under high traffic.
    Why this matters materially: in bursty periods, dense repeater clusters, or adversarial/accidental replay conditions, valid duplicate protection can expire quickly. Previously seen packets can re-enter forwarding paths after eviction, causing duplicate deliveries, extra retransmit load, and elevated collision pressure. The impact is not just theoretical: when the mesh is busy, this becomes a feedback amplifier (more duplicates -> more airtime -> more contention -> lower effective delivery probability).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions