summaryrefslogtreecommitdiffstats
path: root/news/phase4-nat-traversal
diff options
context:
space:
mode:
authormurilo ijanc2026-03-24 21:41:06 -0300
committermurilo ijanc2026-03-24 21:41:06 -0300
commitf186b71ca51e83837db60de13322394bb5e6d348 (patch)
treecd7940eaa16b83d2cde7b18123411bfb161f7ebb /news/phase4-nat-traversal
downloadwebsite-f186b71ca51e83837db60de13322394bb5e6d348.tar.gz
Initial commit
Import existing tesseras.net website content.
Diffstat (limited to 'news/phase4-nat-traversal')
-rw-r--r--news/phase4-nat-traversal/index.html228
-rw-r--r--news/phase4-nat-traversal/index.html.gzbin0 -> 5328 bytes
2 files changed, 228 insertions, 0 deletions
diff --git a/news/phase4-nat-traversal/index.html b/news/phase4-nat-traversal/index.html
new file mode 100644
index 0000000..1d7748b
--- /dev/null
+++ b/news/phase4-nat-traversal/index.html
@@ -0,0 +1,228 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+ <meta charset="utf-8">
+ <meta name="viewport" content="width=device-width, initial-scale=1">
+ <title>Phase 4: Punching Through NATs — Tesseras</title>
+ <meta name="description" content="Tesseras nodes can now discover their NAT type via STUN, coordinate UDP hole punching through introducers, and fall back to transparent relay forwarding when direct connectivity fails.">
+ <!-- Open Graph -->
+ <meta property="og:type" content="article">
+ <meta property="og:title" content="Phase 4: Punching Through NATs">
+ <meta property="og:description" content="Tesseras nodes can now discover their NAT type via STUN, coordinate UDP hole punching through introducers, and fall back to transparent relay forwarding when direct connectivity fails.">
+ <meta property="og:image" content="https://tesseras.net/images/social.jpg">
+ <meta property="og:image:width" content="1200">
+ <meta property="og:image:height" content="630">
+ <meta property="og:site_name" content="Tesseras">
+ <!-- Twitter Card -->
+ <meta name="twitter:card" content="summary_large_image">
+ <meta name="twitter:title" content="Phase 4: Punching Through NATs">
+ <meta name="twitter:description" content="Tesseras nodes can now discover their NAT type via STUN, coordinate UDP hole punching through introducers, and fall back to transparent relay forwarding when direct connectivity fails.">
+ <meta name="twitter:image" content="https://tesseras.net/images/social.jpg">
+ <link rel="stylesheet" href="https://tesseras.net/style.css?h=21f0f32121928ee5c690">
+
+
+ <link rel="alternate" type="application/atom+xml" title="Tesseras" href="https://tesseras.net/atom.xml">
+
+
+ <link rel="icon" type="image/png" sizes="32x32" href="https://tesseras.net/images/favicon.png?h=be4e123a23393b1a027d">
+
+</head>
+<body>
+ <header>
+ <h1>
+ <a href="https:&#x2F;&#x2F;tesseras.net/">
+ <img src="https://tesseras.net/images/logo-64.png?h=c1b8d0c4c5f93b49d40b" alt="Tesseras" width="40" height="40" class="logo">
+ Tesseras
+ </a>
+ </h1>
+ <nav>
+
+ <a href="https://tesseras.net/about/">About</a>
+ <a href="https://tesseras.net/news/">News</a>
+ <a href="https://tesseras.net/releases/">Releases</a>
+ <a href="https://tesseras.net/faq/">FAQ</a>
+ <a href="https://tesseras.net/subscriptions/">Subscriptions</a>
+ <a href="https://tesseras.net/contact/">Contact</a>
+
+ </nav>
+ <nav class="lang-switch">
+
+ <strong>English</strong> | <a href="/pt-br&#x2F;news&#x2F;phase4-nat-traversal&#x2F;">Português</a>
+
+ </nav>
+ </header>
+
+ <main>
+
+<article>
+ <h2>Phase 4: Punching Through NATs</h2>
+ <p class="news-date">2026-02-15</p>
+ <p>Most people's devices sit behind a NAT — a network address translator that lets
+them reach the internet but prevents incoming connections. For a P2P network,
+this is an existential problem: if two nodes behind NATs can't talk to each
+other, the network fragments. Phase 4 continues with a full NAT traversal stack:
+STUN-based discovery, coordinated hole punching, and relay fallback.</p>
+<p>The approach follows the same pattern as most battle-tested P2P systems (WebRTC,
+BitTorrent, IPFS): try the cheapest option first, escalate only when necessary.
+Direct connectivity costs nothing. Hole punching costs a few coordinated
+packets. Relaying costs sustained bandwidth from a third party. Tesseras tries
+them in that order.</p>
+<h2 id="what-was-built">What was built</h2>
+<p><strong>NatType classification</strong> (<code>tesseras-core/src/network.rs</code>) — A new <code>NatType</code>
+enum (Public, Cone, Symmetric, Unknown) added to the core domain layer. This
+type is shared across the entire stack: the STUN client writes it, the DHT
+advertises it in Pong messages, and the punch coordinator reads it to decide
+whether hole punching is even worth attempting (Cone-to-Cone works ~80% of the
+time; Symmetric-to-Symmetric almost never works).</p>
+<p><strong>STUN client</strong> (<code>tesseras-net/src/stun.rs</code>) — A minimal STUN implementation
+(RFC 5389 Binding Request/Response) that discovers a node's external address.
+The codec encodes 20-byte binding requests with a random transaction ID and
+decodes XOR-MAPPED-ADDRESS responses. The <code>discover_nat()</code> function queries
+multiple STUN servers in parallel (Google, Cloudflare by default), compares the
+mapped addresses, and classifies the NAT type:</p>
+<ul>
+<li>Same IP and port from all servers → <strong>Public</strong> (no NAT)</li>
+<li>Same mapped address from all servers → <strong>Cone</strong> (hole punching works)</li>
+<li>Different mapped addresses → <strong>Symmetric</strong> (hole punching unreliable)</li>
+<li>No responses → <strong>Unknown</strong></li>
+</ul>
+<p>Retries with exponential backoff and configurable timeouts. 12 tests covering
+codec roundtrips, all classification paths, and async loopback queries.</p>
+<p><strong>Signed punch coordination</strong> (<code>tesseras-net/src/punch.rs</code>) — Ed25519 signing
+and verification for <code>PunchIntro</code>, <code>RelayRequest</code>, and <code>RelayMigrate</code> messages.
+Every introduction is signed by the initiator with a 30-second timestamp window,
+preventing reflection attacks (where an attacker replays an old introduction to
+redirect traffic). The payload format is <code>target || external_addr || timestamp</code>
+— changing any field invalidates the signature. 6 unit tests plus 3
+property-based tests with proptest (arbitrary node IDs, ports, and session
+tokens).</p>
+<p><strong>Relay session manager</strong> (<code>tesseras-net/src/relay.rs</code>) — Manages transparent
+UDP relay sessions between NATed peers. Each session has a random 16-byte token;
+peers prefix their packets with the token, the relay strips it and forwards.
+Features:</p>
+<ul>
+<li>Bidirectional forwarding (A→R→B and B→R→A)</li>
+<li>Rate limiting: 256 KB/s for reciprocal peers, 64 KB/s for non-reciprocal</li>
+<li>10-minute maximum duration for bootstrap (non-reciprocal) sessions</li>
+<li>Address migration: when a peer's IP changes (Wi-Fi to cellular), a signed
+<code>RelayMigrate</code> updates the session without tearing it down</li>
+<li>Idle cleanup with configurable timeout</li>
+<li>8 unit tests plus 2 property-based tests</li>
+</ul>
+<p><strong>DHT message extensions</strong> (<code>tesseras-dht/src/message.rs</code>) — Seven new message
+variants added to the DHT protocol:</p>
+<table><thead><tr><th>Message</th><th>Purpose</th></tr></thead><tbody>
+<tr><td><code>PunchIntro</code></td><td>"I want to connect to node X, here's my signed external address"</td></tr>
+<tr><td><code>PunchRequest</code></td><td>Introducer forwards the request to the target</td></tr>
+<tr><td><code>PunchReady</code></td><td>Target confirms readiness, sends its external address</td></tr>
+<tr><td><code>RelayRequest</code></td><td>"Create a relay session to node X"</td></tr>
+<tr><td><code>RelayOffer</code></td><td>Relay responds with its address and session token</td></tr>
+<tr><td><code>RelayClose</code></td><td>Tear down a relay session</td></tr>
+<tr><td><code>RelayMigrate</code></td><td>Update session after network change</td></tr>
+</tbody></table>
+<p>The <code>Pong</code> message was extended with NAT metadata: <code>nat_type</code>,
+<code>relay_slots_available</code>, and <code>relay_bandwidth_used_kbps</code>. All new fields use
+<code>#[serde(default)]</code> for backward compatibility — old nodes ignore what they
+don't recognize, new nodes fall back to defaults. 9 new serialization roundtrip
+tests.</p>
+<p><strong>NatHandler trait and dispatch</strong> (<code>tesseras-dht/src/engine.rs</code>) — A new
+<code>NatHandler</code> async trait (5 methods) injected into the DHT engine, following the
+same dependency injection pattern as the existing <code>ReplicationHandler</code>. The
+engine's message dispatch loop now routes all punch/relay messages to the
+handler. This keeps the DHT engine protocol-agnostic while allowing the NAT
+traversal logic to live in <code>tesseras-net</code>.</p>
+<p><strong>Mobile reconnection types</strong> (<code>tesseras-embedded/src/reconnect.rs</code>) — A
+three-phase reconnection state machine for mobile devices:</p>
+<ol>
+<li><strong>QuicMigration</strong> (0-2s) — try QUIC connection migration for all active peers</li>
+<li><strong>ReStun</strong> (2-5s) — re-discover external address via STUN</li>
+<li><strong>ReEstablish</strong> (5-10s) — reconnect peers that migration couldn't save</li>
+</ol>
+<p>Peers are reconnected in priority order: bootstrap nodes first, then nodes
+holding our fragments, then nodes whose fragments we hold, then general DHT
+neighbors. A new <code>NetworkChanged</code> event variant was added to the FFI event
+stream so the Flutter app can show reconnection progress.</p>
+<p><strong>Daemon NAT configuration</strong> (<code>tesd/src/config.rs</code>) — A new <code>[nat]</code> section in
+the TOML config with STUN server list, relay toggle, max relay sessions,
+bandwidth limits (reciprocal vs bootstrap), and idle timeout. All fields have
+sensible defaults; relay is disabled by default.</p>
+<p><strong>Prometheus metrics</strong> (<code>tesseras-net/src/metrics.rs</code>) — 16 metrics across four
+subsystems:</p>
+<ul>
+<li><strong>STUN</strong>: requests, failures, latency histogram</li>
+<li><strong>Punch</strong>: attempts/successes/failures (by NAT type pair), latency histogram</li>
+<li><strong>Relay</strong>: active sessions, total sessions, bytes forwarded, idle timeouts,
+rate limit hits</li>
+<li><strong>Reconnect</strong>: network changes, attempts/successes by phase, duration
+histogram</li>
+</ul>
+<p>6 tests verifying registration, increment, label cardinality, and
+double-registration detection.</p>
+<p><strong>Integration tests</strong> — Two end-to-end tests using <code>MemTransport</code> (in-memory
+simulated network):</p>
+<ul>
+<li><code>punch_integration.rs</code> — Full 3-node hole-punch flow: A sends signed
+<code>PunchIntro</code> to introducer I, I verifies and forwards <code>PunchRequest</code> to B, B
+verifies the original signature and sends <code>PunchReady</code> back, A and B exchange
+messages directly. Also tests that a bad signature is correctly rejected.</li>
+<li><code>relay_integration.rs</code> — Full 3-node relay flow: A requests relay from R, R
+creates session and sends <code>RelayOffer</code> to both peers, A and B exchange
+token-prefixed packets through R, A migrates to a new address mid-session, A
+closes the session, and the test verifies the session is torn down and further
+forwarding fails.</li>
+</ul>
+<p><strong>Property tests</strong> — 7 proptest-based tests covering: signature round-trips for
+all three signed message types (arbitrary node IDs, ports, tokens), NAT
+classification determinism (same inputs always produce same output), STUN
+binding request validity, session token uniqueness, and relay rejection of
+too-short packets.</p>
+<p><strong>Justfile targets</strong> — <code>just test-nat</code> runs all NAT traversal tests across
+<code>tesseras-net</code> and <code>tesseras-dht</code>. <code>just test-chaos</code> is a placeholder for future
+Docker Compose chaos tests with <code>tc netem</code>.</p>
+<h2 id="architecture-decisions">Architecture decisions</h2>
+<ul>
+<li><strong>STUN over TURN</strong>: we implement STUN (discovery) and custom relay rather than
+full TURN. TURN requires authenticated allocation and is designed for media
+relay; our relay is simpler — token-prefixed UDP forwarding with rate limits.
+This keeps the protocol minimal and avoids depending on external TURN servers.</li>
+<li><strong>Signatures on introductions</strong>: every <code>PunchIntro</code> is signed by the
+initiator. Without this, an attacker could send forged introductions to
+redirect a node's hole-punch attempts to an attacker-controlled address (a
+reflection attack). The 30-second timestamp window limits replay.</li>
+<li><strong>Reciprocal bandwidth tiers</strong>: relay nodes give 4x more bandwidth (256 vs 64
+KB/s) to peers with good reciprocity scores. This incentivizes nodes to store
+fragments for others — if you contribute, you get better relay service when
+you need it.</li>
+<li><strong>Backward-compatible Pong extension</strong>: new NAT fields in <code>Pong</code> use
+<code>#[serde(default)]</code> and <code>Option&lt;T&gt;</code>. Old nodes that don't understand these
+fields simply skip them during deserialization. No protocol version bump
+needed.</li>
+<li><strong>NatHandler as async trait</strong>: the NAT traversal logic is injected into the
+DHT engine via a trait, just like <code>ReplicationHandler</code>. This keeps the DHT
+engine focused on routing and peer management, and allows the NAT
+implementation to be swapped or disabled without touching core DHT code.</li>
+</ul>
+<h2 id="what-comes-next">What comes next</h2>
+<ul>
+<li><strong>Phase 4 continued</strong> — performance tuning (connection pooling, fragment
+caching, SQLite WAL), security audits, institutional node onboarding, OS
+packaging</li>
+<li><strong>Phase 5: Exploration and Culture</strong> — public tessera browser by
+era/location/theme/language, institutional curation, genealogy integration,
+physical media export (M-DISC, microfilm, acid-free paper with QR)</li>
+</ul>
+<p>With NAT traversal, Tesseras can connect nodes regardless of their network
+topology. Public nodes talk directly. Cone-NATed nodes punch through with an
+introducer's help. Symmetric-NATed or firewalled nodes relay through willing
+peers. The network adapts to the real world, where most devices are behind a NAT
+and network conditions change constantly.</p>
+
+</article>
+
+ </main>
+
+ <footer>
+ <p>&copy; 2026 Tesseras Project. <a href="/atom.xml">News Feed</a> · <a href="https://git.sr.ht/~ijanc/tesseras">Source</a></p>
+ </footer>
+</body>
+</html>
diff --git a/news/phase4-nat-traversal/index.html.gz b/news/phase4-nat-traversal/index.html.gz
new file mode 100644
index 0000000..57b93d2
--- /dev/null
+++ b/news/phase4-nat-traversal/index.html.gz
Binary files differ