1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
|
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Reed-Solomon: How Tesseras Survives Data Loss — Tesseras</title>
<meta name="description" content="A deep dive into Reed-Solomon erasure coding — what it is, why Tesseras uses it, and the challenges of keeping memories alive across centuries.">
<!-- Open Graph -->
<meta property="og:type" content="article">
<meta property="og:title" content="Reed-Solomon: How Tesseras Survives Data Loss">
<meta property="og:description" content="A deep dive into Reed-Solomon erasure coding — what it is, why Tesseras uses it, and the challenges of keeping memories alive across centuries.">
<meta property="og:image" content="https://tesseras.net/images/social.jpg">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="630">
<meta property="og:site_name" content="Tesseras">
<!-- Twitter Card -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Reed-Solomon: How Tesseras Survives Data Loss">
<meta name="twitter:description" content="A deep dive into Reed-Solomon erasure coding — what it is, why Tesseras uses it, and the challenges of keeping memories alive across centuries.">
<meta name="twitter:image" content="https://tesseras.net/images/social.jpg">
<link rel="stylesheet" href="https://tesseras.net/style.css?h=21f0f32121928ee5c690">
<link rel="alternate" type="application/atom+xml" title="Tesseras" href="https://tesseras.net/atom.xml">
<link rel="icon" type="image/png" sizes="32x32" href="https://tesseras.net/images/favicon.png?h=be4e123a23393b1a027d">
</head>
<body>
<header>
<h1>
<a href="https://tesseras.net/">
<img src="https://tesseras.net/images/logo-64.png?h=c1b8d0c4c5f93b49d40b" alt="Tesseras" width="40" height="40" class="logo">
Tesseras
</a>
</h1>
<nav>
<a href="https://tesseras.net/about/">About</a>
<a href="https://tesseras.net/news/">News</a>
<a href="https://tesseras.net/releases/">Releases</a>
<a href="https://tesseras.net/faq/">FAQ</a>
<a href="https://tesseras.net/subscriptions/">Subscriptions</a>
<a href="https://tesseras.net/contact/">Contact</a>
</nav>
<nav class="lang-switch">
<strong>English</strong> | <a href="/pt-br/news/reed-solomon/">Português</a>
</nav>
</header>
<main>
<article>
<h2>Reed-Solomon: How Tesseras Survives Data Loss</h2>
<p class="news-date">2026-02-14</p>
<p>Your hard drive will die. Your cloud provider will pivot. The RAID array in your
closet will outlive its controller but not its owner. If a memory is stored in
exactly one place, it has exactly one way to be lost forever.</p>
<p>Tesseras is a network that keeps human memories alive through mutual aid. The
core survival mechanism is <strong>Reed-Solomon erasure coding</strong> — a technique
borrowed from deep-space communication that lets us reconstruct data even when
pieces go missing.</p>
<h2 id="what-is-reed-solomon">What is Reed-Solomon?</h2>
<p>Reed-Solomon is a family of error-correcting codes invented by Irving Reed and
Gustave Solomon in 1960. The original use case was correcting errors in data
transmitted over noisy channels — think Voyager sending photos from Jupiter, or
a CD playing despite scratches.</p>
<p>The key insight: if you add carefully computed redundancy to your data <em>before</em>
something goes wrong, you can recover the original even after losing some
pieces.</p>
<p>Here's the intuition. Suppose you have a polynomial of degree 2 — a parabola.
You need 3 points to define it uniquely. But if you evaluate it at 5 points, you
can lose any 2 of those 5 and still reconstruct the polynomial from the
remaining 3. Reed-Solomon generalizes this idea to work over finite fields
(Galois fields), where the "polynomial" is your data and the "evaluation points"
are your fragments.</p>
<p>In concrete terms:</p>
<ol>
<li><strong>Split</strong> your data into <em>k</em> data shards</li>
<li><strong>Compute</strong> <em>m</em> parity shards from the data shards</li>
<li><strong>Distribute</strong> all <em>k + m</em> shards across different locations</li>
<li><strong>Reconstruct</strong> the original data from any <em>k</em> of the <em>k + m</em> shards</li>
</ol>
<p>You can lose up to <em>m</em> shards — any <em>m</em>, data or parity, in any combination —
and still recover everything.</p>
<h2 id="why-not-just-make-copies">Why not just make copies?</h2>
<p>The naive approach to redundancy is replication: make 3 copies, store them in 3
places. This gives you tolerance for 2 failures at the cost of 3x your storage.</p>
<p>Reed-Solomon is dramatically more efficient:</p>
<table><thead><tr><th>Strategy</th><th style="text-align: right">Storage overhead</th><th style="text-align: right">Failures tolerated</th></tr></thead><tbody>
<tr><td>3x replication</td><td style="text-align: right">200%</td><td style="text-align: right">2 out of 3</td></tr>
<tr><td>Reed-Solomon (16,8)</td><td style="text-align: right">50%</td><td style="text-align: right">8 out of 24</td></tr>
<tr><td>Reed-Solomon (48,24)</td><td style="text-align: right">50%</td><td style="text-align: right">24 out of 72</td></tr>
</tbody></table>
<p>With 16 data shards and 8 parity shards, you use 50% extra storage but can
survive losing a third of all fragments. To achieve the same fault tolerance
with replication alone, you'd need 3x the storage.</p>
<p>For a network that aims to preserve memories across decades and centuries, this
efficiency isn't a nice-to-have — it's the difference between a viable system
and one that drowns in its own overhead.</p>
<h2 id="how-tesseras-uses-reed-solomon">How Tesseras uses Reed-Solomon</h2>
<p>Not all data deserves the same treatment. A 500-byte text memory and a 100 MB
video have very different redundancy needs. Tesseras uses a three-tier
fragmentation strategy:</p>
<p><strong>Small (< 4 MB)</strong> — Whole-file replication to 7 peers. For small tesseras, the
overhead of erasure coding (encoding time, fragment management, reconstruction
logic) outweighs its benefits. Simple copies are faster and simpler.</p>
<p><strong>Medium (4–256 MB)</strong> — 16 data shards + 8 parity shards = 24 total fragments.
Each fragment is roughly 1/16th of the original size. Any 16 of the 24 fragments
reconstruct the original. Distributed across 7 peers.</p>
<p><strong>Large (≥ 256 MB)</strong> — 48 data shards + 24 parity shards = 72 total fragments.
Higher shard count means smaller individual fragments (easier to transfer and
store) and higher absolute fault tolerance. Also distributed across 7 peers.</p>
<p>The implementation uses the <code>reed-solomon-erasure</code> crate operating over GF(2⁸) —
the same Galois field used in QR codes and CDs. Each fragment carries a BLAKE3
checksum so corruption is detected immediately, not silently propagated.</p>
<pre><code>Tessera (120 MB photo album)
↓ encode
16 data shards (7.5 MB each) + 8 parity shards (7.5 MB each)
↓ distribute
24 fragments across 7 peers (subnet-diverse)
↓ any 16 fragments
Original tessera recovered
</code></pre>
<h2 id="the-challenges">The challenges</h2>
<p>Reed-Solomon solves the mathematical problem of redundancy. The engineering
challenges are everything around it.</p>
<h3 id="fragment-tracking">Fragment tracking</h3>
<p>Every fragment needs to be findable. Tesseras uses a Kademlia DHT for peer
discovery and fragment-to-peer mapping. When a node goes offline, its fragments
need to be re-created and distributed to new peers. This means tracking which
fragments exist, where they are, and whether they're still intact — across a
network with no central authority.</p>
<h3 id="silent-corruption">Silent corruption</h3>
<p>A fragment that returns wrong data is worse than one that's missing — at least a
missing fragment is honestly absent. Tesseras addresses this with
attestation-based health checks: the repair loop periodically asks fragment
holders to prove possession by returning BLAKE3 checksums. If a checksum doesn't
match, the fragment is treated as lost.</p>
<h3 id="correlated-failures">Correlated failures</h3>
<p>If all 24 fragments of a tessera land on machines in the same datacenter, a
single power outage kills them all. Reed-Solomon's math assumes independent
failures. Tesseras enforces <strong>subnet diversity</strong> during distribution: no more
than 2 fragments per /24 IPv4 subnet (or /48 IPv6 prefix). This spreads
fragments across different physical infrastructure.</p>
<h3 id="repair-speed-vs-network-load">Repair speed vs. network load</h3>
<p>When a peer goes offline, the clock starts ticking. Lost fragments need to be
re-created before more failures accumulate. But aggressive repair floods the
network. Tesseras balances this with a configurable repair loop (default: every
24 hours with 2-hour jitter) and concurrent transfer limits (default: 4
simultaneous transfers). The jitter prevents repair storms where every node
checks its fragments at the same moment.</p>
<h3 id="long-term-key-management">Long-term key management</h3>
<p>Reed-Solomon protects against data loss, not against losing access. If a tessera
is encrypted (private or sealed visibility), you need the decryption key to make
the recovered data useful. Tesseras separates these concerns: erasure coding
handles availability, while Shamir's Secret Sharing (a future phase) will handle
key distribution among heirs. The project's design philosophy — encrypt as
little as possible — keeps the key management problem small.</p>
<h3 id="galois-field-limitations">Galois field limitations</h3>
<p>The GF(2⁸) field limits the total number of shards to 255 (data + parity
combined). For Tesseras, this is not a practical constraint — even the Large
tier uses only 72 shards. But it does mean that extremely large files with
thousands of fragments would require either a different field or a layered
encoding scheme.</p>
<h3 id="evolving-codec-compatibility">Evolving codec compatibility</h3>
<p>A tessera encoded today must be decodable in 50 years. Reed-Solomon over GF(2⁸)
is one of the most widely implemented algorithms in computing — it's in every CD
player, every QR code scanner, every deep-space probe. This ubiquity is itself a
survival strategy. The algorithm won't be forgotten because half the world's
infrastructure depends on it.</p>
<h2 id="the-bigger-picture">The bigger picture</h2>
<p>Reed-Solomon is a piece of a larger puzzle. It works in concert with:</p>
<ul>
<li><strong>Kademlia DHT</strong> for finding peers and routing fragments</li>
<li><strong>BLAKE3 checksums</strong> for integrity verification</li>
<li><strong>Bilateral reciprocity</strong> for fair storage exchange (no blockchain needed)</li>
<li><strong>Subnet diversity</strong> for failure independence</li>
<li><strong>Automatic repair</strong> for maintaining redundancy over time</li>
</ul>
<p>No single technique makes memories survive. Reed-Solomon ensures that data <em>can</em>
be recovered. The DHT ensures fragments <em>can be found</em>. Reciprocity ensures
peers <em>want to help</em>. Repair ensures none of this degrades over time.</p>
<p>A tessera is a bet that the sum of these mechanisms, running across many
independent machines operated by many independent people, is more durable than
any single institution. Reed-Solomon is the mathematical foundation of that bet.</p>
</article>
</main>
<footer>
<p>© 2026 Tesseras Project. <a href="/atom.xml">News Feed</a> · <a href="https://git.sr.ht/~ijanc/tesseras">Source</a></p>
</footer>
</body>
</html>
|