summaryrefslogtreecommitdiffstats
path: root/news/phase4-institutional-onboarding/index.html
blob: cbd9ac3295739a454c1cc4d01c79d6390f0675be (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>Phase 4: Institutional Node Onboarding — Tesseras</title>
    <meta name="description" content="Libraries, archives, and museums can now join the Tesseras network as verified institutional nodes with DNS-based identity, full-text search indexes, and configurable storage pledges.">
    <!-- Open Graph -->
    <meta property="og:type" content="article">
    <meta property="og:title" content="Phase 4: Institutional Node Onboarding">
    <meta property="og:description" content="Libraries, archives, and museums can now join the Tesseras network as verified institutional nodes with DNS-based identity, full-text search indexes, and configurable storage pledges.">
    <meta property="og:image" content="https://tesseras.net/images/social.jpg">
    <meta property="og:image:width" content="1200">
    <meta property="og:image:height" content="630">
    <meta property="og:site_name" content="Tesseras">
    <!-- Twitter Card -->
    <meta name="twitter:card" content="summary_large_image">
    <meta name="twitter:title" content="Phase 4: Institutional Node Onboarding">
    <meta name="twitter:description" content="Libraries, archives, and museums can now join the Tesseras network as verified institutional nodes with DNS-based identity, full-text search indexes, and configurable storage pledges.">
    <meta name="twitter:image" content="https://tesseras.net/images/social.jpg">
    <link rel="stylesheet" href="https://tesseras.net/style.css?h=21f0f32121928ee5c690">
    
        
            <link rel="alternate" type="application/atom+xml" title="Tesseras" href="https://tesseras.net/atom.xml">
        
    
    <link rel="icon" type="image/png" sizes="32x32" href="https://tesseras.net/images/favicon.png?h=be4e123a23393b1a027d">
    
</head>
<body>
    <header>
        <h1>
            <a href="https:&#x2F;&#x2F;tesseras.net/">
                <img src="https://tesseras.net/images/logo-64.png?h=c1b8d0c4c5f93b49d40b" alt="Tesseras" width="40" height="40" class="logo">
                Tesseras
            </a>
        </h1>
        <nav>
            
                <a href="https://tesseras.net/about/">About</a>
                <a href="https://tesseras.net/news/">News</a>
                <a href="https://tesseras.net/releases/">Releases</a>
                <a href="https://tesseras.net/faq/">FAQ</a>
                <a href="https://tesseras.net/subscriptions/">Subscriptions</a>
                <a href="https://tesseras.net/contact/">Contact</a>
            
        </nav>
        <nav class="lang-switch">
            
                <strong>English</strong> | <a href="/pt-br&#x2F;news&#x2F;phase4-institutional-onboarding&#x2F;">Português</a>
            
        </nav>
    </header>

    <main>
        
<article>
    <h2>Phase 4: Institutional Node Onboarding</h2>
    <p class="news-date">2026-02-15</p>
    <p>A P2P network of individuals is fragile. Hard drives die, phones get lost,
people lose interest. The long-term survival of humanity's memories depends on
institutions — libraries, archives, museums, universities — that measure their
lifetimes in centuries. Phase 4 continues with institutional node onboarding:
verified organizations can now pledge storage, run searchable indexes, and
participate in the network with a distinct identity.</p>
<p>The design follows a principle of trust but verify: institutions identify
themselves via DNS TXT records (the same mechanism used by SPF, DKIM, and DMARC
for email), pledge a storage budget, and receive reciprocity exemptions so they
can store fragments for others without expecting anything in return. In
exchange, the network treats their fragments as higher-quality replicas and
limits over-reliance on any single institution through diversity constraints.</p>
<h2 id="what-was-built">What was built</h2>
<p><strong>Capability bits</strong> (<code>tesseras-core/src/network.rs</code>) — Two new flags added to
the <code>Capabilities</code> bitfield: <code>INSTITUTIONAL</code> (bit 7) and <code>SEARCH_INDEX</code> (bit 8).
A new <code>institutional_default()</code> constructor returns the full Phase 2 capability
set plus these two bits and <code>RELAY</code>. Normal nodes advertise <code>phase2_default()</code>
which lacks institutional flags. Serialization roundtrip tests verify the new
bits survive MessagePack encoding.</p>
<p><strong>Search types</strong> (<code>tesseras-core/src/search.rs</code>) — Three new domain types for
the search subsystem:</p>
<ul>
<li><code>SearchFilters</code> — query parameters: <code>memory_type</code>, <code>visibility</code>, <code>language</code>,
<code>date_range</code>, <code>geo</code> (bounding box), <code>page</code>, <code>page_size</code></li>
<li><code>SearchHit</code> — a single result: content hash plus a <code>MetadataExcerpt</code> (title,
description, memory type, creation date, visibility, language, tags)</li>
<li><code>GeoFilter</code> — bounding box with <code>min_lat</code>, <code>max_lat</code>, <code>min_lon</code>, <code>max_lon</code> for
spatial queries</li>
</ul>
<p>All types derive <code>Serialize</code>/<code>Deserialize</code> for wire transport and
<code>Clone</code>/<code>Debug</code> for diagnostics.</p>
<p><strong>Institutional daemon config</strong> (<code>tesd/src/config.rs</code>) — A new <code>[institutional]</code>
TOML section with <code>domain</code> (the DNS domain to verify), <code>pledge_bytes</code> (storage
commitment in bytes), and <code>search_enabled</code> (toggle for the FTS5 index). The
<code>to_dht_config()</code> method now sets <code>Capabilities::institutional_default()</code> when
institutional config is present, so institutional nodes advertise the right
capability bits in Pong responses.</p>
<p><strong>DNS TXT verification</strong> (<code>tesd/src/institutional.rs</code>) — Async DNS resolution
using <code>hickory-resolver</code> to verify institutional identity. The daemon looks up
<code>_tesseras.&lt;domain&gt;</code> TXT records and parses key-value fields: <code>v</code> (version),
<code>node</code> (hex-encoded node ID), and <code>pledge</code> (storage pledge in bytes).
Verification checks:</p>
<ol>
<li>A TXT record exists at <code>_tesseras.&lt;domain&gt;</code></li>
<li>The <code>node</code> field matches the daemon's own node ID</li>
<li>The <code>pledge</code> field is present and valid</li>
</ol>
<p>On startup, the daemon attempts DNS verification. If it succeeds, the node runs
with institutional capabilities. If it fails, the node logs a warning and
downgrades to a normal full node — no crash, no manual intervention.</p>
<p><strong>CLI setup command</strong> (<code>tesseras-cli/src/institutional.rs</code>) — A new
<code>institutional setup</code> subcommand that guides operators through onboarding:</p>
<ol>
<li>Reads the node's identity from the data directory</li>
<li>Prompts for domain name and pledge size</li>
<li>Generates the exact DNS TXT record to add:
<code>v=tesseras1 node=&lt;hex&gt; pledge=&lt;bytes&gt;</code></li>
<li>Writes the institutional section to the daemon's config file</li>
<li>Prints next steps: add the TXT record, restart the daemon</li>
</ol>
<p><strong>SQLite search index</strong> (<code>tesseras-storage</code>) — A migration
(<code>003_institutional.sql</code>) that creates three structures:</p>
<ul>
<li><code>search_content</code> — an FTS5 virtual table for full-text search over tessera
metadata (title, description, creator, tags, language)</li>
<li><code>geo_index</code> — an R-tree virtual table for spatial bounding-box queries over
latitude/longitude</li>
<li><code>geo_map</code> — a mapping table linking R-tree row IDs to content hashes</li>
</ul>
<p>The <code>SqliteSearchIndex</code> adapter implements the <code>SearchIndex</code> port trait with
<code>index_tessera()</code> (insert/update) and <code>search()</code> (query with filters). FTS5
queries support natural language search; geo queries use R-tree <code>INTERSECT</code> for
bounding box lookups. Results are ranked by FTS5 relevance score.</p>
<p>The migration also adds an <code>is_institutional</code> column to the <code>reciprocity</code> table,
handled idempotently via <code>pragma_table_info</code> checks (SQLite's
<code>ALTER TABLE ADD COLUMN</code> lacks <code>IF NOT EXISTS</code>).</p>
<p><strong>Reciprocity bypass</strong> (<code>tesseras-replication/src/service.rs</code>) — Institutional
nodes are exempt from reciprocity checks. When <code>receive_fragment()</code> is called,
if the sender's node ID is marked as institutional in the reciprocity ledger,
the balance check is skipped entirely. This means institutions can store
fragments for the entire network without needing to "earn" credits first — their
DNS-verified identity and storage pledge serve as their credential.</p>
<p><strong>Node-type diversity constraint</strong> (<code>tesseras-replication/src/distributor.rs</code>) —
A new <code>apply_institutional_diversity()</code> function limits how many replicas of a
single tessera can land on institutional nodes. The cap is
<code>ceil(replication_factor / 3.5)</code> — with the default <code>r=7</code>, at most 2 of 7
replicas go to institutions. This prevents the network from becoming dependent
on a small number of large institutions: if a university's servers go down, at
least 5 replicas remain on independent nodes.</p>
<p><strong>DHT message extensions</strong> (<code>tesseras-dht/src/message.rs</code>) — Two new message
variants:</p>
<table><thead><tr><th>Message</th><th>Purpose</th></tr></thead><tbody>
<tr><td><code>Search</code></td><td>Client sends query string, filters, and page number</td></tr>
<tr><td><code>SearchResult</code></td><td>Institutional node responds with hits and total count</td></tr>
</tbody></table>
<p>The <code>encode()</code> function was switched from positional to named MessagePack
serialization (<code>rmp_serde::to_vec_named</code>) to handle <code>SearchFilters</code>' optional
fields correctly — positional encoding breaks when <code>skip_serializing_if</code> omits
fields.</p>
<p><strong>Prometheus metrics</strong> (<code>tesd/src/metrics.rs</code>) — Eight institutional-specific
metrics:</p>
<ul>
<li><code>tesseras_institutional_pledge_bytes</code> — configured storage pledge</li>
<li><code>tesseras_institutional_stored_bytes</code> — actual bytes stored</li>
<li><code>tesseras_institutional_pledge_utilization_ratio</code> — stored/pledged ratio</li>
<li><code>tesseras_institutional_peers_served</code> — unique peers served fragments</li>
<li><code>tesseras_institutional_search_index_total</code> — tesseras in the search index</li>
<li><code>tesseras_institutional_search_queries_total</code> — search queries received</li>
<li><code>tesseras_institutional_dns_verification_status</code> — 1 if DNS verified, 0
otherwise</li>
<li><code>tesseras_institutional_dns_verification_last</code> — Unix timestamp of last
verification</li>
</ul>
<p><strong>Integration tests</strong> — Two tests in
<code>tesseras-replication/tests/integration.rs</code>:</p>
<ul>
<li><code>institutional_peer_bypasses_reciprocity</code> — verifies that an institutional
peer with a massive deficit (-999,999 balance) is still allowed to store
fragments, while a non-institutional peer with the same deficit is rejected</li>
<li><code>institutional_node_accepts_fragment_despite_deficit</code> — full async test using
<code>ReplicationService</code> with mocked DHT, fragment store, reciprocity ledger, and
blob store: sends a fragment from an institutional sender and verifies it's
accepted</li>
</ul>
<p>322 tests pass across the workspace. Clippy clean with <code>-D warnings</code>.</p>
<h2 id="architecture-decisions">Architecture decisions</h2>
<ul>
<li><strong>DNS TXT over PKI or blockchain</strong>: DNS is universally deployed, universally
understood, and already used for domain verification (SPF, DKIM, Let's
Encrypt). Institutions already manage DNS. No certificate authority, no token,
no on-chain transaction — just a TXT record. If an institution loses control
of their domain, the verification naturally fails on the next check.</li>
<li><strong>Graceful degradation on DNS failure</strong>: if DNS verification fails at startup,
the daemon downgrades to a normal full node instead of refusing to start. This
prevents operational incidents — a DNS misconfiguration shouldn't take a node
offline.</li>
<li><strong>Diversity cap at <code>ceil(r / 3.5)</code></strong>: with <code>r=7</code>, at most 2 replicas go to
institutions. This is conservative — it ensures the network never depends on
institutions for majority quorum, while still benefiting from their storage
capacity and uptime.</li>
<li><strong>Named MessagePack encoding</strong>: switching from positional to named encoding
adds ~15% overhead per message but eliminates a class of serialization bugs
when optional fields are present. The DHT is not bandwidth-constrained at the
message level, so the tradeoff is worth it.</li>
<li><strong>Reciprocity exemption over credit grants</strong>: rather than giving institutions
a large initial credit balance (which is arbitrary and needs tuning), we
exempt them entirely. Their DNS-verified identity and public storage pledge
replace the bilateral reciprocity mechanism.</li>
<li><strong>FTS5 + R-tree in SQLite</strong>: full-text search and spatial indexing are built
into SQLite as loadable extensions. No external search engine (Elasticsearch,
Meilisearch) needed. This keeps the deployment a single binary with a single
database file — critical for institutional operators who may not have a DevOps
team.</li>
</ul>
<h2 id="what-comes-next">What comes next</h2>
<ul>
<li><strong>Phase 4 continued</strong> — storage deduplication (content-addressable store with
BLAKE3 keying), security audits, OS packaging (Alpine, Arch, Debian, OpenBSD,
FreeBSD)</li>
<li><strong>Phase 5: Exploration and Culture</strong> — public tessera browser by
era/location/theme/language, institutional curation, genealogy integration
(FamilySearch, Ancestry), physical media export (M-DISC, microfilm, acid-free
paper with QR), AI-assisted context</li>
</ul>
<p>Institutional onboarding closes a critical gap in Tesseras' preservation model.
Individual nodes provide grassroots resilience — thousands of devices across the
globe, each storing a few fragments. Institutional nodes provide anchoring —
organizations with professional infrastructure, redundant storage, and
multi-decade operational horizons. Together, they form a network where memories
can outlast both individual devices and individual institutions.</p>

</article>

    </main>

    <footer>
        <p>&copy; 2026 Tesseras Project. <a href="/atom.xml">News Feed</a> · <a href="https://git.sr.ht/~ijanc/tesseras">Source</a></p>
    </footer>
</body>
</html>