Phase 4: Performance Tuning
+2026-02-15
+A P2P network that can traverse NATs but chokes on its own I/O is not much use. +Phase 4 continues with performance tuning: centralizing database configuration, +caching fragment blobs in memory, managing QUIC connection lifecycles, and +eliminating unnecessary disk reads from the attestation hot path.
+The guiding principle was the same as the rest of Tesseras: do the simplest
+thing that actually works. No custom allocators, no lock-free data structures,
+no premature complexity. A centralized StorageConfig, an LRU cache, a
+connection reaper, and a targeted fix to avoid re-reading blobs that were
+already checksummed.
What was built
+Centralized SQLite configuration (tesseras-storage/src/database.rs) — A
+new StorageConfig struct and open_database() / open_in_memory() functions
+that apply all SQLite pragmas in one place: WAL journal mode, foreign keys,
+synchronous mode (NORMAL by default, FULL for unstable hardware like RPi + SD
+card), busy timeout, page cache size, and WAL autocheckpoint interval.
+Previously, each call site opened a connection and applied pragmas ad hoc. Now
+the daemon, CLI, and tests all go through the same path. 7 tests covering
+foreign keys, busy timeout, journal mode, migrations, synchronous modes, and
+on-disk WAL file creation.
LRU fragment cache (tesseras-storage/src/cache.rs) — A
+CachedFragmentStore that wraps any FragmentStore with a byte-aware LRU
+cache. Fragment blobs are cached on read and invalidated on write or delete.
+When the cache exceeds its configured byte limit, the least recently used
+entries are evicted. The cache is transparent: it implements FragmentStore
+itself, so the rest of the stack doesn't know it's there. Optional Prometheus
+metrics track hits, misses, and current byte usage. 3 tests: cache hit avoids
+inner read, store invalidates cache, eviction when over max bytes.
Prometheus storage metrics (tesseras-storage/src/metrics.rs) — A
+StorageMetrics struct with three counters/gauges: fragment_cache_hits,
+fragment_cache_misses, and fragment_cache_bytes. Registered with the
+Prometheus registry and wired into the fragment cache via with_metrics().
Attestation hot path fix (tesseras-replication/src/service.rs) — The
+attestation flow previously read every fragment blob from disk and recomputed
+its BLAKE3 checksum. Since list_fragments() already returns FragmentId with
+a stored checksum, the fix is trivial: use frag.checksum instead of
+blake3::hash(&data). This eliminates one disk read per fragment during
+attestation — for a tessera with 100 fragments, that's 100 fewer reads. A test
+with expect_read_fragment().never() verifies no blob reads happen during
+attestation.
QUIC connection pool lifecycle (tesseras-net/src/quinn_transport.rs) — A
+PoolConfig struct controlling max connections, idle timeout, and reaper
+interval. PooledConnection wraps each quinn::Connection with a last_used
+timestamp. When the pool reaches capacity, the oldest idle connection is evicted
+before opening a new one. A background reaper task (Tokio spawn) periodically
+closes connections that have been idle beyond the timeout. 4 new pool metrics:
+tesseras_conn_pool_size, pool_hits_total, pool_misses_total,
+pool_evictions_total.
Daemon integration (tesd/src/config.rs, main.rs) — A new [performance]
+section in the TOML config with fields for SQLite cache size, synchronous mode,
+busy timeout, fragment cache size, max connections, idle timeout, and reaper
+interval. The daemon's main() now calls open_database() with the configured
+StorageConfig, wraps FsFragmentStore with CachedFragmentStore, and binds
+QUIC with the configured PoolConfig. The direct rusqlite dependency was
+removed from the daemon crate.
CLI migration (tesseras-cli/src/commands/init.rs, create.rs) — Both
+init and create commands now use tesseras_storage::open_database() with
+the default StorageConfig instead of opening raw rusqlite connections. The
+rusqlite dependency was removed from the CLI crate.
Architecture decisions
+-
+
- Decorator pattern for caching:
CachedFragmentStorewraps +Box<dyn FragmentStore>and implementsFragmentStoreitself. This means +caching is opt-in, composable, and invisible to consumers. The daemon enables +it; tests can skip it.
+ - Byte-aware eviction: the LRU cache tracks total bytes, not entry count. +Fragment blobs vary wildly in size (a 4KB text fragment vs a 2MB photo shard), +so counting entries would give a misleading picture of memory usage. +
- No connection pool crate: instead of pulling in a generic pool library,
+the connection pool is a thin wrapper around
+
DashMap<SocketAddr, PooledConnection>with a Tokio reaper. QUIC connections +are multiplexed, so the "pool" is really about lifecycle management (idle +cleanup, max connections) rather than borrowing/returning.
+ - Stored checksums over re-reads: the attestation fix is intentionally
+minimal — one line changed, one disk read removed per fragment. The checksums
+were already stored in SQLite by
store_fragment(), they just weren't being +used.
+ - Centralized pragma configuration: a single
StorageConfigstruct replaces +scatteredPRAGMAcalls. Thesqlite_synchronous_fullflag exists +specifically for Raspberry Pi deployments where the kernel can crash and lose +un-checkpointed WAL transactions.
+
What comes next
+-
+
- Phase 4 continued — Shamir's Secret Sharing for heirs, sealed tesseras +(time-lock encryption), security audits, institutional node onboarding, +storage deduplication, OS packaging +
- Phase 5: Exploration and Culture — public tessera browser by +era/location/theme/language, institutional curation, genealogy integration, +physical media export (M-DISC, microfilm, acid-free paper with QR) +
With performance tuning in place, Tesseras handles the common case efficiently: +fragment reads hit the LRU cache, attestation skips disk I/O, idle QUIC +connections are reaped automatically, and SQLite is configured consistently +across the entire stack. The next steps focus on cryptographic features (Shamir, +time-lock) and hardening for production deployment.
+ +