Phase 4: Performance Tuning

2026-02-15

A P2P network that can traverse NATs but chokes on its own I/O is not much use. -Phase 4 continues with performance tuning: centralizing database configuration, -caching fragment blobs in memory, managing QUIC connection lifecycles, and -eliminating unnecessary disk reads from the attestation hot path.

The guiding principle was the same as the rest of Tesseras: do the simplest -thing that actually works. No custom allocators, no lock-free data structures, -no premature complexity. A centralized StorageConfig, an LRU cache, a -connection reaper, and a targeted fix to avoid re-reading blobs that were -already checksummed.

What was built

Centralized SQLite configuration (tesseras-storage/src/database.rs) — A -new StorageConfig struct and open_database() / open_in_memory() functions -that apply all SQLite pragmas in one place: WAL journal mode, foreign keys, -synchronous mode (NORMAL by default, FULL for unstable hardware like RPi + SD -card), busy timeout, page cache size, and WAL autocheckpoint interval. -Previously, each call site opened a connection and applied pragmas ad hoc. Now -the daemon, CLI, and tests all go through the same path. 7 tests covering -foreign keys, busy timeout, journal mode, migrations, synchronous modes, and -on-disk WAL file creation.

LRU fragment cache (tesseras-storage/src/cache.rs) — A -CachedFragmentStore that wraps any FragmentStore with a byte-aware LRU -cache. Fragment blobs are cached on read and invalidated on write or delete. -When the cache exceeds its configured byte limit, the least recently used -entries are evicted. The cache is transparent: it implements FragmentStore -itself, so the rest of the stack doesn't know it's there. Optional Prometheus -metrics track hits, misses, and current byte usage. 3 tests: cache hit avoids -inner read, store invalidates cache, eviction when over max bytes.

Prometheus storage metrics (tesseras-storage/src/metrics.rs) — A -StorageMetrics struct with three counters/gauges: fragment_cache_hits, -fragment_cache_misses, and fragment_cache_bytes. Registered with the -Prometheus registry and wired into the fragment cache via with_metrics().

Attestation hot path fix (tesseras-replication/src/service.rs) — The -attestation flow previously read every fragment blob from disk and recomputed -its BLAKE3 checksum. Since list_fragments() already returns FragmentId with -a stored checksum, the fix is trivial: use frag.checksum instead of -blake3::hash(&data). This eliminates one disk read per fragment during -attestation — for a tessera with 100 fragments, that's 100 fewer reads. A test -with expect_read_fragment().never() verifies no blob reads happen during -attestation.

QUIC connection pool lifecycle (tesseras-net/src/quinn_transport.rs) — A -PoolConfig struct controlling max connections, idle timeout, and reaper -interval. PooledConnection wraps each quinn::Connection with a last_used -timestamp. When the pool reaches capacity, the oldest idle connection is evicted -before opening a new one. A background reaper task (Tokio spawn) periodically -closes connections that have been idle beyond the timeout. 4 new pool metrics: -tesseras_conn_pool_size, pool_hits_total, pool_misses_total, -pool_evictions_total.

Daemon integration (tesd/src/config.rs, main.rs) — A new [performance] -section in the TOML config with fields for SQLite cache size, synchronous mode, -busy timeout, fragment cache size, max connections, idle timeout, and reaper -interval. The daemon's main() now calls open_database() with the configured -StorageConfig, wraps FsFragmentStore with CachedFragmentStore, and binds -QUIC with the configured PoolConfig. The direct rusqlite dependency was -removed from the daemon crate.

CLI migration (tesseras-cli/src/commands/init.rs, create.rs) — Both -init and create commands now use tesseras_storage::open_database() with -the default StorageConfig instead of opening raw rusqlite connections. The -rusqlite dependency was removed from the CLI crate.

Architecture decisions

Decorator pattern for caching: CachedFragmentStore wraps -Box<dyn FragmentStore> and implements FragmentStore itself. This means -caching is opt-in, composable, and invisible to consumers. The daemon enables -it; tests can skip it.
Byte-aware eviction: the LRU cache tracks total bytes, not entry count. -Fragment blobs vary wildly in size (a 4KB text fragment vs a 2MB photo shard), -so counting entries would give a misleading picture of memory usage.
No connection pool crate: instead of pulling in a generic pool library, -the connection pool is a thin wrapper around -DashMap<SocketAddr, PooledConnection> with a Tokio reaper. QUIC connections -are multiplexed, so the "pool" is really about lifecycle management (idle -cleanup, max connections) rather than borrowing/returning.
Stored checksums over re-reads: the attestation fix is intentionally -minimal — one line changed, one disk read removed per fragment. The checksums -were already stored in SQLite by store_fragment(), they just weren't being -used.
Centralized pragma configuration: a single StorageConfig struct replaces -scattered PRAGMA calls. The sqlite_synchronous_full flag exists -specifically for Raspberry Pi deployments where the kernel can crash and lose -un-checkpointed WAL transactions.

What comes next

Phase 4 continued — Shamir's Secret Sharing for heirs, sealed tesseras -(time-lock encryption), security audits, institutional node onboarding, -storage deduplication, OS packaging
Phase 5: Exploration and Culture — public tessera browser by -era/location/theme/language, institutional curation, genealogy integration, -physical media export (M-DISC, microfilm, acid-free paper with QR)

With performance tuning in place, Tesseras handles the common case efficiently: -fragment reads hit the LRU cache, attestation skips disk I/O, idle QUIC -connections are reaped automatically, and SQLite is configured consistently -across the entire stack. The next steps focus on cryptographic features (Shamir, -time-lock) and hardening for production deployment.

- -