Scale envelope
Numbers from make scale-check against the demo compose stack
(Postgres 16, single-host, no tuning). The seed builds a 30-device
fabric (20 access + 10 core), 240 ports, and 20,000
mac_observation rows (10,000 host CAM entries + 10,000 trunk-learned
on the core side).
Query latencies at 20k rows
Section titled “Query latencies at 20k rows”| query | rows returned | latency_ms |
|---|---|---|
live_fdb() | 20,001 | 63.0 |
disagreements() | 0 | 21.9 |
mac_history (1 MAC) | 2 | 2.9 |
traceroute (3 hops, reached) | 3 | 9.8 |
bulk_lookup_ouis (100 MACs) | 100 | 1.7 |
All queries land under 70ms. The biggest is live_fdb() which has to
group + aggregate every currently-open observation. The smallest are
the targeted lookups (history for one MAC, OUI bulk lookup) which use
direct B-tree probes.
What the query plans tell us
Section titled “What the query plans tell us”EXPLAIN ANALYZE on live_fdb():
GroupAggregate -> Incremental Sort (Presorted Key: device_id) -> Merge Join (mac_observation × device) -> Nested Loop (mac_observation × port via Memoize cache) -> Index Scan using mac_obs_open_per_source on mac_observation (rows=20001, loops=1) -> Memoize (Hits: 19860, Misses: 141, Memory: 17 kB)Two things worth knowing:
-
The partial index
mac_obs_open_per_source(created in alembic 0001) drives the scan. It’s defined on(device_id, mac, vlan, source) WHERE upper_inf(valid_during) AND upper_inf(recorded_during)— exactly the predicatelive_fdb()filters by. The query never touches the closed-row part of the table. -
The port
Memoizecache absorbs 99.3% of port lookups (19,860 of 20,001 are cache hits). Means we only physically index-scanport141 times even though we look up 20,001 port_ids.
PostgreSQL’s row estimate is 2300 actual 20001 — a 9x undercount.
Doesn’t matter for the chosen plan (which is good), but if a future
workload depends on join order changing under different cardinalities,
ANALYZE mac_observation is the right thing to run.
Where the next bottleneck lives
Section titled “Where the next bottleneck lives”At 20k rows the bottleneck is not the SQL — it’s the TUI render.
The OPS FDB tree builds an in-memory Tree widget with one leaf per
(mac, device, port). At 20k entries the tree is ~30 megabytes of
Python objects and the initial paint is visibly chunky.
Two reasonable mitigations when fabrics get this big:
- Collapse port nodes by default above some threshold. ✅ Implemented
in
OpsScreen._render_fdb—FDB_PORT_AUTO_EXPAND_THRESHOLD = 50andFDB_DEVICE_AUTO_EXPAND_THRESHOLD = 500. Above those, the node renders with a(▶ to expand)hint and stays collapsed on first paint. Operators press Right-arrow / Space to drill in. The demo data is small enough that everything auto-expands, but a 50k-MAC fabric paints fast because most of the tree is collapsed. - Surface the MAC list as a DataTable instead of a Tree for the flat-listing use case. Trees shine for hierarchy; tables are denser and easier to virtualize. Not yet implemented — would benefit an operator who wants to scan all 50k MACs at once. Flag for a future session.
Reproducing
Section titled “Reproducing”make seed-realistic # ORmake scale-check # seeds 20k rows + prints this table + EXPLAINThe seed routes both sw-a-01 and sw-a-11 to sw-c-01 (round-robin), which is why the scale check’s chosen trace pair walks sw-a-01 → sw-c-01 → sw-a-11 — a real 3-hop path within one core, no core-to-core trunk learning required.
See also
Section titled “See also”- The seed source:
scripts/scale_check.py - Data model reference — the partial-index definitions live in alembic 0001
- The compactor invariant — why liveness rows must be patched after the surrounding session commits (not directly perf-related, but the same partial-index trick is at play for the compactor’s age-out SQL)