Tier 1 — AFT-based L2 topology discovery

These three papers form the canonical academic lineage of CAM/AFT-table-based topology inference. All three target the same problem l2trace addresses (reconstructing L2 switching structure from switch-local state), all three predate widespread LLDP deployment, and all three treat the network as a single static snapshot to be reconstructed at one moment in time.

That static-snapshot framing is where l2trace diverges most sharply. The bitemporal valid-during × recorded-during model treats every observation as provisional and stitches the path through time — rejecting the “rerun discovery on conflict” worldview the Tier-1 literature shares.

Breitbart et al. 2004 — NetInventory

Paper. Yuri Breitbart, Minos Garofalakis, Ben Jai, Cliff Martin, Rajeev Rastogi, Avi Silberschatz. “Topology Discovery in Heterogeneous IP Networks: The NetInventory System.” IEEE/ACM ToN 12(3):401–414, June 2004. DOI: 10.1109/TNET.2004.828963. Closed-access. Extended journal version of the INFOCOM 2000 paper.

Problem framing [Breitbart 2004, §I]. Physical (layer-2) topology discovery for heterogeneous, multi-vendor IP networks containing multiple subnets and VLANs, using only standard SNMP MIB information. Three foundational difficulties: limited local information, layer-transparency of L2 elements to L3 routers, and vendor heterogeneity.

Data model and algorithm [Breitbart 2004, §II–§III]. Network modeled as undirected graph G with elements typed as router / switch / hub / host. For each interface e_i^s, F_i^s denotes the AFT — the set of MAC addresses seen as source on packets received at that interface. Two key lemmas:

Direct Connection Theorem [§III Lemma 3.1]: Interfaces e_i^s and e_j^t are directly connected iff F_i^s ∩ F_j^t = ∅ and M_i^s ∪ M_j^t = M^N (union covers all MACs in the subnet). Requires complete AFTs.
Minimal AFT Lemma [§III Lemma 3.4]: The interface with the smallest AFT cardinality has only single-port elements on the far side. Drives FINDLEAFCONNECTIONS, which iteratively peels off leaf elements by always selecting the minimal-AFT port.

For multi-subnet topologies [§IV], they prove that AFT information alone does not always uniquely identify topology — some configurations are inherently ambiguous. They provide a characterization of networks where uniqueness is guaranteed, and an “engineering solution” that returns a small set of candidate graphs otherwise.

Incomplete AFT handling [§V]. Three techniques: spoofed ICMP-echo packets to force AFT population, continuous AFT collection at regular intervals to capture more entries, and a user-specified completeness threshold.

Implementation [§VI–§VII]. Deployed as NetInventory, integrated into Lucent’s network-management product portfolio. Evaluated on Lucent’s Murray Hill research network. Multi-subnet correctness verified against manually-maintained operator maps — NetInventory in fact discovered links the operators had missed. On a Pentium II 400 MHz: 200 switches / 1000 nodes in ≈5s for topology computation, with most wall-clock time spent in AFT collection.

What l2trace borrows. The Direct Connection Theorem itself is now obsolete (LLDP and gNMI replace AFT-intersection inference), but Breitbart’s system architecture in §VI prefigures l2trace more than the algorithm does. NetInventory timestamps every observation, separates Resource Discovery (find topology) from Resource Monitoring (re-validate it), and “retains historical views of the network in that any user can query the system to determine which interfaces existed or which switches and routers were operational or what network events occurred in a given time frame.” This is unitemporal storage — one time axis, capturing “when we recorded it” — and it falls short of l2trace’s bitemporal model in exactly the ways described in Why bitemporal?. Crucially, NetInventory “assumes that the topology of the underlying network remains stable during the discovery process,” making the static-snapshot assumption explicit.

Lowekamp, O’Hallaron, Gross 2001 — Simple Connection Theorem

Paper. Bruce Lowekamp, David R. O’Hallaron, Thomas R. Gross. “Topology Discovery for Large Ethernet Networks.” ACM SIGCOMM 2001, pp. 237–248. DOI: 10.1145/383059.383078. Closed-access.

Problem framing [§1]. Same general problem as Breitbart, but with an explicit constraint: discover topology from a single endpoint without requiring complete AFTs. Lowekamp’s group at CMU found Breitbart’s completeness requirement infeasible on real, large networks (CMU CS had ~2000 hosts and 50 bridges; getting every bridge’s AFT to contain every other bridge’s address was prohibitively expensive in practice).

Algorithm [§4–§5]. Two theorems:

Direct Connection Theorem [§4 Theorem 4.1]: identical in spirit to Breitbart’s, and Lowekamp explicitly credits Breitbart for the proof. Requires complete AFTs.
Simple Connection Theorem [§5 Theorem 5.1]: the central contribution. Defines the through set T_i^x — the complement of the forwarding set, i.e., MACs forwarded through port x. Two ports a_x and b_y are simply connected iff there is exactly one pair of ports for which T_a^x ∩ T_b^y = ∅. The uniqueness clause replaces the completeness clause. Lowekamp proves a Minimum Knowledge Requirement [§5 Lemma 5.2]: forwarding entries for as few as three hosts shared between any pair of bridges suffice to uniquely determine the connection.

Virtual switches [§6.1]. When traversal finds no single next-hop bridge satisfying the predicate, a “virtual switch” is inserted to represent a hub, dumb switch, or SNMP-denied bridge. Unlike Breitbart’s special-case Theorem 4.2, this falls out of the base algorithm — Lowekamp calls this “elegance.”

Evaluation [§8]. Tested on CMU CS Department network (~2000 hosts, 50 bridges) plus BBN, ETH Zurich, William & Mary, and DoD testbeds. Hardware diversity: Cisco, Intel, 3Com, Asanté, Netgear, Xylan, Linksys. All cases produced correct topology, verified by departmental network managers. Wall-clock: ~20s for 2000 hosts on a 566 MHz Pentium III, dominated by SNMP polling time.

Dynamic networks [§7.5]. Lowekamp explicitly addresses host movement: “verifying the location of a host is as simple as checking the leaf bridge at which it was last found.” But the treatment of bridge moves is dismissive: “changes in bridge topology are expected extremely rarely.” Host movement during the discovery phase is treated as a corruption to retry: “our current approach is to simply rerun the discovery phase when such a conflict is found.” This is the classic static-snapshot worldview that l2trace’s bitemporal model rejects.

What l2trace borrows. The “through set” concept is conceptually related to (but distinct from) what LLDP gives us directly. More importantly, Lowekamp’s incomplete-information stance maps onto l2trace’s belief axis: his contribution is essentially “tolerate partial CAM tables by using uniqueness of intersection emptiness rather than completeness of forward sets.” l2trace inverts the framing — instead of tolerating incomplete CAMs synchronously, it treats every CAM observation as bitemporally provisional and stitches them across time. The Minimum Knowledge Requirement is a beautiful piece of combinatorial reasoning but is orthogonal to l2trace’s goals (we have LLDP, so we don’t need to derive adjacencies from CAM intersections).

Bejerano 2009 — Skeleton Trees

Paper. Yigal Bejerano. “Taking the Skeletons Out of the Closets: A Simple and Efficient Topology Discovery Scheme for Large Ethernet LANs.” IEEE/ACM ToN 17(5):1385–1398, October 2009. DOI: 10.1109/TNET.2009.2022264. Closed-access. Extended journal version of INFOCOM 2006.

Problem framing [§I]. Same problem Breitbart and Lowekamp worked on, but with explicit attention to two cases the predecessors handled poorly: uncooperative network elements (hubs and SNMP-denied switches) and multi-subnet organization where an element is invisible to its direct physical neighbor because they belong to different subnets. Bejerano (also at Bell Labs Murray Hill — the successor to Breitbart’s group) critiques Lowekamp’s approach as effective only for single-subnet cases, and critiques Breitbart’s INFOCOM 2003 multi-subnet extension as “hard to implement and high computational complexity” [§II].

Notable LLDP critique [§II]: “the IEEE 802.1ab committee finished its proposal for a new layer-2 discovery protocol, called link layer discovery protocol (LLDP), which provides adequate information to infer the physical topology. However, LLDP cannot be easily deployed on all the legacy equipment, and consequently, it cannot easily be used on today’s huge installed base of heterogeneous Ethernet LANs.” This is the explicit justification for why all this AFT-intersection algorithmic machinery existed: LLDP wasn’t deployed yet. In 2026, this constraint is largely gone, which is part of why l2trace can sidestep the AFT-intersection problem and focus on bitemporal provenance instead.

Algorithm [§IV–§VI]. The skeleton-tree is the central data structure: a tree where each vertex represents a set of network nodes (possibly empty, indicating a hub) and arcs represent physical links. Two phases:

AFT population: single management station sends ping messages to every detected element, populating AFTs along forwarding paths.
STC (Skeleton-Tree Creation) algorithm: build per-subnet skeleton trees, then iteratively merge them. Junction nodes (degree ≥3) and leaves are anchored to single vertices; transit nodes (degree 2) are represented by sequences of vertices that compress homeomorphic paths.

Hubs are detected by discrepancy: when adding a node v to the skeleton-tree, if v’s AFT entries imply a child structure inconsistent with the existing tree, an unlabeled (hub) vertex is inserted between v and its expected ancestor [§V]. Time complexity O(n²).

Evaluation [§VIII–§IX]. Simulation: 2000 random networks per parameter setting, varying switch count (10–50), subnet count, fraction of uncooperative switches (up to 75% of L2 elements). STC achieves >99.75% success when average subnet size ≥ 20 hosts, and

95% success even with only 25% of L2 elements cooperative. Comparison to Lowekamp (“LHG”) shows LHG falls below 80% when subnet count grows, while STC stays above 95% in the same regime [§VIII Fig. 9–10]. Proof-of-concept deployment against the Bell Labs Murray Hill CS Department network in 2004 and re-run in 2008: 17 switches, 35 hubs, 204 hosts, 14 subnets in 2004 — note the hub count exceeded the switch count, the regime where Lowekamp degrades and STC excels.

What l2trace borrows. The skeleton-tree’s vertex-as-set-of-nodes encoding is interesting structurally — it’s how Bejerano represents “unknown intermediate topology” without committing to a specific shape. This maps loosely onto l2trace’s handling of FLOOD terminations and MLAG groups (both cases where the L2 path doesn’t decompose into a single hop sequence). The bigger lesson is methodological: Bejerano’s simulation methodology (parameterized random networks varying uncooperative fraction and subnet size) is the closest thing in the literature to what an l2trace evaluation section should look like, scaled up to time-axis dimensions.

Cross-cutting observations

Partial-data strategies across the three papers

None of the three uses bitemporal semantics in the formal sense, but each handles the partial-data problem differently:

Paper	Strategy	Temporal model
Breitbart 2004	Force population via spoofed pings; relax completeness threshold; collect at intervals and timestamp	Unitemporal — system-time only, retains historical views, no formal “as of T on the network” axis
Lowekamp 2001	Tolerate partial AFTs algorithmically (3 shared hosts suffice); rerun on conflict	None — single static snapshot, host moves treated as “conflict to retry”
Bejerano 2009	Populate AFTs via management-station ping flood; merge per-subnet skeleton trees	None — single snapshot, formal correctness only for stable topology

The honest version of the prior-art gap: Breitbart’s NetInventory explicitly stored historical observations and could answer “what did we know on 2003-04-12 about the network?” — that’s not nothing. What NetInventory lacked was the second axis: “what was true on the network at 14:42 on that date?” l2trace’s valid_during × recorded_during makes that distinction explicit; NetInventory’s (time-of-recording) column conflates “we knew” with “it was.”

Contradiction: Breitbart vs Lowekamp on AFT completeness

The literature shows a real disagreement on whether complete AFTs are needed. Breitbart’s Direct Connection Theorem requires M_i^s ∪ M_j^t = M^N — every MAC in the subnet present in at least one of the two interfaces’ AFTs. Lowekamp 2001 §3.2 explicitly identifies this as the binding constraint on practical deployment, and designs the Simple Connection Theorem to relax it.

For l2trace this disagreement is moot — we have LLDP and don’t need AFT-intersection inference at all. But it’s worth noting as evidence that even within the static-snapshot worldview, the field acknowledged the cost of the “wait for a complete AFT” assumption before our bitemporal pivot.