Tier 3 — L2 standards lineage + post-STP fabrics
This tier traces L2 forwarding standards from the 1985 STP origin through the post-STP datacenter fabrics (SPB, TRILL, PortLand, VL2) and into experimental path-exploration variants (All-Path). The common thread for l2trace is where the forwarding state lives — classic CAM tables on STP bridges, IS-IS link-state databases on SPB/TRILL, pseudo-MAC field decoding on PortLand, directory services on VL2. The collector ABI has to abstract all of these.
Perlman 1985 — Spanning Tree origin paper
Section titled “Perlman 1985 — Spanning Tree origin paper”Paper. Radia Perlman. “An Algorithm for Distributed Computation of a Spanning Tree in an Extended LAN.” SIGCOMM ‘85, pp. 44–53. DOI: 10.1145/319056.319004. Open access from ACM DL.
41 years old as of 2026. The algorithm in this 10-page paper is the direct ancestor of every IEEE 802.1 STP/RSTP/MSTP variant still running today.
Algorithm
Section titled “Algorithm”Bridges in an arbitrary-topology extended LAN compute a distributed, acyclic spanning subset of the network by:
- Electing a root by lowest bridge ID.
- Each bridge computes least-cost path to root.
- On each LAN segment, one designated bridge forwards from root; all other ports on that LAN are blocked.
Key properties
Section titled “Key properties”- Converges in time proportional to LAN diameter, not bridge count.
- Memory per bridge: small constant, independent of network size.
- Bandwidth per LAN: small constant, independent of total links.
Perlman’s Algorhyme poem (printed on the first page of the paper) — “I think that I shall never see / A graph more lovely than a tree / A tree whose crucial property / Is loop-free connectivity” — is part of every networking student’s required reading.
What l2trace borrows
Section titled “What l2trace borrows”l2trace’s stp_state table records the output of Perlman’s
algorithm on a per-(port, vlan) basis at every moment in time. The
bitemporal angle adds something Perlman’s algorithm explicitly
omits: the algorithm preserves no history of prior trees. When
topology changes and a new tree converges, the old tree is gone —
but l2trace remembers both and can answer “was port P forwarding or
blocking at time T?”
Myers, Ng, Zhang 2004 — Rethinking the Service Model
Section titled “Myers, Ng, Zhang 2004 — Rethinking the Service Model”Paper. Andy Myers, T. S. Eugene Ng, Hui Zhang. “Rethinking the Service Model: Scaling Ethernet to a Million Nodes.” HotNets-III, 2004. PDF not retrievable via available archives (HotNets workshops often lack canonical DOIs).
Canonical contribution
Section titled “Canonical contribution”Argued that Ethernet’s STP-based forwarding fundamentally cannot scale to million-node networks because: STP disables links to prevent loops (wasted bandwidth), MAC learning floods unknown destinations across the entire broadcast domain, ARP storms scale linearly with node count, and failure recovery is slow. Proposed replacing the L2 flat-address learning model with something closer to L3 routing while preserving Ethernet’s plug-and-play API.
This is the founding motivation paper for PortLand, VL2, SPB, TRILL, and arguably for everything that came later in datacenter L2 fabric design.
What l2trace borrows
Section titled “What l2trace borrows”No direct features — this is founding motivation literature, not algorithmic prior art. Worth citing as the introduction to any l2trace systems-paper discussion of “why does L2 forwarding state matter in modern fabrics.”
Allan et al. 2010 — Shortest Path Bridging (SPB)
Section titled “Allan et al. 2010 — Shortest Path Bridging (SPB)”Paper. David Allan, Peter Ashwood-Smith, Nigel Bragg, János Farkas, Don Fedyk, Michel Ouellette, Mick Seaman, Paul Unbehagen. “Shortest Path Bridging: Efficient Control of Larger Ethernet Networks.” IEEE Communications Magazine, October 2010, pp. 128–135. DOI: 10.1109/mcom.2010.5594687. Closed-access.
Problem framing
Section titled “Problem framing”STP-based bridging has two structural limitations: it disables non-tree links rather than using them, and it implements a distance-vector algorithm without a proper topology database, which slows convergence after a topology change.
Algorithm — SPB (IEEE 802.1aq)
Section titled “Algorithm — SPB (IEEE 802.1aq)”- Replaces STP’s distance-vector protocol with ISIS-SPB — the IS-IS link-state routing protocol adapted to L2.
- Each bridge maintains a full topology database; forwarding paths are computed via shortest-path-first against that database.
- Up to 16 Equal Cost Trees (ECTs) per region.
- Two modes:
- SPBV — VLAN-based; MAC learning stays in the data plane.
- SPBM — MAC-in-MAC encapsulation; MAC learning OFF in the SPB region core; all B-MACs distributed via ISIS-SPB control plane. 24-bit I-SID gives 16 million service identifiers.
Critical implication for l2trace
Section titled “Critical implication for l2trace”In an SPBM core, customer MAC addresses do not appear in transit-switch CAM tables. The forwarding state lives in the IS-IS link-state database. l2trace’s gNMI/SNMP collectors would need to read OpenConfig’s IS-IS LSDB models in addition to CAM tables when the target fabric runs SPB. Not implemented today; tracked in the prior-art synthesis.
Path congruency [Allan 2010, §SPB Principles]: forward and reverse paths between any two bridge pairs are guaranteed identical; unicast and multicast paths are congruent. l2trace can rely on this when modeling SPBV adjacencies — the bidirectional adjacency audit is guaranteed to pass on a healthy SPB fabric by the spec.
Rojas et al. 2015 — All-Path Bridging
Section titled “Rojas et al. 2015 — All-Path Bridging”Paper. Elisa Rojas, Guillermo Ibáñez, José Manuel Giménez-Guzmán, Juan A. Carral, Alberto García-Martínez, Isaías Martínez-Yelmo, José M. Arco. “All-Path Bridging: Path Exploration Protocols for Data Center and Campus Networks.” Computer Networks 86:79–93, 2015. DOI: 10.1016/j.comnet.2015.01.002. Open access (Elsevier).
Problem framing
Section titled “Problem framing”SPB and TRILL both use link-state routing for L2 — but link-state routing has limitations: complex control-message exchange, additional loop-control mechanisms because the LSDB is temporarily inconsistent during convergence, and complexity in computing equal-cost paths for load balancing. Rojas et al. argue for a radically different approach: don’t compute paths, explore them.
Algorithm — ARP-Path
Section titled “Algorithm — ARP-Path”When an ARP Request broadcast traverses the network, the first copy of the reply that returns identifies the lowest-latency path. That path is locked in for the (src_mac, dst_mac) flow until it ages out. No precomputed topology database; no IS-IS; no STP. Path selection is reactive and load-aware.
What l2trace borrows
Section titled “What l2trace borrows”All-Path bridging is rare in production (SPB and TRILL dominate among post-STP fabrics), but the ARP-Path-style probe trace is conceptually identical to l2trace’s eventual active-validation mode. Where Anteater/ATPG inject custom test packets, an ARP-Path-aware collector could observe the actual ARP exchanges happening naturally and reconstruct path information from them.
Mysore et al. 2009 — PortLand
Section titled “Mysore et al. 2009 — PortLand”Paper. Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya, Amin Vahdat. “PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric.” SIGCOMM ‘09, pp. 39–50. DOI: 10.1145/1592568.1592575. Closed-access.
Algorithm — Pseudo-MAC (PMAC) addressing
Section titled “Algorithm — Pseudo-MAC (PMAC) addressing”Each VM is assigned a pseudo-MAC that encodes its topological
location: pod.position.port.vmid. ARP requests are intercepted by
edge switches and rewritten to return the PMAC. Forwarding is then
purely location-based — every transit switch can determine the
egress port from the PMAC bit fields, without needing a full CAM
lookup.
PortLand assumes a known baseline topology (specifically: a multi-rooted fat-tree). The topology assumption is what makes location-encoded MACs work.
What l2trace borrows
Section titled “What l2trace borrows”In a PortLand deployment, l2trace could decode endpoint location
directly from the PMAC, bypassing CAM-table queries entirely. The
bitemporal mac_observation table compresses to a structure indexed
by PMAC fields, and the L2-traceroute query reduces to PMAC field
decoding.
PortLand is also a case study in why l2trace’s collector layer must be flexible — different fabric architectures expose forwarding state in radically different ways. CAM tables work for STP-based fabrics; LSDBs for SPB/TRILL; PMACs decode directly for PortLand. The collector ABI has to abstract all three.
Greenberg et al. 2009 — VL2
Section titled “Greenberg et al. 2009 — VL2”Paper. Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, Sudipta Sengupta. “VL2: A Scalable and Flexible Data Center Network.” SIGCOMM ‘09, pp. 51–62. DOI: 10.1145/1592568.1592576. Microsoft Research. Closed-access.
Algorithm — three innovations
Section titled “Algorithm — three innovations”- Flat addressing — service instances can be placed anywhere in the network; IP addresses are not topologically encoded.
- Valiant Load Balancing (VLB) — randomly hash each flow to an intermediate switch, then route to destination. Spreads traffic uniformly across paths in a Clos topology.
- End-system address resolution — directory service maps “application IP” → “location IP” at the end-hosts, not the network.
Implication for l2trace
Section titled “Implication for l2trace”VL2 is less L2-pure than PortLand — it’s effectively an L3 fabric
that presents an L2 illusion via the directory service. l2trace’s
bitemporal model still applies: the directory service maintains
essentially a mac_observation equivalent for “application IP →
location IP” mappings, and that mapping changes over time exactly
the way CAM tables do. The collection layer would target VL2’s
directory service rather than switch CAM tables.
VL2 is a good example of where the bitemporal abstraction generalizes beyond strictly-L2.
RFC 6325 — TRILL Base Protocol Specification
Section titled “RFC 6325 — TRILL Base Protocol Specification”Standard. R. Perlman, D. Eastlake 3rd, D. Dutt, S. Gai, A. Ghanwani. “Routing Bridges (RBridges): Base Protocol Specification.” RFC 6325 (Standards Track), IETF, July 2011. 99 pages. Open access from IETF: https://www.rfc-editor.org/rfc/rfc6325.
Routing Bridges (RBridges) provide optimal pair-wise forwarding without configuration, safe forwarding even during periods of temporary loops, and support for multipathing of both unicast and multicast traffic. They achieve these goals using IS-IS routing and encapsulation of traffic with a header that includes a hop count.
Algorithm summary
Section titled “Algorithm summary”- Each RBridge runs IS-IS to learn the topology of the RBridge campus (separate from any 802.1 STP on attached customer bridges).
- Native frames entering the campus are encapsulated with a TRILL header containing source/destination RBridge nicknames and a hop count.
- Transit RBridges forward based on the TRILL header — not on the inner customer MAC. Transit RBridge forwarding tables are sized by RBridge count, not by end-node count.
- The hop count allows safe forwarding during transient IS-IS convergence (frames die after N hops rather than looping).
ESADI protocol
Section titled “ESADI protocol”End-Station Address Distribution Information — RBridges can announce their attached customer MACs into the IS-IS LSDB. This is the TRILL analog of MAC learning, but advertised through the routing protocol rather than learned passively on the wire.
Critical implication for l2trace
Section titled “Critical implication for l2trace”In a TRILL deployment:
- The customer MAC → ingress-RBridge-nickname mapping lives in the ESADI LSDB, not in customer-MAC CAM tables on transit RBridges. l2trace’s collector would need to read ESADI.
- The L2 traceroute path is the RBridge nickname sequence from the IS-IS shortest-path tree, not any CAM table.
- VLAN handling is split between Inner VLAN tags (customer-facing) and Outer VLAN tags (TRILL-network-facing) — non-trivial for the bitemporal model.
Not implemented today; tracked in the prior-art synthesis.
IEEE 802.1Q-2018
Section titled “IEEE 802.1Q-2018”Status. IEEE-SA charges several hundred USD for the PDF. Not retrieved. Summarizing from authoritative secondary sources.
Relevant clauses for l2trace
Section titled “Relevant clauses for l2trace”- Clause 13 — Spanning Tree Protocols (STP / RSTP / MSTP) consolidated specification. STP frame format (BPDU), Hello timer, Max Age, Forward Delay, port state machine (Disabled → Blocking → Listening → Learning → Forwarding for STP).
- Clause 13.19 — Updating learned station location information
when a tree reconfigures. This is the clause that defines how
CAM-table entries are invalidated when an STP topology change is
detected. The mechanism (Topology Change Notification BPDUs +
triggered FDB flush + reduced ageing timer) is the operational
basis for l2trace’s
recorded_duringretraction logic — when an STP topology change notification arrives, the system believes all current CAM entries on affected switches are suspect.
Not implemented today; the reconciler currently doesn’t model TCN events as bulk-retraction triggers.
See also
Section titled “See also”- Why bitemporal? — the time axis that’s missing from every paper on this page
- Prior-art synthesis — what we could implement (SPB LSDB read, TRILL ESADI, PortLand PMAC decode, TCN-triggered retraction) but haven’t yet
- Bibliography — full DOI lookups