I first heard Peyman Kazemian talk about the research that won the Test-of-Time Award for NSDI'24 at NSDI'12 [1]. The paper's authors' insight was that instead of using packet headers as structured data, they would analyze bits in headers without any associated meaning. Using this data and transform functions representing router and switch configurations, they used their tool, Hassel, to uncover forwarding loops and check reachability constraints in Stanford's network in less than ten minutes.
Kazemian described tools they built and shared that created the transformer functions from Cisco router configurations and performed the static analysis. Some of what their tools could do reminded me of work that Bill Cheswick's company, Lumeta, had started doing in the late 90s by capturing router updates and sending out probes using traceroute to do similar things. But Kazemian's tools sounded different, and advanced from static analysis to a dynamic tool presented at NSDI'13 [2].
[Rik Farrow] As I've never managed a large network, a lot of what you did with the analysis of the Stanford network was beyond my understanding, but I do think I comprehend the two key parts of your research: using bitmaps to represent headers and transform functions to represent routers and switches. These two things together support the analysis you did for that paper.
You mention that the transform functions could be created using a tool, Hassel, that took Cisco router configuration and some state and converted that into transform functions. How was that done?
[Peyman Kazemian] We get them from config and state. For example, in Cisco configuration, you have an ACL like this applied as ingress ACL on some interface:
permit ip host 1.1.1.1 host 2.2.2.2
This gets converted to a rule that matches on a bit sequence where
eth_type=0x800, ip_version=4, ip_src=1.1.1.1, ip_dst-2.2.2.2
and then the action is to forward traffic to next transfer function inside the device.
That next transfer function could be IP forwarding. In the IP forwarding table (RIB table) let's say there is a route learned over BGP like this
B 1.1.1.0/24 next_hop=10.1.1.1 eth2
match=ip_dst=1.1.1.0/24 --> action: rw(mac_dst, aa:aa:aa:00:00:00), rw(out_port, eth2), forward_to_next_table.
When these rules are put together, and we trace all possible packets through them, we find an equivalent class (aka flow) with
eth_type=0x800, ip_version=4, ip_src=1.1.1.1, ip_dst-2.2.2.2
that will be forwarded to eth2, and every other kind of packet that will not match on both of these rules, which would be forwarded by the rules they happen to match on.
So in a way, we don't create those bitmaps that represent flows, they are
[RF] You could also explain how you represent the packet header bits. I recall that there are really four states: 0, 1, ignore, and wildcard. And that's why you need two bits for every single header bit.
[PK] Yes, because we need to work with symbolic packets where some bits are both 0 and 1 at the same time, we need to encode this with two bits: first bit indicating whether that bit can take value 1, and second bit to indicate if that bit can take value 0. As a result:
[RF] The 2013 paper [3] seems to cover dynamic rather than static analysis.
[PK] 2013 was a real-time version of the same kind of static analysis where you can update the transfer functions and computed flows on the fly. The idea for the NetPlumber paper was to incrementally update the transfer function rules and all the flows that have been processed through those rules, so that we can avoid recomputing everything from scratch. The trade off is by throwing more memory at it, we can save time.
[RF] I believe that at some point you started working with Software Defined Networks (SDN), where you would need to collect and process configuration and flow data differently?
[PK] Yes, the model we use for modeling devices is essentially the Openflow model (match and action) and as a result it makes parsing jobs for Openflow a lot simpler, as the native device representation for forwarding rules are a lot closer to what HSA [1] needs.
These days, SDN means different things to different people, but as part of Forward we have been able to model non-openflow SDN platforms the same way (like SD WANs, and smart NICs).
[RF] It appears that you left Stanford to work on HSA at the company you founded. How has that worked out?
[PK] I graduated from Stanford in 2013, then started Forward Networks. It's been going well. We are 10.5 years in and growing as a company in all aspects.