Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN. In Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM (SIGCOMM ’13). Association for Computing Machinery, New York, NY, USA, 99–110. DOI:https://doi.org/10.1145/2486001.2486011
Routing & forwarding within the network remains a confusing mix of routing protocols (e.g. BGP, ICMP, MPLS) and forwarding behaviors (e.g. routers, bridges, firewalls). The control and forwarding planes are intertwined inside closed, vertically integrated boxes. SDN took a key step in abstracting network functions by separating the roles of the control & forwarding planes via an open interface between them, e.g. OpenFlow. OpenFlow is based on the Match-Action approach. We need to implement Match-Action at 1 Tb/s speeds in hardware exploiting pipelining & parallelism while living in the constraints of on-chip table memories - switching chips are O(100) times faster at switching than CPUs & O(10) times faster than NPUs.
The simplest approach is Single Match Table (SMT) model where a controller tells the switch to match any set of packet header fields against entries in a single match table. It can be easily implemented in Ternary Content Addressable Memory (TCAM). But this table needs to store every combination of headers, which is wasteful.
Multiple Match Tables (MMT) is a refinement of SMT model. It allows multiple smaller match tables to be matched by a subset of packet fields. These match tables are arranged in a pipeline of stages.
The OpenFlow spec transitioned to the MMT model but does not mandate the width, depth, or number of tables. Existing switch chips fix the width, depth, or number of tables during fabrication. This severely limits flexibility:
This creates problems for network operators who want to tune the table sizes to optimize for their network, or implement new forwarding behavior.
E.g. forwarding, dropping, decrementing TTLs, pushing VLAN or MPLS headers, & GRE encapsulation. This action set is not extensible & not abstract enough to allow any field to be modified, any state machine to be updated, and the packet to be forwarded to an arbitrary set of output ports.
The paper explores Reconfigurable Match Tables, a refinement of the MMT model.
This work describes a RMT chip architecture to provide an existence proof of RMT to standardize the reconfiguration interface between the controller and the data plane. The work also provide use cases that show how the RMT model can be configured to implement forwarding using Ethernet & IP headers, and support RCP. The work describes an implementation of a 64x10Gb/s RMT switch chip to show that a general form of the RMT model is feasible & inexpensive.
The work advocates an implementation architecture that consists of large number of physical pipeline stages that are mapped to a smaller number of logical RMT stages depending on the resource needs of each logical stage.
They found that the power requirement is not significant, and since most use-cases are dominated by memory use, the coupling of processing & memory allocation is not significant in practice.
For terabit-speed realization, the needed to restrict physical match stages to 32. The chip design limit of packet headers is 4Kb (512B). They use 370Mb SRAM and 40Mb TCAM. Each stage may execute one instruction per field & limited to simple arithmetic, logical & bit manipulation. The queueing system provides 4 levels of hierarchy and 2K queues per port. Each stage contains over 200 action units, one for each field in the PHV with over 7000 action units in the chip.
To configure the RMT architecture, one needs two pieces of information: