Azure Accelerated Networking: SmartNICs in the Public Cloud

Paper: Daniel Firestone, Andrew Putnam, Sambhrama Mundkur, Derek Chiou, Alireza Dabagh, Mike Andrewartha, Hari Angepat, Vivek Bhanu, Adrian Caulfield, Eric Chung, Harish Kumar Chandrappa, Somesh Chaturmohta, Matt Humphrey, Jack Lavier, Norman Lam, Fengfen Liu, Kalin Ovtcharov, Jitu Padhye, Gautham Popuri, Shachar Raindel, Tejas Sapre, Mark Shaw, Gabriel Silva, Madhan Sivakumar, Nisheeth Srivastava, Anshuman Verma, Qasim Zuhair, Deepak Bansal, Doug Burger, Kushagra Vaid, David A. Maltz, and Albert Greenberg. 2018. Azure accelerated networking: SmartNICs in the public cloud. In Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation (NSDI’18). USENIX Association, USA, 51–64.

This paper presents Azure Accelerated Networking (AccelNet), a solution for offloading host networking to hardware using custom Azure SmartNICs based on FPGAs, providing <15 microseconds VM-VM TCP latencies and 32 Gbps throughput, since 2016.

Azure has built its cloud network on host-based SDN technologies, through software running in the hypervisor, to implement a rich & changing set of virtual networking features, in order to sell Infrastructure-as-a-Service (IaaS).

and more.

The Virtual Filtering Platform (VFP) is a cloud-scale programmable vSwitch, providing scalable SDN policy for Azure. It is highly programmable and serviceable.

Single Root I/O Virtualization (SR-IOV) reduces CPU utilization by allowing VM to directly access NIC hardware. The host connects to a privileged physical function (PF), while each virtual machine connects to its own virtual function (VF). An SR-IOV NIC contains an embedded switch to forward packets to the right VF based on the MAC address.

Generic Flow Tables (GFT) is a match-action language that defines transformation & control operations on packets for a specific network flow. This mechanism is used in VFP to enforce policy & filtering in an SR-IOV environment.

The Need & Desired Goals

Burning CPUs for these services takes away from the processing power available to customer VMs and increases the overall cost of providing cloud services. The authors need a cost-effective solution providing hardware-like performance with software-like programmability with the following goals:

Deciding the right hardware - FPGAs as SmartNICs

Design & Architecture

Performance

Strengths & Weaknesses

Strengths

Weaknesses

Future Work