Calculation method, calculation apparatus, and computer-readable storage medium

文档序号:156070 发布日期:2021-10-26 浏览:37次 中文

阅读说明:本技术 计算方法、计算装置和计算机可读存储介质 (Calculation method, calculation apparatus, and computer-readable storage medium ) 是由 蒂拉卡·拉杰·苏伦德拉·巴布 胡晓 苏中友 于 2017-12-28 设计创作,主要内容包括:本发明涉及一种计算方法、计算装置和计算机可读存储介质。在一些示例中,一种计算装置,包括:虚拟网络端点;包括第一硬件组件和第二硬件组件的网络接口卡(NIC),其中,第一硬件组件和第二硬件组件提供至NIC的物理网络接口的单独的分组输入/输出访问,其中,NIC被配置为接收从物理网络接口入站的分组;以及虚拟路由器,从NIC接收分组并且响应于确定分组的目的地端点是虚拟网络端点,而使用第一硬件组件将分组输出回至NIC,其中,NIC进一步被配置为响应于从虚拟路由器接收分组而将分组切换至虚拟网络端点并且使用第二硬件组件将分组输出至虚拟网络端点。(The invention relates to a computing method, a computing device and a computer readable storage medium. In some examples, a computing device, comprising: a virtual network endpoint; a Network Interface Card (NIC) comprising a first hardware component and a second hardware component, wherein the first hardware component and the second hardware component provide separate packet input/output access to a physical network interface of the NIC, wherein the NIC is configured to receive packets inbound from the physical network interface; and a virtual router to receive the packet from the NIC and, in response to determining that a destination endpoint of the packet is a virtual network endpoint, output the packet back to the NIC using the first hardware component, wherein the NIC is further configured to switch the packet to the virtual network endpoint and output the packet to the virtual network endpoint using the second hardware component in response to receiving the packet from the virtual router.)

1. A method, comprising:

outputting, by a virtual execution endpoint executed by a computing device, a request for a layer 2 address corresponding to a layer 3 address;

sending, by a virtual router executed by the computing device to provide a virtual network for communication with the virtual execution endpoint, a layer 2 address configured for a first hardware component of a network interface card of the computing device to the virtual execution endpoint in response to the request,

wherein the network interface card comprises a physical network interface, a first hardware component, and a second hardware component, and

wherein the first hardware component and the second hardware component provide separate packet input/output access to a physical network interface of the network interface card; and

sending, by the virtual execution endpoint to the network interface card, an outbound packet generated by the virtual execution endpoint, wherein the outbound packet has a destination tier 2 address, the destination tier 2 address being the tier 2 address for the first hardware component.

2. The method of claim 1, further comprising:

in response to receiving the outbound packet, switching, by the network interface card, the outbound packet to the virtual router of the computing device via the first hardware component based on the destination layer 2 address as the layer 2 address for the first hardware component.

3. The method of claim 1, further comprising:

receiving, by the network interface card, the outbound packet from the virtual execution endpoint via the second hardware component of the network interface card.

4. The method of claim 1, further comprising:

switching, by the network interface card, the request for a layer 2 address corresponding to the layer 3 address to the virtual router.

5. The method of claim 1, further comprising:

wherein the network interface card includes a single root input/output virtualization, SR-IOV, device,

wherein the first hardware component comprises a physical function of the SR-IOV device, and

wherein the second hardware component comprises a virtual function of the SR-IOV device.

6. The method of claim 1, further comprising:

wherein the network interface card includes a single root input/output virtualization, SR-IOV, device,

wherein the first hardware component comprises a first virtual function of the SR-IOV apparatus, and

wherein the second hardware component comprises a second virtual function of the SR-IOV apparatus.

7. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,

wherein the request for a layer 2 address corresponding to the layer 3 address comprises an address resolution protocol request for a layer 2 address of a default layer 3 gateway.

8. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,

wherein sending the layer 2 address configured for the first hardware component comprises: sending an ARP reply that includes the layer 2 address configured for the first hardware component.

9. The method of claim 1, further comprising:

receiving, by the virtual router, the outbound packet via the first hardware component; and

encapsulating, by the virtual router, each of the outbound packets with an outer header of the virtual network, and outputting each of the outbound packets back to the network interface card for output on the physical network interface to tunnel the outbound packets to another physical computing device hosting a destination virtual network endpoint of the outbound packets.

10. The method of claim 1, wherein the virtual network endpoint comprises at least one of a virtual machine and a container.

11. A computing device, comprising:

one or more hardware-based processors coupled to a storage device;

a virtual network endpoint configured for execution by one or more of the processors;

a network interface card comprising a first hardware component and a second hardware component, wherein the first hardware component and the second hardware component provide separate packet input/output access to a physical network interface of the network interface card; and

a virtual router configured for execution by one or more of the processors to provide a virtual network for communication with a virtual execution endpoint,

wherein the virtual execution endpoint is configured to output a request for a layer 2 address corresponding to a layer 3 address,

wherein the virtual router is configured to send, in response to the request, a layer 2 address configured for a first hardware component of a network interface card of the computing device to the virtual execution endpoint, and

wherein the virtual execution endpoint is configured to send an outbound packet generated by the virtual execution endpoint to the network interface card, wherein the outbound packet has a destination tier 2 address, the destination tier 2 address being the layer 2 address for the first hardware component.

12. The computing device of claim 11, wherein the computing device,

wherein the network interface card is configured to switch the outbound packet to the virtual router of the computing device via the first hardware component based on the destination layer 2 address being the layer 2 address for the first hardware component in response to receiving the outbound packet.

13. The computing device of claim 12, wherein the computing device,

wherein the network interface card is configured to receive the outbound packet from the virtual execution endpoint via the second hardware component of the network interface card.

14. The computing device of claim 11, wherein the computing device,

wherein the network interface card is configured to switch the request for a layer 2 address corresponding to the layer 3 address to the virtual router.

15. The computing device of claim 11, wherein the computing device,

wherein the network interface card includes a single root input/output virtualization, SR-IOV, device,

wherein the first hardware component comprises a physical function of the SR-IOV device, and

wherein the second hardware component comprises a virtual function of the SR-IOV device.

16. The computing device of claim 11, further comprising:

wherein the network interface card includes a single root input/output virtualization, SR-IOV, device,

wherein the first hardware component comprises a first virtual function of the SR-IOV apparatus, and

wherein the second hardware component comprises a second virtual function of the SR-IOV apparatus.

17. The computing device of claim 11, wherein the computing device,

wherein the request for a layer 2 address corresponding to the layer 3 address comprises an address resolution protocol request for a layer 2 address of a default layer 3 gateway.

18. The computing device of claim 11, wherein the computing device,

wherein to send the layer 2 address configured for the first hardware component, the virtual router is configured to send an address resolution protocol reply including the layer 2 address configured for the first hardware component.

19. The computing device of claim 11, wherein the computing device,

wherein the virtual router is configured to receive the outbound packet via the first hardware component, and

wherein the virtual router is configured to encapsulate each of the outbound packets with a header of the virtual network and output each of the outbound packets back to the network interface card for output on the physical network interface to tunnel the outbound packets to another physical computing device hosting a destination virtual network endpoint for the outbound packet.

20. A non-transitory computer-readable storage medium comprising instructions for causing one or more processors of a computing device to perform the steps of:

outputting, by a virtual execution endpoint configured for execution by the one or more processors, a request for a layer 2 address corresponding to a layer 3 address;

sending, by a virtual router configured for execution by the computing device to provide a virtual network for communication with the virtual execution endpoint, a layer 2 address configured for a first hardware component of a network interface card of the computing device to the virtual execution endpoint in response to the request,

wherein the network interface card comprises a physical network interface, a first hardware component, and a second hardware component, and

wherein the first hardware component and the second hardware component provide separate packet input/output access to a physical network interface of the network interface card; and

sending, by the virtual execution endpoint to the network interface card, an outbound packet generated by the virtual execution endpoint, wherein the outbound packet has a destination tier 2 address, the destination tier 2 address being the tier 2 address for the first hardware component.

Technical Field

The present invention relates to computer networks, and more particularly to implementing virtual networks on physical networks.

Background

In a typical cloud data center environment, there are a large number of interconnected servers that provide computing and/or storage capacity to run various applications. For example, a data center may include a facility that hosts applications and services for subscribers (i.e., customers of the data center). For example, a data center may host all infrastructure devices, such as networking and storage systems, redundant power supplies, and environmental controls. In a typical data center, clusters of storage systems and application servers are interconnected via a high-speed switch fabric provided by one or more tiers of physical network switches and routers. More sophisticated data centers provide infrastructure throughout the world with subscriber support devices located in various physical host facilities.

Disclosure of Invention

In general, techniques are described for switching packets for a virtual network between a tunnel endpoint for the virtual network and a virtual network endpoint hosted by a computing device using a network interface card-based switch of the computing device. For example, a computing device may use virtualization techniques to host multiple virtual machines or containers, e.g., virtual machines or containers that are corresponding endpoints for one or more virtual networks. The computing device may also execute a software-based virtual router that determines a virtual network endpoint for the packet received via a tunnel that overlays the data center physical switch fabric and terminates at the computing device based on a tunnel encapsulation header and a layer 3 packet header of the packet. The virtual router may encapsulate the received packet with a layer 2 header having a layer 2 destination address, the layer 2 destination address being associated with a destination endpoint for the packet; and the virtual router may output the packet to a network interface card of the computing device. An internal layer 2 switch of the network interface card, which may be a single root input/output virtualization (SR-IOV) network interface card switch, switches packets to the destination endpoint based on a layer 2 header.

In the case of packets output by a virtual network endpoint for communication via a virtual network, the virtual network endpoint is configured to output packets having a layer 2 header destined for the virtual router to an internal layer 2 switch of the network interface card. For each such outbound packet, the inner layer 2 switch switches the packet to a virtual router that determines a virtual network for the packet and outputs the packet encapsulated by a tunnel encapsulation header indicating the virtual network to the physical destination computing device.

The techniques may provide one or more advantages. For example, since the path of packets between a software-based virtual router and a virtual network endpoint (both hosted by a computing device) is via a network interface card switch, the applied technology can leverage (legacy) existing underlying network interface card hardware queues and switching capabilities to perform high-speed layer 2 forwarding between the virtual router and the endpoint. In addition, the network interface card may utilize direct memory access to replicate packets between the virtual router memory address space and the virtual network endpoint, thereby reducing participation of a computing device Central Processing Unit (CPU) in inter-process memory replication. The techniques may also enable the virtual router to take advantage of network interface card rate limiting and rate shaping, as well as hardware offload capabilities, such as General Receive Offload (GRO), Transmission Control Protocol (TCP), segment offload (TSO), and Large Receive Offload (LRO). Furthermore, by using a software-based virtual router in combination with network interface card-based transmissions between the virtual router and virtual network endpoints, the techniques may overcome the disadvantages inherent in some network interface card-based virtual routers, such as limited protocol support, increased cost of network interface cards utilizing tunnel endpoints and virtual routing capabilities, and more challenging development environments.

In one example, a non-transitory computer-readable storage medium comprising instructions that cause a computing device to perform the steps of: receiving, by a network interface card of a computing device, packets inbound from a physical network interface via the physical network interface of the network interface card, wherein the network interface card includes a first hardware component and a second hardware component, wherein the first hardware component and the second hardware component provide separate packet input/output access to the physical network interface of the network interface card; receiving, by a virtual router of a computing device, a packet from a network interface card; outputting, by the virtual router, the packet back to the network interface card using the first hardware component in response to determining that the destination endpoint of the packet is a virtual network endpoint of the computing device; and switching, by the network interface card, the packet to the virtual network endpoint and outputting the packet to the virtual network endpoint using the second hardware component in response to receiving the packet from the virtual router.

In another example, a method, comprising: receiving, by a network interface card of a computing device, packets inbound from a physical network interface via the physical network interface of the network interface card, wherein the network interface card includes a first hardware component and a second hardware component, and wherein the first hardware component and the second hardware component provide separate packet input/output access to the physical network interface of the network interface card; receiving, by a virtual router of a computing device, a packet from a network interface card; outputting, by the virtual router, the packet back to the network interface card using the first hardware component in response to determining that the destination endpoint of the packet is a virtual network endpoint of the computing device; and switching, by the network interface card in response to receiving the packet from the virtual router, the packet to the virtual network endpoint and outputting the packet to the virtual network endpoint using the second hardware component.

In another example, a computing device includes: one or more hardware-based processors coupled to a storage device; a virtual network endpoint configured for execution by one or more processors; a network interface card comprising a first hardware component and a second hardware component, wherein the first hardware component and the second hardware component provide separate packet input/output access to a physical network interface of the network interface card, wherein the network interface card is configured to receive packets inbound from the physical network interface; and a virtual router configured for execution by the one or more processors to receive the packet from the network interface card and output the packet back to the network interface card using the first hardware component in response to determining that a destination endpoint of the packet is a virtual network endpoint, wherein the network interface card is further configured to switch the packet to the virtual network endpoint and output the packet to the virtual network endpoint using the second hardware component in response to receiving the packet from the virtual router.

The details of one or more embodiments of the disclosure are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the disclosure will be apparent from the description and drawings, and from the claims.

Drawings

FIG. 1 is a block diagram illustrating an example network system having a data center in which examples of the techniques described herein may be implemented.

Fig. 2A-2B are block diagrams respectively illustrating an example computing device utilizing a network interface card internal device switch to forward packets between a virtual network endpoint and a virtual router of a tunnel endpoint in accordance with the techniques described herein.

Fig. 3A is a block diagram detailing an exemplary tunnel packet that may be processed by a computing device in accordance with the techniques described in this disclosure.

Fig. 3B is a block diagram illustrating in detail an exemplary packet with a new layer 2 header generated by a virtual router for switching to a destination virtual network endpoint by a network interface card based switch.

FIG. 4 is a flow diagram illustrating an example mode of operation for a computing device in accordance with the techniques described in this disclosure.

FIG. 5 is a flow diagram illustrating an example mode of operation for a computing device in accordance with the techniques described in this disclosure.

Like reference numerals refer to like elements throughout the description and drawings.

Detailed Description

Fig. 1 is a block diagram illustrating an example network system 8 having a data center 10 in which examples of the techniques described herein may be implemented. In general, the data center 10 provides an operating environment for applications and services for a customer site 11 (shown as "customer 11") having one or more customer networks coupled to the data center through the service provider network 7. For example, the data center 10 may host infrastructure equipment, such as networking and storage systems, redundant power supplies, and environmental controls. The service provider network 7 is coupled to a public network 15, which may represent one or more networks managed by other providers, and may thus form part of a large-scale public network infrastructure, such as the internet. For example, public network 15 may represent a Local Area Network (LAN), a Wide Area Network (WAN), the internet, a virtual LAN (vlan), an enterprise LAN, a layer 3 Virtual Private Network (VPN), an Internet Protocol (IP) intranet run by a service provider running service provider network 7, an enterprise IP network, or some combination thereof.

Although customer site 11 and public network 15 are primarily shown and described as edge networks of service provider network 7, in some examples, one or more of customer site 11 and public network 15 may be a tenant network within data center 10 or another data center. For example, the data center 10 may host a plurality of tenants (customers), each respectively associated with one or more Virtual Private Networks (VPNs), each of which may implement one of the customer sites 11.

The service provider network 7 provides packet-based connectivity to additional customer sites 11, data centers 10, and public networks 15. The service provider network 7 may represent a network owned and operated by a service provider to interconnect a plurality of networks. The service provider network 7 may implement multi-protocol label switching (MPLS) forwarding, and in this case the service provider network may be referred to as an MPLS network or MPLS backbone. In some cases, the service provider network 7 represents a plurality of interconnected automated systems, such as the internet that provide services from one or more service providers.

In some examples, data center 10 may represent one of many geographically distributed network data centers. As shown in the example of fig. 1, data center 10 may be a facility that provides network services to customers. Customers of a service provider may be collective entities or individuals such as businesses and governments. For example, a network data center may host network services for several enterprises and end users. Other exemplary services may include data storage, virtual private networking, traffic engineering, file serving, data mining, scientific or supercomputing, and the like. Although elements of data center 10, such as one or more Physical Network Functions (PNFs) or Virtual Network Functions (VNFs), are described as separate edge networks of service provider network 7, they may be included within the service provider network 7 core.

In this example, the data center 10 includes storage and/or compute servers interconnected via a switch fabric 14 provided by one or more layers of physical network switches and routers, with servers 12A-12X (herein "servers 12") depicted as coupled to top-of-rack switches 16A-16N. The server 12 may also be referred to herein as a "host" or a "host device". Although only the servers coupled to TOR switch 16A are shown in detail in fig. 1, data center 10 may also include many additional servers coupled to other TOR switches 16 of data center 10.

The switch fabric 14 in the illustrated example includes interconnected top-of-rack (or other "blade") switches 16A-16N (collectively, "TOR switches 16") coupled to chassis (or "spine" or "core") switches 18A-18M (collectively, "chassis switches 18"). Although not shown, the data center 10 may also include, for example, one or more non-edge switches, routers, hubs, gateways, security devices such as firewalls, intrusion detection and/or intrusion prevention devices, servers, computer terminals, laptops, printers, databases, wireless mobile devices such as mobile phones or personal digital assistants, wireless access points, bridges, cable modems, application accelerators, or other network devices. Data center 10 may also include one or more Physical Network Functions (PNFs), such as physical firewalls, load balancers, routers, route reflectors, Broadband Network Gateways (BNGs), evolved packet cores or other cellular network elements, and other PNFs.

In this example, TOR switches 16 and chassis switches 18 provide redundant (multi-homed) connections to IP fabric 20 and service provider network 7 to servers 12. Chassis switches 18 aggregate traffic flows and provide connections between TOR switches 16. TOR switches 16 may be network devices that provide layer 2(MAC) and/or layer 3 (e.g., IP) routing and/or switching functionality. TOR switches 16 and chassis switches 18 may each include one or more processors and memory, and may execute one or more software processes. Chassis switch 18 is coupled to IP fabric 20, and IP fabric 20 may perform layer 3 routing to route network traffic between data center 10 and customer site 11 through service provider network 7. The switched architecture of the data center 10 is merely an example. Other switching fabrics may have more or fewer switching layers, for example.

The term "packet flow," "traffic flow," or simply "flow" refers to a set of packets from a particular source device or endpoint and sent to a particular destination device or endpoint. The flow of a single packet can be identified by a five-tuple, such as: < source network address, destination network address, source port, destination port, protocol >. The five-tuple generally identifies a packet flow corresponding to the received packet. An n-tuple refers to n items extracted from a five-tuple. For example, a duplet of a packet may refer to a < source network address, destination network address > or a combination of < source network address, source port > of the packet.

Servers 12 may each represent a compute server, a switch, or a storage server. For example, each server 12 may represent a computing device, such as an x86 processor-based server, configured to operate in accordance with the techniques described herein. The server 12 may provide a Network Function Virtualization Infrastructure (NFVI) for the NFV architecture.

The servers 12 host endpoints 23 (shown in fig. 1 as "EPs" 23) of one or more virtual networks running on a physical network, represented here by the IP fabric 20 and the switch fabric 14. Although primarily described with respect to a data center-based switching network, other physical networks, such as the service provider network 7, may support one or more virtual networks.

In accordance with various aspects of the technology described in this disclosure, one or more servers 12 may each include a virtual router that executes one or more routing instances within data center 10 for a corresponding virtual network. Each routing instance may be associated with a network forwarding table. Each routing instance may represent a virtual routing and forwarding instance (VRF) for an internet protocol-virtual private network (IP-VPN). For example, packets received by the virtual route of server 12A from the underlying physical network fabric may include an outer header to allow the physical network fabric to tunnel the payload or "inner packet" to the physical network address of the network interface of server 12A performing the virtual route. The outer header may include not only the physical network address of the network interface of the server, but also a virtual network identifier, such as a VxLAN label or a multi-protocol label switching (MPLS) label, that identifies one of the virtual networks and the corresponding routing instance executed by the virtual route. The inner packet includes an inner header having a destination network address that conforms to a virtual network address space of the virtual network identified by the virtual network identifier.

In accordance with one or more embodiments of the present disclosure, controller 24 provides a logically and, in some cases, physically central controller to facilitate operation of one or more virtual networks within data center 10. In some examples, controller 24 may operate in response to configuration inputs received from network administrator 24. Additional information on the operation of the controller 24 in conjunction with other devices of the data center 10 or other software-defined NETWORKs is found in international application number PCT/US2013/044378 entitled "PHYSICAL PATH detecting FOR VIRTUAL NETWORK FLOWS," filed on 5.6.2013, and in U.S. patent application No. 14/226, 509 entitled "Tunneled PACKET Aggregation FOR VIRTUAL NETWORKs," filed on 26.3.2014, which are incorporated herein by reference as if fully set forth herein.

Each server 12 hosts one or more virtual network endpoints 23 of a virtual network. Each endpoint 23 may represent a virtual machine, container, or other virtualized execution environment that is an endpoint of a virtual network, such as a layer 3 endpoint of a virtual network. Server 12A executes two virtual network endpoints 23A and server 12X executes one virtual network endpoint 23X. However, the server 12 may execute as many endpoints as the hardware resource limitations of the server 12 are actually given. Each endpoint 23 may use one or more virtual hardware components 21 to perform packet I/O or otherwise process packets. For example, endpoint 23A may use one virtual hardware component (e.g., SR-IOV virtual function) enabled by NIC 13A to perform packet I/O and receive/transmit packets over one or more communication links using TOR switch 16A.

Typically, virtual machines provide a virtualized/guest operating system for executing applications in an isolated virtual environment. Because the virtual machine is virtualized from the physical hardware of the host server, the executing application is isolated from the hardware of the host and other virtual machines.

An alternative to virtual machines is a virtualized container, such as that provided by an open source DOCKER container application. Like virtual machines, each container is virtualized and can remain isolated from the host and other containers. However, unlike virtual machines, each container may omit a separate operating system and provide only a suite of applications and a dedicated application library. The container is executed by the host as an isolated user space instance and may share an operating system and common libraries with other containers executing on the host. Thus, a container may require less processing power, storage, and network resources than a virtual machine. As used herein, a container may also be referred to as a virtualization engine, a virtual private server, a silo (silo), or a captain (jail). In some instances, the techniques described herein are with respect to a container and a virtual machine or other virtualized component.

The servers 12 each include at least one Network Interface Card (NIC)13, each network interface card 13 including at least one interface to exchange packets with the TOR switch 16 over a communication link. For example, the server 12A includes the NIC 13A. Each NIC 13 provides one or more virtual hardware components 21 for virtualizing input/output (I/O). The virtual hardware component for I/O may be a virtualization of the physical NIC 13 ("physical function). For example, in single root I/O virtualization (SR-IOV) as described in the peripheral component interface-specific interest group SR-IOV specification, the PCIe physical functionality of a network interface card (or "network adapter") is virtualized to present one or more virtualized network interface cards as "virtual functions" for use by various endpoints executing on the server 12. In this way, the virtual network endpoints may share the same PCIe physical hardware resources, and the virtual functions are examples of virtual hardware components 21. As another example, one or more servers 12 may implement Virtio (e.g., a para-virtualization framework available to Linux operating systems) that provides emulated NIC functionality as a type of virtualized hardware component. As another example, one or more servers 12 may implement an Open vSwitch to perform distributed virtual multi-layer switching between one or more virtual nics (vnics) of a hosted virtual machine, where such vnics may also represent types of virtual hardware components. In some examples, the virtual hardware component is a virtual I/O (e.g., NIC) component. In some examples, the virtual hardware component is an SR-IOV virtual function.

The NICs 13 each include an internal device switch 25 to switch data between the virtual hardware components 21 associated with the NICs. For example, for a NIC capable of SR-IOV, the internal appliance switch may be a Virtual Ethernet Bridge (VEB) to switch between SR-IOV virtual functions and correspondingly between endpoints configured to use SR-IOV virtual functions, where each endpoint may include a guest operating system. Alternatively, the internal device switch 25 may be referred to as a NIC switch or SR-IOV NIC switch for SR-IOV implementation. Each virtual hardware component 21A associated with NIC 13A may be associated with a layer 2 destination address, which may be assigned by NIC 13A or a software process responsible for configuring NIC 13A. The physical hardware component (or "physical function" for SR-IOV implementation) is also associated with a layer 2 destination address.

To switch data between the virtual hardware components associated with the NIC 13A, the internal device switch 25 may perform layer 2 forwarding to switch or bridge layer 2 packets between the virtual hardware components 21A and the physical hardware components of the NIC 13A. Each virtual hardware component 21 may be located on a Virtual Local Area Network (VLAN) of a virtual network for an endpoint 23, the endpoint 23 using the virtual hardware component 21 for I/O. Further exemplary details of SR-IOV implementation within a NIC are described in "PCI-SIG SR-IOV Primer: An Introduction to SR-IOV Technology" (2.5 th edition, Intel corporation, 1 month 2011), the entire contents of which are incorporated herein by reference.

The servers 12A-12X include respective tunnel endpoints 26A-26X. For tunnel endpoint 26A, for example, for packets received by server 12A, tunnel endpoint 26A terminates the virtual network overlay tunnel. As described herein, each tunnel endpoint 26 includes, services, or is otherwise associated with a virtual router that determines a virtual network for a received packet based on a tunnel encapsulation header of the packet and forwards the packet to an appropriate destination endpoint 23 for the packet. For each packet outbound from endpoint 23, the virtual router of tunnel endpoint 26A appends a tunnel encapsulation header indicating the virtual network of the packet to generate an encapsulation or "tunnel" header, and tunnel endpoint 26A outputs the encapsulated packet to a physical destination computing device, such as another server 12, via the overlay tunnel of the virtual network. As used herein, a virtual router may perform the operations of a tunnel endpoint to encapsulate internal packets originating from virtual network endpoint 23 to generate tunnel packets and decapsulate the tunnel packets to obtain internal packets for routing to virtual network endpoint 23.

In accordance with the techniques described herein, the server 12 employs a hybrid model for internal forwarding, whereby the tunnel endpoint 26 forwards packets received from the switch fabric 14 to the internal appliance switch 25 via the virtual network overlay for forwarding to the destination endpoint 23. In the hybrid model described herein, tunneling/decapsulation and virtual routing of packets by server 12A is performed, for example, by tunnel endpoint 26A executed by one or more processors of server 12A (processors other than NIC 13A), while switching of packets between tunnel endpoint 26A and virtual network endpoint 23 is performed by switch 25A of NIC 13A. Thus, the virtual routing model is a hybrid model: neither NIC 13A nor tunnel endpoint 26A, which are executed by one or more processors (processors other than NIC 13A), perform (1) encapsulation/decapsulation and virtual routing and (2) switching functions for packets from or to virtual network endpoint 23.

For example, with respect to server 12A, the internal appliance switch 25A switches packets for the virtual network between tunnel endpoint 26A and virtual network endpoint 23A. Tunnel endpoint 26A may receive packet 27 from a physical hardware component. The virtual router of tunnel endpoint 26A may determine virtual network endpoint 23 for packet 27 based on the tunneling header and the layer 3 packet header of packet 27. The virtual router may encapsulate the received packet with a new layer 2 header having a layer 2 destination address, the layer 2 destination address being associated with the destination endpoint 23 of the packet 27; and the virtual router may output packet 27 to NIC 13A. The internal device switch 25A switches the packet 27 to the destination endpoint 23 based on the new layer 2 header. In some cases, the new layer 2 header includes a VLAN tag for the VLAN of the destination endpoint 23.

For packets output by the virtual network endpoint 23A for delivery over the virtual network, the virtual network endpoint 23A is configured to output packets containing a layer 2 header with the destination layer 2 address to the inner layer 2 switch 25A; the destination layer 2 address is the layer 2 address of one of the virtual hardware components 21A used by the physical hardware component or the I/O tunnel endpoint 26A. For each such outbound packet, the inner layer 2 switch 25A switches the packet to tunnel endpoint 26A with the virtual routing instance, thereby determining the virtual network of the packet and outputting to the physical destination computing device the packet encapsulated by a tunnel encapsulation header that indicates the virtual network of the source endpoint 23A and the destination endpoint of the packet.

Fig. 2A-2B respectively illustrate block diagrams of example computing devices that utilize a network interface card internal device switch to forward packets between virtual network endpoints and virtual router instances associated with tunnel endpoints in accordance with the techniques described herein. The computing device 200 of FIG. 2A may represent a real or virtual server and may represent an example instance of any of the servers 12 of FIG. 1. In this example, the computing device 200 includes a bus 242 that couples to hardware components of the hardware environment of the computing device 200. Bus 242 couples Network Interface Card (NIC)230 capable of SR-IOV, storage disk 246, and microprocessor 210. The front side bus may couple the microprocessor 210 and the storage 244 in some cases. In some examples, bus 242 may couple storage 244, microprocessor 210, and NIC 230. Bus 242 may represent a Peripheral Component Interface (PCI) interconnect (PCIe) bus. In some examples, a Direct Memory Access (DMA) controller may control DMA transfers between components coupled to bus 242. In some examples, components coupled to bus 242 control DMA transfers between components coupled to bus 242.

The microprocessor 210 may include one or more processors; the one or more processors each include an independent execution unit to execute instructions consistent with the instruction set architecture. The execution units may be implemented as stand-alone Integrated Circuits (ICs) or combined in one or more multi-core processors (or "many-core" processors) using a single IC (i.e., a chip multiprocessor), respectively.

Disk 246 represents a computer-readable storage medium; computer-readable storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer-readable storage media includes, but is not limited to, Random Access Memory (RAM), Read Only Memory (ROM), EPROM, flash memory, CD-ROM, Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the microprocessor 210.

Main memory 244 includes one or more computer-readable storage media; the computer-readable storage medium may include Random Access Memory (RAM) (e.g., DDR2/DDR3 SDRAM), such as various forms of dynamic RAM (dram) or static RAM (sram), flash memory, or any other form of fixed or removable storage medium that may be used to carry or store desired program code and program data in the form of instructions or data structures and that may be accessed by a computer. Main memory 144 provides a physical address space comprised of addressable memory locations.

A Network Interface Card (NIC)230 includes one or more interfaces 232 configured to exchange packets using links of an underlying physical network. The interface 232 may comprise a port interface card having one or more network ports. The NIC230 also includes on-card memory 277, e.g., to store packet data. Direct memory access transfers between NIC230 and other devices coupled to bus 242 may read from memory 227 or write to memory 227.

Memory 244, NIC230, storage disk 246, and microprocessor 210 provide an operating environment for executing hypervisor 214 and the software stack of one or more virtual machines 224A-228B (collectively "virtual machines 224") and one or more virtual machines 228 managed by hypervisor 214. The computing device 200 may execute more or fewer virtual machines 216.

While the virtual network endpoints in fig. 2A-2B are illustrated and described with respect to virtual machines, other operating environments, such as containers (e.g., DOCKER containers), may implement the virtual network endpoints. The operating system kernel (not shown in fig. 2A-2B) may execute in kernel space and may include, for example, Linux, berkeley software suite (BSD), another Unix variant kernel, or a Windows server operating system kernel, available from microsoft corporation.

The computing device 200 executes the hypervisor 214 to manage the virtual machines 228. Exemplary hypervisors include kernel-based virtual machine (KVM) for Linux kernel, Xen, ESxi available from VMware, Windows Hyper-V available from Microsoft, and other open-source and proprietary hypervisors. Hypervisor 214 may represent a Virtual Machine Manager (VMM).

The virtual machines 224, 228 may host one or more applications, such as virtual network function instances. In some instances, the virtual machines 224, 228 may host one or more VNF instances, where each VNF instance is configured to apply network functionality to packets.

The hypervisor 214 includes a physical driver 225 to utilize the physical functions 221 provided by the network interface card 230. Network interface card 230 may also implement SR-IOV to enable sharing of physical network functions (I/O) between virtual machines 224. The shared virtual devices, virtual functions 217A-217B, provide dedicated resources such that each virtual machine 224 (and corresponding guest operating system) can access the dedicated resources of NIC230, and thus NIC230 acts as a dedicated NIC for each virtual machine 224. The virtual function portion 217 may represent a lightweight PCIe function portion that shares physical resources with the physical function portion 221 and the other virtual function portions 216. According to the SR-IOV standard, NIC230 may have thousands of virtual functions available, but the number of virtual functions configured is typically much smaller for I/O intensive applications. The virtual function 217 may represent an illustrative example of the virtual hardware component 21 of fig. 1.

Virtual functions 217A-217B may be provided with structure to access queue resources 219A-219B and control the allocated queue resources. For global resource access, the virtual function section 217 may send a request to the physical function section 221 and the physical function section 221 operates to access the global resource in response to the request. Each virtual function 217 has a different associated layer 2 address (e.g., MAC address). The physical function 221 has an associated layer 2 address that is different from any layer 2 address associated with the virtual function 217. The physical function section 221 layer 2 address may be regarded as a layer 2 address of the NIC 230.

Virtual machines 224A-224B include virtual drivers 226A-226N, respectively, that appear directly in virtual machine 224 guest operating systems, thereby providing direct communication between NIC230 and virtual machine 224 via bus 242 with virtual function 217 allocated for the virtual machine. This may reduce the expense of the hypervisor 214 participating in software-based, VIRTIO and/or vSwitch implementations in which the hypervisor 214 memory address space of the memory 244 stores packet data, and packet data copied from the NIC230 to the hypervisor 214 memory address space and from the hypervisor 214 memory address space to the virtual machine 217 memory address space consumes cycles of the microprocessor 210.

NIC230 further includes a hardware-based ethernet bridge 234 to perform layer 2 forwarding between virtual functions 217 and physical functions 221. Thus, the bridge 234 provides hardware acceleration of packet forwarding for the internal virtual machine 224 via the bus 242 and hardware acceleration of packet forwarding between the hypervisor 214 accessing the physical function 221 and any virtual machine 224 via the physical driver 225.

The computing device 200, including the virtual router 220, may be coupled to a physical network switch fabric that includes an overlay network that extends the switch fabric from the physical switch to software or "virtual" routers coupled to physical servers of the switch fabric. A virtual router may be a process or thread, or component thereof, executed by a physical server (e.g., server 12 of fig. 1) that dynamically creates and manages one or more virtual networks that may be used for communication between virtual network endpoints. In one example, the virtual routers implement each virtual network with an overlay network that provides the ability to decouple the virtual address of an endpoint from the physical address (e.g., IP address) of the server on which the endpoint executes. Each virtual network may use its own addressing and security scheme and may be considered orthogonal to the physical network and its addressing scheme. Packets may be communicated within and across a virtual network above a physical network using a variety of techniques.

In the exemplary computing device 200 of FIG. 2A, the virtual router 220 executes within the hypervisor 214 that uses the physical functionality 221 for I/O, but the virtual router 220 may execute within a hypervisor, a host operating system, a host application, or one virtual machine 224 that includes a virtual function driver 226 that uses the virtual functionality 217 for virtual I/O.

The exemplary computing device 250 of fig. 2B is similar to the computing device 200. However, the computing device 250 includes a host process 258 instead of the hypervisor 214 for the computing device 200 to execute the virtual router 260. Host process 258 may represent a software process, application, or service executable by a host operating system (also not shown in fig. 2A-2B) of computing device 250. The physical driver 225 of the host process 258 uses the physical function 221 with the NIC230 for I/O. In some examples of computing device 250, virtual machine 224A may execute virtual router 260. In such instances, the VF driver 226A uses the virtual function 217A with the NIC230 for I/O.

In general, each virtual machine 224, 228 may be assigned a virtual address for use within a corresponding virtual network, where each virtual network may be associated with a different virtual subnet provided by the virtual router 220. The virtual machines 224, 228 may be assigned their own virtual layer three (L3) IP addresses (e.g., to send and receive communications) but may not know the IP address of the computing device 200 on which the virtual machine is executing. In this manner, a "virtual address" is an address for an application that is different from a logical address for an underlying physical computer system (e.g., computing device 200).

In one implementation, the computing device 200 includes a Virtual Network (VN) agent (not shown) that controls the overlay of the virtual network 34 for the computing device 200 and coordinates the routing of data packets within the computing device 200. Typically, the VN agents communicate with virtual network controllers for a plurality of virtual networks, the virtual network controllers generating commands to control the routing of packets. VN agents are operable as agents for control plane messages between virtual machines 224, 228 and the virtual network controller. For example, the virtual machine may request to send a message via the VN agent using its virtual address, and the VN agent may in turn send the message and request that the virtual address of VM 36 that originated the first message receive a response to the message. In some cases, the virtual machines 224, 228 may call a program or function call presented by the application programming interface of the VN agent, and the VN agent may also handle encapsulation (including addressing) of messages.

In one example, a network packet (e.g., a layer three (L3) IP packet or a layer two (L2) ethernet packet generated or consumed by an application instance executing within the virtual network domain by the virtual machine 224, 228) may be encapsulated in another packet (e.g., another IP or ethernet packet) transmitted by the physical network. Packets transmitted in a virtual network may be referred to herein as "inner packets," while physical network packets may be referred to herein as "outer packets" or "tunnel packets. Encapsulation and/or decapsulation of virtual network packets within physical network packets may be performed by virtual router 220. This function is referred to herein as a tunnel, and one or more overlay networks may be created with the function. Other exemplary tunneling protocols that may be used in addition to IPinIP include IP over Generic Routing Encapsulation (GRE), VxLAN, multiprotocol label switching over GRE, MPLS over user datagram protocol, and the like.

As described above, the virtual network controller may provide a logically central controller to facilitate operation of one or more virtual networks. For example, the virtual network controller may maintain a routing information base, e.g., one or more routing tables that store routing information for the physical network and the one or more overlay networks. Virtual router 220 of hypervisor 214 implements Network Forwarding Tables (NFTs) 222A-222N for N virtual networks for which virtual router 220 operates as a tunnel endpoint. In general, each NFT 222 stores forwarding information for the corresponding virtual network and identifies where data packets are to be forwarded and whether the packets are to be encapsulated in a tunneling protocol (such as with a tunnel header that may include one or more headers for different layers of the virtual network protocol stack). Each NFT 222 may be an NFT for a different routing instance (not shown) implemented by virtual router 220.

In accordance with the techniques described herein, virtual router 220 of fig. 2A performs tunneling/decapsulation for packets from/to any of virtual machines 224, and virtual router 220 exchanges packets with virtual machines 224 via ethernet bridge 234 and bus 242 of NIC 230.

The NIC230 may receive a tunnel packet having a layer 2 header with a destination layer 2 address as a layer 2 address assigned to the physical function section 221 of the hypervisor 214. For each received tunnel packet, virtual router 220 receives the tunnel packet data via physical driver 225 and stores the tunnel packet data to hypervisor 214 memory address space. The virtual router 220 processes the tunnel packet to determine the virtual network for the source and destination endpoints of the inner packet from the tunneling header. Virtual router 220 may peel off the layer 2 header and tunnel encapsulation header, forwarding only inner packets internally. The tunnel encapsulation header includes a virtual network identifier, such as a VxLAN label or MPLS label, that indicates the virtual network, e.g., the virtual network for which NFT 222A is a network forwarding table. NFT 222A may include forwarding information for internal packets. For example, NFT 222A may map the destination layer 3 address of the inner packet to virtual function 217B, e.g., to the layer 2 address associated with virtual function 217B and virtual machine 224B. The mapping of the destination layer 3 address of the inner packet to the layer 2 address associated with the virtual function 217B may include an Address Resolution Protocol (ARP) entry.

Rather than sending the internal packets to the destination virtual machine 224A using the VIRTIO interface or other technique for copying the internal packet data from the hypervisor 214 memory address space to the virtual machine 224A guest operating system's memory address space, the virtual router 220 encapsulates the internal packets with a new layer 2 header for the destination layer 2 address, which is the layer 2 address associated with the virtual function 217B. The new layer 2 header may also include a VLAN identifier that corresponds in computing device 200 to the virtual network of the source and destination endpoints of the inner packet. Then, the virtual router 220 outputs the internal packet having the new destination layer 2 address to the NIC230 via the physical function section 221. This may cause the physical driver 225 or other component of the computing device 200 to initiate a direct memory access transfer to copy the internal packet with the new layer 2 header to the NIC 240 memory using the bus 242. Thus, the microprocessor 210 may avoid copying packet data from one memory address space to another.

The ethernet bridge 234 examines the new layer 2 header of the inner packet, determines that the destination layer 2 address is associated with the virtual function 217B and switches the inner packet with the new layer 2 header to add the inner packet with the new layer 2 header to the input queue for the queue 219B of the virtual function 217B. Placing this data in the queue may cause the VF driver 226B or other component of the computing device 200 to initiate a DMA transfer to copy the internal packet with the new layer 2 header to the virtual machine 224B memory address space using the bus 242. Thus, the microprocessor 210 may avoid copying packet data from one memory address space to another. The virtual machine 224B may process the internal packet in the event that packet data has been received in its memory address space. Hereinafter, the switching operation by the ethernet bridge 234 may include adding packet data to the corresponding input queue 219, 223 of the switched virtual function 217 or physical function 221, and the output operation by either of the virtual function driver 226 and the physical driver 225 may similarly include adding packets to the corresponding output queue 219, 223.

The virtual machine 224 may also treat the internal packet as a source virtual network endpoint. For example, virtual machine 224B may generate a layer 3 internal packet destined for a destination virtual network endpoint executed by another computing device (i.e., non-computing device 200). The virtual machine 224B encapsulates the internal packet with a layer 2 header having a layer 2 destination address as the layer 2 address of the physical function section 221, thereby causing the ethernet bridge 234 to switch the packet to the virtual router 220. The VF driver 226B or another component of the computing device 200 may initiate a DMA transfer to copy internal packets with layer 2 headers from the memory address space of the virtual machine 224B to the NIC230 using the bus 242. In response to the switch operation of ethernet bridge 234, physical driver 225 or other component of computing device 200 may initiate a DMA transfer to copy internal packets with layer 2 headers from NIC230 to the memory address space of hypervisor 214 using bus 242. The layer 2 header may include a VLAN identifier that corresponds in computing device 200 to the virtual network of the source and destination endpoints of the inner packet.

The virtual router 220 receives the inner packet and the layer 2 header and determines the virtual network for the inner packet. Virtual router 220 may determine the virtual network from the VLAN identifier of the layer 2 header. Virtual router 220 generates an outer header of the inner packet using NFT 222 corresponding to the inner packet, the outer header including an outer IP header for covering the tunnel and a tunnel encapsulation header identifying the virtual network. The virtual router 220 encapsulates the inner packet with an outer header. The virtual router 220 may encapsulate the tunnel packet with a new layer 2 header having a destination layer 2 address associated with a device external to the computing device 200 (e.g., the TOR switch 16 or one of the servers 12). The virtual router 220 outputs the tunnel packet with the new layer 2 header to the NIC230 using the physical function section 221. This may cause the physical driver 225 to initiate a DMA transfer from the hypervisor 214 memory address space to the NIC230 to copy the tunnel packet and new layer 2 header to the NIC230 memory using the bus 242. The NIC230 outputs the packet on the outbound interface.

Packets output by any of the virtual machines 224 are received by the virtual router 220 for virtual routing. In some examples, virtual router 220 operates as a default gateway or as an Address Resolution Protocol (ARP) proxy. The virtual machine 224B may, for example, broadcast an ARP request for the default gateway that is received and switched to the virtual router via the bridge 234. The virtual router 220 may reply with an ARP response that specifies the layer 2 address for the physical function 221 as the layer 2 address of the default gateway.

In some examples, a controller for computing device 200 (e.g., controller 24 of fig. 1) configures a default path in each of virtual machines 224, causing virtual machines 224 to use virtual router 220 as an initial next hop for outbound packets. In some examples, NIC230 is configured with one or more forwarding rules to cause all packets received from virtual machine 224 to switch to hypervisor 214 via physical function 221 through ethernet bridge 234.

In accordance with the techniques described in this disclosure, virtual router 260 of fig. 2B performs tunnel encapsulation/decapsulation for packets from/to any of virtual machines 224, and virtual router 260 exchanges packets with virtual machines 224 via ethernet bridge 234 and bus 242 of NIC 230.

The NIC230 may receive a tunneling packet having a layer 2 header with a destination layer 2 address, the destination layer 2 address being a layer 2 address at least partially assigned to the physical function 221 of the host process 258. For each received tunnel packet, the virtual router 260 receives the tunnel packet data via the physical driver 225 and stores the tunnel packet data to the host process 258 memory address space. The virtual router 260 processes the tunnel packet to determine the virtual network for the source and destination endpoints of the inner packet from the tunneling header. Virtual router 260 may peel off the layer 2 header and the tunneling header, forwarding only the inner packet internally. The tunnel encapsulation header includes a virtual network identifier, such as a VxLAN label or MPLS label, that indicates the virtual network, e.g., the virtual network for which NFT 222A is a network forwarding table. NFT 222A may include forwarding information for inner packets. For example, NFT 222A may map a destination layer 3 address for the internal packet to virtual function 217B, e.g., to a layer 2 address associated with virtual function 217B and virtual machine 224B. The mapping of the destination layer 3 address of the inner packet to the layer 2 address associated with the virtual function 217B may include an Address Resolution Protocol (ARP) entry.

Rather than sending the internal packet to the destination virtual machine 224B using the VIRTIO interface or other technique for copying the internal packet data from the host process 258 memory address space to the virtual machine 224B guest operating system's memory address space, the virtual router 260 encapsulates the internal packet with a new layer 2 header having a destination layer 2 address that is the layer 2 address associated with the virtual function 217B. The new layer 2 header may also include a VLAN identifier that corresponds in computing device 250 to the virtual network of the source and destination endpoints of the inner packet. Then, the virtual router 260 outputs the internal packet having the new destination layer 2 address to the NIC230 via the physical function section 221. This may cause physical driver 225 or other component of computing device 250 to initiate a DMA transfer to copy the internal packet with the new layer 2 header to NIC 240 memory using bus 242. Thus, the microprocessor 210 may avoid copying packet data from one memory address space to another.

The ethernet bridge 234 examines the new layer 2 header of the inner packet, determines that the destination layer 2 address is associated with the virtual function 217B and switches the inner packet with the new layer 2 header to add the inner packet with the new layer 2 header to the input queue for the queue 219B of the virtual function 217B. Placing this data into the queue may cause the VF driver 226B or other component of the computing device 250 to initiate a DMA transfer to copy internal packets with new layer 2 headers from the NIC230 to the virtual machine 224B memory address space using the bus 242. Thus, the microprocessor 210 may avoid copying packet data from one memory address space to another. In the event that packet data has been received in the memory address space, the virtual machine 224B may process the internal packet.

Virtual machine 224 can also treat the inner packet as a virtual network source endpoint. For example, virtual machine 224B may generate a layer 3 internal packet destined for a destination virtual network endpoint executed by another computing device (i.e., non-computing device 250). The virtual machine 224B encapsulates the inner packet with a layer 2 header having a layer 2 destination address that is the layer 2 address of the physical function section 221, causing the ethernet bridge 234 to switch the packet to the virtual router 220. The VF driver 226B or another component of the computing device 250 may initiate a DMA transfer to copy internal packets with layer 2 headers from the memory address space of the virtual machine 224B to the NIC230 using the bus 242. In response to the switch operation by ethernet bridge 234, physical driver 225 or another component of computing device 250 may initiate a DMA transfer to copy internal packets with layer 2 headers from NIC230 to the memory address space of host process 258 using bus 242. The layer 2 header may include a VLAN identifier that corresponds in computing device 250 to the virtual network of the source and destination endpoints of the inner packet.

The virtual router 260 receives the inner packet and the layer 2 header and determines a virtual network for the inner packet. Virtual router 260 may determine the virtual network from the VLAN identifier of the layer 2 header. Virtual router 260 generates an outer header of the inner packet using NFT 222 corresponding to the virtual network used for the inner packet, the outer header including an outer IP header for the overlay tunnel and a tunnel encapsulation header identifying the virtual network. The virtual router 260 encapsulates the inner packet with an outer header. The virtual router 260 may encapsulate the tunnel packet with a new layer 2 header having a destination layer 2 address associated with a device external to the computing device 250 (e.g., the TOR switch 16 or one of the servers 12). The virtual router 260 outputs the tunnel packet with the new layer 2 header to the NIC230 using the physical function section 221. This may cause the physical driver 225 to initiate a DMA transfer from the host process 258 memory address space to the NIC230 to copy the tunnel packet and new layer 2 header to the NIC230 memory using the bus 242. The NIC230 outputs the packet on the outbound interface.

Packets output by any of the virtual machines 224 are received by the virtual router 260 for virtual routing. In some examples, virtual router 220 operates as a default gateway or as an Address Resolution Protocol (ARP) proxy. The virtual machine 224B may, for example, broadcast an ARP request for the default gateway that is received and switched to the virtual router via the bridge 234. The virtual router 260 may reply with an ARP response that specifies the layer 2 address of the physical function 221 as the layer 2 address of the default gateway.

In some examples, a controller of computing device 250 (e.g., controller 24 of fig. 1) configures a default route in each of virtual machines 224, causing virtual machines 224 to use virtual router 260 as an initial next hop for outbound packets. In some examples, NIC230 is configured with one or more forwarding rules to cause all packets received from virtual machine 224 to switch to host process 258 through ethernet bridge 234 via physical function 221.

In some cases, virtual router 260 may be executed by one of virtual machines 224. In accordance with techniques described in this disclosure, for example, virtual machine 224A may execute virtual router 260 to operate as a tunnel endpoint application and perform virtual routing. In some cases, the above description relating to the queue 223, the physical function 221, and the physical driver 225 will be applied in reverse to the queue 219A, the virtual function 217A, and the virtual function driver 226A, respectively.

Fig. 3A is a block diagram detailing an exemplary tunnel packet that may be processed by a computing device in accordance with the techniques described herein. For simplicity and ease of illustration, tunnel packet 150 does not illustrate every field of a typical tunnel packet, but is provided to highlight the techniques described herein. Further, various implementations may include tunnel packet fields in various orderings. "outer" or "tunnel" packet 150 includes an outer header 152 and an inner or "encapsulation" packet 156. The outer header 152 may include a protocol or type of service (TOS) field 162 and common (i.e., switchable through the underlying physical network of the virtual network associated with the inner packet 156) IP address information in the form of a source IP address field 164 and a destination IP address field 166. The protocol field 162 in this example indicates that the tunneling packet 150 uses GRE tunneling encapsulation, but in other cases other forms of tunneling encapsulation may be used, including IPinIP, NVGRE, VxLAN, and MPLS over MPLS, for example.

The outer header 152 also includes a tunnel encapsulation header 154, which tunnel encapsulation header 154 includes, in this example, a GRE protocol field 170 specifying a GRE protocol (here, MPLS) and an MPLS label field 172 specifying an MPLS label value (here, 214). The MPLS label field is an example of a virtual network identifier and may be associated with a routing instance and/or NFT for a virtual network in a virtual router (e.g., virtual router 220 of computing device 200 of fig. 2A or virtual router 260 of computing device 250 of fig. 2B).

Inner packet 156 includes inner header 158 and payload 184. The inner header 158 may include a protocol or type of service (TOS) field 174 as well as private (i.e., for a particular virtual routing and forwarding instance) IP address information in the form of a source IP address field 176 and a destination IP address field 178, and transport layer information in the form of a source port field 180 and a destination port field 182. The payload 184 may include: the application layer (layer 7(L7)) and in some cases other L4-L7 information generated by or consumed by the virtual machine for the virtual network. The payload 184 may comprise, and thus may alternatively be referred to as, an "L4 packet," a "UDP packet," or a "TCP packet.

Fig. 3B is a block diagram detailing an exemplary packet with a new layer 2 header generated by a virtual router for output to a network interface card for switching by a network interface card-based switch to a destination virtual network endpoint. Packets 192 include inner packet 156 of fig. 3A, inner packet 156 being communicated between two virtual network endpoints. The virtual router 220, 260 encapsulates the inner packet 156 with a new layer 2 header 186 having a source layer 2(MAC) address 188 and a destination layer 2(MAC) address 190. The destination layer 2 address has a value of M1, M1 is the layer 2 address associated with the virtual function 217 used by the virtual machine 224, and the virtual machine 224 is the destination virtual network endpoint for the internal packet 158. The virtual machine 224 may have a layer 3 address as the value of the destination IP address field 178. In some cases, layer 2 header 186 may include a VLAN identifier for a VLAN associated with a virtual network that includes a destination virtual network endpoint for inner packet 156.

FIG. 4 is a flow diagram illustrating an example mode of operation for a computing device in accordance with the techniques described in this disclosure. The operations 400 may be performed by any of the computing device 200, the server 12, or another computing device. The network interface card 230 may be SR-IOV capable and thus have one or more virtual functions 217 for the packet I/O physical functions 221. Network interface card 230 may be configured to receive tunnel packets from physical interface 232 and, for example, apply one or more rules to direct the received tunnel packets using virtual function 217 or physical function 221 to virtual router process 220, 260(402), e.g., executed by a computing device, a virtual machine hosted by a host process, or as part of hypervisor 214. The virtual router 220, 260 terminates the tunnel and determines a virtual network for an inner packet of the tunneled packet and a destination virtual network endpoint for the tunneled packet based on parameters included in the tunneled packet (404). For a received tunnel packet (e.g., receivable via DMA), the virtual router 220, 260 may strip off an outer header including a tunnel encapsulation header to obtain an inner packet of the received tunnel packet by reading the tunnel packet from a memory device and/or by detecting the tunnel packet as a set of signals on a bus. The virtual router 220, 260 may encapsulate the internal packet with a new layer 2 header with a destination layer 2 address, which is a layer 2 address configured for the virtual function 217 for packet I/O through the destination virtual network endpoint of the internal packet (406). The virtual router 220, 260 may output the inner packet with the new layer 2 header to the NIC230 (408), switching the inner packet with the new layer 2 header to the virtual function based on the destination layer 2 address (410). The virtual router 220, 260 may output the internal packet with the new layer 2 header to the NIC230 via DMA by storing the tunneled packet to a memory device and/or by outputting the packet as a set of signals on the bus.

FIG. 5 is a flow diagram illustrating an example mode of operation for a computing device in accordance with the techniques described in this disclosure. Operations 500 may be performed by any of computing device 200, server 12, or another computing device. The network interface card 230 may be SR-IOV capable and thus have one or more virtual functions 217 for the packet I/O physical functions 221. Virtual machine 224A as the source virtual network endpoint may output internal packets with layer 2 headers to NIC230 (502) using virtual function 217A. The layer 2 header may have a destination layer 2 address, which is a layer 2 address of a virtual or physical function configured for use by the virtual router process 220, 260 for packet I/O. As a result, the NIC230 switches the internal packet and the layer 2 header to a virtual or physical functional section, so that the internal packet and the layer 2 header are received by the virtual router process 220, 260 (504). The virtual router process 220, 260 performs virtual routing of the inner packet based on a network forwarding table for the virtual network including the virtual machine 224A (506), and adds an outer header to the inner packet including a tunneling header indicating the virtual network to generate a tunneling packet (508). The virtual router process 220, 260 outputs the tunnel packet to the NIC230 for output via the physical interface 232 (510). The tunnel packet is switched by the physical network to a physical computing device hosting a destination virtual network endpoint for the tunnel packet.

The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. Various features described as modules, units or components may be implemented together in an integrated logic device or separately as discrete but interoperable logic devices or other hardware devices. In some cases, various features of an electronic circuit device may be implemented as one or more integrated circuit devices, such as an integrated circuit chip or chipset.

If implemented in hardware, the present disclosure may be directed to an apparatus such as a processor or an integrated circuit device, such as an integrated circuit chip or chipset. Alternatively or additionally, if implemented in software or firmware, the techniques may be realized at least in part by a computer-readable data storage medium comprising instructions that, when executed, cause a processor to perform one or more of the methods described above. For example, a computer-readable data storage medium may store such instructions for execution by a processor.

The computer readable medium may form part of a computer program product that may include packaging materials. The computer-readable medium may include a computer data storage medium, such as Random Access Memory (RAM), Read Only Memory (ROM), non-volatile random access memory (NVRAM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, magnetic or optical data storage media, and so forth. In some examples, an article of manufacture may include one or more computer-readable storage media.

In some examples, the computer-readable storage medium may include a non-transitory medium. The term "non-transitory" may indicate that the storage medium is not embodied in a carrier wave or propagated signal. In some examples, a non-transitory storage medium may store data that may change over time (e.g., in RAM or a cache).

The code or instructions may be software and/or firmware executed by processing circuitry that includes one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the term "processor" as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functions described in this disclosure may be provided within software modules or hardware modules.

Additionally or alternatively to the foregoing, the following examples are described. Features described in any of the examples below may utilize any of the other examples described herein.

Example 1a computing device, comprising: one or more hardware-based processors coupled to a storage device; a virtual network endpoint configured for execution by one or more processors; a network interface card comprising a first hardware component and a second hardware component, wherein the first hardware component and the second hardware component provide separate packet input/output access to a physical network interface of the network interface card, wherein the network interface card is configured to receive packets inbound from the physical network interface; and a virtual router configured for execution by the one or more processors to receive the packet from the network interface card and output the packet back to the network interface card using the first hardware component in response to determining that a destination endpoint of the packet is a virtual network endpoint, wherein the network interface card is further configured to switch the packet to the virtual network endpoint and output the packet to the virtual network endpoint using the second hardware component in response to receiving the packet from the virtual router.

Example 2 the computing apparatus of example 1, wherein the network interface card comprises a single root input/output virtualization (SR-IOV) apparatus; wherein the first hardware component comprises a physical function of the SR-IOV device; wherein the second hardware component comprises a virtual function portion of the SR-IOV device.

Example 3 the computing apparatus of example 1, wherein the network interface card comprises a single root input/output virtualization (SR-IOV) apparatus; wherein the first hardware component comprises a first virtual function portion of the SR-IOV device; wherein the second hardware component comprises a second virtual function portion of the SR-IOV device.

Example 4 the computing apparatus of example 1, wherein the second hardware component is configured with a layer 2 address; wherein the virtual router is configured to output packets having a layer 2 header back to the network interface card, the layer 2 header having a destination layer 2 address as the layer 2 address.

Example 5 the computing apparatus of example 1, wherein the virtual router is configured to output the packet to the network interface card by causing a direct memory access transfer of the packet from a memory address space of the virtual router to a memory of the network interface card.

Example 6 the computing apparatus of example 1, wherein the network interface card is configured to output the packet to the virtual network endpoint by causing a direct memory access transfer of the packet from a memory of the network interface card to a memory address space of the virtual network endpoint.

Example 7 the computing apparatus of example 1, wherein the virtual network endpoint comprises at least one of a virtual machine and a container.

Example 8 the computing device of example 1, wherein the packet comprises an internal packet and a tunneling header indicating one of a plurality of virtual networks, the virtual network comprising the virtual network endpoint, wherein the virtual router is configured to determine a network forwarding table based at least on the tunneling header, the network forwarding table indicating that the layer 2 address configured for the second hardware component is a layer 2 address of the virtual network endpoint; wherein to output the packet back to the network interface card, the virtual router is configured to output an internal packet having a layer 2 header back to the network interface card, the layer 2 header having a destination layer 2 address that is a layer 2 address of the virtual network endpoint.

Example 9 the computing device of example 1, wherein the packet comprises a first packet; wherein the virtual network endpoint is configured to output the second packet to the network interface card using the second hardware component; wherein the network interface card is configured to switch the second packet to the virtual router and output the second packet to the virtual router using the first hardware component; and wherein the virtual router is configured to encapsulate the second packet with the external header and output the second packet back to the network interface card for output on the physical network interface, thereby tunneling the second packet to another physical computing device hosting the destination virtual network endpoint for the second packet.

Example 10 the computing apparatus of example 1, wherein the virtual network endpoint is configured with a default route to cause the virtual network endpoint to output outbound packets having layer 2 headers that each have a layer 2 destination address that is a layer 2 address configured for the first hardware component, and wherein the network interface card is configured to switch the outbound packets to the virtual router based at least on the layer 2 header and output the outbound packets to the virtual router using the first hardware component.

Example 11 the computing device of example 1, wherein the virtual router is configured to output an address resolution protocol reply to the address resolution protocol request in response to receiving the address resolution protocol request requesting a layer 2 address for the default gateway, the reply specifying that the layer 2 address configured for the first hardware component is for the layer 2 address of the default gateway.

Example 12a method, comprising: receiving, by a network interface card of a computing device, packets inbound from a physical network interface via the physical network interface of the network interface card, wherein the network interface card includes a first hardware component and a second hardware component, and wherein the first hardware component and the second hardware component provide separate packet input/output access to the physical network interface of the network interface card; receiving, by a virtual router of a computing device, a packet from a network interface card; outputting, by the virtual router, the packet back to the network interface card using the first hardware component in response to determining that the destination endpoint of the packet is a virtual network endpoint of the computing device; and switching, by the network interface card, the packet to the virtual network endpoint and outputting the packet to the virtual network endpoint using the second hardware component in response to receiving the packet from the virtual router.

Example 13 the method of example 12, wherein the network interface card comprises a single root input/output virtualization (SR-IOV) device, wherein the first hardware component comprises a physical functional portion of the SR-IOV device, and wherein the second hardware component comprises a virtual functional portion of the SR-IOV device.

Example 14 the method of example 12, wherein the network interface card includes a single root input/output virtualization (SR-IOV) device, wherein the first hardware component includes a first virtual function portion of the SR-IOV device, and wherein the second hardware component includes a second virtual function portion of the SR-IOV device.

Example 15 the method of example 12, wherein the second hardware component is configured with a layer 2 address, the method further comprising: the packet with the layer 2 header having the destination layer 2 address as the layer 2 address is output by the virtual router back to the network interface card.

Example 16 the method of example 12, wherein outputting the packet to the network interface card includes causing a direct memory access transfer of the packet from a memory address space of the virtual router to a memory of the network interface card.

Example 17 the method of example 12, wherein outputting the packet to the virtual network endpoint includes causing a direct memory access transfer of the packet from a memory of the network interface card to a memory address space of the virtual network endpoint.

Example 18 the method of example 12, wherein the virtual network endpoint includes at least one of a virtual machine and a container.

Example 19 the method of example 12, wherein the packet includes an inner packet and a tunneling header indicating one of a plurality of virtual networks, the virtual network including a virtual network endpoint, the method further comprising: determining, by the virtual router, a network forwarding table based at least on the tunneling header, the network forwarding table indicating that the layer 2 address configured for the second hardware component is a layer 2 address of the virtual network endpoint; wherein outputting the packet back to the network interface card comprises outputting, by the virtual router, an internal packet back to the network interface card having a layer 2 header with a destination layer 2 address that is a layer 2 address of the virtual network endpoint.

Example 20 the method of example 12, wherein the packet comprises a first packet, the method further comprising: outputting, by the virtual network endpoint, the second packet to the network interface card using the second hardware component; switching, by the network interface card, the second packet to the virtual router and outputting the second packet to the virtual router using the first hardware component; encapsulating, by the virtual router, the second packet with the external header and outputting the second packet back to the network interface card for output on the physical network interface, thereby tunneling the second packet to another physical computing device hosting a destination virtual network endpoint for the second packet.

Example 21 the method of example 12, further comprising: receiving, by the virtual network endpoint, a default route to cause the virtual network endpoint to output outbound packets having layer 2 headers, the layer 2 headers each having a layer 2 destination address that is a layer 2 address configured for the first hardware component; switching, by the network interface card, the outbound packet to the virtual router based at least on the layer 2 header and outputting, using the first hardware component, the outbound packet to the virtual router.

Example 22 according to the method of example 12, the virtual router outputs an address resolution protocol reply to the address resolution protocol request specifying that the layer 2 address configured for the first hardware component is for the layer 2 address of the default gateway in response to receiving the address resolution protocol request requesting the layer 2 address of the default gateway.

Example 23a non-transitory computer-readable storage medium comprising instructions to cause a computing device to perform the steps of: receiving, by a network interface card of a computing device, packets inbound from a physical network interface via the physical network interface of the network interface card, wherein the network interface card includes a first hardware component and a second hardware component, wherein the first hardware component and the second hardware component provide separate packet input/output access to the physical network interface of the network interface card; receiving, by a virtual router of a computing device, a packet from a network interface card; outputting, by the virtual router, the packet back to the network interface card using the first hardware component in response to determining that the destination endpoint of the packet is a virtual network endpoint of the computing device; and switching, by the network interface card in response to receiving the packet from the virtual router, the packet to the virtual network endpoint and outputting the packet to the virtual network endpoint using the second hardware component.

Moreover, any of the specific features set forth in any of the above examples may be incorporated into advantageous examples of the described technology. That is, any particular feature may generally be applicable to all examples of the invention. Various examples of the present invention have been described.

28页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种发送邮件的方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!