Routing table selection in policy-based routing systems
阅读说明:本技术 基于策略的路由系统中的路由表选择 (Routing table selection in policy-based routing systems ) 是由 A.法兰森 T.哈马姆 于 2017-05-24 设计创作,主要内容包括:描述了一种由网络设备实现的用于在基于策略的路由(PBR)系统中选择路由表的方法。该方法可包括:从第一网络域接收分组;生成用于分组的防火墙标记,其中该防火墙标记包括网络域指示和分组分类指示;确定分组的网络域指示与规则集中的匹配规则的选择器之间的匹配;以及在确定分组的网络域指示与匹配规则的选择器之间的匹配时,将防火墙标记输入到匹配规则的函数中,以识别用于分组的路由表。(A method implemented by a network device for selecting a routing table in a policy-based routing (PBR) system is described. The method can comprise the following steps: receiving a packet from a first network domain; generating a firewall label for the packet, wherein the firewall label comprises a network domain indication and a packet classification indication; determining a match between the network domain indication of the packet and a selector of a matching rule in the rule set; and upon determining a match between the network domain indication of the packet and the selector of the matching rule, entering a firewall token into a function of the matching rule to identify a routing table for the packet.)
1. A method (700) implemented by a network device (108) for selecting a routing table in a policy-based routing (PBR) system, the method comprising:
receiving (702) a packet (202) from a first network domain (102);
generating (704) a firewall label (300) for the packet, wherein the firewall label comprises a network domain indication (302) and a packet classification indication (304);
determining (706) the network domain indication and rule (210) of the packet1-210N) Concentrated piece of paperMatching rule (210)1) A selector (402); and
upon determining the match between the network domain indication of the packet and the selector of the matching rule, inputting (708) the firewall token to a function (404) of the matching rule to identify a routing table (212) for the packet1)。
2. The method of claim 1, further comprising:
routing (710) the packet to a next hop (110) according to an entry (504) in the routing table1-110x)。
3. The method of claims 1 and 2, wherein determining a match comprises applying a mask to the firewall tag to mask the packet classification indication of the firewall tag and identify the network domain indication of the firewall tag.
4. The method of claims 1-3, wherein the function of the matching rule provides a firewall tag value and a first set of routing tables (210) associated with the first network domain1-210256) A one-to-one mapping between the routing tables in (1).
5. The method of claim 4, wherein the routing table of the packet is in the first set of routing tables.
6. The method of claims 4 and 5, wherein the rule set comprises rules (210) corresponding to a second network domain (104)2),
Wherein the rules corresponding to the second network domain include: a selector matching an identifier of the second network domain, and providing a firewall tag value and a data packet from a second set of routing tables (212) associated with the second network domain257-212320) Is used as a function of the one-to-one mapping between the routing tables.
7. The method of claim 6, wherein the network domain indication of the matching rule corresponding to the first network domain has a first length and the network domain indication of the rule corresponding to the second network domain has a second length, and
wherein the first length and the second length are different.
8. The method of claims 1-7, wherein the rule set includes rules (210) having a selector and a discrete action to be taken directly in response to a match with the selectorN-1)。
9. The method of claims 1-8, wherein the packet classification indication describes one or more of information in or associated with the packet.
10. The method of claims 1-9, wherein the packet classification indication describes one or more of a hash of an address of the packet and a class of data within the packet.
11. A method as in claims 1-10, wherein the network domain indication is set to an identifier of the first network domain.
12. The method of claims 1-10, wherein the set of rules is part of a Routing Policy Database (RPDB) (208).
13. A network device, comprising:
a non-transitory machine readable storage medium (948) having stored therein the classifier (204) and the routing policy engine (206); and
a processor (942) coupled to the non-transitory machine-readable storage medium, the processor configured to execute the classifier and the routing policy engine,
wherein the classifier is configured to receive a packet (202) from a first network domain (102) and generate a firewall tag (300) for the packet, wherein the firewall tag comprises a network domain indication (302) and a packet classification indication (304), and
wherein the routing policy engine is configured to determine the network domain indication and rules (210) of the packet1-210N) Centralized matching rules (210)1) And upon determining the match between the network domain indication of the packet and the selector of the matching rule, inputting the firewall token to a function of the matching rule (404) to identify a routing table for the packet.
14. The network device of claim 13, wherein determining a match comprises applying a mask to the firewall tag to mask the packet classification indication of the firewall tag and identify the network domain indication of the firewall tag.
15. The network device of claims 13 and 14, wherein the function of the matching rule provides a firewall tag value and a first set of routing tables (210) associated with the first network domain1-210256) A one-to-one mapping between the routing tables in (1).
16. The network device of claim 15, wherein the routing table of the packet is in the first set of routing tables.
17. The network device of claim 15, wherein the set of rules comprises rules (210) corresponding to a second network domain (104)2),
Wherein the rules corresponding to the second network domain include: a selector matching an identifier of the second network domain, and providing a firewall tag value and a data packet from a second set of routing tables (212) associated with the second network domain257-212320) Is used as a function of the one-to-one mapping between the routing tables.
18. The network device of claims 13-17, wherein the packet classification indication describes one or more of information in or associated with the packet.
19. The network device of claims 13-18, wherein the network device is a computing device configured to execute a plurality of virtual machines (862A-R) that implement Network Function Virtualization (NFV).
20. A network device as recited in claims 13-18, wherein the network device is a control plane device (904) configured to implement a control plane (876) of a Software Defined Network (SDN).
Technical Field
Embodiments described herein relate to the field of selecting routing tables; and more particularly, to selecting a routing table for a packet in a policy-based routing (PBR) system.
Background
Routing is the process of routing traffic in a network or between or across multiple networks. In some cases, routing may be performed to partition network resources. There may be many reasons for partitioning network resources in a routing network, and using multiple routing tables in a routing device is a separate technique that helps describe the structural separation of routing control and forwarding behavior without adding additional routing devices in the form of additional hardware units. For example, (1) when there is a need to separate traffic of different enterprises that traverse a common device, (2) in schemes that provide special path diversity forwarding over independent networks, (3) as a means of separate routing processes in routing devices for unicast traffic and multicast traffic, or (4) in network slice scenarios that accommodate special forwarding characteristics, multiple routing tables may be used.
In general, the term "routing" means that the router only looks at the destination Internet Protocol (IP) address in the packet to determine the next hop address to forward the packet. However, the term "policy-based routing (PBR)" is used when other information for routing decisions needs to be considered, or in the case where it must first be established which routing table to select based on certain policy criteria. In a PBR system, packet processing is managed by a prioritized PBR rule list. This ordered list may be referred to as a Routing Policy Database (RPDB). The PBR rules may include selectors that identify categories or classifications of packets, and action predicates.
Packet classification is a process or mechanism that classifies traffic packets into classes based on information in the packets, information associated with the packets, or results of processing of the information. The classified packets may then be marked ("colored") so that the process and/or device can easily identify packets belonging to a class and provide differentiated processing based on the packet marking (color). Such classification techniques may be used in routers and firewalls to provide, for example, differentiated quality of service (QoS) and policy-based packet processing for classes (colors) of packets. The mark (color) is sometimes referred to as a firewall mark (fwmark) and may be represented by an integer value.
For example, when a network device receives a packet, the packet may be classified to produce a firewall label. The firewall flag may then be compared to a selector of the corresponding PBR rule in the RPDB. Upon determining a match, the corresponding action predicate matching the PBR rule may be taken. For example, the first firewall flag value may correspond to a selector of the first PBR rule. In response to the firewall label of the packet matching the selector of the first PBR rule, an action predicate of the first PBR rule can be taken. In this case, the action predicate may be the selection of the first routing table for determining the next hop for the packet. Similarly, another packet received by the network device may be classified and associated with a second firewall flag value. The second firewall flag value will not match the selector of the first PBR rule but will match the selector of the second PBR rule. In response to a match with a selector of the second PBR rule, an action predicate of the second rule may be taken. In this case, the action predicate of the second PBR rule may be to select a second routing table for determining a next hop for the packet.
In complex real-world scenarios, there may be hundreds or thousands of PBR rules in a single RPDB. In some cases, hundreds or thousands of PBR rules may be centralized on each associated network domain, and the RPDB may cover several network domains. Further, some of these rules may be independent of the routing table selection (e.g., performing a packet drop action) or based on a selector other than a firewall tag. Since each packet may need to be compared to a large number of PBR rule selectors, the routing table selection speed often becomes severely reduced. In addition, as the number of PBR rules in each RPDB increases, the difficulty and burden of managing the RPDB is also affected.
Disclosure of Invention
A method implemented by a network device for selecting a routing table in a policy-based routing (PBR) system is described. The method can comprise the following steps: receiving a packet from a first network domain; generating a firewall label for the packet, wherein the firewall label comprises a network domain indication and a packet classification indication; determining a match between the network domain indication of the packet and a selector of a matching rule in the rule set; and upon determining that the network domain of the packet indicates a match with the selector of the matching rule, entering a firewall tag into a function of the matching rule to identify a routing table for the packet.
A network device is also described herein. The network device may include a non-transitory machine-readable storage medium having stored therein a classifier and a routing policy engine; and a processor coupled to the non-transitory machine-readable storage medium. The processor is configured to execute a classifier and a routing policy engine, wherein the classifier is configured to receive a packet from a first network domain and generate a firewall tag for the packet, wherein the firewall tag contains a network domain indication and a packet classification indication, and wherein the routing policy engine is configured to determine a match between the network domain indication of the packet and a selector of a matching rule in a rule set, and upon determining a match between the network domain indication of the packet and the selector of the matching rule, input the firewall tag into a function of the matching rule to identify a routing table for the packet.
In systems that have a large number of PBR rules and where PBR-based routing table selection is based on packet classification, the systems and methods described herein greatly accelerate routing table selection. In particular, in a typical real-world deployment scenario, routing table selection has been measured 100 times faster than conventional techniques, with hundreds of PBR rules. This improved performance is achieved without degrading other PBR-based capabilities.
Further, the described systems and methods act on PBR rules that use firewall flags as selectors and still work consistently/in concert with PBR rules in the Routing Policy Database (RPDB) that act on selectors other than firewall flags. For example, higher priority PBR rules may be inserted into the RPDB to drop or isolate particular traffic based on a source Internet Protocol (IP) selector.
Furthermore, the described systems and methods result in a more compact RPDB with fewer entries (e.g., PBR rules) than conventional systems. In particular, several PBR rules selected using firewall labels as routing tables of selectors for a single network domain overlay may be combined into a single PBR rule. This reduced number of PBR rules results in a more compact RPDB that is easier to manage.
The systems and methods described herein may be applied to a load balancing, firewall, or routing system that uses multiple routing tables. In addition, the PBR described herein may be independent of a particular product and support multiple network domains with overlapping IP addresses.
Drawings
The systems, devices, structures, methods and designs may be best understood by referring to the following description and accompanying drawings that are used to illustrate embodiments. In the drawings:
fig. 1 illustrates a network system including a set of network domains, according to one embodiment.
Fig. 2 illustrates an example of a network device operating in a network domain of a network system, according to one embodiment.
Fig. 3 illustrates a firewall label (fwmark) including a network domain indication and a packet classification indication, according to one embodiment.
FIG. 4 illustrates a policy-based routing (PBR) rule including selectors and action predicates, according to one embodiment.
Fig. 5 illustrates a routing table with a set of entries according to one embodiment.
Fig. 6 illustrates an example of a network device operating in a network domain of a network system, according to one embodiment.
Fig. 7 illustrates a method for selecting a routing table in a PBR system according to one embodiment.
Figure 8A illustrates connections between Network Devices (NDs) within an exemplary network and three exemplary implementations of NDs, according to some embodiments.
Fig. 8B illustrates an exemplary manner of implementing a dedicated network device in accordance with some embodiments.
Figure 8C illustrates various exemplary ways in which Virtual Network Elements (VNEs) may be coupled, according to some embodiments.
Figure 8D illustrates a network with a single Network Element (NE) on each ND, according to some embodiments, and within this direct approach, the traditional distributed approach (typically used by traditional routers) is contrasted with a centralized approach (also referred to as network control) for maintaining reachability and forwarding information.
Fig. 8E illustrates a simple case where each ND implements a single NE, but the centralized control plane has abstracted (represented) multiple NEs in different NDs into a single NE in one virtual network, according to some embodiments.
Figure 8F illustrates a scenario in which multiple VNEs are implemented on different NDs and coupled to each other, and the centralized control plane has abstracted these multiple VNEs such that they appear as a single VNE within one virtual network, according to some embodiments.
Fig. 9 illustrates a general control plane device with Centralized Control Plane (CCP) software 950, in accordance with some embodiments.
Detailed Description
The following description describes methods and apparatus for selecting a routing table for a packet in policy-based routing (PBR) by including a network domain indication and a packet classification indication associated with the packet in a firewall flag (fwmark). In the following description, numerous specific details such as logic implementations, opcodes, means to specify operands, resource partitioning/sharing/duplication implementations, types and interrelationships of system components, and logic partitioning/integration choices are set forth in order to provide a thorough understanding of the systems, devices, architectures, methods, and designs of the present invention. However, it will be understood by those skilled in the art that the embodiments described herein may be practiced without such specific details. In other instances, control structures, gate level circuits and full software instruction sequences have not been shown in detail in order not to obscure the system, apparatus, structure, method and design described herein. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.
References in the specification to "one embodiment," "an example embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
Bracketed text and blocks with dashed line boundaries (e.g., large dashed lines, small dashed lines, dot-dash lines, and dots) may be used herein to illustrate optional operations that add additional features to an embodiment. However, such notation should not be considered to imply that these are the only options or optional operations and/or that blocks with solid line boundaries are not optional in certain embodiments.
In the following description and claims, the terms "coupled" and "connected," along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. "coupled" is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, cooperate or interact with each other. "connected" is used to indicate that communication is established between two or more elements coupled to each other.
Fig. 1 illustrates a network system 1100 according to one embodiment.
Each
Fig. 2 shows an example of a
Packet 202 may be a formatted data unit that includes control information and user data. For example, the control information may be located in a Physical (PHY) or Media Access Control (MAC) header of packet 202 and may include a source address of the transmitting device, a destination address of the receiving device (e.g., the final/intended destination of packet 202), service priority or quality of service (QoS) information, a length indicator, error detection/correction information, and/or one or more similar control information. Packet 202 may be an internet protocol datagram. The user data may be located in the payload of packet 202 and may include text, video, images, audio, or other similar pieces of data intended for use by the receiving device. For example, the packet 202 may include a portion of the video in the payload for viewing by a user of the receiving/destination device.
As shown in fig. 2, the classifier 204 of the
Accordingly, the packet 202 may be classified according to the technology specified for the
The classification described above produces labeled or "colored" packets so that the systems and methods described herein can readily identify packets belonging to a particular class and provide differentiated processing based on packet label/color (e.g., selecting different routing tables based on label/color). This differentiated processing may provide different QoS and policy based packet processing for classes/categories (e.g., colors) of packets. In some embodiments, the term "color" or "marking" may be referred to as a firewall marking (fwmark) and may be represented by an integer value.
Although described and illustrated in fig. 2 as classifier 204 of
As described above, classification of packet 202 by classifier 204 may result in a firewall label for packet 202. In one embodiment, the firewall tag for packet 202 may consist of two pieces of data: (1) a network domain indication and (2) a packet classification indication. Fig. 3 illustrates an
In one embodiment, the packet classification indication 304 corresponds to the classification performed by the classifier 204 described above, and the network domain indication 302 uniquely identifies the network domain (e.g., network domain 102) from which the packet 202 was received. Both the packet classification indication 304 and the network domain indication 302 may be represented in a binary address space. In one embodiment, network domain indication 302 may be assigned by classifier 204, while in other embodiments, network domain indication 302 may be assigned by another component of
As will be described herein,
In one embodiment, the length of the network domain indication 302 and the length of the packet classification indication 304 may be consistent/equal across all network domains. For example, the network domain indication 302 corresponding to the
After classifying the packet 202 and generating the
In one embodiment, as shown in FIG. 4, each
The routing table 212 indicates/describes the set of routes or next hops that a packet may take on its way to a destination. FIG. 5 illustrates an exemplary routing table 212 according to one embodiment1. As shown in FIG. 5, routing table 2121A set of information fields 502 may be included that may include a network destination address (e.g., destination subnet), a netmask, a gateway (e.g., the next hop or gateway is the address of the next network device to which the packet is to be sent on its way to its final destination), an interface, and a metric (e.g., a cost associated with the path over which the packet is to be sent). In some embodiments, the network destination field and the network mask field may together be used to identify the network domain 110 of the destination of the packet1-110X(e.g., next hop).
Different values in the
As described above, in some embodiments, the selection of the routing table 212 by the action predicate 404 may be performed by a function of the action predicate 404. For example, as shown in FIG. 4, the action predicates 404 can include taking the
As described above, in some instances,
Since the network domain indication 302 is used to determine a match with the
In some embodiments, RPDB208 may include both
In some embodiments, only one action predicate 404 may be executed for each received packet. Specifically, upon determining a match to the
After the routing table 212 is selected by the
Turning now to fig. 7, a
In one embodiment, the operations of
In one embodiment, the
As shown in fig. 1,
The packet 202 may include various information in one or more headers (e.g., Physical (PHY) and/or Medium Access Control (MAC) headers) and/or payload portions of the packet 202. In one embodiment, the information may include a destination address of packet 202 stored in a header of packet 202. The destination address indicates the ultimate destination of packet 202. For example, although packet 202 may be received by
After receiving the packet 202, a firewall label (fwmark)300 may be generated for the packet 202 at operation 704. In one embodiment, as shown in fig. 3,
The network domain indication 302 uniquely identifies the network domain (e.g., network domain 102) from which the packet 202 was received. In one embodiment, network domain indication 302 may be assigned by classifier 204, while in other embodiments, network domain indication 302 may be assigned by another component of
In one embodiment, packet classification indication 304 may describe information within packet 202 (e.g., information within a payload or header describing packet 202), information associated with packet 202, and/or information resulting from processing any of the preceding information. For example, classifier 204 algorithmically classifies packets according to specified classification criteria and outputs packet classification indication 304. In some embodiments, the classification criteria may be specified per network domain such that packets received from
For example, the classification technique associated with the
In one embodiment, the length of the network domain indication 302 and the length of the packet classification indication 304 may be consistent/equal across all network domains. However, in other embodiments, the length of the network domain indication 302 and/or the length of the packet classification indication 304 may be variable across the network domain. For example, the network domain indication 302 of the packet 202 received from the
After generating
As described above, in some embodiments, the length of the network domain indication 302 may depend on the network domain. To account for this variability, the
Upon determining a match between the network domain indication 302 of the packet 202 and the
Since the network domain indication 302 is used to determine a match with the
In some embodiments, the RPDB208 may include both PBR rules 210 (e.g., the PBR rules 210 shown in fig. 4) that are related to selecting a routing table 212 based on a
In some embodiments, only one action predicate 404 may be executed for each received packet 202. Specifically, upon determining a match to the
In one embodiment,
After identifying/selecting the routing table 212 for the packet 202, the packet 202 may be forwarded according to an entry in the identified/selected routing table 212 at
In systems having a large number of
Further, the described systems and methods act on
Furthermore, the described systems and methods result in a more compact RPDB208 with fewer entries (e.g., PBR rules 210) than conventional systems. In particular,
The systems and methods described herein may be applied to a load balancing, firewall, or routing system that uses multiple routing tables 212. In addition, the PBR described herein may be independent of a particular product and support multiple network domains with overlapping IP addresses.
As described above, the systems and methods described herein may be performed by one or more electronic devices. An electronic device stores and transmits (internally and/or with other electronic devices over a network) code (which is made up of software instructions, sometimes referred to as computer program code and/or computer program) and/or data using a machine-readable medium (also referred to as a computer-readable medium) such as a machine-readable storage medium (e.g., a magnetic disk, an optical disk, a solid state drive, a Read Only Memory (ROM), a flash memory device, a phase change memory) and a machine-readable transmission medium (also referred to as a carrier wave) (e.g., an electrical, optical, radio, acoustical or other form of propagated signal — such as a carrier wave, an infrared signal), and/or a computer-readable medium. Accordingly, an electronic device (e.g., a computer) includes hardware and software, such as a set of one or more processors (e.g., where a processor is a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application specific integrated circuit, field programmable gate array, other electronic circuitry, a combination of one or more of the foregoing) coupled to one or more machine readable storage media that store code for execution on the set of processors and/or that store data. For example, an electronic device may include non-volatile memory that contains code because the non-volatile memory may hold the code/data even when the electronic device is turned off (when powered off), while when the electronic device is turned on, the portion of code that is to be executed by the processor of the electronic device is typically copied from the slower non-volatile memory into the volatile memory of the electronic device (e.g., Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM)). A typical electronic device also includes a set or one or more physical Network Interfaces (NI) to establish network connections (to transmit and/or receive code and/or data using propagated signals) with other electronic devices. For example, the set of physical NI (or the set of physical NI in combination with the set of processors executing code) may perform any formatting, encoding, or conversion to allow the electronic device to send and receive data over a wired connection and/or a wireless connection. In some embodiments, the physical NI may include radio circuitry capable of receiving data from other electronic devices over a wireless connection and/or transmitting data to other devices via a wireless connection. The radio circuitry may include one or more transmitters, one or more receivers, and/or one or more transceivers adapted for radio frequency communications. The radio circuitry may convert the digital data into a radio signal having suitable parameters (e.g., frequency, timing, channel, bandwidth, etc.). The radio signal may then be transmitted via an antenna to one or more suitable recipients. In some embodiments, the set of physical NI may include one or more Network Interface Controllers (NICs), also referred to as network interface cards, network adapters, or Local Area Network (LAN) adapters. A NIC may facilitate connecting an electronic device to other electronic devices, allowing them to communicate via wires by plugging a cable into a physical port connected to the NIC. One or more portions of an embodiment may be implemented using different combinations of software, firmware, and/or hardware.
A Network Device (ND) is an electronic device that communicatively interconnects other electronic devices (e.g., other network devices, end-user devices) on a network. Some network devices are "multi-service network devices" that provide support for multiple networking functions (e.g., routing, bridging, switching, layer two aggregation, session border control, quality of service, and/or user management) and/or provide support for multiple application services (e.g., data, voice, and video).
Figure 8A illustrates connections between Network Devices (NDs) within an exemplary network and three exemplary implementations of NDs, according to some embodiments. FIG. 8A illustrates the ND800A-H and the connectivity of the ND800A-H through lines between 800A-800B, 800B-800C, 800C-800D, 800D-800E, 800E-800F, 800F-800G, and 800A-800G, and between 800H and each of 800A, 800C, 800D, and 800G. These NDs are physical devices and the connection between them may be wireless or wired (commonly referred to as a link). Additional lines extending from the
Two exemplary ND implementations in fig. 8A are: 1) a dedicated network device 802 using a custom Application Specific Integrated Circuit (ASIC) and a dedicated Operating System (OS); and 2) a general purpose network device 804 using a general purpose off-the-shelf (COTS) processor and a standard OS.
The dedicated network device 802 includes
Private network device 802 is generally considered physically and/or logically to include: 1) a ND control plane 824 (sometimes referred to as the control plane) including a processor 812 that executes control communications and configuration modules 832A-R; and 2) an ND forwarding plane 826 (sometimes referred to as a forwarding plane, data plane, or media plane) that includes forwarding resources 814 that utilize forwarding tables 834A-R and physical NI 816. By way of example, where the ND is a router (or implements a routing function), the ND control plane 824 (the processor 812 executing the control communication and configuration modules 832A-R) is generally responsible for participating in controlling how data (e.g., packets) are to be routed (e.g., next hops of data and outgoing physical NIs for the data) and storing the routing information in forwarding tables 834A-R, while the
Fig. 8B illustrates an exemplary manner of implementing the private network device 802 in accordance with some embodiments. Figure 8B shows a dedicated network device that includes card 838, which is typically hot-pluggable. While in some embodiments the cards 838 are of two types (one or more cards (sometimes referred to as line cards) that operate as the
Returning to fig. 8A, the general-purpose network device 804 includes
The instantiation and virtualization (if implemented) of one or more groups of one or
In a particular embodiment, the virtualization layer 854 includes a virtual switch that provides forwarding services similar to physical ethernet switches. Specifically, the virtual switch forwards traffic between
The third exemplary ND implementation in FIG. 8A is a
Regardless of the above-described exemplary implementation of a ND, the shortened term "Network Element (NE)" is sometimes used to refer to a single VNE of multiple VNEs implemented by a ND when that VNE is considered (e.g., only one VNE is part of a given virtual network), or when only that single VNE is currently being implemented by the ND. Also in all of the above example implementations, each VNE (e.g., VNE830A-R, VNE860A-R, and those in hybrid network device 806) receives data on a physical NI (e.g., 816,846) and forwards the data out of the appropriate one of the physical NIs (e.g., 816,846). For example, a VNE implementing an IP router function forwards IP packets based on some of the IP header information in the IP packets; where the IP header information includes a source IP address, a destination IP address, a source port, a destination port (where "source port" and "destination port" refer herein to a protocol port as opposed to a physical port of the ND), a transport protocol (e.g., a User Datagram Protocol (UDP), a Transmission Control Protocol (TCP), and a Differentiated Services Code Point (DSCP) value).
Figure 8C illustrates various exemplary ways in which VNEs may be coupled, according to some embodiments. FIG. 8C shows VNE870A.1-870A.P (and optionally VNE870A.Q-870A.R) implemented in ND800A and VNE870H.1 implemented in
For example, the ND of FIG. 8A may form part of the Internet or a private network; other electronic devices (not shown; such as end-user devices including workstations, laptops, netbooks, tablets, palmtops, mobile phones, smart phones, tablets, multimedia phones, Voice Over Internet Protocol (VOIP) phones, terminals, portable media players, GPS units, wearable devices, gaming systems, set-top boxes, internet-enabled home appliances) may be coupled to the network (directly or through other networks such as access networks) to communicate with each other (directly or through servers) and/or access content and/or services through the network (e.g., the internet or a Virtual Private Network (VPN) overlaid on the internet (e.g., through tunnels)). Such content and/or services are typically provided by one or more servers (not shown) belonging to a service/content provider or one or more end-user devices (not shown) participating in a peer-to-peer (P2P) service, and may include, for example, public web pages (e.g., free content, store pages, search services), private web pages (e.g., username/password accessed web pages providing email services), and/or VPN-based corporate networks. For example, an end-user device may be coupled (e.g., by a client head-end device coupled to an access network (wired or wirelessly)) to an edge ND that is coupled (e.g., by one or more core NDs) to other edge NDs that are coupled to electronic devices that act as servers. However, through computing and storage virtualization, one or more of the electronic devices operating as the ND in FIG. 8A may also host one or more such servers (e.g., one or
A virtual network is a logical abstraction of a physical network (e.g., the physical network in fig. 8A) that provides network services (e.g., L2 services and/or L3 services). The virtual network may be implemented as an overlay network (sometimes referred to as a network virtualization overlay) that provides network services (e.g., layer two (L2, data link layer) and/or layer three (L3, network layer) services) over an underlying network (e.g., an L3 network, such as an Internet Protocol (IP) network that uses tunnels (e.g., Generic Routing Encapsulation (GRE), layer two tunneling protocol (L2TP), IPSec) to create the overlay network).
A Network Virtualization Edge (NVE) is located at the edge of the underlying network and participates in implementing network virtualization; the network-facing side of the NVE uses an underlying network to tunnel frames to and from other NVEs; the outward facing side of the NVE transmits and receives data to and from systems external to the network. A Virtual Network Instance (VNI) is a particular instance of a virtual network on an NVE (e.g., a NE/VNE on a ND, a portion of a NE/VNE on a ND, where the NE/VNE is divided into multiple VNEs through simulation); one or more VNIs may be instantiated on the NVE (e.g., as different VNEs on the ND). A Virtual Access Point (VAP) is a logical connection point on the NVE for connecting external systems to the virtual network; the VAP may be a physical port or a virtual port identified by a logical interface identifier (e.g., VLAN ID).
Examples of network services include: 1) ethernet LAN emulation services (ethernet-based multipoint services similar to Internet Engineering Task Force (IETF) multiprotocol label switching (MPLS) or ethernet vpn (evpn) services), where external systems are interconnected across the network by LAN environments on the underlying network (e.g., NVE provides separate L2VNI (virtual switch instances) for different such virtual networks, and L3 (e.g., IP/MPLS) tunneling across the underlying network); and 2) virtualized IP forwarding services (similar to IETF IP VPN (e.g., Border Gateway Protocol (BGP)/MPLSIPVPN)) from a service definition perspective, where external systems are interconnected across the network through an L3 environment on the underlying network (e.g., NVE provides separate L3 VNIs (forwarding and routing instances) for different such virtual networks, and L3 (e.g., IP/MPLS) tunneling across the underlying network)). Network services may also include quality of service capabilities (e.g., traffic classification tagging, traffic throttling and scheduling), security capabilities (e.g., filters for protecting client front-ends from network-initiated attacks to avoid false routing advertisements), and administrative capabilities (e.g., full detection and processing).
Figure 8D illustrates a network with a single network element on each ND of figure 8A, and within this direct approach, the traditional distributed approach (typically used by traditional routers) is contrasted with a centralized approach (also referred to as network control) for maintaining reachability and forwarding information, according to some embodiments. In particular, FIG. 8D shows Network Elements (NEs) 870A-H having the same connectivity as the ND800A-H of FIG. 8A.
FIG. 8D illustrates distributed
For example, in the case of using private network device 802, control communications and configuration modules 832A-R of
Fig. 8D illustrates a centralized method 874 (also referred to as a Software Defined Network (SDN)) that decouples the system making the decision about the location where the traffic is sent from the underlying system forwarding the traffic to the selected destination. The illustrated
For example, where a dedicated network device 802 is used in the
While the above example uses a dedicated network device 802, the same
FIG. 8D also shows that the centralized control plane 876 has a northbound interface 884 to the application layer 886 where the application 888 resides. Centralized control plane 876 can form a virtual network 892 for applications 888 (sometimes referred to as a logical forwarding plane, a network service, or an overlay network (
Although FIG. 8D illustrates a distributed
Although FIG. 8D illustrates a simple case in which each ND800A-H implements a single NE870A-H, it should be understood that the network control method described with reference to FIG. 8D is also applicable to networks in which one or more of the
On the other hand, fig. 8E and 8F illustrate exemplary abstractions of NEs and VNEs, respectively, that may be presented by network controller 878 as part of different virtual networks 892. Figure 8E illustrates a simple case where each ND800A-H implements a single NE870A-H (see figure 8D), but the centralized control plane 876 has abstracted (represented) multiple NEs (
Fig. 8F illustrates a case where multiple VNEs (VNE 870a.1 and VNE870h.1) are implemented on different NDs (
While some embodiments implement centralized control plane 876 as a single entity (e.g., a single software instance running on a single electronic device), alternative embodiments may extend functionality across multiple entities (e.g., multiple software instances running on different electronic devices) for redundancy and/or scalability purposes.
Similar to the network device implementation, the one or more electronic devices running the centralized control plane 876, and thus the network controller 878 including the centralized reachability and forwarding information module 879, may be implemented in a variety of ways (e.g., a dedicated device, a general purpose (e.g., COTS) device, or a hybrid device). These electronic devices will similarly include one or more processors, a set of one or more physical NIs, and a non-transitory machine-readable storage medium having centralized control plane software stored thereon. For example, fig. 9 shows a general control plane device 904 including hardware 940, the hardware 940 including a set of one or more processors 942 (which are typically COTS processors) and a physical NI946, and a non-transitory machine readable storage medium 948 having Centralized Control Plane (CCP) software 950 stored therein.
In embodiments using computational virtualization, processor 942 typically executes software to instantiate virtualization layer 954 (e.g., in one embodiment, virtualization layer 954 represents a kernel of an operating system (or shim executing on an underlying operating system) that allows multiple instances 962A-R (representing separate user spaces, also referred to as virtualization engines, virtual private servers, or boxes), referred to as software containers, each instance 962A-R being available to execute a set of one or more applications; in another embodiment, virtualization layer 954 represents a hypervisor (sometimes referred to as a Virtual Machine Monitor (VMM)) or a hypervisor executing on top of a host operating system, and an application runs on top of a guest operating system within an instance 962A-R (which may be considered a tightly-isolated form of a software container in a particular case) referred to as a virtual machine run by the hypervisor; in another embodiment, the application is implemented as a monolithic kernel that can be generated by directly compiling with the application only a limited set of libraries (e.g., from a library operating system (LibOS) that includes drivers/libraries for OS services) for providing the specific OS services required by the application, and the monolithic kernel can run directly on the hardware 940, directly on a hypervisor represented by the virtualization layer 954 (in which case the monolithic kernel is sometimes described as running within the LibOS virtual machine), or within a software container represented by one of the instances 962A-R. Again, in embodiments in which computational virtualization is used, during operation, an instance of CCP software 950 (shown as CCP instance 976A) is executed on virtualization layer 954 (e.g., within instance 962A). In embodiments that do not use computational virtualization, CCP instance 976A executes on "bare metal" general purpose control plane device 904 as a single kernel or on top of a host operating system. The instantiation of CCP instance 976A, and virtualization layer 954 and instances 962A-R (if implemented), are collectively referred to as software instance 952.
In some embodiments, CCP instance 976A includes network controller instance 978. The network controller instance 978 includes a centralized reachability and forwarding information module instance 979, which is a middleware layer that provides the context of the network controller 878 to the operating system and communicates with various NEs, and a CCP application layer 980 (sometimes referred to as an application layer) above the middleware layer (providing the intelligence required for various network operations, such as protocols, network context awareness, and user interfaces). At a more abstract level, this CCP application layer 980 within the centralized control plane 876 works with a virtual network view (a logical view of the network), and the middleware layer provides the translation from the virtual network to the physical view.
For each flow, centralized control plane 876 transmits the relevant messages to
Standards such as OpenFlow define protocols for messages and models for processing packets. The models for processing packets include header parsing, packet classification, and making forwarding decisions. Header parsing describes how to interpret packets based on a well-known set of protocols. Some protocol fields are used to construct the matching structure (or key) to be used in packet classification (e.g., a first key field may be a source Media Access Control (MAC) address and a second key field may be a destination MAC address).
Packet classification involves performing a lookup in memory to classify a packet by determining which entry in the forwarding table (also referred to as a forwarding table entry or flow entry) best matches the packet based on the matching structure or key of the forwarding table entry. Many flows represented in forwarding table entries may correspond/match packets; in such a case, the system is typically configured to determine a forwarding table entry from the number of forwarding table entries according to a defined scheme (e.g., selecting the first forwarding table entry that matches). The forwarding table entry includes both a particular set of matching criteria (a set of values or wildcards, or an indication of which portions of the packet should be compared to particular values/wildcards, as defined by the matching capability — either for a particular field in the packet header or for some other packet content) and a set of one or more actions to be taken by the data plane when a matching packet is received. For example, the action may be for a packet using a particular port, pushing a header onto the packet, flooding the packet, or simply dropping the packet. Thus, forwarding table entries for IPv4/IPv6 packets having a particular Transmission Control Protocol (TCP) destination port may include actions that specify that these packets should be dropped.
Based on the forwarding table entries identified during packet classification, a forwarding decision is made and an action is performed by performing on the packet the set of actions identified in the matching forwarding table entry.
However, when an unknown packet (e.g., "missed packet" or "match-miss" as used in OpenFlow parlance) arrives at the
The Network Interface (NI) may be physical or virtual; and in the context of IP, the interface address is an IP address assigned to NI, either a physical NI or a virtual NI. A virtual NI may be associated with a physical NI, associated with another virtual interface, or independent of itself (e.g., a loopback interface, a point-to-point protocol interface). NI (physical or virtual) may be numbered (NI with IP address) or unnumbered (NI without IP address). The loopback interface (and its loopback address) is a particular type of virtual NI (and IP address) of the NE/VNE (physical or virtual) that is typically used for management purposes; where such IP addresses are referred to as node loopback addresses. The IP address of the NI assigned to an ND is referred to as the IP address of the ND; at a smaller level of granularity, the IP address assigned to an NI assigned to a NE/VNE implemented on an ND may be referred to as the IP address of that NE/VNE.
The next hop address selection by the routing system for a given destination may resolve to a path (i.e., the routing protocol may generate a next hop on the shortest path); but if the routing system determines that there are multiple feasible next hops (i.e., the routing protocol generated forwarding solution provides more than one next hop on the shortest path-multiple equal cost next hops), then some additional criteria are used-e.g., in a connectionless network, Equal Cost Multipath (ECMP) (also known as equal cost multipath control, multipath forwarding and IP multipath) may be used (e.g., typical implementations use specific header fields as criteria to ensure that packets of a particular packet flow are always forwarded on the same next hop to maintain packet flow ordering). For purposes of multipath forwarding, a packet flow is defined as a set of packets that share an ordering constraint. As an example, a set of packets in a particular TCP transmission sequence need to arrive in order, otherwise the TCP logic interprets out-of-order delivery as congestion and slows down the TCP transmission rate.
A third layer (L3) Link Aggregation (LAG) link is a link that directly connects two NDs with multiple IP addressed link paths (each link path being assigned a different IP address) and performs load distribution decisions across these different link paths at the ND forwarding plane; in this case, load sharing decisions are made between link paths.
Some NDs include functionality for authentication, authorization, and accounting (AAA) protocols (e.g., RADIUS (remote authentication dial-in user service), Diameter, and/or TACACS + (upgraded version terminal access controller access control system)). The AAA may be provided through a client/server model in which the AAA client is implemented on the ND and the AAA server may be implemented locally on the ND or on a remote electronic device coupled to the ND. Authentication is the process of identifying and verifying a user. For example, the user may be identified by a combination of a username and password or by a unique key. Authorization determines what a user can do after being authenticated, such as to gain access to a particular electronic device information resource (e.g., by using an access control policy). Billing is the recording of user activity. As an overview example, an end-user device may be coupled (e.g., through an access network) through an edge ND (AAA-enabled) coupled to a core ND coupled to an electronic device implementing a server of a service/content provider. AAA processing is performed to identify, for a user, a user record for the user that is stored in the AAA server. The user record includes a set of attributes (e.g., user name, password, authentication information, access control information, rate limiting information, policy information) used during processing of the user's traffic.
A particular ND (e.g., a particular edge ND) internally represents an end user device (or sometimes Customer Premises Equipment (CPE), such as a home gateway (e.g., router, modem)) that uses a user circuit. The user circuitry uniquely identifies the user session within the ND and is typically present during the lifetime of the session. Thus, the ND typically allocates a user circuit when the user is connected to the ND, and correspondingly, de-allocates the user circuit when the user is disconnected. Each user session represents a distinguishable flow of packets transmitted between a ND and an end user device (or sometimes a CPE, such as a home gateway or modem) using a protocol, such as point-to-point protocol (PPPoX) over another protocol (e.g., X is ethernet or Asynchronous Transfer Mode (ATM)), ethernet, 802.1Q virtual lan (vlan), internet protocol, or ATM. A user session may be initiated using various mechanisms, such as manually providing Dynamic Host Configuration Protocol (DHCP), DHCP/clientless internet protocol service (CLIPS), or Media Access Control (MAC) address tracking. For example, point-to-point protocol (PPP) is commonly used for Digital Subscriber Line (DSL) services and requires the installation of a PPP client that enables a user to enter a username and password, which in turn can be used to select a user record. When using DHCP (e.g., for cable modem services), a username is typically not provided; but in such cases other information is provided (e.g., information including the MAC address of the hardware in the end user device (or CPE)). Using DHCP and CLIPS on the ND captures MAC addresses and uses these addresses to distinguish subscribers and access their subscriber records.
Virtual Circuits (VCs), synonymous with virtual connections and virtual channels, are connection-oriented communication services that are transported using packet-mode communication. Virtual circuit communication is similar to circuit switching in that both are connection-oriented, which means that in both cases the data is transferred in the correct order and signalling overhead is required during the connection set-up phase. Virtual circuits may exist at different layers. For example, at layer 4, connection-oriented transport layer data link protocols, such as the Transmission Control Protocol (TCP), may rely on connectionless packet-switched network layer protocols, such as IP, where different packets may be routed on different paths and thus delivered out of order. If a reliable virtual circuit is established using TCP over the underlying unreliable and connectionless IP protocol, the virtual circuit is identified by a source and destination network socket address pair (i.e., a sender IP address and a receiver IP address and port number). However, virtual circuits are possible because TCP includes segment numbering and reordering at the receiver side to prevent out-of-order delivery. Virtual circuits are also possible at the third layer (network layer) and the second layer (data link layer); such virtual circuit protocols are based on connection-oriented packet switching, which means that data is always transported along the same network path, i.e. through the same NE/VNE. In such protocols, packets are not routed individually and complete addressing information is not provided in the header of each data packet; only a small Virtual Channel Identifier (VCI) is required in each packet; and the routing information is transferred to the NE/VNE during the connection establishment phase; the exchange simply involves looking up the virtual channel identifier in a table, rather than analyzing the complete address. An example of a network layer and data link layer virtual circuit protocol, where data is always carried on the same path: x.25, wherein VC is identified by a Virtual Channel Identifier (VCI); frame relay, where VC is identified by VCI; an Asynchronous Transfer Mode (ATM) in which the circuit is identified by a Virtual Path Identifier (VPI) and Virtual Channel Identifier (VCI) pair; general Packet Radio Service (GPRS); and multiprotocol label switching (MPLS), which may be used for IP on virtual circuits (each circuit identified by a label).
A particular ND (e.g., a particular edge ND) uses a hierarchical structure of circuits. The leaf nodes of the hierarchy of circuits are subscriber circuits. The subscriber circuit has a parent circuit in the hierarchy that typically represents an aggregation of multiple subscriber circuits and, therefore, has network segments and network elements for providing access network connectivity for those end user devices to the NDs. These parent circuits may represent a physical aggregation or logical aggregation of user circuits (e.g., Virtual Local Area Networks (VLANs), Permanent Virtual Circuits (PVCs) (e.g., for Asynchronous Transfer Mode (ATM)), a circuit group, a channel, a pseudowire, a physical NI for NDs, and a link aggregation group). A circuit group is a virtual fabric that allows the groups of circuits to be grouped together for configuration purposes (e.g., aggregate rate control). Pseudowires are an emulation of second layer point-to-point connection-oriented services. A link aggregation group is a virtual construct that merges multiple physical NIs for bandwidth aggregation and redundancy purposes. Thus, the parent circuit physically or logically encapsulates the user circuit.
For example, in the case of multiple virtual routers, each virtual router may share system resources, but be separated from other virtual routers in terms of its administrative domain, AAA (authentication, authorization, and accounting) name space, IP addresses, and routing databases.
Within a particular ND, a physical NI-independent "interface" may be configured as part of the VNE to provide higher layer protocol and service information (e.g., layer 3 addressing). The user record in the AAA server identifies, among other user configuration requirements, which context (e.g., which VNE/NE) the corresponding user should be bound to within the ND. As used herein, a binding forms an association between a physical entity (e.g., a physical NI, a channel) or a logical entity (e.g., a circuit such as a user circuit or a logical circuit (a set of one or more user circuits)) and an interface of a context for which a network protocol (e.g., a routing protocol, a bridging protocol) is configured through the interface. When some higher layer protocol interfaces are configured and associated with a physical entity, user data flows over the physical entity.
Some NDs provide support for implementing VPNs (virtual private networks), such as layer 2 VPNs and/or layer 3 VPNs. For example, NDs in which a provider's network and a customer's network are coupled are referred to as PEs (provider edges) and CEs (customer edges), respectively. In a layer 2 VPN, forwarding is typically performed on the CEs on either end of the VPN, and traffic is sent over the network (e.g., through one or more PEs coupled by other NDs). Layer 2 circuits are configured between CEs and PEs (e.g., ethernet ports, ATM Permanent Virtual Circuits (PVCs), frame relay PVCs). In a layer 3 VPN, routing is typically performed by PEs. As an example, an edge ND supporting multiple VNEs may be deployed as a PE; the VNE may be configured with VPN protocols and is therefore referred to as a VPN VNE.
Some NDs provide support for VPLS (virtual private LAN service). For example, in a VPLS network, end-user devices access content/services provided through the VPLS network by coupling to CEs, which are coupled through PEs coupled by other NDs. VPLS networks can be used to implement triple play network applications (e.g., data applications (e.g., high speed internet access)), video applications (e.g., television services such as IPTV (internet protocol television), VoD (video on demand) services), and voice applications (e.g., VoIP (voice over internet protocol) services), VPN services, and the like. VPLS is a second layer VPN that can be used for multipoint connections. The VPLS network also allows end-use devices coupled to CEs at separate geographic locations to communicate with each other over a Wide Area Network (WAN) as if they were directly attached to each other in a Local Area Network (LAN), referred to as an emulated LAN.
In a VPLS network, each CE is typically attached to a bridging module of a PE, possibly through an access network (wired and/or wireless), via an attachment circuit (e.g., a virtual link or connection between the CE and the PE). The bridging module of the PE is attached to the emulated LAN through the emulated LAN interface. Each bridge module acts as a "virtual switch instance" (VSI) by maintaining a forwarding table that maps MAC addresses to pseudowires and attachment circuits. The PEs forward the frames (received from the CEs) to destinations (e.g., other CEs, other PEs) based on MAC destination address fields included in the frames.
In one embodiment, one or more of the operations and functions described above with respect to fig. 1-7 may be implemented by components described with respect to the methods and elements of fig. 8A-8F and 9. For example, the classifier 204 and/or the
In some embodiments, virtualization may be utilized to provide NFV. For example, the
While the systems, devices, structures, methods, and designs herein have been described in terms of several embodiments, those skilled in the art will recognize that the systems, devices, structures, methods, and designs can be practiced with modification and alteration within the spirit and scope of the appended claims and not limited to the described embodiments. The description is thus to be regarded as illustrative instead of limiting.
Moreover, while the flow diagrams in the figures show a particular order of operations performed by particular embodiments, it should be understood that such order is exemplary (e.g., alternative embodiments may perform the operations in a different order, combine particular operations, overlap particular operations, etc.).