SDN data center network elephant flow scheduling method based on differential evolution algorithm

文档序号:1965918 发布日期:2021-12-14 浏览:13次 中文

阅读说明:本技术 基于差分进化算法的sdn数据中心网络大象流调度方法 (SDN data center network elephant flow scheduling method based on differential evolution algorithm ) 是由 李宏慧 代荣荣 付学良 于 2021-08-24 设计创作,主要内容包括:本发明公开了基于差分进化算法的SDN数据中心网络大象流调度方法,包括以下步骤:建立数据中心网络大象流调度模型;收集数据中心网络链路状态信息;对新到达数据中心网络的数据流,利用ECMP算法进行流量调度;若大象流途径的某条链路的最大链路利用率大于阈值,执行步骤S5~S7,否则转步骤S2;选取与大象流同源目同节点的k条最短可用路径,作为重路由大象流的备选路径;利用基于差分进化算法的流量调度方法,从k条备选路径中计算出全局最优路径;将计算出的新路径转换成流表项下发给各网络节点,重路由大象流。本发明中的方法可以对拥塞链路上的大象流计算全局最优路径后重路由,实现流量的全局动态调度,降低网络最大链路利用率,提高网络对分带宽。(The invention discloses a SDN data center network elephant flow scheduling method based on a differential evolution algorithm, which comprises the following steps of: establishing a data center network elephant flow scheduling model; collecting data center network link state information; carrying out flow scheduling on data streams newly arriving at a data center network by utilizing an ECMP algorithm; if the maximum link utilization rate of a certain link of the elephant flow path is larger than the threshold value, executing the steps S5-S7, otherwise, turning to the step S2; selecting k shortest available paths of the same source and the same node of the elephant flow as alternative paths of the heavy-route elephant flow; calculating a global optimal path from the k alternative paths by using a flow scheduling method based on a differential evolution algorithm; and converting the calculated new path into a flow table entry and issuing the flow table entry to each network node, and rerouting the elephant flow. The method of the invention can calculate the global optimal path for the elephant flow on the congested link and then reroute, thereby realizing the global dynamic scheduling of the flow, reducing the maximum link utilization rate of the network and improving the bisection bandwidth of the network.)

1. An SDN data center network elephant flow scheduling method based on a differential evolution algorithm is characterized by comprising the following steps,

s1: establishing a data center network elephant flow scheduling model;

s2: continuously collecting data center network link state information through an sflow-rt collector;

s3: carrying out flow scheduling on data streams newly arriving at a data center network by utilizing an ECMP algorithm;

s4: if the maximum link utilization rate of a certain link of the elephant flow path is larger than the threshold value, executing steps S5-S7; otherwise go to step S2;

s5: selecting k shortest available paths of the same source and the same node of the elephant flow as alternative paths of the rerouted elephant flow;

s6: calculating a global optimal path from the k alternative paths by using a flow scheduling method based on a differential evolution algorithm according to the current network state collected by the sflow-rt collector;

s7: and converting the calculated new path into a flow table entry and sending the flow table entry to each network node, rerouting the elephant flow, and simultaneously turning to the step S2 to continue scheduling.

2. The SDN data center network elephant flow scheduling method based on the differential evolution algorithm as recited in claim 1, wherein the specific operation of step S1 comprises the following steps,

s101: the data center network topology is represented as graph G (S, L), where S is the set of all network nodes, node SiE.g. S, i ═ 1, 2. L represents the set of all links in the network, and the link L belongs to the L and has the capacity of ClThe link utilization is ul

S102: let LiAnd L'iRespectively, a subset of the link set L, anyi∈Li、li∈LiBy siIs an endpoint and the data stream flows through liIngress or egress node si(ii) a The set of data flows causing congestion is represented as E, and the bandwidth of the data flow E belongs to E and is beFor the source and destination nodes of the data stream e respectivelyAndrepresents; variables ofIndicating whether the data stream e passes through the link l;

s103: will link utilization ulDefined as the sum of the bandwidths of all data streams e traversing link l and the capacity C of link llThe optimal goal of the traffic scheduling problem, i.e. minimizing the maximum link utilization in the network,can be usedRepresents; when the optimization goal is reached, the flow scheduling meets the following constraint conditions:

3. the SDN data center network elephant flow scheduling method based on the differential evolution algorithm as recited in claim 2, wherein the specific operation of the traffic scheduling method based on the differential evolution algorithm in step S6 includes the following steps,

s601: initializing an algorithm;

s602: performing mutation operation;

s603: performing cross operation;

s604: selecting operation;

s605: judging whether an iteration condition is met, if so, exiting the loop; otherwise, repeating the steps S602-604.

4. The SDN data center network elephant flow scheduling method based on the differential evolution algorithm as recited in claim 3, wherein the specific operation of the step S601 comprises the following steps,

s6011: setting circulation control conditions; setting a loop control variable t as 1, and setting the maximum iteration time MaxT;

s6012: initializing a solution space R; generating a k-shortest path as a candidate solution space R of an alternative path based on a Yen algorithm;

s6013: initializing a population; sequentially selecting M individuals from the solution space R to form a population Pop, wherein the individuals XiFor a complete path, each path consists of n links, i.e. Xi={l1,l2,...,ln} Xi∈R,1≤i≤M;

S6014: defining a fitness function; subjecting an individual XiIs defined as F (X)i)=1/(a×HOPS(Xi)+b×MAXU(Xi) In the formula, HOPS (X)i) Is a path individual XiLength of (A), MAXU (X)i) Is XiIncluding the maximum link utilization value in the link, with a and b being the impact factors.

5. The SDN data center network elephant flow scheduling method based on the differential evolution algorithm as recited in claim 4, wherein the specific operation of the step S602 comprises the following steps,

in the t-th iteration, the individual X isi(t) replacing the link with the maximum link utilization rate larger than the threshold value H by one or more adjacent non-congested links to form a variant individual Vi(t);

The links used for replacement all come from k alternative paths in the alternative path candidate solution space R, and the maximum link utilization rate of all the links in the links used for replacement is less than the threshold H, and the link used for replacement is the one with the shortest length in the solution space R.

6. The SDN data center network elephant flow scheduling method based on the differential evolution algorithm as recited in claim 5, wherein the specific operation of the step S603 comprises the following steps,

s6031: in the t-th iteration, from the interval [0,1 ]]Selecting a random number r, if r is larger than a cross factor cr, reserving a variant individual Vi(t) as crossed individuals Ui(t), directly turning to the step S604 for selection operation; if r is less than or equal to the cross factor cr, continuing the following steps to obtain the original individual Xi(t) and variant individuals Vi(t) performing a crossover operation;

s6032: subjecting an individual Xi(t) and Vi(t) traversing in sequence, and when the first same node appears, completely exchanging the links behind the node to obtain a crossed individual Ui(t); judge Ui(t) whether a loop exists, if so, reserving the path of the loop removed as a crossed individual Ui(t);

S6033: if crossing individual Ui(t) comparison with the original entity Xi(t) exactly the same, randomly selecting one from the population Pop as Vi(t) homologous homonymous route individuals W (t), and the variant individuals V (t) and W (t) are processed by the method in step S6032i(t) performing a crossover operation to obtain crossover entities Ui(t)。

7. The SDN data center network elephant flow scheduling method based on the differential evolution algorithm as recited in claim 6, wherein the specific operation of the step S604 comprises the following steps,

s6041: respectively calculating crossed individuals U according to fitness function in S6014i(t) and original subject Xi(t) fitness value;

s6042: if crossing individual Ui(t) the fitness value is superior to that of the original individual Xi(t) fitness value, then the cross individuals U in the t iterationi(t) substitution of original entity Xi(t) Retention to the next generation population, otherwise still retaining Individual Xi(t), i.e.

Technical Field

The invention relates to the technical field of computer network application, in particular to a method for scheduling elephant flow of an SDN data center network based on a differential evolution algorithm.

Background

In recent years, big data and cloud computing are developed, and a data center formed by interconnecting two or three layers of switches or routers becomes an infrastructure for information construction of the internet. As the size of data centers has grown, the number of communications within a data center network has grown exponentially, and the demand for bandwidth has increased. The traditional data center network structure is overwhelmed, link congestion is easily generated, and effective traffic transmission service cannot be provided, so that the scheduling problem of network traffic is concerned more and more.

In a traditional data center network architecture, a routing algorithm cannot collect and know dynamic information of the whole network, and cannot realize global optimization scheduling of network traffic. And a Software Defined Network (SDN) separates a control plane and a data plane of a switch, and a controller can grasp the use condition of the whole Network in real time, more accurately realize the scheduling of Network traffic, and provide a new solution for developing new Network applications and future internet technologies.

To date, Equal Cost Multi-Path Routing (ECMP) is widely used in data center network traffic scheduling. When multiple equivalent available paths exist for reaching the same destination node, the ECMP algorithm performs hash operation on each data stream, and then the data streams are uniformly distributed into the multiple equivalent paths according to hash values, so that load balance of the network is realized. Research shows that data center network traffic is divided into two data flows, namely a elephant flow and a mouse flow, wherein the data flow occupying 10% or more of link bandwidth is called the elephant flow. In a data center network, although the elephant flow occupies a small amount, the elephant flow is long in duration and carries up to 90% of data volume. The ECMP algorithm belongs to a static traffic scheduling algorithm, which does not take into account the real-time usage status of the network. When elephant flows occur in the network, ECMP may distribute multiple data flows over the same path, causing link congestion and load imbalance, low network resource utilization, etc.

Disclosure of Invention

In view of the existing problems, the invention aims to provide an SDN data center network elephant flow scheduling method based on a differential evolution algorithm.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

an SDN data center network elephant flow scheduling method based on a differential evolution algorithm is characterized by comprising the following steps,

s1: establishing a data center network elephant flow scheduling model;

s2: continuously collecting data center network link state information through an sflow-rt collector;

s3: carrying out flow scheduling on data streams newly arriving at a data center network by utilizing an ECMP algorithm;

s4: if the maximum link utilization rate of a certain link of the elephant flow path is larger than the threshold value, executing steps S5-S7; otherwise go to step S2;

s5: selecting k shortest available paths of the same source and the same node of the elephant flow as alternative paths of the rerouted elephant flow;

s6: calculating a global optimal path from the k alternative paths by using a flow scheduling method based on a differential evolution algorithm according to the current network state collected by the sflow-rt collector;

s7: and converting the calculated new path into a flow table entry and sending the flow table entry to each network node, rerouting the elephant flow, and simultaneously turning to the step S2 to continue scheduling.

Further, the specific operation of step S1 includes the following steps,

s101: the data center network topology is represented as graph G (S, L), where S is the set of all network nodes, node SiE.g. S, i ═ 1, 2. L represents the set of all links in the network, and the link L belongs to the L and has the capacity of ClThe link utilization is ul

S102: let LiAnd L'iRespectively subsets of the link set L1, arbitrarilyi∈Li、l′i∈L′iBy siIs an endpoint and the data stream flows through liIngress or egress node si(ii) a The set of data flows causing congestion is represented as E, and the bandwidth of the data flow E belongs to E and is beFor the source and destination nodes of the data stream e respectivelyAndrepresents; variables ofIndicating whether the data stream e passes through the link l;

s103: will link utilization ulDefined as the sum of the bandwidths of all data streams e traversing link l and the capacity C of link llThe ratio of (c) to (d), the optimization objective of the traffic scheduling problem, i.e. minimizing the maximum link utilization in the network, availableRepresents; when the optimization goal is reached, the flow scheduling meets the following constraint conditions:

further, the specific operation of the traffic scheduling method based on the differential evolution algorithm in step S6 includes the following steps,

s601: initializing an algorithm;

s602: performing mutation operation;

s603: performing cross operation;

s604: selecting operation;

s605: judging whether an iteration condition is met, if so, exiting the loop; otherwise, repeating the steps S602-604.

Further, the specific operation of step S601 includes the following steps,

s6011: setting circulation control conditions; setting a loop control variable t as 1, and setting the maximum iteration time MaxT;

s6012: initializing a solution space R; generating a k-shortest path as a candidate solution space R of an alternative path based on a Yen algorithm;

s6013: initializing a population; sequentially selecting M individuals from the solution space R to form a population Pop, wherein the individuals XiFor a complete path, each path consists of n links, i.e. Xi=l1,l2,...,ln}Xi∈R,l≤i≤M;

S6014: defining a fitness function; subjecting an individual XiIs defined as F (X)i)=1/(a×HOPS(Xi)+b×MAXU(Xi) In the formula, HOPS (X)i) Is a path individual XiLength of (A), MAXU (X)i) Is XiIncluding the maximum link utilization value in the link, with a and b being the impact factors.

Further, the specific operation of step S602 includes the following steps,

in the t-th iteration, the individual X isi(t) replacing the link with the maximum link utilization rate larger than the threshold value H by one or more adjacent non-congested links to form a variant individual Vi(t);

The links used for replacement all come from k alternative paths in the alternative path candidate solution space R, and the maximum link utilization rate of all the links in the links used for replacement is less than the threshold H, and the link used for replacement is the one with the shortest length in the solution space R.

Further, the specific operation of step S603 includes the following steps,

s6031: in the t-th iteration, from the interval [0,1 ]]Selecting a random number r, if r>Cross factor cr, retention of variant individuals Vi(t) as crossed individuals Ui(t), directly turning to the step S604 for selection operation; if r is less than or equal to the cross factor cr, continuing the following steps to obtain the original individual Xi(t) and variant individuals Vi(t) performing a crossover operation;

s6032: subjecting an individual Xi(t) and Vi(t) traversing in sequence, and when the first same node appears, completely exchanging the links behind the node to obtain a crossed individual Ui(t); judge Ui(t) whether a loop exists, if so, reserving the path of the loop removed as a crossed individual Ui(t);

S6033: if crossing individual Ui(t) comparison with the original entity Xi(t) exactly the same, randomly selecting one from the population Pop as Vi(t) homologous homonymous route individuals W (t), and the variant individuals V (t) and W (t) are processed by the method in step S6032i(t) performing a crossover operation to obtain crossover entities Ui(t)。

Further, the specific operation of step S604 includes the following steps,

s6041: respectively calculating crossed individuals U according to fitness function in S6014i(t) and original subject Xi(t) fitness value;

s6042: if crossing individual Ui(t) the fitness value is superior to that of the original individual Xi(t) fitness value, then the cross individuals U in the t iterationi(t) substitution of original entity Xi(t) Retention to the next generation population, otherwise still retaining Individual Xi(t), i.e.

The invention has the beneficial effects that:

1. the SDN data center network elephant flow scheduling method based on the differential evolution algorithm monitors the using state of the link at any time by combining the differential evolution algorithm on the basis of ECMP, calculates the global optimal path for the elephant flow on the congested link and then reroutes the elephant flow, can effectively reduce the maximum link utilization of the network, and improves the bisection bandwidth of the network.

2. Compared with ECMP and GFF, the network bisection bandwidth in the random mode of the DE-ECMP algorithm provided by the invention is respectively increased by 16.74% -32.09% and 7.95% -23.62%, and the network bisection bandwidth in the stride (i) interval mode is respectively increased by 12.69% and 4.89%; the bisection bandwidth of the network under the staggered mode is respectively improved by 20% and 6.7%; the maximum link utilization rate of the network is reduced, and the network load balance is well realized.

Drawings

FIG. 1 is a flow chart of elephant flow scheduling in the present invention.

FIG. 2 is a flowchart of the DE-SDN algorithm of the present invention.

FIG. 3 is a schematic diagram of the variant operation of the present invention.

FIG. 4 is a schematic diagram of the crossover operation of the present invention.

FIG. 5 is a 4-tuple fat tree type network topology in the simulation experiment of the present invention.

Fig. 6 is a comparison result of average halved bandwidth in random mode in simulation experiment of the present invention.

Fig. 7 is a comparison result of average halved bandwidth in stride mode in simulation experiment of the present invention.

Fig. 8 is a comparison result of average halved bandwidth in a stationary mode in a simulation experiment of the present invention.

Fig. 9 shows the result of comparing the maximum link utilization in the static mode in the simulation experiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the following further describes the technical solution of the present invention with reference to the drawings and the embodiments.

The SDN data center network elephant flow scheduling method based on the differential evolution algorithm comprises the following steps of S1: establishing a data center network elephant flow scheduling model;

specifically, S101: the data center network topology is represented as graph G (S, L), where S is the set of all network nodes (switches), node SiE.g. S, i ═ 1, 2. L represents the set of all links in the network, and the link L belongs to the L and has the capacity of ClThe link utilization is ul

S102: let LiAnd L'iRespectively, a subset of the link set L, anyi∈Li、l′i∈L′iBy siIs an endpoint and the data stream flows through liIngress or egress node si(ii) a The set of data flows causing congestion is represented as E, and the bandwidth of the data flow E belongs to E and is beFor the source and destination nodes of the data stream e respectivelyAndrepresents; variables ofIndicating whether the data stream e passes through the link l;

s103: will link utilization ulDefined as the sum of the bandwidths of all data streams e traversing link l and the capacity C of link llThe ratio of (a) to (b), the optimization objective of the traffic scheduling problem, i.e. minimizing the maximum link utilization in the network, can be represented by equation (1);

when the optimization goal is reached, the flow scheduling meets the following constraint conditions:

equation (2) illustrates that the sum of the bandwidths of data streams e routed over link l cannot exceed the capacity of the link; equation (3) shows if node siIf the node is the source node of the data stream e, the data stream e only flows out of the node; and equation (4) illustrates if node siE, the data flow only flows into the node and does not flow out; formula (5) represents, if siE, the data flow of the node is the same as that of the node; equation (6) defines the value range of the variable.

Further, a data center network elephant flow is scheduled by using a differential evolution algorithm and an SDN algorithm, and a specific flow is shown in fig. 1. The method specifically comprises the following steps:

s2: continuously collecting data center network link state information through an sflow-rt collector;

s3: carrying out flow scheduling on data streams newly arriving at a data center network by utilizing an ECMP algorithm;

s4: if the maximum link utilization rate of a certain link of the elephant flow path is larger than the threshold value, executing steps S5-S7; otherwise go to step S2;

s5: selecting k shortest available paths of the same source and the same node of the elephant flow as alternative paths of the rerouted elephant flow;

s6: calculating a global optimal path from the k alternative paths by using a flow scheduling method based on a differential evolution algorithm according to the current network state collected by the sflow-rt collector;

s7: and converting the calculated new path into a flow table entry and sending the flow table entry to each network node, rerouting the elephant flow, and simultaneously turning to the step S2 to continue scheduling.

Further, the basic idea of the Differential Evolution (DE) is to perform a variation operation on a randomly generated population, sum the difference vector of two individuals and a third individual according to a certain rule to generate a variation individual, cross the variation individual and a target individual, calculate fitness value and continuously iterate calculation, and finally select a global optimal solution according to a natural rule of high or low. The DE algorithm has the advantages of high convergence speed, few control parameters, simple algorithm and easy execution, and the optimization result is more stable than that of a genetic algorithm and a particle swarm algorithm.

The traffic scheduling method based on the differential evolution algorithm proposed in step S6 calculates the global optimal path that satisfies the constraints (2) to (6) according to the network topology and the current link utilization. The input is the current utilization rate of a link in the data center network, and the output is an optimal path of the rerouted elephant flow. The flow chart of the algorithm (DE-SDN) is shown in the attached figure 2, and the specific operation comprises the following steps,

s601: initializing an algorithm;

s6011: setting circulation control conditions; setting a loop control variable t as 1, and setting the maximum iteration time MaxT;

s6012: initializing a solution space R; generating a k-shortest path as a candidate solution space R of an alternative path based on a Yen algorithm;

s6013: initializing a population; sequentially selecting M individuals from the solution space R to form a population Pop, wherein the individuals XiEach path is a complete path, and each path is composed of n links l, as shown in the following formula (7).

Xi={l1,l2,...,ln}Xi∈R,1≤i≤M (7)

S6014: defining a fitness function; to achieve the optimization goal of minimizing maximum link utilization, individual X's are assignediThe fitness function of (a) is defined as shown in equation (8).

F(Xi)=1/(a×HOPS(Xi)+b×NAXU(Xi)) (8)

In the formula, HOPS (X)i) Is a path individual XiLength of (hop count), MAXU (X)i) Is XiIncluding the maximum link utilization value in the link, with a and b being the impact factors. The minimization problem solved in the text is converted into the maximization problem through the definition of the fitness function, and the optimal solution is solved according to the following steps.

S602: performing mutation operation;

specifically, in the t-th iteration, the individual X is dividedi(t) replacing the link with the maximum link utilization rate larger than the threshold value H by one or more adjacent non-congested links to form a variant individual Vi(t);

The links used for replacement all come from k alternative paths in the alternative path candidate solution space R, and the maximum link utilization rate of all the links in the links used for replacement is less than the threshold H, and the link used for replacement is the one with the shortest length in the solution space R.

As shown in fig. 3, an example of a mutation operation is given.

S in FIG. 3iRepresents a switch node,/jIs a link connecting two switches.

Assume that link l is connected on the path shown in FIG. 3 (a)3If the utilization rate of (1) is greater than the threshold value, the replacement is performed. Using Yen algorithm with link l3Two end points s of2And s3And respectively generating k-shortest paths for a source node and a destination node, and reserving the k-shortest paths to a variant candidate solution space R'. Selecting a candidate l satisfying the following condition from the solution space R*Path, as shown in fig. 3 (b):

a. candidate route l*The maximum link utilization rate of all the links is less than a threshold value H;

b. candidate route l*The shortest length in the solution space R'.

Then, the congested link l3Replacement with candidate Path l*Generating variant individuals V as shown in FIG. 3 (c)i(t)。

S603: performing cross operation;

specifically, in the t-th iteration, the interval [0,1 ] is counted]Selecting a random number r, if r>Cross factor cr, retention of variant individuals Vi(t) as crossed individuals Ui(t), directly turning to the step S604 for selection operation; if r is less than or equal to the cross factor cr, continuing the following steps to obtain the original individual Xi(t) and variant individuals Vi(t) performing a crossover operation;

subjecting an individual Xi(t) and Vi(t) traversing in sequence, and when the first same node appears, completely exchanging the links behind the node to obtain a crossed individual Ui(t); judge Ui(t) whether a loop exists, if so, reserving the path of the loop removed as a crossed individual Ui(t);

Taking into account variant individuals Vi(t) may be represented by Xi(t) mutation results in the crossroad individuals Ui(t) possibly related to the original subject Xi(t) are identical, causing invalid crossings, if an invalid crossing occurs, randomly selecting one from the population Pop to be Vi(t) homologous homonymous route individuals W (t), and the variant individuals V (t) and W (t) are processed by the method in step S6032i(t) performing a crossover operation to obtain crossover entities Ui(t), thereby increasing the probability that the results of the mutation operation will remain in the next generation population.

A schematic diagram of the crossover operation is shown in fig. 4. Suppose that the individual W (t) (fig. 4 (a)) and the variant individual Vi(t) (fig. 3 (c)) performs a crossover operation. It can be seen that their first common node isMixing W (t) and Vi(t) from a common nodeTo the destination nodeAll the link strings are interchanged to obtain crossed individuals U shown as (b) in FIG. 4i(t)。

S604: selecting operation;

specifically, the crossed individuals U are respectively calculated according to the fitness function in S6014i(t) and original subject Xi(t) fitness value;

if crossing individual Ui(t) the fitness value is superior to that of the original individual Xi(t) fitness value, then the cross individuals U in the t iterationi(t) substitution of original entity Xi(t) Retention to the next generation population, otherwise still retaining Individual Xi(t), i.e.

S605: judging whether an iteration condition is met, if so, exiting the loop; otherwise, repeating the steps S602-604.

Simulation experiment:

in order to verify the effectiveness of the DE-SDN algorithm provided by the invention, Floodlight is used as an SDN controller in a simulation experiment, a Mininet is used for establishing a 4-element fat-tree data center network for a simulation platform, and an sflow technology is adopted to monitor the network state. DE-SDN was compared to ECMP and GFF. And taking the average bisection bandwidth and the maximum link utilization rate as indexes for measuring the performance of the algorithm. The term "halved bandwidth" refers to the total bandwidth of the data stream traffic that a network is divided into two identical subnets, and all links in the two subnets pass through in a specified unit time. The maximum link utilization rate refers to a value at which the link utilization rate reaches the highest value among the link utilization rate values of all paths of the network when the network transmits. Under the same load condition, the larger the bisection bandwidth is, the larger the network throughput is, and the lower the maximum link utilization rate is, the uniform link utilization distribution of the network is indicated, so that the condition that one link is over utilized is not caused, namely, the better the network load balancing performance is.

The experimental environment is as follows:

(1) network topology

A 4-membered fat-tree network topology was constructed using the Mininet platform with Python programming, as shown in figure 5. All the nodes are OpenFlow switches, and 20 nodes in total are provided. The access layer switch is connected with 16 hosts, and the bandwidth of each link is set to be 100 Mb/s.

(2) Communication mode

The elephant flow is defined as a data flow that is 10% larger than the link bandwidth, so data flows with a bandwidth of 10Mb/s and more are identified as elephant flows in this experiment. The traffic generation tool Iperf is used, secondary development is performed, and the internal command of Mininet is extended to generate data streams of three different communication modes. The three different communication modes are as follows:

a) random pattern Random: randomly selecting a source host and a target host in a network, and randomly generating a flow mode and the flow size at the same time;

b) spacing pattern stride (i): the host with the number x transmits data to the host with the number (x + i) mod n, wherein n is the number of the hosts in the network;

c) staggered pattern Staggered (p1, p 2): each host sends data to hosts belonging to the same access layer switch with a probability p1, to hosts belonging to the same pod with a probability p2, and to hosts within other pods with probabilities 1-p1-p 2.

According to the data center network flow characteristics, the size of the data flow in the experiment is subjected to exponential distribution, wherein the parameter r of an exponential function is 0.23. The time interval for generating each stream follows a poisson distribution, the duration of each stream is 60 seconds, and the data stream from the 20 th second to the 40 th second time period is taken as effective experimental data.

(3) Algorithm parameter setting

According to the optimization target of traffic scheduling, the main parameter settings of the DE-SDN algorithm in the simulation experiment are shown in table 1.

TABLE 1 DE-SDN Algorithm Primary parameter values

Parameter(s) Set value Suggested value
Maximum number of iterations MaxT 50 [50,100]
Number of groups M 50 [50,100]
Influence factor a 1 [1,10]
Influencing factor b 10 [1,10]
Threshold value H 0.5 0.5
Cross factor cr 0.1 [0,1]

Since the optimization goal is to minimize the maximum link utilization, the impact factor of the fitness function is set to a-1 and b-10. If the cross factor cr is large, convergence is generally accelerated to obtain a locally optimal solution, so cr becomes 0.1.

Average bisection bandwidth comparison results:

the data flow type in the data center network is complex and the flow is huge, the three communication modes are selected in a simulation experiment to compare the performances of three flow scheduling algorithms of ECMP, GFF and DE-SDN, and each flow scheduling algorithm in each communication mode is subjected to 20 groups of experiments and then averaged to obtain a final result. The comparison graphs of the average halved bandwidth of the experimental results are shown in fig. 6-8, wherein the horizontal axis represents the communication mode, and the vertical axis represents the average halved bandwidth value, and the unit is Mbps/s.

Average fractional bandwidth comparison in random mode two sets of experiments random1 and random2 were performed in random mode as shown in fig. 6. As can be seen from fig. 6, the average bisection bandwidth of the GFF algorithm is higher than that of the ECMP algorithm, and the average bisection bandwidth of the DE-SDN algorithm is higher than that of the ECMP algorithm and the GFF algorithm. Compared with an ECMP algorithm, the average bisection bandwidth of the DE-SDN algorithm is increased by 16.74-32.09%, and compared with the average bisection bandwidth of the GFF algorithm, the average bisection bandwidth of the DE-SDN algorithm is increased by 7.95-23.62%.

In order to simulate the unbalanced network load and facilitate the simulation of the flow complexity of a real network, three communication modes, i being 4, 6 and 8, are respectively adopted to perform three groups of comparison experiments stride (4), stride (6) and stride (8) in a stride (i) interval mode. Comparison of average bisection bandwidth in Stride communication mode is shown in fig. 7, and it can be seen from fig. 7 that the average bisection bandwidth of the DE-SDN algorithm is higher than that of the ECMP and GFF algorithms in all three interval modes, wherein in Stride (6) mode, the average bisection bandwidth of the DE-SDN algorithm is increased by 12.69% compared with that of the ECMP algorithm and 4.89% compared with that of the GFF algorithm.

Under the staggered mode of the static, two staggered modes of static (0,0.2) and static (0,0.4) are selected for the experiment, taking the mode of static (0,0.2) as an example, in the experiment, each host sends data streams to the host of the same Pod with the probability of 0.2, and sends data streams to other hosts in the rest Pod with the probability of 0.8, and the average bisection bandwidth comparison under the staggered mode of static is shown in fig. 8. As can be seen from fig. 8, in the two interleaving modes, there is no obvious difference between the average bisection bandwidths of the ECMP algorithm and the GFF algorithm, but the average bisection bandwidth of the DE-SDN algorithm is higher than that of the ECMP algorithm and the GFF algorithm. Under the condition of stationary (0,0.2), compared with the ECMP algorithm, the average bisection bandwidth is improved by 20 percent; in a static (0,0.4) mode, the DE-SDN algorithm is improved by 6.7 percent compared with the ECMP algorithm.

In conclusion, in the three communication modes, the average bisection bandwidth of the DE-SDN algorithm is higher than that of the ECMP algorithm and that of the GFF algorithm. This is because the ECMP algorithm belongs to a static hash scheduling, and distributes data flows uniformly to different equal-cost paths according to hash values, and does not consider the real-time state of the network, and cannot properly handle the scheduling of elephant flows, which results in increased data flow collisions and link congestion. Compared with the ECMP algorithm, the GFF algorithm selects the first eligible path for the elephant flow on the congested link according to the current network information state and the link passed by the data flow in sequence for rerouting, so that the average split bandwidth is improved. However, in the interleaving mode, as the number of data streams generated between the pods increases, the number of conflicting streams also increases, and the GFF still selects the first path that satisfies the condition, at this time, it is not possible to determine whether the path is the globally optimal path, network congestion may occur again, and the problem of load imbalance cannot be effectively alleviated. The DE-SDN algorithm provided by the invention firstly carries out ECMP algorithm scheduling, and when link congestion caused by elephant flow is detected through real-time monitoring, a global optimal path is selected again in combination with the current network link state, and the elephant flow is rescheduled, so that the average bisection bandwidth is better than that of other two algorithms.

Maximum link utilization comparison results:

the maximum link utilization rate is also an important index for evaluating whether a traffic scheduling algorithm can better realize network load balance, and can reflect the use condition of network links.

In order to simulate the real network load imbalance condition, a steady (0,0.2) communication mode is selected for the experiment to compare with the maximum link utilization rate of ECMP, GFF and DE-SDN. For each algorithm, 20 experiments were performed, with each set of data stream transmission duration being 60 seconds. Finally, the cumulative distribution function value of all the maximum link utilization values is calculated, the frequency of each interval is 0.1, and the line graph shown in the figure 9 is obtained. In the figure, the abscissa represents the maximum link utilization rate, the ordinate is the cumulative distribution function value, and the three curves are cumulative distribution function curves corresponding to the ECMP, GFF and DE-SDN algorithms. The higher the image position of the cumulative distribution function graph is, the better the effect of reducing the maximum link utilization rate is. As can be seen from fig. 9, the maximum link utilization of the ECMP algorithm is mostly concentrated in 70% to 85%, the maximum link utilization of the GFF algorithm is 60% to 80%, and compared with the ECMP and GFF algorithms, the DE-SDN algorithm is reduced by about 10%, and mostly concentrated in 60% to 75%.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种通信方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!