Load balancing method and device, storage medium and computing equipment

文档序号:190096 发布日期:2021-11-02 浏览:34次 中文

阅读说明:本技术 负载均衡方法、装置、存储介质和计算设备 (Load balancing method and device, storage medium and computing equipment ) 是由 刘迎冬 张晓龙 陈谔 陈洁 刘秀颖 于 2021-07-08 设计创作,主要内容包括:本公开的实施方式提供了一种负载均衡方法,该方法包括:接收针对所述第一类业务的扩容指令;其中,所述扩容指令包括所述第二调度组的标识信息;响应于所述扩容指令,从与所述标识信息对应的第二调度组中为所述第一类业务分配若干处理器,并调用所述接口,修改所述若干处理器与业务的绑定关系,将所述若干处理器与所述第一类业务进行绑定,并基于与所述第一类业务绑定的所述若干处理器创建扩容调度组;采用负载均衡策略针对所述第一调度组和所述扩容调度组进行负载均衡处理,以将所述第一调度组承担的所述第一类业务中的至少部分的处理任务,调度至所述扩容调度。(An embodiment of the present disclosure provides a load balancing method, including: receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group; responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service; and performing load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling.)

1. A load balancing method is applied to a computing device, and the computing device comprises a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with first type services; at least part of processors in the second scheduling group are bound with a second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; the processors in the first scheduling group share a third-level cache, and the processors in the second scheduling group share a third-level cache; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the method comprises the following steps:

receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group;

responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service;

and performing load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

2. The method of claim 1, invoking the interface, modifying the binding relationship between the processors and the service, and binding the processors and the first type of service, comprising:

and calling the interface, carrying out dynamic thermal modification aiming at the second scheduling group, and binding the processors with the first type of service.

3. The method according to claim 2, wherein the second scheduling group includes a set of processors that are reserved and not bound to the second type of traffic, and are used for assuming processing tasks of the first type of traffic;

allocating a plurality of processors for the first class of traffic from the second scheduling group, comprising:

and allocating processors for the first type of service from the processor set reserved in the second scheduling group.

4. The method of claim 1, the first scheduling group and the second scheduling group belong to a same scheduling domain; topology data for describing a topology structure of the scheduling domain is maintained in a kernel of an operating system loaded by the computing device; the topology data comprises description information used for describing the binding relationship between the processors and the services in each scheduling group in the scheduling domain;

the interface opened by the computing device comprises a user-mode interface for modifying the description information maintained in a kernel of an operating system;

calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service, including:

calling the user mode interface, modifying the description information of the second scheduling group, creating a binding relationship between the processors and the first type of service in the description information, and creating a capacity expansion scheduling group based on the processors bound with the first type of service.

5. The method of claim 2, the plurality of processors onboard the computing device employing a NUMA architecture; the first scheduling group and the second scheduling group belong to the same scheduling domain formed by all processors under the NUMA architecture; the first scheduling domain comprises a first NUMA node bound with first type services under the NUMA architecture; the second scheduling domain comprises a second NUMA node bound with a second type of service under the NUMA architecture; wherein a third level cache is shared between the processors in the first NUMA node and the processors in the second NUMA node.

6. The method of claim 1, the load balancing policy comprising:

and preferentially scheduling the services among service scheduling groups bound with the same type of services.

7. The method of claim 6, wherein load balancing the first scheduling group and the extended scheduling group comprises:

when any target processor in the capacity expansion scheduling group meets the load balancing processing condition, determining a scheduling group with a larger service load in the first scheduling group and the capacity expansion scheduling group;

if the service load of the first scheduling group is larger, a processor with the highest service load in the first scheduling group is further confirmed; and the number of the first and second groups,

and scheduling at least part of processing tasks in the first type of services borne by the processor with the highest service load to the target processor.

8. A storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 7.

9. A service scheduling device is applied to a computing device, and the computing device comprises a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with first type services; at least part of processors in the second scheduling group are bound with a second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; the processors in the first scheduling group share a third-level cache, and the processors in the second scheduling group share a third-level cache; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the method comprises the following steps:

the receiving module is used for receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group;

the creating module is used for responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service;

and the scheduling module is used for carrying out load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

10. A computing device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the method of any one of claims 1-7 by executing the executable instructions.

Technical Field

Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a load balancing method and apparatus, a storage medium, and a computing device.

Background

This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

The hybrid deployment means that processing tasks corresponding to at least two different types of services are deployed on the same computing device or the same computing device cluster. For example, in practical applications, services such as online services and offline services may be deployed on the same server or server cluster.

With the development of processor technology, the computing devices for processing tasks may be mounted with multiple CPU cores. In order to allow the computing device to perform processing tasks, and to perform scheduling evenly among its onboard CPUs, each CPU may be generally divided into several scheduling Domains (scheduled Domains) in the operating system kernel of the computing device.

The abstracted scheduling domain refers to a group of CPUs sharing attributes and scheduling policies abstracted from an operating system kernel. Each scheduling domain may in turn contain one or more scheduling groups (scheduled groups), which are treated by the scheduling domain as an independent scheduling unit. The computing device may schedule the assumed processing tasks among the scheduling groups based on a load balancing policy, so that the processing tasks assumed by the scheduling groups achieve load balancing.

In practical applications, when the computing device adopts a hybrid deployment manner, in order to avoid interaction between different types of services of the hybrid deployment, service isolation is generally required to be performed on different types of services borne by the computing device.

At present, when service isolation is performed on different types of services borne by a computing device, a method is generally adopted in which a specific service type is respectively bound to each scheduling group on the computing device.

For example, taking the case that the online service and the offline service are mixedly deployed on the same server, assuming that the server carries multiple CPU cores, and each CPU is abstracted into a first scheduling group and a second scheduling group, the online service and the first scheduling group may be bound, and the offline service and the second scheduling group may be bound, so as to implement service isolation between the online service and the offline service.

However, although service isolation can be achieved by binding specific service types for each scheduling group, in practical applications, the service types bound for each scheduling group may cause certain limitations on load balancing scheduling of processing tasks among each scheduling group, and further cause that the processing tasks born by each scheduling group cannot achieve load balancing.

Disclosure of Invention

In a first aspect of the disclosed embodiments, a load balancing method is provided, which is applied to a computing device, where the computing device includes a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with first type services; at least part of processors in the second scheduling group are bound with a second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; the processors in the first scheduling group share a third-level cache, and the processors in the second scheduling group share a third-level cache; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the method comprises the following steps:

receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group;

responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service;

and performing load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

In an embodiment of the present disclosure, invoking the interface, modifying a binding relationship between the processors and the service, and binding the processors and the first type of service includes:

and calling the interface, carrying out dynamic thermal modification aiming at the second scheduling group, and binding the processors with the first type of service.

In an embodiment of the present disclosure, the second scheduling group includes a set of processors that are not bound to the second type of service and are configured to undertake processing tasks of the first type of service;

allocating a plurality of processors for the first class of traffic from the second scheduling group, comprising:

and allocating processors for the first type of service from the processor set reserved in the second scheduling group.

In an embodiment of the present disclosure, the capacity expansion instruction includes identification information of a plurality of processors allocated from the second scheduling group to the first class service;

said allocating processors from said second scheduling group for said first class of traffic, comprising:

and allocating the processors corresponding to the identification information included in the capacity expansion instruction in the second scheduling group to the first class service.

In one embodiment of the present disclosure, the first scheduling group and the second scheduling group belong to the same scheduling domain; topology data for describing a topology structure of the scheduling domain is maintained in a kernel of an operating system loaded by the computing device; the topology data comprises description information used for describing the binding relationship between the processors and the services in each scheduling group in the scheduling domain;

the interface opened by the computing device comprises a user-mode interface for modifying the description information maintained in a kernel of an operating system;

calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service, including:

calling the user mode interface, modifying the description information of the second scheduling group, creating a binding relationship between the processors and the first type of service in the description information, and creating a capacity expansion scheduling group based on the processors bound with the first type of service.

In one embodiment of the present disclosure, the plurality of processors hosted by the computing device employ a NUMA architecture; the first scheduling group and the second scheduling group belong to the same scheduling domain formed by all processors under the NUMA architecture; the first scheduling domain comprises a first NUMA node bound with first type services under the NUMA architecture; the second scheduling domain comprises a second NUMA node bound with a second type of service under the NUMA architecture; wherein a third level cache is shared between the processors in the first NUMA node and the processors in the second NUMA node.

In an embodiment of the present disclosure, creating a binding relationship between the processors and the first type of service includes:

and creating a binding relationship between the processors and the service processing process corresponding to the first type of service.

In one embodiment of the present disclosure, the binding relationship includes an affinity relationship between the processor and the business process.

In one embodiment of the present disclosure, the first type of traffic comprises online traffic; the second type of service comprises an offline service; alternatively, the first and second electrodes may be,

the first type of service comprises an offline service; the second type of service comprises an online service.

In one embodiment of the present disclosure, the load balancing policy includes:

and preferentially scheduling the services among service scheduling groups bound with the same type of services.

In an embodiment of the present disclosure, performing load balancing processing on the first scheduling group and the capacity-extended scheduling group includes:

when any target processor in the capacity expansion scheduling group meets the load balancing processing condition, determining a scheduling group with a larger service load in the first scheduling group and the capacity expansion scheduling group;

if the service load of the first scheduling group is larger, a processor with the highest service load in the first scheduling group is further confirmed; and the number of the first and second groups,

and scheduling at least part of processing tasks in the first type of services borne by the processor with the highest service load to the target processor.

In one embodiment of the present disclosure, the load balancing processing condition includes any one shown below:

the target processor meets the condition of periodically carrying out load balancing processing;

the number of processing tasks carried by the target processor is below a threshold.

In a second aspect of embodiments of the present disclosure, there is provided a medium for use in a computing device, the computing device comprising a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with first type services; at least part of processors in the second scheduling group are bound with a second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; the processors in the first scheduling group share a third-level cache, and the processors in the second scheduling group share a third-level cache; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the storage medium has stored thereon computer instructions which, when executed by a processor, implement the steps of the method as follows:

receiving a capacity expansion instruction triggered when the first type of service meets a capacity expansion condition; wherein the capacity expansion instruction includes identification information of the second scheduling group;

responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service;

and performing load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

In a third aspect of the disclosed embodiments, there is provided an apparatus, for application to a computing device comprising a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with first type services; at least part of processors in the second scheduling group are bound with a second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; the processors in the first scheduling group share a third-level cache, and the processors in the second scheduling group share a third-level cache; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the method comprises the following steps:

the receiving module is used for receiving a capacity expansion instruction triggered when the first type of service meets a capacity expansion condition; wherein the capacity expansion instruction includes identification information of the second scheduling group;

the creating module is used for responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service;

and the scheduling module is used for carrying out load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

In a fourth aspect of embodiments of the present disclosure, there is provided a computing device comprising: a plurality of processors; a memory for storing processor-executable instructions; wherein, the processors all adopt a three-level cache architecture; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with first type services; at least part of processors in the second scheduling group are bound with a second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; the processors in the first scheduling group share a third-level cache, and the processors in the second scheduling group share a third-level cache; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the processor executes the executable instructions to implement the method as follows:

receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group;

responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the processors and the service, binding the processors and the first type of service, and creating a capacity expansion scheduling group based on the processors bound with the first type of service;

and performing load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

In the above embodiments of the present disclosure, at least the following advantages are provided:

on one hand, because the computing device opens an interface for modifying the binding relationship between the processors and the services, when a first class of services bound to a first scheduling group meets a capacity expansion condition, a plurality of processors are allocated to the first class of services from a second scheduling group bound to a second class of services, the binding relationship between the allocated processors and the services is modified by calling the interface, the allocated processors are bound to the first class of services, and a capacity expansion scheduling group is created based on the processors bound to the first class of services to share the processing tasks of the first class of services borne by the first scheduling group, so that the first scheduling group bound to the first class of services can be prevented from affecting the first class of services borne by the first scheduling group due to non-capacity expansion and overlarge service load.

On the other hand, when the computing device performs load balancing processing on the first type of service, it is usually required to perform load balancing scheduling on processing tasks corresponding to the first type of service among a plurality of scheduling groups bound to the first type of service; therefore, on the basis that the first scheduling group bound with the first type of service exists, a capacity expansion scheduling group is created based on a plurality of processors distributed from the second scheduling group and bound with the first type of service, so that the computing equipment can perform load balancing processing on processing tasks corresponding to the first type of service between the first scheduling group and the capacity expansion scheduling group based on a loaded load balancing strategy, and the processing tasks of the first type of service borne by the first scheduling group are scheduled to the capacity expansion scheduling group, so that the load pressure of the first scheduling group is relieved;

moreover, the processor in the capacity expansion scheduling group can be bound with the first type of service instead of the second type of service; therefore, the problem that the processing tasks carried by the first scheduling group and the capacity expansion scheduling group are unbalanced due to the fact that the service types bound by the first scheduling group and the capacity expansion scheduling group are inconsistent in the process of carrying out load balancing processing on the processing tasks corresponding to the first type of service between the first scheduling group and the capacity expansion scheduling group, and the processing tasks carried by the first scheduling group and the capacity expansion scheduling group cannot be scheduled to the capacity expansion scheduling group can be avoided, so that the processing resources of the processors in the capacity expansion scheduling group can be fully utilized, and the processing resources of the processors in the capacity expansion scheduling group cannot be fully utilized.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

FIG. 1 schematically illustrates a schematic diagram of a computing device employing a three-level processor cache architecture, according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a schematic diagram of a computing device employing a NUMA architecture, according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart of a method of load balancing according to an embodiment of the present disclosure;

fig. 4 is a flowchart schematically illustrating a load balancing process performed on the first scheduling group and the above-mentioned capacity-extended scheduling group according to an embodiment of the present disclosure;

FIG. 5 schematically illustrates a block diagram of a load balancing apparatus according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a hardware architecture diagram of a computing device, in accordance with an embodiment of the present disclosure;

fig. 7 schematically shows a schematic diagram of a software product applied to a load balancing method according to an embodiment of the present disclosure.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to an embodiment of the disclosure, a load balancing method, medium, device and computing equipment are provided.

In this context, it is to be understood that the terminology involved is intended to be in the nature of words of description. Moreover, any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.

The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.

Application scene overview

Referring to fig. 1, fig. 1 is a schematic diagram of a computing device employing a three-level processor cache architecture according to the present disclosure.

As shown in fig. 1, the computing device may carry multiple physical CPUs (central processing units) that commonly access the same physical memory. The physical CPU refers to CPU hardware actually inserted into a motherboard of the computing device. The number of physical CPUs mounted on the computing device is the number of CPU hardware actually inserted into the motherboard of the computing device. Each physical CPU mounted on the computing device may be a multi-core CPU, and may include a plurality of CPU cores (e.g., CPU core shown in fig. 1). Further, in order to improve the access efficiency of each CPU core to the memory, a multi-level cache may be respectively provided for each CPU core.

The multi-level cache refers to a temporary storage between the CPU and the memory. Generally, the multi-level cache generally includes three levels of caches, namely, L1 cache (first level cache), L2cache (second level cache), and L3cache (third level cache).

The L1 cache is generally arranged in the CPU core, and is a cache shared by the CPU core, and is used to store data to be accessed and instructions to be executed by the CPU; in practical applications, the L1 Cache can be further divided into an L1D-Cache for storing data and an L1I-Cache for storing instructions.

The L2cache is also typically provided inside the CPU core, and is a cache shared by the CPU core for storing data that the CPU needs to access.

In practical applications, the L2cache may be a cache that is provided outside the CPU core and shared by a plurality of CPU cores, and is not particularly limited in this specification. For example, the L2cache shown in fig. 1 is a cache provided inside the CPU core and shared exclusively by the CPU core. In practical applications, the L2cache shown in fig. 1 may be set outside the CPU core and shared by multiple CPU cores, as in the L3 cache.

The L3cache is generally a cache that is disposed outside the CPU core and shared by multiple CPU cores, and is used to store data that the CPU needs to access.

In the three-level processor cache architecture shown in fig. 1, since each physical CPU commonly accesses the same physical memory, if the number of CPU cores of the physical CPU is too large, each physical CPU faces a performance bottleneck.

Referring to fig. 2, fig. 2 is a schematic diagram of a computing device adopting a Non Uniform Memory Access (NUMA) architecture according to the present disclosure.

NUMA architecture is a processor architecture derived from the three-level processor cache architecture shown in FIG. 1. Under the NUMA architecture, an exclusive memory may be set for each physical CPU, so that, compared to the architecture shown in fig. 1, performance bottlenecks of each physical CPU caused by an excessive number of CPU cores of the physical CPU may be alleviated.

As shown in fig. 2, under a NUMA architecture, a CPU onboard a computing device may be divided into a plurality of NUMA nodes.

For example, as shown in fig. 2, under the NUMA architecture, each physical CPU mounted on the computing device may be regarded as an independent NUMA node; for example, assuming that the computing device is equipped with N physical CPUs, the N physical CPUs may be divided into N NUMA nodes, and each NUMA node corresponds to an independent physical CPU.

As shown in fig. 2, the CPU cores inside each NUMA node share an L3 cache; and the CPU core in each NUMA node does not share the L3cache with the CPU cores in other NUMA nodes.

Under the NUMA architecture, after a CPU mounted on a computing device is divided into a plurality of NUMA nodes, an exclusive memory may be allocated to each NUMA node.

For example, in practical application, the memory mounted on the computing device may be evenly distributed to each NUMA node according to the total number of NUMA nodes; for example, if a computing device using a NUMA mechanism has N NUMA nodes in total and the capacity of the memory mounted on the computing device is M, the capacity of the shared memory allocated to each NUMA node is N/M.

With continued reference to fig. 2, in a computing device employing a NUMA architecture, the NUMA nodes may be interconnected by a NUMA interconnect module. The specific technical details of the interconnection of each NUMA node through the NUMA interconnection module are not described in detail in this specification.

On one hand, NUMA nodes can Access an exclusive memory (referred to as Local Access) through an internal channel (such as an IO bus); on the other hand, a memory (referred to as Remote Access) shared by other NUMA nodes can be remotely accessed through the NUMA interconnect module.

The shared memory allocated for the NUMA node may be referred to as a local memory of the NUMA node; the memory allocated for other NUMA nodes may be referred to as the NUMA node's remote memory.

In practical applications, when a NUMA node accesses a local memory through an internal channel, it can generally quickly access corresponding data based on a memory address of the local memory directly. However, a NUMA node typically has a lower access speed when accessing memory of other NUMA nodes than when accessing local memory because it needs to be accessed remotely through a NUMA interconnect module.

It should be noted that the number of CPUs mounted on a computing device generally refers to the number of logical CPUs mounted on the computing device.

In practical applications, each CPU core on each physical CPU shown in fig. 1 and fig. 2 may further be modeled as a pair of logical CPUs by using Hyper-Threading (Hyper-Threading).

In this case, the number of CPUs mounted on the computing device depends on the number of physical CPUs mounted on the computing device and the number of CPU cores included in each physical CPU.

For example, assume that a computing device is loaded with N physical CPUs, each of which includes M CPU cores; each CPU core can be further simulated into a pair of logic CPUs based on the hyper-threading technology. The number of logical CPUs mounted on the computing device is N × M × 2. For example, if the computing device carries 2 physical CPUs, each of which is a multicore CPU having 4CPU cores, at this time, the value of N is 2, and the value of M is 4, the number of logical CPUs carried by the computing device is 16.

The computing device shown in fig. 2 may be configured to undertake processing tasks corresponding to at least two different types of services. For example, in one example, online services and offline services may be deployed in a mix on the computing devices described above.

In addition, when at least two different types of services are mixed and deployed to the computing device, in order to enable the processing tasks borne by the computing device to be scheduled evenly among the loaded logical CPUs, each loaded logical CPU may be generally divided into a plurality of scheduling domains in an operating system kernel of the computing device.

When the operating system kernel of the computing device divides the scheduling domain for each logical CPU mounted thereon, all the logical CPUs mounted on the computing device can be usually divided into one scheduling domain. And furthermore, the CPUs of the L3cache shared by all logic CPUs mounted on the computing equipment are respectively divided into the same scheduling group. That is, the CPUs in each of the divided dispatch groups share the L3 cache.

As shown in fig. 2, as described above, L3cache is shared among the logical CPUs included in the physical CPUs; therefore, the above-described manner of dividing the scheduling groups is equivalent to dividing the logical CPUs included in the physical CPUs mounted on the computing device into the same scheduling group. The number of finally divided scheduling groups is the same as the number of physical CPUs mounted on the computing device.

For example, referring to fig. 2, for a computing device adopting the NUMA architecture, since each physical CPU will be an independent NUMA node; therefore, when the scheduling groups are divided in the manner described above, all the logical CPUs mounted on the computing device may be divided into a large scheduling domain, and then the logical CPUs included in the NUMA node corresponding to each physical CPU may be divided into an independent scheduling group.

In this case, assuming that the computing device includes N NUMA nodes, the logical CPUs included in each NUMA node together form an independent scheduling group, and in this case, the scheduling domain may include N scheduling groups corresponding to the NUMA nodes in total.

In practical applications, in order to avoid interaction between different types of services deployed in a mixed manner, different types of services assumed by a computing device may be isolated from each other.

Among them, a common solution for isolating the traffic is a cpuiset scheme at present. The cpuiset scheme is a scheme that implements service isolation by restricting execution of a service processing process corresponding to a service to a specific CPU core or cores to run in a manner of binding the service processing process corresponding to the service to the specific CPU core or cores. The cpuiset scheme is easy to implement and has good isolation, so the cpuiset scheme is a general isolation scheme which is widely applied at present.

When a CPU (central processing unit) scheme is adopted for service isolation, because the CPU is already in an operating system kernel of the computing equipment before, each logic CPU carried by the computing equipment is divided into a scheduling domain and a scheduling group; therefore, different types of services carried by the computing equipment can be respectively deployed to different scheduling groups, and the different types of services are respectively bound with the corresponding scheduling groups, so that service isolation is realized.

Taking the mixed deployment of the online service and the offline service on the computing device as an example, the description is given.

Assume that the computing device includes a total of two NUMA nodes, denoted NUMA node and NUMA node b, respectively. NUMA nodeA corresponds to the divided first scheduling group; NUMA nodeB corresponds to the divided second scheduling group.

In this case, the online service may be deployed to a first scheduling group, and a CPU in the first scheduling group assumes a processing task corresponding to the online service; correspondingly, the offline service can be deployed to a second scheduling group, and the CPU in the second scheduling group undertakes the processing task corresponding to the offline service.

Meanwhile, in order to implement the isolation between the online service and the offline service, the cpuiset scheme described above may be adopted to bind the online service with the first scheduling group, and to bind the offline service with the second scheduling group.

After deploying the online service to the first scheduling group and binding the online service with the first scheduling group, the subsequent computing device may perform load balancing processing on the processing tasks corresponding to the online service, which are assumed by the CPUs in the first scheduling group, among the CPUs in the first scheduling group based on a load balancing policy, so that the processing tasks corresponding to the online service, which are assumed by the CPUs in the first scheduling group, can reach a load balanced state.

Correspondingly, after the offline service is deployed to the second scheduling group and the offline service is bound to the second scheduling group, the subsequent computing device may also perform load balancing processing on the processing tasks corresponding to the offline service, which are borne by the CPUs in the second scheduling group, among the CPUs in the second scheduling group based on a load balancing policy, so that the processing tasks corresponding to the offline service, which are borne by the CPUs in the second scheduling group, can also reach a load balanced state.

Further, in practical applications, when a processing task load borne by one of the first scheduling group or the second scheduling group is too heavy (for example, the number of processing task processes borne by the one scheduling group reaches a threshold), it may be necessary to perform an extension on the one scheduling group, and allocate a new CPU from the other scheduling group to bear the processing task.

For example, assuming that the load of the tasks assumed in the NUMA nodeA (i.e., the first scheduling group) bound to the online service is too heavy, several CPUs may be allocated to the NUMA nodeA from the NUMA nodeB (i.e., the second scheduling group) to assume the processing tasks corresponding to the online service.

However, when a certain scheduling group is expanded, the service types bound for the new CPUs allocated to the scheduling group from other scheduling groups are usually inconsistent with the service types bound to the expanded scheduling group; therefore, the inconsistency of the bound service types may cause certain limitations on load balancing scheduling of the processing tasks among the scheduling groups, and further may cause that the processing tasks borne by the scheduling groups cannot achieve load balancing.

For example, assuming that the workload placed on NUMA nodeA bound to online traffic is too heavy, after allocating several CPUs from NUMA nodeB to NUMA nodeA, these allocated CPUs are bound to offline traffic. And the load balancing scheduling strategy adopted by the computing equipment is a strategy for preferentially scheduling the services between the service scheduling groups bound with the same type of services.

This may result in that even if several CPUs are allocated for NUMA nodeA from NUMA nodeB, these allocated CPUs are not bound to online traffic but remain bound to offline traffic; therefore, when the CPU in the NUMA nodeA performs the load balancing process, the computing device still cannot schedule the processing tasks of the online service overloaded and born by the CPU in the NUMA nodeA to the newly allocated CPUs for processing.

This results in inefficient use of the processing resources of the allocated CPUs, and the processing tasks that the newly allocated CPUs are assigned to with the CPUs in the NUMA nodeA are always under load balance. For example, it may happen that the CPU in NUMA nodeA is still overloaded, and the traffic load of online traffic borne by several CPUs allocated for NUMA nodeA from NUMA nodeB is still low or even completely unloaded.

Therefore, under the scene that the computing device mixedly deploys a plurality of services and service isolation of the plurality of services is realized through the Cpuset scheme, once the capacity expansion of the CPUs of the cross-scheduling group is involved, the problem that the processing resources of the expanded CPUs cannot be fully utilized may occur.

Summary of The Invention

As described above, in a scenario where multiple services are mixedly deployed on a computing device adopting the architecture shown in fig. 1 or fig. 2, and service isolation of the multiple services is implemented by using the Cpuset scheme, once capacity expansion of CPUs across scheduling groups is involved, a problem that processing resources of the expanded CPUs cannot be fully utilized usually occurs.

In view of this, the present specification provides a load balancing method for load balancing a CPU and an existing CPU by fully utilizing processing resources of an expanded CPU when performing capacity expansion of the CPU across scheduling groups in a scenario where a plurality of services are mixedly deployed in a computing device and service isolation of the plurality of services is implemented by using a Cpuset scheme.

The core technical concept of the specification is as follows:

the method comprises the steps of opening modification permission aiming at the binding relationship between CPUs and services in each scheduling group of the computing equipment, enabling a plurality of CPUs to be distributed to a first class of service from a second scheduling group bound with a second class of service when the first class of service bound with a first scheduling group of the computing equipment meets a capacity expansion condition, binding the CPUs and the first class of service in a mode of modifying the binding relationship between the CPUs and the services, and establishing the capacity expansion scheduling group based on the CPUs bound with the first class of service to share the load pressure of the first class of service born by the first scheduling group.

In this way, when the computing device performs load balancing processing on the first type of service, it is usually necessary to perform load balancing scheduling on processing tasks corresponding to the first type of service among multiple scheduling groups bound to the first type of service; therefore, on the basis that the first scheduling group bound with the first type of service exists, a capacity expansion scheduling group is created based on a plurality of processors distributed from the second scheduling group and bound with the first type of service, so that the computing equipment can perform load balancing processing on processing tasks corresponding to the first type of service between the first scheduling group and the capacity expansion scheduling group based on a loaded load balancing strategy, and the processing tasks of the first type of service borne by the first scheduling group are scheduled to the capacity expansion scheduling group, so that the load pressure of the first scheduling group is relieved;

moreover, the processor in the capacity expansion scheduling group can be bound with the first type of service instead of the second type of service; therefore, the problem that the processing tasks carried by the first scheduling group and the capacity expansion scheduling group are unbalanced due to the fact that the service types bound by the first scheduling group and the capacity expansion scheduling group are inconsistent in the process of carrying out load balancing processing on the processing tasks corresponding to the first type of service between the first scheduling group and the capacity expansion scheduling group, and the processing tasks carried by the first scheduling group and the capacity expansion scheduling group cannot be scheduled to the capacity expansion scheduling group can be avoided, so that the processing resources of the processors in the capacity expansion scheduling group can be fully utilized, and the processing resources of the processors in the capacity expansion scheduling group cannot be fully utilized.

Exemplary method

The technical idea of the present specification will be described in detail by specific examples.

Referring to fig. 3, fig. 3 is a flowchart illustrating a load balancing method according to an exemplary embodiment. The method is applied to a computing device.

Wherein, in the computing device, a plurality of processors may be included; the plurality of processors may each employ the three-level cache architecture shown in fig. 1; the plurality of processors are partitioned into at least a first scheduling group and a second scheduling group; at least part of processors in the first scheduling group are bound with the first type of service; at least part of processors in the second scheduling group are bound with the second type of service; the computing equipment opens an interface for modifying the binding relationship between the processor and the service; l3cache is shared among processors in the first dispatch group, and L3cache is shared among processors in the second dispatch group; l3cache is not shared between the processors in the first dispatch group and the processors in the second dispatch group; the method performs the steps of:

step 301, receiving a capacity expansion instruction for the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group;

the computing device may be any form of hardware device capable of mixedly deploying multiple types of services; for example, in one example, the computing device may be a computing device in a cloud computing platform for undertaking computing tasks; in another example, the computing device may also provide various types of real-time or non-real-time services to the user.

The specific service types of the first type of service and the second type of service are not particularly limited in this specification, and in practical application, may include any type of service that can be mixedly deployed on the same computing device;

in one example, the first type of service may be an online service, and the second type of service may be an offline service; or, the first type of service may be an offline service; the second type of service may be an online service. It should be explained that the online service may be a service with high real-time performance; conversely, the offline service may be a service with low real-time requirement;

for example, in the scenario that the computing device is a computing device that undertakes a computing task in a cloud computing platform, the online service may be a real-time cloud computing task; the offline service may be an offline calculation task executed offline in the background;

for another example, in the scenario that the computing device is a server of a music playing APP providing various real-time or offline services for a user, the online service may be an online playing service provided for the user; the offline service may be a service for downloading offline music provided for the user.

In another example, the first type of service and the second type of service may be other types of services besides online services or offline services;

for example, taking the computing device as a server of a certain music playing APP as an example, in this scenario, the first type of service may be a music copyright purchase service provided for a common user; the second service may be a music copyright sharing service provided for professional musicians. The computing device may carry a plurality of physical CPUs, each of which may be a multicore CPU including a plurality of CPU cores; when the computing device supports hyper-threading, each CPU core may be further modeled as a pair of logical CPUs. In this case, the number of CPUs mounted on the computing device generally refers to the number of logical CPUs mounted on the computing device, and the number of logical CPUs depends on the number of physical CPUs mounted on the computing device and the number of CPU cores included in each physical CPU.

For example, in an example, if the computing device is configured to mount 2 physical CPUs, each physical CPU is a multicore CPU having 4CPU cores, and each CPU core is further modeled as a pair of logical CPUs using the hyper-threading technology, the number of the logical CPUs mounted on the computing device is 4 × 2 × 16.

The CPU carried by the computing device can be divided into a plurality of scheduling domains, and each scheduling domain can be further divided into a plurality of scheduling groups.

In an application scenario of mixed deployment of the first-class service and the second-class service on the computing device, the divided scheduling groups may include; a first scheduling group for bearing processing tasks of the first type of service; and the second scheduling group is used for bearing the processing tasks of the second type of service.

For example, in implementation, all the logical CPUs mounted on the computing device may be divided into one scheduling domain. Further, since the logical CPU under each physical CPU shares L3 cach; each physical CPU mounted on the computing device may be divided into different scheduling groups. In this case, the first schedule group and the second schedule group correspond to different physical CPUs.

In an illustrated embodiment, the computing device may specifically employ a NUMA architecture as illustrated in fig. 2. When the computing device adopts a NUMA architecture, the computing device may include a plurality of NUMA nodes; for example, at least a first NUMA node and a second NUMA node may be included. In this case, all logical CPUs under the NUMA architecture may be divided into one scheduling domain; and then dividing the first NUMA node into the first scheduling domain, and dividing the first NUMA node into the second scheduling domain.

For the divided first scheduling domain and the second scheduling domain, the cpu set scheme may still be adopted to bind the first type of service to the first scheduling domain and bind the second type of service to the second type of service, so as to implement service isolation between the first type of service and the second type of service when the first type of service and the second type of service are mixedly deployed on the computing device.

In this specification, when the first type of service or the second type of service satisfies the capacity expansion condition, the capacity expansion may be performed on the first type of service or the second type of service by triggering a capacity expansion instruction.

The capacity expansion condition can be flexibly customized by a user based on actual capacity expansion requirements; for example, the capacity expansion condition may include that a utilization rate of a CPU that handles a processing task of the first type of service or the second type of service reaches a threshold; the number of the processing tasks corresponding to the first type of service or the second type of service reaches a threshold (for example, the number of the service processes reaches a threshold), and so on, which are not listed in this specification.

In an example, the capacity expansion instruction may specifically be a capacity expansion instruction that is automatically triggered and created by the computing device when the first type of service or the second type of service meets the capacity expansion condition. In another example, the computing device may specifically provide a user (e.g., a business manager) with a command line tool associated with a business; for example, the command line tool may be a client software that provides instruction entry services to users. In this case, the capacity expansion instruction may be a capacity expansion instruction manually input by a command line tool when the first type of service or the second type of service satisfies the capacity expansion condition.

In this specification, it is assumed that the first type of service satisfies the capacity expansion condition, at this time, a plurality of CPUs for undertaking processing tasks of the first type of service need to be expanded for the first type of service from the second scheduling group bound to the second type of service.

In this case, the capacity expansion instruction triggered when the first type of service satisfies the capacity expansion condition may specifically include the identifier of the second scheduling group. The computing device may receive the capacity expansion instruction, process the capacity expansion instruction, and perform capacity expansion for the first type of service.

Step 302, in response to the capacity expansion instruction, allocating a plurality of processors to the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relationship between the plurality of processors and the service, binding the plurality of processors and the first type of service, and creating a capacity expansion scheduling group based on the plurality of processors bound to the first type of service;

when receiving the capacity expansion instruction, the computing device may respond to the capacity expansion instruction and allocate a plurality of CPUs to the first class service from the second scheduling group

In an embodiment shown, in the second scheduling group, a plurality of CPUs unbound to the second type of service may be reserved, and used for bearing a CPU set of processing tasks corresponding to the first type of service; the number of the reserved CPUs in the CPU set can be flexibly set based on the total number of CPUs actually contained in the computing device.

In this case, when allocating a CPU for the first class service that needs capacity expansion from the second scheduling group, a CPU may be specifically allocated for the first class service from the reserved CPU set.

Of course, in the first scheduling group, a plurality of CPUs not bound to the first type of service may also be reserved for bearing a CPU set of processing tasks corresponding to the second type of service. When the second type of service meets the capacity expansion condition, the CPU may also be allocated for the second type of service from the CPU set.

In an embodiment shown in the present invention, in the capacity expansion instruction, identification information of a plurality of CPUs allocated to the first class service from the second scheduling group may be specifically included. For example, the capacity expansion instruction may specifically be a capacity expansion instruction manually input by a user through a command line tool, and at this time, the capacity expansion instruction may specifically include identification information of a plurality of CPUs that are specified by the user and that need to be allocated to the first class service from the second scheduling group.

In this case, when allocating CPUs to the first class service that needs capacity expansion from the second scheduling group, a plurality of CPUs corresponding to the identification information included in the capacity expansion instruction in the second scheduling group may be specifically allocated to the first class service.

In this specification, a plurality of CPUs allocated to the first class of service from the second scheduling group may be bound to the second class of service by default; therefore, in order to avoid such a binding relationship and to avoid a limitation on a subsequent load balancing processing process, the computing device may open a modification right of the binding relationship between the CPU and the service in each scheduling group in the CPU set scheme to the user by opening an interface for modifying the binding relationship between the processor and the service.

In this case, after a plurality of CPUs are allocated to the first type of service from the second scheduling group, the interface may be further invoked to modify the binding relationship between the allocated CPUs and the service, bind the allocated CPUs and the first type of service, and then create a capacity expansion scheduling group based on the allocated CPUs bound to the first type of service.

In an embodiment shown, when the interface is called to modify the binding relationship between the CPUs and the services allocated, a dynamic thermal modification mode may be specifically adopted to instantly adjust the binding relationship between the CPUs and the services in the second scheduling group, and bind the CPUs and the first type of services.

By adopting the dynamic hot modification mode, the binding relationship between the CPUs allocated to the first type of service from the second scheduling group and the first type of service can be immediately effective without restarting the computing equipment.

Of course, in practical applications, besides the dynamic hot modification, a traditional cold modification method that the device needs to be modified after being restarted to be effective may also be used, and is not particularly limited in this specification.

In an embodiment shown in the above description, the kernel of the operating system installed in the computing device generally describes the scheduling domain and the scheduling group according to a physical topology of a physical CPU installed in the computing device, and generally maintains, in the kernel of the operating system, topology data for describing a topology of the scheduling domain to which the first scheduling domain and the second scheduling group belong; for example, in practical applications, the topology data is usually a structural body maintained in the kernel of the operating system; for example, the structure is usually a structure file named structure _ domain _ topology _ level maintained in an operating system kernel.

The topology data further comprises some description information for describing the binding relationship between the processors and the services in each scheduling group in the scheduling domain; for example, in practical applications, the topology data is usually a structural body maintained in the kernel of the operating system; for example, the struct is usually a variable named scheduled _ domain _ mask _ fmask maintained in the kernel of the operating system.

In the related art, due to the kernel of the operating system, the scheduling domain and the scheduling group are usually described according to the physical topology of the physical CPU carried by the device; therefore, in general, the above structure and description information describe the actual physical topology of a physical CPU mounted on a computing device.

In this specification, in order to flexibly and mutually expand the capacity of the CPUs between the two scheduling groups that are physically separated from each other, the modification right for the description information in the structure may be opened, so that a user may flexibly adjust the binding relationship between the CPUs in the scheduling groups and the services based on actual capacity expansion requirements, without being limited by the physical topology of the physical CPUs mounted on the computing device.

It should be noted that, as described above, the modification of the description information may specifically be a dynamic hot modification, so as to ensure that the modification of the description information can be immediately effective without restarting the computing device.

In an embodiment shown, the interface opened by the computing device for modifying the binding relationship between the processor and the service may specifically be a user mode interface for modifying the description information maintained in the kernel of the operating system; for example, the user mode interface may be an API interface for opening modification rights of the above description information.

In this case, after the computing device allocates a plurality of CPUs to the first type of service from the second scheduling group, the computing device may further invoke the user-mode interface to modify the description information corresponding to the second scheduling group, and create a binding relationship with the first type of service for the allocated CPUs.

For example, in an example, it is assumed that a plurality of CPUs allocated to the first type of service are CPUs in the reserved CPU set, and at this time, the CPUs are not bound to the second type of service; the binding relationship between the CPUs and the first type of service may be newly added directly in the description information corresponding to the second scheduling group.

In another example, it is assumed that a plurality of CPUs allocated to the first type of service, which are not CPUs in the reserved CPU set, are bound with the second type of service in advance; the description information corresponding to the second scheduling group may be modified, and the binding relationship between the CPUs and the second type of service maintained in the description information may be modified again to the binding relationship between the CPUs and the first type of service.

In one implementation manner, when creating a binding relationship between the allocated CPUs and the first type of service, the allocated CPUs may be specifically bound to service processing processes corresponding to the first type of service, and the binding relationship between the allocated CPUs and the service processing processes corresponding to the first type of service is created in the description information.

It should be noted that, the binding relationship between the service processing process corresponding to the first type of service and the CPU generally refers to an affinity relationship between the service processing process and the CPU. However, the affinity relationship between a business process and a certain CPU generally means that the business process needs to run on the CPU for as long as possible without being migrated to other CPUs.

Further, after the description information of the second scheduling group is modified by calling the user mode interface and binding relationships between the assigned CPUs and the first type of service are created, a capacity expansion scheduling group may be created based on the assigned CPUs at this time to trigger reconstruction of the second scheduling group, and a capacity expansion scheduling group bound to the first type of service is partitioned from the second scheduling group.

The second scheduling group after reconstruction includes two scheduling groups, one of which is a capacity expansion scheduling group bound with the first class service, and the other is a scheduling group bound with the second class service and composed of the remaining CPUs.

Step 303, performing load balancing processing on the first scheduling group and the capacity expansion scheduling group by using a load balancing policy, so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group.

In this specification, after the computing device responds to the capacity expansion instruction, allocates a plurality of CPUs to the first class service from the second scheduling group, and creates a capacity expansion scheduling group bound to the first class service based on the allocated CPUs, the subsequent computing device may perform load balancing processing on the first scheduling group and the capacity expansion scheduling group based on a load balancing policy, so as to schedule at least part of processing tasks of the first class service borne by the first scheduling group to the capacity expansion scheduling group, so as to relieve load pressure in the first scheduling group.

In an embodiment shown, the load balancing policy adopted by the computing device may still be a policy for preferentially scheduling traffic between traffic scheduling groups bound to the same type of traffic.

The newly-built binding relationship between the capacity-expansion scheduling group and the service is modified to be bound with the first type of service in the manner described above; at this time, the first scheduling group and the capacity expansion scheduling group are both bound with the first class service; therefore, by adopting the strategy of preferentially scheduling the services between the service scheduling groups bound with the same type of service, at least part of processing tasks in the first type of service borne by the first scheduling group can be normally scheduled to the capacity expansion scheduling group.

It should be noted that, in practical applications, when the computing device uses a load balancing policy to perform load balancing processing on the first scheduling group and the capacity-extended scheduling group, the load balancing processing procedure is usually performed in units of CPUs in the first scheduling group and the capacity-extended scheduling group.

In implementation, the computing device may generally determine whether each CPU in the first scheduling group and the capacity expansion scheduling group satisfies a load balancing processing condition in sequence, and when any CPU satisfies the load balancing processing condition, may perform load balancing processing on the CPU by using a load balancing algorithm related to the load balancing policy.

It should be noted that, a specific type of the load balancing algorithm corresponding to the load balancing policy adopted by the computing device is not particularly limited in this specification.

In an embodiment shown, the load balancing algorithm may specifically be a CFS (complete fair scheduling) algorithm.

Referring to fig. 4, fig. 4 is a flowchart illustrating a load balancing process performed on a first scheduling group and the above-mentioned capacity-extended scheduling group according to an exemplary embodiment, where the process includes the following steps:

step 401, when any target processor in the capacity expansion scheduling group meets the load balancing processing condition, determining a scheduling group with a larger service load in the first scheduling group and the capacity expansion scheduling group;

based on a CFS algorithm, when any target CPU in the capacity expansion scheduling group meets a load balancing condition, a scheduling group with a larger service load in the first scheduling group and the capacity expansion scheduling group can be determined; for example, the scheduling group with a large number of processes carried in the first scheduling group and the extended scheduling group is determined.

Step 402, if the traffic load of the first scheduling group is larger, further confirming the processor with the highest traffic load in the first scheduling group; and scheduling at least part of processing tasks of the first type of services borne by the processor with the highest service load to the target processor.

If the service load of the first scheduling group is large, the CPU with the highest service load in the first scheduling group can be further confirmed, and at least part of processing tasks in the first type of services borne by the CPU with the highest service load are scheduled to the target CPU; for example, at least part of the service processing process in the first type of service carried by the processor with the highest service load is migrated to the target CPU.

By the scheduling mode, the processing tasks born in the first scheduling group can be gradually shared to the newly expanded capacity-expanded scheduling group, so that the processing tasks born by the CPUs in the first scheduling group and the capacity-expanded scheduling group can reach a load balancing state, and further the load pressure of the first scheduling group is relieved.

Of course, if after a period of time, any other CPU in the first scheduling group except the target CPU meets the load balancing processing condition, at this time, the load balancing processing procedure for the first scheduling group and the capacity expansion scheduling group is similar to that described above, and is not described again. It should be noted that, the load balancing processing conditions generally depend on a load balancing algorithm used by the computing device, and are not particularly limited in this specification;

for example, if the load balancing algorithm is a CFS algorithm, the common load balancing processing conditions may include the following cases:

in one case, load balancing may be periodically performed on each CPU in the first scheduling group and the capacity-extended scheduling group; for example, a user may configure a fixed time for performing load balancing processing for each CPU; the load balancing processing time of different CPUs may be different. In this case, for a certain CPU, if the CPU satisfies the condition for periodically performing the load balancing processing, that is, the current time reaches the time for performing the load balancing processing configured for the CPU, it is determined that the CPU has satisfied the load balancing condition.

In another case, the load balancing condition of each CPU in the first scheduling group and the capacity expansion scheduling group may be that the number of processing tasks carried by the CPU is lower than a threshold. In this case, for a certain CPU, if the number of processing tasks carried by the CPU is below a threshold. The CPU is deemed to have satisfied the load balancing condition.

In the above embodiment, when the computing device performs load balancing processing on the first type of service, it is generally required to perform load balancing scheduling on processing tasks corresponding to the first type of service among multiple scheduling groups bound to the first type of service; therefore, on the basis that the first scheduling group bound with the first type of service exists, a capacity expansion scheduling group is created based on a plurality of processors distributed from the second scheduling group and bound with the first type of service, so that the computing device can perform load balancing processing on processing tasks corresponding to the first type of service between the first scheduling group and the capacity expansion scheduling group based on a loaded load balancing strategy, and the processing tasks of the first type of service carried by the first scheduling group are scheduled to the capacity expansion scheduling group, so that the load pressure of the first scheduling group is relieved.

Moreover, the processor in the capacity expansion scheduling group can be bound with the first type of service instead of the second type of service; therefore, the problem that the processing tasks carried by the first scheduling group and the capacity expansion scheduling group are unbalanced due to the fact that the service types bound by the first scheduling group and the capacity expansion scheduling group are inconsistent in the process of carrying out load balancing processing on the processing tasks corresponding to the first type of service between the first scheduling group and the capacity expansion scheduling group, and the processing tasks carried by the first scheduling group and the capacity expansion scheduling group cannot be scheduled to the capacity expansion scheduling group can be avoided, so that the processing resources of the processors in the capacity expansion scheduling group can be fully utilized, and the processing resources of the processors in the capacity expansion scheduling group cannot be fully utilized.

In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

In an exemplary embodiment of the present disclosure, a load balancing apparatus is also provided. Fig. 5 shows a schematic structural diagram of the load balancing apparatus 500, and as shown in fig. 5, the load balancing apparatus 500 may include: a receiving module 510, a creating module 520, and a scheduling module 530. Wherein:

the receiving module 510 is configured to receive a capacity expansion instruction for the first type of service; wherein the capacity expansion instruction includes identification information of the second scheduling group;

the creating module 520 is configured to, in response to the capacity expansion instruction, allocate a plurality of processors to the first class of service from a second scheduling group corresponding to the identification information, invoke the interfaces, modify the binding relationship between the plurality of processors and the service, bind the plurality of processors and the first class of service, and create a capacity expansion scheduling group based on the plurality of processors bound to the first class of service;

the scheduling module 530 is configured to perform load balancing processing on the first scheduling group and the extended scheduling group by using a load balancing policy, so as to schedule at least part of processing tasks in the first type of service assumed by the first scheduling group to the extended scheduling group.

The specific details of each module of the load balancing apparatus 500 have been described in detail in the foregoing description of the load balancing method flow, and therefore are not described herein again.

It should be noted that although several modules or units of the load balancing apparatus 500 are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

An electronic device 600 according to such an embodiment of the present disclosure is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present disclosure.

As shown in fig. 6, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: the at least one processing unit 601, the at least one memory unit 602, and a bus 603 that connects the various system components (including the memory unit 602 and the processing unit 601).

Wherein the storage unit stores program code, which can be executed by the processing unit 601, so that the processing unit 601 performs the steps of the various embodiments described above in this specification.

The storage unit 602 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6021 and/or a cache memory unit 6022, and may further include a read-only memory unit (ROM) 6023.

The memory unit 602 may also include a program/usage tool 6024 having a set (at least one) of program modules 6025, such program modules 6025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, and in some combination, may comprise a representation of a network environment.

The bus 603 may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or any other bus structure using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 604 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 605. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 606. As shown, a network adapter 606 communicates with the other modules of the electronic device 600 via the bus 603. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the present disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present disclosure described in the "exemplary methods" section above of this specification, when the program product is run on the terminal device.

Referring to fig. 7, a program product 70 for implementing the above method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It should be noted that although in the above detailed description several units/modules or sub-units/modules of the apparatus are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

26页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:网络请求数据处理方法和系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!