Bucket life cycle configuration method, device, equipment and medium

文档序号:1815425 发布日期:2021-11-09 浏览:4次 中文

阅读说明:本技术 一种桶生命周期配置方法、装置、设备及介质 (Bucket life cycle configuration method, device, equipment and medium ) 是由 王铂 陶桐桐 胡永刚 于 2021-06-30 设计创作,主要内容包括:本申请公开了一种桶生命周期配置方法、装置、设备及介质,包括:确定桶生命周期的配置参数的可行域;基于所述可行域,利用预设多目标优化算法以第一优化目标和第二优化目标确定最优配置参数集;其中,所述第一优化目标为基于配置参数处理过期数据之后剩余过期数据相对于总过期数据的空间占比;所述第二优化目标为基于配置参数处理过期数据对应的集群性能占用率;从所述最优配置参数集中确定出目标配置参数,利用所述目标配置参数配置桶生命周期。这样,能够兼顾桶生命周期效果以及其他业务处理,从而提升系统性能。(The application discloses a bucket life cycle configuration method, device, equipment and medium, comprising: determining feasible fields of configuration parameters of a bucket lifecycle; determining an optimal configuration parameter set by utilizing a preset multi-objective optimization algorithm according to a first optimization objective and a second optimization objective based on the feasible region; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy rate corresponding to the processing of the expired data based on the configuration parameters; and determining target configuration parameters from the optimal configuration parameter set, and configuring the life cycle of the bucket by using the target configuration parameters. Therefore, the bucket life cycle effect and other business processing can be considered, and the system performance is improved.)

1. A bucket lifecycle configuration method, comprising:

determining feasible fields of configuration parameters of a bucket lifecycle;

determining an optimal configuration parameter set by utilizing a preset multi-objective optimization algorithm according to a first optimization objective and a second optimization objective based on the feasible region; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy rate corresponding to the processing of the expired data based on the configuration parameters;

and determining target configuration parameters from the optimal configuration parameter set, and configuring the life cycle of the bucket by using the target configuration parameters.

2. The bucket lifecycle configuration method of claim 1, wherein the determining the feasible fields of configuration parameters for a bucket lifecycle comprises:

the feasible fields for the bucket lifecycle's number of concurrent threads, maximum number of objects, and number of thread queues are determined.

3. The bucket lifecycle configuration method of claim 2, wherein the optimization objective formulas of the first and second optimization objectives are:

where Er represents a first optimization objective, Pr represents a second optimization objective, and mworkerRepresents the number of concurrent threads, mwqRepresents the number of thread queues, mobjRepresenting the maximum number of objects, S representing the total number of stale data, and W representing the total cluster performance data.

4. The bucket lifecycle configuration method according to any of claims 1 to 3, wherein the determining an optimal set of configuration parameters with a first optimization objective and a second optimization objective using a pre-set multi-objective optimization algorithm based on the feasible domain comprises:

and determining an optimal configuration parameter set by utilizing a multi-objective particle swarm optimization algorithm according to the feasible domain and the first optimization objective and the second optimization objective.

5. The bucket lifecycle configuration method of claim 4, wherein the determining an optimal configuration parameter set with a first optimization objective and a second optimization objective using a multi-objective particle swarm optimization algorithm based on the feasible domain comprises:

determining the number of the populations and the maximum evaluation times;

and determining an optimal configuration parameter set by using a first optimization target and a second optimization target through a multi-target particle swarm optimization algorithm based on the feasible region, the population number and the maximum evaluation times.

6. A bucket lifecycle configuration apparatus, comprising:

the parameter feasible region determining module is used for determining the feasible region of the configuration parameters of the bucket life cycle;

the optimal configuration parameter set determining module is used for determining an optimal configuration parameter set by utilizing a preset multi-objective optimization algorithm according to the feasible region and a first optimization target and a second optimization target; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy rate corresponding to the processing of the expired data based on the configuration parameters;

and the bucket life cycle configuration module is used for determining target configuration parameters from the optimal configuration parameter set and configuring the bucket life cycle by using the target configuration parameters.

7. The bucket lifecycle configuration apparatus of claim 6, wherein the parameter feasible region determination module is specifically configured to:

the feasible fields for the bucket lifecycle's number of concurrent threads, maximum number of objects, and number of thread queues are determined.

8. The bucket lifecycle configuration apparatus of claim 7, wherein the optimization objective formula of the first and second optimization objectives is:

where Er represents a first optimization objective, Pr represents a second optimization objective, and mworkerRepresents the number of concurrent threads, mwqRepresents the number of thread queues, mobjRepresenting the maximum number of objects, S representing the total number of stale data, and W representing the total cluster performance data.

9. An electronic device, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the bucket lifecycle configuration method of any of claims 1 to 5.

10. A computer-readable storage medium for storing a computer program which, when executed by a processor, implements the bucket lifecycle configuration method of any of claims 1 to 5.

Technical Field

The present application relates to the field of object storage technologies, and in particular, to a bucket life cycle configuration method, apparatus, device, and medium.

Background

In the actual storage process, part of data does not need to be stored in the system for a long time, so that the function of deleting the data due to expiration is often needed. In an RGW (i.e., RADOS (i.e., Reliable, autonomous, Distributed Object storage) gateway), the delete function for data expiration is a bucket lifecycle, i.e., LC (i.e., lifecycle), which can set a rule to a bucket, including what kind of data in the bucket will expire at what time, and whether to delete or transfer to another storage space after expiration. Wherein the configuration parameters of the bucket lifecycle affect the execution effect of the bucket lifecycle.

In the RGW, when the data amount exceeds a certain range, since the characteristic is that after traversing all object metadata in the bucket, an expired object is deleted or transferred, there may be an object that is not deleted or transferred within a specified time, for example, an expired object may be deleted or transferred after setting for 3 days, which may cause untimely space release and affect the performance of the storage system. However, the transfer or deletion efficiency of the LC is increased blindly, and the LC occupies system resources, which causes other service processing problems. At present, the parameter setting of the bucket life cycle is generally according to experience, but the selection is difficult to make due to the need of taking the LC effect and the processing of other services into consideration, and the diversified requirements of the existing large-scale storage are difficult to adapt.

Disclosure of Invention

In view of this, an object of the present application is to provide a bucket life cycle configuration method, apparatus, device and medium, which can take into account bucket life cycle effects and other business processes, thereby improving system performance. The specific scheme is as follows:

in a first aspect, the present application discloses a bucket lifecycle configuration method, comprising:

determining feasible fields of configuration parameters of a bucket lifecycle;

determining an optimal configuration parameter set by utilizing a preset multi-objective optimization algorithm according to a first optimization objective and a second optimization objective based on the feasible region; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy rate corresponding to the processing of the expired data based on the configuration parameters;

and determining target configuration parameters from the optimal configuration parameter set, and configuring the life cycle of the bucket by using the target configuration parameters.

Optionally, the determining the feasible fields of the configuration parameters of the bucket life cycle includes:

the feasible fields for the bucket lifecycle's number of concurrent threads, maximum number of objects, and number of thread queues are determined.

Optionally, the optimization target formulas of the first optimization target and the second optimization target are as follows:

where Er represents a first optimization objective, Pr represents a second optimization objective, and mworkerRepresents the number of concurrent threads, mwqRepresents the number of thread queues, mobjRepresenting the maximum number of objects, S representing the total number of stale data, and W representing the total cluster performance data.

Optionally, the determining, based on the feasible region, an optimal configuration parameter set by using a preset multi-objective optimization algorithm with a first optimization objective and a second optimization objective includes:

and determining an optimal configuration parameter set by utilizing a multi-objective particle swarm optimization algorithm according to the feasible domain and the first optimization objective and the second optimization objective.

Optionally, the determining an optimal configuration parameter set by using a multi-objective particle swarm optimization algorithm with a first optimization objective and a second optimization objective based on the feasible region includes:

determining the number of the populations and the maximum evaluation times;

and determining an optimal configuration parameter set by using a first optimization target and a second optimization target through a multi-target particle swarm optimization algorithm based on the feasible region, the population number and the maximum evaluation times.

In a second aspect, the present application discloses a bucket lifecycle configuration apparatus, comprising:

the parameter feasible region determining module is used for determining the feasible region of the configuration parameters of the bucket life cycle;

the optimal configuration parameter set determining module is used for determining an optimal configuration parameter set by utilizing a preset multi-objective optimization algorithm according to the feasible region and a first optimization target and a second optimization target; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy rate corresponding to the processing of the expired data based on the configuration parameters;

and the bucket life cycle configuration module is used for determining target configuration parameters from the optimal configuration parameter set and configuring the bucket life cycle by using the target configuration parameters.

Optionally, the parameter feasible region determining module is specifically configured to:

the feasible fields for the bucket lifecycle's number of concurrent threads, maximum number of objects, and number of thread queues are determined.

Optionally, the optimization target formulas of the first optimization target and the second optimization target are as follows:

where Er represents a first optimization objective, Pr represents a second optimization objective, and mworkerRepresents the number of concurrent threads, mwqRepresents the number of thread queues, mobjRepresenting the maximum number of objects, S representing the total number of stale data, and W representing the total cluster performance data.

In a third aspect, the present application discloses an electronic device, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the bucket lifecycle configuration method as described above.

In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program which, when executed by a processor, implements the aforementioned bucket lifecycle configuration method.

According to the method, the feasible region of the configuration parameters of the bucket life cycle is determined, and then the optimal configuration parameter set is determined by the first optimization target and the second optimization target through the preset multi-objective optimization algorithm based on the feasible region; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; and the second optimization target is the cluster performance occupancy rate corresponding to the expired data processed based on the configuration parameters, and finally, the target configuration parameters are determined from the optimal configuration parameter set, and the bucket life cycle is configured by utilizing the target configuration parameters. That is, in the present application, the configuration parameter of the bucket life cycle is taken as an optimized parameter, the space proportion of the remaining outdated data after the outdated data is processed based on the configuration parameter with respect to the total outdated data, and the cluster performance occupancy corresponding to the outdated data is processed based on the configuration parameter as an optimized target, where the space proportion of the remaining outdated data after the outdated data is processed based on the configuration parameter with respect to the total outdated data reflects the processing effect of the bucket life cycle, and the cluster performance occupancy corresponding to the outdated data is processed based on the configuration parameter with respect to reflects the influence of the bucket life cycle on other services.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a flow chart of a bucket lifecycle configuration method as disclosed herein;

FIG. 2 is a schematic diagram of a specific multi-objective particle swarm optimization algorithm provided by the present application;

FIG. 3 is a schematic view of a bucket life cycle configuration apparatus according to the present disclosure;

fig. 4 is a block diagram of an electronic device disclosed in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the RGW, when the data amount exceeds a certain range, since the characteristic is that after traversing all object metadata in the bucket, an expired object is deleted or transferred, there may be an object that is not deleted or transferred within a specified time, for example, an expired object may be deleted or transferred after setting for 3 days, which may cause untimely space release and affect the performance of the storage system. However, the transfer or deletion efficiency of the LC is increased blindly, and the LC occupies system resources, which causes other service processing problems. At present, the parameter setting of the bucket life cycle is generally according to experience, but the selection is difficult to make due to the need of taking the LC effect and the processing of other services into consideration, and the diversified requirements of the existing large-scale storage are difficult to adapt. Therefore, the bucket life cycle configuration scheme is provided, bucket life cycle effects and other business processing can be considered, and therefore system performance is improved.

Referring to fig. 1, an embodiment of the present application discloses a bucket life cycle configuration method, including:

step S11: the feasible fields of the configuration parameters for the bucket lifecycle are determined.

In particular embodiments, the feasible fields for the bucket lifecycle's number of concurrent threads, maximum number of objects, and number of thread queues may be determined.

That is, the number of concurrent threads, the maximum number of objects, and the number of thread queues that affect the life cycle effect of the bucket are used as optimized parameters in the embodiments of the present application.

Step S12: determining an optimal configuration parameter set by utilizing a preset multi-objective optimization algorithm according to a first optimization objective and a second optimization objective based on the feasible region; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy corresponding to the processing of the expired data based on the configuration parameters.

Wherein the optimization objective formulas of the first optimization objective and the second optimization objective are as follows:

where Er represents a first optimization objective, Pr represents a second optimization objective, and mworkerRepresents the number of concurrent threads, mwqRepresents the number of thread queues, mobjRepresenting the maximum number of objects, S representing the total number of stale data, and W representing the total cluster performance data.

Further, W may specifically be the amount of data that can be concurrently processed by the cluster per second.

In a specific embodiment, based on the feasible domain, an optimal configuration parameter set is determined by using a MOPSO (Multi-objective Particle Swarm Optimization) algorithm with a first Optimization objective and a second Optimization objective.

Further, the number of the populations and the maximum evaluation times are determined in the embodiment of the application; and determining an optimal configuration parameter set by using a first optimization target and a second optimization target through a multi-target particle swarm optimization algorithm based on the feasible region, the population number and the maximum evaluation times.

Specifically, the step of determining the optimal configuration parameter set by using the multi-objective particle swarm optimization algorithm comprises the following steps: inputting a concurrency thread of the LC, the maximum object number of the LC, a feasible region of the thread queue number of the LC, the population individual number P and the maximum evaluation time maxFES.

Step 1: initializing a population, randomly generating P particles, wherein each particle represents a feasible solution (namely a configuration parameter group comprising a concurrent thread number, a maximum object number and a thread queue number), performing bucket life cycle function configuration by using the scheme of each particle, and calculating a fitness value by using an optimization objective formula;

step 2: screening particles, determining the historical optimal pbest of each particle, and finding out the global optimal gbest;

during the first iteration, initializing the history optimal pbest of each particle as the particle itself, and finding out the global optimal gbest.

Step 3: calculating and updating the position and the speed according to the speed and a position formula, and evaluating each particle according to an optimized target formula to obtain a new particle swarm;

step 4: updating the optimal solution, namely updating the gbest, and updating the history optimal pbest;

step 5: outputting a Pareto solution set in the population when the maximum evaluation times are reached, and returning to Step2 if the maximum evaluation times are not reached;

step 6: and obtaining a Pareto solution set, namely an optimal parameter pool.

It should be noted that the multi-objective particle swarm optimization algorithm can find a non-dominated Pareto solution set corresponding to an approximate PF (i.e., Pareto Front) in a decision space. The decision maker selects a solution from the solution set according to the actual situation. As shown in fig. 2, fig. 2 is a schematic diagram of a specific multi-objective particle swarm optimization algorithm provided by the present application. Wherein, circles of the left and right graphs in fig. 2 respectively correspond to Pareto optimal solution and optimal target result, squares represent non-optimal solution and non-optimal target result, f1And f2Respectively denote Er and Pr in the present application.

That is, in the embodiments of the present application, the bucket life cycle function and the cluster performance are used as targets, the number of issue threads, the number of maximum objects, and the number of thread queues are used as optimized parameters, an optimal configuration parameter set is obtained by using a multi-objective parameter algorithm, the bucket life cycle optimization problem is converted into a multi-objective optimization problem, and the multi-objective optimization problem is solved by using a multi-objective particle swarm optimization algorithm.

Step S13: and determining target configuration parameters from the optimal configuration parameter set, and configuring the life cycle of the bucket by using the target configuration parameters.

In a specific implementation manner, a user can configure a cluster by using the optimal solution set and combining the requirements of the user, that is, a group of concurrent threads, the maximum object number and the thread queue number are screened from the optimal solution set, and a bucket life cycle is configured, so that the performance of the cluster is better exerted.

As can be seen, in the embodiment of the present application, a feasible region of configuration parameters of a bucket lifecycle is determined first, and then based on the feasible region, an optimal configuration parameter set is determined by using a first optimization objective and a second optimization objective through a preset multi-objective optimization algorithm; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; and the second optimization target is the cluster performance occupancy rate corresponding to the expired data processed based on the configuration parameters, and finally, the target configuration parameters are determined from the optimal configuration parameter set, and the bucket life cycle is configured by utilizing the target configuration parameters. That is, in the present application, the configuration parameter of the bucket life cycle is taken as an optimized parameter, the space proportion of the remaining outdated data after the outdated data is processed based on the configuration parameter with respect to the total outdated data, and the cluster performance occupancy corresponding to the outdated data is processed based on the configuration parameter as an optimized target, where the space proportion of the remaining outdated data after the outdated data is processed based on the configuration parameter with respect to the total outdated data reflects the processing effect of the bucket life cycle, and the cluster performance occupancy corresponding to the outdated data is processed based on the configuration parameter with respect to reflects the influence of the bucket life cycle on other services.

Referring to fig. 3, an embodiment of the present application discloses a bucket life cycle configuration device, including:

a parameter feasible region determining module 11, configured to determine a feasible region of the configuration parameter of the bucket life cycle;

an optimal configuration parameter set determining module 12, configured to determine an optimal configuration parameter set according to the first optimization objective and the second optimization objective by using a preset multi-objective optimization algorithm based on the feasible region; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; the second optimization target is the cluster performance occupancy rate corresponding to the processing of the expired data based on the configuration parameters;

and a bucket life cycle configuration module 13, configured to determine a target configuration parameter from the optimal configuration parameter set, and configure a bucket life cycle by using the target configuration parameter.

As can be seen, in the embodiment of the present application, a feasible region of configuration parameters of a bucket lifecycle is determined first, and then based on the feasible region, an optimal configuration parameter set is determined by using a first optimization objective and a second optimization objective through a preset multi-objective optimization algorithm; the first optimization target is the space proportion of the residual expired data relative to the total expired data after the expired data is processed based on the configuration parameters; and the second optimization target is the cluster performance occupancy rate corresponding to the expired data processed based on the configuration parameters, and finally, the target configuration parameters are determined from the optimal configuration parameter set, and the bucket life cycle is configured by utilizing the target configuration parameters. That is, in the present application, the configuration parameter of the bucket life cycle is taken as an optimized parameter, the space proportion of the remaining outdated data after the outdated data is processed based on the configuration parameter with respect to the total outdated data, and the cluster performance occupancy corresponding to the outdated data is processed based on the configuration parameter as an optimized target, where the space proportion of the remaining outdated data after the outdated data is processed based on the configuration parameter with respect to the total outdated data reflects the processing effect of the bucket life cycle, and the cluster performance occupancy corresponding to the outdated data is processed based on the configuration parameter with respect to reflects the influence of the bucket life cycle on other services.

The parameter feasible region determining module 11 is specifically configured to:

the feasible fields for the bucket lifecycle's number of concurrent threads, maximum number of objects, and number of thread queues are determined.

Correspondingly, the optimization target formulas of the first optimization target and the second optimization target are as follows:

where Er represents a first optimization objective, Pr represents a second optimization objective, and mworkerRepresents the number of concurrent threads, mwqRepresents the number of thread queues, mobjRepresenting the maximum number of objects, S representing the total number of stale data, and W representing the total cluster performance data.

In a specific embodiment, the optimal configuration parameter set determining module 12 is specifically configured to determine the optimal configuration parameter set by using a multi-objective particle swarm optimization algorithm according to the first optimization objective and the second optimization objective based on the feasible region.

Further, the optimal configuration parameter set determining module 12 is specifically configured to determine the number of populations and the maximum evaluation times; and determining an optimal configuration parameter set by using a first optimization target and a second optimization target through a multi-target particle swarm optimization algorithm based on the feasible region, the population number and the maximum evaluation times.

Referring to fig. 4, an embodiment of the present application discloses an electronic device 20, which includes a processor 21 and a memory 22; wherein, the memory 22 is used for saving computer programs; the processor 21 is configured to execute the computer program and the bucket lifecycle configuration method disclosed in the foregoing embodiments.

For the specific process of the bucket life cycle configuration method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.

The memory 22 is used as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, and the storage mode may be a transient storage mode or a permanent storage mode.

In addition, the electronic device 20 further includes a power supply 23, a communication interface 24, an input-output interface 25, and a communication bus 26; the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to a specific application requirement, which is not specifically limited herein.

Further, the present application also discloses a computer readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the bucket lifecycle configuration method disclosed in the foregoing embodiments.

For the specific process of the bucket life cycle configuration method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above detailed description is provided for a bucket lifecycle configuration method, apparatus, device, and medium, and the specific examples are applied herein to explain the principles and embodiments of the present application, and the descriptions of the above embodiments are only used to help understand the method and core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种对象存储方法、装置、设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类