Data aggregation method, device, equipment and computer readable storage medium

文档序号：1952030 发布日期：2021-12-10 浏览：3次中文

阅读说明：本技术 一种数据聚合方法、装置、设备及计算机可读存储介质 (Data aggregation method, device, equipment and computer readable storage medium ) 是由任磊武模仁何文龙于 2021-06-30 设计创作，主要内容包括：本发明公开了一种数据聚合方法、装置、设备及计算机可读存储介质,在多版本功能开启的情况下,本申请中的数据聚合方法对于接收到的存储容量小于预设阈值的目标数据,可以在将其本体数据聚合至对应的聚合数据的基础上,将该目标数据的版本号以及该目标数据的本体数据在聚合数据中的位置打包放在聚合数据的位图信息中,并且在目标数据的元数据中添加了聚合数据的身份信息,即使存在重名的目标数据,由于每个目标数据的版本号的唯一性,用户也可以根据目标数据的版本号快捷准确地找到目标数据中本体数据的具体位置,也即本申请可以识别不同版本号的重名目标数据,从而不再对重名数据进行覆盖存储,降低了数据丢失的风险,提升了用户体验。(The invention discloses a data aggregation method, a device, equipment and a computer readable storage medium, under the condition that a multi-version function is started, the data aggregation method in the application can pack the version number of target data and the position of the body data of the target data in aggregated data into bitmap information of the aggregated data on the basis of aggregating the body data of the received target data with the storage capacity smaller than a preset threshold value to the corresponding aggregated data, and adds the identity information of the aggregated data in the metadata of the target data, even if the renamed target data exists, a user can quickly and accurately find the specific position of the body data in the target data according to the version number of the target data due to the uniqueness of the version number of each target data, namely the application can identify the renamed target data with different version numbers, therefore, the duplicate name data is not stored in a covering mode, the risk of data loss is reduced, and the user experience is improved.)

1. A method for data aggregation, comprising:

when target data with the storage capacity smaller than a preset threshold value is received, aggregating body data of the target data to aggregated data corresponding to the target data;

judging whether a multi-version function of the distributed storage system is started or not;

if so, integrating the position of the body data of the target data in the aggregated data and the version number of the target data into a structural form and adding the structural form to bitmap information of the aggregated data so as to find the specific position of the body data in the aggregated data in the bitmap information through the version number;

adding identity information of the aggregated data in metadata of the target data so as to determine the aggregated data where ontology data of the target data is located through the identity information;

wherein the storage capacity of the aggregated data is greater than the preset threshold.

2. The data aggregation method according to claim 1, wherein after determining whether the multi-version function of the distributed storage system is enabled, the data aggregation method further comprises:

if not, adding the position of the body data of the target data in the aggregated data to the bitmap information and executing the step of adding the identity information of the aggregated data in the metadata of the target data.

3. The data aggregation method according to claim 2, wherein the determining whether the multi-version function of the distributed storage system is enabled specifically comprises:

judging whether the metadata of the target data has a version number or not;

if yes, the multi-version function of the distributed storage system is judged to be started.

4. The data aggregation method of claim 3, further comprising:

and responding to a parameter modification instruction, and modifying the preset threshold according to the parameter modification instruction.

5. The data aggregation method of claim 1, wherein the identity information of the aggregated data is at least one of a name, a creation date, and a storage location.

6. The data aggregation method of claim 1, wherein the distributed storage system is a distributed object storage system.

7. The data aggregation method according to any one of claims 1 to 6, further comprising:

in response to a deletion instruction, determining metadata of the target data according to the identity information of the target data specified in the deletion instruction;

determining the aggregated data where the body data of the target data specified in the deletion instruction is located according to the determined identity information of the aggregated data in the metadata;

according to the determined version number in the metadata, searching the position of the body data of the target data specified in the deleting instruction in the determined bitmap information of the aggregated data;

positioning and deleting the body data of the target data specified in the deleting instruction in the determined aggregated data according to the searched position;

deleting the determined metadata of the target data;

and deleting the data in the structural body form corresponding to the searched position in the determined bitmap information of the aggregated data.

8. A data aggregation apparatus, comprising:

the aggregation module is used for aggregating body data of the target data to aggregated data corresponding to the target data when the target data with the storage capacity smaller than a preset threshold value is received;

the judging module is used for judging whether the multi-version function of the distributed storage system is started or not, and if so, the updating module is triggered;

the updating module is used for integrating the position of the body data of the target data in the aggregated data and the version number of the target data into a structural form and adding the structural form to the bitmap information of the aggregated data so as to find the specific position of the body data in the aggregated data in the bitmap information through the version number;

the adding module is used for adding the identity information of the aggregated data in the metadata of the target data so as to determine the aggregated data where the ontology data of the target data is located through the identity information;

wherein the storage capacity of the aggregated data is greater than the preset threshold.

9. A data aggregation device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the data aggregation method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the data aggregation method according to any one of claims 1 to 7.

Technical Field

The invention relates to the field of distributed storage systems, in particular to a data aggregation method, a data aggregation device, data aggregation equipment and a computer readable storage medium.

Background

In a distributed storage system, in order to improve the utilization rate of a storage space, multiple target data with a small storage capacity may be aggregated into one aggregated data with a large storage capacity by using an aggregation function, and then the aggregated data is stored, but when a multi-version function in the distributed storage system is turned on, a user may create multiple data with the same name.

Therefore, how to provide a solution to the above technical problem is a problem that needs to be solved by those skilled in the art.

Disclosure of Invention

The invention aims to provide a data aggregation method, which is characterized in that the overlapping storage of the renamed data is not carried out in the aggregation process, so that the risk of data loss is reduced, and the user experience is improved; another object of the present invention is to provide a data aggregation method, apparatus, device and computer-readable storage medium, which do not perform overlay storage on the renamed data any more during the aggregation process, thereby reducing the risk of data loss and improving the user experience.

In order to solve the above technical problem, the present invention provides a data aggregation method, including:

when target data with the storage capacity smaller than a preset threshold value is received, aggregating body data of the target data to aggregated data corresponding to the target data;

judging whether a multi-version function of the distributed storage system is started or not;

wherein the storage capacity of the aggregated data is greater than the preset threshold.

Preferably, after determining whether the multi-version function of the distributed storage system is turned on, the data aggregation method further includes:

Preferably, the judging whether the multi-version function of the distributed storage system is started specifically includes:

judging whether the metadata of the target data has a version number or not;

if yes, the multi-version function of the distributed storage system is judged to be started.

Preferably, the data aggregation method further includes:

and responding to a parameter modification instruction, and modifying the preset threshold according to the parameter modification instruction.

Preferably, the identity information of the aggregated data is specifically at least one of a name, a creation date, and a storage location.

Preferably, the distributed storage system is a distributed object storage system.

Preferably, the data aggregation method further includes:

in response to a deletion instruction, determining metadata of the target data according to the identity information of the target data specified in the deletion instruction;

positioning and deleting the body data of the target data specified in the deleting instruction in the determined aggregated data according to the searched position;

deleting the determined metadata of the target data;

and deleting the data in the structural body form corresponding to the searched position in the determined bitmap information of the aggregated data.

In order to solve the above technical problem, the present invention further provides a data aggregation apparatus, including:

the judging module is used for judging whether the multi-version function of the distributed storage system is started or not, and if so, the updating module is triggered;

wherein the storage capacity of the aggregated data is greater than the preset threshold.

In order to solve the above technical problem, the present invention further provides a data aggregation device, including:

a memory for storing a computer program;

a processor for implementing the steps of the data aggregation method as described above when executing the computer program.

To solve the above technical problem, the present invention further provides a computer-readable storage medium, having a computer program stored thereon, where the computer program, when executed by a processor, implements the steps of the data aggregation method as described above.

The invention provides a data aggregation method, under the condition that a multi-version function is started, for target data with the storage capacity smaller than a preset threshold value, the data aggregation method can pack the version number of the target data and the position of the body data of the target data in aggregated data into bitmap information of the aggregated data on the basis of aggregating the body data of the target data to corresponding aggregated data, and adds the identity information of the aggregated data in metadata of the target data, so that even if the target data with the duplicate name exists, a user can quickly and accurately find the specific position of the body data in the target data according to the version number of the target data due to the uniqueness of the version number of each target data, namely the method can identify the duplicate name target data with different version numbers, thereby not covering and storing the duplicate name data, the risk of data loss is reduced, and the user experience is improved.

The invention also provides a data aggregation device, equipment and a computer readable storage medium, which have the same beneficial effects as the data aggregation method.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed in the prior art and the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic flow chart of a data aggregation method according to the present invention;

FIG. 2 is a schematic structural diagram of a data aggregation apparatus according to the present invention;

fig. 3 is a schematic structural diagram of a data aggregation device according to the present invention.

Detailed Description

The core of the invention is to provide a data aggregation method, which does not cover and store the renowned data in the aggregation process, reduces the risk of data loss, and improves the user experience; another core of the present invention is to provide a data aggregation method, apparatus, device and computer-readable storage medium, which no longer perform overlay storage on the renamed data in the aggregation process, thereby reducing the risk of data loss and improving the user experience.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a schematic flow chart of a data aggregation method provided by the present invention, where the data aggregation method includes:

s101: when target data with the storage capacity smaller than a preset threshold value is received, aggregating body data of the target data to aggregated data corresponding to the target data;

specifically, in view of the technical problems in the background art, the present invention is to aggregate each small data (data with a storage capacity smaller than a preset threshold) in a distributed storage system with a multi-version function into aggregated data, and perform no overwriting operation on data with the same name in the process to prevent data loss, so that in the step, when receiving target data with a storage capacity smaller than the preset threshold, the main data of the target data is aggregated into aggregated data corresponding to the target data, that is, the main data of the target data is aggregated regardless of whether there is repeated data with a name before, thereby implementing aggregation of all small data, and the specific distinguishing measure is embodied in subsequent steps.

S102: judging whether a multi-version function of the distributed storage system is started or not;

specifically, considering that there may be a plurality of data with the same name when the multi-version function is turned on, and the corresponding processing manners are different, it is necessary to determine whether the multi-version function of the distributed storage system is turned on in this step, so as to trigger the subsequent action according to the determination result.

S103: if so, integrating the position of the body data of the target data in the aggregated data and the version number of the target data into a structural form and adding the structural form to the bitmap information of the aggregated data so as to find the specific position of the body data in the aggregated data through the version number in the bitmap information;

specifically, in the case that the multi-version function is turned on, in the present application, to distinguish each target data by the unique version number (i.e. ID) of each target data, theoretically, the position of the target data found from the bitmap information can find the position of the body data of the target data in the aggregated data according to the position, but assuming that a plurality of small data with the same name exist in the bitmap information, the position data in the "position" of the bitmap information cannot be found by the name at this time, so in this step, the "position" data of the target data and the unique version number thereof are integrated into a structural form and added to the bitmap information of the aggregated data, so that the "position" data of the body data of the target data bound with the version number can be found in the bitmap information only by the version number of the target data, so that eventually the "location" data can be used to find the ontology data in the aggregated data.

S104: adding identity information of the aggregated data in metadata of the target data so as to determine the aggregated data where the ontology data of the target data is located through the identity information;

and the storage capacity of the aggregated data is greater than a preset threshold value.

Specifically, in the process of searching for the target data by the user, the metadata needs to be searched for through the identity information of the target data, after the identity information of the aggregated data in the metadata is found, the bitmap information of the corresponding aggregated data can be found through the identity information of the aggregated data, and finally the "position" data is determined from the bitmap information through the version number in the metadata of the target data, so that the body data of the target data can be found in the aggregated data finally.

On the basis of the above-described embodiment:

as a preferred embodiment, after determining whether the multi-version function of the distributed storage system is turned on, the data aggregation method further includes:

and if not, adding the position of the body data of the target data in the aggregated data to the bitmap information and executing the step of adding the identity information of the aggregated data in the metadata of the target data.

Specifically, under the condition that the multi-version function is not opened, it is proved that newly uploaded target data with the same name does not exist in the distributed system, so that the condition that the target data with the same name before is possibly covered does not need to be considered, at this time, the position of the body data of the target data in the aggregated data can be directly added to the bitmap information, the step of adding the identity information of the aggregated data in the metadata of the target data is executed, and the unique "position" data of the target data can be found from the bitmap information through the name in the metadata of the target data in the subsequent searching process without distinguishing through the version number.

As a preferred embodiment, the specific step of determining whether the multi-version function of the distributed storage system is activated is:

judging whether the metadata of the target data has a version number or not;

if yes, the multi-version function of the distributed storage system is judged to be started.

Specifically, the judgment method is simple, rapid and accurate.

Of course, in addition to the above determination manner, the determination may be made in a manner of "determining whether the configuration parameter in the bucket corresponding to the target data configures the multi-version function to be turned on", and the embodiment of the present invention is not limited herein.

As a preferred embodiment, the data aggregation method further includes:

and responding to the parameter modification instruction, and modifying the preset threshold according to the parameter modification instruction.

Specifically, in order to facilitate a user to modify the preset threshold independently, a modification interface is opened in the embodiment of the present invention, and the user can modify the preset threshold through a parameter modification instruction, so that the work efficiency and the user experience are improved.

The preset threshold may be various data, for example, 512kb, and the like, and the embodiment of the present invention is not limited herein.

Specifically, the size of the aggregated data may also be set autonomously, for example, the size may be set to 4MB, and the embodiment of the present invention is not limited herein.

The parameter modification instruction may be sent by a user through a human-computer interaction device, and the human-computer interaction device may be of various types, for example, may be a mobile terminal, and the embodiment of the present invention is not limited herein.

As a preferred embodiment, the identity information of the aggregated data is embodied in at least one of a name, a creation date, and a storage location.

Specifically, at least one of the name, the creation date, and the storage location can accurately find the aggregated data, and the amount of data is small.

Of course, besides at least one of the name, the creation date, and the storage location, the identity information of the aggregated data may be of other types, and the embodiment of the present invention is not limited herein.

In a preferred embodiment, the distributed storage system is a distributed object storage system.

In particular, the distributed object storage system has the characteristic of wide use.

Of course, the distributed storage system may be other types besides the distributed object storage system, for example, a distributed file storage system, and the like, and the embodiment of the present invention is not limited herein.

As a preferred embodiment, the data aggregation method further includes:

in response to the deleting instruction, determining metadata of the target data according to the identity information of the target data specified in the deleting instruction;

determining aggregated data where the body data of the target data specified in the deletion instruction is located according to the identity information of the aggregated data in the determined metadata;

according to the version number in the determined metadata, searching the position of the body data of the target data specified in the deleting instruction in the determined bitmap information of the aggregated data;

positioning and deleting the body data of the target data appointed in the deleting instruction in the determined aggregated data according to the searched position;

deleting the metadata of the determined target data;

and deleting the data in the structural body form corresponding to the searched position in the determined bitmap information of the aggregated data.

Specifically, when target data is deleted, the target data also needs to be searched first, and this process has been described previously, and the embodiment of the present invention is not described herein again, and after body data of the target data is searched, the body data can be directly deleted, and meanwhile, data in the form of a structure body in the metadata and bitmap information that remain therein can be deleted, so that complete deletion is achieved, space occupation is reduced, and user experience is improved.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a data aggregation device provided in the present invention, the data aggregation device including:

the aggregation module 21 is configured to aggregate, when target data with a storage capacity smaller than a preset threshold is received, ontology data of the target data to aggregated data corresponding to the target data;

the judging module 22 is used for judging whether the multi-version function of the distributed storage system is started, and if so, the updating module 23 is triggered;

the updating module 23 is configured to integrate the position of the body data of the target data in the aggregated data and the version number of the target data into a structural form and add the structural form to the bitmap information of the aggregated data, so as to find a specific position of the body data in the aggregated data through the version number in the bitmap information;

the adding module 24 is configured to add identity information of the aggregated data in the metadata of the target data, so as to determine, through the identity information, the aggregated data where the ontology data of the target data is located;

and the storage capacity of the aggregated data is greater than a preset threshold value.

For the introduction of the data aggregation apparatus provided in the embodiment of the present invention, reference is made to the foregoing embodiment of the data aggregation method, and details of the embodiment of the present invention are not repeated herein.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a data aggregation device provided in the present invention, where the data aggregation device includes:

a memory 31 for storing a computer program;

a processor 32 for implementing the steps of the data aggregation method as in the previous embodiments when executing the computer program.

For the introduction of the data aggregation device provided in the embodiment of the present invention, reference is made to the foregoing embodiment of the data aggregation method, and details of the embodiment of the present invention are not repeated herein.

The present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data aggregation method as in the previous embodiments.

For the introduction of the computer-readable storage medium provided by the embodiment of the present invention, reference is made to the foregoing embodiment of the data aggregation method, and details of the embodiment of the present invention are not repeated herein.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should also be noted that, in the present specification, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

10页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种信息确定方法、第一存储服务器及存储介质

Data aggregation method, device, equipment and computer readable storage medium

相关技术

网友询问留言