Data reconstruction method, device, storage medium and device

文档序号:1860769 发布日期:2021-11-19 浏览:4次 中文

阅读说明:本技术 数据重构方法、设备、存储介质及装置 (Data reconstruction method, device, storage medium and device ) 是由 王文锋 徐曜 倪浩轩 郭志刚 陈玲鸿 于 2021-08-03 设计创作,主要内容包括:本发明公开了一种数据重构方法、设备、存储介质及装置,该方法包括:在接收到数据分片重构请求时,获取各存储节点的重构负载量,根据重构负载量从存储节点中选取目标存储节点,并读取目标存储节点中的目标数据分片,根据目标数据分片确定异常数据分片,并对异常数据分片进行数据重构;由于本发明通过各存储节点的重构负载量来选取目标存储节点,并基于目标存储节点的目标数据分片进行数据重构,从而避免了个别存储节点负载量过大,缩短了数据分片重构的速度,增加了数据分片重构的稳定性。(The invention discloses a data reconstruction method, a device, a storage medium and a device, wherein the method comprises the following steps: when a data fragment reconstruction request is received, acquiring reconstruction load capacity of each storage node, selecting a target storage node from the storage nodes according to the reconstruction load capacity, reading target data fragments in the target storage node, determining abnormal data fragments according to the target data fragments, and performing data reconstruction on the abnormal data fragments; according to the invention, the target storage node is selected through the reconstruction load capacity of each storage node, and the data reconstruction is carried out based on the target data fragment of the target storage node, so that the phenomenon that the load capacity of an individual storage node is overlarge is avoided, the speed of reconstructing the data fragment is shortened, and the stability of reconstructing the data fragment is increased.)

1. A data reconstruction method, characterized in that the data reconstruction method comprises the steps of:

when a data fragment reconstruction request is received, acquiring the reconstruction load capacity of each storage node;

selecting a target storage node from the storage nodes according to the reconstruction load capacity, and reading target data fragments in the target storage node;

and determining abnormal data fragments according to the target data fragments, and performing data reconstruction on the abnormal data fragments.

2. The data reconstruction method according to claim 1, wherein the step of selecting a target storage node from the storage nodes according to the reconstruction load amount and reading a target data slice in the target storage node comprises:

sequencing the storage nodes according to the reconstruction load quantity to obtain a sequencing result;

and screening the storage nodes according to the sorting result to obtain target storage nodes.

3. The data reconstruction method of claim 2, wherein the step of filtering the storage nodes according to the sorting result to obtain a target storage node comprises:

acquiring the number of blocks of stored data, and determining the number of target fragments required by data fragment reconstruction according to the number of the blocks;

and screening the storage nodes according to the sorting result and the target fragment number to obtain target storage nodes.

4. The data reconstruction method according to claim 3, wherein the step of filtering the storage nodes according to the sorting result and the target number of the fragments to obtain the target storage nodes comprises:

screening the storage nodes according to the sorting result and the target fragment number to obtain candidate storage nodes;

judging whether the candidate storage node is in a locked state;

and when the candidate storage node is not in the locking state, taking the candidate storage node as a target storage node.

5. The data reconstruction method according to claim 1, wherein after the step of selecting a target storage node from the storage nodes according to the reconstruction load amount and reading the target data slice in the target storage node, the method further comprises:

acquiring reconstruction reading information of the target storage node;

and updating the reconstruction load capacity of the target storage node according to the reconstruction reading information to obtain the current reconstruction load capacity of the target storage node.

6. The data reconstruction method according to claim 5, wherein after the step of updating the reconstruction load amount of the target storage node according to the reconstruction read information and obtaining the current reconstruction load amount of the target storage node, the method further comprises:

acquiring the storage time of the current reconstruction load amount, and judging whether the storage time is greater than or equal to a preset time threshold value;

and deleting the current reconstruction load quantity when the storage time is greater than or equal to a preset time threshold.

7. The data reconstruction method according to any one of claims 1 to 6, wherein after the step of determining an abnormal data slice according to the target data slice and performing data reconstruction on the abnormal data slice, the method further comprises:

acquiring the current reconstruction load capacity of each target storage node, and judging whether the current reconstruction load capacity is larger than a preset load capacity threshold value or not;

and when the current reconstruction load capacity is larger than a preset load capacity threshold value, setting the target storage node in a locking state.

8. A data reconstruction device, characterized in that the data reconstruction device comprises: a memory, a processor, and a data reconstruction program stored on the memory and executable on the processor, the data reconstruction program when executed by the processor implementing the data reconstruction method of any one of claims 1 to 7.

9. A storage medium having stored thereon a data reconstruction program which, when executed by a processor, implements the data reconstruction method according to any one of claims 1 to 7.

10. A data reconstruction apparatus, characterized in that the data reconstruction apparatus comprises: the system comprises an information acquisition module, a node selection module and a reconstruction processing module;

the information acquisition module is used for acquiring the reconstruction load capacity of each storage node when a data fragment reconstruction request is received;

the node selection module is used for selecting a target storage node from the storage nodes according to the reconstruction load capacity and reading the target data fragment in the target storage node;

and the reconstruction processing module is used for determining abnormal data fragments according to the target data fragments and reconstructing data of the abnormal data fragments.

Technical Field

The present invention relates to the field of internet technologies, and in particular, to a data reconstruction method, device, storage medium, and apparatus.

Background

At present, when a distributed storage system performs data fragmentation reconstruction, a storage node with a smaller fragmentation serial number is often selected to reconstruct abnormal data fragmentation. However, the above reconstruction method does not consider the reconstruction load of each storage node, which results in a large load of an individual storage node, and further results in a slow reconstruction speed and poor stability of data fragments.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a data reconstruction method, data reconstruction equipment, a storage medium and a data reconstruction device, and aims to solve the technical problems that in the prior art, when abnormal data fragments are reconstructed, the reconstruction load capacity of each storage node is not considered, so that the load capacity of an individual storage node is large, and further the reconstruction speed of the data fragments is low and the stability is poor.

In order to achieve the above object, the present invention provides a data reconstruction method, including the steps of:

when a data fragment reconstruction request is received, acquiring the reconstruction load capacity of each storage node;

selecting a target storage node from the storage nodes according to the reconstruction load capacity, and reading target data fragments in the target storage node;

and determining abnormal data fragments according to the target data fragments, and performing data reconstruction on the abnormal data fragments.

Optionally, the step of selecting a target storage node from the storage nodes according to the reconstruction load amount and reading a target data slice in the target storage node includes:

sequencing the storage nodes according to the reconstruction load quantity to obtain a sequencing result;

and screening the storage nodes according to the sorting result to obtain target storage nodes.

Optionally, the step of screening the storage nodes according to the sorting result to obtain a target storage node includes:

acquiring the number of blocks of stored data, and determining the number of target fragments required by data fragment reconstruction according to the number of the blocks;

and screening the storage nodes according to the sorting result and the target fragment number to obtain target storage nodes.

Optionally, the step of screening the storage nodes according to the sorting result and the number of the target fragments to obtain target storage nodes includes:

screening the storage nodes according to the sorting result and the target fragment number to obtain candidate storage nodes;

judging whether the candidate storage node is in a locked state;

and when the candidate storage node is not in the locking state, taking the candidate storage node as a target storage node.

Optionally, after the step of selecting a target storage node from the storage nodes according to the reconstruction load amount and reading the target data slice in the target storage node, the method further includes:

acquiring reconstruction reading information of the target storage node;

and updating the reconstruction load capacity of the target storage node according to the reconstruction reading information to obtain the current reconstruction load capacity of the target storage node.

Optionally, after the step of updating the reconstruction load amount of the target storage node according to the reconstruction read information and obtaining the current reconstruction load amount of the target storage node, the method further includes:

acquiring the storage time of the current reconstruction load amount, and judging whether the storage time is greater than or equal to a preset time threshold value;

and deleting the current reconstruction load quantity when the storage time is greater than or equal to a preset time threshold.

Optionally, after the step of determining an abnormal data fragment according to the target data fragment and performing data reconstruction on the abnormal data fragment, the method further includes:

acquiring the current reconstruction load capacity of each target storage node, and judging whether the current reconstruction load capacity is larger than a preset load capacity threshold value or not;

and when the current reconstruction load capacity is larger than a preset load capacity threshold value, setting the target storage node in a locking state.

Furthermore, to achieve the above object, the present invention also proposes a data reconstruction device comprising a memory, a processor, and a data reconstruction program stored on the memory and executable on the processor, the data reconstruction program being configured to implement the data reconstruction method as described above.

Furthermore, to achieve the above object, the present invention also proposes a storage medium having stored thereon a data reconstruction program which, when executed by a processor, implements the data reconstruction method as described above.

In addition, to achieve the above object, the present invention further provides a data reconstruction apparatus, comprising: the system comprises an information acquisition module, a node selection module and a reconstruction processing module;

the information acquisition module is used for acquiring the reconstruction load capacity of each storage node when a data fragment reconstruction request is received;

the node selection module is used for selecting a target storage node from the storage nodes according to the reconstruction load capacity and reading the target data fragment in the target storage node;

and the reconstruction processing module is used for determining abnormal data fragments according to the target data fragments and reconstructing data of the abnormal data fragments.

The invention discloses that when a data fragment reconstruction request is received, the reconstruction load capacity of each storage node is obtained, a target storage node is selected from the storage nodes according to the reconstruction load capacity, the target data fragment in the target storage node is read, the abnormal data fragment is determined according to the target data fragment, and the data reconstruction is carried out on the abnormal data fragment; according to the invention, the target storage node is selected through the reconstruction load capacity of each storage node, and the data reconstruction is carried out based on the target data fragment of the target storage node, so that the phenomenon that the load capacity of an individual storage node is overlarge is avoided, the speed of reconstructing the data fragment is shortened, and the stability of reconstructing the data fragment is increased.

Drawings

Fig. 1 is a schematic structural diagram of a data reconstruction device of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a data reconstruction method according to a first embodiment of the present invention;

FIG. 3 is a diagram illustrating data storage by an error correction code according to an embodiment of a data reconstruction method of the present invention;

FIG. 4 is a schematic diagram of conventional data slice reconstruction according to an embodiment of the data reconstruction method of the present invention;

FIG. 5 is a schematic diagram of data slice reconstruction according to an embodiment of the data reconstruction method of the present invention;

FIG. 6 is a flowchart illustrating a data reconstruction method according to a second embodiment of the present invention;

FIG. 7 is a flowchart illustrating a data reconstruction method according to a third embodiment of the present invention;

fig. 8 is a block diagram of a data reconstruction device according to a first embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a data reconstruction device of a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the data reconstruction apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), and the optional user interface 1003 may further include a standard wired interface and a wireless interface, and the wired interface for the user interface 1003 may be a USB interface in the present invention. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a Random Access Memory (RAM) or a Non-volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the data reconstruction device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in FIG. 1, memory 1005, identified as one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a data reconstruction program.

In the data reconstruction device shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting user equipment; the data reconstruction apparatus calls a data reconstruction program stored in the memory 1005 through the processor 1001 and executes the data reconstruction method provided by the embodiment of the present invention.

Based on the above hardware structure, an embodiment of the data reconstruction method of the present invention is provided.

Referring to fig. 2, fig. 2 is a flowchart illustrating a data reconstruction method according to a first embodiment of the present invention.

Step S10: and when a data fragment reconstruction request is received, acquiring the reconstruction load capacity of each storage node.

It should be understood that the main body of the method of this embodiment may be a data reconstruction device with data processing, network communication and program running functions, such as a computer or a server, or other electronic devices capable of implementing the same or similar functions, which is not limited in this embodiment. In this embodiment and other embodiments, a server is taken as an example for explanation.

It should be noted that the data fragment reconstruction request may be initiated by a user through a user interface of the data reconstruction device. The Storage node may be an Object-based Storage Device (OSD). The reconstruction load amount may be the number of reconstructed read requests and the number of reconstructed write operations performed by the receiving storage node.

It can be understood that, in this embodiment, data is stored by an error correction code, and a specific storage manner may be: the error correcting code divides the stored data into N parts of original data, M parts of check data are calculated through the N parts of original data, and the N + M parts of data are named as data fragments and are respectively stored in different OSD. When data is read, an original data block can be restored by any N data fragments in the N + M data fragments; when the abnormal data fragments are reconstructed, the abnormal data fragments can be restored through the N data fragments.

For ease of understanding, the description will be made with reference to fig. 3, but this scheme is not limited thereto. Fig. 3 is a schematic diagram of data storage by an error correction code, where N is 3 and M is 2 are taken as examples, and how to store an object named NYAN in Ceph by an erasure code is specifically described, assuming that the content of the object is abcdefghii. After uploading the NYAN to Ceph, the client calls a corresponding erasure code algorithm in the main OSD to perform coding calculation on data: the original ABCDEFGHI is divided into three slices, which correspond to stripe slice 1 (content is ABC), stripe slice 2 (content is DEF) and stripe slice 3 (content is GHI) in the figure, and then the other two check stripe slices 4 (content is YXY) and check stripe slice 5 (content is QGC) are calculated. The 5 slices are randomly distributed over 5 different OSDs according to the rule specified by the crushmap, and the storage operation for this object is completed.

After the client initiates a request for reading NYAN, the main OSD of the PG where this object is located may initiate a read request to other associated OSDs, for example, the main OSD is OSD1 in the figure, when the request is sent to OSD2, OSD3, OSD4, and OSD5, since OSD2 is slow to read, the OSD5 fails to respond to the request, and finally only the stripe slices of OSD1 (content is ABC), OSD3 (content is GHI), and OSD4 (content is YXY) may be obtained, at this time, the OSD1, as the main OSD, may perform erasure correction code decoding operation on the data slices of OSD1, OSD3, and OSD4, calculate the slice content (i.e., DEF) above OSD2, and then recombine a new NYAN content (abcdef), and finally return the result to the client.

It can be understood that in the prior art, when data fragmentation reconstruction is performed, a storage node with a smaller fragmentation serial number is often selected to reconstruct abnormal data fragmentation, so that more data fragmentation needs to be read on some storage nodes, the data reading speed and stability of the storage nodes are affected, and the data fragmentation reconstruction speed is low and the stability is poor.

For ease of understanding, the description will be made with reference to fig. 4, but this scheme is not limited thereto. Fig. 4 is a schematic diagram of conventional data slice reconstruction, in which a main PG is OSD1, and when receiving a data slice reconstruction request, an OSD sends a reconstruction read request to an OSD8, an OSD19, and an OSD5 with smaller slice numbers, the OSD8, the OSD19, and the OSD5 feed back a target data slice to the OSD1 according to the reconstruction read request, the OSD1 determines an abnormal data slice according to the target data slice, performs data reconstruction on the abnormal data slice, and sends a data slice generated by reconstruction to an OSD15 where the abnormal data slice is located after reconstruction is completed.

Referring to fig. 4, it can be seen that the reconstruction load of the OSD19 is 8, and a reconstruction read request sent by the main PG is also added, so that more data fragments need to be read by the OSD19, which affects the data reading speed and stability of the OSD19, and further causes the slow speed and poor stability of data fragment reconstruction.

Step S20: and selecting a target storage node from the storage nodes according to the reconstruction load capacity, and reading the target data fragment in the target storage node.

It should be understood that, the selection of the target storage node from the storage nodes according to the reconstruction load amount may be to sort the storage nodes from small to large according to the reconstruction load amount, and use the storage node which is sorted at the top as the target storage node.

Further, after the step S20, in order to update the reconstruction load amount of each storage node in real time, the method further includes:

acquiring reconstruction reading information of the target storage node;

and updating the reconstruction load capacity of the target storage node according to the reconstruction reading information to obtain the current reconstruction load capacity of the target storage node.

It should be appreciated that after reading the target data slices in the target storage node, the target storage node additionally increases the reconstruction load amount of the read target data slices. Therefore, the reconstruction load amount of the target data slice needs to be updated to ensure the accuracy of the reconstruction load amount.

It should be noted that the reconstructed read information may be the number of nodes reading the target data slice from the target storage node.

It can be understood that the reconstruction load amount of the target storage node is updated according to the reconstruction read information, and the obtaining of the current reconstruction load amount of the target storage node may be determining the number of nodes additionally increasing the read target data fragments according to the reconstruction read information, and updating the reconstruction load amount of the target storage node according to the number of nodes to obtain the current reconstruction load amount of the target storage node.

Further, in order to save a storage space, after the updating the reconstruction load amount of the target storage node according to the reconstruction read information and obtaining the current reconstruction load amount of the target storage node, the method further includes:

acquiring the storage time of the current reconstruction load amount, and judging whether the storage time is greater than or equal to a preset time threshold value;

and deleting the current reconstruction load quantity when the storage time is greater than or equal to a preset time threshold.

It should be noted that the preset time threshold may be preset by a user, or may be automatically generated by the data reconstruction device according to an actual situation, which is not limited in this embodiment.

It should be understood that when the storage time of the current reconstruction load amount is greater than or equal to the preset time threshold, indicating that the reconstruction is completed, at this time, the current reconstruction load amount may be deleted, so as to save the storage space.

Step S30: and determining abnormal data fragments according to the target data fragments, and performing data reconstruction on the abnormal data fragments.

It can be understood that determining the abnormal data fragment according to the target data fragment may be performing data analysis on the target data fragment, and determining the abnormal data fragment according to the analysis result.

It should be understood that the data reconstruction of the abnormal data slice may be the data reconstruction of the abnormal data slice according to the target data slice through a preset decoding model. The preset decoding model is used for restoring the abnormal data fragments.

For ease of understanding, the description will be made with reference to fig. 5, but this scheme is not limited thereto. Fig. 5 is a schematic diagram of data slicing reconstruction, in which the main PG is OSD1, when receiving a data slicing reconstruction request, the OSD sends a reconstruction read request to the OSD8, the OSD5, and the OSD12 with small reconstruction load amount, the OSD8, the OSD5, and the OSD12 feed back a target data slice to the OSD1 according to the reconstruction read request, the OSD1 determines an abnormal data slice according to the target data slice and performs data reconstruction on the abnormal data slice, and after the reconstruction is completed, the data slice generated by reconstruction is sent to the OSD15 where the abnormal data slice is located.

Referring to fig. 5, in the embodiment, the OSD receiving the reconfiguration read request is selected according to the reconfiguration load amount, so that the OSD with a smaller reconfiguration load amount is preferentially selected to send the reconfiguration read request, the OSD with a larger reconfiguration load amount is prevented from being generated, the reconfiguration load balance of all OSDs in the storage system is maintained, the overall data reconfiguration time is shortened, and the stability of the system is increased.

In the first embodiment, the method includes the steps that when a data fragment reconstruction request is received, reconstruction load capacity of each storage node is obtained, a target storage node is selected from the storage nodes according to the reconstruction load capacity, target data fragments in the target storage node are read, abnormal data fragments are determined according to the target data fragments, and data reconstruction is carried out on the abnormal data fragments; in the embodiment, the target storage node is selected according to the reconstruction load capacity of each storage node, and data reconstruction is performed on the basis of the target data fragment of the target storage node, so that the phenomenon that the load capacity of an individual storage node is too large is avoided, the speed of data fragment reconstruction is shortened, and the stability of data fragment reconstruction is improved.

Referring to fig. 6, fig. 6 is a flowchart illustrating a data reconstruction method according to a second embodiment of the present invention, and the data reconstruction method according to the second embodiment of the present invention is proposed based on the first embodiment shown in fig. 2.

In the second embodiment, the step S20 includes:

step S201: and sequencing the storage nodes according to the reconstruction load quantity to obtain a sequencing result.

It can be understood that, the storage nodes are sorted according to the reconstruction load amount, and the obtained sorting result may be obtained by sorting the storage nodes from small to large according to the reconstruction load amount.

Step S202: and screening the storage nodes according to the sorting result to obtain target storage nodes.

It should be understood that, the storage nodes are screened according to the sorting result, and the target storage node is obtained by taking a preset number of storage nodes sorted at the top as the target storage node. Wherein the preset number may be preset.

Further, in order to ensure that the selected target storage node can meet the requirement of data reconstruction, the step S202 includes:

acquiring the number of blocks of stored data, and determining the number of target fragments required by data fragment reconstruction according to the number of the blocks;

and screening the storage nodes according to the sorting result and the target fragment number to obtain target storage nodes.

It can be understood that, when reconstructing the abnormal data fragments, the abnormal data fragments need to be restored through the data fragments of the target fragment number. The target number of slices may be the number of blocks when data is stored.

It should be understood that determining the target number of slices required for data slice reconstruction according to the number of slices may be taking the number of slices as the target number of slices required for data slice reconstruction.

It can be understood that, the storage nodes are screened according to the sorting result and the target fragment number, and the target storage node is obtained by taking the storage node with the target fragment number sorted in the top as the target storage node.

Further, in order to avoid the situation that the load capacity of the target storage node is too large, the screening the storage nodes according to the sorting result and the target fragment number to obtain the target storage node includes:

screening the storage nodes according to the sorting result and the target fragment number to obtain candidate storage nodes;

judging whether the candidate storage node is in a locked state;

and when the candidate storage node is not in the locking state, taking the candidate storage node as a target storage node.

It should be understood that, in practical applications, the overall load capacity of the storage nodes is large, and the selected target storage node is also busy. At this time, the system congestion may be caused if the data reconstruction is continued, and therefore, the busy target storage node may be set to be in a locked state to prevent being selected.

It should be noted that, when the candidate storage node is in the locked state, the candidate storage node cannot be selected.

The second embodiment obtains the sorting result by sorting the storage nodes according to the reconstructed load amount, and obtains the target storage node by screening the storage nodes according to the sorting result, so that the target storage node can be accurately selected.

Referring to fig. 7, fig. 7 is a flowchart illustrating a data reconstruction method according to a third embodiment of the present invention, and the data reconstruction method according to the third embodiment of the present invention is proposed based on the first embodiment shown in fig. 2.

In the third embodiment, after the step S30, the method further includes:

step S40: the method comprises the steps of obtaining the current reconstruction load capacity of each target storage node, and judging whether the current reconstruction load capacity is larger than a preset load capacity threshold value or not.

It should be noted that the preset load amount threshold may be preset.

It should be understood that, in practical applications, the overall load capacity of the storage nodes is large, and the selected target storage node is also busy. At this time, the system congestion may be caused if the data reconstruction is continued, and therefore, the busy target storage node may be set to be in a locked state to prevent being selected.

It can be understood that, in order to determine whether the target storage node is busy, the current reconstruction load amount of the target storage node may be compared with a preset load amount threshold, and when the current reconstruction load amount is less than or equal to the preset load amount threshold, the target storage node is not busy, and data reconstruction processing may be performed.

Step S50: and when the current reconstruction load capacity is larger than a preset load capacity threshold value, setting the target storage node in a locking state.

It should be understood that when the current reconfiguration load is greater than the preset load threshold, it indicates that the target storage node is busy, and at this time, the target storage node needs to be set to a locked state to avoid being selected.

The third embodiment sets the target storage node in a locked state by acquiring the current reconstruction load capacity of each target storage node, and determining whether the current reconstruction load capacity is greater than the preset load capacity threshold, so that the target storage node can be set when the target storage node is busy, and the target storage node is prevented from being selected to cause system congestion.

Furthermore, an embodiment of the present invention further provides a storage medium, where the storage medium stores a data reconstruction program, and the data reconstruction program, when executed by a processor, implements the data reconstruction method as described above.

In addition, referring to fig. 8, an embodiment of the present invention further provides a data reconstruction apparatus, where the data reconstruction apparatus includes: the system comprises an information acquisition module, a node selection module and a reconstruction processing module;

the information acquisition module is used for acquiring the reconstruction load capacity of each storage node when a data fragment reconstruction request is received.

It should be noted that the data fragment reconstruction request may be initiated by a user through a user interface of the data reconstruction device. The Storage node may be an Object-based Storage Device (OSD). The reconstruction load amount may be the number of reconstructed read requests and the number of reconstructed write operations performed by the receiving storage node.

It can be understood that, in this embodiment, data is stored by an error correction code, and a specific storage manner may be: the error correcting code divides the stored data into N parts of original data, M parts of check data are calculated through the N parts of original data, and the N + M parts of data are named as data fragments and are respectively stored in different OSD. When data is read, an original data block can be restored by any N data fragments in the N + M data fragments; when the abnormal data fragments are reconstructed, the abnormal data fragments can be restored through the N data fragments.

For ease of understanding, the description will be made with reference to fig. 3, but this scheme is not limited thereto. Fig. 3 is a schematic diagram of data storage by an error correction code, where N is 3 and M is 2 are taken as examples, and how to store an object named NYAN in Ceph by an erasure code is specifically described, assuming that the content of the object is abcdefghii. After uploading the NYAN to Ceph, the client calls a corresponding erasure code algorithm in the main OSD to perform coding calculation on data: the original ABCDEFGHI is divided into three slices, which correspond to stripe slice 1 (content is ABC), stripe slice 2 (content is DEF) and stripe slice 3 (content is GHI) in the figure, and then the other two check stripe slices 4 (content is YXY) and check stripe slice 5 (content is QGC) are calculated. The 5 slices are randomly distributed over 5 different OSDs according to the rule specified by the crushmap, and the storage operation for this object is completed.

After the client initiates a request for reading NYAN, the main OSD of the PG where this object is located may initiate a read request to other associated OSDs, for example, the main OSD is OSD1 in the figure, when the request is sent to OSD2, OSD3, OSD4, and OSD5, since OSD2 is slow to read, the OSD5 fails to respond to the request, and finally only the stripe slices of OSD1 (content is ABC), OSD3 (content is GHI), and OSD4 (content is YXY) may be obtained, at this time, the OSD1, as the main OSD, may perform erasure correction code decoding operation on the data slices of OSD1, OSD3, and OSD4, calculate the slice content (i.e., DEF) above OSD2, and then recombine a new NYAN content (abcdef), and finally return the result to the client.

It can be understood that in the prior art, when data fragmentation reconstruction is performed, a storage node with a smaller fragmentation serial number is often selected to reconstruct abnormal data fragmentation, so that more data fragmentation needs to be read on some storage nodes, the data reading speed and stability of the storage nodes are affected, and the data fragmentation reconstruction speed is low and the stability is poor.

For ease of understanding, the description will be made with reference to fig. 4, but this scheme is not limited thereto. Fig. 4 is a schematic diagram of conventional data slice reconstruction, in which a main PG is OSD1, and when receiving a data slice reconstruction request, an OSD sends a reconstruction read request to an OSD8, an OSD19, and an OSD5 with smaller slice numbers, the OSD8, the OSD19, and the OSD5 feed back a target data slice to the OSD1 according to the reconstruction read request, the OSD1 determines an abnormal data slice according to the target data slice, performs data reconstruction on the abnormal data slice, and sends a data slice generated by reconstruction to an OSD15 where the abnormal data slice is located after reconstruction is completed.

Referring to fig. 4, it can be seen that the reconstruction load of the OSD19 is 8, and a reconstruction read request sent by the main PG is also added, so that more data fragments need to be read by the OSD19, which affects the data reading speed and stability of the OSD19, and further causes the slow speed and poor stability of data fragment reconstruction.

And the node selection module is used for selecting a target storage node from the storage nodes according to the reconstruction load capacity and reading the target data fragment in the target storage node.

It should be understood that, the selection of the target storage node from the storage nodes according to the reconstruction load amount may be to sort the storage nodes from small to large according to the reconstruction load amount, and use the storage node which is sorted at the top as the target storage node.

Further, in order to update the reconstruction load amount of each storage node in real time, the data reconstruction apparatus further includes: a load update module;

the load updating module is used for acquiring the reconstruction reading information of the target storage node, updating the reconstruction load capacity of the target storage node according to the reconstruction reading information, and acquiring the current reconstruction load capacity of the target storage node.

It should be appreciated that after reading the target data slices in the target storage node, the target storage node additionally increases the reconstruction load amount of the read target data slices. Therefore, the reconstruction load amount of the target data slice needs to be updated to ensure the accuracy of the reconstruction load amount.

It should be noted that the reconstructed read information may be the number of nodes reading the target data slice from the target storage node.

It can be understood that the reconstruction load amount of the target storage node is updated according to the reconstruction read information, and the obtaining of the current reconstruction load amount of the target storage node may be determining the number of nodes additionally increasing the read target data fragments according to the reconstruction read information, and updating the reconstruction load amount of the target storage node according to the number of nodes to obtain the current reconstruction load amount of the target storage node.

Further, in order to save the storage space, the data reconstruction apparatus further includes: an information deleting module;

the information deleting module is used for acquiring the storage time of the current reconstruction load amount, judging whether the storage time is larger than or equal to a preset time threshold value or not, and deleting the current reconstruction load amount when the storage time is larger than or equal to the preset time threshold value.

It should be noted that the preset time threshold may be preset by a user, or may be automatically generated by the data reconstruction device according to an actual situation, which is not limited in this embodiment.

It should be understood that when the storage time of the current reconstruction load amount is greater than or equal to the preset time threshold, indicating that the reconstruction is completed, at this time, the current reconstruction load amount may be deleted, so as to save the storage space.

And the reconstruction processing module is used for determining abnormal data fragments according to the target data fragments and reconstructing data of the abnormal data fragments.

It can be understood that determining the abnormal data fragment according to the target data fragment may be performing data analysis on the target data fragment, and determining the abnormal data fragment according to the analysis result.

It should be understood that the data reconstruction of the abnormal data slice may be the data reconstruction of the abnormal data slice according to the target data slice through a preset decoding model. The preset decoding model is used for restoring the abnormal data fragments.

For ease of understanding, the description will be made with reference to fig. 5, but this scheme is not limited thereto. Fig. 5 is a schematic diagram of data slicing reconstruction in the present scheme, in the diagram, a main PG is OSD1, when receiving a data slicing reconstruction request, an OSD sends a reconstruction read request to an OSD8, an OSD5, and an OSD12 with small reconstruction load capacity, the OSD8, the OSD5, and the OSD12 feed back a target data slice to the OSD1 according to the reconstruction read request, the OSD1 determines an abnormal data slice according to the target data slice, performs data reconstruction on the abnormal data slice, and sends a data slice generated by reconstruction to an OSD15 where the abnormal data slice is located after reconstruction is completed.

Referring to fig. 5, in the embodiment, the OSD receiving the reconfiguration read request is selected according to the reconfiguration load amount, so that the OSD with a smaller reconfiguration load amount is preferentially selected to send the reconfiguration read request, the OSD with a larger reconfiguration load amount is prevented from being generated, the reconfiguration load balance of all OSDs in the storage system is maintained, the overall data reconfiguration time is shortened, and the stability of the system is increased.

In this embodiment, it is disclosed that when a data fragment reconstruction request is received, a reconstruction load amount of each storage node is obtained, a target storage node is selected from the storage nodes according to the reconstruction load amount, a target data fragment in the target storage node is read, an abnormal data fragment is determined according to the target data fragment, and data reconstruction is performed on the abnormal data fragment; in the embodiment, the target storage node is selected according to the reconstruction load capacity of each storage node, and data reconstruction is performed on the basis of the target data fragment of the target storage node, so that the phenomenon that the load capacity of an individual storage node is too large is avoided, the speed of data fragment reconstruction is shortened, and the stability of data fragment reconstruction is improved.

In an embodiment, the node selection module is further configured to sort the storage nodes according to the reconstructed load amount to obtain a sorting result, and screen the storage nodes according to the sorting result to obtain a target storage node;

in an embodiment, the node selection module is further configured to obtain a block number of the stored data, determine a target fragment number required for data fragment reconstruction according to the block number, and screen the storage nodes according to the sorting result and the target fragment number to obtain target storage nodes;

in an embodiment, the node selection module is further configured to screen the storage nodes according to the sorting result and the target fragment number to obtain candidate storage nodes, determine whether the candidate storage nodes are in a locked state, and when the candidate storage nodes are not in the locked state, use the candidate storage nodes as target storage nodes;

in one embodiment, the data reconstruction apparatus further includes: a load update module;

the load updating module is used for acquiring reconstruction reading information of the target storage node, updating the reconstruction load capacity of the target storage node according to the reconstruction reading information and acquiring the current reconstruction load capacity of the target storage node;

in one embodiment, the data reconstruction apparatus further includes: an information deleting module;

the information deleting module is used for acquiring the storage time of the current reconstruction load amount, judging whether the storage time is greater than or equal to a preset time threshold value or not, and deleting the current reconstruction load amount when the storage time is greater than or equal to the preset time threshold value;

in one embodiment, the data reconstruction apparatus further includes: a node locking module;

the node locking module is used for acquiring the current reconstruction load capacity of each target storage node, judging whether the current reconstruction load capacity is larger than a preset load capacity threshold value or not, and setting the target storage nodes to be in a locking state when the current reconstruction load capacity is larger than the preset load capacity threshold value.

Other embodiments or specific implementation manners of the data reconstruction apparatus according to the present invention may refer to the above method embodiments, and are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order, but rather the words first, second, third, etc. are to be interpreted as names.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g., a Read Only Memory (ROM)/Random Access Memory (RAM), a magnetic disk, an optical disk), and includes several instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

17页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种数据读取方法、系统、设备及计算机可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类