Distributed storage system and management method and device thereof

文档序号:1296034 发布日期:2020-08-07 浏览:7次 中文

阅读说明:本技术 分布式存储系统及其管理方法、装置 (Distributed storage system and management method and device thereof ) 是由 刘金鑫 董乘宇 于 2019-01-31 设计创作,主要内容包括:本公开涉及一种分布式存储系统及其管理方法、装置,所述方法应用于客户端,所述方法包括:获取第一时间段内访问的各数据节点的延迟参数,所述延迟参数用于表示客户端访问数据节点的延迟程度;将获取的各数据节点的延迟参数发送至主节点,以使所述主节点根据所述延迟参数,确定所述各数据节点被分配给客户端进行访问的概率。通过由数据节点的延迟程度确定数据节点被分配给客户端进行访问的概率,根据本公开实施例的分布式存储系统及其管理方法、装置能够在分布式存储系统的存储容量不变的情况下,提升客户端访问数据节点的速度。(The present disclosure relates to a distributed storage system and a management method and device thereof, wherein the method is applied to a client, and the method comprises the following steps: obtaining delay parameters of each data node accessed in a first time period, wherein the delay parameters are used for representing the delay degree of a client accessing the data nodes; and sending the acquired delay parameters of the data nodes to a main node, so that the main node determines the probability of the data nodes being allocated to the client side for access according to the delay parameters. The probability that the data nodes are allocated to the client side for access is determined according to the delay degree of the data nodes, and the distributed storage system, the management method and the management device thereof according to the embodiment of the disclosure can improve the speed of the client side for accessing the data nodes under the condition that the storage capacity of the distributed storage system is not changed.)

1. A distributed storage system management method is applied to a client, and comprises the following steps:

obtaining delay parameters of each data node accessed in a first time period, wherein the delay parameters are used for representing the delay degree of a client accessing the data nodes;

and sending the acquired delay parameters of the data nodes to a main node, so that the main node determines the probability of the data nodes being allocated to the client side for access according to the delay parameters.

2. The method of claim 1, wherein obtaining the latency parameter for each data node accessed during the first time period comprises:

acquiring the delay time of each data node accessed in the first time period;

selecting a reference value from the obtained delay time;

and determining the ratio of the delay time of each data node accessed in the first time period to the reference value as the delay parameter of each data node accessed in the first time period.

3. The method of claim 2, wherein obtaining the delay time of each data node accessed in the first time period comprises:

for each data node accessed within the first time period:

acquiring the accessed data volume and delay time when the data node is accessed each time in the first time period;

determining unit delay time corresponding to data with specified data volume in the data node accessed each time according to the data volume and the delay time accessed each time;

and determining the average value of the unit delay time of accessing the data node in each time in the first time period as the delay time of the data node.

4. A distributed storage system management method is applied to a main node, and comprises the following steps:

the method comprises the steps that delay parameters are obtained from a client, wherein the delay parameters obtained from the client comprise delay parameters of data nodes accessed by the client in a first time period, and the delay parameters of the data nodes are used for expressing the delay degree of the client for accessing the data nodes;

and determining the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

5. The method of claim 4, wherein determining the probability that each data node in the data node cluster is allocated to the client for access according to the obtained delay parameter comprises:

acquiring the probability that each data node in the data node cluster determined before the current moment is allocated to a client for access;

and adjusting the acquired probability according to the acquired delay parameters.

6. The method of claim 5, wherein adjusting the probability of acquisition based on the delay parameter of acquisition comprises:

for any data node in the data node cluster, when the delay parameter of the acquired data node is increased, the probability of the acquired data node is reduced, and when the delay parameter of the acquired node to be data is reduced, the probability of the acquired data node is increased.

7. A method for managing a distributed storage system, the method comprising:

the method comprises the steps that a client side obtains delay parameters of each data node accessed in a first time period, wherein the delay parameters are used for representing the delay degree of the data nodes accessed by the client side;

the client sends the acquired delay parameters of each data node to the master node;

the main node obtains a delay parameter from the client;

and the main node determines the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

8. The method of claim 7, wherein the obtaining, by the client, the delay parameter of each data node accessed in the first time period comprises:

the client acquires the delay time of each data node accessed in the first time period;

the client selects a reference value from the acquired delay time;

and the client determines the ratio of the delay time of each data node accessed in the first time period to the reference value as the delay parameter of each data node accessed in the first time period.

9. The method of claim 7, wherein the step of obtaining the delay time of each data node accessed in the first time period by the client comprises:

the client side comprises the following steps of aiming at each data node accessed in the first time period:

the client acquires the accessed data volume and delay time when the data node is accessed each time in the first time period;

the client determines unit delay time corresponding to data with specified data volume in the data node according to the data volume and the delay time of each access;

and the client determines the average value of the unit delay time of accessing the data node in each time in the first time period as the delay time of the data node.

10. The method of claim 7, wherein the determining, by the master node, the probability that each data node in the data node cluster is allocated to the client for access according to the obtained delay parameter comprises:

the method comprises the steps that a main node obtains the probability that each data node in a data node cluster determined before the current moment is allocated to a client side for access;

and the main node adjusts the acquired probability according to the acquired delay parameters.

11. The method of claim 7 or 10, wherein the master node adjusts the probability of acquisition based on the acquired delay parameter, comprising:

for any data node in the data node cluster, when the delay parameter of the acquired data node is increased, the master node reduces the probability of the acquired data node, and when the delay parameter of the acquired node to be data is decreased, the master node increases the probability of the acquired data node.

12. A distributed storage system management apparatus, the apparatus comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring delay parameters of each data node accessed in a first time period, and the delay parameters are used for representing the delay degree of a client accessing the data nodes;

and the sending module is used for sending the acquired delay parameters of the data nodes to the main node so that the main node determines the probability that the data nodes are allocated to the client side for access according to the delay parameters.

13. The apparatus of claim 12, wherein the obtaining module is further configured to:

acquiring the delay time of each data node accessed in the first time period;

selecting a reference value from the obtained delay time;

and determining the ratio of the delay time of each data node accessed in the first time period to the reference value as the delay parameter of each data node accessed in the first time period.

14. The apparatus of claim 12, wherein the obtaining module is further configured to:

for each data node accessed within the first time period:

acquiring the accessed data volume and delay time when the data node is accessed each time in the first time period;

determining unit delay time corresponding to data with specified data volume in the data node accessed each time according to the data volume and the delay time accessed each time;

and determining the average value of the unit delay time of accessing the data node in each time in the first time period as the delay time of the data node.

15. A distributed storage system management apparatus, the apparatus comprising:

the system comprises an acquisition module, a delay module and a processing module, wherein the acquisition module is used for acquiring delay parameters from a client, the delay parameters acquired from the client comprise delay parameters of data nodes accessed by the client in a first time period, and the delay parameters of the data nodes are used for expressing the delay degree of the client accessing the data nodes;

and the determining module is used for determining the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

16. The apparatus of claim 15, wherein the determining module is further configured to:

acquiring the probability that each data node in the data node cluster determined before the current moment is allocated to a client for access;

and adjusting the acquired probability according to the acquired delay parameters.

17. The apparatus according to claim 15 or 16, wherein the determining module is specifically configured to:

for any data node in the data node cluster, when the delay parameter of the acquired data node is increased, the probability of the acquired data node is reduced, and when the delay parameter of the acquired node to be data is reduced, the probability of the acquired data node is increased.

18. A distributed storage system is characterized in that the distributed storage system comprises a client, a data node and a main node;

the client is used for obtaining delay parameters of each data node accessed in a first time period, the delay parameters are used for representing the delay degree of the data nodes accessed by the client, and the obtained delay parameters of each data node are sent to the main node;

the main node is used for obtaining the delay parameters from the client and determining the probability that each data node in the data node cluster is allocated to the client for access according to the obtained delay parameters.

19. A distributed storage system, comprising:

a memory for storing a program;

a processor, coupled to the memory, for executing the program to perform the method of any of claims 1 to 3, or to perform the method of any of claims 4 to 6.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a distributed storage system and a management method and apparatus thereof.

Background

The distributed storage system is a large-scale data storage system built on a large number of servers and a network, and can effectively prevent the problems of data loss or unavailable service caused by single disk failure, single click failure and small-range network failure.

Slow nodes, i.e., data nodes that perform significantly worse than other data nodes in the distributed storage system, may occur in the distributed storage system.

Disclosure of Invention

In view of this, the present disclosure provides a distributed storage system, and a management method and an apparatus thereof, which improve a speed of a client accessing a data node under a condition that a storage capacity of the distributed storage system is not changed.

According to a first aspect of the present disclosure, there is provided a distributed storage system management method, which is applied to a client, and includes: obtaining delay parameters of each data node accessed in a first time period, wherein the delay parameters are used for representing the delay degree of a client accessing the data nodes; and sending the acquired delay parameters of the data nodes to a main node, so that the main node determines the probability of the data nodes being allocated to the client side for access according to the delay parameters.

According to a second aspect of the present disclosure, there is provided a distributed storage system management method, the method being applied to a master node, the method including: the method comprises the steps that delay parameters are obtained from a client, wherein the delay parameters obtained from the client comprise delay parameters of data nodes accessed by the client in a first time period, and the delay parameters of the data nodes are used for expressing the delay degree of the client for accessing the data nodes; and determining the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

According to a third aspect of the present disclosure, there is provided a distributed storage system management method, the method comprising: the method comprises the steps that a client side obtains delay parameters of each data node accessed in a first time period, wherein the delay parameters are used for representing the delay degree of the data nodes accessed by the client side; the client sends the acquired delay parameters of each data node to the master node; the main node obtains a delay parameter from the client; and the main node determines the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

According to a fourth aspect of the present disclosure, there is provided a distributed storage system management apparatus, the apparatus comprising: the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring delay parameters of each data node accessed in a first time period, and the delay parameters are used for representing the delay degree of a client accessing the data nodes; and the sending module is used for sending the acquired delay parameters of the data nodes to the main node so that the main node determines the probability that the data nodes are allocated to the client side for access according to the delay parameters.

According to a fifth aspect of the present disclosure, there is provided a distributed storage system management apparatus, the apparatus comprising: the system comprises an acquisition module, a delay module and a processing module, wherein the acquisition module is used for acquiring delay parameters from a client, the delay parameters acquired from the client comprise delay parameters of data nodes accessed by the client in a first time period, and the delay parameters of the data nodes are used for expressing the delay degree of the client accessing the data nodes; and the determining module is used for determining the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

According to a sixth aspect of the present disclosure, there is provided a distributed storage system, the system comprising a client, a data node, and a master node; the client is used for obtaining delay parameters of each data node accessed in a first time period, the delay parameters are used for representing the delay degree of the data nodes accessed by the client, and the obtained delay parameters of each data node are sent to the main node; the main node is used for obtaining the delay parameters from the client and determining the probability that each data node in the data node cluster is allocated to the client for access according to the obtained delay parameters.

According to a seventh aspect of the present disclosure, there is provided a distributed storage system, the system comprising a memory for storing a program; a processor, coupled to the memory, for executing the program to perform the method according to the first aspect as described above or to perform the method according to the second aspect as described above.

In the embodiment of the disclosure, the probability that the data node is allocated to the client for access is determined by the delay degree of the data node, and when the data node becomes a slow node, the probability that the data node is accessed by the client is adjusted instead of deleting the data node, so that the storage capacity of the distributed storage system is ensured to be unchanged, and the speed of accessing the data node by the client is increased.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 illustrates a flow chart of a distributed storage system management method according to an embodiment of the present disclosure.

FIG. 2 illustrates an architectural schematic of a distributed storage system according to an embodiment of the present disclosure.

FIG. 3 shows a flow diagram of a distributed storage system management method according to an embodiment of the present disclosure.

FIG. 4 shows a flow diagram of a distributed storage system management method according to an embodiment of the present disclosure.

Fig. 5 illustrates a block diagram of a distributed storage system management apparatus according to an embodiment of the present disclosure.

Fig. 6 illustrates a block diagram of a distributed storage system management apparatus according to an embodiment of the present disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

Fig. 1 illustrates a flow chart of a distributed storage system management method according to an embodiment of the present disclosure. The method can be applied to a client. As shown in fig. 1, the method may include:

step S11, obtaining a delay parameter of each data node accessed in the first time period, where the delay parameter is used to indicate a delay degree of the client accessing the data node.

Step S12, sending the obtained delay parameter of each data node to a master node, so that the master node determines, according to the delay parameter, a probability that each data node is allocated to a client for access.

In the embodiment of the disclosure, the probability that the data node is allocated to the client for access is determined by the delay degree of the data node, and when the data node becomes a slow node, the probability that the data node is accessed by the client is adjusted instead of deleting the data node, so that the storage capacity of the distributed storage system is ensured to be unchanged, and the speed of accessing the data node by the client is increased.

FIG. 2 illustrates an architectural schematic of a distributed storage system according to an embodiment of the present disclosure. Wherein, client 1 and client 2 may represent different clients; data node 1, data node 2, data node 3, and data node 4 may represent different data nodes. As shown in fig. 2, the distributed storage system may include a master node, data nodes, and clients. The master node is a control node in the distributed storage system and is mainly responsible for the distribution of data nodes, the acquisition of system data, the monitoring of node health states and the like. The data node is a node in charge of data storage in the distributed storage system, and can be a physical server, and a process for receiving a data access request can be executed on the data node. The client is a module provided for an upper layer user on the distributed storage system, and the client can access the data node by sending an access request to the data node. The data node can respond to the read-write request sent by the client to realize data access.

The distributed storage system management method shown in fig. 1 may be applied to the client shown in fig. 2, such as the client 1 or the client 2.

In step S11, the first time period may represent any one of time periods, and the first time period may be set as required, for example, the first time period may be one hour, one day, one week, or the like, and the disclosure is not limited thereto.

A client may access one or more data nodes during a first time period. The client may obtain the delay parameters of each data node accessed within the first time period. Wherein, the delay parameter can be used to represent the delay degree of the client accessing the data node.

In one possible implementation manner, the delay time of one access of the data node may be the time taken for completing the data access corresponding to the access request. The delay time of the data node may be an average of delay times of respective accesses of the data node in the first time period.

In one example, the delay parameter of the data node may be a delay time of the data node. The larger the delay time of the data node is, the higher the delay degree of the data node is, and therefore the delay time of the data node can be used as the delay parameter of the data node.

In one example, the delay parameter of the data node may be a parameter representing a delay time of the data node, such as a relative delay time of the data node. Step S11 may include: acquiring the delay time of each data node accessed in the first time period; selecting a reference value from the obtained delay time; and determining the ratio of the delay time of each data node accessed in the first time period to the reference value as the delay parameter of each data node accessed in the first time period.

The client may obtain the delay time of each data node accessed in the first time period, then select any one of the delay times as a reference value (for example, select the maximum value or the minimum value thereof as the reference value), and determine the relative delay time of each data node, that is, the ratio of the delay time of each data node to the reference value, with the reference value as a reference. The client may determine the relative delay time of the data node as the delay parameter of the data node.

The larger the relative delay time of the data node is, the higher the delay degree of the data node relative to other data nodes is, and therefore, the relative delay time of the data node can be used as the delay parameter of the data node.

In the embodiment of the present disclosure, by performing the relative processing on the delay time, the performance of different data nodes can be compared from the perspective of the client, thereby avoiding the performance comparison between different clients for a certain data node, and effectively preventing the performance jitter caused by the network jitter.

In one possible implementation, the delay time of one access of the data node may be a unit delay time for completing data access of a specified data amount in the access request. In one example, taking any one data node accessed by the client within the first time period as an example, the obtaining the delay time of the data node may include: acquiring the accessed data volume and delay time when the data node is accessed each time in the first time period; determining unit delay time corresponding to data with specified data volume in the data node accessed each time according to the data volume and the delay time accessed each time; and determining the average value of the unit delay time of accessing the data node in each time in the first time period as the delay time of the data node.

The unit delay time corresponding to the data with the specified data volume in the data node is determined according to the data volume and the delay time of each access: the unit delay time is a delay time of access/amount of data of access.

The specified data amount may be set as required, for example, the specified data amount may be 4KB, and the present disclosure is not limited thereto.

In the embodiment of the disclosure, the delay time is normalized, the unit delay time is used as the delay time of one-time access of the data node, and the delay time of each request is unified into the delay time of data access of a specified data volume, so that the delay time of one-time access of the data node is not affected by the data volume factor of the access, and the accuracy is improved.

In step S12, the data node cluster may be a cluster formed by all data nodes in the distributed storage system. The data nodes accessed by different clients in the first time period may be different or the same.

The client may transmit the delay parameter acquired in step S11 to the master node. The master node may receive the delay parameters sent by the client, and the delay parameters sent by the client may include delay parameters of data nodes accessed by the client in the first time period. The master node may determine, according to the delay parameter of each data node accessed by the client within the first time period, a probability that each data node in the data node cluster is allocated to the client for access.

FIG. 3 shows a flow diagram of a distributed storage system management method according to an embodiment of the present disclosure. The method may be applied to a master node as shown in fig. 2. As shown in fig. 3, the method may include:

step S21, obtaining a delay parameter from the client, where the delay parameter obtained from the client includes a delay parameter of each data node accessed by the client in the first time period, and the delay parameter of the data node is used to indicate a delay degree of the client accessing the data node.

And step S22, determining the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

In the embodiment of the disclosure, the probability that the data node is allocated to the client for access is determined by the delay degree of the data node, and when the data node becomes a slow node, the probability that the data node is accessed by the client is adjusted instead of deleting the data node, so that the storage capacity of the distributed storage system is ensured to be unchanged, and the speed of accessing the data node by the client is increased.

In step S21, the master node may obtain the delay parameters from different clients, and the delay parameters obtained from the clients may include delay parameters of data nodes accessed by the clients in the first time period, where the delay parameters of the data nodes are used to indicate the delay degree of the clients accessing the data nodes.

Step S21 can refer to step S11, which is not described here.

In step S22, the master node may determine a probability that each data node in the data node cluster is allocated to the client for access according to the delay parameter of each data node accessed by the client within the first time period.

The master node can perform weighted scoring on each data node in the data node cluster according to the delay parameter of each data node accessed by the client in the first time period, the scoring result can be a percentile system, the percentile scoring obtained by the data node (slow node) with high delay degree is low, and the percentile scoring obtained by the data node (normal node) with low delay degree is high. The percentile score may represent a probability that the data node is assigned to the client for access. When the master node subsequently allocates the data node, the slow node may be allocated to the client access with a lower probability, and the normal node may be allocated to the client access with a higher probability.

In one possible implementation, step S22 may include: acquiring the probability that each data node in the data node cluster determined before the current moment is allocated to a client for access; and adjusting the acquired probability according to the acquired delay parameters.

The master node may determine the probability that each subsequent data node is allocated to the client for access based on the previously determined probability that each data node is allocated to the client for access and the currently obtained delay parameter of each data node.

In one possible implementation, adjusting the obtained probability according to the obtained delay parameter may include: for any data node in the data node cluster, the master node may compare the delay parameter of the currently acquired data node with a delay threshold, and adjust the probability that the previously determined data node is allocated to the client for access according to the comparison result, so as to obtain the probability that the subsequent data node is allocated to the client for access. Specifically, if the delay parameter of the currently acquired data node is greater than the delay threshold, the master node reduces the probability that the previously determined data node is allocated to the client for access; if the delay parameter of the currently acquired data node is less than or equal to the delay threshold, the master node may increase the probability that the previously determined data node is allocated to the client for access.

Thus, for the previously determined slow node, if the currently obtained delay parameter of the data node indicates that the delay degree of the data node is higher (corresponding to being greater than the delay threshold), the master node may reduce the previously determined probability that the data node is allocated to the client for access, and obtain the probability that the subsequent data node is allocated to the client for access. In this way, the slow node can allocate fewer access requests, and the long tail delay of the user is not greatly influenced to a certain extent.

For the previously determined slow node, if the delay parameter of the currently acquired data node indicates that the delay degree of the data node is low (corresponding to a delay threshold value or less), the master node may increase the probability that the previously determined data node is allocated to the client for access, and obtain the probability that the subsequent data node is allocated to the client for access. Therefore, after the slow node recovers to be normal, more access requests can be distributed to the node recovering to be normal, the utilization rate of the data node is improved, and the total storage capacity of the distributed storage system is prevented from continuously decreasing.

In one possible implementation, adjusting the obtained probability according to the obtained delay parameter may include: for any data node in the data node cluster, when the delay parameter of the acquired data node is increased, the master node reduces the probability of the acquired data node, and when the delay parameter of the acquired node to be data is decreased, the master node increases the probability of the acquired data node.

When the delay parameter of the acquired data node is increased, the delay degree of the data node is increased, the pressure of reading and writing data of the data node is increased, and the main node can reduce the probability that the data node is allocated to the client side for access, so that the data node is allocated to the client side for access with a lower probability, and the pressure of the data node is reduced.

When the obtained data node delay parameter is reduced, which indicates that the delay degree of the data node is reduced, and the pressure of reading and writing data of the data node is reduced, the master node can increase the probability that the data node is allocated to the client for access, so that the data node is allocated to the client for access with a higher probability, and the pressure of other data nodes (for example, data nodes with higher delay degree) is reduced.

FIG. 4 shows a flow diagram of a distributed storage system management method according to an embodiment of the present disclosure. The method may be applied to the distributed storage system shown in fig. 2. As shown in fig. 4, the method may include:

in step S31, the client obtains a delay parameter of each data node accessed in the first time period, where the delay parameter is used to indicate a delay degree of the client accessing the data node.

In step S32, the client sends the acquired delay parameters of each data node to the master node.

In step S33, the master node acquires the delay parameter from the client.

And step S34, the main node determines the probability that each data node in the data node cluster is allocated to the client for access according to the acquired delay parameters.

In the embodiment of the disclosure, the probability that the data node is allocated to the client for access is determined by the delay degree of the data node, and when the data node becomes a slow node, the probability that the data node is accessed by the client is adjusted instead of deleting the data node, so that the storage capacity of the distributed storage system is ensured to be unchanged, and the speed of accessing the data node by the client is increased.

Step S31 and step S32 may refer to step S11 and step S12, which are not described herein.

In one possible implementation, step S31 may include: the client acquires the delay time of each data node accessed in the first time period; the client selects a reference value from the acquired delay time; and the client determines the ratio of the delay time of each data node accessed in the first time period to the reference value as the delay parameter of each data node accessed in the first time period.

In one example, the obtaining, by the client, the delay time of each data node accessed in the first time period may include: the client side comprises the following steps of aiming at each data node accessed in the first time period: the client acquires the accessed data volume and delay time when the data node is accessed each time in the first time period; the client determines unit delay time corresponding to data with specified data volume in the data node according to the data volume and the delay time of each access; and the client determines the average value of the unit delay time of accessing the data node in each time in the first time period as the delay time of the data node.

Step S33 and step S34 may refer to step S21 and step S22, which are not described herein.

In one possible implementation, step S34 may include: the method comprises the steps that a main node obtains the probability that each data node in a data node cluster determined before the current moment is allocated to a client side for access; and the main node adjusts the acquired probability according to the acquired delay parameters.

In a possible implementation manner, the adjusting, by the master node, the obtained probability according to the obtained delay parameter may include: for any data node in the data node cluster, when the delay parameter of the acquired data node is increased, the master node reduces the probability of the acquired data node, and when the delay parameter of the acquired node to be data is decreased, the master node increases the probability of the acquired data node.

Application example

As shown in fig. 2, the client 1 accesses the data node 1, the data node 2, and the data node 3 during the first period. Client 2 has access to data node 1, data node 3, and data node 4 for a first period of time. The data node 1, the data node 2, the data node 3 and the data node 4 form a data node cluster. The following describes a distributed storage system management method according to an embodiment of the present disclosure, taking a relative delay time as a delay parameter as an example.

The client 1 acquires: the relative delay time (delay parameter) of the data node 1 is 20%; the relative delay time of the data node 2 is 20%, and the relative delay time of the data node 3 is 100%. The client 1 transmits the relative delay times of the data node 1, the data node 2 and the data node 3 to the master node.

The client 2 acquires: the relative delay time (delay parameter) of the data node 1 is 20%; the relative delay time of the data node 3 is 100%, and the relative delay time of the data node 4 is 20%. The client 2 sends the relative delay times of the data node 1, the data node 3 and the data node 4 to the master node.

The master node obtains the relative delay times from the client 1 and the client 2, respectively. And the main node determines the probability that each data node in the data node cluster is allocated to the client side for access according to the acquired relative delay time. The master node determines that the probability that the data node 1 is allocated to the client for access is 80%, the probability that the data node 2 is allocated to the client for access is 80%, the probability that the data node 3 is allocated to the client for access is 20%, and the probability that the data node 4 is allocated to the client for access is 80%.

As can be seen from the relative delay time of each data node accessed by the client 1 in the first time period, the delay degree of the data node 3 is high. The relative delay time of each data node accessed by the client 2 in the first time period is known, and the delay degree of the data node 3 is high. Therefore, the probability that the data node 3 determined by the master node is allocated to the client for access is low. In this way, the data node 3 (slow node) can be relieved of stress. By reducing the pressure of the data nodes 3 instead of deleting the data nodes 3, the speed of accessing the data nodes by the client can be improved under the condition that the storage capacity of the distributed storage system is unchanged.

Fig. 5 illustrates a block diagram of a distributed storage system management apparatus according to an embodiment of the present disclosure. The device can be applied to a client. As shown in fig. 5, the apparatus 40 may include:

an obtaining module 41, configured to obtain a delay parameter of each data node accessed in a first time period, where the delay parameter is used to indicate a delay degree of a client accessing the data node;

a sending module 42, configured to send the obtained delay parameter of each data node to a master node, so that the master node determines, according to the delay parameter, a probability that each data node is allocated to a client for access.

In the embodiment of the disclosure, the probability that the data node is allocated to the client for access is determined by the delay degree of the data node, and when the data node becomes a slow node, the probability that the data node is accessed by the client is adjusted instead of deleting the data node, so that the storage capacity of the distributed storage system is ensured to be unchanged, and the speed of accessing the data node by the client is increased.

In a possible implementation manner, the obtaining module is further configured to:

acquiring the delay time of each data node accessed in the first time period;

selecting a reference value from the obtained delay time;

and determining the ratio of the delay time of each data node accessed in the first time period to the reference value as the delay parameter of each data node accessed in the first time period.

In a possible implementation manner, the obtaining module is further configured to:

for each data node accessed within the first time period:

acquiring the accessed data volume and delay time when the data node is accessed each time in the first time period;

determining unit delay time corresponding to data with specified data volume in the data node accessed each time according to the data volume and the delay time accessed each time;

and determining the average value of the unit delay time of accessing the data node in each time in the first time period as the delay time of the data node.

Fig. 6 illustrates a block diagram of a distributed storage system management apparatus according to an embodiment of the present disclosure. The apparatus may be applied to a master node. As shown in fig. 6, the apparatus 50 may include:

an obtaining module 51, configured to obtain a delay parameter from a client, where the delay parameter obtained from the client includes a delay parameter of each data node accessed by the client in a first time period, and the delay parameter of the data node is used to indicate a delay degree of the client accessing the data node;

and the determining module 52 is configured to determine, according to the obtained delay parameter, a probability that each data node in the data node cluster is allocated to the client for access.

In the embodiment of the disclosure, the probability that the data node is allocated to the client for access is determined by the delay degree of the data node, and when the data node becomes a slow node, the probability that the data node is accessed by the client is adjusted instead of deleting the data node, so that the storage capacity of the distributed storage system is ensured to be unchanged, and the speed of accessing the data node by the client is increased.

In one possible implementation, the determining module is further configured to:

acquiring the probability that each data node in the data node cluster determined before the current moment is allocated to a client for access;

and adjusting the acquired probability according to the acquired delay parameters.

In a possible implementation manner, the determining module is specifically configured to:

for any data node in the data node cluster, when the delay parameter of the acquired data node is increased, the probability of the acquired data node is reduced, and when the delay parameter of the acquired node to be data is reduced, the probability of the acquired data node is increased.

The embodiment of the present disclosure further provides a distributed storage system, where the distributed storage system includes a client, a data node, and a master node; the client is used for obtaining delay parameters of each data node accessed in a first time period, the delay parameters are used for representing the delay degree of the data nodes accessed by the client, and the obtained delay parameters of each data node are sent to the main node; the main node is used for obtaining the delay parameters from the client and determining the probability that each data node in the data node cluster is allocated to the client for access according to the obtained delay parameters.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于NVM的固态硬盘元数据管理方法及系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类