Construction method and application of distributed learning index model

文档序号：1952726 发布日期：2021-12-10 浏览：18次中文

阅读说明：本技术 一种分布式学习索引模型的构建方法及应用 (Construction method and application of distributed learning index model ) 是由华宇李鹏飞于 2021-09-18 设计创作，主要内容包括：本发明公开了一种分布式学习索引模型的构建方法及应用,属于计算机分布式存储技术领域,包括：对各存储节点,分别将其存储的数据根据键值的大小进行排序后,以其存储的数据的键值作为输入,对应的排序位置作为输出,训练机器学习模型,得到各存储节点的学习索引模型,并同步到所有的计算节点中；计算节点通过RDMA操作直接修改存储节点中的数据,而无需存储节点的CPU参与工作；同时,计算节点异步地对旧模型进行重新训练,并将新模型同步到存储节点中；通过将修改数据和模型的操作放到分布式系统中的计算节点来执行,大大降低了存储节点的CPU开销。(The invention discloses a construction method and application of a distributed learning index model, belonging to the technical field of computer distributed storage and comprising the following steps: after the data stored in each storage node is sorted according to the size of the key values, the key values of the data stored in each storage node are used as input, the corresponding sorting positions are used as output, a machine learning model is trained, a learning index model of each storage node is obtained, and the learning index models are synchronized to all the computing nodes; the computing node directly modifies the data in the storage node through RDMA operation without the participation of a CPU of the storage node; meanwhile, the computing node asynchronously retrains the old model and synchronizes the new model to the storage node; the operation of modifying data and model is carried out by putting the operation into the computing nodes in the distributed system, so that the CPU overhead of the storage nodes is greatly reduced.)

1. A method for constructing a distributed learning index model is characterized by comprising the following steps: after the data stored in each storage node is sorted according to the size of the key value key, the key value key of the data stored in each storage node is used as input, the corresponding sorting position is used as output, a machine learning model is trained, a learning index model of each storage node is obtained, and the learning index model is synchronized to all the computing nodes;

the learning index model comprises a plurality of independent index submodels; the data stored by the storage nodes are divided into a plurality of data intervals, each indexing sub-model is respectively used for indexing the data in one data interval, and the data intervals covered by the indexing sub-models are not overlapped; training each index sub-model by data in the corresponding data interval respectively so as to enable the maximum error of each index sub-model to be smaller than a preset threshold value;

wherein the maximum error of the kth index submodel is max _ error_k＝|Y_k,i-f(X_k,i) L + δ; i is 1,2, …, N_k；N_kThe number of data in a data interval corresponding to the kth index sub-model is set; y is_k,iThe sequencing position of the ith data in the data interval corresponding to the kth index sub-model is determined; x_k,iKey value key of ith data in the data interval corresponding to the kth index sub-model; f (X)_k,i) The ordering position of the ith data in the data interval corresponding to the kth index sub-model is obtained after prediction by the kth index sub-model; delta is a deviation value;

the data stored in the storage nodes are sorted and stored in a plurality of arrays with the size delta, the physical addresses of all the arrays are stored in address translation tables, and the address translation tables are correspondingly synchronized into corresponding computing nodes.

2. The method for constructing the distributed learning index model according to claim 1, wherein the data stored in the storage nodes are sorted and stored in a linked list formed by the plurality of arrays with the size δ; at this time, the array corresponds to a link table node in the link table.

3. The method for constructing the distributed learning index model according to claim 1 or 2, wherein the index submodel is a linear regression model according to < key^*Model > is stored; wherein, key^*And the model is the minimum data or the maximum data in the data interval covered by the index sub-model, and the model is the model parameter of the index sub-model.

4. A learning index model insertion method constructed based on the distributed learning index model construction method according to any one of claims 1 to 3, comprising the steps of:

s11, the calculation node calculates the sequencing position of the data to be inserted by adopting the learning index model thereon;

s12, converting the obtained sorting position into a physical position of a corresponding array based on an address conversion table on the computing node, wherein the computing node reads the corresponding array from a storage node through unilateral RDMA operation according to the physical address; judging whether the read array contains a key of the data to be inserted, if so, ending the operation; otherwise, go to step S13;

s13, pre-allocating an array on the storage node to store the insertion data, if the pre-allocated array has a free slot, the computing node directly inserts the data to be inserted into the free slot of the pre-allocated array on the storage node; otherwise, go to step S14;

s14, creating a new array for the data to be inserted on the storage node, updating an address translation table, synchronizing the address translation table to the corresponding computing node, and inserting the data to be inserted into the free slot of the new array; and in this process, the compute nodes asynchronously perform the following operations: reading all data in a data interval where the data to be inserted are located on the corresponding storage node through unilateral RDMA operation, retraining an index sub-model corresponding to the data interval where the data to be inserted are located based on the read data, and synchronizing the obtained index sub-model to the corresponding storage node.

5. The insertion method of the learning index model constructed by the construction method of the distributed learning index model according to claim 4, wherein when the data stored in the storage nodes are stored in the linked list formed by the arrays with the size δ after being sorted, if the pre-allocated array has a free slot, the linked list nodes corresponding to the pre-allocated array are locked by using CAW operation in RDMA technology, and the linked list nodes corresponding to the pre-allocated array are unlocked after the data to be inserted are written into the free positions of the linked list nodes corresponding to the pre-allocated array;

if the pre-distributed array does not have a free slot, locking the linked list nodes and the address conversion tables corresponding to the pre-distributed array by adopting CAW operation in the RDMA technology, and creating a new array on the storage nodes for the data to be inserted as new linked list nodes in the linked list; and inserting the physical address of the new chain table node into the address translation table, then inserting the data to be inserted into the idle slot of the new array, and finally unlocking the chain table node and the address translation table corresponding to the pre-allocated array.

6. The method for inserting the learning index model constructed by the method for constructing the distributed learning index model according to claim 4 or 5, wherein when the computing node asynchronously reads all data in a data interval in which data to be inserted on a corresponding storage node is located, the version of a sub-model corresponding to an index on the storage node is recorded; retraining an index sub-model corresponding to a data interval in which data to be inserted are located based on the read data, and then comparing the version of the retrained index sub-model with the version of the index sub-model on the storage node; and if the two versions are the same, synchronizing the retrained indexing sub-model to the corresponding storage node, and updating the version of the corresponding indexing sub-model on the storage node to ensure the correct updating of the model.

7. A query method of a learning index model constructed based on the construction method of the distributed learning index model according to any one of claims 1 to 3, comprising the following steps:

s21, the calculation node calculates the sequencing position of the data to be inquired by adopting the learning index model thereon;

s22, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table on the computing node, reading the corresponding array from a storage node through unilateral RDMA operation according to the physical address by the computing node, and obtaining a pointer of a pointing value corresponding to the data to be queried in the read array;

and S23, the computing node reads the value to be queried from the corresponding storage node through unilateral RDMA operation according to the physical address pointed by the pointer pointing to the value.

8. A method for deleting a learning index model constructed based on the method for constructing a distributed learning index model according to any one of claims 1 to 3, comprising the steps of:

s31, the calculation node calculates the sequencing position of the data to be deleted by adopting the learning index model thereon;

s32, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table on the computing node, reading the corresponding array from the storage node by unilateral RDMA operation according to the physical address by the computing node, judging whether the read array contains a key of the data to be deleted, if not, judging that the data to be deleted does not exist in the corresponding storage node, and ending the operation; otherwise, go to step S33;

s33, the computing node obtains a pointer of a pointing value corresponding to the data to be deleted in the read array; and according to the physical address pointed by the pointer pointing to the value, deleting the data to be deleted on the corresponding storage node by the computing node through unilateral RDMA operation.

9. The method for deleting the learning index model constructed by the method for constructing the distributed learning index model according to claim 8, wherein when the data stored in the storage node is sorted and stored in the linked list formed by the plurality of arrays with the size δ, in step S33, the linked list node where the data to be deleted is located is determined based on the sorting position of the data to be deleted; locking a linked list node where data to be deleted is located by using CAW operation in RDMA technology, then deleting the data to be deleted from the linked list node where the data to be deleted is located, and finally unlocking the linked list node where the data to be deleted is located;

when the deletion operation generates empty linked list nodes, deleting the array corresponding to the empty linked list nodes and updating the address conversion table stored on the corresponding storage nodes; and in this process, the compute nodes asynchronously perform the following operations: reading all data in a data interval where the data to be deleted on the corresponding storage node are located through unilateral RDMA operation, retraining an index sub-model corresponding to the data interval where the data to be deleted are located based on the read data, and synchronizing the obtained model to the corresponding storage node.

10. A machine-readable storage medium having stored thereon machine-executable instructions which, when invoked and executed by a processor, cause the processor to implement the method of any of claims 1-9.

Technical Field

The invention belongs to the technical field of computer distributed storage, and particularly relates to a construction method and application of a distributed learning index model.

Background

In the face of the access and storage requirements of mass data, the data center stores the data into a distributed system, and the data center is connected with the memories of different machines through a network to provide high-capacity and high-efficiency data storage and access services. Different machines in the distributed system are divided into storage nodes and computing nodes, so that the distributed system has stronger flexibility and expansibility, and the computing nodes can directly read and write the memories of the storage nodes through the RDMA technology, so that the performance of the distributed storage system based on the network is further improved. Storage systems use different index structures to meet different requirements, where a tree index structure is an important structure to satisfy a range request. However, the existing tree index structure is not suitable for the RDMA network-based distributed storage system, because the multi-level tree structure requires that the computing node can read the final data through multiple RDMA operations, which introduces expensive network overhead.

The existing learning index technology utilizes a machine learning model to learn the distribution rule of data, so that the obtained model is utilized to calculate the data position, all models can be stored only by using a small amount of memory, the position of the data can be calculated by utilizing the strong calculation performance of a calculation node, and the required data can be read only by one RDMA operation. Compared with the traditional tree index strategy, the space, time and network overhead of the learning index are much smaller, and the method is more suitable for the RDMA-based distributed storage system. However, the existing learning index method is difficult to be used in a distributed storage system with separate storage nodes and computing nodes, mainly because of the following challenges:

high CPU overhead: for workloads that write frequently, the frequent changes in data cause the model to require heavy training as well. Because the retraining time is long and the old model cannot be used, in order to meet the requirement of concurrent request tasks, the existing method maintains a tree structure at the storage node to process dynamic data change, and puts all tasks (including adding, deleting, modifying and checking) to the storage node to do before retraining to obtain a new model. Under the condition, the storage nodes need to execute a large number of data modification tasks and frequently retrain the model, so that the calculation burden of the storage nodes is greatly increased, and the performance of the whole system is further reduced by the storage nodes with weak calculation capability, so that the data access performance which is good enough cannot be obtained through learning indexes.

The remote data access performance is poor: one of the existing methods for reducing the CPU overhead of the storage node is to put the data modification and model recalculation tasks on the compute nodes, but this approach requires that we lock the data and model to prevent other compute nodes from modifying or reading inconsistent data and models. The method blocks the execution of other tasks, greatly reduces the concurrency performance of the system, and cannot meet the requirements of efficient data storage and access.

In summary, the existing concurrent retraining learning index schemes all require that new data is temporarily placed at an additional address, introduce multiple network overheads during access, and are not suitable for the RDMA-based distributed network storage system.

Disclosure of Invention

Aiming at the defects or improvement requirements of the prior art, the invention provides a construction method and application of a distributed learning index model, which are used for solving the technical problem that the CPU overhead of storage nodes is high in the prior art.

In order to achieve the above object, in a first aspect, the present invention provides a method for constructing a distributed learning index model, including:

after the data stored in each storage node is sorted according to the size of the key value key, the key value key of the data stored in each storage node is used as input, the corresponding sorting position is used as output, a machine learning model is trained, a learning index model of each storage node is obtained, and the learning index model is synchronized to all the computing nodes;

wherein the maximum error of the kth index submodel is max _ error_k＝|Y_k,i-f(X_k,i) L + δ; i is 1,2, …, N_k；N_kIs the k-th cableThe number of data in the corresponding data interval of the primer model; y is_k,iThe sequencing position of the ith data in the data interval corresponding to the kth index sub-model is determined; x_k,iKey value key of ith data in the data interval corresponding to the kth index sub-model; f (X)_k,i) The ordering position of the ith data in the data interval corresponding to the kth index sub-model is obtained after prediction by the kth index sub-model; delta is a deviation value;

Further preferably, the data stored by the storage nodes are sorted and stored in a linked list formed by the plurality of arrays with the size of δ; at this time, the array corresponds to a link table node in the link table.

Further preferably, the index submodel is a linear regression model, according to < key^*Model > is stored; wherein, key^*The model is the minimum data or the maximum data in the data interval covered by the index sub-model, and the model is the model parameter of the index sub-model.

In a second aspect, the present invention provides a method for inserting a learning index model constructed by the above method for constructing a distributed learning index model, including the following steps:

s11, the calculation node calculates the sequencing position of the data to be inserted by adopting the learning index model thereon;

s12, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table on the computing node, and reading the corresponding array from the storage node by the computing node through unilateral RDMA operation according to the physical address; judging whether the read array contains the key value key of the data to be inserted, if so, ending the operation; otherwise, go to step S13;

s14, creating a new array for the data to be inserted on the storage node, updating the address translation table, synchronizing the address translation table to the corresponding computing node, and inserting the data to be inserted into the idle slot of the new array; and in the process, the computing nodes asynchronously perform the following operations: reading all data in a data interval where the data to be inserted are located on the corresponding storage node through unilateral RDMA operation, retraining an index sub-model corresponding to the data interval where the data to be inserted are located based on the read data, and synchronizing the obtained index sub-model to the corresponding storage node.

Further preferably, when the data stored in the storage node is sorted and stored in the linked list formed by the plurality of arrays with the size δ, in step S13, if the pre-allocated array has an idle slot, the linked list node corresponding to the pre-allocated array is locked by using the CAW operation in the RDMA technique, and the linked list node corresponding to the pre-allocated array is unlocked after the data is written into the idle position of the linked list node corresponding to the pre-allocated array;

Further preferably, when the computing node asynchronously reads all data in the data interval in which the data to be inserted on the corresponding storage node is located, the version of the corresponding index sub-model on the storage node is recorded; retraining an index sub-model corresponding to a data interval in which data to be inserted are located based on the read data, and then comparing the version of the retrained index sub-model with the version of the index sub-model on the storage node; and if the two versions are the same, synchronizing the retrained indexing sub-model to the corresponding storage node, and updating the version of the corresponding indexing sub-model on the storage node to ensure the correct updating of the model.

In a third aspect, the present invention provides a query method for a learning index model constructed based on the above construction method for a distributed learning index model, including the following steps:

s21, the calculation node calculates the sequencing position of the data to be inquired by adopting the learning index model thereon;

s22, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table on the computing node, reading the corresponding array from the storage node by the computing node through unilateral RDMA operation according to the physical address, and obtaining a pointer of a pointing value corresponding to the data to be queried in the read array;

In a fourth aspect, the present invention provides a method for deleting a learning index model constructed by the above method for constructing a distributed learning index model, including the steps of:

s31, the calculation node calculates the sequencing position of the data to be deleted by adopting the learning index model thereon;

s32, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table on the computing node, reading the corresponding array from the storage node by the computing node through unilateral RDMA operation according to the physical address, judging whether the read array contains a key of data to be deleted, if not, judging that the data to be deleted does not exist in the corresponding storage node, and ending the operation; otherwise, go to step S33;

s33, the calculation node obtains a pointer of a pointing value corresponding to the data to be deleted in the read array; and according to the physical address pointed by the pointer pointing to the value, deleting the data to be deleted on the corresponding storage node by the computing node through unilateral RDMA operation.

Further preferably, when the data stored in the storage node is sorted and stored in the linked list formed by the plurality of arrays with the size δ, in step S33, the linked list node where the data to be deleted is located is determined based on the sorting position of the data to be deleted; the chain table node where the data to be deleted is located is locked by using CAW operation in RDMA technology, then the data to be deleted is deleted from the chain table node where the data to be deleted is located, and finally the chain table node where the data to be deleted is located is unlocked.

Further preferably, when the deletion operation generates a null linked list node, the array corresponding to the null linked list node is deleted and the address translation table stored on the corresponding storage node is updated; and in this process, the compute nodes asynchronously perform the following operations: reading all data in a data interval where the data to be deleted on the corresponding storage node are located through unilateral RDMA operation, retraining an index sub-model corresponding to the data interval where the data to be deleted are located based on the read data, and synchronizing the obtained model to the corresponding storage node.

In a fifth aspect, the invention also provides a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to carry out any of the methods provided by the invention.

Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:

1. the invention provides a method for constructing a distributed learning index model.A storage node adaptively constructs an efficient learning index model according to the distribution condition of the existing data, and then synchronizes the learning index model obtained by training to a computing node; the learning index model comprises a plurality of independent index submodels, each index submodel is used for indexing data in one data interval, and the data intervals covered by the index submodels are not overlapped; the obtained distributed learning index puts a large number of calculation tasks into the calculation nodes, and by adding the deviation delta in the model training stage and limiting the data to move only in the array with the size delta, the calculation nodes can modify the data (such as insertion, update and deletion) of each array through RDMA (remote direct memory access) operation without reducing the precision of the model; in addition, the newly created array is stored at the end of the address translation table without influencing the sequencing sequence of the existing array, and all arrays can be searched through the address translation table without retraining the model, so that data loss is avoided; after the old model is still available when the data is modified, the model can be retrained asynchronously on the computing nodes, the new model is synchronized into each node, and the operation of retraining the model runs in the background without blocking the index operation of other nodes; compared with the existing scheme, the invention completes the operations of accessing and modifying data and index to the computing node, thereby greatly reducing the CPU overhead of the storage node.

2. In the method for constructing the distributed learning index model, provided by the invention, in order to ensure that all data are not lost and the error of each training data cannot exceed the maximum error, the training algorithm enables the index sub-model to cover the interval length as long as possible under the condition that the maximum error is smaller than a predefined threshold value, and meanwhile, the deviation value delta is added when the maximum error is calculated, so that all data can be randomly moved to delta positions without reducing the precision of the model, the address conversion table does not need to be updated, the address conversion table is updated only when an array is newly built or deleted, and the CPU (Central processing Unit) overhead of a storage node is greatly reduced.

3. In the data operation method of the learning index model constructed based on the construction method of the distributed learning index model, the computing node can calculate the logic position of data through the constructed learning index model and read the data of the storage node through unilateral RDMA operation, and meanwhile, the computing node can directly insert new data into a pre-allocated slot of the storage node without the participation of a CPU (Central processing Unit) of the storage node; when the pre-allocated slots of the storage nodes are used up, concurrent retraining is carried out, the positions of all data can be calculated by using the old model before a new model is obtained, and no data loss is ensured; the retraining is executed by other threads concurrently, other index operations are not affected, and the computing pressure of the storage nodes is greatly reduced. The invention can utilize the computing nodes to complete the operations of adding, deleting, modifying and checking the data through the RDMA technology, thereby greatly reducing the computing cost of the storage nodes, having smaller space cost and lower request delay.

4. The data operation method of the learning index model constructed based on the construction method of the distributed learning index model provided by the invention adopts a fine-grained locking mode, only locks the data in the prediction range of the model instead of locking all the data, and reduces the operation conflicts of different nodes; in addition, the invention retrains the model in an asynchronous mode, and the operation runs in the background without blocking other index operations, thereby enhancing the concurrency performance of the index structure and having better remote data access performance.

5. In the data operation method of the learning index model constructed based on the construction method of the distributed learning index model, after a new learning index model is obtained through retraining, the new model is synchronized to all nodes through a version-based method, so that the updated learning index model is prevented from being modified to the previous version, the consistency of the learning index model is ensured, and the model is ensured to be updated correctly.

Drawings

Fig. 1 is a flowchart of a method for constructing a distributed learning index model according to embodiment 1 of the present invention;

fig. 2 is a flowchart of an insertion method of a learning index model constructed based on the construction method of the distributed learning index model provided in embodiment 1 according to embodiment 2 of the present invention;

fig. 3 is a flowchart of a method for inserting data through a learning index in an RDMA-based distributed storage system according to embodiment 2 of the present invention;

fig. 4 is a flowchart of a method for performing model retraining by a compute node in an RDMA-based distributed storage system according to embodiment 2 of the present invention;

fig. 5 is a flowchart of a method for querying data through learning index in an RDMA-based distributed storage system according to embodiment 3 of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

Examples 1,

A method for constructing a distributed learning index model, as shown in fig. 1, includes:

the learning index model comprises a plurality of independent index submodels; the data stored by the storage nodes are divided into a plurality of data intervals according to key values, each indexing sub-model is respectively used for indexing the data in one data interval, and the data intervals covered by the indexing sub-models are not overlapped; training each index sub-model by data in the corresponding data interval respectively so as to enable the maximum error of each index sub-model to be smaller than a preset threshold value; the size of the preset threshold is generally aligned with the length of cacheline of the cache line, so that the length of cacheline can be divided by the storage space of data, namely the number of data is multiplied by the space occupied by single data, and then the division of the length of cacheline is an integer; the preset threshold value in the present embodiment may be set to 16, 32, 64, 128, etc.; the larger the threshold value, the smaller the number of index submodels and the larger the search range, and the size of the index submodels is usually balanced between the number of models and the search range according to the distribution rule of data. It should be noted that, the above-mentioned index submodels are independent from each other, and can be modified individually, and cover all data without overlapping and are respectively responsible for indexing data in the covered interval.

Wherein the k indexThe maximum error of the submodel is max _ error_k＝|Y_k,i-f(X_k,i) L + δ; i is 1,2, …, N_k；N_kThe number of data in a data interval corresponding to the kth index sub-model is set; y is_k,iThe sequencing position of the ith data in the data interval corresponding to the kth index sub-model is determined; x_k,iKey value key of ith data in the data interval corresponding to the kth index sub-model; f (X)_k,i) The ordering position of the ith data in the data interval corresponding to the kth index sub-model is obtained after prediction by the kth index sub-model; δ is an offset value, which is generally preset according to the length of cacheline, and can be set to be 8, 16, 32, etc., so that the difference between the maximum error and the offset value δ is aligned with cacheline. In order to ensure that all data are not lost and the error of each training data cannot exceed the maximum error, the training algorithm enables the indexing sub-model to cover the interval length as long as possible under the condition that the maximum error is smaller than a predefined threshold, and meanwhile, a deviation value delta is added when the maximum error is calculated, so that all data can be moved to delta positions at will without reducing the precision of the model.

Further, the index sub-model may be a linear regression model, a shallow neural network model, a convolutional neural network, or the like; preferably, in this embodiment, the index sub-model is a linear regression model, and is obtained by training using an OptimalPLR algorithm according to < key^*Model > is stored; wherein, key^*The model is the minimum data or the maximum data in the data interval covered by the index sub-model, and the model is the model parameters of the index sub-model and comprises weight and offset; in this embodiment, the index submodels are independent from each other, and the sorting position of the data covered by each index submodel starts to count from logic 0, so that modifying any index submodel does not affect the use of other index submodels.

Sorting data stored by the storage nodes and storing the data in a plurality of arrays with the size delta, wherein physical addresses of all the arrays are stored in an address translation table; and the address translation tables are correspondingly synchronized into the corresponding computing nodes so as to ensure the consistency of the address translation tables on the computing nodes and the learning index model. The data stored by the storage nodes are sorted according to the size of the keys, so that efficient range requests can be made conveniently. It should be noted that the array can be flexibly created at any position, and the physical position can be discontinuous, and once created, the physical position cannot be modified; the address translation table can be modified, and the address translation table is ordered and continuous logically; the address translation table is bound with the learning index model, and the logical address (i.e. the sorting position) calculated by the learning index model can be changed into a physical address through the address translation table. In addition, only recording the start position of the array can reduce the size of the address translation table, and other positions in the array are calculated by the start position and the sorting position.

It should be noted that, because the deviation δ is added in the training algorithm, and the size of the array is δ, the precision of the model is not affected when the data moves in the array, and the address translation table does not need to be updated, and is updated only when the array is newly built or deleted, thereby greatly reducing the CPU overhead.

The arrays with the size delta can be connected to form a linked list or not; in order to avoid excessive data movement caused by data insertion and sorting, in an optional implementation mode, the arrays with the size delta can be connected to form a linked list; after being sorted, the data stored by the storage nodes are stored in a linked list formed by the arrays with the size of delta, the linked list is used for storing real key value data, and all the data are sorted according to the size of the key so as to facilitate efficient range request; at this time, the array corresponds to a linked list node in the linked list, that is, each linked list node corresponds to an array; simultaneously, respectively storing the physical positions of all linked list nodes covered by each index sub-model into respective address translation tables for the conversion from the logical address to the physical address; it should be noted that there are multiple address translation tables, and each index sub-model corresponds to one address translation table. Specifically, N pairs (generally 8-256) of data are placed into a linked list node (namely an array), different linked list nodes are organized into a linked list form, an index sub-model is trained according to the data and the logical position (namely a sequencing position) of the data, and the physical addresses of all linked list nodes in the linked list are stored in an address conversion table so as to be convenient for data access through a learning index model in the following process.

It should be noted that, as long as the consistency between the model parameters and the logical addresses of the address translation tables is ensured, all data can be indexed without data loss. The reason is that the model is trained according to the logical address of the data, and the new link table node can be always obtained through the existing link table node through the pointer, so that all data can be indexed even for the old model, and the consistency between the current model and the corresponding logical address is ensured.

Further, the invention comprises an index operation thread, a retraining thread and a model synchronization thread, wherein different threads execute different tasks. The index operation thread completes functions of data addition, deletion, modification and check, and the retraining thread executes an asynchronous retraining task to retrain a corresponding index sub-model in the learning index model; and the model synchronization thread can synchronize the retrained indexing submodel to different server nodes. And in the synchronization process, updating the index sub-model according to the version number of the index sub-model to ensure the consistency of the models. The consistency comprises the consistency of the parameters of the index submodels and the logical addresses of the address translation table, and the space overhead of the address translation table is reduced and the consistency of the address translation table is ensured by adopting a mode of sharing the address translation table and independently storing the logical addresses by different index submodels.

Examples 2,

A learning index model insertion method constructed based on the distributed learning index model construction method provided in embodiment 1, as shown in fig. 2, includes the following steps:

s11, the calculation node calculates the sequencing position of the data to be inserted by adopting the learning index model thereon;

specifically, determining a corresponding index sub-model in the learning index model according to a key value key of the data to be inserted, and inputting the key value key of the data to be inserted into the index sub-model to obtain a sorting position of the data to be inserted;

s12, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table (address conversion table corresponding to an index sub-model corresponding to a data interval where data to be inserted is located) on the computing node, and reading the corresponding array from the storage node through unilateral RDMA operation by the computing node according to the physical address; judging whether the read array contains a key of the data to be inserted, if so, ending the operation; otherwise, go to step S13;

specifically, for new data, the compute node inserts the data directly into a free slot of the pre-allocated array of storage nodes, and if the pre-allocated slot runs out, more locations are allocated to accommodate the new data through a retrain operation, while the retrain operation is performed concurrently so as not to affect other operations.

S14, creating a new array for the data to be inserted on the storage node, updating an address conversion table (address conversion table corresponding to the index sub-model corresponding to the data interval where the data to be inserted is located) and synchronizing the address conversion table to the corresponding computing node, and inserting the data to be inserted into the idle slot of the new array; and in the process, the computing nodes asynchronously perform the following operations: reading all data in a data interval where the data to be inserted are located on the corresponding storage node through unilateral RDMA operation, retraining an index sub-model corresponding to the data interval where the data to be inserted are located based on the read data, and synchronizing the obtained index sub-model to the corresponding storage node.

Specifically, if the pre-allocated array has a free slot, locking the pre-allocated array, and after writing data into the free position of the pre-allocated array, unlocking the pre-allocated array; if the pre-distributed array does not have a free slot, locking the pre-distributed array and an address conversion table, and creating a new array for the data to be inserted on the storage node; and inserting the physical address of the new array into an address translation table, then inserting the data to be inserted into the idle slot of the new array, and finally unlocking the pre-allocated array and the address translation table.

Further, when the data stored in the storage nodes are stored in the linked list formed by the arrays with the size delta after being sorted, if the pre-allocated array has an idle slot, the linked list nodes corresponding to the pre-allocated array are locked by adopting CAW operation in the RDMA technology, and the linked list nodes corresponding to the pre-allocated array are unlocked after the data are written into the idle positions of the linked list nodes corresponding to the pre-allocated array;

Further, the creation of a new array indicates that the model needs retraining, at this time, a retraining signal is set and the retraining thread is notified to perform asynchronous retraining, and the insertion process can end without waiting for the execution of retraining.

Before a new model is obtained by retraining, the old model can calculate the positions of all data, so that no data is lost during concurrent retraining; in order to reduce the computing pressure of the storage nodes, the retraining process is realized in the computing nodes, and required data are prefetched in an asynchronous mode to execute retraining; finally, the newly trained model is synchronized to the corresponding storage node in a version-based mode to ensure the correct updating of the model, and the models of other computing nodes are synchronized when the next indexing operation is executed (when the other computing nodes read data from the storage node every time, the other computing nodes can judge whether the model is the latest or not, if not, the other computing nodes update from the storage node, and compared with the mode that all the computing nodes are synchronized each time the model is updated, the synchronization overhead can be reduced). Specifically, when the computing node asynchronously reads all data in a data interval in which the data to be inserted on the corresponding storage node is located, recording the version of the corresponding index sub-model on the storage node; retraining an index sub-model corresponding to a data interval in which data to be inserted are located based on the read data, and then comparing the version of the retrained index sub-model with the version of the index sub-model on the storage node; if the two versions are the same, synchronizing the retrained indexing submodel into the corresponding storage node, and updating the version of the corresponding indexing submodel on the storage node to avoid the updated model from being modified to the previous version, thereby ensuring the consistency of the model; if the two versions are different, the model at the moment is updated by other computing nodes, and the operation is ended without updating.

Further, taking the example that the data stored by the storage nodes are sorted and stored in the linked list formed by the above-mentioned arrays with the size δ, a flow chart of a method for inserting data through learning indexes in the RDMA-based distributed storage system is shown in fig. 3, all the inserting processes are completed by the computing nodes, and the flow includes the following steps:

step 1, the CPU of the computing node calls a corresponding index sub-model to perform position computation according to the data to be inserted;

step 2, the index sub-model calculates the logic position of the data to be inserted;

step 3, converting the logical address into a physical address by using an address conversion table;

step 4, the computing node reads the data of the storage node through the RDMA network card;

step 5, the RDMA network card returns the data of the physical address requested by the computing node to the computing node, the computing node searches the returned data, if the data to be inserted already exists, repeated insertion is not needed, and the insertion process is ended; if the data to be inserted does not exist and the linked list nodes have spare positions, the data are directly inserted;

step 6, if the data to be inserted does not exist and the linked list node has no vacant position, a new linked list node is created, and the position of the new node is inserted into the address translation table;

and Step 7, inserting the new node into the linked list, and inserting the data to be inserted into the vacant position to complete the insertion process.

Further, fig. 4 shows an embodiment of the present invention for model retraining by a compute node in an RDMA-based distributed storage system. The retraining is executed asynchronously, the process does not affect other operations, and the retraining is triggered when the retraining thread detects a retraining signal, and the process comprises the following steps:

step 1, the computing node prefetches all data covered by the index sub-model to be retrained from the storage node through the RDMA network card;

step 2, the RDMA network card reads data on the storage node according to the physical position of the prefetched data and returns the data to the computing node;

step 3, the computing node completes retraining locally according to the pre-acquired data;

step 4, the computing node sends the index submodel to the storage node through the RDMA network card;

and Step 5, writing a new index sub-model on the storage node through an RDMA writing technology, ensuring the consistency of the parameters of the index sub-model and an address translation table, and completing the synchronization of the model.

It should be noted that, the retraining is triggered by the computing node due to data modification, for example, when a new linked list node is created by the computing node in the insert operation and the delete operation or an empty node is deleted, the logical position needs to be changed, and the retraining is triggered at this time. If the spare positions of the existing linked list nodes in the linked list are insufficient, a new linked list node is created and the position of the linked list node is inserted into the address conversion table, and only the linked list node existing in the address conversion table can be used; and the logical locations of the linked-list nodes can only be modified by retraining.

Before the new model is obtained through retraining, all data can still be indexed by the old model, and no data is lost; the reason is that all data are stored by adopting the structure of the linked list nodes, the model is trained according to the data and the logical positions of the linked list, the new node can be inquired through the existing node through a pointer, and the logical positions of the existing node cannot be influenced before retraining, so that all data can be indexed by the old model.

The related technical scheme is the same as embodiment 1, and is not described herein.

Examples 3,

A method for querying a learning index model constructed based on the method for constructing a distributed learning index model provided in embodiment 1 includes the following steps:

s21, the calculation node calculates the sequencing position of the data to be inquired by adopting the learning index model thereon;

specifically, determining a corresponding index submodel in the learning index model according to a key value key of the data to be queried, and inputting the key value key of the data to be queried into the index submodel to obtain a sorting position of the data to be queried;

s22, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table (address conversion table corresponding to an index sub-model corresponding to a data interval where the data to be queried is located) on the computing node, reading the corresponding array from the storage node through unilateral RDMA operation according to the physical address by the computing node, and obtaining a pointer of a pointing value corresponding to the data to be queried in the read array;

Further, taking the example that the data stored by the storage node is stored in the linked list formed by the plurality of arrays with the size δ after being sorted, a flow chart of a method for querying data through learning index in the RDMA-based distributed storage system is shown in fig. 5, wherein the storage node stores all data in a linked list node manner, stores linked list node addresses in an address translation table, obtains a learning index model according to the data and the logical addresses thereof, and synchronizes the learning index model and the address translation table to the computing node for data query. The process of querying data on a computing node comprises the following steps:

step 1, a CPU of the computing node calls a corresponding index sub-model to perform position computation according to the data to be queried;

step 2, the index sub-model calculates the logic position of the data to be inquired;

step 3, converting the logical address into a physical address by using an address conversion table;

step 4, the computing node reads the data of the storage node through the RDMA network card;

and Step 5, the RDMA network card returns the data of the physical address requested by the computing node to the computing node, and the computing node searches the returned data to finish the query process.

The related technical scheme is the same as embodiment 1, and is not described herein.

Examples 4,

A method for deleting a learning index model constructed based on the method for constructing a distributed learning index model provided in embodiment 1 includes the steps of:

s31, the calculation node calculates the sequencing position of the data to be deleted by adopting the learning index model thereon;

specifically, determining a corresponding index submodel in the learning index model according to the key value key of the data to be deleted, and inputting the key value key of the data to be deleted into the index submodel to obtain the sorting position of the data to be deleted;

s32, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table (address conversion table corresponding to an index sub-model corresponding to a data interval where data to be deleted are located) on the computing node, reading the corresponding array from the storage node through unilateral RDMA operation by the computing node according to the physical address, judging whether the read array contains a key of the data to be deleted, if not, judging that the data to be deleted does not exist in the corresponding storage node, and ending the operation; otherwise, go to step S33;

Wherein, the deleting process is as follows: determining an array where the data to be deleted is located based on the sorting position of the data to be deleted, locking the array where the data to be deleted is located, then deleting the data to be deleted from the array where the data to be deleted is located, and finally unlocking the array where the data to be deleted is located.

Further, when the data stored in the storage node is sorted and stored in the linked list formed by the plurality of arrays with the size δ, in step S33, the linked list node where the data to be deleted is located is determined based on the sorting position of the data to be deleted; the chain table node where the data to be deleted is located is locked by using CAW operation in RDMA technology, then the data to be deleted is deleted from the chain table node where the data to be deleted is located, and finally the chain table node where the data to be deleted is located is unlocked.

It should be noted that, when the deletion operation generates a null linked list node, the array corresponding to the null linked list node is deleted and the address translation table stored in the corresponding storage node is updated; and in this process, the compute nodes asynchronously perform the following operations: reading all data in a data interval where the data to be deleted on the corresponding storage node are located through unilateral RDMA operation, retraining an index sub-model corresponding to the data interval where the data to be deleted are located based on the read data, and synchronizing the obtained model to the corresponding storage node.

The related technical scheme is the same as that of embodiment 1 and embodiment 2, and is not described herein.

Examples 5,

An updating method of a learning index model constructed based on the construction method of a distributed learning index model provided in embodiment 1 includes the following steps:

s41, calculating the sequencing position of the data to be updated by the calculation node by adopting the learning index model;

specifically, a corresponding index sub-model in the learning index model is determined according to the key value key of the data to be updated, and the key value key of the data to be updated is input into the index sub-model to obtain the sorting position of the data to be updated;

s42, converting the obtained sequencing position into a physical position of a corresponding array based on an address conversion table (address conversion table corresponding to an index sub-model corresponding to a data interval where data to be inquired is located) on the computing node, and reading the corresponding array from the storage node through unilateral RDMA operation by the computing node according to the physical address; judging whether the read array contains the key of the data to be updated, if so, turning to the step S43; otherwise, the operation is ended;

s43, obtaining a pointer pointing to the value corresponding to the data to be updated in the read array, and updating the value stored in the corresponding storage node into the value of the data to be updated by the computing node through unilateral RDMA operation according to the physical address pointed by the pointer pointing to the value.

The related technical scheme is the same as embodiment 1, and is not described herein.

In summary, in the data operation method of the learning index model constructed based on the construction method of the distributed learning index model provided by the invention, the computing node directly modifies the data in the storage node through RDMA operation without the participation of a CPU of the storage node; meanwhile, the computing node asynchronously retrains the old model and synchronizes the new model to the storage node; the operation of modifying data and model is carried out by putting the operation into the computing nodes in the distributed system, so that the CPU overhead of the storage nodes is greatly reduced.

Examples 6,

A machine-readable storage medium storing machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement any of the methods provided in embodiment 1, embodiment 2, embodiment 3 and/or embodiment 4 of the present invention.

The related technical solutions are the same as those in embodiment 1, embodiment 2, embodiment 3 and embodiment 4, and are not described herein again.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

17页详细技术资料下载

Construction method and application of distributed learning index model

相关技术

网友询问留言