Partial repetition code construction method based on shadow

文档序号:308559 发布日期:2021-11-26 浏览:31次 中文

阅读说明:本技术 一种基于shadow的部分重复码构造方法 (Partial repetition code construction method based on shadow ) 是由 王静 孙伟 何亚锦 沈克勤 张鑫楠 于 2021-08-13 设计创作,主要内容包括:本发明公开了一种基于shadow的部分重复码构造方法,包括以下步骤:步骤1:将原始文件M分成k个原始数据块,对k个原始数据块进行(n,k)MDS编码得到n个编码数据块;步骤2:根据编码数据块的个数n构建集合X和集合ψ,其中,集合X包括n个不同元素,集合ψ包括t个子集φ,所述子集φ为集合X的(d+1)元子集,子集φ包含(d+1)个元素且每个子集φ内没有相同的元素;步骤3:获得集合ψ的影子集合其中,影子集合包括t组子影子集合,每组子影子集合包含(d+1)个集合φ’,集合φ’包含d个元素,集合φ’由删除子集φ中任一个元素后其余元素组成;步骤4:根据影子集合构造FR码。本发明构造出的FR码修复局部性较低且不会随着系统参数增加而增加,同时,可以根据系统需要选择合适的节点存储容量以及数据重复度。(The invention discloses a method for constructing a partial repeated code based on shadow, which comprises the following steps: step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks; step 2: constructing a set X and a set psi according to the number n of the encoded data blocks, wherein the set X comprises n different elements, the set psi comprises t subsets phi which are (d +1) element subsets of the set X, the subsets phi contain (d +1) elements and each subset phi has no same element; and step 3: obtaining a shadow set of the set psi Wherein the shadow sets The method comprises the following steps that t groups of sub-shadow sets are included, each group of sub-shadow sets comprises (d +1) sets phi ', each set phi ' comprises d elements, and each set phi ' comprises the rest elements after any element in a subset phi is deleted; step (ii) of4: according to shadow sets And constructing an FR code. The FR code constructed by the invention has low repair locality and cannot be increased along with the increase of system parameters, and meanwhile, the appropriate node storage capacity and data repetition degree can be selected according to the system requirements.)

1. A method for constructing a partial repetition code based on shadow is characterized by comprising the following steps:

step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks, wherein k is more than or equal to 2 and n is more than or equal to k;

step 2: constructing a set X and a set psi from the number n of encoded data blocks, wherein the set X comprises n different elements and the set psi comprises t subsets phi which are (d +1) element subsets of the set X, the subsets phi containing (d +1) elements and having no identical elements within each subset phi, wherein d is a positive integer and (d +1) < n;

and step 3: obtaining a shadow set of the set psiWherein the shadow setsThe method comprises the following steps that t groups of sub-shadow sets are included, each group of sub-shadow sets comprises (d +1) sets phi ', each set phi ' comprises d elements, and each set phi ' comprises the rest elements after any element in a subset phi is deleted;

and 4, step 4: according to shadow setsThe FR code is constructed, including three cases:

the first condition is as follows: if an isomorphic FR code is constructed, each node of the isomorphic FR code corresponds to a shadow setThe number of nodes of the isomorphic FR code is t × (d +1), the node storage capacity of the isomorphic FR code is d, the repetition degree of the isomorphic FR code is d, and the data block stored by each node of the isomorphic FR code is a corresponding set phi ″An element contained;

case two: if a repetition degree heterogeneous FR code is constructed, deletingAny set phi' in each group of sub-shadow sets in the shadow set to obtain a shadow set after deletionPruned shadow setsThe method comprises t groups of sub-shadow sets, wherein each group of sub-shadow sets comprises d sets phi';

each node of the repetition degree heterogeneous FR code corresponds to the deleted shadow setThe number of the nodes of the repetition degree heterogeneous FR code is t × d, the node storage capacity of the repetition degree heterogeneous FR code is d, the repetition degree of the repetition degree heterogeneous FR code is d or (d-1), and the data block stored by each node of the repetition degree heterogeneous FR code is an element included in the corresponding set Φ';

case three: if the storage capacity heterogeneous FR codes are constructed, constructing a shadow set after deletion on the basis of the second caseThe shadow sub-incidence matrix A carries out row-column interchange on the A to obtain a matrix A';

each node of the storage capacity heterogeneous FR code corresponds to each row in a matrix A ', the number of the nodes of the storage capacity heterogeneous FR code is the number of the rows of the matrix A', the node storage capacity of the storage capacity heterogeneous FR code is d or d-1, the repetition degree of the storage capacity heterogeneous FR code is d, and a data block stored by each node of the storage capacity heterogeneous FR code is the number of columns with 1 element in the corresponding row.

Technical Field

The invention belongs to the field of computers, and particularly relates to a method for constructing a partial repetition code based on shadow.

Background

With the progress of technology, more and more data needs to be stored, but a traditional storage system cannot meet the requirement of mass data storage, and a distributed storage system capable of storing a large amount of data is produced. In a distributed storage system, there is often data loss, so some methods are needed to ensure the reliability of the data, and "copy" and "erasure code" techniques are generally used. However, the storage overhead occupied by the copy strategy is large, erasure code repair is complex, the whole file needs to be downloaded for repair in the repair process, and large repair bandwidth overhead is needed. Rouayheb and Ramchandran then propose a repair-accurate partial Repetition (FR) code in 2010. The FR codes can tolerate the accurate no-code repair of multiple fault nodes, the repair bandwidth overhead and the calculation complexity are small, and the repair performance of the fault nodes is greatly improved. There are many methods for constructing the FR code, such as using the Steiner system, pair-wise balanced design, etc.

Most of the existing FR code constructions can not adjust system parameters, for example, a regular graph is used for constructing a part of repeated codes, the repetition degree of the repeated codes can not be changed, and only single-node faults can be repaired. Prajapati proposes a partially repeated code with a ring structure, which cannot adjust parameters in time according to system requirements. Based on the FR codes which can be designed in groups, the appropriate storage capacity or the repetition degree can be selected according to the system requirements, but the repair locality increases with the increase of the parameters.

Disclosure of Invention

The invention aims to provide a method for constructing a partial repeated code based on shadow, which is used for solving the problems that the storage capacity and the repeatability cannot be changed according to the system requirement and the repair locality is large in the prior art.

In order to realize the task, the invention adopts the following technical scheme:

a method for constructing a partial repetition code based on shadow comprises the following steps:

step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks, wherein k is more than or equal to 2 and n is more than or equal to k;

step 2: constructing a set X and a set psi from the number n of encoded data blocks, wherein the set X comprises n different elements and the set psi comprises t subsets phi which are (d +1) element subsets of the set X, the subsets phi containing (d +1) elements and having no identical elements within each subset phi, wherein d is a positive integer and (d +1) < n;

and step 3: obtaining a shadow set of the set psiWherein the shadow setsThe method comprises the following steps that t groups of sub-shadow sets are included, each group of sub-shadow sets comprises (d +1) sets phi ', each set phi ' comprises d elements, and each set phi ' comprises the rest elements after any element in a subset phi is deleted;

and 4, step 4: according to shadow setsThe FR code is constructed, including three cases:

the first condition is as follows: if an isomorphic FR code is constructed, each node of the isomorphic FR code corresponds to a shadow setThe number of nodes of the isomorphic FR code is t × (d +1), the node storage capacity of the isomorphic FR code is d, the repetition degree of the isomorphic FR code is d, and the data block stored by each node of the isomorphic FR code is a corresponding set phi ″An element contained;

case two: if a repetition degree heterogeneous FR code is constructed, deletingAny set phi' in each group of sub-shadow sets in the shadow set to obtain a shadow set after deletionPruned shadow setsThe method comprises t groups of sub-shadow sets, wherein each group of sub-shadow sets comprises d sets phi';

each node of the repetition degree heterogeneous FR code corresponds to the deleted shadow setThe number of the nodes of the repetition degree heterogeneous FR code is t × d, the node storage capacity of the repetition degree heterogeneous FR code is d, the repetition degree of the repetition degree heterogeneous FR code is d or (d-1), and the data block stored by each node of the repetition degree heterogeneous FR code is an element included in the corresponding set Φ';

case three: if the storage capacity heterogeneous FR codes are constructed, constructing a shadow set after deletion on the basis of the second caseThe shadow sub-incidence matrix A carries out row-column interchange on the A to obtain a matrix A';

each node of the storage capacity heterogeneous FR code corresponds to each row in a matrix A ', the number of the nodes of the storage capacity heterogeneous FR code is the number of the rows of the matrix A', the node storage capacity of the storage capacity heterogeneous FR code is d or d-1, the repetition degree of the storage capacity heterogeneous FR code is d, and a data block stored by each node of the storage capacity heterogeneous FR code is the number of columns with 1 element in the corresponding row.

Compared with the prior art, the invention has the following technical characteristics:

1. the partial repetition code constructed based on shadow is a new algorithm, the FR code constructed by the algorithm is simpler, more intuitive and more efficient, and the repair locality of the constructed FR code is lower and cannot be increased along with the increase of system parameters.

2. Based on the partial repetition code constructed by the shadow, the appropriate node storage capacity and data repetition degree can be selected according to the system requirement.

Drawings

FIG. 1 is a homogeneous FR code structure based on a shadow construction;

fig. 2 is a repetition degree heterogeneous FR code based on the shadow construction;

fig. 3 is a storage capacity heterogeneous FR code based on the shadow construction.

Detailed Description

The present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be noted that, in the following embodiments, only the objects, technical solutions and advantages of the present invention will be made clear to those skilled in the art, and the present invention is not limited to these embodiments.

shadow structure: let X be a set of n elements, letRepresents a set consisting of k elements in all X, and a set existsWherein k is more than or equal to 0 and less than or equal to n. Collection

Set of scalesIs a shadow of δ, whereinRepresents all k-1 element groups in XTo a set of. E represents a setF represents a subset of the set delta, shadow setIs a set made by deleting one element in the set delta.

The embodiment discloses a method for constructing a partial repetition code based on a shadow structure, which specifically comprises the following steps:

step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks c1,…,ck-1,ck,ck+1,…cnThe n coded data blocks comprise k original data blocks and n-k check data blocks, wherein k is more than or equal to 2, and n is more than or equal to k;

step 2: constructing a set X and a set psi from the number n of encoded data blocks, wherein the set X comprises n different elements and the set psi comprises t subsets phi which are (d +1) element subsets of the set X, the subsets phi containing (d +1) elements and having no identical elements within each subset phi, wherein d is a positive integer and (d +1) < n;

and step 3: deleting each element in the subset phi once to obtain a shadow set of the set psiWherein the shadow setsComprises t sets of child shadow sets, each set of child shadow sets comprising (d +1) sets phi ', each set phi' comprising d elements;

and 4, step 4: according to shadow setsThe FR code is constructed, including three cases:

the first condition is as follows: if structure isomorphic FR codeThen each node of the isomorphic FR code corresponds to a shadow setThe number of nodes of the isomorphic FR code is t × (d +1), the node storage capacity of the isomorphic FR code is d, the repetition degree of the isomorphic FR code is d, and the data block stored by each node of the isomorphic FR code is an element contained in the corresponding set Φ';

case two: if a repetition degree heterogeneous FR code is constructed, deletingAny set phi' in each group of sub-shadow sets in the shadow set to obtain a shadow set after deletionPruned shadow setsThe method comprises t groups of sub-shadow sets, wherein each group of sub-shadow sets comprises d sets phi';

each node of the repetition degree heterogeneous FR code corresponds to the deleted shadow setThe number of the nodes of the repetition degree heterogeneous FR code is t × d, the node storage capacity of the repetition degree heterogeneous FR code is d, the repetition degree of the repetition degree heterogeneous FR code is d or (d-1), and the data block stored by each node of the repetition degree heterogeneous FR code is an element included in the corresponding set Φ';

case three: if the storage capacity heterogeneous FR codes are constructed, constructing a shadow set after deletion on the basis of the second caseThe shadow sub-incidence matrix A carries out row-column interchange on the A to obtain a matrix A';

each node of the storage capacity heterogeneous FR code corresponds to each row in a matrix A ', the number of the nodes of the storage capacity heterogeneous FR code is the number of the rows of the matrix A', the node storage capacity of the storage capacity heterogeneous FR code is d or d-1, the repetition degree of the storage capacity heterogeneous FR code is d, and a data block stored by each node of the storage capacity heterogeneous FR code is the number of columns with 1 element in the corresponding row.

Specifically, in case oneIs the ith node of the FR code,the ith set of (2) contains elements corresponding to data blocks stored for the ith node of the FR code. According to the set psi and shadow setThe FR code is divided into t sub-shadow groups, and the s sub-shadow group of the FR code is composed of the s subset (0 of them) in the set ψ<s is less than or equal to t), and the generated shadow set phi s' is correspondingly generated.

Specifically, in the third case, each row in the matrix a 'represents a storage node, and the ith row in the matrix a' represents the ith storage node N in the distributed storage systemiI is 1,2, …, n. The FR code is constructed by the following formula:

Ni={j:aij=1} (2)

j is 1,2, …, n, i denotes the i-th storage node, aijRepresenting the value of the ith row and the jth column of the matrix. N is a radical ofiStorage node representing FR code, NiThe data block included in the data block list is the number of columns corresponding to all 1 s in the ith row in the matrix a', the number of columns is extracted to obtain the data block stored in one node, and a heterogeneous FR code with a storage capacity of d or d-1 and a repetition degree ρ ═ d for each node can be constructed.

Examples

This embodiment provides a method for constructing a scalable partial Repetition (FR) code, and the following technical features are also disclosed on the basis of the above embodiment:

this example is to construct a (12,9) MDS code with m ═ m (m)1,m2,m3,m4,m5,m6,m7,m8,m9) Representing an original file stored in a distributed storage system, c ═ m1,m2,m3,m4,m5,m6,m7,m8,m9,p10,p11,p12) Representing systematic MDS codes, m of1,m2,m3,m4,m5,m6,m7,m8,m9Representing an original data block; p is a radical of10,p11,p12Representing a check data block.

In this embodiment, it is first determined that the storage capacity of some repetition code nodes in the distributed storage system is d equal to 3, so that a set X containing 12 elements is selected {1,2,3,4,5,6,7,8,9,10,11,12}, and a 4-element set ψ satisfying the condition is constructed as follows

ψ={{1,4,8,12},{2,5,9,11},{3,6,7,10}} (3)

In this embodiment, the system includes 3 subsets and is divided into 3 sub shadow groups and shadow setsThe following were used:

in this embodiment, according to the condition in step 4 that a shadow set corresponds to the distributed storage system, the structured isomorphic FR code is as shown in fig. 1 below. Each node has a storage capacity d of 3, a repetition ρ of 3, and is divided into three sub-shadow groups. The requirement of the repeatability can be met by deleting the shadow set phi s' according to the size of the storage capacity of the system.

In this embodiment, the shadow set obtained after deleting one subsetThe following were used:

by gatheringThe FR codes with different repetition degrees can be obtained according to the case two in the step 4, as shown in fig. 2. The FR code repetition rate ρ of the structure is 2 or 3, and the storage capacity d is 3.

In this embodiment, shadow setThe corresponding shadow sub-incidence matrix is as follows:

shadow setExchanging rows and columns of the corresponding shadow sub-incidence matrixes to obtain the following matrixes

In this embodiment, the association matrix a' may obtain the FR codes with different storage capacities in the case of three in step 4, as shown in fig. 3. The node stores heterogeneous partial repetition codes with the storage capacity of 2 or 3 and the repetition degree of rho being 3.

The embodiment can see that the homogeneous FR codes have the same storage capacity and the same repetition degree for each storage node, heterogeneous FR codes with different repetition degrees can be constructed by simply deleting sets, heterogeneous FR codes with different node storage capacities can be constructed by inverting the association matrix, and a suitable shadow set can be selected for construction according to the requirement of the system on the node storage capacity and the data repetition degree. It is obvious that the FR code is more suitable for the actual distributed storage system than the general FR code, and the storage cost is lower.

8页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:射频增益控制方法、装置及通信设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类