Hierarchical optimization efficient ciphertext fuzzy retrieval method and related equipment

文档序号:1816119 发布日期:2021-11-09 浏览:21次 中文

阅读说明:本技术 分级优化的高效密文模糊检索方法及相关设备 (Hierarchical optimization efficient ciphertext fuzzy retrieval method and related equipment ) 是由 牛晓光 林青海 徐远卓 刘书洹 梅雨轩 于 2021-07-07 设计创作,主要内容包括:本发明提供一种分级优化的高效密文模糊检索方法及相关设备,该方法包括:第一总端将每个原始数据对应的密文数据、安全密文索引、第一索引标签值和标志位作为一组样本上传至数据服务器;第二总端将每个查询关键词对应的安全检索陷门与语言特征码上传至数据服务器;在数据服务器中进行检索,得到N个密文数据集合,将N个密文数据集合发送给第二总端,第二总端分别对每个密文数据集合中的密文数据解密,计算第二索引标签值;将每个密文数据集合对应的第二索引标签值与第一索引标签值进行对比,验证数据检索结果。通过本发明,不需要预先构建关键词集合,就可以完成分级优化的高效密文模糊检索,且通过对检索结果进行验证,确保了检索结果的可靠性。(The invention provides a hierarchical optimization high-efficiency ciphertext fuzzy retrieval method and related equipment, wherein the method comprises the following steps: the first master terminal uploads ciphertext data, a safety ciphertext index, a first index tag value and a flag bit corresponding to each original data as a group of samples to a data server; the second main terminal uploads the safe retrieval trapdoor and the language feature code corresponding to each query keyword to a data server; searching in a data server to obtain N ciphertext data sets, sending the N ciphertext data sets to a second master end, decrypting ciphertext data in each ciphertext data set by the second master end respectively, and calculating a second index tag value; and comparing the second index tag value corresponding to each ciphertext data set with the first index tag value, and verifying the data retrieval result. By the method and the device, the high-efficiency fuzzy ciphertext retrieval with hierarchical optimization can be completed without constructing a keyword set in advance, and the reliability of the retrieval result is ensured by verifying the retrieval result.)

1. The hierarchical optimization efficient ciphertext fuzzy retrieval method is characterized by comprising the following steps:

the first master terminal uploads ciphertext data, a safety ciphertext index, a first index tag value and a flag bit corresponding to each original data as a group of samples to a data server;

the second master calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server;

the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer;

the second master terminal decrypts the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set;

the second master end calculates to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value;

and the second master end compares the second index tag value corresponding to each ciphertext data set with the first index tag value, and ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result.

2. The hierarchical optimized high-efficiency fuzzy ciphertext retrieval method of claim 1, wherein the step of calculating the safe retrieval trapdoor and the language feature code according to the query keyword comprises:

converting the query keyword into a fingerprint feature vector V with a vector dimension kWUsing strong pseudo-random permutation function to fingerprint feature vector VWProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorProcessing is carried out, two reversible matrixes M with the dimension of k multiplied by k and generated randomly are introduced1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2Performing encryption operation, and taking an operation result as a security retrieval trapdoor corresponding to the query key word;

and obtaining the language feature code according to the language type of the query keyword.

3. The hierarchical optimized high-efficiency fuzzy ciphertext retrieval method of claim 1, wherein before the step of uploading the ciphertext data, the secure ciphertext index, the first index tag value and the flag bit corresponding to each original data as a set of samples to the data server, further comprising:

the first bus end calculates according to the security parameters to obtain a data encryption key, and encrypts original data based on the data encryption key to obtain ciphertext data;

the first bus end extracts the characteristic information of the original data and calculates according to the characteristic information to obtain a safe ciphertext index;

the first master end calculates to obtain a first index tag value according to the original data, the ciphertext data and the safe ciphertext index;

the first bus end obtains a flag bit according to the language type of the original data.

4. The hierarchical optimized high-efficiency fuzzy ciphertext retrieval method according to claim 3, wherein the step of extracting the feature information of the original data and calculating the secure ciphertext index according to the feature information comprises:

if the original data is English character data MenThen map the English character data to k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

if the original data is Chinese character data MchApplying five-stroke encoding rule to convert font code into four-bit string data, and mapping the four-bit string data into k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

m independent P-stable locality sensitive hash functions LSH are selected to construct a dimension k2Of bloom Filter vector V'MAnd will vector v1Mapping to bloom Filter vector V'MPerforming the following steps;

generating a vector dimension of k with a pseudorandom sequence generator3According to the bloom filter vector V'MCorresponding to the calculation of the sequence vector R of random numbers to generate a string of particularly sensitive dataFingerprint feature vector VM

Fingerprint feature vector V using strong pseudo-random permutation functionMProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorCarrying out treatment;

introducing two reversible matrices M of dimension k x k generated randomly1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2And performing encryption operation, and taking an operation result as a safety ciphertext index corresponding to the original data.

5. The hierarchical optimized high-efficiency fuzzy ciphertext retrieval method of claim 3, wherein the step of computing a first index tag value from the original data, the ciphertext data, and the secure ciphertext index comprises:

and calling a one-way hash function and an MAC function to calculate the original data, the ciphertext data and the safety ciphertext index to obtain a first index tag value.

6. The hierarchical optimized high-efficiency fuzzy ciphertext retrieval method according to claim 1, wherein the step of calculating the inner product operation result of the secure ciphertext index corresponding to each target sample and the secure retrieval trapdoor comprises:

defining the secure ciphertext index asDefining a secure search trapdoor asAnd carrying out vector operation on the secure ciphertext index and the secure retrieval trapdoor to obtain an inner product operation result, wherein the operation process is as follows:

7. the hierarchical optimized efficient ciphertext fuzzy retrieval device is characterized by comprising the following components:

an uploading module: the first master end uploads the ciphertext data, the safety ciphertext indexes, the first index tag values and the zone bits corresponding to each original data as a group of samples to a data server;

a first calculation module: the second master end calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server;

a selecting module: the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer;

a decryption module: the second master end is used for decrypting the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set;

a second calculation module: the second master end is used for calculating to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value;

a comparison module: and the second master end is used for comparing the second index tag value corresponding to each ciphertext data set with the first index tag value, and the ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result.

8. The hierarchical optimized efficient ciphertext fuzzy retrieval apparatus of claim 7, wherein the first computing module is specifically configured to:

converting the query keyword into a fingerprint feature vector V with a vector dimension kWUsing strong pseudo-random permutation function to fingerprint feature vector VWProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorProcessing is carried out, two reversible matrixes M with the dimension of k multiplied by k and generated randomly are introduced1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2Performing encryption operation, and taking an operation result as a security retrieval trapdoor corresponding to the query key word;

and obtaining the language feature code according to the language type of the query keyword.

9. A hierarchically optimized high-efficiency ciphertext fuzzy retrieval apparatus, comprising a processor, a memory, and a hierarchically optimized high-efficiency ciphertext fuzzy retrieval program stored on the memory and executable by the processor, wherein the hierarchically optimized high-efficiency ciphertext fuzzy retrieval program, when executed by the processor, implements the steps of the hierarchically optimized high-efficiency ciphertext fuzzy retrieval method of any of claims 1 to 6.

10. A readable storage medium having stored thereon a hierarchically optimized high-efficiency ciphertext fuzzy retrieval program, wherein the hierarchically optimized high-efficiency ciphertext fuzzy retrieval program, when executed by a processor, performs the steps of the hierarchically optimized high-efficiency ciphertext fuzzy retrieval method of any one of claims 1 to 6.

Technical Field

The invention relates to the technical field of cryptography, in particular to a hierarchical optimization efficient ciphertext fuzzy retrieval method and related equipment.

Background

The method supports ciphertext fuzzy retrieval, can allow partial matching within a certain range to a certain degree, thereby improving the usability and functionality of ciphertext data and meeting the actual application requirements of legal authorized users. The main technologies related to the present solution for supporting fuzzy search of ciphertext can be roughly divided into: matching based on similarity, matching based on wildcard character and matching based on dictionary method.

The ciphertext fuzzy retrieval method based on similarity matching mainly defines similarity of keywords by editing distance, and has the defects that the number of elements in a fuzzy keyword set is increased in exponential order of magnitude under the condition that the length of the keywords and the editing distance are increased, so that the calculation performance consumption is greatly increased, and the storage resource waste is greatly reduced. The ciphertext fuzzy retrieval method based on wildcard matching and the ciphertext fuzzy retrieval method based on dictionary matching reduce the number of elements in a fuzzy keyword set, but the keyword set still needs to be constructed in advance, so that the performance consumption of a computer is high, and the storage space is wasted.

Disclosure of Invention

The invention mainly aims to provide a hierarchical optimization efficient ciphertext fuzzy retrieval method and related equipment, and aims to solve the problems that a keyword set needs to be constructed in advance, so that the performance consumption of a computer is high, the storage space is wasted, and the retrieval result cannot be verified.

In a first aspect, the invention provides a hierarchical optimization efficient ciphertext fuzzy retrieval method, which comprises the following steps:

the first master terminal uploads ciphertext data, a safety ciphertext index, a first index tag value and a flag bit corresponding to each original data as a group of samples to a data server;

the second master calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server;

the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer;

the second master terminal decrypts the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set;

the second master end calculates to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value;

and the second master end compares the second index tag value corresponding to each ciphertext data set with the first index tag value, and ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result.

Optionally, the step of obtaining the security retrieval trapdoor and the language feature code by calculation according to the query keyword includes:

converting the query keyword into a fingerprint feature vector V with a vector dimension kWUsing strong pseudo-random permutation function to fingerprint feature vector VWProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorProcessing is carried out, two reversible matrixes M with the dimension of k multiplied by k and generated randomly are introduced1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2Performing encryption operation, and taking an operation result as a security retrieval trapdoor corresponding to the query key word;

and obtaining the language feature code according to the language type of the query keyword.

Optionally, before the step of uploading the ciphertext data, the secure ciphertext index, the first index tag value, and the flag bit corresponding to each piece of original data to the data server as a group of samples, the method further includes:

the first bus end calculates according to the security parameters to obtain a data encryption key, and encrypts original data based on the data encryption key to obtain ciphertext data;

the first bus end extracts the characteristic information of the original data and calculates according to the characteristic information to obtain a safe ciphertext index;

the first master end calculates to obtain a first index tag value according to the original data, the ciphertext data and the safe ciphertext index;

the first bus end obtains a flag bit according to the language type of the original data.

Optionally, the step of extracting the feature information of the original data and calculating to obtain the secure ciphertext index according to the feature information includes:

if the original data is English character data MenThen map the English character data to k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

if the original data is Chinese character data MchApplying five-stroke encoding rule to convert font code into four-bit string data, and mapping the four-bit string data into k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

m independent P-stable locality sensitive hash functions LSH are selected to construct a dimension k2Of bloom Filter vector V'MAnd will vector v1Mapping to bloom Filter vector V'MPerforming the following steps;

generating a vector dimension of k with a pseudorandom sequence generator3According to the bloom filter vector V'MFingerprint feature vector V corresponding to specific sensitive data string generated by calculating random number sequence vector RM

Fingerprint feature vector V using strong pseudo-random permutation functionMProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorCarrying out treatment;

introducing two reversible matrices M of dimension k x k generated randomly1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2And performing encryption operation, and taking an operation result as a safety ciphertext index corresponding to the original data.

Optionally, the step of calculating a first index tag value according to the original data, the ciphertext data, and the secure ciphertext index includes:

and calling a one-way hash function and an MAC function to calculate the original data, the ciphertext data and the safety ciphertext index to obtain a first index tag value.

Optionally, the step of calculating an inner product operation result of the secure ciphertext index corresponding to each target sample and the secure search trapdoor includes:

defining the secure ciphertext index asDefining a secure search trapdoor as And carrying out vector operation on the secure ciphertext index and the secure retrieval trapdoor to obtain an inner product operation result, wherein the operation process is as follows:

in a second aspect, the present invention further provides a hierarchical optimized efficient fuzzy ciphertext search apparatus, including:

an uploading module: the first master end uploads the ciphertext data, the safety ciphertext indexes, the first index tag values and the zone bits corresponding to each original data as a group of samples to a data server;

a first calculation module: the second master end calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server;

a selecting module: the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer;

a decryption module: the second master end is used for decrypting the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set;

a second calculation module: the second master end is used for calculating to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value;

a comparison module: and the second master end is used for comparing the second index tag value corresponding to each ciphertext data set with the first index tag value, and the ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result.

Optionally, the first calculating module is specifically configured to:

converting the query keyword into a fingerprint feature vector V with a vector dimension kMUsing strong pseudo-random permutation function to fingerprint feature vector VWProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorProcessing is carried out, two reversible matrixes M with the dimension of k multiplied by k and generated randomly are introduced1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2Performing encryption operation, and taking an operation result as a security retrieval trapdoor corresponding to the query key word;

and obtaining the language feature code according to the language type of the query keyword.

In a third aspect, the present invention further provides a hierarchically optimized high-efficiency fuzzy ciphertext retrieval apparatus, where the hierarchically optimized high-efficiency fuzzy ciphertext retrieval apparatus includes a processor, a memory, and a hierarchically optimized high-efficiency fuzzy ciphertext retrieval program stored in the memory and executable by the processor, where the hierarchically optimized high-efficiency fuzzy ciphertext retrieval program, when executed by the processor, implements the steps of the hierarchically optimized high-efficiency fuzzy ciphertext retrieval method as described above.

In a fourth aspect, the present invention further provides a readable storage medium, on which a hierarchical optimized high-efficiency ciphertext fuzzy retrieval program is stored, wherein the hierarchical optimized high-efficiency ciphertext fuzzy retrieval program, when executed by a processor, implements the steps of the hierarchical optimized high-efficiency ciphertext fuzzy retrieval method as described above.

In the invention, a first master terminal uploads ciphertext data, a safety ciphertext index, a first index tag value and a flag bit corresponding to each original data as a group of samples to a data server; the second master calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server; the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer; the second master terminal decrypts the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set; the second master end calculates to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value; and the second master end compares the second index tag value corresponding to each ciphertext data set with the first index tag value, and ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result. By the aid of the method and the device, efficient ciphertext fuzzy retrieval of hierarchical optimization can be completed without constructing the keyword set in advance, the problems that the performance of a computer is high in consumption and storage space is wasted due to the fact that the keyword set is required to be constructed in advance are solved, and correctness and reliability of retrieval results are guaranteed by verifying the retrieval results.

Drawings

Fig. 1 is a schematic diagram of a hardware structure of a hierarchical optimized efficient ciphertext fuzzy retrieval apparatus according to an embodiment of the present invention;

FIG. 2 is a schematic flowchart of a first embodiment of the hierarchical optimized high-efficiency fuzzy ciphertext retrieval method of the present invention;

fig. 3 is a functional module diagram of a hierarchical optimized high-efficiency fuzzy ciphertext search apparatus according to a first embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In a first aspect, embodiments of the present invention provide a hierarchical optimized efficient fuzzy ciphertext retrieval apparatus, which may be an apparatus with a data processing function, such as a Personal Computer (PC), a notebook computer, or a server.

Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a hierarchical optimized efficient ciphertext fuzzy retrieval apparatus according to an embodiment of the present invention. In this embodiment of the present invention, the hierarchical optimized efficient fuzzy ciphertext retrieval apparatus may include a processor 1001 (e.g., a Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used for realizing connection communication among the components; the user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); the network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WI-FI interface, WI-FI interface); the memory 1005 may be a Random Access Memory (RAM) or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, and the memory 1005 may optionally be a storage device independent of the processor 1001. Those skilled in the art will appreciate that the hardware configuration depicted in FIG. 1 is not intended to be limiting of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.

With continued reference to fig. 1, a memory 1005, which is one type of computer storage medium in fig. 1, may include an operating system, a network communication module, a user interface module, and a hierarchical optimized efficient ciphertext fuzzy retrieval program. The processor 1001 may call the hierarchical optimized high-efficiency fuzzy ciphertext retrieval program stored in the memory 1005, and execute the hierarchical optimized high-efficiency fuzzy ciphertext retrieval method provided by the embodiment of the present invention.

In a second aspect, the embodiment of the invention provides a hierarchical optimization efficient ciphertext fuzzy retrieval method.

In an embodiment, referring to fig. 2, fig. 2 is a flowchart illustrating a hierarchical optimized efficient fuzzy search method according to a first embodiment of the present invention. As shown in fig. 2, the hierarchical optimized efficient ciphertext fuzzy retrieval method includes the following steps:

s10: the first master terminal uploads ciphertext data, a safety ciphertext index, a first index tag value and a flag bit corresponding to each original data as a group of samples to a data server;

in this embodiment, the first bus uploads the ciphertext data C, the secure ciphertext index S, the first index Tag value Tag, and the flag sign corresponding to each piece of original data to the data server.

S20: the second master calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server;

in this embodiment, the second master calculates, according to the query keyword W, a secure retrieval trapdoor Q by using a fuzzy retrieval algorithm, where each keyword corresponds to one secure retrieval trapdoor Q. And assigning the language feature codes according to the language types of the query keywords to obtain the language feature codes corresponding to the language types of the query keywords.

Further, in an embodiment, the step of calculating to obtain the security retrieval trapdoor and the language feature code according to the query keyword further includes:

converting the query keyword into a fingerprint feature vector V with a vector dimension kWUsing strong pseudo-random permutation function to fingerprint feature vector VWProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorProcessing is carried out, two reversible matrixes M with the dimension of k multiplied by k and generated randomly are introduced1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2Performing encryption operation, and taking an operation result as a security retrieval trapdoor corresponding to the query key word;

and obtaining the language feature code according to the language type of the query keyword.

In this embodiment, the second headend converts the query keyword into a fingerprint feature vector V with a vector dimension kW. The strong pseudo-random algorithm and the first bus terminal are used for calling the index key SK calculated by the key generation algorithm according to the security parameter K to calculate and obtain the security fingerprint feature vectorBinary bit vector with vector dimension kAs a split vector, according to a security nearest neighbor algorithm, a security fingerprint feature vector is obtainedThe encryption is split into two corresponding sub-vectorsIntroducing two reversible matrices M of dimension k x k generated randomly1And M2Performing encryption operation of the vector and the matrix, taking the operation result as a safe retrieval trapdoor Q corresponding to the query keyword W,

querying keywords W for Chinese characterschThe language feature code sig is assigned with 4bit units, wherein the first 3bit is randomly assigned, and the last 1bit is assigned to be 0, for example, if a keyword W is queried for English charactersenThe language feature code sig is assigned with 4bit units, wherein the first 3bit is randomly assigned, and the last 1bit is assigned as 1. And uploading the safe retrieval trapdoor Q and the corresponding language feature code sig to a data server.

S30: the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer;

in this embodiment, the data server receives the first bus upload moreAfter the group sample and the safe search trapdoor and the language feature code uploaded by the second bus are processed by match (sig)4,sign4) And the function compares the equivalent of the language feature code sig with the 4 th bit of the sign bit sign on the label field of the ciphertext data, and returns a value of 1, namely the safe retrieval trapdoor of the query keyword is consistent with the language feature of the ciphertext data, and if the returned value is 0, the safe retrieval trapdoor is inconsistent with the language feature of the ciphertext data. Selecting target samples with the flag bits matched with the language feature codes from multiple groups of samples according to comparison operation results of the language feature codes and the flag bits, carrying out vector operation on the safety ciphertext indexes corresponding to each target sample and the safety retrieval trapdoor, sorting the inner product operation results larger than a threshold value in the multiple inner product operation results according to the inner product operation results obtained by the vector operation from large to small, returning to Top-K data retrieval according to system retrieval requirements, returning to N target samples corresponding to the first N inner product operation results in a sorting queue, and calculating the hash value of the safety ciphertext index corresponding to each target sample. And the hash value of the safety ciphertext index corresponding to each group of target samples, the ciphertext data and the first index tag value form a ciphertext data set. And forming N ciphertext data sets by the hash values of the N groups of safety ciphertext indexes, the ciphertext data and the first index tag value, and sending the N ciphertext data sets to a second master end, wherein N is a positive integer. The setting significance of the threshold is to improve the accuracy of the retrieval data, reduce the return of redundant data, namely the recall rate, reduce the returned encrypted data set, reduce the calculation efficiency of the decryption operation and the comparison operation of the encrypted data after the decryption operation and the comparison operation, and improve the retrieval efficiency and the accuracy.

Further, in an embodiment, the step of calculating an inner product operation result of the secure ciphertext index corresponding to each target sample and the secure search trapdoor further includes:

defining the secure ciphertext index asDefining a secure search trapdoor as And carrying out vector operation on the secure ciphertext index and the secure retrieval trapdoor to obtain an inner product operation result, wherein the operation process is as follows: t is a transposed matrix.

In this embodiment, vector operation is performed on the secure ciphertext index and the secure search trapdoor corresponding to each target sample, and according to the inner product operation result obtained by the vector operation, the inner product operation results larger than the threshold value in the plurality of inner product operation results are sorted in the descending order, and the larger the value of the inner product operation result, the higher the matching degree of the secure search trapdoor Q and the secure ciphertext index S is.

S40: the second master terminal decrypts the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set;

in this embodiment, the second group is based on the data encryption key EK, and the symmetric encryption algorithm is applied to the ciphertext data C in each ciphertext data setiDecrypting to obtain decrypted data M corresponding to each ciphertext data setiIs defined as Mi=DEK(Ci)。

S50: the second master end calculates to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value;

in this embodiment, the second head end combines the decrypted data M corresponding to each ciphertext data setiCiphertext data CiAnd a hash value HiCalculating to obtain a second index Tag value Tag 'by using the one-way hash function, the MAC function and the index key SK in the step S20'i=MACSK(Hash(Mi||Ci)||Hi)。

S60: and the second master end compares the second index tag value corresponding to each ciphertext data set with the first index tag value, and ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result.

In this embodiment, the second aggregation compares the second index tag value corresponding to each ciphertext data set with the first index tag value, and takes the ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value as a correct retrieval result and takes the ciphertext data in the ciphertext data set with the second index tag value inconsistent with the first index tag value as an incorrect retrieval result.

In this embodiment, the first master uploads ciphertext data, a secure ciphertext index, a first index tag value, and a flag bit corresponding to each original data as a group of samples to the data server; the second master calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server; the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer; the second master terminal decrypts the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set; the second master end calculates to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value; and the second master end compares the second index tag value corresponding to each ciphertext data set with the first index tag value, and ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result. By the aid of the method, efficient ciphertext fuzzy retrieval based on hierarchical optimization can be completed without constructing the keyword set in advance, the problems that the performance of a computer is high in consumption and storage space is wasted due to the fact that the keyword set needs to be constructed in advance are solved, and correctness and reliability of retrieval results are guaranteed by verifying the retrieval results.

Further, in an embodiment, step S10 is preceded by:

step S010: the first bus end calculates according to the security parameters to obtain a data encryption key, and encrypts original data based on the data encryption key to obtain ciphertext data;

in this embodiment, a security parameter K is input to the first bus, the first bus invokes a key generation algorithm to calculate a data encryption key EK according to the security parameter K, and encrypts original data based on the data encryption key EK to obtain ciphertext data.

Step S020: the first bus end extracts the characteristic information of the original data and calculates according to the characteristic information to obtain a safe ciphertext index;

in this embodiment, since the types of the english character data and the english character data in the original data M are different, the original data M is first subjected to character conversion, then the first general end performs an operation of extracting fingerprint feature information from the original data M, and based on the extracted fingerprint feature information, a secure ciphertext index S is calculated by using a local sensitive hash function and a secure nearest neighbor algorithm.

Step S030: the first master end calculates to obtain a first index tag value according to the original data, the ciphertext data and the safe ciphertext index;

in this embodiment, the first master invokes a key generation algorithm to calculate an index key SK according to the security parameter K, invokes a one-way hash function and an MAC function, and calculates the original data M, the ciphertext data C, and the security ciphertext index S in combination with the index key SK to obtain a first index tag value

Step S040: the first bus end obtains a flag bit according to the language type of the original data.

In this embodiment, in order to distinguish english character data from chinese character data, the last 4-bit unit needs to be reserved in the tag field of the ciphertext data to set a flag sign, the first 3 bits are assigned randomly, the last 1bit is assigned as 1, that is, representing english character data, and the last 1bit is assigned as 0, that is, representing chinese character data.

Further, in an embodiment, the step S020 further includes:

if the original data is English character data MenThen map the English character data to k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

if the original data is Chinese character data MchApplying five-stroke encoding rule to convert font code into four-bit string data, and mapping the four-bit string data into k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

m independent P-stable locality sensitive hash functions LSH are selected to construct a dimension k2Of bloom Filter vector V'MAnd will vector v1Mapping to bloom Filter vector V'MPerforming the following steps;

generating a vector dimension of k with a pseudorandom sequence generator3According to the bloom filter vector V'MFingerprint feature vector V corresponding to specific sensitive data string generated by calculating random number sequence vector RM

Fingerprint feature vector V using strong pseudo-random permutation functionMProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorCarrying out treatment;

introducing two reversible matrices M of dimension k x k generated randomly1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2And performing encryption operation, and taking an operation result as a safety ciphertext index corresponding to the original data.

In this embodiment, the fingerprint feature vector VM=V′MR, fingerprint feature vector VMDimension k ofM=k2+k3. The random permutation key that defines the strong pseudorandom permutation function F is pk ← {0,1 })kThen the strong pseudorandom permutation function F is: f: {0,1}k×{0,1}k→{0,1}k. Using strong pseudo-random permutation function F to transform fingerprint feature vector VstringCarrying out safe random replacement on the medium elements, introducing a replacement key HK, and generating a safe fingerprint feature vector by operationThe value is defined as:binary bit vector introducing same vector dimension kAs a vector key, a security fingerprint feature vector is generated by using a security nearest neighbor algorithmCryptographic splitting into two vectorsIntroducing two reversible matrices M of dimension k x k generated randomly1And M2Performing encryption operation of vector and matrix with the operation result as the original dataM corresponding to the secure ciphertext index S, whose value is defined asT is a transposed matrix.

Further, in an embodiment, the step S030 further includes:

and calling a one-way hash function and an MAC function to calculate the original data, the ciphertext data and the safety ciphertext index to obtain a first index tag value.

In this embodiment, the first master terminal invokes a one-way hash function and a MAC function, and calculates the original data M, the ciphertext data C, and the secure ciphertext index S to obtain a first index Tag value Tag in combination with the index key SK. The specific implementation definition is as follows: tag ═ MACSK(Hash(M||C)||Hash(S))。

In a third aspect, the embodiment of the invention further provides a hierarchical optimized efficient ciphertext fuzzy retrieval device.

In an embodiment, referring to fig. 3, fig. 3 is a functional module schematic diagram of a first embodiment of the hierarchical optimized efficient fuzzy ciphertext retrieval apparatus according to the present invention. As shown in fig. 3, the hierarchical optimized efficient ciphertext fuzzy retrieval apparatus includes:

the uploading module 10: the first master end uploads the ciphertext data, the safety ciphertext indexes, the first index tag values and the zone bits corresponding to each original data as a group of samples to a data server;

the first calculation module 20: the second master end calculates according to the query key words to obtain a safe retrieval trapdoor and a language feature code, and uploads the safe retrieval trapdoor and the language feature code to a data server;

the selecting module 30: the data server selects target samples with flag bits matched with the language feature codes from multiple groups of samples, calculates the inner product operation results of the safe ciphertext indexes corresponding to each target sample and the safe retrieval trapdoor to obtain multiple inner product operation results, sorts the inner product operation results larger than a threshold value in the multiple inner product operation results according to a descending order, selects N target samples corresponding to the first N inner product operation results in a sorting queue, obtains N ciphertext data sets based on the N target samples, and sends the N ciphertext data sets to a second main terminal, wherein each ciphertext data set comprises the hash value of the safe ciphertext index corresponding to the corresponding target sample, ciphertext data and a first index tag value, and N is a positive integer;

the decryption module 40: the second master end is used for decrypting the ciphertext data in each ciphertext data set respectively to obtain decrypted data corresponding to each ciphertext data set;

the second calculation module 50: the second master end is used for calculating to obtain a second index tag value corresponding to each ciphertext data set according to the decrypted data corresponding to each ciphertext data set, the ciphertext data and the hash value;

the comparison module 60: and the second master end is used for comparing the second index tag value corresponding to each ciphertext data set with the first index tag value, and the ciphertext data in the ciphertext data set with the second index tag value consistent with the first index tag value is used as a correct retrieval result.

Further, in an embodiment, the first calculating module 20 is further configured to:

converting the query keyword into a fingerprint feature vector V with a vector dimension kWUsing strong pseudo-random permutation function to fingerprint feature vector VWProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorProcessing is carried out, two reversible matrixes M with the dimension of k multiplied by k and generated randomly are introduced1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2Performing encryption operation, and taking the operation result as the corresponding security retrieval of the query key wordA trapdoor;

and obtaining the language feature code according to the language type of the query keyword.

Further, in an embodiment, the hierarchical optimized efficient fuzzy ciphertext retrieval apparatus further includes a data obtaining module, which is specifically configured to:

the first bus end calculates according to the security parameters to obtain a data encryption key, and encrypts original data based on the data encryption key to obtain ciphertext data;

the first bus end extracts the characteristic information of the original data and calculates according to the characteristic information to obtain a safe ciphertext index;

the first master end calculates to obtain a first index tag value according to the original data, the ciphertext data and the safe ciphertext index;

the first bus end obtains a flag bit according to the language type of the original data.

Further, in an embodiment, the second calculating module 50 is further configured to:

if the original data is English character data MenThen map the English character data to k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

if the original data is Chinese character data MchApplying five-stroke encoding rule to convert font code into four-bit string data, and mapping the four-bit string data into k1Vector v of dimensions1Wherein each character corresponds to a vector v1Position 1;

m independent P-stable locality sensitive hash functions LSH are selected to construct a dimension k2Of bloom Filter vector V'MAnd will vector v1Mapping to bloom Filter vector V'MPerforming the following steps;

generating a vector dimension of k with a pseudorandom sequence generator3According to the bloom filter vector V'MFingerprint feature vector V corresponding to specific sensitive data string generated by calculating random number sequence vector RM

By strong falseRandom permutation function versus fingerprint feature vector VMProcessing to obtain the security fingerprint feature vectorUsing security nearest neighbor algorithm to carry out security fingerprint feature vectorCarrying out treatment;

introducing two reversible matrices M of dimension k x k generated randomly1And M2The security fingerprint feature vector processed by the security nearest neighbor algorithm is processedAnd a reversible matrix M1、M2And performing encryption operation, and taking an operation result as a safety ciphertext index corresponding to the original data.

Further, in an embodiment, the second calculating module 50 is further configured to:

and calling a one-way hash function and an MAC function to calculate the original data, the ciphertext data and the safety ciphertext index to obtain a first index tag value.

Further, in an embodiment, the second calculating module 50 is further configured to:

defining the secure ciphertext index asDefining a secure search trapdoor as And carrying out vector operation on the secure ciphertext index and the secure retrieval trapdoor to obtain an inner product operation result, wherein the operation process is as follows:

the function implementation of each module in the hierarchical optimized high-efficiency ciphertext fuzzy retrieval device corresponds to each step in the hierarchical optimized high-efficiency ciphertext fuzzy retrieval method embodiment, and the function and implementation process are not described in detail herein.

In a fourth aspect, the embodiment of the present invention further provides a readable storage medium.

The readable storage medium of the invention stores a hierarchical optimized high-efficiency ciphertext fuzzy retrieval program, wherein when the hierarchical optimized high-efficiency ciphertext fuzzy retrieval program is executed by a processor, the steps of the hierarchical optimized high-efficiency ciphertext fuzzy retrieval method are realized.

The method implemented when the high-efficiency ciphertext fuzzy search program optimized in a hierarchical manner is executed may refer to each embodiment of the high-efficiency ciphertext fuzzy search method optimized in a hierarchical manner, and is not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for causing a terminal device to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:智能电表数据压缩方法、装置和电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!