Data desensitization method and device

文档序号:190910 发布日期:2021-11-02 浏览:11次 中文

阅读说明:本技术 数据脱敏方法以及装置 (Data desensitization method and device ) 是由 徐力权 邹欣如 闫家宝 曲诗文 于 2021-08-16 设计创作,主要内容包括:本申请公开了一种数据脱敏方法以及装置。其中方法适用于在已运行的业务系统中接入数据库中间件以实现对业务数据脱敏的场景,通过在业务数据的原明文字段的基础上,增加密文、辅助列字段,并将数据脱敏过程分成三个阶段,每个阶段处理不同的逻辑。其中第一阶段将业务系统的增量数据的明文、密文和辅助列值均入库,此阶段中数据计算过程按照明文字段操作;第二阶段将增量数据的明文、密文和辅助列值均入库,此阶段中数据计算过程按照辅助列字段操作;第三阶段将增量数据的密文和辅助列值入库,此阶段中数据计算过程按照辅助列字段操作。使得业务在不改业务代码的情况下完成数据脱敏,整个过程业务无感知,可以在业务正常运行的情况下实现数据脱敏。(The application discloses a data desensitization method and a data desensitization device. The method is suitable for accessing the database middleware in an operating service system to realize a service data desensitization scene, and is characterized in that a ciphertext and an auxiliary column field are added on the basis of an original plaintext field of service data, the data desensitization process is divided into three stages, and each stage processes different logics. The method comprises the following steps that in the first stage, plaintext, ciphertext and auxiliary column values of incremental data of a business system are all put into a warehouse, and in the first stage, the data calculation process is operated according to plaintext fields; the second stage is to put the plaintext, the ciphertext and the auxiliary column value of the incremental data into a database, and the data calculation process in the second stage is operated according to the auxiliary column field; and the third stage of warehousing the ciphertext and the auxiliary column value of the incremental data, wherein the data calculation process in the third stage is operated according to the auxiliary column field. The service completes data desensitization under the condition of not changing service codes, the service is not sensed in the whole process, and the data desensitization can be realized under the condition that the service normally runs.)

1. A data desensitization method, wherein the method is adapted to access a database middleware in an already operating service system to implement a scenario of desensitization to service data, the method comprising:

responding to the business system running in a first stage, receiving first incremental data of the business system running in the first stage;

determining first plaintext data, first ciphertext data and first auxiliary column data corresponding to the first incremental data, and writing the first plaintext data, the first ciphertext data and the first auxiliary column data into the database;

in response to the business system operating in a second phase, receiving second incremental data of the business system operating in the second phase;

determining second plaintext data, second ciphertext data and second auxiliary column data corresponding to the second incremental data, and writing the second plaintext data, the second ciphertext data and the second auxiliary column data into the database;

responding to the third phase of the operation of the business system, receiving third incremental data of the business system in the third phase of the operation, determining third ciphertext data and third auxiliary column data of the third incremental data, and writing the third ciphertext data and the third auxiliary column data into the database;

wherein the data calculation process in the first stage operates according to plaintext fields; the data calculation process in the second stage and the third stage both operate according to auxiliary column fields.

2. A method of data desensitization according to claim 1, further comprising:

and when the business system operates in the first stage and the second stage, desensitizing the inventory data in the database.

3. A data desensitization method according to claim 2, wherein said desensitizing the inventory data in said database comprises:

calculating ciphertext data and auxiliary column data corresponding to the stock data according to plaintext data of the stock data in the database;

judging whether the calculated ciphertext data is consistent with ciphertext data stored in the database and corresponding to the stock data, and judging whether the calculated auxiliary column data is consistent with auxiliary column data stored in the database and corresponding to the stock data;

and deleting the plaintext data of the stock data in the database in response to the fact that the calculated ciphertext data is consistent with the ciphertext data stored in the database and corresponding to the stock data, and the calculated auxiliary column data is consistent with the auxiliary column data stored in the database and corresponding to the stock data.

4. The data desensitization method according to claim 3, wherein prior to computing ciphertext data and auxiliary column data corresponding to the inventory data from plaintext data of the inventory data within the database, the method further comprises:

judging whether the auxiliary column data of the stock data corresponds to the plaintext data of the stock data;

responding to the fact that the auxiliary column data of the stock data do not correspond to the plaintext data of the stock data, and acquiring the plaintext data of the stock data;

calculating the latest ciphertext data of the stock data according to the plaintext data of the stock data;

and updating the original ciphertext data of the stock data into the latest ciphertext data.

5. The data desensitization method according to claim 3 or 4, wherein prior to calculating ciphertext data and auxiliary column data corresponding to said inventory data from plaintext data of said inventory data within said database, said method further comprises:

and carrying out data cleaning processing on the stock data in the database.

6. A data desensitization apparatus, wherein the apparatus is adapted to access database middleware in an already operating service system to implement a scenario of desensitization to service data, the apparatus comprising:

the receiving and sending module is used for responding to the first phase of the operation of the service system and receiving first incremental data when the service system operates in the first phase;

the data processing module is used for determining first plaintext data, first ciphertext data and first auxiliary column data corresponding to the first incremental data, and writing the first plaintext data, the first ciphertext data and the first auxiliary column data into the database;

the transceiver module is further configured to receive, in response to the service system operating in a second phase, second incremental data of the service system operating in the second phase;

the data processing module is further configured to determine second plaintext data, second ciphertext data, and second auxiliary column data corresponding to the second incremental data, and write the second plaintext data, the second ciphertext data, and the second auxiliary column data into the database;

the transceiver module is further configured to receive, in response to the service system operating in a third phase, third incremental data of the service system operating in the third phase;

the data processing module is further configured to determine third ciphertext data and third auxiliary column data of the third incremental data, and write the third ciphertext data and the third auxiliary column data into the database;

wherein the data calculation process in the first stage operates according to plaintext fields; the data calculation process in the second stage and the third stage both operate according to auxiliary column fields.

7. The data desensitization apparatus according to claim 6, wherein said data processing module is further configured to:

and when the business system operates in the first stage and the second stage, desensitizing the inventory data in the database.

8. The data desensitization device according to claim 7, wherein said data processing module is specifically configured to:

calculating ciphertext data and auxiliary column data corresponding to the stock data according to plaintext data of the stock data in the database;

judging whether the calculated ciphertext data is consistent with ciphertext data stored in the database and corresponding to the stock data, and judging whether the calculated auxiliary column data is consistent with auxiliary column data stored in the database and corresponding to the stock data;

and deleting the plaintext data of the stock data in the database in response to the fact that the calculated ciphertext data is consistent with the ciphertext data stored in the database and corresponding to the stock data, and the calculated auxiliary column data is consistent with the auxiliary column data stored in the database and corresponding to the stock data.

9. The data desensitization apparatus according to claim 8, wherein said data processing module is further configured to:

before ciphertext data and auxiliary column data corresponding to the stock data are calculated according to plaintext data of the stock data in the database, judging whether the auxiliary column data of the stock data correspond to the plaintext data of the stock data;

responding to the fact that the auxiliary column data of the stock data do not correspond to the plaintext data of the stock data, and acquiring the plaintext data of the stock data;

calculating the latest ciphertext data of the stock data according to the plaintext data of the stock data;

and updating the original ciphertext data of the stock data into the latest ciphertext data.

10. A data desensitization device according to claim 8 or 9, wherein said data processing module is further configured to:

and performing data cleaning processing on the stock data in the database before calculating ciphertext data and auxiliary column data corresponding to the stock data according to plaintext data of the stock data in the database.

11. A computer device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of desensitizing data according to any of claims 1 to 5.

12. A computer readable storage medium, wherein the computer instructions are configured to cause the computer to perform a method of desensitizing data according to any of claims 1 to 5.

Technical Field

The present application relates to the field of data processing, and in particular, to a data desensitization method and apparatus, a computer device, and a storage medium.

Background

At present, data desensitization technology is generally realized by relying on an encryption algorithm, so after a service system accesses the encryption algorithm, service codes must be modified according to an SDK (Software Development Kit) of the encryption algorithm, and service is invasive. It can be understood that data desensitization is divided into static data desensitization and dynamic data desensitization, a common technical scheme can only desensitize static data alone or dynamic data alone, and no means for desensitizing historical data and newly added data in an operating service system exists in related technologies.

Disclosure of Invention

The application provides a data desensitization method, a data desensitization device, computer equipment and a storage medium, which can enable a service to complete data desensitization under the condition of not changing service codes, the service is not sensed in the whole process, and the data desensitization can be realized under the condition of normal operation of the service.

According to a first aspect of the present application, there is provided a data desensitization method, which is applicable to a scenario in which a database middleware is accessed in an already-running business system to implement service data desensitization, and the method includes:

responding to the business system running in a first stage, receiving first incremental data of the business system running in the first stage;

determining first plaintext data, first ciphertext data and first auxiliary column data corresponding to the first incremental data, and writing the first plaintext data, the first ciphertext data and the first auxiliary column data into the database;

in response to the business system operating in a second phase, receiving second incremental data of the business system operating in the second phase;

determining second plaintext data, second ciphertext data and second auxiliary column data corresponding to the second incremental data, and writing the second plaintext data, the second ciphertext data and the second auxiliary column data into the database;

responding to the third phase of the operation of the business system, receiving third incremental data of the business system in the third phase of the operation, determining third ciphertext data and third auxiliary column data of the third incremental data, and writing the third ciphertext data and the third auxiliary column data into the database;

wherein the data calculation process in the first stage operates according to plaintext fields; the data calculation process in the second stage and the third stage both operate according to auxiliary column fields.

According to a second aspect of the present application, there is provided a data desensitization apparatus, which is adapted to access a database middleware in an already-operating business system to implement a scenario of desensitization on business data, the apparatus including:

the receiving and sending module is used for responding to the first phase of the operation of the service system and receiving first incremental data when the service system operates in the first phase;

the data processing module is used for determining first plaintext data, first ciphertext data and first auxiliary column data corresponding to the first incremental data, and writing the first plaintext data, the first ciphertext data and the first auxiliary column data into the database;

the transceiver module is further configured to receive, in response to the service system operating in a second phase, second incremental data of the service system operating in the second phase;

the data processing module is further configured to determine second plaintext data, second ciphertext data, and second auxiliary column data corresponding to the second incremental data, and write the second plaintext data, the second ciphertext data, and the second auxiliary column data into the database;

the transceiver module is further configured to receive, in response to the service system operating in a third phase, third incremental data of the service system operating in the third phase;

the data processing module is further configured to determine third ciphertext data and third auxiliary column data of the third incremental data, and write the third ciphertext data and the third auxiliary column data into the database;

wherein the data calculation process in the first stage operates according to plaintext fields; the data calculation process in the second stage and the third stage both operate according to auxiliary column fields.

According to a third aspect of the present application, there is provided a computer device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method of desensitizing data of the first aspect.

According to a fourth aspect of the present application, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of desensitizing data of the preceding first aspect.

According to the technical scheme of the application, on the basis of an original plaintext field of the service data, a ciphertext and an auxiliary column field are added, the data desensitization process is divided into three stages, and each stage processes different logics. The method comprises the following steps that in the first stage, plaintext, ciphertext and auxiliary column values of incremental data of a business system are all put into a warehouse, and in the first stage, the data calculation process is operated according to plaintext fields; the second stage is to put the plaintext, the ciphertext and the auxiliary column value of the incremental data into a database, and the data calculation process in the second stage is operated according to the auxiliary column field; and the third stage of warehousing the ciphertext and the auxiliary column value of the incremental data, wherein the data calculation process in the third stage is operated according to the auxiliary column field. The service completes data desensitization under the condition of not changing service codes, the service is not sensed in the whole process, and the data desensitization can be realized under the condition that the service normally runs.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a flowchart of a data desensitization method according to an embodiment of the present disclosure;

FIG. 2 is a diagram of an internal operating architecture of database middleware provided in an embodiment of the present application;

fig. 3 is an exemplary diagram of a service data desensitization process provided in an embodiment of the present application;

FIG. 4 is a flow diagram of an inventory data desensitization process provided by an embodiment of the present application;

fig. 5 is a block diagram of a data desensitization apparatus according to an embodiment of the present disclosure;

FIG. 6 shows a schematic block diagram of an example computer device that can be used to implement embodiments of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

First, terms referred to in the present application will be described.

1) Data desensitization

Data desensitization refers to data deformation of some sensitive information through desensitization rules, and reliable protection of sensitive private data is achieved. Under the condition of relating to client security data or some business sensitive data, the real data is modified and provided for test use under the condition of not violating system rules, and data desensitization is required to be carried out on personal information such as identification numbers, mobile phone numbers, card numbers, client numbers and the like. Data desensitization is divided into static data desensitization and dynamic data desensitization. Among these, static data desensitization differs from dynamic data desensitization primarily by: whether or not desensitization is performed at the time sensitive data is used. Static data desensitization is generally used in a non-production environment, sensitive data is used in the non-production environment after desensitization from the production environment is completed, and the static data desensitization is generally used for solving the problem that sensitive data cannot be stored in the non-production environment, such as the correlation between the data quantity and the data of a production library required by testing and developing the library to troubleshoot problems or perform data analysis and the like. Dynamic data desensitization is generally used in a production environment, desensitization is performed when sensitive data is accessed, and the problem that desensitization of different levels is required when the same sensitive data needs to be read according to different conditions in the production environment is generally solved.

2) Data encryption technology

Encryption is a technique for restricting access to data transmitted over a network. The encoded data resulting from the encryption of the original data (also called Plaintext) by an encryption device (hardware or software) and a key is called Ciphertext (Ciphertext). The process of restoring the ciphertext to the original plaintext is called decryption, which is the inverse of encryption, but the decryptor must decrypt the ciphertext using the same type of encryption device and key.

The encryption types can be simply divided into four types:

1. decryption problems are not considered at all;

2. private key encryption techniques; such as symmetric encryption that uses the same key for encryption and decryption. Generally, this encryption scheme is difficult to implement in applications because it is difficult to share keys in the same secure manner. Such as: RC4(Rivest Cipher 4, stream Encryption algorithm), RC2(Rivest Cipher 2, conventional symmetric block Encryption algorithm), DES (Data Encryption Standard, block algorithm using key Encryption), and AES (Advanced Encryption Standard) series Encryption algorithms.

3. Public key encryption technology: such as asymmetric key encryption using a set of public/private key systems, one key for encryption and another key for decryption. Public keys can be widely shared and disclosed. This encryption is more convenient when data needs to be transmitted to the outside of the server in an encrypted manner. Such as: RSA.

4. A digital certificate. A digital certificate is an asymmetric key encryption, however, an organization may use a certificate and associate a set of public and private keys with its owner through a digital signature.

In the related art, the data desensitization technology is generally realized by depending on an encryption algorithm, so after a service is accessed to the encryption algorithm, a service code must be modified according to an SDK (secure data Key) of the encryption algorithm, and the service is invasive. Data desensitization is divided into static data desensitization and dynamic data desensitization, a common technical scheme can only desensitize static data alone or dynamic data alone, when an operated service system needs to desensitize historical data and newly added data are desensitized scenes, desensitization needs to be completed, services are not sensed, and at present, no good solution is provided. In addition, after the data is desensitized by the encryption algorithm in the conventional method, rollback cannot be performed any more, and even if rollback is performed, the rollback has an influence on the service, and the common technical scheme has no rollback capability or has intrusion on the service.

In view of the above, the present application provides a data desensitization method, apparatus, computer device and storage medium. Specifically, a data desensitization method, an apparatus, a computer device, and a storage medium of the embodiments of the present application are described below with reference to the drawings.

It should be noted that the data desensitization method according to the embodiment of the present application is applicable to a scenario in which a database middleware is accessed in an already-running service system to implement service data desensitization. As shown in fig. 1, the data desensitization method may include the following steps.

Step 101, responding to the business system operating in the first stage, receiving first incremental data of the business system operating in the first stage.

Step 102, determining first plaintext data, first ciphertext data and first auxiliary column data corresponding to the first incremental data, and writing the first plaintext data, the first ciphertext data and the first auxiliary column data into a database.

Optionally, when first incremental data of the service system operating in the first stage is received, it may be determined whether the type of the first incremental data is sensitive data, and if not, the first plaintext data of the first incremental data is directly written into the database. If the type of the first incremental data is sensitive data, then desensitization of the first incremental data is required. First plaintext data of the first incremental data may be determined first, and corresponding first ciphertext data and first auxiliary column data may be determined according to the first plaintext data.

In one implementation, a first encryption algorithm may be used to encrypt plaintext data of the first incremental data (i.e., the first plaintext data) to obtain first ciphertext data of the first incremental data. And encrypting the plaintext data of the first incremental data (namely the first plaintext data) by adopting a second encryption algorithm to obtain first auxiliary column data of the first incremental data. Wherein the first encryption algorithm is different from the second encryption algorithm. That is, different encryption algorithms can be used to encrypt the incremental data, and the plaintext data and the auxiliary column data of the incremental data are obtained correspondingly.

In the embodiment of the present application, the first encryption algorithm may be any one of a symmetric encryption algorithm, an asymmetric key encryption algorithm, or a digital certificate. The second encryption algorithm may be a hash algorithm, i.e. the first auxiliary column data may be a hash value.

It should be noted that, in the embodiment of the present application, ciphertext data of the incremental data may change with a change of a private key of an encryption algorithm, while the auxiliary column data remains unchanged and cannot solve corresponding plaintext data.

And 103, responding to the operation of the business system in the second stage, and receiving second incremental data of the business system in the second stage.

And 104, determining second plaintext data, second ciphertext data and second auxiliary column data corresponding to the second incremental data, and writing the second plaintext data, the second ciphertext data and the second auxiliary column data into the database.

Optionally, when second incremental data of the service system operating in the second stage is received, it may be determined whether the type of the second incremental data is sensitive data, and if not, the second plaintext data of the second incremental data is directly written into the database. If the type of the second incremental data is sensitive data, then desensitization of the second incremental data is required. Second plaintext data of the second incremental data may be determined first, and corresponding second ciphertext data and second auxiliary column data may be determined according to the second plaintext data.

In one implementation, the plaintext data of the second incremental data (i.e., the second plaintext data) may be encrypted by using a first encryption algorithm to obtain second ciphertext data of the second incremental data. And encrypting the plaintext data of the second incremental data (namely the second plaintext data) by adopting a second encryption algorithm to obtain second auxiliary column data of the second incremental data. Wherein the first encryption algorithm is different from the second encryption algorithm. That is, different encryption algorithms can be used to encrypt the incremental data, and the plaintext data and the auxiliary column data of the incremental data are obtained correspondingly.

In the embodiment of the present application, the first encryption algorithm may be any one of a symmetric encryption algorithm, an asymmetric key encryption algorithm, or a digital certificate. The second encryption algorithm may be a hash algorithm, i.e. the second auxiliary column data may be a hash value.

It should be noted that, in the embodiment of the present application, ciphertext data of the incremental data may change with a change of a private key of an encryption algorithm, while the auxiliary column data remains unchanged and cannot solve corresponding plaintext data.

And 105, responding to the third phase of the operation of the business system, receiving third incremental data of the business system in the third phase of the operation, determining third ciphertext data and third auxiliary column data of the third incremental data, and writing the third ciphertext data and the third auxiliary column data into the database.

Optionally, when third incremental data of the service system operating in the third stage is received, it may be determined whether the type of the third incremental data is sensitive data, and if not, the third plaintext data of the third incremental data is directly written into the database. If the type of the third incremental data is sensitive data, then desensitization of the third incremental data is required. The third plaintext data of the third incremental data may be determined first, and the corresponding third ciphertext data and third auxiliary column data may be determined according to the third plaintext data.

In one implementation, the plaintext data of the third incremental data (i.e., the third plaintext data) may be encrypted by using a first encryption algorithm to obtain third ciphertext data of the third incremental data. And encrypting the plaintext data of the third incremental data (namely the third plaintext data) by adopting a second encryption algorithm to obtain third auxiliary column data of the third incremental data. Wherein the first encryption algorithm is different from the second encryption algorithm. That is, different encryption algorithms can be used to encrypt the incremental data, and the plaintext data and the auxiliary column data of the incremental data are obtained correspondingly.

In the embodiment of the present application, the first encryption algorithm may be any one of a symmetric encryption algorithm, an asymmetric key encryption algorithm, or a digital certificate. The second encryption algorithm may be a hash algorithm, i.e. the third auxiliary column data may be a hash value.

It should be noted that, in the embodiment of the present application, ciphertext data of the incremental data may change with a change of a private key of an encryption algorithm, while the auxiliary column data remains unchanged and cannot solve corresponding plaintext data.

In the embodiment of the application, the data calculation process in the first stage is operated according to a plaintext field; the data calculation process in the second and third stages operates according to the auxiliary column fields.

It should be noted that in the embodiment of the present application, data desensitization is to invoke an encryption/decryption system interface to implement data encryption/decryption, and the whole process of the data desensitization method is implemented at a middleware layer of a database. As shown in fig. 2, a diagram of the architecture is run inside the database middleware. As shown in fig. 2, data is dropped from the application layer to the database, and then needs to undergo several steps of "parsing ═ routing ═ rewriting ═ importing ═ merging ═ returning".

It is noted that the present application does not encrypt the data directly, because: if data is directly encrypted, after a business system accesses the middleware, incremental data is encrypted, but the database is still in plaintext, and new data is written in ciphertext, so that the business system cannot be used if the database has both plaintext and ciphertext, and if encryption and decryption fail or errors occur, the data cannot be recovered, and the risk is high. In another case, if the table-partitioned encryption is performed, data routing is required, and plaintext becomes ciphertext after encryption, a routing key value changes, data routing also changes, and service data cannot find previously-stored data, so that problems also exist.

Aiming at the problems, the method adds two fields of a ciphertext and an auxiliary column on the basis of the original plaintext field, and divides the data desensitization process into three stages, wherein each stage processes different logics. The three stages of desensitization are described in detail below. Assuming the desensitized column, i.e., plaintext a, ciphertext B, and auxiliary column C, the desensitization process is divided into three stages, each of which operates as described in the following table:

TABLE 1

Phases Plaintext A Ciphertext B Auxiliary column C
First stage Warehouse entry (operate accordingly) Put in storage Put in storage
Second stage Put in storage Put in storage Warehouse entry (operate accordingly)
The third stage Not put in storage Put in storage Warehouse entry (operate accordingly)

Wherein, the first stage: when the service system is applied to the middleware of the access database, the service access database is plaintext, the calculation operation is performed by using the plaintext, but the ciphertext and the auxiliary column are written at the same time, and even if the access is problematic, the plaintext is still present, so that the service system is not influenced and can be rolled back at any time. And a second stage: the business system runs to the stage, which shows that the encryption and decryption processes of the access database middleware and the encryption and decryption system have no problem, the calculation logic can be switched to the logic of ciphertext calculation at the moment, because the plaintext is deleted finally, the data writing is not different from the data writing in the first stage at the moment, but the calculation process is operated according to the auxiliary column data. If there is a problem at this point, the system can be switched back to a stage where the plaintext and ciphertext are all in place, so there is no risk. And a third stage: the business system runs to the stage, which shows that the encryption and decryption processes of the access database middleware and the encryption and decryption system have no problem, and the operation according to the ciphertext auxiliary column has no problem, so that the plaintext is useless, and the plaintext is not written into the database in the third stage.

According to the above design, it is possible to enumerate how many CRUD operations are handled at these several stages: (Note: xx represents plaintext data, yy represents ciphertext data, zz represents auxiliary column data, and ww represents other field values, i.e., non-sensitive data)

1)insert into、replace into

Before desensitization: insert table (A) value (xx);

the first stage is as follows: insert table (A, B, C) value (xx, yy, zz)

And a second stage: insert table (A, B, C) value (xx, yy, zz)

And a third stage: insert into table (B, C) value (yy, zz)

2)Select where A=xx

Before desensitization: select a, other from table where a is xx;

the first stage is as follows: select a, other from table where a is xx;

and a second stage: select a, B, C, other from table where C ═ zz;

and a third stage: select B, C, other from table where C ═ zz;

in the data merge phase, the value of A, B, C is converted by CDS into xx, which is returned to the business layer.

3)Select A where other=ww

Before desensitization: select a, other from table where other is ww;

the first stage is as follows: select a, other from table where other is ww;

and a second stage: select a, B, C, other from table where other is ww;

and a third stage: select B, C, other from table where other is ww;

in the data merging stage, judging whether C exists, if so, calculating A return according to B + C, and if not, directly returning A

4)Update table set A=xx where A=xx

The first stage is as follows: update table set a xx w her a xx;

and a second stage: update table set a ═ xx, B ═ yy, C ═ zz where C ═ zz;

and a third stage: update table B yy, C zz corner C zz

5)Update table set A=xx where other=ww

The first stage is as follows: update table set a xx work other ww;

and a second stage: update table set a ═ xx, B ═ yy, C ═ zz sphere other ═ ww;

and a third stage: update table set B yy, C zz another ww;

6) the case of Delete from table where after xx is executed directly without rewriting

7)Delete from table where A=xx

The first stage is as follows: delete from table where a. xx;

and a second stage: delete from table where C ═ zz;

and a third stage: delete from table where C ═ zz.

Note that DDL (Data Definition Language) is not within the desensitization consideration.

It should be noted that, in the embodiment of the present application, when the business system operates in the second phase and the third phase, the calculation operation is performed by using the auxiliary column data. This is because: for security, the private key of data encryption may change, so that the encrypted ciphertext may be different after a certain time, and therefore cannot be directly calculated by using the ciphertext, because the routing logic is different after the encrypted ciphertext is changed, an invariant value is required to assist in calculation, which is the reason why the auxiliary column is used for the calculation without the ciphertext, but directly using the auxiliary column, and because the auxiliary column is calculated by the hash algorithm, even if the auxiliary column is solved, the plaintext cannot be solved.

The foregoing is the theoretical basis for desensitization of the database middleware internal implementation, and it is seen below how these transformations are done at the database middleware kernel level. As shown in fig. 3, according to the design scheme, when data is routed, encryption is not required, plain text routing is directly used, when the bottom layer is interacted, SQL is required to be rewritten correspondingly, and finally, when the service obtains the last ResultSet, desensitization processing is required to be performed on corresponding data, and the end user still obtains plain text.

It should be noted that how the database middleware desensitizes incremental data is provided above, at this time, the business system may already access the database middleware to implement compatible operation of plaintext and ciphertext, that is, after accessing the database middleware, the incremental data is plaintext and ciphertext, and the incremental data is also compatible with the previous business system, and the incremental data of the first stage and the second stage are written into the database to become stock data (i.e., historical data). This is followed by a purge of the inventory data in the database (i.e., historical data) to achieve desensitization of the historical data.

In some embodiments of the present application, desensitization processing may be performed on inventory data in the database while the business system is operating in the first phase and the second phase. In one implementation, as shown in FIG. 4, an implementation of desensitization processing on inventory data in a database may include the following steps.

Step 401, calculating cipher text data and auxiliary column data corresponding to the stock data according to the plaintext data of the stock data in the database.

Optionally, the ciphertext data of the stock data is calculated by using a first encryption algorithm, and the auxiliary column data of the stock data is calculated by using a second encryption algorithm. Wherein the first encryption algorithm is different from the second encryption algorithm. As an example, the first encryption algorithm may be any one of a symmetric encryption algorithm, an asymmetric key encryption algorithm, or a digital certificate. The second encryption algorithm may be a hash algorithm, i.e. the auxiliary column data may be a hash value.

Step 402, determining whether the calculated ciphertext data is consistent with the ciphertext data stored in the database and corresponding to the stock data, and determining whether the calculated auxiliary column data is consistent with the auxiliary column data stored in the database and corresponding to the stock data.

And step 403, deleting the plaintext data of the stock data in the database in response to the fact that the calculated ciphertext data is consistent with the ciphertext data corresponding to the stock data stored in the database and the calculated auxiliary column data is consistent with the auxiliary column data corresponding to the stock data stored in the database.

Optionally, in some embodiments of the present application, before calculating ciphertext data and auxiliary column data corresponding to the stock data according to plaintext data of the stock data in the database, it may be determined whether the auxiliary column data of the stock data corresponds to the plaintext data of the stock data; responding to the fact that the auxiliary column data of the stock data do not correspond to the plaintext data of the stock data, and obtaining the plaintext data of the stock data; calculating the latest ciphertext data of the stock data according to the plaintext data of the stock data; and updating the original ciphertext data of the stock data into the latest ciphertext data.

Optionally, in some embodiments of the present application, before calculating the ciphertext data and the auxiliary column data corresponding to the inventory data according to the plaintext data of the inventory data in the database, the data cleaning process is performed on the inventory data in the database.

It should be noted that the desensitization of the inventory data in the embodiment of the present application may include: the method comprises a data correction step, a data cleaning step, a data verification step and a plaintext deleting step.

Wherein, the data correction step

Since the service system may have a scene where new and old database middleware coexist at the time of upgrading the database middleware, the old application updates the plaintext, but the new application updates the ciphertext to cause inconsistency of the plaintext and the ciphertext, the application can firstly perform data correction on the encrypted data after the application is completely upgraded. The data correction means that the existing ciphertext is encrypted once according to the plaintext, taking the plaintext as the standard. For example, the data of the auxiliary column may be judged first, the plaintext and the auxiliary column may be compared, and if the corresponding relationship between the plaintext and the auxiliary column is incorrect, that is, the auxiliary column is not calculated based on the plaintext, the updating may be performed according to the plaintext. In theory, the data of this stage is relatively small, and the correction process is generally relatively fast.

Data cleaning step

The logic of data cleaning is realized according to increment desensitization, the existing data is taken out and then updated, and because the ciphertext is updated in the process of updating the plaintext in the first stage, the data desensitization can be realized, and whether the update is correct after the service is accessed into the middleware of the database is verified.

Of course, the update in data washing is not simply selecting out data, and then doing the update one by one, but is realized by a database lock, namely "Select a from table for update"; update a where a ═ a; "this is done because in a special case, data inconsistency occurs, and if the service has accessed the database middleware for a period of desensitization, it is possible that the service system is in an update record, and if the record is in the update record, the data inconsistency occurs, and to avoid this, the data cleansing phase needs to be locked when updating the data, and because locking affects performance, the data cleansing phase is run as often as possible during low-peak periods of service, and the data cleansed each time is not too large.

Data checking step

The data check is to add a security again to ensure that the plaintext and the ciphertext are consistent, and the check process is to check whether the corresponding ciphertext and the auxiliary column are correct or not according to the plaintext. For example, the plaintext a in the table may be taken out in segments, the encrypted data B and C may be obtained by using an encryption and decryption system, and whether the encrypted data B and C are consistent with the ciphertext and the auxiliary column corresponding to the plaintext a in the database may be determined.

Plaintext deletion step

The process of plaintext deletion, namely update table set a ″; since a large amount of update is needed and the performance of the database is affected if the database is directly operated, the method can adopt a segmented update mode to gradually set the plaintext to be empty, thereby completely completing data desensitization. For example, when it is determined that the calculated ciphertext data matches ciphertext data corresponding to stock data stored in the database, and the calculated auxiliary column data matches auxiliary column data corresponding to stock data stored in the database, plaintext data of the stock data stored in the database may be set to null to delete plaintext data of the stock data in the database.

According to the data desensitization method, on the basis of the original plaintext field of the service data, the ciphertext and the auxiliary column field are added, the data desensitization process is divided into three stages, and each stage processes different logics. The method comprises the following steps that in the first stage, plaintext, ciphertext and auxiliary column values of incremental data of a business system are all put into a warehouse, and in the first stage, the data calculation process is operated according to plaintext fields; the second stage is to put the plaintext, the ciphertext and the auxiliary column value of the incremental data into a database, and the data calculation process in the second stage is operated according to the auxiliary column field; and the third stage of warehousing the ciphertext and the auxiliary column value of the incremental data, wherein the data calculation process in the third stage is operated according to the auxiliary column field. The data desensitization can be completed by the service system under the condition that the service code is not changed; in addition, a complete inventory data desensitization and incremental data desensitization scheme is provided, and the whole process service is not sensed, so that the data desensitization can be realized under the condition that a service system normally operates; in addition, a set of complete exception handling scheme is provided, when desensitization has problems, rollback can be performed at any time, data are kept complete, and the whole process service is not sensed.

In order to realize the embodiment, the application also provides a data desensitization device.

Fig. 5 is a block diagram of a data desensitization apparatus according to an embodiment of the present disclosure. It should be noted that the data desensitization apparatus according to the embodiment of the present application is suitable for accessing a database middleware in an already-operating service system to implement a scenario of desensitizing service data. As shown in fig. 5, the data desensitization apparatus may include: a transceiver module 501 and a data processing module 502.

The transceiver module 501 is configured to respond to that the service system operates in the first stage, and receive first incremental data of the service system operating in the first stage;

the data processing module 502 is configured to determine first plaintext data, first ciphertext data, and first auxiliary column data corresponding to the first incremental data, and write the first plaintext data, the first ciphertext data, and the first auxiliary column data into the database;

the transceiver module 501 is further configured to receive, in response to the service system operating in the second stage, second incremental data of the service system operating in the second stage;

the data processing module 502 is further configured to determine second plaintext data, second ciphertext data, and second auxiliary line data corresponding to the second incremental data, and write the second plaintext data, the second ciphertext data, and the second auxiliary line data into the database;

the transceiver module 501 is further configured to respond to the third phase of the service system, and receive third incremental data of the service system when the service system operates in the third phase;

the data processing module 502 is further configured to determine third ciphertext data and third auxiliary column data of the third incremental data, and write the third ciphertext data and the third auxiliary column data into the database;

in the embodiment of the application, the data calculation process in the first stage is operated according to a plaintext field; the data calculation process in the second and third stages operates according to the auxiliary column fields.

Optionally, in some embodiments of the present application, the data processing module 502 may further be configured to: when the business system operates in the first stage and the second stage, desensitization processing is carried out on the stock data in the database.

In this embodiment of the application, the implementation process of desensitizing the data processing module 502 to the inventory data in the database may be as follows: calculating ciphertext data and auxiliary column data corresponding to the stock data according to plaintext data of the stock data in the database; judging whether the calculated ciphertext data is consistent with ciphertext data stored in a database and corresponding to stock data, and judging whether the calculated auxiliary column data is consistent with auxiliary column data stored in the database and corresponding to the stock data; and deleting the plaintext data of the stock data in the database in response to the fact that the calculated ciphertext data is consistent with the ciphertext data corresponding to the stock data and the calculated auxiliary column data is consistent with the auxiliary column data corresponding to the stock data and stored in the database.

Optionally, in some embodiments of the present application, the data processing module 502 determines whether the auxiliary column data of the stock data corresponds to the plaintext data of the stock data before calculating the ciphertext data and the auxiliary column data corresponding to the stock data according to the plaintext data of the stock data in the database; responding to the fact that the auxiliary column data of the stock data do not correspond to the plaintext data of the stock data, and obtaining the plaintext data of the stock data; calculating the latest ciphertext data of the stock data according to the plaintext data of the stock data; and updating the original ciphertext data of the stock data into the latest ciphertext data.

Optionally, in other embodiments of the present application, the data processing module 502 performs data cleaning processing on the inventory data in the database before calculating the ciphertext data and the auxiliary column data corresponding to the inventory data according to the plaintext data of the inventory data in the database.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

According to the data desensitization device of the embodiment of the application, the ciphertext and the auxiliary column field are added on the basis of the original plaintext field of the service data, the data desensitization process is divided into three stages, and each stage processes different logics. The method comprises the following steps that in the first stage, plaintext, ciphertext and auxiliary column values of incremental data of a business system are all put into a warehouse, and in the first stage, the data calculation process is operated according to plaintext fields; the second stage is to put the plaintext, the ciphertext and the auxiliary column value of the incremental data into a database, and the data calculation process in the second stage is operated according to the auxiliary column field; and the third stage of warehousing the ciphertext and the auxiliary column value of the incremental data, wherein the data calculation process in the third stage is operated according to the auxiliary column field. The data desensitization can be completed by the service system under the condition that the service code is not changed; in addition, a complete inventory data desensitization and incremental data desensitization scheme is provided, and the whole process service is not sensed, so that the data desensitization can be realized under the condition that a service system normally operates; in addition, a set of complete exception handling scheme is provided, when desensitization has problems, rollback can be performed at any time, data are kept complete, and the whole process service is not sensed.

Based on the embodiment of the application, the application also provides a computer device, at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform a method of desensitizing data described in any of the preceding embodiments.

Based on the embodiments of the present application, the present application further provides a computer-readable storage medium, wherein computer instructions are used for causing a computer to execute the data desensitization method according to any one of the foregoing embodiments provided in the embodiments of the present application.

FIG. 6 shows a schematic block diagram of an example computer device that can be used to implement embodiments of the present application. Computer devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 6, the apparatus 600 includes a computing unit 601, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 601 performs the various methods and processes described above, such as a data desensitization method. For example, in some embodiments, the data desensitization method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into RAM 603 and executed by the computing unit 601, one or more steps of the data desensitization methods described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the data desensitization method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

20页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于群签名的区块链非法地址监管系统及追溯方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类