TrustZone-based credible quantitative model reasoning method

文档序号:137298 发布日期:2021-10-22 浏览:44次 中文

阅读说明:本技术 基于TrustZone的可信量化模型推理方法 (TrustZone-based credible quantitative model reasoning method ) 是由 肖堃 金宙贤 陈丽蓉 罗蕾 李允� 于 2021-06-30 设计创作,主要内容包括:本发明公开了一种基于TrustZone的可信量化模型推理方法,首先将执行人工智能模型推理的物联网终端设备采用TrustZone技术划分为安全世界和普通世界,在安全世界中对人工智能模型进行解析与数据量化,将计算节点划分为简单计算节点和复杂计算节点,将简单计算节点部署至安全世界,复杂计算节点部署至安全世界的共享内存,在进行人工智能模型推理时,将简单计算节点在安全世界中运行,将复杂计算节点通过共享内存发送至普通世界中运行并将运算结果返回至安全世界,物联网终端设备对复杂计算节点的运算结果进行校验后与简单计算节点的运算结果进行整合,得到人工智能模型推理结果。本发明通过将简单计算节点部署和复杂计算节点分别部署,提高人工智能模型推理效率。(The invention discloses a TrustZone-based credible quantitative model reasoning method, which comprises the steps of firstly dividing an Internet of things terminal device for executing artificial intelligence model reasoning into a safe world and a common world by adopting the TrustZone technology, analyzing and quantizing the artificial intelligent model in the safety world, dividing the computing nodes into simple computing nodes and complex computing nodes, deploying the simple computing nodes to the safety world, deploying the complex computing nodes to a shared memory of the safety world, when artificial intelligence model reasoning is carried out, the simple computing nodes are operated in the safe world, the complex computing nodes are sent to the common world through the shared memory to operate, the operation result is returned to the safe world, and the terminal equipment of the internet of things checks the operation result of the complex computing nodes and then integrates the operation result of the simple computing nodes to obtain the artificial intelligence model reasoning result. According to the invention, the artificial intelligence model reasoning efficiency is improved by respectively deploying the simple computing nodes and the complex computing nodes.)

1. A trustZone-based credible quantitative model reasoning method is characterized by comprising the following steps:

s1: for the terminal equipment of the Internet of things executing artificial intelligence model inference, the TrustZone is adopted to divide the CPU into two execution environments: the secure world and the general world;

s2: setting a quantization rule of the artificial intelligent model in advance, generating a quantization model and storing the quantization model in a security world of the terminal equipment of the Internet of things;

s3: the terminal equipment of the Internet of things analyzes the artificial intelligent model: converting the artificial intelligence model from an original format into an ONNX model, then performing Protobuf deserialization, extracting data of the artificial intelligence model, dividing the obtained data into tensor data and computing node data, wherein the tensor data are input data and output data of each computing node for reasoning of the artificial intelligence model, and the computing node data are data of each computing node in the artificial intelligence model and relevant parameter data; then, the tensor data are quantized according to the quantization rule set in the step S2, and the quantized tensor data and the calculation node data are cached in the safe world;

s4: dividing received computing nodes into simple computing nodes and complex computing nodes by the terminal equipment of the Internet of things according to a preset complexity judgment standard, then dividing a shared memory in the security world, and placing the complex computing node data and related tensor data into the shared memory;

s5: when the IOT terminal equipment needs artificial intelligence model reasoning, the simple computing node is operated in the safe world, the complex computing node is sent to the common world to operate through the shared memory, and multi-thread operation is adopted during the operation of the complex computing node;

s6: the complex computing node returns an operation result to the safe world after the complex computing node calculates the operation result in the common world, the terminal equipment of the internet of things checks the operation result of the complex computing node in the safe world according to a preset credible check, and the operation result of the simple computing node is integrated after the check is passed, so that an inference result of the artificial intelligent model is obtained.

Technical Field

The invention belongs to the technical field of Internet of things, and particularly relates to a TrustZone-based credible quantitative model reasoning method.

Background

With the development of the internet of things technology, people put higher requirements on the aspects of security and intelligence of hardware equipment. The development of a hardware platform for providing computing service and a parallel computing technology is benefited, and the intellectualization of the Internet of things equipment is greatly improved. The inference process of the artificial intelligence model on the equipment side can be divided into two parts. And one part is a non-running service part, namely a part for managing the model, wherein the service part comprises functions of analyzing the network model, quantifying the function, optimizing the network model and the like. The other part is a part related to the operation state, wherein the part comprises a computing library and hardware computing resources for providing computing services. Technologies such as quantitative calculation, a Neon instruction technology, calculation algorithm optimization, parallel calculation and the like are often integrated in the calculation library. FIG. 1 is a schematic flow chart of artificial intelligence model reasoning of the Internet of things.

Due to the development of anti-attack technology and the complex and various environment of the equipment of the internet of things, the intelligent process of the equipment of the internet of things faces huge safety problems. Therefore, how the artificial intelligence engine on the internet of things equipment can provide safe computing environment service gradually becomes a research and development direction of falling of artificial intelligence application in the internet of things. And the equipment terminal in the internet of things is the key point of safety protection. TrustZone is a security technology which is provided by ARM company and aims at the running environment of a CPU (central processing unit) of an equipment terminal of the Internet of things. The TrustZone technology is characterized in that a single-core CPU is divided into two different execution environments: one environment is the common world, and a CPU in the environment has abundant resources, so that a Linux system and common user applications can be deployed in the environment of the common world; another environment is the secure world where tasks are often security related, such as security algorithm checking, password verification, etc., and thus secure OS and applications are deployed in security. However, the essence of the architecture technology is time-sharing of different operating environments of a single CPU, so that a secure application in a secure environment cannot execute a multi-threaded operation at all, and cannot run on a multi-core at high speed, which is also a resource limitation problem caused by the TrustZone technology. This also brings a lot of inconvenience to the artificial intelligence model reasoning and needs further research.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a TrustZone-based credible quantitative model reasoning method.

In order to achieve the purpose, the TrustZone-based credible quantitative model reasoning method comprises the following steps:

s1: for the terminal equipment of the Internet of things executing artificial intelligence model inference, the TrustZone is adopted to divide the CPU into two execution environments: the secure world and the general world;

s2: setting a quantization rule of the artificial intelligent model in advance, generating a quantization model and storing the quantization model in a security world of the terminal equipment of the Internet of things;

s3: the terminal equipment of the Internet of things analyzes the artificial intelligent model: converting the artificial intelligence model from an original format into an ONNX model, then performing Protobuf deserialization, extracting data of the artificial intelligence model, dividing the obtained data into tensor data and computing node data, wherein the tensor data are input data and output data of each computing node for reasoning of the artificial intelligence model, and the computing node data are data of each computing node in the artificial intelligence model and relevant parameter data; then, the tensor data are quantized according to the quantization rule set in the step S1, and the quantized tensor data and the calculation node data are cached in the safe world;

s4: dividing received computing nodes into simple computing nodes and complex computing nodes by the terminal equipment of the Internet of things according to a preset complexity judgment standard, then dividing a shared memory in the security world, and placing the complex computing node data and related tensor data into the shared memory;

s5: when the IOT terminal equipment needs artificial intelligence model reasoning, the simple computing node is operated in the safe world, the complex computing node is sent to the common world to operate through the shared memory, and multi-thread operation is adopted during the operation of the complex computing node;

s6: the complex computing node returns an operation result to the safe world after the complex computing node calculates the operation result in the common world, the terminal equipment of the internet of things checks the operation result of the complex computing node in the safe world according to a preset credible check, and the operation result of the simple computing node is integrated after the check is passed, so that an inference result of the artificial intelligent model is obtained.

The invention relates to a TrustZone-based credible quantitative model inference method, which comprises the following steps of firstly dividing an Internet of things terminal device for executing artificial intelligence model inference into two execution environments by adopting a TrustZone technology: the method comprises the steps that an artificial intelligence model is analyzed in the safety world and data quantization is carried out on the artificial intelligence model, computing nodes are divided into simple computing nodes and complex computing nodes, the simple computing nodes are deployed to the safety world, the complex computing nodes enable the common world to call the complex computing nodes through a shared memory of the safety world, the simple computing nodes are operated in the safety world when artificial intelligence model reasoning is carried out, the complex computing nodes are sent to the common world through the shared memory to operate, operation results are returned to the safety world, and the operation results of the complex computing nodes are integrated with the operation results of the simple computing nodes after being checked by terminal equipment of the internet of things in the safety world to obtain the artificial intelligence model reasoning results.

According to the invention, the simple computing nodes and the complex computing nodes are deployed respectively, so that the inference efficiency of the artificial intelligence model is improved.

Drawings

FIG. 1 is a schematic flow diagram of IOT artificial intelligence model reasoning;

FIG. 2 is a flowchart of an embodiment of the TrustZone-based credible quantitative model inference method;

FIG. 3 is a block diagram of an Internet of things terminal device according to the present invention;

FIG. 4 is a flow chart of artificial intelligence model resolution in the present invention.

Detailed Description

The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

Examples

FIG. 2 is a flowchart of a specific embodiment of the TrustZone-based credible quantitative model inference method of the present invention. As shown in fig. 2, the TrustZone-based credible quantitative model inference method of the present invention specifically comprises the steps of:

s201: configuration of terminal equipment of the Internet of things:

for the terminal equipment of the Internet of things executing artificial intelligence model inference, the TrustZone is adopted to divide the CPU into two execution environments: a secure world and a general world.

Fig. 3 is a structural diagram of the terminal device of the internet of things in the invention. As shown in fig. 3, the terminal device of the internet of things in the present invention is divided into two execution environments: the system comprises a safety world and a common world, wherein the safety world bears tasks of model analysis, data quantification, reasoning execution and credibility verification, the common world bears the task of reasoning execution, and information interaction is realized between the safety world and the common world through a communication interface.

S202: determining an artificial intelligence model quantization rule:

because the input data of the artificial intelligence model is usually floating point numerical values, in order to improve the operation speed of the artificial intelligence model on the CPU of the terminal device of the internet of things, the artificial intelligence model needs to be set in advance according to the quantization rule, and the generated quantization model is stored in the secure world of the terminal device of the internet of things.

The quantification rule can be set according to the actual condition of the input data of the artificial intelligence model, and can be generally selected from an ARM (advanced RISC machines) computer library. The quantization modes include a symmetric mode and an asymmetric mode. By converting the floating-point numerical value into the low-bit numerical value, the calculation intensity, the parameter size and the memory consumption of the artificial intelligence model can be effectively reduced. In the embodiment, the quantization rule adopts asymmetric quantization, the quantization parameters are quantization scale and zero point, and a proper quantization parameter is found by traversing training set data. The quantization model takes the information entropy as an evaluation standard on the principle of ensuring the data distribution consistency before and after quantization as much as possible.

S203: analyzing and quantifying an artificial intelligence model:

and the terminal equipment of the Internet of things analyzes the artificial intelligent model. FIG. 4 is a flow chart of artificial intelligence model resolution in the present invention. As shown in fig. 4, the method for analyzing the artificial intelligence model in the present invention comprises: the method comprises the steps of converting an artificial intelligence model from an original format into an ONNX model, then conducting Protobuf deserialization, extracting data of the artificial intelligence model, dividing the obtained data into tensor data and computing node data, wherein the tensor data are input data and output data of each computing node of the artificial intelligence model in an inference mode, and the computing node data are data of each computing node in the artificial intelligence model and relevant parameter data. Then, the tensor data are quantized according to the quantization rule set in step S202, and the quantized tensor data and the calculation node data are cached in the secure world.

ONNX (Open Neural Network Exchange) is a model Exchange format shared by a framework, is equivalent to a translation function, converts models with different formats into a uniform format, and then extracts required tensor data and computing node data from the uniform format. Table 1 is an example of a tensor data format.

TABLE 1

Table 2 is a compute node data format example.

TABLE 2

S204: and (3) deploying computing nodes:

the method comprises the steps that the terminal equipment of the Internet of things divides received computing nodes into simple computing nodes and complex computing nodes according to a preset complexity judgment standard, then a shared memory is divided in a safe world, and complex computing node data and related tensor data are placed in the shared memory.

The complexity decision criteria can be set by the method in the reference "Molchanov P, Tyree S, Karras T, et al. Generally, the complex calculation is mostly convolution calculation. Table 3 is an example of a data format of a convolution compute node.

TABLE 3

S205: artificial intelligence model reasoning:

when the IOT terminal equipment needs artificial intelligence model reasoning, the simple computing nodes are operated in the safe world, the complex computing nodes are sent to the common world to be operated through the shared memory, and multi-thread operation is adopted when the complex computing nodes are operated.

The specific method for realizing the memory sharing by TA (Trust application) in the secure world and CA (client application) in the common world is as follows: the TA in the secure world and the CA in the ordinary world establish connection, the CA sends a command TEE _ IOS _ SHM _ ALLOC of the shared memory, and the CA puts itself into a block through a hardware instruction command. The Linux system switches the world from the common world to the secure world through SMC commands. Since the CA does not know the size of the shared memory to be set, in the secure world, the TA transfers the size of the memory to be shared to the Linux system through reverse RPC calling, and the Linux system distributes the shared memory for data interaction to the TA and the CA, thereby realizing the sharing of the memory.

When the computing node runs, the computing is mainly performed through an ARM computing library interface. Based on the OpenCL and OpenGL computing libraries, computing operators are initialized, memory space is allocated for various tensors, and then operation is carried out according to specific computing contents of computing nodes. Taking a convolution calculation node as an example, a CLTensor interface is called for the convolution calculation node to initialize the input, the output, a tensor layer of convolution kernel weight and convolution kernel offset, and a CL scheduler is created and initialized. The CL scheduler is used for setting the number of threads in operation to relate to the number of the threads in operation, so that the multithread parallel calculation is realized, and the operation speed is improved. Secondly, creating a TensorShape class, an offset TensorShape class and a PadStrideInfo class of convolution kernel weight, initializing and configuring a computing node by using a distributor of input, output, convolution kernel weight and convolution kernel offset, and mapping an input memory block and a shared memory space.

S206: and (4) reasoning result integration:

the complex computing node returns an operation result to the safe world after the complex computing node calculates the operation result in the common world, the terminal equipment of the internet of things checks the operation result of the complex computing node in the safe world according to a preset credible check, and the operation result of the simple computing node is integrated after the check is passed, so that an inference result of the artificial intelligent model is obtained.

Taking convolution calculation nodes as an example, a large number of matrix multiplications are needed in convolution calculation, one convolution operation can be realized by a tiling algorithm and a matrix multiplication, and the execution of the matrix multiplication is in a CLGEMM class. CLGEMM performs a multiplication of the convolution matrix with the feature matrix, the matrix multiplication being performed according to the Freivald algorithm. When the node task needs to perform convolution operation, the operation needs to be checked for credibility. The trusted verification method adopted in this embodiment is as follows: and selecting part of matrix multiplication from convolution operation in the secure world for calculation, wherein if the calculation results are consistent with those of the matrix multiplication in the common world, the credible check is passed, and otherwise, the credible check is not passed. Since the calculation involves a quantized value, and the accuracy of the value changes due to saturation after the calculation, the matrix multiplication check based on the idea of the Freivald algorithm is used to perform the confidence check.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种面向制造的深度数字孪生系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!