Data processing method and device based on convolutional neural network architecture

文档序号：1146095 发布日期：2020-09-11 浏览：25次中文

阅读说明：本技术 一种基于卷积神经网络架构的数据处理方法及装置 (Data processing method and device based on convolutional neural network architecture ) 是由王哲仇晓颖韩彬于 2019-05-05 设计创作，主要内容包括：一种基于卷积神经网络架构的数据处理方法及装置。其中,方法包括：若当前运算层的输入为一组定点数据时,根据所述当前运算层的运算规则对该组定点数据进行处理后生成当前运算层的输出数据；若当前运算层的输入为n组定点数据时,对所述n组定点数据进行调整,以使得所述n组数据中的每一组数据的量化参数相同,根据所述当前运算层的运算规则对调整后的n组数据进行处理后生成当前运算层的输出数据；其中,n≥2；所述量化参数包括定点数据的总位宽、整数部分位宽和小数部分位宽；当完成所有运算层的运算时,输出所述待测数据的预测结果。如此,可以提高定点化后的卷积神经网络模型的计算准确度。(A data processing method and device based on a convolutional neural network architecture. The method comprises the following steps: if the input of the current operation layer is a set of fixed point data, the set of fixed point data is processed according to the operation rule of the current operation layer to generate output data of the current operation layer; if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of each group of data in the n groups of data are the same, and generating output data of the current operation layer after processing the adjusted n groups of data according to the operation rule of the current operation layer; wherein n is more than or equal to 2; the quantization parameters comprise the total bit width, the integer part bit width and the decimal part bit width of the fixed point data; and when the operation of all the operation layers is finished, outputting the prediction result of the data to be detected. Thus, the calculation accuracy of the fixed-point convolutional neural network model can be improved.)

1. A data processing method based on a convolutional neural network architecture is characterized by comprising the following steps:

receiving input to-be-detected data based on the trained convolutional neural network model; the neural network model comprises a plurality of operation layers which are connected in sequence, and each operation layer outputs input data of a next operation layer after performing operation according to output data of a previous operation layer; the parameter of the convolutional neural network model is fixed point data, and the data to be detected is the fixed point data;

if the input of the current operation layer is a set of fixed point data, the set of fixed point data is processed according to the operation rule of the current operation layer to generate output data of the current operation layer;

if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of the n groups of data are the same, and processing the adjusted n groups of data according to the operation rule of the current operation layer to generate the output data of the current operation layer; wherein n is more than or equal to 2; the quantization parameters comprise the total bit width, the integer part bit width and the decimal part bit width of the fixed point data;

and when the operation of all the operation layers is finished, outputting the prediction result of the data to be detected.

2. The method of claim 1, wherein if the current operation layer is a residual structure, the input of the current operation layer comprises n sets of fixed point data; the adjusting the n sets of fixed point data to make quantization parameters of the n sets of data the same specifically includes:

acquiring the smallest decimal part bit width in the n groups of fixed point data as a reference bit width according to the decimal part bit width of each group of fixed point data in the n groups of fixed point data;

and shifting, rounding and saturating n-1 groups of fixed point data except for one group of fixed point data with the reference bit width in the n groups of fixed point data by taking the reference bit width as a reference so as to enable the bit widths of the decimal parts of the n groups of fixed point data to be the same.

3. The method of claim 2, wherein the residual structure is implemented using an Eltwise layer.

4. The method of claim 1, wherein if the current operation layer is a cascade structure, the input of the current operation layer comprises n sets of fixed point data; the adjusting the n groups of fixed point data to make the quantization parameters of the n groups of data the same specifically comprises the following steps:

according to the output quantization parameter of the current operation layer, using the decimal part bit width of the output quantization parameter as a reference bit width;

and shifting, rounding and saturating the n groups of fixed point data by taking the reference bit width as a reference so as to ensure that the bit widths of the decimal parts of the n groups of fixed point data are the same.

5. The method according to claim 4, wherein the cascade structure is implemented by using a concat layer.

6. The method of any one of claims 1 to 5, wherein if the current operation layer comprises a convolutional layer and a batch normalization layer; the batch normalization includes a BatchNorm layer and a Scale layer; before receiving the input data to be tested, the method further comprises the following steps:

folding the convolution layer, the BatchNorm layer and the Scale layer to form a new convolution layer;

and performing fixed point processing on the parameters of the new convolutional layer.

7. The method of claim 6, wherein folding the convolutional layer, the BatchNorm layer, and the Scale layer to form a new convolutional layer comprises:

folding the convolution layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new convolution layer after folding treatment:

wherein v is the variance parameter of the BatchNorm layer, μ is the mean parameter of the BatchNorm layer, β is the translation parameter of the Scale layerNumber, s is the scaling parameter of the Scale layer, eps is the default value, w_foldIs the weight parameter of the new convolutional layer, w is the weight parameter of the convolutional layer before the folding process, b_foldIs the bias parameter for the new convolutional layer.

8. The method according to any one of claims 1 to 5, wherein if the current operation layer comprises a fully connected layer and a batch normalization layer; the batch normalization layer comprises a BatchNorm layer and a Scale layer; before the receiving the input data to be tested, the method further comprises:

folding the full connection layer, the BatchNorm layer and the Scale layer to form a new convolution layer;

and performing fixed point processing on the parameters of the new convolutional layer.

9. The method according to claim 8, wherein the folding the fully-connected layer, the BatchNorm layer, and the Scale layer to form a new fully-connected layer comprises:

folding the convolution layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new fully-connected layer:

wherein v is a variance parameter of the BatchNorm layer, μ is a mean parameter of the BatchNorm layer, β is a translation parameter of the Scale layer, s is a scaling parameter of the Scale layer, eps is a default value, w is a variance parameter of the Scale layer_foldIs the new weight parameter of the fully-connected layer, w is the weight parameter of the fully-connected layer before the folding process, b_foldB is the bias parameter of the fully-connected layer before the folding process.

10. The method of any of claims 1 to 9, wherein prior to receiving input data under test, the method further comprises:

obtaining the maximum value of the absolute value of the first data group, and calculating the bit width IL of the initial integer part according to the maximum value₀；

Computing IL_i+1＝IL_i-1, wherein i ≧ 0; and calculating IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min；

According to the initial integer part bit width IL₀Maximum floating point value r_maxAnd minimum floating point value r_minAnd acquiring the quantization parameter of the first data group.

11. The method according to claim 10, wherein said bit width IL according to said initial integer portion₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically includes:

according to the initial integer part bit width IL₀Performing fixed-point processing and anti-fixed-point processing on each data in the first data group to generate a floating point value r';

with the r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min，r_max]；

Obtaining an excess [ r ] in the first data set_min，r_max]Generates a second data set based on IL from each of said data₀Floating point values r' and r generated after fixed point processing and anti-fixed point processing_maxAnd r_minCalculating saturation loss of each data in the second data group, and accumulating the saturation loss of each data in the second data group to obtain a first accumulated value ST;

obtaining [ r ] in the first data set_min，r_max]Generating a third data set from the plurality of data, calculating a gain for each data in the third data set, and determining a gain for each data in the third data setAccumulating the gain of each data in the third data group to obtain a second accumulated value G;

and acquiring the quantization parameter of the first data group according to the first accumulated value ST and the second accumulated value G.

12. The method of claim 11, wherein said deriving quantization parameters for the first data set from the first accumulation value ST and the second accumulation value G comprises:

when G is not more than K × ST, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value.

13. The method of claim 12, wherein said deriving quantization parameters for said first data set from said first accumulation value ST and said second accumulation value G further comprises:

when G > K × ST, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min。

14. The method according to any one of claims 11 to 13, wherein said data is based on IL according to each of said data₀Calculating the saturation loss of each data in the second data group by using floating point values r', r _ max and r _ min generated after the fixed-point processing and the anti-fixed-point processing specifically comprises:

when the r 'is a positive number, obtaining saturation loss of the data by calculating an absolute value of a difference between r' and r _ max;

when the r 'is negative, calculating the absolute value of the difference between r' and r _ min as the saturation loss of the data.

15. The method according to any one of claims 11 to 14, wherein the calculating the gain of each data in the third data set specifically comprises:

separately acquiring said data at IL₀Quantization loss of L1 and in IL_iA quantization loss L2 of lower, calculating an absolute value of a difference of L1 and L2 as a gain of the data; the quantization loss is the difference between any data subjected to fixed point processing and inverse fixed point processing and the original data.

16. The method according to claim 10, wherein said bit width IL according to said initial integer portion₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically includes:

with the r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min，r_max]；

Obtaining an excess of [ r ] in the set of data_min，r_max]The number of data of C1;

acquiring the number C2 of non-zero-value data in the group of data;

when C2 is not more than K × C1, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value.

17. The method according to claim 16, wherein said bit width IL according to said initial integer portion₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically further includes:

when C2 > K C1, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min。

18. The method of any one of claims 1 to 17, wherein the parameters of the convolutional neural network model include a weight parameter and a bias parameter; the weight parameters, the bias parameters and the input and output data of the convolutional neural network model are respectively used as a group of data to preset the total bit width of the fixed point data in the fixed point process, and the weight parameters, the bias parameters, the input data and the output data of each operation layer respectively calculate corresponding quantization parameters according to the distribution range of the weight parameters, the bias parameters, the input data and the output data.

19. The method of any one of claims 1 to 18, wherein the trained convolutional neural network is obtained by:

performing network parameter training on the convolutional neural network to obtain initial floating point parameters of the convolutional neural network; the network parameters include weight data and bias data;

according to the distribution of the initial floating point parameter, the input data and the output data, obtaining quantization parameters corresponding to the initial floating point parameter, the input data and the output data respectively;

continuing to input training data into the convolutional neural network model based on the initial floating point parameters and the quantization losses under the quantization parameters, and updating the floating point parameters of the convolutional neural network model and the quantization losses under the quantization parameters according to a loss function of the convolutional neural network model;

and when the loss function converges to a preset condition, performing fixed-point processing on the floating-point parameters obtained by current updating based on the quantization parameters, thereby generating the trained convolutional neural network model.

20. A data processing apparatus comprising at least a memory and a processor; the memory is connected with the processor through a communication bus and is used for storing computer instructions executable by the processor; the processor is configured to read computer instructions from the memory to implement a data processing method based on a convolutional neural network architecture, the method comprising:

if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of each group of data in the n groups of data are the same, and generating output data of the current operation layer after processing the adjusted n groups of data according to the operation rule of the current operation layer; wherein n is more than or equal to 2; the quantization parameters comprise the total bit width, the integer part bit width and the decimal part bit width of the fixed point data;

and when the operation of all the operation layers is finished, outputting the prediction result of the data to be detected.

21. The apparatus of claim 20, wherein the inputs to the current operation layer comprise n sets of fixed point data if the current operation layer is in a residual structure; the processor is further configured to read computer instructions from the memory to implement:

22. The apparatus of claim 21, wherein the residual structure is implemented using an Eltwise layer.

23. The apparatus of claim 20, wherein if the current operation layer is in a cascade structure, the input of the current operation layer comprises n sets of fixed point data; the processor is further configured to read computer instructions from the memory to implement:

according to the output quantization parameter of the current operation layer, using the decimal part bit width of the output quantization parameter as a reference bit width;

24. The apparatus of claim 23, wherein the cascade structure is implemented using a concat layer.

25. The apparatus of any one of claims 20 to 24, wherein if the current operation layer comprises a convolutional layer and a batch normalization layer; the batch normalization includes a BatchNorm layer and a Scale layer; the processor is further configured to read computer instructions from the memory to implement:

folding the convolution layer, the BatchNorm layer and the Scale layer to form a new convolution layer;

and performing fixed point processing on the parameters of the new convolutional layer.

26. The apparatus of claim 25, wherein the processor is further configured to read computer instructions from the memory to implement:

folding the convolution layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new convolution layer after folding treatment:

wherein v is a variance parameter of the BatchNorm layer, μ is a mean parameter of the BatchNorm layer, β is a translation parameter of the Scale layer, s is a scaling parameter of the Scale layer, eps is a default value, w is a variance parameter of the Scale layer_foldAs the weight parameter for the new convolutional layer,w is a weight parameter of the convolution layer before folding processing, b_foldIs the bias parameter for the new convolutional layer.

27. The apparatus according to any one of claims 20 to 24, wherein if the current operation layer comprises a fully connected layer and a batch normalization layer; the batch normalization layer comprises a BatchNorm layer and a Scale layer; the processor is further configured to read computer instructions from the memory to implement:

folding the full connection layer, the BatchNorm layer and the Scale layer to form a new convolution layer;

and performing fixed point processing on the parameters of the new convolutional layer.

28. The apparatus of claim 27, wherein the processor is further configured to read computer instructions from the memory to implement:

folding the full connection layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new full connection layer:

29. The apparatus of any of claims 20 to 28, wherein prior to receiving input data under test, the processor is further configured to read computer instructions from the memory to implement:

obtaining the maximum value of the absolute value of the first data group, and calculating the bit width IL of the initial integer part according to the maximum value₀；

Computing IL_i+1＝IL_i-1, wherein i ≧ 0; and calculating IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min；

According to the initial integer part bit width IL₀Maximum floating point value r_maxAnd minimum floating point value r_minAnd acquiring the quantization parameter of the first data group.

30. The apparatus of claim 29, wherein the processor is further configured to read computer instructions from the memory to implement:

according to the initial integer part bit width IL₀Performing fixed-point processing and anti-fixed-point processing on each data in the first data group to generate a floating point value r';

with the r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min，r_max]；

obtaining [ r ] in the first data set_min，r_max]Generating a third data group by a plurality of data in the third data group, calculating the gain of each data in the third data group, and accumulating the gain of each data in the third data group to obtain a second accumulated value G;

and acquiring the quantization parameter of the first data group according to the first accumulated value ST and the second accumulated value G.

31. The apparatus of claim 30, wherein the processor is further configured to read computer instructions from the memory to implement:

when G is not more than K × ST, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value.

32. The apparatus of claim 31, wherein the processor is further configured to read computer instructions from the memory to implement:

when G > K × ST, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min。

33. The apparatus according to any of claims 30 to 32, wherein the processor is further configured to read computer instructions from the memory to implement:

when the r 'is a positive number, obtaining saturation loss of the data by calculating an absolute value of a difference between r' and r _ max;

when the r 'is negative, calculating the absolute value of the difference between r' and r _ min as the saturation loss of the data.

34. The apparatus according to any of claims 30 to 33, wherein the processor is further configured to read computer instructions from the memory to implement:

35. The apparatus of claim 29, wherein the processor is further configured to read computer instructions from the memory to implement:

with the r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min，r_max]；

Obtaining an excess of [ r ] in the set of data_min，r_max]The number of data of C1;

acquiring the number C2 of non-zero-value data in the group of data;

when C2 is not more than K × C1, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value.

36. The apparatus of claim 35, wherein the processor is further configured to read computer instructions from the memory to implement:

when C2 > K C1, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min。

37. The apparatus of any one of claims 20 to 36, wherein the parameters of the convolutional neural network model comprise a weight parameter and a bias parameter; the weight parameters, the bias parameters and the input and output data of the convolutional neural network model are respectively used as a group of data to preset the total bit width of the fixed point data in the fixed point process, and the weight parameters, the bias parameters, the input data and the output data of each operation layer respectively calculate corresponding quantization parameters according to the distribution range of the weight parameters, the bias parameters, the input data and the output data.

38. The apparatus according to any of claims 20 to 37, wherein the processor is further configured to read computer instructions from the memory to implement:

39. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 19.

30页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：信息处理系统、信息处理方法和信息处理装置

Data processing method and device based on convolutional neural network architecture

相关技术

网友询问留言