Method for detecting image classification output result

文档序号:1738155 发布日期:2019-12-20 浏览:12次 中文

阅读说明:本技术 一种图像分类输出结果的检测方法 (Method for detecting image classification output result ) 是由 丁晓伟 张政 于 2019-08-27 设计创作,主要内容包括:本发明涉及一种图像分类输出结果的检测方法,通过逐层分析网络层的方法构建一个具有多层网络的第一分类网络;其包括卷积层和全连接层;卷积层和全连接层对原始图像经过线性和非线性处理得到特征图,并对特征图进行转换形成自省网络训练数据;第一分类网络预测结果错误的,训练标签为0;第一分类网络预测结果正确的,训练标签为1;将训练样本输入自省网络;将第一分类网络的输出结果经过步骤S2中的处理和转换后输入到已训练的自省网络中,通过自省网络输出判定第一分类网络输出结果是否正确。本发明通过自省网络对第一分类网络的输出特征图进行识别,进而可以判断第一分类网络输出结果是否正确。(The invention relates to a method for detecting image classification output results, which constructs a first classification network with a multilayer network by a method of analyzing network layers layer by layer; it comprises a convolution layer and a full connection layer; the convolution layer and the full connection layer perform linear and nonlinear processing on the original image to obtain a characteristic diagram, and the characteristic diagram is converted to form self-provincial network training data; the first classification network predicts that the result is wrong, and the training label is 0; the first classification network predicts that the result is correct, and the training label is 1; inputting training samples into a self-provincial network; the output result of the first classification network is processed and converted in step S2, and then input to the trained introspection network, and whether the output result of the first classification network is correct or not is determined by the introspection network output. The invention identifies the output characteristic diagram of the first classification network through the introspection network, and further can judge whether the output result of the first classification network is correct.)

1. A detection method for image classification output results is characterized by comprising the following steps:

s1: constructing a first classification network with a multi-layer network by a method of analyzing network layers layer by layer; the first classification network comprises a convolution layer and a full connection layer;

s2: the convolutional layer and the full-link layer perform linear and nonlinear processing on the original image to obtain a feature map, and the feature map is converted to form self-provincial network training data; the first classification network predicts that the result is wrong, and the training label is 0; the first classification network predicts that the result is correct, and the training label is 1; the provincial network is a second classification network with hierarchical input;

s3: inputting training samples into a self-provincial network;

s4: the output result of the first classification network is processed and converted in step S2, and then is input into the trained introspection network, where the input into the introspection network is as follows: the characteristic diagram output by the first classification network is connected with the characteristic diagram output by the introspection network to be used as the input of the next-layer introspection network; and judging whether the output result of the first classification network is correct or not through the output of the introspection network.

2. The method of claim 1, wherein the first classification network comprises a plurality of convolutional layers and a plurality of fully-connected layers, wherein the plurality of convolutional layers are designed as follows, con1 layer: the method comprises the steps of inputting an original image, performing convolution processing, obtaining a characteristic diagram through convolution processing, and directly inputting the output of the characteristic diagram after linear and nonlinear processing into the next con2 convolution layer;

the con2 convolutional layer performs convolutional processing on the input image, obtains a feature map through convolutional processing, and directly inputs the output of the feature map after linear and nonlinear processing into the next con3 convolutional layer, so as to reach the last full-link layer.

3. The method of claim 1, wherein when the output signature of the convolutional layer and the fully-connected layer of the first classification network is greater than a preset value,

the first classification network only selects the feature graphs output by a layers and inputs the feature graphs into the provincial network for data fitting; wherein a is an arbitrary value smaller than a preset value.

4. The method of claim 1, wherein when the output signature of the convolutional layer and the fully-connected layer of the first classification network is greater than a preset value,

and the first classification network only inputs the characteristic diagram output by the full connection layer into an introspection network for fitting.

5. The method of claim 1, wherein when the profile of the convolutional layer and the fully-connected layer output of the first classification network is not greater than a preset value,

and the first classification network inputs all feature maps output by each image layer into the introspection network for fitting.

6. The method of claim 1, wherein the output of the introspection network is as follows:

comparing the output of the introspection network for the first classification network result with a set threshold;

when the output of the introspection network is smaller than the threshold value, determining that the output result of the first classification network is unreliable, and outputting 0;

and when the output of the introspection network is not less than the threshold value, determining that the output result of the first classification network is reliable, and outputting 1.

7. The method of claim 1, wherein the introspection network inputs are:

linear and non-linear processed single layer feature maps or multi-layer feature maps.

8. The method of claim 1, wherein the introspection network inputs are:

and the characteristic graph output by the multiple layers is connected with the characteristic graph output by the previous layer.

9. The method of claim 1, wherein the introspection network inputs are:

the multiple feature maps are used as inputs to parallel branches of the network, which are merged at a later stage.

10. The method of claim 1,

the introspection network implements a binary comparator.

Technical Field

The invention relates to the technical field of computers, in particular to a method for detecting an image classification output result.

Background

By learning the translation invariant features and parameter sharing, the image classification task is executed by using a Convolutional Neural Network (CNN), so that the classification performance is greatly improved, and the number of learning parameters is reduced, so that the network is more generalized and is not easy to over-fit. However, the performance of these networks is still highly dependent on their trained data sets. Off-normative datasets can lead to incorrect predictions, primarily because the network does not see similar images when trained. When these networks use softmax to compress the output between 0,1 and force the score to become a probability of summing to 1, the highest predicted score category appears to have a higher probability due to the normalized index, even though the scores for all categories are lower. Therefore, it is difficult to distinguish between good and bad predictions by simply observing these probabilities, which is the case for any network using the softmax classifier.

New CNN architectures are emerging that provide the most advanced image classification and improve classification accuracy-simple network architectures like AlexNet have evolved to large networks like inclusion-Resnet. In addition, steps such as data expansion during training and testing attempt to account for data changes that are difficult for the network to learn. However, while improving, image classification results are not always reliable, and knowing when a classifier can be relied upon and when its predictions need to be further reviewed is a challenge that currently needs to be addressed.

Disclosure of Invention

In view of the above, an object of the present invention is to overcome the deficiencies of the prior art, and to provide a method for detecting an image classification output result, so as to solve the problem that it is impossible to determine whether an image classification output result is accurate in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

s1: constructing a first classification network with a multi-layer network by a method of analyzing network layers layer by layer; the first classification network comprises a convolution layer and a full connection layer;

s2: the convolutional layer and the full-link layer perform linear and nonlinear processing on the original image to obtain a feature map, and the feature map is converted to form self-provincial network training data; the first classification network predicts that the result is wrong, and the training label is 0; the first classification network predicts that the result is correct, and the training label is 1; the provincial network is a second classification network with hierarchical input;

s3: inputting training samples into a self-provincial network;

s4: the output result of the first classification network is processed and converted in step S2, and then is input into the trained introspection network, where the input into the introspection network is as follows: the characteristic diagram output by the first classification network is connected with the characteristic diagram output by the introspection network to be used as the input of the next-layer introspection network; and judging whether the output result of the first classification network is correct or not through the output of the introspection network.

Further, the first classification network includes a plurality of convolutional layers and a plurality of fully-connected layers, wherein the design method for the plurality of convolutional layers is as follows, layer con 1: the method comprises the steps of inputting an original image, performing convolution processing, obtaining a characteristic diagram through convolution processing, and directly inputting the output of the characteristic diagram after linear and nonlinear processing into the next con2 convolution layer;

the con2 convolutional layer performs convolutional processing on the input image, obtains a feature map through convolutional processing, and directly inputs the output of the feature map after linear and nonlinear processing into the next con3 convolutional layer, so as to reach the last full-link layer.

Furthermore, when the characteristic graphs output by the convolution layer and the full connection layer of the first classification network are larger than a preset value,

the first classification network only selects the feature graphs output by a layers and inputs the feature graphs into the provincial network for data fitting; wherein a is an arbitrary value smaller than a preset value.

Furthermore, when the characteristic graphs output by the convolution layer and the full connection layer of the first classification network are larger than a preset value,

and the first classification network only inputs the characteristic diagram output by the full connection layer into an introspection network for fitting.

Further, when the characteristic diagram output by the convolution layer and the full connection layer of the first classification network is not more than the preset value,

and the first classification network inputs all feature maps output by each image layer into the introspection network for fitting.

Further, the output of the introspection network adopts the following mode:

comparing the output of the introspection network for the first classification network result with a set threshold;

when the output of the introspection network is smaller than the threshold value, determining that the output result of the first classification network is unreliable, and outputting 0;

and when the output of the introspection network is not less than the threshold value, determining that the output result of the first classification network is reliable, and outputting 1.

Further, the input of the introspection network is:

linear and non-linear processed single layer feature maps or multi-layer feature maps.

Further, the input of the introspection network is:

and the characteristic graph output by the multiple layers is connected with the characteristic graph output by the previous layer.

Further, the input of the introspection network is:

the multiple feature maps are used as inputs to parallel branches of the network, which are merged at a later stage.

Further, the introspection network implements a binary comparator.

By adopting the technical scheme, the invention can achieve the following beneficial effects:

and establishing a provincial network, and identifying the output characteristic graph of the first classification network through the provincial network so as to judge whether the output result of the first classification network is correct.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a step diagram of a method for detecting an output result of image classification according to the present invention;

fig. 2 is a schematic flow chart of a method for detecting an image classification output result according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.

A specific method for detecting an output result of image classification provided in the embodiment of the present application is described below with reference to the accompanying drawings.

As shown in fig. 1, the method for detecting an image classification output result provided in the embodiment of the present application includes,

s1: constructing a first classification network with a multi-layer network by a method of analyzing network layers layer by layer; it comprises a convolution layer and a full connection layer;

s2: the convolutional layer and the full-link layer perform linear and nonlinear processing on the original image to obtain a feature map, and the feature map is converted to form self-provincial network training data; the first classification network predicts that the result is wrong, and the training label is 0; the first classification network predicts that the result is correct, and the training label is 1; the provincial network is a second classification network with hierarchical input;

s3: inputting training samples into a self-provincial network;

s4: the output result of the first classification network is processed and converted in step S2, and then is input into the trained introspection network, where the input into the introspection network is as follows: the characteristic diagram output by the first classification network is connected with the characteristic diagram output by the introspection network to be used as the input of the next-layer introspection network; and judging whether the output result of the first classification network is correct or not through the output of the introspection network.

The specific working process is as follows: firstly, inputting an original image into a first classification network, inputting the output of a feature map subjected to linear and nonlinear processing into a self-provincial network by a layer in the first classification network, and outputting a fitted feature map by the self-provincial network;

secondly, the characteristic diagram output by the first classification network is connected with the characteristic diagram output by the introspection network to be used as the input of the next-layer introspection network; this up to the last layer;

the self-provincial network judges the prediction result of the first classification network according to the self-provincial network training data, 0 is output when the self-provincial network considers that the prediction result of the first classification network is wrong, and 1 is output when the self-provincial network considers that the prediction result of the first classification network is correct.

It should be noted that the feature maps output by each layer of the first classification network do not need to be input into the introspection network for fitting, and may be set according to a preset value, provided that the size of the feature map output by the first classification network is the same as the size of the feature map of the introspection network to be input; for example: the feature maps output by any layers of the first classification network can be selected and input into the introspection network; similarly, it is not necessary to input each layer of the introspection network.

In some embodiments, referring to fig. 2, the first classification network includes a plurality of convolutional layers and a plurality of fully-connected layers, wherein the design method for the plurality of convolutional layers is as follows, con1 layer: the method comprises the steps of inputting an original image, performing convolution processing to obtain a characteristic diagram, and directly inputting the output after linear and nonlinear processing into the next con2 convolution layer;

the con2 convolutional layer performs convolutional processing on the input image, obtains a feature map through convolutional processing, and directly inputs the output of the feature map after linear and nonlinear processing into the next con3 convolutional layer, so as to reach the last layer.

The number of the convolution layers and the number of the full-connection layers can be set according to actual conditions.

Within a certain range, as the number of feature maps increases, the corresponding classification accuracy increases, and as the number of feature maps increases, the features extracted from the input image increases, and the expression capability of the model increases, the number of feature maps should be increased as much as possible under the permission of the calculation capability, so that the extraction quality of the image features is improved, and the analysis capability of the first classification network is further increased. However, when the result of the first classification network is judged by the introspection network, too many feature maps increase the workload of the introspection network, which results in a large model, so the feature maps output by the layers of the first classification network are limited, a preset value is set, and a use mode of the introspection network is selected according to the number of the feature maps output by the layers of the first classification network.

Specifically, when the characteristic graphs output by the convolution layer and the full connection layer of the first classification network are larger than a preset value,

and the first classification network only inputs the feature graph output by the last a layers into the introspection network for judgment.

When the characteristic graphs output by the convolution layer and the full connection layer of the first classification network are larger than a preset value,

and the first classification network only inputs the characteristic diagram output by the full connection layer into the provincial network for judgment.

When the characteristic graphs output by the convolutional layer and the full-link layer of the first classification network are not greater than a preset value,

and the first classification network inputs all feature maps output by each map layer into the introspection network for judgment.

As shown in fig. 2, the present application takes an example that the output characteristic diagrams of the convolutional layer and the fully-connected layer of the first classification network are not greater than a preset value to specifically explain:

in the embodiment of the application, the first classification network comprises 3 convolutional layers and 1 full-connection layer, the convolutional layer conv1 inputs an original image to be subjected to convolution processing, a feature map is obtained after the convolution processing, the output of the feature map after linear and nonlinear processing is directly input to the next convolutional layer convi-1, and data fitting is carried out on the output of the feature map and a conv1 fitting value in the province network to obtain a convi-1 fitting value;

in the first classification network, the output of conv1 is used as the input of convi-1 of the first classification network, the output of conv1 is processed by convolution to obtain a characteristic diagram, and the output of the characteristic diagram after linear and nonlinear processing is directly input to the next convolution layer coni; in addition, the output of convi-1 is data-fitted to the convi-1 fit values in the provincial network to obtain the coni fit values.

As in the above operation, in the first classification network, up to the fully-connected layer fc1, fc1 outputs the image classification result of the first classification network; in addition, the characteristic diagram output by fc1 is fitted with the characteristic diagram output by fc1 in the provincial network to obtain an output characteristic diagram of fc 2.

Optionally, the output of the introspection network is in the following way:

the provincial network outputs and sets a threshold value aiming at a first classification network result;

when the output of the introspection network is smaller than the threshold value, determining that the output result of the first classification network is unreliable, and outputting 0;

and when the output of the introspection network is not less than the threshold value, determining that the output result of the first classification network is reliable, and outputting 1.

It should be noted that human intervention is allowed when the output result of the first classification network is not reliable.

In some embodiments, the threshold is set to 0.5, and the output value is 1 when the output of the introspection network is greater than 0.5 and 0 when the output of the introspection network is less than 0.5.

Optionally, the introspection network has as input

And subjecting the characteristic diagram to linear and nonlinear processing to obtain a single-layer characteristic diagram or a multi-layer characteristic diagram.

In some embodiments, the introspection network can input the feature map of the single-layer output or the feature map of the multi-layer output.

Optionally, the introspection network has as input

And the characteristic graph output by the multiple layers is connected with the characteristic graph output by the previous layer.

Optionally, the introspection network has as input

The multiple feature maps are used as inputs to parallel branches of the network, which are merged at a later stage.

Alternatively,

the introspection network implements a binary comparator.

In summary, the present invention provides a detection method capable of detecting whether an image classification output result is correct, where a self-provincial network performs data fitting on an output result of a first classification network, and when an output of the self-provincial network is smaller than a threshold, it is determined that the output result of the first classification network is incorrect, and 0 is output; and when the output of the introspection network is not less than the threshold value, determining that the output result of the first classification network is correct, and outputting 1. The reliability of the image classification result can be verified through the provincial network, and when the image classification result is unreliable, human intervention can be timely carried out.

It is to be understood that the method embodiments provided above correspond to the method embodiments described above, and corresponding specific contents may be referred to each other, which are not described herein again.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.

It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:信息处理装置、信息处理方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!