Method, device, terminal and storage medium for measuring generalization capability of deep neural network

文档序号:1363418 发布日期:2020-08-11 浏览:11次 中文

阅读说明:本技术 深度神经网络的泛化能力的度量方法、装置、终端、存储介质 (Method, device, terminal and storage medium for measuring generalization capability of deep neural network ) 是由 毛宏亮 黄敏峰 程宝平 于 2020-03-10 设计创作,主要内容包括:本发明实施例涉及人工智能领域,公开了一种深度神经网络的泛化能力的度量方法、装置、终端、计算机可读存储介质。本发明中,所述深度神经网络的泛化能力的度量方法,包括:获取待评估的深度神经网络;获取所述深度神经网络输出的分类类别的特征向量;对所述特征向量进行归一化处理,计算归一化处理后的所述特征向量在高维单位球面上的分布面积,将所述分布面积作为所述深度神经网络的泛化能力的度量值;输出所述泛化能力的度量值。本发明能够对深度神经网络的泛化能力进行度量,并且实现比较简单。(The embodiment of the invention relates to the field of artificial intelligence and discloses a method, a device, a terminal and a computer readable storage medium for measuring the generalization ability of a deep neural network. In the present invention, the method for measuring the generalization ability of the deep neural network includes: acquiring a deep neural network to be evaluated; acquiring a feature vector of a classification category output by the deep neural network; carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network; and outputting the measurement value of the generalization ability. The method can measure the generalization ability of the deep neural network, and is simple to implement.)

1. A method for measuring generalization ability of a deep neural network, comprising:

acquiring a deep neural network to be evaluated;

acquiring a feature vector of a classification category output by the deep neural network;

carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network;

and outputting the measurement value of the generalization ability.

2. The method according to claim 1, wherein the step of normalizing the feature vector and calculating the distribution area of the normalized feature vector on the high-dimensional unit sphere comprises:

normalizing the feature vectors to respectively generate points on a high-dimensional unit spherical surface;

obtaining a maximum convex hull of the spherical surface, wherein the maximum convex hull is a maximum convex polygon formed by all the points of the spherical surface;

and calculating the area of the maximum convex hull as a measure of the generalization ability of the deep neural network.

3. The method according to claim 2, wherein the step of calculating the area of the maximum convex hull specifically comprises:

converting the angle corresponding to each dimension of the maximum convex hull into a spherical coordinate value;

and calculating the area of the maximum convex hull through a spherical area calculation formula in the high-dimensional calculus.

4. The method according to claim 3, wherein the step of calculating the area of the maximum convex hull specifically comprises:

wherein the content of the first and second substances,

a (d) is the area of the maximum convex hull;

d is a spatial dimension; thetaiThe angle of the largest convex hull in each dimension;

1, …, n; i is the serial number of a point on the high-dimensional unit spherical surface;

n is the total number of points on the high dimensional unit sphere.

5. The method of claim 4, wherein the measure of generalization ability is in a range of valuesWhere 0 indicates that the generalization capability is zero, i.e., all inputs are classified as one;representing the maximum classification capacity, i.e. the global area.

6. An apparatus for measuring generalization ability of a deep neural network, comprising:

the first acquisition module is used for acquiring a deep neural network to be evaluated;

the second acquisition module is used for acquiring the feature vectors of the classification categories output by the deep neural network;

the calculation module is used for carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network;

and the output unit is used for outputting the measurement value of the generalization ability.

7. The apparatus of claim 6, wherein the computing module comprises:

the normalization processing unit is used for performing normalization processing on the feature vectors and respectively generating points on a high-dimensional unit spherical surface;

the acquisition unit is used for acquiring the maximum convex hull of the spherical surface, wherein the maximum convex hull is a maximum convex polygon formed by all the points of the spherical surface;

and the calculating unit is used for calculating the area of the maximum convex hull as the measurement value of the generalization ability of the deep neural network.

8. The apparatus according to claim 7, wherein the computing unit specifically includes: converting the angle corresponding to each dimensionality of the maximum convex hull into a spherical coordinate system; and calculating the area of the maximum convex hull through a spherical area calculation formula in the high-dimensional calculus.

9. A terminal, comprising:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for measuring the generalization ability of a deep neural network of any one of claims 1 to 5.

10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the method for measuring the generalization ability of a deep neural network according to any one of claims 1 to 5.

Technical Field

The embodiment of the invention relates to the field of artificial intelligence, in particular to a method, a device, a terminal and a computer readable storage medium for measuring the generalization ability of a deep neural network.

Background

With the development of deep learning technology, the existing artificial intelligence technology taking a deep neural network as a core is increasingly well known. However, the neural network design principle uncertain after deep learning is widely known as the black box theory, and lacks a simple guiding principle.

At present, aiming at the generalization ability of a deep neural network, a method of traditional machine learning is basically continued, in practice, the generalization ability of the neural network is generally evaluated by using a variance-deviation principle and through a training set, a verification set and a test set, and meanwhile, the neural network which can well perform on the three data sets (generally requiring the verification set) is recognized as the neural network with strong generalization ability, so that the accuracy of the verification set and the test set (generally requiring only the verification set) is used as the quantitative measurement of the generalization ability of the neural network. But the traditional method cannot give reasonably proper generalization values aiming at the deep neural network with high dimensional and over-parameterization.

Disclosure of Invention

The embodiment of the invention aims to provide a method, a device, a terminal and a computer readable storage medium for measuring the generalization ability of a deep neural network, which can measure the generalization ability of the deep neural network.

In order to solve the above technical problem, an embodiment of the present invention provides a method for measuring a generalization ability of a deep neural network, including:

acquiring a deep neural network to be evaluated;

acquiring a feature vector of a classification category output by the deep neural network;

carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network;

and outputting the measurement value of the generalization ability.

The embodiment of the invention also provides a device for measuring the generalization ability of the deep neural network, which comprises:

the first acquisition module is used for acquiring a deep neural network to be evaluated;

the second acquisition module is used for acquiring the feature vectors of the classification categories output by the deep neural network;

the calculation module is used for carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network;

and the output unit is used for outputting the measurement value of the generalization ability.

An embodiment of the present invention further provides a terminal, including:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for measuring the generalization ability of a deep neural network of any one of claims 1 to 5.

The embodiment of the invention also provides a computer readable storage medium, which stores a computer program, and the computer program is used for realizing the method for measuring the generalization ability of the deep neural network when being executed by a processor.

Compared with the prior art, the method and the device for evaluating the deep neural network have the advantages that the deep neural network to be evaluated is obtained; acquiring a feature vector of a classification category output by the deep neural network; carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network; and outputting the measurement value of the generalization ability. The invention can provide a method for measuring the generalization ability of the deep neural network, and the realization method is simpler.

In addition, the step of normalizing the feature vector and calculating the distribution area of the feature vector on the high-dimensional unit spherical surface after the normalization processing includes: normalizing the feature vectors to respectively generate points on a high-dimensional unit spherical surface; obtaining a maximum convex hull of the spherical surface, wherein the maximum convex hull is a maximum convex polygon formed by all the points of the spherical surface; and calculating the area of the maximum convex hull as a measure of the generalization ability of the deep neural network. In the embodiment of the invention, the calculation process of the generalization measurement is given according to the normalized projection of the classification characteristic vector on the high-dimensional spherical surface, and the realization is simpler.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.

Fig. 1 is a schematic flow chart of a method for measuring the generalization ability of a deep neural network according to a first embodiment of the present invention;

fig. 2 is a schematic flow chart of a method for measuring the generalization ability of a deep neural network according to an application scenario of the present invention:

fig. 3 is a connection diagram of a device for measuring generalization ability of a deep neural network according to another embodiment of the present invention.

Fig. 4 is a schematic diagram of a terminal according to another embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.

The first embodiment of the invention relates to a method for measuring the generalization ability of a deep neural network. The flow is shown in fig. 1, and specifically comprises the following steps:

step 11, obtaining a deep neural network to be evaluated;

step 12, obtaining the feature vector of the classification type output by the deep neural network; the deep neural network can be analogized to a classifier, whose final output can be a feature vector of the classification class.

Step 13, carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network;

and 14, outputting the measurement value of the generalization ability. The measurement value of the generalization ability of the deep neural network can be used as one of the evaluation indexes of the deep neural network, so as to be used as one of the reference index values for whether to perform subsequent processing on the deep neural network.

In the embodiment, a deep neural network to be evaluated is obtained; acquiring a feature vector of a classification category output by the deep neural network; carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network; and outputting the measurement value of the generalization ability. The invention can provide a method for measuring the generalization ability of the deep neural network, and the realization method is simpler.

In one embodiment, step 13 comprises:

131, normalizing the feature vectors to respectively generate points on a high-dimensional unit spherical surface;

step 132, obtaining a maximum convex hull of the spherical surface, where the maximum convex hull is a maximum convex polygon formed by all the points of the spherical surface;

step 133, calculating the area of the maximum convex hull as the measure of the generalization ability of the deep neural network. Wherein the step of calculating the area of the maximum convex hull specifically comprises: converting the angle corresponding to each dimension of the maximum convex hull into a spherical coordinate value; and calculating the area of the maximum convex hull through a spherical area calculation formula in the high-dimensional calculus.

The step of calculating the area of the maximum convex hull specifically includes:

A(d)

wherein, the area of the maximum convex hull is;

d is a spatial dimension; thetaiThe angle of the largest convex hull in each dimension;

1, …, n; i is the serial number of a point on the high-dimensional unit spherical surface;

n is the total number of points on the high dimensional unit sphere.

The value range of the generalization ability measurement value isWhere 0 indicates that the generalization capability is zero, i.e., all inputs are classified as one;representing the maximum classification capacity, i.e. the global area.

The following describes an application scenario of the present invention. The application scene relates to a practical tool capable of calculating, measures the generalization ability of the neural network in a given range, utilizes the direct output result of the existing neural network, does not change any neural network structure and work flow, and is simple and practical.

The design idea of the invention is as follows: as shown in fig. 2, the current deep neural network can be analogized to a classifier, the final output of which is a feature vector of a classification category, the final feature vector is normalized to a high-dimensional unit sphere by using a generalization capability promotion technique, particularly an L2 norm technique, the numerical feature of the final feature vector is changed into one on the sphere through a normalization operation, and the generalization capability of the deep neural network is characterized by calculating the distribution area of the normalized feature vector of the classification category on the sphere.

The calculation of the high-dimensional unit spherical area (the measurement name defined in the present scheme is ms-metric) "includes the following steps:

the maximum convex hull (convex hull) of the sphere, i.e. the maximum convex polygon of the finite set of points of the sphere, is calculated, which can be obtained with reference to existing algorithms for calculating geometry with respect to "maximum convex hull calculation", wherein of course the spherical coordinates need to be used instead of the general euclidean coordinates.

After finding the maximum convex hull of the spherical surface, converting the maximum convex hull into spherical coordinates by a spherical surface area calculation formula in high-dimensional calculus and utilizing the angle corresponding to the convex hull to calculate the convex hull area, wherein the general formula is as follows

The method can be calculated by numerical integration method or Monte Carlo method, d is space dimension, and thetaiN is the angle of the convex hull in each dimension;

the maximum convex hull area obtained above is the neural network generalization ability value defined by the technical scheme, generally being a d-dimensional spherical surface, and the generalization ability value range beingWhere 0 means that the generalization ability is zero, i.e. all inputs are classified as one class, the other endIt is the maximum classification capability, i.e. the global area.

And (4) summarizing the steps to complete the generalization numerical calculation of the deep neural network.

The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are all within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.

The embodiment of the invention has the following beneficial effects:

the blank exists in the present computing technology oriented to the generalization capability of the deep neural network, and more, the blank surrounds how to improve the generalization of the deep neural network. The embodiment of the invention provides a method for measuring the generalization ability of a deep neural network, which can quantize corresponding values by using a calculable process. The method specifically comprises the following steps: according to the normalized projection of the classified feature vectors on the high-dimensional spherical surface, a calculation flow for measuring the generalization capability is given by utilizing a geometric and numerical integration calculation method; by utilizing the category feature vector of the existing neural network, the generalization capability of the deep neural network can be measured based on the area of the high-dimensional spherical area.

Another embodiment of the invention relates to a device for measuring the generalization ability of a deep neural network. The structure is shown in fig. 3, and specifically comprises the following steps:

the first acquisition module is used for acquiring a deep neural network to be evaluated;

the second acquisition module is used for acquiring the feature vectors of the classification categories output by the deep neural network;

the calculation module is used for carrying out normalization processing on the feature vector, calculating the distribution area of the feature vector on a high-dimensional unit spherical surface after the normalization processing, and taking the distribution area as the measurement value of the generalization capability of the deep neural network;

and the output unit is used for outputting the measurement value of the generalization ability.

The calculation module comprises:

the normalization processing unit is used for performing normalization processing on the feature vectors and respectively generating points on a high-dimensional unit spherical surface;

the acquisition unit is used for acquiring the maximum convex hull of the spherical surface, wherein the maximum convex hull is a maximum convex polygon formed by all the points of the spherical surface;

and the calculating unit is used for calculating the area of the maximum convex hull as the measurement value of the generalization ability of the deep neural network. The computing unit specifically includes: converting the angle corresponding to each dimensionality of the maximum convex hull into a spherical coordinate system; and calculating the area of the maximum convex hull through a spherical area calculation formula in the high-dimensional calculus.

It should be noted that each module referred to in this embodiment is a logical module, and in practical applications, one logical unit may be one physical unit, may be a part of one physical unit, and may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, elements that are not so closely related to solving the technical problems proposed by the present invention are not introduced in the present embodiment, but this does not indicate that other elements are not present in the present embodiment.

The embodiment of the present invention further relates to a terminal, as shown in fig. 4, including:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method for measuring a generalization ability of the deep neural network.

Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.

The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.

Another embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.

That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于训练超网络的方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!