Method of operating a search framework system

文档序号:1429268 发布日期:2020-03-17 浏览:8次 中文

阅读说明:本技术 操作搜索框架系统的方法 (Method of operating a search framework system ) 是由 伍捷 苏俊杰 刘峻诚 于 2019-09-06 设计创作,主要内容包括:本发明提供了一种操作搜索框架系统的方法,搜索框架系统包含算数运算硬件。方法包含将输入数据及多个重置参数输入至算数运算硬件的自动架构搜索框架;自动架构搜索框架执行多个算数运算以搜寻最佳化卷积神经网络模型;及输出最佳化CNN模型。本发明可在可重置的硬件设置的限制条件下,及客制化模型大小及可接受运算复杂度内建造出最佳化CNN模型。(The invention provides a method of operating a search framework system comprising arithmetic hardware. The method comprises inputting input data and multiple reset parameters into an automatic architecture search framework of arithmetic operation hardware; the automatic architecture search framework executes a plurality of arithmetic operations to search an optimized convolutional neural network model; and outputting the optimized CNN model. The invention can build an optimized CNN model under the limit condition of resettable hardware setting, customized model size and acceptable operation complexity.)

1. A method of operating a search framework system, the search framework system comprising an arithmetic hardware, the method comprising:

inputting input data and a plurality of reset parameters into an automatic architecture search framework of the arithmetic operation hardware;

the automatic architecture search framework performs a plurality of arithmetic operations to search for an optimized Convolutional Neural Network (CNN) model; and

outputting the optimized CNN model.

2. The method of claim 1, wherein the optimized CNN model comprises classification, object detection, and/or segmentation.

3. The method of claim 1, wherein the input data is multimedia data comprising a plurality of images and/or sounds.

4. The method of claim 1, wherein the plurality of reset parameters are related to a memory size and a computing power of the arithmetic hardware.

5. The method of claim 1, wherein the automated framework search framework performing the plurality of arithmetic operations to search for the optimized CNN model comprises:

inputting the CNN data into a structure generator to generate updated CNN data;

enhancing the CNN data in an augmented rewarding neural network to produce enhanced CNN data;

when the validity reaches a predetermined value, the optimized CNN model is outputted.

6. The method of claim 5, wherein the automated framework search framework performing the plurality of arithmetic operations to search for the optimized CNN model further comprises:

the enhanced CNN data is input to the fabric generator.

7. The method of claim 5, wherein the CNN data comprises convolutional layers, active layers, and pooling layers.

8. The method of claim 7, wherein the plurality of convolutional layers comprises a number of filters, a convolutional kernel size, and a plurality of bias parameters.

9. The method of claim 7, wherein the plurality of active layers comprise a leaky linear rectifying unit, a ReLU, a parameterized ReLU, a sigmoid function, and a softmax function.

10. The method of claim 7, wherein the plurality of pooling layers comprises a number of steps and a convolution kernel size.

11. The method of claim 5, wherein the augmented reward neural network comprises a plurality of reward functions.

12. The method of claim 5, wherein inputting the CNN data to the architecture generator to generate the updated CNN data comprises:

inputting the CNN data and the initial hidden data into a hidden layer to execute a hidden layer operation so as to generate hidden layer data;

inputting the hidden layer data into a full connection layer to execute a full connection operation so as to generate full connection data;

inputting the full-concatenation data into an embedding vector to perform an embedding operation to generate embedded data;

inputting the embedded data to a decoder to generate decoded data; and

outputting the updated CNN data when the number of the plurality of layers of CNN data exceeds a predetermined number.

13. The method of claim 12, wherein inputting the CNN data to the architecture generator to generate the updated CNN data further comprises:

and inputting the decoding data and the hidden layer data into a next hidden layer to execute the next hidden layer operation.

14. The method of claim 12, wherein the hidden layer is a recurrent neural network.

15. The method of claim 12, wherein the hidden layer operations comprise a plurality of weight, bias, and enable arithmetic operations.

16. The method of claim 12, wherein the fully-concatenated operation includes a plurality of weight, bias, and enable arithmetic operations.

17. The method of claim 12, wherein the embedding operation comprises concatenating convolutional layers and active layers of the fully concatenated data.

Technical Field

The present invention relates to machine learning technology, and more particularly to a search framework system that can be configured to search for an optimal neural network model for different hardware constraints.

Background

Convolutional Neural Networks (CNNs) are considered to be the most notable class of neural networks, and have been highly successful in the field of machine learning, such as image recognition, image classification, speech recognition, natural language processing, and video classification. Because of the large amount of data sets, the high computational power and the high demand for storage memory required by CNN architectures, CNN architectures are becoming more complex and more difficult to achieve better performance, making CNN architectures difficult to implement on embedded systems, such as mobile phones and video screens, that have less storage memory, lower computational power and limited resources.

More specifically, hardware settings may differ among different devices. Different devices have different capabilities to support the associated CNN architecture. In order to achieve the best performance of the application under the constraint of the resettable hardware, it is very important to find the best CNN architecture meeting the hardware constraint.

Disclosure of Invention

The embodiment of the invention provides a method for operating a search framework system, wherein the search framework system comprises arithmetic operation hardware. The method comprises the following steps: inputting the input data and a plurality of reset parameters into an automatic architecture search framework of the arithmetic operation hardware; the automatic architecture search framework executes a plurality of arithmetic operations to search an optimized convolutional neural network model; and outputting the optimized CNN model. The invention can build an optimized CNN model under the limit condition of resettable hardware setting, customized model size and acceptable operation complexity.

Drawings

FIG. 1 is a block diagram of a search framework system according to an embodiment of the present invention.

FIG. 2 shows a block diagram of the automated architecture search framework of FIG. 1.

FIG. 3 is a block diagram of the architecture generator of FIG. 2.

Reference numerals:

100 search framework system

102 input data

104 reset parameter

106 automated architecture search framework

108 arithmetic operation hardware

110 optimized CNN model

200 architecture generator

201 initial input data

202 updated CNN data

210 augmented reward neural network

212 enhanced CNN data

302 initial hidden data

303 first hidden layer

304 first hidden layer data

305 first fully connected layer

306 first fully connected data

307 first embedded vector

308 first embedded data

310 decoder

311 first decoded data

313 second hidden layer

314 second hidden layer data

315 second fully connected layer

316 second fully connected data

317 second embedding vector

318 second embedded data

Detailed Description

The invention provides an automatic architecture search framework (AUTO-ARS), which outputs an optimized Convolutional Neural Network (CNN) model under the limitation condition of resettable hardware.

FIG. 1 is a block diagram of a search framework system 100 according to an embodiment of the present invention. The search framework system 100 includes arithmetic hardware 108. The arithmetic hardware 108 has an automated framework search framework 106 executing thereon. Input data 102 and reset parameters 104 are input into an automated framework search framework 106. The automated framework search framework 106 performs arithmetic operations to search for the optimized CNN model 110. The optimized CNN model 110 is optimized CNN data for meeting hardware constraints.

The reset parameters 104 include hardware setup parameters such as memory size and computing power of the arithmetic computing hardware 108. The input data 102 may be multimedia data such as images and/or sound. The automated framework search framework 106 performs arithmetic operations to search for the optimized CNN model. The optimized CNN model 110 contains a list of application tasks such as classification, object detection, segmentation, etc.

FIG. 2 shows a block diagram of the automated architecture search framework 106. The automated architecture search framework 106 is implemented by an architecture generator 200 and an augmented reward neural network 210. In the automated architecture search framework 106, initial input data 201 is input to the architecture generator 200 for generating updated CNN data 202. The initial input data 201 may be multimedia data, including images and/or sound. The updated CNN data 202 is then input to the augmented reward neural network 210 for generating augmented CNN data 212. Additionally, enhanced CNN data 212 may be input to the schema generator 200 to generate the re-updated CNN data 202. In other words, the architecture generator 200 and the augmented reward neural network 210 form a recursive loop for performing a recursive re-update procedure of the updated CNN data 202 and the augmented CNN data 212. When the validity accuracy reaches a predetermined value, the recursive re-update procedure is terminated and the optimized CNN model is output.

Fig. 3 shows a block diagram of the architecture generator 200. The architecture generator 200 is implemented by a recurrent neural network (recurrent neural network). In the architecture generator 200, the initial input data 201 and the initial hidden data 302 are input to the first hidden layer 303 to perform a hidden layer operation for generating first hidden layer data 304. The hidden layer operations include weight, bias, and enable arithmetic operations. The first hidden layer data 304 is then input to the first fully-connected layer 305 to perform a fully-connected operation for generating first fully-connected data 306. The full join operation includes weighting, biasing, and enabling arithmetic operations. The first fully-connected data 306 is additionally input to a first embedding vector 307 to perform an embedding operation for generating first embedded data 308. The first embedded vector 307 connects the convolutional layer and the active layer of the fully-connected data 306 to generate first embedded data 308.

The second stage of the recurrent neural network is then performed. The first embedded data 308 is input to a decoder 310 to generate first decoded data 311. The first decoded data 311 and the first concealment layer data 304 are then input into a second concealment layer 313 to perform a concealment layer operation for generating second concealment layer data 314. In addition, the second hidden layer data 314 is input to a second fully-connected layer 315 to perform a fully-connected operation for generating second fully-connected data 316. The second fully-connected data 316 is then input to a second embedding vector 317 to perform an embedding operation for generating second embedded data 318.

The third stage of the recurrent neural network then continues as shown in the above steps. This process continues to the next stage of the recurrent neural network until the number of levels of CNN data exceeds a predetermined value, and then outputs updated CNN data to the augmented reward neural network 210. In some embodiments, if the validity accuracy has reached a predetermined value before the number of levels of CNN data exceeds a predetermined value, the updated CNN data is output as the optimized CNN model. In other embodiments, the augmented reward neural network 210 may then output the latest CNN data output as the optimized CNN model, even if the validity accuracy reaches a predetermined value before the number of levels of CNN data exceeds the predetermined value, and continue to update CNN data until all levels of the recurrent neural network have updated CNN data.

The CNN data includes a convolutional layer, an active layer, and a pooling layer. The convolutional layer contains the number of filters (filters), the size of the convolutional kernel (kernel), and the bias parameters. The active layer includes a leakage (leak) linear rectifying unit (ReLU), a ReLU, a parameterized ReLU (parametrical ReLU), a sigmoid (sigmoid) function, and a softmax function. The pooling layer contains the number of steps and the convolution kernel size.

The search framework system 100 may be set for different hardware constraints. The search framework system 100 combines the CNN, the architecture generator 200, and the strong rewarding neural network 210 to search the optimized CNN model 110. The architecture generator 200 predicts the components of the neural network, such as convolutional layers with filter number, convolutional kernel size and bias parameters, and activation layers with various leakage functions, etc. The architecture generator 200 generates hyper-parameters as markers (tokens) for the sequences. More specifically, the convolutional layer has its own label, such as the number of filters, convolutional kernel size, and bias parameters. The active layer has its own flags, such as leakage ReLU, parameterized ReLU, sigmoid function, and softmax function. The pooling layer has its own label, such as the number of steps and the convolution kernel size. All the labels of the different kinds of layers are in a pool of resettable hardware settings.

When the number of levels of the CNN data exceeds a predetermined value, the process of updating the CNN data is stopped. Once the architecture generator 200 finishes updating CNN data, a feed forward neural network (fed forward neural network) that meets the constraint of the resettable hardware setting is constructed and transmitted to the strong rewarding neural network 210 for training. The rewarding neural network 210 acquires and trains CNN data until convergence. The validity accuracy of the neural network proposed by the invention is defined as the result of optimization. Through the policy gradient (policy gradient) method and using validity accuracy as a design metric, the architecture generator 200 updates its parameters to generate better CNN data over a period of time. An optimized CNN model can be constructed by updating the hidden layer. By applying the technique proposed by the present invention, an optimized CNN model can be created within the constraint of resettable hardware settings, the customized model size and acceptable computational complexity.

The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made within the scope of the claims of the present invention should be covered by the present invention.

8页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:神经网络的训练方法、图像识别方法及其装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!