Data processing device and artificial intelligence processor

文档序号：303616 发布日期：2021-11-26 浏览：10次中文

阅读说明：本技术 数据处理装置以及人工智能处理器 (Data processing device and artificial intelligence processor ) 是由裴京马骋王冠睿施路平于 2021-08-27 设计创作，主要内容包括：本公开涉及一种数据处理装置以及人工智能处理器。数据处理装置,应用于人工智能处理器的处理核心,该人工智能处理器包括多个处理核心,每个处理核心包括存储模块及数据处理装置,该数据处理装置连接存储模块,该数据处理装置包括：地址产生模块,用于根据控制指令,生成输入地址以及输出地址；数据转换模块,连接到地址产生模块,用于根据输入地址从存储模块读取第一数据,对第一数据执行数据转换操作,得到第二数据,并将第二数据写入存储模块的输出地址。本公开实施例的数据处理装置能够对存储模块中的存储数据进行数据整合。(The present disclosure relates to a data processing apparatus and an artificial intelligence processor. Data processing apparatus, be applied to the processing core of artificial intelligence treater, this artificial intelligence treater includes a plurality of processing cores, and every processing core includes memory module and data processing apparatus, and this data processing apparatus connects memory module, and this data processing apparatus includes: the address generation module is used for generating an input address and an output address according to the control instruction; and the data conversion module is connected to the address generation module and used for reading the first data from the storage module according to the input address, performing data conversion operation on the first data to obtain second data and writing the second data into the output address of the storage module. The data processing device of the embodiment of the disclosure can integrate the data of the storage module.)

1. A data processing apparatus, applied to a processing core of an artificial intelligence processor, wherein the artificial intelligence processor includes a plurality of processing cores, each processing core includes a storage module and a data processing apparatus, the data processing apparatus is connected to the storage module, and the data processing apparatus includes:

the address generation module is used for generating an input address and an output address according to the control instruction;

a data conversion module, connected to the address generation module, for reading first data from the memory module according to the input address, performing a data conversion operation on the first data to obtain second data, and writing the second data into an output address of the memory module,

wherein the data conversion operation comprises at least one of a data merge, a data split, a data migration, and a data type conversion operation.

2. The apparatus of claim 1, wherein the data conversion operation comprises a data merge operation,

the address generation module is used for generating a plurality of input addresses and the output address according to a control instruction;

the data conversion module is used for reading a plurality of first data from the storage module according to a plurality of input addresses and writing second data generated by combining the plurality of first data into the output address of the storage module.

3. The apparatus of claim 1, wherein the data conversion operation comprises a data splitting operation,

the address generation module is used for generating the input address and a plurality of output addresses according to a control instruction;

the data conversion module is configured to read first data from the storage module according to the input address, and write a plurality of second data obtained by splitting the first data into a plurality of output addresses of the storage module, respectively.

4. The apparatus of claim 1, wherein the data transformation operation comprises a data migration operation,

the address generation module is used for generating the input address and the output address according to a control instruction;

the data conversion module is used for reading first data from the storage module according to the input address and writing second data determined according to the first data into the output address of the storage module.

5. The apparatus of claim 1, wherein the data conversion operation comprises a data type conversion operation,

the data conversion module is used for performing data type conversion operation on the first data to obtain second data when the data type of the first data is different from that of the second data,

wherein the data type includes any one of a 32-bit integer int32, an 8-bit integer int8, and a three-valued data type.

6. The apparatus of claim 1, further comprising a line count module,

and the line counting module is used for counting according to the counting pulse sent by the control module of the processing core and sending the control instruction to the address generation module when the counting number is greater than or equal to a line number threshold.

7. The apparatus of claim 1, wherein the address generation module comprises a plurality of sets of counters,

and the plurality of groups of counters are used for generating the input address and the output address and controlling the data conversion module to execute the data conversion operation through counter logic.

8. The apparatus of claim 7, wherein the address generation module comprises a first set of counters, a second set of counters, and a third set of counters,

the first group of counters are used for generating a first address, and the first address is an address of the input data;

the second set of counters to generate a second address;

the third set of counters to generate a third address, the third address being an address of the output data,

wherein, when the data merging operation is executed, the second address is the input address; when a data splitting operation is executed, the second address is the output address; skipping a generation operation of the second address when performing a data migration operation.

9. The apparatus of claim 1, wherein the output address comprises an address of a routing module send data area located in the processing core,

the data conversion module is configured to write the second data into a data sending area of the routing module, so that the routing module sends the second data.

10. An artificial intelligence processor comprising a plurality of processing cores, each processing core comprising a memory module and a data processing apparatus according to any one of claims 1 to 9.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a data processing apparatus and an artificial intelligence processor.

Background

In recent years, the field of neuromorphic computing is rapidly developed, and a neural network can be constructed by adopting a hardware circuit so as to simulate the function of a brain. For example, neuromorphic chips may be utilized to build large-scale, parallel, low-power-consuming, and computing platforms that can support complex pattern learning.

In the related art, the neuromorphic chip includes a plurality of processing cores, and how to implement data integration in the processing cores has become a problem to be solved.

Disclosure of Invention

In view of this, the present disclosure provides a data processing apparatus and an artificial intelligence processor.

According to an aspect of the present disclosure, there is provided a data processing apparatus applied to a processing core of an artificial intelligence processor, the artificial intelligence processor including a plurality of processing cores, each processing core including a memory module and a data processing apparatus, the data processing apparatus being connected to the memory module, the data processing apparatus including:

the address generation module is used for generating an input address and an output address according to the control instruction;

and the data conversion module is connected to the address generation module and used for reading first data from the storage module according to the input address, performing data conversion operation on the first data to obtain second data, and writing the second data into an output address of the storage module, wherein the data conversion operation comprises at least one of data merging operation, data splitting operation, data migration operation and data type conversion operation.

In one possible implementation, the data conversion operation includes a data merge operation,

the address generation module is used for generating a plurality of input addresses and the output address according to a control instruction;

In one possible implementation, the data conversion operation includes a data splitting operation,

the address generation module is used for generating the input address and a plurality of output addresses according to a control instruction;

In one possible implementation, the data transformation operation includes a data migration operation,

the address generation module is used for generating the input address and the output address according to a control instruction;

In one possible implementation, the data conversion operation comprises a data type conversion operation,

wherein the data type includes any one of a 32-bit integer int32, an 8-bit integer int8, and a three-valued data type.

In one possible implementation, the apparatus further comprises a line count module,

In one possible implementation, the control instruction includes a primitive instruction sent by a control module of the processing core.

In a possible implementation manner, the address generation module includes a plurality of sets of counters, and the plurality of sets of counters are respectively used for generating the input address and the output address, and controlling the data conversion module to execute the data conversion operation through counter logic.

In one possible implementation, the address generation module includes a first set of counters, a second set of counters, and a third set of counters,

the first group of counters are used for generating a first address, and the first address is an address of the input data;

the second set of counters to generate a second address;

the third set of counters to generate a third address, the third address being an address of the output data,

In one possible implementation, the output address includes an address of a routing module sending data area located in the processing core,

the data conversion module is configured to write second data into the data sending area of the routing module, so that the routing module sends the second data.

In one possible implementation, the data type includes at least one of a 32-bit integer int32, an 8-bit integer int8, and a three-valued data type.

According to another aspect of the present disclosure, there is provided an artificial intelligence processor comprising the above-described processing core.

According to the embodiment of the disclosure, an input address and an output address are generated by an address generation module according to a control instruction; the data conversion module reads first data from the storage module according to the input address, performs data conversion operation on the first data to obtain second data, and writes the second data into the output address of the storage module, so that data integration processing on the storage data in the storage module can be realized.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 illustrates a block diagram of an artificial intelligence processor in accordance with an embodiment of the disclosure.

Fig. 2 shows a block diagram of a data processing apparatus of an embodiment of the present disclosure.

Fig. 3 shows a schematic diagram of storing characteristic diagram data in a storage module according to an embodiment of the present disclosure.

Fig. 4 shows a schematic diagram of a data processing apparatus of an embodiment of the present disclosure.

FIG. 5 illustrates a schematic diagram of a data conversion operation of an embodiment of the present disclosure.

FIG. 6 illustrates a schematic diagram of a data merge operation of an embodiment of the present disclosure.

Fig. 7 illustrates a schematic diagram of merging a plurality of first data into second data according to an embodiment of the disclosure.

FIG. 8 illustrates a schematic diagram of a data splitting operation of an embodiment of the present disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

The neuromorphic chip is a chip with a novel framework and can simulate the working mechanism of the human brain. The neuromorphic chip may be organized according to a processing core. For example, a neuromorphic chip may include 4096 (e.g., a 64 x 64 array) processing cores, each of which may internally simulate 256 neurons in biological sense, such that a neuromorphic chip may simulate a total of about 100 million neurons.

In one possible implementation, the processing core may organize the neurons in a two-dimensional mesh structure. One neuron can comprise axons, dendrites, cell bodies and the like, and the axons, dendrites and cell bodies of the neuron can be realized by hardware circuits. The axons may be rows of a two-dimensional mesh, the dendrites may be columns of a two-dimensional mesh, and the synapses may be cross-points of rows and columns for storing weights. The axons may be used for sending data, the dendrites may be used for receiving data, and the synapses may be used for storing weights, where the weights may be weighted values of a former neuron before passing through synapses by a latter neuron, and correspond to weight parameters of the neural network. For example, when a neuron is not used, the weight of the neuron may be set to 0, and when a calculation using a neuron is required, a weight value may be assigned to the neuron. The cell body can be used for realizing the arrangement of data in the storage module, and the data arrangement method comprises the following steps: and performing data migration Move, data merging Merge and data splitting Split operation, wherein the cell body can also plan a corresponding path to route data, and finally the data reach a target processing core.

FIG. 1 illustrates a block diagram of an artificial intelligence processor in accordance with an embodiment of the disclosure. As shown in fig. 1, the artificial intelligence processor 100 includes a plurality of processing cores 101, and each processing core 101 may include a memory module, an arithmetic module, a data processing device, and the like.

The storage module may be used for storing various data, for example, pixel data of an image and weight data of a convolution kernel, and the storage module may include an individual storage module or a storage module located inside another module, for example, a storage module located inside a routing module; the operation module may include a multiplier accumulator MAC array for performing operations based on the pixel data and the weight data. The data processing device can be used for performing data arrangement on the storage data of the storage module, such as data migration, data merging, data comparison and the like.

The present disclosure provides a data processing apparatus, which may be used to implement data sorting on storage data of a storage module.

Fig. 2 shows a block diagram of a data processing apparatus of an embodiment of the present disclosure. As shown in fig. 2, the data processing apparatus 60 may be applied to a processing core of an artificial intelligence processor including a plurality of processing cores, each of which includes a memory module 40, a control module 50, and a data processing apparatus 60 connecting the memory module and the control module, the data processing apparatus including:

the address generating module 10 is configured to generate an input address and an output address according to the control instruction;

and a data conversion module 20, connected to the address generation module, configured to read first data from the storage module according to the input address, perform a data conversion operation on the first data to obtain second data, and write the second data into an output address of the storage module, where the data conversion operation includes at least one of a data merge operation, a data split operation, a data migration operation, and a data type conversion operation.

For example, the data conversion operation may be a merging operation performed on a plurality of first data, a splitting operation performed on the first data, a migration operation performed on the first data, and the like, wherein during the above operation, a conversion operation of a data type may also be performed, for example, the first data with the data type of 32-bit integer int32 or the data type of the intermediate conversion data is converted into 8-bit integer int 8. The first data may be one or more, the second data may be one or more, and the present disclosure does not limit the type of data conversion operation, the number of the first data and the second data.

Through the data processing device of the embodiment of the disclosure, data arrangement of the storage data of the storage module can be realized.

Wherein the memory module of the processing core may be configured to store data and instructions related to neural network computations. In one example, the storage module may be a memory having a certain storage capacity, and stores different kinds of data such as vectors, matrices, and tensors in the neural network calculation.

In one possible implementation manner, the storage module stores Feature Map (Feature Map) data, where the Feature Map data is stored in units of vectors, and each vector contains data in a Channel direction, which may be referred to as a depth of the vector. The Channel data can be aligned according to the 16Byte width, the number of the units of the storage module occupied by the Channel data can be called as the length of a unit vector, and the feature map data can be composed of the number of vectors and the length of each vector.

Fig. 3 shows a schematic diagram of storing characteristic diagram data in a storage module according to an embodiment of the present disclosure. As shown in fig. 3, the feature map data is width × height × channel, where width and height are 3 as an example, and the number of vectors included in the feature map is 9. The data in each vector channel direction are aligned according to the 16Byte width, resulting in a vector length of 3. As shown in fig. 3, the storage sequence of the feature map data in the memory may be: and storing the data in sequence according to the Channel direction, the row direction of the vector in the characteristic diagram data and the column direction of the vector in the characteristic diagram data.

It should be understood that the storage sequence of the feature map data in the storage module may include various forms, and the present disclosure does not limit the type of data processed by the data processing apparatus, the storage form of the data in the storage module. For ease of understanding, the present disclosure will be described with an example of performing a data conversion operation on feature map data in the form of vectors stored in a storage module.

In one possible implementation, the data transformation operation includes at least one of a data merge, a data split, a data migration, and a data type transformation operation.