Data processing apparatus and data processing method

文档序号：191864 发布日期：2021-11-02 浏览：26次中文

阅读说明：本技术 数据处理装置及数据处理方法 (Data processing apparatus and data processing method ) 是由吴华强周颖高滨唐建石钱鹤于 2021-08-11 设计创作，主要内容包括：一种数据处理装置及数据处理方法。该数据处理装置用于神经网络的数据处理,神经网络包括至少一个第一处理层和至少一个第二处理层,数据处理装置包括：第一忆阻器阵列,包括阵列排布的多个第一忆阻器单元,且被配置为存储对应于至少一个第一处理层的权重值矩阵；第二忆阻器阵列,包括阵列排布的多个第二忆阻器单元,且被配置为存储对应于至少一个第二处理层的权重值矩阵；其中,多个第一忆阻器单元的数据保持性优于多个第二忆阻器单元的数据保持性,和/或多个第二忆阻器单元的耐久性优于多个第一忆阻器单元的耐久性。该数据处理装置采用两种具有不同的耐久性及数据保持性的忆阻器单元,便于对忆阻器单元有不同需求的神经网络权重层的灵活实现。(A data processing apparatus and a data processing method. The data processing device is used for data processing of a neural network, the neural network comprises at least one first processing layer and at least one second processing layer, the data processing device comprises: a first memristor array comprising a plurality of first memristor cells arranged in an array and configured to store a weight value matrix corresponding to at least one first processing layer; a second memristor array comprising a plurality of second memristor cells arranged in an array and configured to store a weight value matrix corresponding to the at least one second processing layer; wherein data retention of the first plurality of memristor cells is better than data retention of the second plurality of memristor cells, and/or endurance of the second plurality of memristor cells is better than endurance of the first plurality of memristor cells. The data processing device adopts two memristor units with different durability and data retentivity, and is convenient for flexibly realizing neural network weight layers with different requirements on the memristor units.)

1. A data processing apparatus for data processing of a neural network, the neural network comprising at least one first processing layer and at least one second processing layer, the at least one first processing layer being closer to an input of the neural network than the at least one second processing layer,

the data processing apparatus includes:

a first memristor array comprising a plurality of first memristor cells arranged in an array and configured to store a weight value matrix corresponding to the at least one first processing layer;

a second memristor array comprising a plurality of second memristor cells arranged in an array and configured to store a weight value matrix corresponding to the at least one second processing layer;

wherein a data retention of the first plurality of memristor cells is better than a data retention of the second plurality of memristor cells, and/or,

the endurance of the second plurality of memristor cells is better than the endurance of the first plurality of memristor cells.

2. The data processing apparatus of claim 1, wherein the plurality of first memristor cells and/or the plurality of second memristor cells are Resistive Random Access Memories (RRAMs).

3. The data processing apparatus of claim 1, wherein the plurality of first memristor cells and the plurality of second memristor cells are fabricated on a same semiconductor substrate.

4. The data processing apparatus of claim 2, wherein the resistive switching memory device in the first memristor cell has TiN/Ta₂O₅/TaO_xThe material structure of/TiN; and/or the resistance change type memory device in the second memristor unit is provided with TiN/HfO₂The material structure of/Ti/TiN.

5. The data processing apparatus of claim 1, further comprising a control drive circuit, wherein the control drive circuit is configured to set weight data stored by the plurality of first memristor cells and/or weight data of the plurality of second memristor cells, respectively.

6. The data processing apparatus of claim 1, wherein the at least one second processing layer comprises a last processing layer of the neural network.

7. A data processing method applied to the data processing apparatus according to any one of claims 1 to 6, the data processing method comprising:

mapping a weight value matrix corresponding in the at least one first processing layer into the first memristor array;

mapping a weight value matrix corresponding in the at least one second processing layer into the second memristor array.

8. The data processing method of claim 7, further comprising:

and modifying at least one weight value in the at least one second processing layer, and mapping the modified weight value in the at least one second processing layer into the second memristor array.

9. The data processing method of claim 8, wherein modifying at least one weight value in the at least one second processing layer, mapping the modified weight value in the at least one second processing layer into the second memristor array, comprises:

acquiring an image to be trained and a standard output value corresponding to the image to be trained;

processing the image to be trained by utilizing the neural network to obtain a training result;

calculating a loss value of the neural network based on the training result and the standard output value; and

and modifying at least one weight value in the at least one second processing layer based on the loss value, and mapping the modified weight value in the at least one second processing layer into the second memristor array.

10. The data processing method of claim 9, wherein a resistance value of a memristor of the first memristor cell remains unchanged during the correction of the weight value in the at least one second processing layer.

11. The data processing method of any of claims 7-10, further comprising:

pre-training a target neural network to obtain the neural network after or during training, thereby obtaining a weight value matrix for the at least one first processing layer and a weight value matrix in the at least one second processing layer.

Technical Field

The embodiment of the disclosure relates to a data processing device and a data processing method.

Background

The neural network brings numerous changes to the fields of intelligent travel, intelligent medical treatment and the like. At present, the main hardware implementation platform of the neural network is still a CPU and a GPU based on the traditional Von Neumann architecture, and the memory-computation integrated technology based on the memristor is expected to break through the bottleneck of the Von Neumann architecture of the classical computing system, bring about explosive growth of hardware computation power and energy efficiency, further promote development and landing of artificial intelligence, and is one of the most potential next-generation hardware chip technologies.

Disclosure of Invention

At least one embodiment of the present disclosure provides a data processing apparatus for data processing of a neural network, the neural network including at least one first processing layer and at least one second processing layer, the at least one first processing layer being closer to an input of the neural network than the at least one second processing layer, the data processing apparatus including: a first memristor array comprising a plurality of first memristor cells arranged in an array and configured to store a weight value matrix corresponding to at least one first processing layer; a second memristor array comprising a plurality of second memristor cells arranged in an array and configured to store a weight value matrix corresponding to the at least one second processing layer; wherein data retention of the first plurality of memristor cells is better than data retention of the second plurality of memristor cells, and/or endurance of the second plurality of memristor cells is better than endurance of the first plurality of memristor cells.

For example, in the data processing apparatus provided by at least one embodiment of the present disclosure, the plurality of first memristor units and/or the plurality of second memristor units are Resistive Random Access Memories (RRAMs).

For example, in a data processing apparatus provided by at least one embodiment of the present disclosure, a plurality of first memristor cells and a plurality of second memristor cells are fabricated on the same semiconductor substrate.

For example, in the data processing apparatus provided by at least one embodiment of the present disclosure, the resistive random access memory device in the first memristor unit has TiN/Ta₂O₅/TaO_xThe material structure of/TiN; and/or the resistive random access memory device in the second memristor unit is provided with TiN/HfO₂The material structure of/Ti/TiN.

For example, the data processing apparatus provided by at least one embodiment of the present disclosure further includes a control driving circuit, where the control driving circuit is configured to set weight data stored in the plurality of first memristor cells and/or weight data of the plurality of second memristor cells, respectively.

For example, in a data processing apparatus provided in at least one embodiment of the present disclosure, the at least one second processing layer includes a last processing layer of the neural network.

At least one embodiment of the present disclosure provides a data processing method applied to a data processing apparatus provided in at least one embodiment of the present disclosure, where the data processing method includes: mapping a weight value matrix corresponding to at least one first processing layer into a first memristor array; a weight value matrix corresponding to at least one second processing layer is mapped into a second memristor array.

For example, the data processing method provided in at least one embodiment of the present disclosure further includes: and modifying at least one weight value in the at least one second processing layer, and mapping the modified weight value in the at least one second processing layer into the second memristor array.

For example, in a data processing method provided by at least one embodiment of the present disclosure, modifying at least one weight value in at least one second processing layer, and mapping the modified weight value in the at least one second processing layer into a second memristor array includes: acquiring an image to be trained and a standard output value corresponding to the image to be trained; processing the image to be trained by utilizing a neural network to obtain a training result; calculating a loss value of the neural network based on the training result and the standard output value; and modifying at least one weight value in the at least one second processing layer based on the loss value, and mapping the modified weight value in the at least one second processing layer into the second memristor array.

For example, in the data processing method provided by at least one embodiment of the present disclosure, the resistance value of the memristor of the first memristor unit remains unchanged during the process of correcting the weight value in the at least one second processing layer.

For example, the data processing method provided in at least one embodiment of the present disclosure further includes: the target neural network is pre-trained to obtain a trained or in-training neural network, thereby obtaining a weight value matrix for at least one first processing layer and a weight value matrix in at least one second processing layer.

Drawings

To more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description relate only to some embodiments of the present disclosure and are not limiting to the present disclosure.

FIG. 1 shows a schematic diagram of a memristor array structure;

FIG. 2 shows a schematic diagram of a memristor cell of a 1T1R structure;

FIG. 3 shows a schematic diagram of an exemplary neural network architecture;

fig. 4 shows a schematic block diagram of a data processing apparatus provided in at least one embodiment of the present disclosure;

FIG. 5 shows a schematic of a neural network mapping to a memristor array;

FIG. 6 shows a structure based on TiN/Ta₂O₅/TaO_xTiN and TiN/HfO₂Schematic diagram of 1T1R unit structure of/Ti/TiN stacked structure;

FIG. 7A shows a TiN/Ta based₂O₅/TaO_xA test chart of data retention of the RRAM device with the TiN stacking structure;

FIG. 7B shows a TiN/Ta based₂O₅/TaO_xA test chart of the durability of the RRAM device with the TiN stacking structure;

fig. 8 is a schematic flow chart of a data processing method of a neural network according to at least one embodiment of the present disclosure;

fig. 9 shows a processing flow chart of a data processing method provided by at least one embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings of the embodiments of the present disclosure. It is to be understood that the described embodiments are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the disclosure without any inventive step, are within the scope of protection of the disclosure.

Unless otherwise defined, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in this disclosure is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. Also, the use of the terms "a," "an," or "the" and similar referents do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.

The present disclosure is illustrated by the following specific examples. Detailed descriptions of known functions and known components may be omitted in order to keep the following description of the embodiments of the present disclosure clear and concise. When any component of an embodiment of the present disclosure appears in more than one drawing, that component is identified by the same reference numeral in each drawing.

The existing hardware implementation platform for data Processing of neural network is still based on CPU (Central Processing Unit) and GPU (Graphics Processing Unit) of traditional von neumann architecture, and the architecture with the computing Unit separated from the storage Unit makes image information need to be taken out from the storage Unit and then sent into the computing Unit in the computing process, and this data access process causes unnecessary energy consumption and waste of Processing time.

Memristors (e.g., resistive random access memories, phase change memories, conductive bridge memories, etc.) are non-volatile devices whose conduction state can be adjusted by applying an external stimulus. The memristor is a two-terminal device, has the characteristics of adjustable resistance and non-volatilization, and is widely applied to memory and computation integrated calculation. According to kirchhoff's current law and ohm's law, an array formed by memristors can complete multiplication and accumulation calculation in parallel, and storage and calculation both occur in each device of the array. Based on the computing architecture, the storage and computation integrated computing without a large amount of data movement can be realized.

A Resistive Random Access Memory (RRAM) is one of memristors, and the resistance of the RRAM can be adjusted by applying voltage, and the RRAM has the advantages of non-volatility, low density, compatibility with the conventional CMOS process, and the like. RRAM devices have received attention from numerous researchers in recent years due to their similarity to biological synapses, and are widely used in hardware implementations of neural networks.

However, RRAM devices are not ideal devices, e.g., some defective devices cannot be programmed to a target resistance state, and the conductance of RRAM devices may drift continuously. Reliability indicators of RRAM devices include endurance of the device and data retention of the device, e.g., endurance of the device refers to the number of times the resistance of a memristor can be switched between a high resistance state and a low resistance state; data retention of a device refers to the ability of the memristor to remain relatively stable in resistance for a period of time after the memristor is programmed to a certain resistance state, e.g., a high resistance state.

Research work based on RRAM devices has shown that there is a certain restriction between its data retention and endurance, for example, RRAM devices with better data retention have a problem of poor endurance, while RRAM devices with better endurance generally have a problem of poor data retention.

Multiply-accumulate is the core computational task required to run a neural network. Thus, using the conductances of the memristors in the array to represent the weight values, energy-efficient neural network operations may be implemented based on such computational-integrated calculations. However, after the neural network is programmed in the memristor array, due to the existence of non-ideal factors of the memristor, such as cell yield, programming error, device resistance drift, and the like, the non-ideal factors may cause a large deviation between the weight of the neural network mapped onto the memristor and the ideal weight, that is, the deviation between the resistance value of the weight value mapped onto the corresponding memristor and the weight value is large.

At present, the improvement scheme of the non-ideal factors of the memristor comprises an online training scheme, namely weights are trained on the memristor array online, but the training overhead of the scheme is large, and high challenges are provided for the energy consumption and the time of a circuit for processing edge computing tasks.

The data processing device adopts two memristor units with different data retentivity and durability to store weight matrixes of different processing layers of the neural network, so that the flexible realization of the weight layers of the neural network with different requirements on memristor devices is facilitated.

FIG. 1 shows a schematic diagram of a memristor array. As shown in fig. 1, the memristor array is made up of a plurality of memristor cells that make up an array of M rows and N columns, with M and N being positive integers. Each memristor cell includes a switching element and one or more memristors. In fig. 1, WL <1>, WL <2> … … WL < M > respectively represent word lines of the first row, the second row … …, and the mth row, and the control electrodes (e.g., the gates of the transistors) of the switching elements in the memristor cell circuits of each row are connected to the word line corresponding to the row; BL <1>, BL <2> … … BL < N > respectively represent bit lines of a first column and a second column … … and an Nth column, and a memristor in the memristor unit circuit of each column is connected with the corresponding bit line of the column; SL <1>, SL <2> … … and SL < M > respectively represent source lines of a first row and a second row … … and an mth row, and the source of the transistor in the memristor unit circuit of each row is connected with the source line corresponding to the row.

The memristor array of M rows and N columns shown in fig. 1 may represent a neural network weight matrix of size M rows and N columns. For example, the first layer of neuron layer has N neuron nodes connected to correspond to N columns of bit lines of the memristor array shown in fig. 1; the second layer of neuron layer has M neuron nodes, and is correspondingly connected to M row source lines of the memristor array shown in fig. 1. By inputting voltage excitation to the first layer of neuron layer in parallel, an output current obtained by multiplying a voltage excitation vector by a conductance matrix (conductance is the inverse of resistance) of the memristor array can be obtained at the second layer of neuron layer.

Specifically, according to kirchhoff's law, the output current of the memristor array may be derived according to the following formula:

wherein j is 1, …, n, k is 1, …, m.

In the above formula, v_kRepresenting a voltage stimulus, i, input to a neuron node k in a first layer of neuron layers_jRepresenting the output current, g, of a neuron node j of a second layer of neuron layer_k,jRepresenting memristanceA conductance matrix of an array of devices.

According to kirchhoff's law, the memristor array can complete multiplication and accumulation calculation in parallel.

It is noted that, for example, in some examples, each weight of the neural network weight matrix may also be implemented using two memristors. That is, the output of one column of output current may be achieved by two columns of memristors in the memristor array. In this case, a neural network weight matrix representing a size of m rows and n columns requires an array of m rows and 2n columns of memristors.

It should be noted that the current output by the memristor array is an analog current, and in some examples, the analog current may be converted into a digital voltage by an analog-to-digital conversion circuit (ADC) and transferred to the second layer of neuron layer, so that the second layer of neuron layer may also convert the digital voltage into an analog voltage by a digital-to-analog conversion circuit (DAC) and connect with another layer of neuron layer by another memristor array; in other examples, the analog current may also be converted to an analog voltage by a sample-and-hold circuit for transmission to the second layer of neuron layers.

The memristor cells in the memristor array of fig. 1 may be, for example, a 1T1R structure or a 2T2R structure, where the memristor cells of the 1T1R structure include one switching transistor and one memristor, and the memristor cells of the 2T2R structure include two switching transistors and two memristors, and the embodiments of the present disclosure do not limit the types of memristors employed.

It should be noted that the transistors used in the embodiments of the present disclosure may be thin film transistors or field effect transistors (e.g., MOS field effect transistors) or other switching devices with the same characteristics. The source and drain of the transistor used herein may be symmetrical in structure, so that there may be no difference in structure between the source and drain. Embodiments of the present disclosure do not limit the type of transistors employed.

FIG. 2 shows a schematic diagram of a memristor cell of a 1T1R structure. As shown in fig. 2, the memristor cell of the 1T1R structure includes one transistor M1 and one memristor R1.

The following embodiments will be described by taking an example in which the transistor M1 is an N-type transistor.

The word line terminal WL is used to apply a corresponding voltage to the gate of the transistor M1, thereby controlling the transistor M1 to be turned on or off. When the memristor R1 is operated, for example, a set operation or a reset operation, the transistor M1 needs to be turned on first, that is, a turn-on voltage needs to be applied to the gate of the transistor M1 through the word line terminal WL. After the transistor M1 is turned on, for example, a voltage may be applied to the memristor R1 by applying voltages to the memristor R1 at the source line terminal SL and the bit line terminal BL to change the resistance state of the memristor R1. For example, a set voltage may be applied through the bit line terminal BL to cause the memristor R1 to be in a low resistance state; for another example, a reset voltage may be applied across the source terminal SL to place the memristor R1 in a high resistance state. For example, the resistance value in the high resistance state is 100 times or more, for example 1000 times or more, the resistance value in the low resistance state.

It should be noted that, in the embodiment of the present disclosure, by applying voltages to the word line terminal WL and the bit line terminal BL at the same time, the resistance value of the memristor R1 may be made smaller and smaller, that is, the memristor R1 changes from the high resistance state to the low resistance state, and an operation of changing the memristor R1 from the high resistance state to the low resistance state is referred to as a set operation; by applying voltages to the word line terminal WL and the source line terminal SL simultaneously, the resistance value of the memristor R1 can be made larger and larger, that is, the memristor R1 changes from the low resistance state to the high resistance state, and the operation of changing the memristor R1 from the low resistance state to the high resistance state is called a reset operation. For example, the memristor R1 has a threshold voltage that does not change the resistance value (or conductance value) of the memristor R1 when the input voltage magnitude is less than the threshold voltage of the memristor R1. In this case, a calculation may be made with the resistance value (or conductance value) of the memristor R1 by inputting a voltage less than the threshold voltage; the resistance value (or conductance value) of the memristor R1 may be changed by inputting a voltage greater than a threshold voltage.

For example, the neural network may be a neural network of any structure, such as a full convolution network, a deep convolution network, a U-type network (U-Net), and the like, and the structure of the neural network is not limited by the present disclosure.

For example, a neural network typically includes an input, an output, and a plurality of processing layers. For example, the input end is used for receiving data to be processed, such as an image to be processed, and the output end is used for outputting a processing result, such as a processed image, and the like, the plurality of processing layers may include convolution layers, full-link layers, and the like, which mainly include multiply-accumulate calculation, and the processing layers may include different contents according to the structure of the neural network. After the input data is input into the neural network, the corresponding output is obtained through the plurality of processing layers, for example, the input data may be subjected to operations such as convolution, upsampling, downsampling, standardization, full connection, and planarization through the plurality of processing layers, which is not limited in this disclosure.

Fig. 3 shows a schematic diagram of an exemplary neural network architecture. As shown in fig. 3, the neural network includes 3 layers of neuron layers, an input layer 101, a hidden layer 102, and an output layer 103. The input layer 101 has 4 inputs, the hidden layer 102 has 3 outputs, and the output layer 103 has 2 outputs.

For example, the 4 inputs to the input layer 101 may be 4 images, or four feature images of 1 image. The 3 outputs of the hidden layer 102 may be feature images of the image input via the input layer 101.

For example, as shown in fig. 3, the neural network includes two processing layers, a first convolutional layer 201 and a second convolutional layer 202. As shown in FIG. 3, the convolutional layer has weightsAnd biasWeight ofRepresenting convolution kernels, offsetsIs a scalar superimposed on the output of the convolutional layer, where k is a label representing either the input layer 101 or the hidden layer 102, and i and j are labels of the cells of the input layer 101 and the hidden layer 102, respectively. For example, the first convolution layer 201 includes a first set of convolution kernels (of FIG. 3)) And a first set of offsets (of FIG. 3). The second convolutional layer 202 includes a second set of convolutional kernels (of FIG. 3)) And a second set of offsets (of FIG. 3). Typically, each convolutional layer comprises tens or hundreds of convolutional kernels, which may comprise at least five convolutional layers if the convolutional neural network is a deep convolutional neural network.

The calculation processes of convolution calculation, full-connection calculation and the like in the neural network mainly comprise multiply-accumulate calculation, so that functional layers such as convolution layers and full-connection layers can be realized through memristor arrays. For example, the weighted weights of the convolutional layers and the fully-connected layers can be represented by the array conductance of the memristor array, while the inputs of the convolutional layers and the fully-connected layers can be represented by the corresponding voltage excitations, so that the convolutional calculation and the fully-connected calculation can be respectively realized according to the kirchhoff's law. For example, with different parameter mapping manners, the multiply-accumulate calculation of one processing layer may be implemented by using one memristor array, or the multiply-accumulate calculation of multiple processing layers may be implemented by using one memristor array, which is not limited by the present disclosure.

Fig. 4 shows a schematic block diagram of a data processing apparatus provided in at least one embodiment of the present disclosure. The data processing device is used for data processing of a neural network, for example, the neural network comprises at least one first processing layer and at least one second processing layer, the at least one first processing layer being closer to an input of the neural network than the at least one second processing layer.

For example, as shown in fig. 3, the neural network includes a first convolutional layer 201 and a second convolutional layer 202, where the second convolutional layer 202 is closer to the output end of the neural network than the first convolutional layer 201, and the first convolutional layer is a first processing layer and the second convolutional layer is a second processing layer.

For example, in other embodiments, the neural network may include 5 processing layers, which are a first processing layer 1, a first processing layer 2, a first processing layer 3, a second processing layer 1, and a second processing layer 2 in sequence along the direction from the input end to the output end, for example, the first processing layer 1 to the first processing layer 3 may all be convolutional layers, and the second processing layer 1 and the second processing layer 2 may be convolutional layers or fully-connected layers.

For example, the second processing layer comprises the last processing layer of the neural network, i.e. the processing layer of the neural network closest to the output comprising the multiply-accumulate calculation. The first processing layer includes other processing layers of the neural network except for a second processing layer closest to an output of the neural network.

For a large-scale neural network, in order to deal with the non-ideal factors of the memristor, a mixed training strategy can be adopted, for example, the neural network is trained firstly, the ideal weight of the neural network trained by software is mapped to the memristor array, then the weight of the second processing layer of the neural network is subjected to online fine tuning, and as the weight of the second processing layer can be adjusted only and the weight value of the first processing layer is not adjusted, the non-ideal factors of the memristor can be overcome with smaller training overhead, and the effect of the neural network is effectively improved.

However, the training process of the neural network has certain requirements on the life of the memristor device, for example, for a simple two-layer Artificial Neural Network (ANN), it is necessary to perform more than 5000 times of iterative training, and if the high precision of the training and the reduction of the learning rate are considered, the number of times of iterative updating of the weights is more.

Therefore, in the training mode, since data in some memristor units needs to be stored for a long time, and another part of memristor units needs to be finely adjusted according to an actual neural network effect, if the memristor array only includes memristors of the same structure, high requirements on data retention and durability of the memristor device are provided, and implementation is not easy.

For example, as shown in fig. 4, a data processing apparatus 400 provided by at least one embodiment of the present disclosure includes a first memristor array 401, a second memristor array 402, the first memristor array 401 includes a plurality of first memristor units arranged in an array and is configured to store a weight value matrix corresponding to at least one first processing layer; the second memristor array 402 includes a plurality of second memristor cells arranged in an array and is configured to store a weight value matrix corresponding to at least one second processing layer.

For example, the structures of the first memristor array 401 and the second memristor array 402 may refer to the relevant contents of fig. 1, and are not described in detail here.

For example, the weight value matrices of the multiple first processing layers may be weight value matrices obtained by arranging the weights of the respective first processing layers in a certain order, for example, the weight value matrices of the multiple second processing layers may be weight value matrices obtained by arranging the weights of the respective second processing layers in a certain order, and the concept of the weights of the processing layers may refer to the related contents in fig. 3, which is not described herein again.

For example, the data retention of the first plurality of memristor cells is better than the data retention of the second plurality of memristor cells and the endurance of the second plurality of memristor cells is better than the endurance of the first plurality of memristor cells, e.g., the data retention of the first plurality of memristor cells is better than the data retention of the second plurality of memristor cells, or the endurance of the second plurality of memristor cells is better than the endurance of the first plurality of memristor cells.

That is, different processing layers are implemented by memristors with different data retention and endurance, for example, a plurality of first processing layers of the neural network closer to the input end are implemented by memristors with better data retention but relatively poorer endurance, so that the weights of the plurality of first processing layers of the neural network are kept unchanged as much as possible in the training process to reduce the subsequent training times; the memristor with good durability but relatively poor data retention is adopted to realize a second processing layer closer to the output end, namely the processing layer needing fine tuning training, and the weight of the second processing layer is continuously adjusted in the training process to adapt to the interference of non-ideal factors. Two memristors with different characteristics are integrated in the same hardware structure, so that the neural network processing layer with different requirements on the memristors can be flexibly realized.

FIG. 5 shows a schematic of a neural network mapping to a memristor array.

As shown in fig. 5, the neural network includes two processing layers, namely a first processing layer and a second processing layer shown in fig. 5. The weight value matrix of the first processing layer isThe weight value matrix of the second processing layer isThe weight value matrix of the first processing layer is mapped into a first memristor array, and the weight value matrix of the second processing layer is mapped into a second memristor array. The first memristor array comprises a plurality of first memristor units arranged in an array and is configured to store a weight value matrixThe second memristor array comprises a plurality of second memristor units arranged in an array and is configured to store a weight value matrix

For example, a first memristor array is composed of more data-retention memristors, and a second memristor array is composed of more endurance memristors. For example, the structure of the first memristor cell and the second memristor cell may also be 2T2R, that is, the present disclosure does not limit the structure of the memristor cells, as long as the first memristor cell is guaranteed to have better data retention and the second memristor cell is guaranteed to have better endurance.

The first memristor array further includes a word line driver (WL driver 1 in fig. 5) and a bit line driver (BL driver 1 in fig. 5), and similarly, the second memristor array further includes a word line driver (WL driver 2 in fig. 5) and a bit line driver (BL driver 2 in fig. 5), and for the description of the word line driver and the bit line driver, reference may be made to the relevant contents of fig. 1, and details are not repeated here.

For example, the specific manner of mapping the weight value matrix into the memristor array may be any feasible manner, and the present disclosure does not specifically limit the mapping position of the weight value, the method for writing the weight value into the memristor, and the like.

For example, the plurality of first memristor cells and/or the plurality of second memristor cells are Resistive Random Access Memories (RRAMs).

For example, the RRAM device in the first memristor cell has TiN/Ta₂O₅/TaO_xA material stack structure of/TiN, and further provided with, for example, metal electrodes on both sides of the stack structure; the RRAM device in the second memristor unit has TiN/HfO₂A material stack of/Ti/TiN, and further provided with, for example, metal electrodes on both sides of the stack.

As described above, there is a certain restriction between the durability and data retention of RRAM devices, and devices with good data retention have a problem of poor durability, while devices with good durability have a problem of poor data retention. Based on TiN/Ta₂O₅/TaO_xRRAM device with/TiN stacked structure shows good data retention (ten years at 117 ℃) and the durability is only 10⁴Period based on TiN/HfO₂RRAM devices of/Ti/TiN materials exhibit better endurance (10)⁸One cycle), the data can only be maintained at 78 ℃ for ten years.

FIG. 6 shows a structure based on TiN/Ta₂O₅/TaO_xTiN Stack Structure and TiN/HfO₂Schematic diagram of 1T1R unit structure of/Ti/TiN stacked structure.

For example, a plurality of first memristor cells and a plurality of second memristor cells are fabricated on the same semiconductor substrate, e.g., on the same silicon wafer, e.g., by sequentially depositing TiN, Ta on the semiconductor substrate used to fabricate the transistors₂O₅、TaO_x、TiN、HfO₂And Ti and TiN layers, compared with the process of separately preparing the RRAM device with a material stacking structure, the preparation method has more photoetching steps, but reduces the total preparation cost and improves the integration level. Moreover, for example, two devices are integrated in the same chip, which facilitates flexible implementation of neural network weight layers with different requirements on the devices.

FIG. 7A shows a TiN/Ta based₂O₅/TaO_xTest chart of data retention of RRAM device with TiN stacking structure.

Based on TiN/Ta, as shown in FIG. 7A₂O₅/TaO_xRRAM devices of/TiN stacked structures exhibit good data retention, based on TiN/Ta at 250 ℃₂O₅/TaO_xRRAM device with/TiN stacked structure for longer time period (10)⁶s) to maintain its resistance value in a Low Resistance State (LRS) and a High Resistance State (HRS).

FIG. 7B shows a TiN/Ta based₂O₅/TaO_xA test chart of the durability of RRAM device with TiN stacking structure.

Based on TiN/Ta, as shown in FIG. 7B₂O₅/TaO_xRRAM devices of/TiN stacked structures exhibit poor endurance with stored data switching times of up to about 10 between High Resistance State (HRS) and Low Resistance State (LRS)⁴The resistance value of which greatly changes, i.e. the durability of which is only about 10⁴And (4) one period.

For example, as shown in fig. 4, the data processing apparatus further includes a control driving circuit 403, and the control driving circuit 403 is configured to set the weight value data stored in the plurality of first memristor cells and/or the weight value data stored in the plurality of second memristor cells, respectively, and may be implemented by a data processing circuit, a single chip microcomputer, or the like, for example.

For example, the control driving circuit 403 is configured to set the memristor array, map a weight value matrix corresponding to at least one first processing layer to a corresponding memristor cell in the first memristor array, and map a weight value matrix corresponding to at least one second processing layer to a corresponding memristor cell in the second memristor array.

It should be noted that, for clarity and conciseness of representation, not all the constituent elements of the data processing apparatus are given in the embodiments of the present disclosure. Other constituent elements not shown may be provided and arranged according to specific needs by those skilled in the art to realize the necessary functions of the data processing apparatus, and the embodiment of the present disclosure is not limited thereto.

Fig. 8 is a schematic flowchart of a data processing method of a neural network according to at least one embodiment of the present disclosure, where the data processing method is applied to a data processing apparatus according to at least one embodiment of the present disclosure, so that the foregoing contents are referred to for the relevant contents of the data processing apparatus, and are not repeated here.

As shown in fig. 8, the data processing method includes steps S801 to S802.

Step S801: a weight value matrix corresponding in at least one first processing layer is mapped into a first memristor array.

Step S802: a weight value matrix corresponding to at least one second processing layer is mapped into a second memristor array.

For example, an implementation of mapping a weight value matrix of a neural network to a memristor array is shown in fig. 5, where the conductance value of each memristor cell of the memristor array is used to represent the weight value of one node connection.

The data processing method provided by the embodiment adopts two memristor units with different data retentivity and durability to store the weight matrixes of different processing layers of the neural network, so that the flexible implementation of the neural network weight layers with different requirements on memristor devices is facilitated.

In some embodiments of the present disclosure, the data processing method further comprises: and modifying at least one weight value in the at least one second processing layer, and mapping the modified weight value in the at least one second processing layer into the second memristor array.

For example, modifying at least one weight value in the at least one second processing layer, and mapping the modified weight value in the at least one second processing layer into the second memristor array may include: acquiring an image to be trained and a standard output value corresponding to the image to be trained; processing the image to be trained by utilizing a neural network to obtain a training result; calculating a loss value of the neural network based on the training result and the standard output value; and modifying at least one weight value in the at least one second processing layer based on the loss value, and mapping the modified weight value in the at least one second processing layer into a second memristor array.

For example, the resistance of the memristor of the first memristor cell remains unchanged during the correction of the weight value in the at least one second processing layer.

For example, a neural network is trained in an offline mode (for example, a computer and other platforms) in advance to obtain a trained or trained neural network, a weight value matrix corresponding to a first processing layer in the neural network is mapped to a first memristor array, a weight value matrix corresponding to a second processing layer in the neural network is mapped to a second memristor array, the weight data of the first memristor array is kept as unchanged as possible in the training process, only the weight value data in the second memristor array is adjusted, so that the advantage of good data retention of the first memristor array can be utilized, the mapped resistance value is kept stable for a long time, the training frequency is reduced, meanwhile, the advantage of strong durability of the second memristor array is utilized, the operable times of repeated programming of memristor units are large, and the interference of non-ideal factors of devices can be adapted by updating the weight value data in the second memristor array, the method is more suitable for training the neural network and improves the effect of the neural network.

Fig. 9 shows a processing flow chart of a data processing method provided by at least one embodiment of the present disclosure.

First, a target neural network is pre-trained to obtain a trained or in-training neural network, thereby obtaining a weight value matrix for at least one first processing layer and a weight value matrix for at least one second processing layer. For example, the neural network includes a plurality of first processing layers and a second processing layer, and for example, a target neural network to be trained may be trained on a computer in advance, so as to obtain a plurality of weight value matrices corresponding to the first processing layers and a plurality of weight value matrices corresponding to the second processing layers.

Next, a weight value matrix of the at least one first processing layer is mapped into a first memristor array, and a weight value matrix of the at least one second processing layer is mapped into a second memristor array.

And then, judging whether the neural network meets a preset accuracy rate condition, if not, acquiring the image to be trained and a standard output value corresponding to the image to be trained, inputting the image to be trained into the first memristor array and the second memristor array to obtain a training result, and calculating a loss value of the neural network according to the training result and the standard output value.

And then, correcting the weight value matrix corresponding to the second processing layer based on the loss value, updating the corrected weight value to the corresponding memristor unit in the second memristor array, and keeping the conductance value of the memristor unit in the first memristor array unchanged in the training process.

And then, continuing to execute the training process, and correcting the weight value matrix corresponding to the second processing layer until the neural network meets the preset accuracy rate condition.

The process flow diagram shown in fig. 9 is described below with respect to a particular application.

For example, for the MNIST (Mixed National Institute of Standards and Technology database) task of handwritten digit recognition, first, a target neural network is pre-trained on a computer to obtain an ideal neural network, for example, a convolutional neural network including 5 processing layers, five processing layers are respectively 4 first processing layers and one second processing layer, for example, the 4 first processing layers include three convolutional layers and 1 fully-connected layer, and the second processing layer includes 1 fully-connected layer closest to an output end of the neural network.

Then, mapping the weight value matrixes of the 4 first processing layers into the first memristor array, and mapping the weight value matrixes of the 1 second processing layer into the second memristor array, and the specific mapping manner is not repeated here.

Due to the fact that non-ideal factors such as writing errors and device yield problems exist in the RRAM device, a weight value matrix mapped into the memristor array is difficult to achieve distribution the same as ideal weights, fine tuning training needs to be conducted on the memristor array to make up for degradation of recognition accuracy.

For example, as shown in fig. 9, it is determined whether the neural network satisfies a predetermined accuracy condition, and if not, fine tuning training is performed.

In the fine tuning training, firstly, a to-be-trained image and a standard output value corresponding to the to-be-trained image are obtained, the to-be-trained image is input into the first memristor array and the second memristor array, a training result is obtained, and a loss value of the neural network is calculated according to the training result and the standard output value.

For example, the RRAM device may drift or need to learn similar new knowledge after a period of time, and at this time, it is only necessary to repeat the fine tuning training process, keep the weight value matrices of the plurality of first processing layers unchanged, and adjust the weight value matrix of the second processing layer. Since the weight value matrixes of the four first processing layers are mapped onto the first memristor array, the RRAM device in the memristor array has good data retention, the resistance value of the RRAM device in the first memristor array keeps unchanged for a long time, and the training frequency is reduced due to the stability of the resistance value of the RRAM device. And the weight value matrix of the second processing layer is mapped onto the second memristor array, the RRAM device in the memristor array has good durability, the operable times of repeated programming of the RRAM device are large, and the RRAM device is suitable for training of a neural network.

The following points need to be explained:

(1) the drawings of the embodiments of the disclosure only relate to the structures related to the embodiments of the disclosure, and other structures can refer to the common design.

(2) Without conflict, embodiments of the present disclosure and features of the embodiments may be combined with each other to arrive at new embodiments.

The above description is intended to be exemplary of the present disclosure, and not to limit the scope of the present disclosure, which is defined by the claims appended hereto.

18页详细技术资料下载

Data processing apparatus and data processing method

相关技术

网友询问留言