Method and apparatus for programming analog neural memory in deep learning artificial neural network
阅读说明:本技术 用于在深度学习人工神经网络中对模拟神经存储器进行编程的方法和设备 (Method and apparatus for programming analog neural memory in deep learning artificial neural network ) 是由 H·V·特兰 V·蒂瓦里 N·多 M·雷顿 于 2019-01-18 设计创作,主要内容包括:本发明公开了与人工神经网络中的矢量-矩阵乘法(VMM)阵列一起使用的编程系统和方法的许多实施方案。因此,可极精确地对所选单元进行编程,以保持N个不同值中的一个值。(The present invention discloses many embodiments of programming systems and methods for use with a vector-matrix multiplication (VMM) array in an artificial neural network. Thus, the selected cell can be programmed very accurately to hold one of the N different values.)
1. A method of programming a selected memory cell in a vector-matrix multiplication array comprising rows and columns of flash memory cells, the method comprising:
programming a plurality of flash memory cells in the array;
partially erasing the plurality of flash memory cells in the array;
performing a first programming operation on a selected cell within the plurality of flash memory cells, wherein a first charge level is stored on a floating gate of the selected cell;
performing a second programming operation on the selected cell, wherein an additional level of charge is stored on the floating gate of the selected cell; and
repeating the second programming operation until a desired level of charge is stored on the selected cell or until the second programming operation is performed a maximum number of times.
2. The method of claim 1, wherein a verifying step is performed after each second programming operation, and the repeating step occurs if the verifying step produces a first result and does not occur if the verifying step produces a second result.
3. The method of claim 1, further comprising a hard programming step for unused memory cells.
4. The method of claim 1, wherein the first programming operation and the second programming operation utilize pulse modulation.
5. The method of claim 1, wherein the first programming operation and the second programming operation utilize programming current modulation.
6. The method of claim 1, wherein the first and second programming operations utilize pulse modulation for a selected memory cell and constant programming pulses for another selected memory cell.
7. The method of claim 1, wherein the first programming operation and the second programming operation utilize high voltage level modulation.
8. The method of claim 2, wherein the verifying step comprises comparing the current consumed by the selected cell with the current consumed by a reference matrix.
9. The method of claim 1, wherein the memory cell is a split gate memory cell.
10. The method of claim 1, wherein the memory cell is a stacked gate memory cell.
11. The method of claim 1, wherein the reference matrix comprises a shared control gate and a shared erase gate.
12. The method of claim 1, wherein the reference matrix comprises a shared control gate and a separate erase gate for each reference cell.
13. The method of claim 1, wherein the reference current is combined from a plurality of constant currents, delta currents, and differential currents from a reference matrix.
14. A circuit for comparing a current consumed by a selected memory cell of a vector matrix multiplier with a current consumed by a reference matrix, the circuit comprising:
a first circuit comprising a first PMOS transistor coupled to a first NMOS transistor, the first NMOS transistor coupled to the selected memory cell; and
a second circuit comprising a second PMOS transistor coupled to a second NMOS transistor, the second NMOS transistor coupled to the reference matrix;
wherein a node between the second PMOS transistor and the second NMOS transistor outputs a current indicative of a value stored in the selected memory cell.
15. The circuit of claim 14, further comprising an array leakage compensation circuit.
16. The circuit of claim 14, further comprising a reference array leakage compensation circuit.
17. The circuit of claim 14, wherein the selected memory cell is a split gate memory cell.
18. The circuit of claim 14, wherein the selected memory cell is a stacked gate memory cell.
19. A circuit for comparing a current consumed by a selected cell of a vector matrix multiplier with a current consumed by a reference matrix, the circuit comprising:
a transistor to compare the current received from a first node to a reference current, wherein the first node is selectively coupled to a reference matrix through a first switch or to a selected memory cell through a second switch;
wherein a voltage on the first node is indicative of a comparison result.
20. The circuit of claim 19, wherein the reference current is held on the transistor.
21. The circuit of claim 19, further comprising:
a comparator for comparing the current received from a first node with a reference current;
wherein an output of the comparator indicates whether a value stored in the selected cell exceeds a value stored in the reference matrix.
22. The circuit of claim 19, further comprising an array leakage compensation circuit.
23. The circuit of claim 19, further comprising a reference array leakage compensation circuit.
24. The circuit of claim 19, wherein the selected cell is a split gate memory cell.
25. The circuit of claim 19, wherein the selected cell is a stacked gate memory cell.
26. A circuit for converting a memory cell current signal of a vector matrix multiplier to a set of digital bits, the circuit comprising:
a memory cell current source;
a switch coupled to the current source;
a capacitor coupled to the current source in parallel with the switch;
a comparator having one input coupled to the current source, the switch, and the capacitor and another input coupled to a voltage reference; and
a counter coupled to receive an output from the comparator and output a digital count signal.
27. The circuit of claim 26, wherein the memory current is held on a transistor.
28. The circuit of claim 26, further comprising an array leakage compensation circuit.
29. The circuit of claim 26, wherein the memory cell is a split gate memory cell.
30. The circuit of claim 26, wherein the memory cell is a stacked gate memory cell.
31. A circuit for converting a memory cell current signal to a slope for a vector matrix multiplier, the circuit comprising:
a memory cell current source;
a switch coupled to the current source;
a capacitor coupled to the current source in parallel with the switch;
a comparator having one input coupled to the current source, the switch, and the capacitor and another input coupled to a voltage reference, wherein an output comparison of the comparator is indicative of the slope of the current output by the current source.
32. The circuit of claim 31, wherein the memory current is held on a transistor.
33. The circuit of claim 31, further comprising an array leakage compensation circuit.
34. The circuit of claim 31, wherein the memory cell is a split gate memory cell.
35. The circuit of claim 31, wherein the memory cell is a stacked gate memory cell.
36. The circuit of claim 31, wherein the voltage reference is time-multiplexed to indicate a value of the memory cell.
37. A circuit for converting a memory cell current signal to a slope for a vector matrix multiplier, the circuit comprising:
a memory cell current source;
a transistor coupled to the current source;
a switch coupled to the transistor;
a capacitor coupled to the transistor in parallel with the switch;
a comparator having one input coupled to the transistor, the switch, and the capacitor and another input coupled to a voltage reference, wherein an output comparison of the comparator indicates a slope of the current output by the current source.
38. The circuit of claim 37, wherein the memory current is held on a transistor.
39. The circuit of claim 37, further comprising an array leakage compensation circuit.
40. The circuit of claim 37, wherein the memory cell is a split gate memory cell.
41. The circuit of claim 37, wherein the memory cell is a stacked gate memory cell.
Technical Field
The present invention discloses many embodiments of programming apparatus and methods for use with a vector-matrix multiplication (VMM) array in an artificial neural network.
Background
Artificial neural networks mimic biological neural networks (the central nervous system of animals, in particular the brain) which can be used for estimation or approximation depending on a large number of inputs and generally unknown functions. Artificial neural networks typically include layers of interconnected "neurons" that exchange messages with each other.
FIG. 1 illustrates an artificial neural network, where circles represent the inputs or layers of neurons. Connections (called synapses) are indicated by arrows and have a numerical weight that can be adjusted empirically. This enables the neural network to adapt to the input and to learn. Typically, a neural network includes a layer of multiple inputs. There are typically one or more intermediate layers of neurons, and an output layer of neurons that provides the output of the neural network. Neurons at each level make decisions based on data received from synapses, either individually or collectively.
One of the major challenges in developing artificial neural networks for high-performance information processing is the lack of adequate hardware technology. In practice, practical neural networks rely on a large number of synapses, thereby achieving high connectivity between neurons, i.e., very high computational parallelism. In principle, such complexity may be achieved by a digital supercomputer or a dedicated cluster of graphics processing units. However, these methods are also energy efficient, in addition to high cost, compared to biological networks, which consume less energy mainly because they perform low precision analog calculations. CMOS analog circuits have been used for artificial neural networks, but most CMOS-implemented synapses are too bulky given the large number of neurons and synapses.
Applicants previously disclosed an artificial (simulated) neural network that utilized one or more non-volatile memory arrays as synapses in U.S. patent application 15/594,439, which is incorporated herein by reference. The non-volatile memory array operates as a simulated neuromorphic memory. The neural network device includes a first plurality of synapses configured to receive a first plurality of inputs and generate a first plurality of outputs therefrom, and a first plurality of neurons configured to receive the first plurality of outputs. The first plurality of synapses comprises a plurality of memory cells, wherein each of the memory cells comprises: spaced apart source and drain regions formed in the semiconductor substrate, wherein a channel region extends between the source and drain regions; a floating gate disposed over and insulated from a first portion of the channel region; and a non-floating gate disposed over and insulated from a second portion of the channel region. Each of the plurality of memory cells is configured to store a weight value corresponding to a plurality of electrons on the floating gate. The plurality of memory units is configured to multiply the first plurality of inputs by the stored weight values to generate a first plurality of outputs.
Each non-volatile memory cell used in an analog neuromorphic memory system must be erased and programmed to maintain a very specific and precise amount of charge in the floating gate. For example, each floating gate must hold one of N different values, where N is the number of different weights that can be indicated by each cell. Examples of N include 16, 32, and 64.
One challenge in VMM systems is being able to program selected cells with the precision and granularity required for different values of N. For example, extreme precision is required in the programming operation if the selected cell can include one of 64 different values.
What is needed is an improved programming system and method suitable for use with a VMM in an emulated neuromorphic memory system.
Disclosure of Invention
The present invention discloses many embodiments of programming systems and methods for use with a vector-matrix multiplication (VMM) array in an artificial neural network. Thus, the selected cell can be programmed very accurately to hold one of the N different values.
Drawings
Fig. 1 is a schematic diagram illustrating an artificial neural network.
Figure 2 is a cross-sectional side view of a conventional 2-gate non-volatile memory cell.
FIG. 3 is a cross-sectional side view of a conventional 4-gate non-volatile memory cell.
FIG. 4 is a side cross-sectional side view of a conventional 3-gate non-volatile memory cell.
FIG. 5 is a cross-sectional side view of another conventional 2-gate non-volatile memory cell.
FIG. 6 is a schematic diagram illustrating different stages of an exemplary artificial neural network utilizing a non-volatile memory array.
Fig. 7 is a block diagram showing a vector multiplier matrix.
FIG. 8 is a block diagram showing various stages of a vector multiplier matrix.
Fig. 9 shows another embodiment of a vector multiplier matrix.
Fig. 10 shows another embodiment of a vector multiplier matrix.
Fig. 11 shows operating voltages for performing operations on the vector multiplier matrix of fig. 10.
Fig. 12 shows another embodiment of a vector multiplier matrix.
Fig. 13 shows the operating voltages for performing the operation on the vector multiplier matrix of fig. 12.
Fig. 14 shows another embodiment of a vector multiplier matrix.
Fig. 15 shows operating voltages for performing operations on the vector multiplier matrix of fig. 14.
Fig. 16 shows another embodiment of a vector multiplier matrix.
Fig. 17 shows the operating voltages for performing the operations on the vector multiplier matrix of fig. 216.
Fig. 18A and 18B illustrate a programming method for a vector multiplier matrix.
Fig. 19 shows waveforms for the programming of fig. 18A and 18B.
Fig. 20 shows waveforms for the programming of fig. 18A and 18B.
Fig. 21 shows waveforms for the programming of fig. 18A and 18B.
Figure 22 shows a vector multiplier matrix system.
Fig. 23 shows a column driver.
Fig. 24 shows a plurality of reference matrices.
Fig. 25 shows a single reference matrix.
Fig. 26 shows a reference matrix.
Fig. 27 shows another reference matrix.
Fig. 28 shows a comparison circuit.
Fig. 29 shows another comparison circuit.
Fig. 30 shows another comparison circuit.
Fig. 31 shows a current-to-digital bit circuit.
Fig. 32 shows waveforms of the circuit of fig. 31.
Fig. 33 shows a current-to-slope circuit.
Fig. 34 shows waveforms of the circuit of fig. 33.
Fig. 35 shows a current-to-slope circuit.
Fig. 36 shows waveforms of the circuit of fig. 35.
Detailed Description
The artificial neural network of the present invention utilizes a combination of CMOS technology and a non-volatile memory array.
Non-volatile memory cell
Digital non-volatile memories are well known. For example, U.S. patent 5,029,130 ("the' 130 patent") discloses an array of split gate non-volatile memory cells, and is incorporated by reference herein for all purposes. Such a memory cell is shown in fig. 2. Each
The
The memory cell 210 (with electrons placed on the floating gate) is programmed by placing a positive voltage on the
Table 1 shows typical voltage ranges that may be applied to the terminals of
table 1: operation of
Other split gate memory cell configurations are known. For example, fig. 3 depicts a four-gate memory cell 310 that includes a
Table 2 shows typical voltage ranges that may be applied to the terminals of memory cell 310 for performing read, erase and program operations:
table 2: operation of flash memory cell 310 of FIG. 3
WL/SG
BL
CG
EG
SL
Reading
1.0-2V
0.6-2V
0-2.6V
0-2.6
0V
Erasing
-0.5V/ 0V
0V/-8V
8-
0V
Programming
1V
1μA
8-11V
4.5-9V
4.5-5V
FIG. 4 depicts a split-gate tri-gate memory cell 410. Memory cell 410 is the same as memory cell 310 of fig. 3, except that memory cell 410 does not have a separate control gate. The erase operation (erasing through the erase gate) and the read operation are similar to the operation of fig. 3, except that there is no control gate bias. The programming operation is also completed without the control gate bias, so the programming voltage on the source line is higher to compensate for the lack of control gate bias.
Table 3 shows typical voltage ranges that may be applied to the terminals of memory cell 410 for performing read, erase and program operations:
table 3: operation of flash memory cell 410 of FIG. 4
WL/SG
BL
EG
SL
Reading
0.7-2.2V
0.6-2V
0-2.6
0V
Erasing
-0.5V/
0V
11.5
0V
Programming
1V
2-3μA
4.5V
7-9V
Fig. 5 depicts a stacked gate memory cell 510. Memory cell 510 is similar to
Table 4 shows typical voltage ranges that may be applied to the terminals of memory cell 510 for performing read, erase and program operations:
table 4: operation of flash memory cell 510 of FIG. 5
CG
BL
SL
P-sub
Reading
2-5V
0.6– 0V
0V
Erasing
-8 to-10V/0V
FLT
FLT
8-10V/15-20V
Programming
8-12V
3- 0V
0V
In order to utilize a memory array comprising one of the above types of non-volatile memory cells in an artificial neural network, two modifications are made. First, the circuitry is configured so that each memory cell can be programmed, erased, and read individually without adversely affecting the memory state of other memory cells in the array, as explained further below. Second, continuous (analog) programming of the memory cells is provided.
In particular, the memory state (i.e., the charge on the floating gate) of each memory cell in the array can be continuously changed from a fully erased state to a fully programmed state independently and with minimal disturbance to other memory cells. In another embodiment, the memory state (i.e., the charge on the floating gate) of each memory cell in the array can be continuously changed from a fully programmed state to a fully erased state, or vice versa, independently and with minimal disturbance to other memory cells. This means that the cell storage device is analog, or at least can store one of many discrete values (such as 16 or 64 different values), which allows very precise and individual tuning of all cells in the memory array, and which makes the memory array ideal for storing and fine-tuning synaptic weights for neural networks.
Neural network employing non-volatile memory cell array
Figure 6 conceptually illustrates a non-limiting example of a neural network that utilizes a non-volatile memory array. This example uses a non-volatile memory array neural network for facial recognition applications, but any other suitable application may also be implemented using a non-volatile memory array based neural network.
For this example, S0 is an input, which is a 32x32 pixel RGB image with 5-bit precision (i.e., three 32x32 pixel arrays, one for each color R, G and B, respectively, each pixel being 5-bit precision). Synapse CB1 from S0 to C1 simultaneously has different sets of weights and shared weights, and scans the input image with a 3x3 pixel overlap filter (kernel), shifting the filter by 1 pixel (or more than 1 pixel as indicated by the model). Specifically, the values of 9 pixels in the 3x3 portion of the image (i.e., referred to as a filter or kernel) are provided to synaptic CB1, whereby these 9 input values are multiplied by appropriate weights, and after summing the outputs of this multiplication, a single output value is determined by the first neuron of CB1 and provided for generating the pixels of one of the layers C1 of the feature map. The 3x3 filter is then shifted to the right by one pixel (i.e., adding the three pixel column to the right and releasing the three pixel column to the left), thereby providing the 9 pixel values in the newly located filter to synapse CB1, thereby multiplying them by the same weight and determining a second single output value by the associated neuron. This process continues until the 3x3 filter scans all three colors and all bits (precision values) over the entire 32x32 pixel image. This process is then repeated using different sets of weights to generate different feature maps for C1 until all feature maps for layer C1 are computed.
At C1, in this example, there are 16 feature maps, each having 30 × 30 pixels. Each pixel is a new feature pixel extracted from the product of the input and kernel, so each feature map is a two-dimensional array, so in this example, synapse CB1 is comprised of a 16-layer two-dimensional array (bearing in mind that the neuron layers and arrays referenced herein are logical relationships, not necessarily physical relationships, i.e., the arrays are not necessarily oriented in a physical two-dimensional array). Each of the 16 feature maps is generated by one of sixteen different sets of synaptic weights applied to the filter scan. The C1 feature maps may all relate to different aspects of the same image feature, such as boundary identification. For example, a first map (generated using a first set of weights, shared for all scans used to generate the first map) may identify rounded edges, a second map (generated using a second set of weights different from the first set of weights) may identify rectangular edges, or aspect ratios of certain features, and so on.
Before moving from C1 to S1, an activation function P1 (pooling) is applied that pools values from consecutive non-overlapping 2x2 regions in each feature map. The purpose of the pooling stage is to average neighboring locations (or a max function may also be used) to e.g. reduce the dependency of edge locations and reduce the data size before entering the next stage. At S1, there are 16 15x15 feature maps (i.e., 16 different arrays of 15x15 pixels each). Synapses and associated neurons from S1 to C2 in CB2 scan the mapping in S1 with a 4x4 filter, with the filter shifted by 1 pixel. At C2, there are 22 12x12 feature maps. Before moving from C2 to S2, an activation function P2 (pooling) is applied that pools values from consecutive non-overlapping 2x2 regions in each feature map. At S2, there are 22 6x6 feature maps. The activation function is applied to synapse CB3 from S2 to C3, where each neuron in C3 is connected to each map in S2. At C3, there are 64 neurons. Synapse CB4 from C3 to output S3 fully connects S3 to C3. The output at S3 includes 10 neurons, with the highest output neuron determining the class. For example, the output may indicate an identification or classification of the content of the original image.
Each level of synapse is implemented using an array or portion of an array of non-volatile memory cells. FIG. 7 is a block diagram of a vector-matrix multiplication (VMM) array that includes non-volatile memory cells and is used as synapses between an input layer and a next layer. In particular, the VMM32 includes a non-volatile
The outputs of the memory array are provided to a differential adder (such as an adding operational amplifier) 38 that sums the outputs of the memory cell array to create a single value for the convolution. The differential adder is to sum the positive and negative weights with the positive input. The summed output value is then provided to an
FIG. 8 is a block diagram of the various stages of the VMM. As shown in fig. 8, the input is converted from digital to analog by a digital-to-analog converter 31 and provided to an input VMM32 a. The output generated by the input VMM32 a is provided as input to the next VMM (hidden level 1)32b, which in turn generates output provided as input to the next VMM (hidden level 2)32b, and so on. The layers of VMM32 serve as different layers of synapses and neurons for a Convolutional Neural Network (CNN). Each VMM may be a separate non-volatile memory array, or multiple VMMs may utilize different portions of the same non-volatile memory array, or multiple VMMs may utilize overlapping portions of the same non-volatile memory array. The example shown in fig. 8 comprises five layers (32a,32b,32c,32d,32 e): one input layer (32a), two hidden layers (32b,32c) and two fully connected layers (32d,32 e). Those of ordinary skill in the art will appreciate that this is merely exemplary and that, instead, a system may include more than two hidden layers and more than two fully connected layers.
Vector-matrix multiplication (VMM) array
FIG. 9 illustrates a neuron VMM 900, which is particularly suited for use in a memory cell of the type shown in FIG. 3, and which serves as a synapse and component for neurons between an input layer and a next layer. The VMM 900 includes a memory array 901 of non-volatile memory cells and a reference array 902 (at the top of the array). Alternatively, another reference array may be placed at the bottom. In the VMM 900, control gate lines (such as control gate line 903) extend in a vertical direction (so the reference array 902 is in a row direction orthogonal to the input control gate lines), and erase gate lines (such as erase gate line 904) extend in a horizontal direction. Here, the input is provided on the control gate line and the output appears on the source line. In one embodiment, only even rows are used, and in another embodiment, only odd rows are used. The current placed on the source line performs the summing function of all currents from the memory cells connected to the source line.
As described herein with respect to neural networks, the flash memory cells are preferably configured to operate in a sub-threshold region.
The memory cells described herein are under weak reverse bias:
Ids=Io*e(Vg-Vth)/kVt=w*Io*e(Vg)/kVt
w=e(-Vth)/kVt
for an I-to-V logarithmic converter that converts an input current to an input voltage using a memory cell:
Vg=k*Vt*log[Ids/wp*Io]
for a memory array used as a vector matrix multiplier VMM, the output current is:
Iout=wa*Io*e(Vg)/kVti.e. by
Iout=(wa/wp)*Iin=W*Iin
W=e(Vthp-Vtha)/kVt
The word line or control gate may be used as an input to the memory cell for an input voltage.
Alternatively, the flash memory cell may be configured to operate in the linear region:
Ids=beta*(Vgs-Vth)*Vds;beta=u*Cox*W/L
Wα(Vgs-Vth)
for an I-to-V linear converter, a memory cell operating in the linear region may be used to linearly convert an input/output current to an input/output voltage.
Other embodiments of ESF vector matrix multipliers are described in U.S. patent application 15/826,345, which is incorporated herein by reference. The source line or bit line may be used as the neuron output (current summation output).
FIG. 10 shows a neuronal VMM 1000 that is particularly suited for use in a memory cell of the type shown in FIG. 2, and serves as a synapse between an input layer and a next layer. VMM 1000 includes a memory array 1003 of non-volatile memory cells, a reference array 1001, and a reference array 1002. The reference arrays 1001 and 1002 in the column direction of the array are used to convert the current input flowing into terminal BLR0-3 to voltage input WL 0-3. In practice, the reference memory cell is a diode connected through a multiplexer, into which a current input flows. The reference cell is tuned (e.g., programmed) to a target reference level. The target reference level is provided by a reference microarray matrix. The memory array 1003 serves two purposes. First, it stores the weights to be used by VMM 1000. Second, memory array 1003 effectively multiplies the inputs (current inputs provided in terminal BLR 0-3; reference arrays 1001 and 1002 convert these current inputs to input voltages to provide to word line WL0-3) by the weights stored in the memory array to produce an output, which will be either an input to the next layer or an input to the final layer. By performing the multiplication function, the memory array eliminates the need for a separate multiplication logic circuit and is also power efficient. Here, a voltage input is provided on the word line and an output appears on the bit line during a read (infer) operation. The current placed on the bit line performs the summing function of all currents from the memory cells connected to the bit line.
FIG. 11 shows operating voltages for VMM 1000. The columns in the table indicate the voltages placed on the word line for the selected cell, the word lines for the unselected cells, the bit line for the selected cell, the bit lines for the unselected cells, the source line for the selected cell, and the source line for the unselected cells. The rows indicate read, erase, and program operations.
FIG. 12 illustrates a
Fig. 13 shows the operating voltages for
FIG. 14 shows a neuron VMM1400, particularly suited for use in a memory cell of the type shown in FIG. 3, and serving as a synapse and component for neurons between an input layer and a next layer. VMM1400 includes a memory array 1403 of non-volatile memory cells, a reference array 1401, and a reference array 1402. Reference arrays 1401 and 1402 are used to convert the current input flowing into terminal BLR0-3 to voltage input CG 0-3. In practice, the reference memory cell is a diode connected through cascode multiplexer 1414, with the current input flowing into it. The multiplexer 1414 includes a multiplexer 1405 and a cascode transistor 1404 to ensure that the voltage on the bit line of the reference cell being read is constant. The reference cell is tuned to a target reference level. Memory array 1403 serves two purposes. First, it stores the weights to be used by VMM 1400. Second, memory array 1403 effectively multiplies the inputs (current inputs provided to terminal BLR 0-3; reference arrays 1401 and 1402 convert these current inputs to input voltages to be provided to control gate CG0-3) by the weights stored in the memory array to produce an output that will be either the input to the next layer or the input to the final layer. By performing the multiplication function, the memory array eliminates the need for a separate multiplication logic circuit and is also power efficient. Here, the input is provided on a word line and the output appears on a bit line during a read operation. The current placed on the bit line performs the summing function of all currents from the memory cells connected to the bit line.
VMM1400 implements unidirectional tuning for memory cells in memory array 1403. That is, each cell is erased and then partially programmed until the desired charge on the floating gate is reached. If too much charge is placed on the floating gate (so that the wrong value is stored in the cell), the cell must be erased and the sequence of partial program operations must be restarted. As shown, two rows sharing the same erase gate need to be erased together (referred to as page erase), and thereafter, each cell is partially programmed, until the desired charge on the floating gate is reached,
fig. 15 shows the operating voltages for VMM 1400. The columns in the table indicate the voltages placed on the word line for the selected cell, the word lines for the unselected cells, the bit lines for the selected cell, the bit lines for the unselected cells, the control gate for the selected cell, the control gate for the unselected cells in the same sector as the selected cell, the control gate for the unselected cells in a different sector than the selected cell, the erase gate for the unselected cells, the source line for the selected cell, the source line for the unselected cells. The rows indicate read, erase, and program operations.
FIG. 16 shows a
Fig. 17 shows the operating voltages for the
Fig. 18A and 18B illustrate a programming method 1800. First, the method begins (step 1801), which typically occurs in response to receiving a program command. Next, a bulk program operation programs all cells to the "0" state (step 1802). Then, soft erase erases all cells to an intermediate weak erase level of approximately 3-5 μ A (step 1803). This is in contrast to deep erase, which would bring all cells to the fully erased state for digital applications (e.g., cell currents of about 20-30 uA). Hard programming is then performed on all unselected cells to remove charge from the cells (step 1804), bringing the unused cells to a very deep programmed state to ensure that the cells are truly off, meaning that these memory cells contribute insignificant current. Soft programming is then performed on the selected cells using a coarse algorithm to remove some charge from the cells to n intermediate weak programming levels of approximately 0.7-1.5 μ A (steps 1805, 1806, 1807). A coarse step programming loop is performed followed by a verify operation in which the charge on the selected cell is compared to various thresholds in a coarse iterative manner (steps 1806 and 1807). The coarse step programming cycle includes a coarse voltage increment, and/or a coarse programming time, and/or a coarse programming current that causes a coarse cell current change from one programming step to the next. Next, precision programming is performed (steps 1808-1813) in which all cells are programmed by the fine step programming algorithm to a target level in the range of 100 pA-20 nA, depending on the desired level. A fine step programming loop is performed followed by a verify operation (steps 1809 and 1810). The fine step programming cycle may include a combination of voltage increments, and/or programming times, and/or coarse and fine resolutions of programming current that result in a fine cell current change from one programming step to the next. If the selected cell reaches the desired target, the programming operation is complete (step 1811). If not, the precise programming operation is repeated until the desired level is reached. However, if the number of attempts exceeds the threshold number (step 1812), the programming operation stops and the selected cell is considered to be a bad cell (step 1813).
FIG. 19 shows an
Fig. 20 illustrates an exemplary waveform 2000 for performing a programming operation using high voltage level modulation. Signal 2003 is a separate pulse program loop enable signal for a particular bit line. Signal 2004 is a separate pulsed programming cycle enable signal for another particular bit line. The programming pulse widths of signals 2003 and 2004 are the same in this waveform. Signal 2005 is a high voltage increment such as a source line or control gate for programming. It increments from one programming pulse to the next.
Fig. 21 shows an exemplary waveform 2100 for performing a programming operation using high voltage level modulation. Signal 2103 is an individual pulse program loop enable signal for a particular bit line. Signal 2104 is a separate pulse programming cycle enable signal for another particular bit line. Signal 2105 is a programming high voltage increment, such as for the source line or control gate used for programming. From one programming pulse to the next, it may be the same or incremented. The programming pulse width of signal 2103 is the same across the multiple pulses with incrementally higher voltages. For the same high voltage increment, the programming pulse width of signal 2104 is different across the multiple pulses, e.g., narrower for the first pulse.
Fig. 22 shows a VMM system 2200 including a VMM matrix 2201, a column decoder 2202, and a column driver 2203.
Fig. 23 illustrates an exemplary column driver 2300 that may be used as column driver 2203 in fig. 22. The column driver 2300 includes a latch 2301, an inverter 2302, a nor gate 2303, a PMOS transistor 2304, and NMOS transistors 2305 and 2306, configured as shown. Latch 2301 receives a data signal DIN and an enable signal EN. The node BLIO between PMOS transistor 2304 and NMOS transistor 2305 contains a bit line input or output signal that is selectively connected to a bit line by a column decoder (such as column decoder 2202 in FIG. 22). The sense amplifier SA 2310 is coupled to the bit line through a node BLIO to read the cell current of the selected memory cell. The sense amplifier SA 2310 is used to verify a desired current level of a selected memory cell, such as after an erase or program operation. The PMOS transistor 2304 is used to inhibit the bit line in programming, depending on the inhibit control circuit 2301/2302/2303. The NMOS transistor 2306 provides a bias programming current to the bit line. NMOS transistor 2305 enables a bias programming current to the bit line, thereby enabling programming of the selected memory cell. Thus, a program control signal (such as
Fig. 24 shows an
Fig. 25 shows an
Fig. 26 illustrates an exemplary reference matrix 2600 that may be used as the
Fig. 27 shows an exemplary reference matrix 2700 that may be used as the
Fig. 28 shows an Icell PMOS compare
Fig. 29 shows an Icell PMOS compare circuit 2900 including PMOS transistor 2901, switches 2902, 2903, and 2904, NMOS cascode transistors 2905 and 2907, selected memory cells 2908 from VMM memory array 2920, a reference matrix 2906 (such as reference matrix 2600 or 2700), and a comparator 2909, arranged as shown. The output COMP _ OUT is a voltage value indicating the relationship of the value stored in the selected memory cell 2908 and the reference current. The Icell comparison circuit operates by acting as a single PMOS current mirror with time multiplexing to eliminate mismatch between the two mirror PMOS transistors. For the first time period, S0 and S1 are closed and S2 is open. The current from the reference memory matrix 2906 is stored (held) in the PMOS transistor 2901. For the next time period, S0 and S1 are open and S2 is closed. The stored reference current is compared to the current from the memory cell 2908, and the result of the comparison is indicated on output node 2910. Optionally, a comparator may compare the voltage on node 2901 to a reference voltage VREF to indicate the result of the comparison. Here, a reference current is sampled and held, and a current from the selected memory cell 2902 is sampled and held in the PMOS transistor 2901 to be compared with the reference current.
In another embodiment, the array
Fig. 31 shows Icell to digital data circuit 3100, which includes current source 3101, switch 3102, capacitor 3103, comparator 3104, and counter 3105. At the beginning of the compare period, signal 3110 is pulled to ground. Signal 3110 then begins to rise according to cell current 3101 (extracted from the VMM memory array with array leakage compensation as described above). The ramp rate is proportional to the cell current 3101 and the capacitor 3103. The output 3112 of the comparator 3104 then enables the counter 3105 to begin counting digitally. Once the voltage on node 3110 reaches voltage level VREF 3111, comparator 3104 switches polarity and stops counter 3105. The digital output Q < N:0>3113 values indicate the value of the cell current 3101.
Fig. 32 shows a
Fig. 33 shows Icell to
FIG. 34 shows a waveform 3400 for Icell to
Fig. 35 shows Icell-to-
Fig. 36 shows a
It should be noted that as used herein, the terms "above …" and "above …" both inclusively encompass "directly on …" (with no intermediate material, element, or space disposed therebetween) and "indirectly on …" (with intermediate material, element, or space disposed therebetween). Similarly, the term "adjacent" includes "directly adjacent" (no intermediate material, element, or space disposed therebetween) and "indirectly adjacent" (intermediate material, element, or space disposed therebetween), "mounted to" includes "directly mounted to" (no intermediate material, element, or space disposed therebetween) and "indirectly mounted to" (intermediate material, element, or space disposed therebetween), and "electrically coupled to" includes "directly electrically coupled to" (no intermediate material or element therebetween that electrically connects the elements together) and "indirectly electrically coupled to" (intermediate material or element therebetween that electrically connects the elements together). For example, forming an element "over a substrate" can include forming the element directly on the substrate with no intervening materials/elements therebetween, as well as forming the element indirectly on the substrate with one or more intervening materials/elements therebetween.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:原子序重排方法