Method for processing programming error of multi-plane NVM and storage device

文档序号:828772 发布日期:2021-03-30 浏览:4次 中文

阅读说明:本技术 多平面nvm处理编程出错的方法与存储设备 (Method for processing programming error of multi-plane NVM and storage device ) 是由 于松海 李德领 袁戎 于 2019-09-27 设计创作,主要内容包括:提供了多平面NVM处理编程出错的方法与存储设备。所提供的用于存储设的方法,包括:获取待回收的存储块,若存储块被标记为伪坏块,擦除所述存储块;以及若擦除所述存储块成功,将所述存储块标记为好块。(Methods and memory devices for handling programming errors for multi-plane NVM are provided. A method for a storage device is provided, comprising: acquiring a storage block to be recovered, and erasing the storage block if the storage block is marked as a false bad block; and if the memory block is successfully erased, marking the memory block as a good block.)

1. A method for a storage device, comprising:

acquiring a storage block to be recovered, and erasing the storage block if the storage block is marked as a false bad block; and

and if the memory block is successfully erased, marking the memory block as a good block.

2. The method of claim 1, wherein

And if the storage block is marked as a false bad block, before the storage block is erased, the valid data stored in the storage block is also moved to other storage blocks.

3. The method of claim 1 or 2, further comprising:

obtaining a message indicating that the NVM chip processes a complete programming command, the programming command operating a first memory block;

if the message indicates that the program command fails to execute, marking the first storage block as a bad block, and marking one or more storage blocks belonging to the same plane as the first storage block as a pseudo-bad block.

4. The method of claim 3, further comprising:

and setting a bad block mark for the LUN where the first storage block is located.

5. The method of claim 3 or 4, further comprising:

and responding to the message indicating that the execution of the programming command is successful, and if the first storage block is marked as a bad block or a pseudo-bad block, writing the data to be written by the programming command indicated by the message into other storage blocks.

6. The method of one of claims 3-5, further comprising:

in response to the message indicating successful execution of the program command, freeing resources allocated for the program command if the first memory block is neither marked as a bad block nor a pseudo-bad block.

7. The method of one of claims 3-6, further comprising:

in response to the message indicating that the program command execution is successful, identifying whether a LUN in which the first storage block is located is set with a bad block flag;

and if the LUN where the first storage block is located is set with a bad block mark, identifying whether the first storage block is marked as a bad block or a pseudo-bad block.

8. The method of one of claims 3 to 7, further comprising:

in response to the message indicating that the program command execution is successful, identifying whether a LUN in which the first storage block is located is set with a bad block flag; and if the LUN where the first storage block is located is not set with the bad block mark, releasing the resources allocated to the programming command.

9. The method of claim 6 or 7, further comprising:

and if the first storage block is not marked as a bad block or a pseudo-bad block, clearing the bad block mark of the LUN where the first storage block is located.

10. A memory device comprising control means and a NVM chip, the control means performing the method according to one of claims 1-9.

Technical Field

The present application relates to memory technology, and more particularly to NVM having multiple planes (planes) in a memory device

Non-Volatile Memory indicates a processing method after processing a program command error and a storage device thereof.

Background

FIG. 1 illustrates a block diagram of a storage device. The storage device 102 is coupled to a host for providing storage capabilities to the host. The host and the storage device 102 may be coupled by various methods, including but not limited to, connecting the host and the storage device 102 by, for example, SATA (Serial Advanced Technology Attachment), SCSI (Small Computer System Interface), SAS (Serial Attached SCSI), IDE (Integrated Drive Electronics), USB (Universal Serial Bus), PCIE (Peripheral Component Interconnect Express, PCIE, high speed Peripheral Component Interconnect), NVMe (NVM Express, high speed nonvolatile storage), ethernet, fibre channel, wireless communication network, etc. The host may be an information processing device, such as a personal computer, tablet, server, portable computer, network switch, router, cellular telephone, personal digital assistant, etc., capable of communicating with the storage device in the manner described above. The Memory device 102 includes an interface 103, a control section 104, one or more NVM chips 105, and a DRAM (Dynamic Random Access Memory) 110.

NAND flash Memory, phase change Memory, FeRAM (Ferroelectric RAM), MRAM (magnetoresistive Memory), RRAM (Resistive Random Access Memory), XPoint Memory, and the like are common NVM.

The interface 103 may be adapted to exchange data with a host by means such as SATA, IDE, USB, PCIE, NVMe, SAS, ethernet, fibre channel, etc.

The control unit 104 is used to control data transfer between the interface 103, the NVM chip 105, and the DRAM 110, and also used for memory management, host logical address to flash physical address mapping, erase leveling, bad block management, and the like. The control component 104 can be implemented in various manners of software, hardware, firmware, or a combination thereof, for example, the control component 104 can be in the form of an FPGA (Field-programmable gate array), an ASIC (Application-Specific Integrated Circuit), or a combination thereof. The control component 104 may also include a processor or controller in which software is executed to manipulate the hardware of the control component 104 to process IO (Input/Output) commands. The control component 104 may also be coupled to the DRAM 110 and may access data of the DRAM 110. FTL tables and/or cached IO command data may be stored in the DRAM.

Control section 104 includes a flash interface controller (or referred to as a media interface controller, a flash channel controller) that is coupled to NVM chip 105 and issues commands to NVM chip 105 in a manner that conforms to an interface protocol of NVM chip 105 to operate NVM chip 105 and receive command execution results output from NVM chip 105. Known NVM chip interface protocols include "Toggle", "ONFI", etc.

The memory Target (Target) is one or more Logic Units (LUNs) that share CE (Chip Enable) signals within the NAND flash package. One or more dies (Die) may be included within the NAND flash memory package. Typically, a logic cell corresponds to a single die. The logical unit may include a plurality of planes (planes). Multiple planes within a logical unit may be accessed in parallel, while multiple logical units within a NAND flash memory chip may execute commands and report status independently of each other.

Data is typically stored and read on NVM on a page basis. And data is erased in blocks. A block (also referred to as a physical block) contains a plurality of pages. A block contains a plurality of pages. Pages on the storage medium (referred to as physical pages) have a fixed size, e.g., 17664 bytes. Physical pages may also have other sizes.

In the storage device, mapping information from logical addresses to physical addresses is maintained by using a Flash Translation Layer (FTL). The logical addresses constitute the storage space of the solid-state storage device as perceived by upper-level software, such as an operating system. The physical address is an address for accessing a physical memory location of the solid-state memory device. Address mapping may also be implemented using an intermediate address modality in the related art. E.g. mapping the logical address to an intermediate address, which in turn is further mapped to a physical address.

A table structure storing mapping information from logical addresses to physical addresses is called an FTL table. FTL tables are important metadata in solid state storage devices. Usually, the data entry of the FTL table records the address mapping relationship in the unit of data page in the solid-state storage device.

The FTL of some memory devices is provided by a host to which the memory device is coupled, the FTL table is stored by a memory of the host, and the FTL is provided by software executed by a processor of the host. Still other storage management devices disposed between hosts and storage devices provide FTLs.

The large block includes physical blocks from each of a plurality of Logical Units (LUNs). A plurality of logical units providing a physical block for a large block is called a logical unit group. Each logical unit of the group of logical units may provide one physical block for the large block. For example, in the large block diagram shown in FIG. 2, a large block is constructed on every 16 Logical Units (LUNs). Each large block includes 16 physical blocks from 16 Logical Units (LUNs), respectively. In the example of FIG. 2, chunk 0 includes physical block 0 from each of the 16 Logical Units (LUNs), and chunk 2 includes physical block 2 from each Logical Unit (LUN). The bulk mass may also be constructed in a number of other ways. In fig. 2, a physical block is indicated by a reference numeral in the form of Ba-b, where a indicates that the physical block is provided by a logical unit (LUN a) and b indicates that the block number of the physical block in the logical unit is b. The large block stores user data and check data. And calculating to obtain large blocks of check data according to the user data stored in the large blocks. By way of example, the check data is stored in the last physical block of the large block. Other physical blocks of the large block may also be selected to store the check data. As yet another example, in FIG. 3A of the Chinese patent application No. 201710752321.0 and its description related to FIG. 3A, yet another manner of construction in bulk is provided.

Fig. 3 shows a schematic diagram of a logic cell and Plane (Plane). Each Logical Unit (LUN) includes a plurality of planes (planes). Referring to FIG. 3, large blocks are constructed on a group of logical units of 16 logical units (LUN0, LUN1, … …, LUN 15). Taking LUN2 as an example, LUN2 includes 4 planes (plane 0, plane 1, plane 2, and plane 3, where plane 0 and plane 3 are shown). And each plane in the LUN can simultaneously carry out read-write operation, so that the parallelism of NVM operation is improved.

In FIG. 3, block B0 of LUN0-LUN15 constitutes chunk 0, where physical blocks B0 in the planes of LUN0-LUN 14 are used to store user data, and physical blocks B0 of 4 planes of LUN15 are used to store parity data calculated from the user data of chunk 0. Block B1 of LUN0-LUN15 constitutes Block 1. In LUN2 of FIG. 3, planes 0 and 3 in LUN2 are shown. Sometimes there are bad blocks in the NVM and each large block is caused to include a different number of physical blocks.

Disclosure of Invention

According to a first aspect of the present application, there is provided a first method for a storage device according to the first aspect of the present application, comprising: acquiring a storage block to be recovered, and erasing the storage block if the storage block is marked as a false bad block; and if the memory block is successfully erased, marking the memory block as a good block.

According to a first method for a storage device of the first aspect of the present application, there is provided a second method for a storage device of the first aspect of the present application, further comprising: and if the memory block is failed to be erased, marking the memory block as a bad block.

The third method for a storage device according to the first aspect of the present application is provided, wherein if the acquired storage block to be reclaimed is marked as a bad block, the storage block is not reclaimed.

According to one of the first to third methods for a storage device of the first aspect of the present application, there is provided the fourth method for a storage device of the first aspect of the present application, wherein if the storage block is marked as a false bad block, valid data stored on the storage block is also moved to another storage block before erasing the storage block.

According to one of the first to fourth methods for a storage device of the first aspect of the present application, there is provided a method for a storage device according to the fifth aspect of the present application, further comprising: obtaining a message indicating that the NVM chip processes a complete programming command, the programming command operating a first memory block; if the message indicates that the program command fails to execute, marking the first storage block as a bad block, and marking one or more storage blocks belonging to the same plane as the first storage block as a pseudo-bad block.

According to a fifth method for a storage device of the first aspect of the present application, there is provided a sixth method for a storage device of the first aspect of the present application, wherein: in response to the message indicating a failure to execute the program command: marking storage blocks in the opened large blocks in the storage device, which belong to the same plane as the first storage block, as pseudo-bad blocks; marking all other storage blocks in the storage device, which belong to the same plane as the first storage block, as pseudo-bad blocks; or marking all other memory blocks which belong to the same plane as the first memory block and are not written with data in the memory device as pseudo-bad blocks.

According to a fifth or sixth method for a storage device of the first aspect of the present application, there is provided a seventh method for a storage device of the first aspect of the present application, wherein: in response to the message indicating a failure in program command execution, marking storage blocks in the storage device that are in the same LUN as the first storage block and that have the same block address as a pseudo-bad block.

According to a fifth method for a storage device of the first aspect of the present application, there is provided the eighth method for a storage device of the first aspect of the present application, wherein: and marking one or more memory blocks which are not marked as bad blocks and belong to the same plane as the first memory block as pseudo-bad blocks.

According to a sixth method for storing an amount of memory according to the first aspect of the present application, there is provided the ninth method for a memory device according to the first aspect of the present application, wherein in response to the message indicating a failure of execution of the program command: marking storage blocks which belong to the same plane as the first storage block and are not marked as bad blocks in the opened large blocks in the storage device as pseudo-bad blocks; marking all other storage blocks which are not marked as bad blocks and belong to the same plane as the first storage block in the storage device as pseudo-bad blocks; or marking all other storage blocks which belong to the same plane as the first storage block and are not written with data and marked as bad blocks in the storage device as pseudo-bad blocks.

According to one of the fifth to ninth methods for a storage device of the first aspect of the present application, there is provided the tenth method for a storage device according to the first aspect of the present application, further comprising: and setting a bad block mark for the LUN where the first storage block is located.

According to one of the fifth to tenth methods for a storage device of the first aspect of the present application, there is provided the eleventh method for a storage device according to the first aspect of the present application, further comprising: and responding to the message indicating that the execution of the programming command is successful, and if the first storage block is marked as a bad block or a pseudo-bad block, writing the data to be written by the programming command indicated by the message into other storage blocks.

According to one of the fifth to eleventh methods for a storage device of the first aspect of the present application, there is provided the twelfth method for a storage device of the first aspect of the present application, further comprising: in response to the message indicating successful execution of the program command, freeing resources allocated for the program command if the first memory block is neither marked as a bad block nor a pseudo-bad block.

According to one of the fifth to twelfth methods for a storage device of the first aspect of the present application, there is provided the thirteenth method for a storage device according to the first aspect of the present application, further comprising: in response to the message indicating that the program command execution is successful, identifying whether a LUN in which the first storage block is located is set with a bad block flag; and if the LUN where the first storage block is located is set with a bad block mark, identifying whether the first storage block is marked as a bad block or a pseudo-bad block.

According to one of the fifth to thirteenth methods for a storage device of the first aspect of the present application, there is provided the fourteenth method for a storage device of the first aspect of the present application, further comprising: in response to the message indicating that the program command execution is successful, identifying whether a LUN in which the first storage block is located is set with a bad block flag; and if the LUN where the first storage block is located is not set with the bad block mark, releasing the resources allocated to the programming command.

According to a twelfth or thirteenth method for a storage device of the first aspect of the present application, there is provided the method for a storage device of the fifteenth aspect of the present application, further comprising: and if the first storage block is not marked as a bad block or a pseudo-bad block, clearing the bad block mark of the LUN where the first storage block is located.

According to a second aspect of the present application, there is provided a first memory device according to the second aspect of the present application, comprising a control means and an NVM chip, the control means performing one of the methods for a memory device according to the first aspect of the present application.

According to a third aspect of the present application, there is provided a system for storing a device according to the first aspect of the present application, comprising: the device comprises an erasing module, a recovery module and a recovery module, wherein the erasing module is used for acquiring a storage block to be recovered, and erasing the storage block if the storage block is marked as a false bad block; and the marking module is used for marking the storage block as a good block if the storage block is successfully erased. .

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 illustrates a block diagram of a storage device;

FIG. 2 shows a schematic diagram in bulk;

FIG. 3 shows a schematic diagram of a logic cell and Plane (Plane);

FIG. 4 illustrates a schematic diagram of large blocks and planar bad blocks of a memory device according to an embodiment of the present application;

FIG. 5A illustrates a block diagram of a memory device according to an embodiment of the present application;

FIG. 5B illustrates a flow diagram for processing commands according to an embodiment of the present application;

FIG. 6 illustrates a flow diagram of a data reclamation process in accordance with an embodiment of the present application;

FIG. 7 is a block diagram of a control component according to yet another embodiment of the present application;

FIG. 8 illustrates a flow diagram for processing commands according to yet another embodiment of the present application; and

FIG. 9 illustrates a flow diagram for processing commands according to yet another embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application are clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

FIG. 4 shows a schematic diagram of a large block and a planar bad block of a storage device according to an embodiment of the present application.

FIG. 4 shows large blocks (e.g., large block 0, large block 2, large block 410, and large block 420) constructed in a group of 16 LUNs. Physical block 412 of large block 410, provided by plane 0 of LUN2, is a bad block. When the NVM chip leaves the factory, the bad blocks in the NVM chip are marked. When the NVM chip is used by the storage device, the newly appeared bad block is identified and marked. The physical block marked as a bad block is no longer used by the control component of the storage device, and the data written on the physical block marked as a bad block is moved to other physical blocks by the control component. For example, in the garbage collection process, the data written in the bad block is moved to another physical block.

In the prior art, bad blocks are identified according to the marks set in the NVM chip by the NVM chip vendor, and bad blocks are identified according to the responses of the NVM chip in the commands affecting reading, programming and erasing. For example, if an erase or program command applied to a block fails to process, the block is identified as a bad block.

In research and development and experiments, the inventor of the application finds that bad blocks of different planes (planes) have correlation in the same DIE (DIE). In use, if physical block number B1 in plane 0 becomes a bad block, then the probability that other physical blocks in the plane are also bad or are about to become bad is significantly greater than other physical blocks. Further, if such a block has not been identified as a bad block, the control unit writes data thereto, and in response to a programming operation of the data, the blocks have a greater probability of indicating a failure in the execution of the programming operation because they are already bad blocks or will become bad blocks, and trigger an error handling operation. The error handling operation interrupts the normal process flow of the control unit, greatly affecting the performance of the storage device.

Still further, in some embodiments, the control component writes data to two or more large blocks (these large blocks are referred to as open large blocks). If the programming operation of one open large block of physical block B1 fails to be performed, the programming operation of another open large block or other open large blocks of physical blocks belonging to the same plane as physical block B1 also has a greater probability of failing, thereby possibly causing multiple consecutive error handling operations, which further degrades the performance of the storage device, such as reducing read/write bandwidth, increasing read/write latency, and causing performance jitter.

With continued reference to FIG. 4, both chunk 410 and chunk 420 are open chunks to carry the data being written. Physical block 412 of large block 410 is a bad block and, according to embodiments of the present application, physical block 422 of large block 420 has a greater chance of being a bad block because physical block 422 belongs to the same plane of the same die as physical block 412. Particularly, in the case that the physical blocks 412 and 422 are normal physical blocks, there is a high probability that the physical block 422 will become a bad block after the physical block 412 is identified as a bad block due to a programming error.

To this end, according to an embodiment of the present application, physical block 412 is identified as a bad block in response to a programming error occurring while writing data to physical block 412, and physical block 422 is also marked as a "pseudo-bad block" although no programming error has occurred on physical block 422. For marked "false bad blocks," no more data is written to them, thereby reducing the chance of a programming operation failing and also reducing the chance of triggering an error handling operation to reduce performance jitter of the storage device.

Alternatively, in response to physical block 422 being identified as a bad block, only the corresponding physical block 422 of large block 420 that currently belongs to the open large block is marked as a "pseudo-bad block". In yet another embodiment, in response to physical block 422 being identified as a bad block, all physical blocks that have not yet been written with data that belong to the same plane as physical block 422 (plane 0) are marked as "pseudo-bad blocks". As still another embodiment, in response to identification of physical block 422 as a bad block, all physical blocks belonging to the same plane (plane 0) as physical block 422 are marked as "pseudo-bad blocks", and for the physical blocks to which data has been written and which are also marked as "pseudo-bad blocks", data migration is also preferentially performed on the physical blocks, and the data written thereon is migrated to other physical blocks.

FIG. 5A illustrates a block diagram of a memory device according to an embodiment of the present application.

The control components shown in fig. 5A include a host interface 510, a command processing unit 530, a command completion processing unit 540, and a media interface 520 for accessing the NVM chip 105. A plurality of NVM chips 105 are coupled to the control component.

The host interface 510 is used to exchange commands and data with a host. The command is, for example, an IO command to access a storage device. In one example, the host and the storage device communicate via NVMe/PCIe protocol, and the host interface 210 processes the PCIe protocol data packet, extracts the NVMe protocol command, and returns a processing result of the NVMe protocol command to the host.

The command processing unit 530 is coupled to the host interface 510, receives commands sent by the host to the storage device, and provides the commands to the media interface 520. The command processing unit 530 instructs the medium interface 520 to read data from the NVM chip according to a read command provided by the host, or instructs the medium interface 520 to send a program command to the NVM chip to write data according to a write command provided by the host. The command processing unit 530 is implemented by a CPU or dedicated hardware.

The media interface 520 provides the results of the NVM chip processing command to the command completion processing unit 540. If the command is successfully processed, the command completion processing unit 540 releases the resource occupied by the command (e.g., a cache for temporarily storing data to be written by the command), or returns the command processing result to the host. If the command processing fails, the command completion processing unit 540 may initiate an error handling process (e.g., re-write the cached data to be written by the command to the NVM chip) to attempt to eliminate the effect of the command processing failure and/or return the command processing result to the host.

The NVM chip may have errors in processing commands. For example, a command to write page X of physical block B is in error when processed by the NVM chip, and the media interface 520 in response indicates to the command completion processing unit 540 that the command to write page X of physical block B is in error. The command processing unit 540 accordingly recognizes the block in which the page X is located (referred to as block B) as a bad block. Bad blocks are no longer written with data and data that has been written to bad blocks will be moved to other blocks. The command completion processing unit 540 indicates to the command processing unit 530 that the physical block B is a bad block. So that the command processing unit 530 no longer writes data to block B. And the command processing unit 530 generates a plurality of commands at a subsequent time to move the data written to the block B to other blocks. The command completion processing unit 540 also marks other physical blocks of the same plane as the physical block B as pseudo-bad blocks and indicates to the command processing unit 530 that these other physical blocks are pseudo-bad blocks. So that command processing unit 530 does not write data to these other physical blocks any more. If one or more of these other physical blocks have been written with data, command processing unit 530 also generates a plurality of commands at a later time to move the data written in these other physical blocks to other blocks.

FIG. 5B shows a flow diagram for processing commands according to an embodiment of the present application.

Command processing 530 unit provides commands to media interface 520 to access NVM chip 105. The media interface 520 provides the results of the NVM chip processing command to the command completion processing unit 540. The command completion processing unit 540 recognizes the processing result of the command. If a program command processing failure for page X of block B is recognized (550), the command processing unit 530 is instructed to generate a new write command to rewrite the data to be written to page X into another block (560). And marking block B as bad block (570), and also marking other physical blocks belonging to the same plane as physical block B as pseudo-bad blocks (570). The command processing unit 530 no longer writes data to blocks marked as bad blocks or pseudo-bad blocks. Alternatively, the control section may use two or more open large blocks, and in response to the physical block B being identified as a bad block, the command completion processing unit 540 may mark only a physical block belonging to the same plane as the physical block B among the currently open large blocks as a pseudo-bad block. Still alternatively, the command completion processing unit 540 marks all physical blocks belonging to the same plane as the physical block B as pseudo-bad blocks. Some of these physical blocks marked as pseudo-bad blocks have been written with data and others have not. The command processing unit 530 also generates a plurality of commands to move the data written to block B and the data written to the pseudo-physical block to other blocks (580). Still alternatively, the command completion processing unit 540 marks a physical block, to which data has not been written, belonging to the same plane as the physical block B as a pseudo-bad block.

Alternatively, when the command completion processing unit 540 marks a pseudo bad block, it also checks whether the physical block to be marked is a bad block. If the physical block to be marked has been marked as a bad block (either factory marked or in use marked), the bad block marking is not changed and is no longer marked as a false bad block. And if the physical block to be marked is not marked as a bad block, marking the physical block as a pseudo-bad block.

Note that the bad block flag is a different flag than the pseudo bad block flag. By way of example, for both bad blocks and pseudo-bad blocks, command processing unit 530 no longer writes data to them. However, for the pseudo-bad blocks, next, for example, in a garbage collection operation, the command processing unit 530 applies an erase command to the pseudo-bad blocks through the media interface 520, and if the pseudo-bad blocks are successfully erased, the command completion processing unit 540 clears the pseudo-bad block marks thereof based on the successful erasure of the pseudo-bad blocks, so as to command the processing unit 530 to open the physical blocks again and write data to the physical blocks. And for a bad block, the command processing unit 530 does not apply the erase command thereto any more.

FIG. 6 illustrates a flow diagram of a data reclamation process in accordance with an embodiment of the present application.

For example, a command processing unit (e.g., command processing unit 530 of fig. 5A) of the control unit initiates data reclamation in time to move valid data in a reclaimed physical block to another physical block and erase the reclaimed physical block. The command processing unit selects a physical block (denoted as physical block B2) to be reclaimed (610). It is identified whether physical block B2 is marked as a bad block (620). For bad blocks, no data recovery is necessary (670). If physical block B2 is not marked as a bad block, it is also identified whether it is marked as a "pseudo-bad block" (630). If physical block B2 is not marked as a false bad block, the valid data stored thereon is reclaimed (640) and physical block B2 is erased (650). If step 630 identifies that physical block B2 has been marked as a pseudo-bad block, the physical block is erased directly (650) without having to reclaim its data.

A command completion processing unit (e.g., the command completion processing unit 540 of fig. 5A) of the control part recognizes whether the physical block B succeeds in performing the erase command (660). If physical block B successfully performs the erase command, meaning it is not a bad block, the pseudo-bad block flag on physical block B is cleared (physical block B2 is recorded as a good block available) (680), and if physical block B fails to perform the erase command, it is determined to be a bad block and physical block B2 is marked as a bad block (optionally its pseudo-bad block flag is also cleared) (690). The physical block (available good block) that is cleared of the false bad block marker may be allocated to carry the written data in the future. Therefore, according to the embodiment of the application, the failure probability of the programming operation for writing data into some physical blocks when actual damage does not occur is predictively marked as the false bad blocks, especially the failure probability of the programming operation for continuously writing data into the physical blocks from the same plane of a plurality of large blocks is reduced, the occurrence probability of error processing operation is reduced, and the performance jitter of the storage device is reduced. Meanwhile, whether the false bad block really has a fault or not is identified in the data recovery (or garbage recovery and storage block recovery) operation, and the false bad block without the fault is marked as a good block again, so that the false bad block can be used for bearing written data in the future, and the waste of storage block resources is avoided.

FIG. 7 is a block diagram of a control component according to yet another embodiment of the present application. The control components shown in fig. 7 include a host interface 710, a command processing unit 730, a command completion processing unit 740, and a media interface 720 for accessing NVM chip 105. A plurality of NVM chips 105 are coupled to the control component. The NVM chip includes one or more Logical Units (LUNs), e.g., LUN0, LUN1, LUN2, and LUN 3.

The control unit provides a plurality of command queues (e.g., command queue 0, command queue 1, command queue 2, and command queue 3), each of which corresponds to one of the logic units of the NVM chip. The command processing unit 730 populates the command queue 1 corresponding to LUN1 with a command to access, for example, LUN 1.

Media interface 720 provides the results of the NVM chip process command to command completion processing unit 740.

The command queue is a first-in-first-out queue. The command processing unit 730 fills the command to the end of the command queue. The command is fetched from the head of the command queue and the NVM chip is accessed by the media interface 720 according to the command.

The command queue includes a plurality of entries, each entry accommodating a command. So that the command queue can accommodate multiple commands that access their corresponding logical units. Then, the commands in the command queues are sequentially processed. By way of example, referring to FIG. 7, command queue 1 corresponding to LUN1 is populated with commands to write (also referred to as program) page X, write page X +1, read page Y, and write page X +1, in that order.

The NVM chip may have errors in processing commands. For example, the command of write page X of command queue 1 is in error when processed by the NVM chip, and the media interface 720 indicates to the command completion processing unit 740 in response that the command processing of write page X is in error. The command processing unit 740 accordingly recognizes the physical block in which the page X is located (referred to as a physical block B) as a bad block. Bad blocks are no longer written with data and data that has been written to bad blocks will be moved to other blocks. The command completion processing unit 740 indicates to the command processing unit 730 that the block B is a bad block. So that the command processing unit 730 does not write data to the block B any more. And the command processing unit 730 generates a plurality of commands at a subsequent time to move the data written to the block B to other blocks.

Command completion processing unit 740 also marks other physical blocks of the same plane as physical block B as pseudo-bad blocks and indicates to command processing unit 730 that these other physical blocks are pseudo-bad blocks. So that command processing unit 730 does not write data to these other physical blocks any more. If one or more of these other physical blocks have been written with data, command processing unit 730 also generates a plurality of commands at a later time to move the data written in these other physical blocks to other blocks.

However, referring to command queue 1, after the command to write page X (at which point block B has not been identified as a bad block), the command to write page X +1 is also populated into the queue with the command to write page X + 2. When the command completion processing unit 740 learns that the command processing for writing page X is in error, the command for writing page X +1 and the command for writing page X +2 have been added to the command queue 1 or committed to LUN 1. The command to write page X +1 and the command to write page X +2 cannot be undone. This results in the fact that a write command (e.g., a command to write page X +1 and a command to write page X + 2) was submitted to bad block B.

If the NVM chip fails to process the command for writing page X +1 or the command for writing page X +2, the command completion processing unit 740, in response to the indication that the command for writing page X +1 or the command for writing page X +2 is faulty, has the opportunity to notify the command processing unit 730 to regenerate the write command, so as to rewrite the data corresponding to the command for writing page X +1 or the command for writing page X +2 into another block, thereby ensuring that the data is reliably recorded in the NVM chip.

However, if the NVM chip successfully processes the command for writing page X +1 or the command for writing page X +2, the command completion handling unit 740 receives an indication that the command for writing page X +1 or the command for writing page X +2 was successfully processed. In this case, the resources for the command to write page X +1 or the command to write page X +2 are released, and the data written to page X +1 or page X +1 is stored unreliably in block B until the command processing unit moves the data of block B to another block in response to recognizing that block B is a bad block, and the data is reliably stored in the NVM chip. Since block B has been identified as a bad block, there is a risk that data is read from block B, which may have been corrupted, creating a risk for the data reliability of the storage device.

FIG. 8 illustrates a flow diagram for processing commands according to yet another embodiment of the present application.

The command completion processing unit 810 (see also fig. 7) acquires a message indicating completion of processing of the command by the NVM (810), and identifies a command processing result indicated by the message. If the programming of page X of block B fails (815), the command processing unit 730 is instructed to generate a new write command to rewrite the data to be written to page X to other blocks (840). And mark block B as bad block (845). Command processing unit 730 no longer writes data to the block marked as a bad block.

In response to marking physical block B as bad, one or more other physical blocks (denoted as physical block B1) that will belong to the same plane as physical block B are also obtained (850). It is further identified whether physical block B1 has been marked as a bad block (855). If physical block B1 is not marked as a bad block (855), physical block B1 is marked as a pseudo-bad block (860). If physical block B1 has been marked as a bad block, its bad block marker does not have to be modified. Command processing unit 730 no longer writes data to blocks marked as bad blocks or pseudo-bad blocks.

Alternatively, the control section may use two or more open large blocks, and in response to the physical block B being identified as a bad block, 7 may mark only physical blocks belonging to the same plane as the physical block B among the currently open large blocks as pseudo-bad blocks. Still alternatively, 7 marks all physical blocks belonging to the same plane as physical block B as pseudo-bad blocks. Some of these physical blocks marked as pseudo-bad blocks have been written with data and others have not. Still alternatively, the command completion processing unit 540 marks a physical block, to which data has not been written, belonging to the same plane as the physical block B as a pseudo-bad block.

Command processing unit 730 also generates commands to move the data written to block B, and optionally the data written to the pseudo-physical block, to other blocks (865) (a process referred to as data reclamation). The data reclamation operation is not necessarily performed immediately, but is performed at an appropriate time (e.g., the storage device is idle, available block data of the storage device is less than a threshold, etc.).

If the command completion processing unit 740 recognizes that the command processing to write page X of block B is successful (815), it further checks whether block B is marked as a bad block or a pseudo-bad block (820). If block B is marked as a bad block or a pseudo-bad block, although the command processing of page X of block B is currently successful, it is handled according to the failure of the command processing of page X of block B, for example, the command processing unit 730 is instructed to generate a new write command to rewrite the data to be written to page X into another block (830).

Based on the previous record, command processing unit 740 knows that block B has been marked as a bad block or a pseudo-bad block, so that command processing unit 740 knows that there is a risk or unreliability in the processing of commands to write pages of block B, and handles according to a write command processing failure, for example, rewriting data to be written to these pages to other blocks immediately, instead of waiting until a data recovery operation to perform data migration, so that data is stored without pages of bad blocks or pseudo-bad blocks (i.e., so that the time for storing data is short), ensuring reliability of data storage of the storage device.

If block B is not marked as a bad block or a false bad block (820), command completion processing unit 740 releases the resources occupied by the command (e.g., a cache temporarily storing data to be written by the command), or returns the command processing result to the host.

FIG. 9 illustrates a flow diagram for processing commands according to yet another embodiment of the present application.

There may be tens of thousands or even more blocks in the storage device. There may be tens to hundreds of blocks carrying write commands. Each block carries hundreds of write commands (number of pages in a block).

For the received write command processing result of each NVM chip, it is queried whether the accessed block has failed in command processing, which will introduce extra overhead and increase the workload of the storage device.

In this regard, according to the embodiment of fig. 9, command completion processing unit 740 (see also fig. 7) obtains a message indicating completion of processing of the command by the NVM (910), and identifies the result of command processing indicated by the message (915). If the programming of page X of physical block B fails (915), the command processing unit 730 is instructed to generate a new write command to rewrite the data to be written to page X to other blocks (950). And marking block B as bad block, and also setting a bad block mark for the LUN where block B is located (955).

In response to marking physical block B as bad, one or more other physical blocks (denoted as physical block B1) that will belong to the same plane as physical block B are also obtained (960). It is further identified whether physical block B1 has been marked as a bad block (965). If physical block B1 is not marked as bad (965), physical block B1 is marked as a pseudo-bad block (970). If physical block B1 has been marked as a bad block, its bad block marker does not have to be modified. Command processing unit 730 no longer writes data to blocks marked as bad blocks or pseudo-bad blocks.

The command processing unit 730 also generates a plurality of commands to move the data written to block B and optionally the data written to the pseudo-physical block to other blocks (975) (this process is referred to as data reclamation).

If the command completion processing unit 740 recognizes that the command processing for page X of block B is successful (915), it further checks whether the LUN (denoted as LUN L) where block B is located is set with a bad block flag (920). If LUN L is not set with the bad block flag (920), the command processing to page X of block B is complete, and the command completion processing unit 240 releases the resources occupied by the command (e.g., a cache for temporarily storing data to be written by the command), or returns the command processing result to the host (940). If LUN L is set with a bad block flag (920), it is further checked whether block B is marked as a bad block or a pseudo-bad block (925). If block B is marked as a bad block or a pseudo-bad block (925), although the command processing currently writing to page X of block B succeeds, it is handled in accordance with the failure of the command processing to page X of block B, for example, by instructing command processing unit 730 to generate a new write command to rewrite the data to be written to page X into another block (930). If block B is not marked as a bad block or a pseudo-bad block (925), the bad block marking of LUN L where block B is located is cleared (935), command processing to page X of block B is completed, command completion processing unit 740 releases the resources occupied by the command (e.g., a cache that temporarily stores the data to be written by the command), or returns the command processing result to the host (940).

For example, referring to FIG. 7, command queue 1 is populated with commands for write page X, commands for write page X +1, and commands for write page X + 2. According to the embodiment illustrated in fig. 9, the command processing completion unit 240, in response to finding that the command processing for writing page X fails, records that, in addition to rewriting the data corresponding to the command, page X is a bad block in block B, a block belonging to the same plane as block B is a pseudo-bad block, and sets a bad block flag for LUN L in which block B is located. Next, as an example, the NVM chip processes the command to write page X +1 with the command to write page X +2 successfully, and the command completion processing unit 740 receives such an indication. Although command completion processing unit 740 knows that the command to write page X +1 and the command to write page X +2 are successfully processed, it further checks whether LUN L accessed by these commands is set with a bad block flag. Based on the previous record, LUN L is set with a bad block flag, instructing the processing unit to continue checking whether block B is marked as a bad block or a pseudo-bad block. Based on the previous record, command completion processing unit 740 knows that block B has been marked as bad, so that command completion processing unit 740 knows that the processing of the command to write page X +1 and the command to write page X +2 is at risk or unreliable, and copes with the failure of the processing of the command to write page X +1 and the command to write page X + 2.

Next, the command processing completion unit 740 finds that the command processing for writing page P of block B2 (block B2 is different from block B, located in a different plane as block B, and also located in LUN L) is successful, and the command processing completion unit 740 further checks that LUN L accessed by the command is set with a bad block flag. Instructing the processing unit to continue checking whether block B2 is marked as a bad block. Based on the previous record, command processing unit 740 knows that block B2 is not marked as a bad block or a pseudo-bad block, thereby clearing the bad block flag set on LUN L. Next, the command completion processing unit 740 finds that the command processing for page P +1 of block B7 is successful, and the command completion processing unit 740 further checks that LUN L accessed by the command has no bad block flag set, so that the command processing for page P +1 of block B2 is completed, omitting the check of whether block B2 is a bad block.

The data reclamation operations in these embodiments are also implemented according to the flow illustrated in FIG. 6.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

17页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于扩充硬盘扩充单元的丛集式储存系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类