NVMe solid state disk exception handling method and device and integrated chip

文档序号：784757 发布日期：2021-04-09 浏览：8次中文

阅读说明：本技术 一种NVMe固态硬盘异常处理方法、装置及集成芯片 (NVMe solid state disk exception handling method and device and integrated chip ) 是由刘海亮黄锐汪再金刘洋于 2020-12-23 设计创作，主要内容包括：本申请公开了一种NVMe固态硬盘异常处理方法、装置及集成芯片,该方法包括：当NAND FLASH控制器检测到直接传输出现错误,将与NVMe控制器的传输接口信号data-trans-err信号置为1,并将出现错误的信息上报CPU；当NVMe控制器检测到data-trans-err信号为1,将出现错误的数据从NAND FLASH控制器中读出,并发送与NAND FLASH控制器的握手信号,然后在NVMe控制器内部丢弃数据；当CPU收到信息,调用RAID模块对数据进行恢复。本申请对传输错误的情况标记和处理恢复,避免读操作流程因该错误卡顿或停止,降低了错误出现对数据传输的影响。(The application discloses an NVMe solid state disk exception handling method, device and integrated chip, wherein the method comprises the following steps: when detecting that an error occurs in direct transmission, the NAND FLASH controller sets a transmission interface signal data _ trans _ err signal of the NVMe controller to be 1, and reports the error information to the CPU; when the NVMe controller detects that the data _ trans _ err signal is 1, reading out the data with errors from the NAND FLASH controller, sending a handshake signal with the NAND FLASH controller, and then discarding the data inside the NVMe controller; and when the CPU receives the information, calling the RAID module to recover the data. The method and the device can mark and process the transmission error, avoid the blockage or stop of the reading operation process due to the error, and reduce the influence of the error on data transmission.)

1. An NVMe solid state disk exception handling method is characterized by comprising the following steps:

when the NAND FLASH controller detects that an error occurs in direct transmission, the NAND FLASH controller sets a transmission interface signal data _ trans _ err signal of the NVMe controller to be 1, and reports the error information to the CPU;

when the NVMe controller detects that the data _ trans _ err signal is 1, reading error data from the NAND FLASH controller through the NVMe controller, sending a handshake signal with the NAND FLASH controller, and then discarding the data inside the NVMe controller;

and when the CPU receives the information, calling a RAID module through the CPU to recover the data.

2. The NVMe solid state disk exception handling method of claim 1, further comprising:

when the RAID module recovers the data and fails to recover, the following operations are carried out:

configuring a DMA command through the CPU, and setting a data _ cmd _ err field segment in the DMA command to be 1;

when the NVMe controller receives the DMA command, the data corresponding to the DMA command is moved through the NVMe controller so as to be written into a host memory, and a data _ cmd _ err domain segment in a first execution result corresponding to the DMA command is set to be 1 and then reported to the CPU;

and when the NVMe controller completes data transfer of all DMA commands corresponding to the NVMe command, setting a data _ cmd _ err field segment in a second execution result corresponding to the NVMe command to be 1 by the NVMe controller and then reporting to the CPU.

3. The NVMe solid state disk exception handling method of claim 2, further comprising:

when the CPU receives the second execution result, the CPU issues specific data for completing a queue to a CQ corresponding to the NVMe command, and the error state of the queue is set to be 1;

and writing the specific data into the host memory through the NVMe controller, and sending an MSI-X interrupt message, an MSI interrupt message or a PIN-Base interrupt message to the host according to the configuration of PCIE.

4. The NVMe solid state disk exception handling method of claim 1, further comprising:

when the RAID module recovers the data successfully, the following operations are carried out:

writing the successfully recovered data into an on-chip cache or a DDR memory of the SSD controller through the CPU, and correspondingly configuring a DMA command;

when the NVMe controller receives the DMA command, the data corresponding to the DMA command is moved through the NVMe controller so as to be written into a host memory, and a first execution result corresponding to the DMA command is reported to the CPU;

and when the NVMe controller completes data transfer of all the DMA commands corresponding to the NVMe commands, reporting a second execution result corresponding to the NVMe commands to the CPU through the NVMe controller.

5. The NVMe solid state disk exception handling method of claim 4, further comprising:

and replying the CQ data with successful execution to the host through the NVMe controller.

6. The NVMe solid state disk exception handling method of claim 4, further comprising:

when the CPU receives the second execution result, the CPU issues specific data of a completion queue to a CQ corresponding to the NVMe command;

7. An NVMe solid state disk exception handling device is characterized by comprising an NAND FLASH controller, an NVMe controller and a CPU, wherein:

when the NAND FLASH controller detects that an error occurs in direct transmission, setting a data _ trans _ err signal of a transmission interface with the NVMe controller to be 1, and reporting the error information to the CPU;

when the NVMe controller detects that the data _ trans _ err signal is 1, reading out error data from the NAND FLASH controller, sending a handshake signal with the NAND FLASH controller, and then discarding the data inside the NVMe controller;

and when the CPU receives the information, calling a RAID module to recover the data.

8. The NVMe solid state disk exception handling device of claim 7,

when the RAID module recovers the data and fails to recover, the CPU configures a DMA command and sets a data _ cmd _ err field in the DMA command to be 1;

when the NVMe controller receives the DMA command, the data corresponding to the DMA command is moved to write the data into a host memory, and a data _ cmd _ err field in a first execution result corresponding to the DMA command is set to be 1 and then reported to the CPU;

and when the NVMe controller completes data transfer of all DMA commands corresponding to the NVMe command, setting a data _ cmd _ err domain segment in a second execution result corresponding to the NVMe command to be 1 and reporting to the CPU.

9. The NVMe solid state disk exception handling device of claim 8,

when the CPU receives the second execution result, issuing specific data for completing a queue to a CQ corresponding to the NVMe command, and setting the error state of the queue to be 1;

and the NVMe controller writes the specific data into the host memory, and sends MSI-X interrupt message, MSI interrupt message or PIN-Base interrupt message to the host according to the configuration of PCIE.

10. An integrated chip, comprising:

the NVMe solid state disk exception handling device of any one of claims 7-9.

Technical Field

The invention relates to the field of solid state disk operation, in particular to an NVMe solid state disk exception handling method, device and integrated chip.

Background

Currently, NVMe (Non-Volatile Memory express) Solid State Disk (SSD) gains more and more attention for storage due to its advantages of low latency, low power consumption, high bandwidth, and the like, and also becomes a new trend for storage device development. Objects of the NVMe controller on the Data path mainly include PCIe (Peripheral Component Interconnect express), NVMe controller, on-chip cache or DDR (Double Data Rate, Double synchronous dynamic random access memory), NAND Flash controller, LDPC (Low Density Parity Check, sparse Data Check technology) codec, RAID (Redundant Arrays of Independent Disk array technology), CPU (Central Processing Unit), and the like. In order to improve the bandwidth of read data, a latest read method is to directly transmit the fragmented data read by NAND particles in the NAND Flash controller to the NVMe controller through a hardware circuit, so that the waiting time on the cache or DDR of the SSD controller is avoided, and the time for configuring NVMe DMA (Direct Memory Access) by the CPU is saved through a hardware circuit Direct transmission mode, thereby improving the bandwidth of the read operation data of the solid state disk.

However, in such a process, once a data reading error occurs, the reading operation of the solid state disk is blocked and cannot be continued, and how to solve the reading error and the abnormal recovery of the data is a problem to be solved by those skilled in the art.

Disclosure of Invention

In view of this, the present invention provides an NVMe solid state disk exception handling method, an NVMe solid state disk exception handling device, and an integrated chip. The specific scheme is as follows:

an NVMe solid state disk exception handling method comprises the following steps:

and when the CPU receives the information, calling a RAID module through the CPU to recover the data.