Reduced error correction code for dual channel DDR DRAM

文档序号:1800825 发布日期:2021-11-05 浏览:12次 中文

阅读说明:本技术 双通道ddr动态随机存取存储器的减少的纠错码 (Reduced error correction code for dual channel DDR DRAM ) 是由 纳尔辛·克里希纳·维贾伊拉奥 克里斯蒂安·马库斯·彼得森 于 2021-05-06 设计创作,主要内容包括:本申请涉及双通道DDR动态随机存取存储器的减少的纠错码。接收第一组64字节数据和第二组64字节数据。为第一组64字节数据计算第一组八个纠错码(ECC)字节,并为第二组64字节数据计算第二组八个ECC字节。在单个突发中通过第5代双数据速率(DDR5)双通道,第一组64字节数据、第二组64字节数据、第一组八个ECC字节和第二组八个ECC字节被发送到一个或更多个DDR5同步动态随机存取存储器(SDRAM)模块,其中DDR5双通道包括第一数据通道和第二数据通道,并且其中第一数据通道和第二数据通道由相同的时钟信号驱动。(The application relates to reduced error correction codes for dual channel DDR dynamic random access memories. A first set of 64 bytes of data and a second set of 64 bytes of data are received. A first set of eight Error Correction Code (ECC) bytes is calculated for the first set of 64 bytes of data and a second set of eight ECC bytes is calculated for the second set of 64 bytes of data. The first set of 64 bytes of data, the second set of 64 bytes of data, the first set of eight ECC bytes, and the second set of eight ECC bytes are sent to one or more DDR5 Synchronous Dynamic Random Access Memory (SDRAM) modules over a double 5-generation double data rate (DDR5) lane in a single burst, wherein the DDR5 lane includes a first data channel and a second data channel, and wherein the first data channel and the second data channel are driven by the same clock signal.)

1. A method, comprising:

receiving a first set of 64 bytes of data and a second set of 64 bytes of data;

calculating a first set of eight Error Correction Code (ECC) bytes for the first set of 64 bytes of data and a second set of eight ECC bytes for the second set of 64 bytes of data; and

transmitting the first set of 64-byte data, the second set of 64-byte data, the first set of eight ECC bytes, and the second set of eight ECC bytes to one or more Synchronous Dynamic Random Access Memory (SDRAM) modules over a dual channel in a single burst, wherein the dual channel includes a first data channel and a second data channel, and wherein the first data channel and the second data channel are driven by the same clock signal.

2. The method of claim 1, wherein the one or more Synchronous Dynamic Random Access Memory (SDRAM) modules comprise one or more double data rate 5 (DDR5) SDRAMs, and wherein the dual channel comprises a DDR5 dual channel.

3. The method of claim 2, wherein the DDR5 dual channels comprises 72-bit wide DDR5 dual channels.

4. The method of claim 3, wherein the 72-bit wide DDR5 dual channel includes the first data channel being 32-bit wide, the second data channel being 32-bit wide, and an ECC channel being 8-bit wide.

5. The method of claim 2, wherein the first and second data channels of the DDR5 dual channel are driven in lockstep by a same clock signal.

6. The method of claim 2, wherein sending the first set of 64 bytes of data, the second set of 64 bytes of data, the first set of eight ECC bytes, and the second set of eight ECC bytes over the DDR5 dual channel to the one or more generation 5 double data rate (DDR5) Synchronous Dynamic Random Access Memory (SDRAM) modules has a burst length of 16 words.

7. The method of claim 2, wherein the first set of eight ECC bytes is used to error correct the entire first set of 64 bytes of data, and wherein the second set of eight ECC bytes is used to error correct the entire second set of 64 bytes of data.

8. The method of claim 7, wherein the transmitting of the first set of eight ECC bytes and the second set of eight ECC bytes is in a predetermined order, and wherein the predetermined order is to separate ECC bytes read from the one or more 5 th generation double data rate (DDR5) Synchronous Dynamic Random Access Memory (SDRAM) modules that are used to correct errors of the first set of 64 bytes of data and ECC bytes that are used to correct errors of the second set of 64 bytes of data.

9. The method of claim 7, wherein the transmission of the first set of eight ECC bytes and the second set of eight ECC bytes is in a predetermined order, and wherein the transmission alternates between the two sets of eight ECC bytes.

10. The method of claim 7, wherein the transmission of the first set of eight ECC bytes and the second set of eight ECC bytes is in a predetermined order, and wherein the transmission of the first set of eight ECC bytes is performed before the transmission of the second set of eight ECC bytes.

11. A system, comprising:

a first interface configured to receive a first set of 64 bytes of data and a second set of 64 bytes of data;

a processor configured to calculate a first set of eight Error Correction Code (ECC) bytes for the first set of 64 bytes of data and a second set of eight ECC bytes for the second set of 64 bytes of data; and

a second interface configured to send the first set of 64-byte data, the second set of 64-byte data, the first set of eight ECC bytes, and the second set of eight ECC bytes to one or more Synchronous Dynamic Random Access Memory (SDRAM) modules over a dual channel in a single burst, wherein the dual channel includes a first data channel and a second data channel, and wherein the first data channel and the second data channel are driven by the same clock signal.

12. The system of claim 11, wherein the one or more Synchronous Dynamic Random Access Memory (SDRAM) modules comprise one or more double data rate 5 (DDR5) SDRAMs, and wherein the dual channel comprises a DDR5 dual channel.

13. The system of claim 12, wherein the DDR5 dual channels comprises 72-bit wide DDR5 dual channels.

14. The system as claimed in claim 13, wherein the 72-bit wide DDR5 dual channels includes the first data channel being 32-bit wide, the second data channel being 32-bit wide, and an ECC channel being 8-bit wide.

15. The system of claim 12, wherein the first and second data channels of the DDR5 dual channel are driven in lockstep by a same clock signal.

16. The system of claim 12, wherein the sending of the first set of 64 bytes of data, the second set of 64 bytes of data, the first set of eight ECC bytes, and the second set of eight ECC bytes over the DDR5 dual channel to the one or more generation 5 double data rate (DDR5) Synchronous Dynamic Random Access Memory (SDRAM) modules has a burst length of 16 words.

17. The system of claim 12, wherein the first set of eight ECC bytes is used to error correct the entire first set of 64 bytes of data, and wherein the second set of eight ECC bytes is used to error correct the entire second set of 64 bytes of data.

18. The system of claim 17, wherein the transmission of the first set of eight ECC bytes and the second set of eight ECC bytes is in a predetermined order, and wherein the predetermined order is to separate ECC bytes read from the one or more 5 th generation double data rate (DDR5) Synchronous Dynamic Random Access Memory (SDRAM) modules for error correcting the first set of 64 bytes of data and ECC bytes for error correcting the second set of 64 bytes of data.

19. The system of claim 17, wherein the transmission of the first set of eight ECC bytes and the second set of eight ECC bytes is in a predetermined order, and wherein the transmission alternates between the two sets of eight ECC bytes.

20. The system of claim 17, wherein the transmission of the first set of eight ECC bytes and the second set of eight ECC bytes is in a predetermined order, and wherein the transmission of the first set of eight ECC bytes is performed before the transmission of the second set of eight ECC bytes.

Background

Error-correcting codes (ECCs) are used to control data errors on unreliable or noisy communication channels. The sender may encode the message with redundant information in the form of ECC. Redundancy allows the receiver to detect a limited number of errors that may occur anywhere in the message and to correct them typically without retransmission. The term ECC encompasses any type of ECC, including block codes (block codes), convolutional codes, and the like. The ECC may be used to protect data stored in the memory device.

Brief Description of Drawings

Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.

Fig. 1 shows a block diagram in which a memory controller 104 is used to access groups of DRAMs (DRAMs) 108.

Fig. 2 illustrates an embodiment of a DDR4 channel.

Fig. 3 illustrates an embodiment of a conventional DDR5 channel.

FIG. 4 illustrates an embodiment in which a conventional DDR5 channel is used to access DDR5 SDRAM memory.

Fig. 5 illustrates an embodiment of a conventional DDR5 memory controller 404.

FIG. 6 illustrates an embodiment in which a DDR5 SDRAM memory is accessed using a modified DDR5 channel.

Fig. 7 illustrates an embodiment of an improved DDR5 memory controller 605.

Fig. 8 illustrates an embodiment of a process 800 for sending data to one or more double data rate 5 (DDR5) Synchronous Dynamic Random Access Memory (SDRAM) modules through a memory controller.

Fig. 9 shows an improved DDR5 dual channel embodiment.

Fig. 10 shows another embodiment of the improved DDR5 dual channel.

Detailed Description

The invention can be implemented in numerous ways, including as a process, an apparatus, a system, a composition of matter, a computer program product embodied on a computer readable storage medium, and/or a processor, e.g., a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless otherwise specified, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or as a specific component that is manufactured to perform the task. As used herein, the term "processor" refers to one or more devices, circuits, and/or processing cores configured to process data (e.g., computer program instructions).

The following provides a detailed description of one or more embodiments of the invention and the accompanying drawings that illustrate the principles of the invention. The invention is described in connection with these embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.

Double data rate synchronous dynamic random access memory (DDR SDRAM) is a Double Data Rate (DDR) Synchronous Dynamic Random Access Memory (SDRAM) type memory integrated circuit used in computers. With the ever increasing density of Dynamic Random Access Memory (DRAM) and increasing interface speeds, the memory industry is transitioning from the 4 th generation DDR4 to the 5 th generation DDR5 industry standard.

FIG. 1 shows a block diagram in which a memory controller 104 is used to access a set of DRAM modules 108. The memory controller 104 sends and receives data flowing between the processor and the processor's DRAM memory 108. The processor may be a Central Processing Unit (CPU) or an accelerator. The DRAM memory 108 may be any DDR SDRAM memory, such as DDR4 or DDR5 SDRAM memory.

In some embodiments, the memory controller 104 is integrated into another chip 102. For example, the memory controller 104 may be placed on the same die or integrated as an integral part of a processor (e.g., CPU or accelerator). In another example, the memory controller 104 may be placed on the same die or integrated as part of an Application Specific Integrated Circuit (ASIC), and the ASIC may be connected to the processor through an interconnect (e.g., compute Express link (CXL) or PCI Express (PCIe)).

The memory controller 104 is a digital circuit that manages the flow of data to and from the DRAM memory 108. The memory controller 104 contains the logic necessary for many different functions, including the logic to read and write to the DRAM and also the logic to refresh the DRAM. The memory controller 104 may also include logic for error detection and correction. The memory controller 104 includes a memory physical interface (PHY) block 106. The PHY block 106 includes logic for connecting to an external DRAM memory 108. In some embodiments, a DDR PHY interface (DFI) is used as an interface protocol that defines the signals, timing, and programmable parameters needed to transfer control information and data to and from the DRAM device and between the memory controller 104 and the PHY block 106.

DDR4 and DDR5 have different dual in-line memory module (DIMM) channel architectures. DIMMs include a series of DRAM integrated circuits. These modules are mounted on printed circuit boards and are designed for use with personal computers, workstations, and servers. Fig. 2 illustrates an embodiment of a DDR4 channel. Fig. 3 illustrates an embodiment of a conventional DDR5 channel.

Referring to fig. 2 and 3, as part of the transition from DDR4 to DDR5, one of the changes is that the prefetch (prefetch) depth (and thus the burst (burst) length) in DRAM has increased from 8 to 16. Referring to FIG. 2, the prefetch depth of DDR4 is 8n, and the basic burst size is eight words (word). The prefetch depth refers to the number of data words prefetched each time a column command is executed on the DDR memory. Burst length is the amount of data transferred between a processor and its memory per transfer. Since the core (core) of a DRAM is much slower than the interface, this difference is made up by accessing the information in parallel and then serializing it (seriize) out of the interface. For example, DDR4 prefetches eight words, meaning that each time a read or write operation is performed, it is performed on eight words of data and bursts from or into the SDRAM for four clock cycles on two clock edges, for a total of eight consecutive operations. Referring to fig. 3, the prefetch depth of DDR5 is 16n and the basic burst size is 16 words. A burst length of 16 allows a single burst to access 64 bytes of data, which is a typical CPU cache line size. DDR5 accomplishes this by using only one of the two independent channels.

Typically, the cache line size of different devices (e.g., Central Processing Units (CPUs) or other processors) is 64 bytes. To allow 64 byte accesses to match a 64 byte cache line size, the bus width in DDR4 (see fig. 2) is 72 bits, with 64 data bits and eight Error Correction Code (ECC) bits. In other words, DDR4 DIMM has a 72 bit bus that includes 64 data bits and eight ECC bits.

To continue to allow 64 byte accesses to match the 64 byte cache line size, the bus width in DDR5 (see fig. 3) is changed to 40 bits, with 32 data bits and eight ECC bits. In other words, there are two 40-bit DDR5 channels per DIMM, and 32 data bits and eight ECC bits per channel. The total data width of the two DDR5 channels is 2 × 32 data bits — 64 bits in total, which is the same as DDR 4. However, having two smaller independent channels may improve memory access efficiency. In addition, each 40-bit DDR5 channel has its own independent clock, address or control signal.

The side effect of this change is that the amount of ECC data increases from eight ECC bytes to 16 ECC bytes for every 64 bytes of data. In other words, the number of ECC bytes is doubled. The ratio of data to ECC (i.e., the ratio of the actual amount of data to the amount of ECC data) is reduced from 8:1 to 4: 1. The increased ECC overhead adds additional cost and power consumption to the server system.

FIG. 4 illustrates an embodiment in which a conventional DDR5 channel is used to access DDR5 SDRAM memory. Fig. 5 illustrates an embodiment of a conventional DDR5 memory controller 404.

As shown in FIG. 4, two separate memory controllers (404 and 406) integrated into chip 402 are used to independently access two sets of DRAMs (408 and 410). The DDR5 channel between memory controller 404 and its DRAM bank and the DDR5 channel between memory controller 406 and its DRAM bank are independent of each other. Each of these channels has its own independent clock, and the address and control signals of the channels are aligned with the clocks of the channels.

As shown in fig. 5, each memory controller (e.g., memory controller 404 or memory controller 406) includes modules such as a data engine 502, an ECC engine 504, and a PHY block 506. For every 64 bytes of data transferred from the processor to the data engine 502, the ECC engine 504 calculates 16 ECC bytes corresponding to the 64 bytes of data. 64 bytes of data and 16 ECC bytes (80 bytes total) are sent to the DRAM in bursts of 16 words by a single independent clock.

FIG. 6 illustrates an embodiment in which a DDR5 SDRAM memory is accessed using a modified DDR5 channel. Fig. 7 illustrates an embodiment of an improved DDR5 memory controller 605.

As shown in fig. 6, a single modified memory controller 605 integrated into chip 602 is used to access both sets of DRAMs (608 and 610). The two DDR5 channels between memory controller 605 and their respective DRAM banks are no longer independent of each other. The two DDR5 lanes have timing dependencies because each lane is driven in a lockstep (lockstep) manner by the same clock signal and the address and control signals of the lanes are aligned to the same clock domain. Thus, the two sets of DRAMs (608 and 610) are no longer independent of each other. The reads and writes of the DRAMs are performed on the same clock and both sets of DRAMs need to be accessed in a lockstep fashion. Thus, while the two sets of DRAMs may be physically separate sets of DRAMs, they logically belong to the same DRAM set.

As shown in fig. 7, the improved memory controller 605 includes modules such as a data engine 702, an ECC engine 704, and a PHY block 706. Error Correction Codes (ECCs) are used to control data errors on unreliable or noisy communication channels. The sender may encode the message with redundant information in the form of ECC. Redundancy allows the receiver to detect a limited number of errors that may occur anywhere in the message and to correct them typically without retransmission. The term ECC encompasses any type of ECC, including block codes, convolutional codes, and the like. ECC engine 704 provides error correction for the data by generating ECC bytes for the data and sending the ECC bytes to PHY block 706. For every 128 bytes of data transferred from the processor to the data engine 702, the ECC engine 704 computes 16 ECC bytes corresponding to the 128 bytes of data. 128 bytes of data and 16 ECC bytes (144 bytes total) are sent to the DRAM in bursts of 16 words by a single independent clock.

Fig. 8 illustrates an embodiment of a process 800 for sending data to one or more double data rate 5 (DDR5) Synchronous Dynamic Random Access Memory (SDRAM) modules through a memory controller. At step 802, a first set of 64 bytes of data and a second set of 64 bytes of data are received. At step 804, a first set of eight Error Correction Code (ECC) bytes is calculated for the first set of 64 bytes of data, and a second set of eight ECC bytes is calculated for the second set of 64 bytes of data. At step 806, the first set of 64 bytes of data, the second set of 64 bytes of data, the first set of eight ECC bytes, and the second set of eight ECC bytes are sent to one or more DDR5 Synchronous Dynamic Random Access Memory (SDRAM) modules over a double data rate 5 (DDR5) channel in a single burst. The DDR5 dual channel includes a first data channel and a second data channel, and wherein the first data channel and the second data channel are driven by a single clock signal.

Fig. 9 shows an improved DDR5 dual channel embodiment. The prefetch depth is 16n and the basic burst length is 16 words. However, instead of having two independent 40-bit DDR5 channels (see one 40-bit DDR5 legacy channel in fig. 3), the improved DDR5 dual channel is a 72-bit dual channel 902.

To continue to allow 64 byte accesses to match a 64 byte cache line size, the memory controller 605 is configured to use a bus width of 72 bits, with 2x32 data bits and 1x8 ECC bit lanes (see FIG. 9). Each 32-bit data channel (904 and 906) transfers 64 bytes of data in one burst, which is protected by its own eight ECC bytes. A 2x32 bit data lane carries a total of 128 bytes of data in one burst, which are protected by a total of 2x 8-16 ECC bytes. The 2x32 bit data lanes (lane 904 and lane 906) are driven in lock-step by the same clock signal, transferring blocks of 128 bytes of data at a time. To allow independent commands, the address/control/command for channel 804 and channel 806 may be separated.

Since eight ECC bytes corresponding to each 32-bit data channel are used to correct the entire 64-byte data in one burst, the eight ECC bytes can be used to correct the data without causing any delay as long as the eight ECC bytes are sent within the burst. Also, since the two data channels (904 and 906) are driven in a lockstep manner by the same clock signal, error correction for both data channels can be performed without any delay as long as 16 ECC bytes corresponding to 128 bytes of data are transmitted within a burst. Thus, the 16 ECC bytes corresponding to 128 bytes of data may be sent in any order. For example, memory controller 605 may receive 16 ECC bytes corresponding to a combination of channel 904 and channel 906, and organize the ECC bytes and send them in a predetermined order.

In some embodiments, the 16 ECC bytes are interleaved (interleaved). For example, as shown in FIG. 9, eight ECC bytes (indicated by hatched blocks) corresponding to the left 32-bit data channel 904 and eight ECC bytes (indicated by solid blocks) corresponding to the right 32-bit data channel 906 are interleaved together. The sending of the ECC bytes alternates between the two data channels. Specifically, one ECC byte (indicated by the hatched block) corresponding to the left 32-bit data channel 904 is sent first, then one ECC byte (indicated by the solid block) corresponding to the right 32-bit data channel 906 is sent, then a second ECC byte (indicated by the hatched block) corresponding to the left 32-bit data channel 904 is sent, then a second ECC byte (indicated by the solid block) corresponding to the right 32-bit data channel 906 is sent next, and so on.

Fig. 10 shows another embodiment of the improved DDR5 dual channel. Eight ECC bytes (indicated by hatched blocks) corresponding to the left 32-bit data channel 1004 are sent first, followed by eight ECC bytes (indicated by solid blocks) corresponding to the right 32-bit data channel 1006.

In some embodiments, the four ECC bytes corresponding to the 32-bit data channel on the left are sent first, followed by the four ECC bytes corresponding to the 32-bit data channel on the right. Next, the remaining four ECC bytes corresponding to the 32-bit data channel on the left are sent, and then the remaining four ECC bytes corresponding to the 32-bit data channel on the right are sent.

As shown in fig. 9 and 10, the ECC bytes are shown as being in the middle of their respective 72-bit dual lanes (902 and 1002). However, it should be appreciated that memory controller 605 may map the data bytes and ECC bytes in a different manner across the address space. The ECC bytes may be stored across multiple DRAMs. The memory controller 605 tracks how the data bytes and ECC bytes are mapped onto the 72-bit dual channel. Thus, when the memory controller 605 reads from the DRAM, the memory controller 605 may separate the data bytes and the ECC bytes according to a predetermined mapping. In addition, the memory controller 605 tracks the order in which ECC bytes corresponding to the two 32-bit data channels are sent or received. Thus, when the memory controller 605 reads from the DRAM, the memory controller 605 may separate ECC bytes corresponding to the first data channel from ECC bytes corresponding to the second data channel according to a predetermined order. Then, the memory controller 605 may detect or correct an error of a data byte belonging to each 32-bit data channel using an ECC byte corresponding to the data channel.

The benefit of the improved DDR5 dual channel is that the amount of ECC data is reduced from 16 ECC bytes (see fig. 3) to eight ECC bytes (see fig. 9 and 10) for every 64 bytes of data. In other words, compared to the DDR5 conventional technique, the number of ECC bytes per 64 bytes of data is reduced by half. The data to ECC ratio (i.e., the ratio of the actual data amount to the ECC data amount) increases from 4:1 to 8:1, the same as the DDR4 ratio. The reduced ECC overhead reduces the cost and power consumption of the server system, which is beneficial when no additional ECC protection is needed. Alternatively, the amount of ECC data is 16 bytes for every 128 bytes of data. In other words, the improved DDR5 transfers twice the amount of data over the dual channel, maintaining the same number of 16 ECC bytes.

Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:固态硬盘测试方法、装置、系统和可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!