Data processing method, remote direct memory access network card and equipment

文档序号:1686967 发布日期:2020-01-03 浏览:13次 中文

阅读说明:本技术 数据处理方法、远程直接访存网卡和设备 (Data processing method, remote direct memory access network card and equipment ) 是由 孙贝磊 周超 李涛 于 2018-06-26 设计创作,主要内容包括:本申请提供了一种数据处理方法、RNIC和设备,该方法包括:第一RNIC接收第二RNIC发送的RDMA写请求,RDMA写请求包括第一数据和数据持久化标记,该数据持久化标记用于指示第一数据为待持久化的数据;根据该RDMA写请求,第一RNIC向第一处理器发送DMA写请求,以指示第一处理器将第一数据写入第一设备,第一RNIC与第一处理器属于第一设备,第二RNIC属于第二设备,两个设备基于RDMA方式通信;根据该数据持久化标记,第一RNIC确定该第一数据为待持久化的数据,第一RNIC指示第一处理器将第一数据保存到第一设备的非易失性存储器中。以此减小远程内存持久化过程中RDMA网络的负载。(The application provides a data processing method, an RNIC and equipment, wherein the method comprises the following steps: the method comprises the steps that a first RNIC receives an RDMA write request sent by a second RNIC, the RDMA write request comprises first data and a data persistence mark, and the data persistence mark is used for indicating that the first data are data to be persisted; according to the RDMA write request, the first RNIC sends a DMA write request to the first processor to instruct the first processor to write first data into the first equipment, the first RNIC and the first processor belong to the first equipment, the second RNIC belongs to the second equipment, and the two equipment communicate based on the RDMA mode; according to the data persistence flag, the first RNIC determines that the first data is data to be persisted, and the first RNIC instructs the first processor to save the first data to a non-volatile memory of the first device. Thereby reducing the load on the RDMA network during remote memory persistence.)

1. A data processing method, comprising:

a first remote direct memory access network card RNIC receives a remote direct memory access RDMA write request sent by a second RNIC, wherein the RDMA write request comprises first data and a data persistence mark, the RDMA write request is used for requesting to write the first data into first equipment, the data persistence mark is used for indicating that the first data are data to be persisted, the first RNIC is the RNIC of the first equipment, the second RNIC is the RNIC of second equipment, and the first equipment and the second equipment communicate based on an RDMA mode;

the first RNIC sends a Direct Memory Access (DMA) write request to a first processor, the DMA write request comprises the first data, the DMA write request is used for indicating the first processor to write the first data into the first equipment, the first processor is a processor of the first equipment, and the first RNIC and the first processor communicate based on a DMA mode;

the first RNIC instructs the first processor to store the first data in a non-volatile memory of the first device according to the data persistence flag.

2. The method of claim 1, wherein the data persistence flag is a write persistence instruction.

3. The method according to claim 1, wherein the data persistence flag is a destination storage address corresponding to the first data, a storage space corresponding to the destination storage address is used for storing the first data, the destination storage address is a persistent storage address in the first device, and the storage space corresponding to the persistent storage address is used for storing data to be persisted.

4. The method of any of claims 1-3, wherein instructing the first processor to store the first data to the non-volatile memory of the first device according to the data persistence flag comprises:

the first RNIC adds a DMA least significant bit read request in a receiving queue corresponding to the second RNIC according to the data persistence mark;

the first RNIC sends the DMA least significant bit read request to the first processor, the DMA least significant bit read request to instruct the first processor to store all data cached in a peripheral bus link of the first device into a non-volatile memory of the first device.

5. The method of any of claims 1-3, wherein instructing the first processor to store the first data to the non-volatile memory of the first device according to the data persistence flag comprises:

the first RNIC clears a WQE after generating a Work Queue Entry (WQE) corresponding to an RDMA receiving request according to the data persistence mark and the first data, wherein the RDMA receiving request is used for receiving the RDMA sending request initiated by the second device;

the first RNIC generates a Completion Queue Entry (CQE) corresponding to the RDMA receive request, the CQE corresponding to the RDMA receive request being used to instruct the first processor to store first data cached in a volatile storage medium of the first processor into a non-volatile memory of the first device.

6. A data processing method, comprising:

a second remote direct memory access network card RNIC receives a remote direct memory access RDMA write persistence request sent by a second processor, wherein the RDMA write persistence request comprises a data persistence mark, the RDMA write persistence request is used for requesting to store first data into a nonvolatile memory of first equipment, the data persistence mark is used for indicating that the first data are data to be persisted, the second RNIC is an RNIC of the second equipment, the second processor is a processor of the second equipment, the first equipment and the second equipment communicate based on an RDMA mode, and the second RNIC and the second processor communicate based on a DMA mode;

the second RNIC generates an RDMA write request from the RDMA write persistence request, the RDMA write request including the first data and the data persistence flag;

the second RNIC sends the RDMA write request to a first RNIC, the RDMA write request is used for requesting to write the first data into the first equipment, and the first RNIC is the RNIC of the first equipment.

7. The method of claim 6, wherein the data persistence flag is a write persistence instruction.

8. The method according to claim 6, wherein the data persistence flag is a destination storage address corresponding to the first data, a storage space corresponding to the destination storage address is used for storing the first data, the destination storage address is a persistent storage address in the first device, and the storage space corresponding to the persistent storage address is used for storing data to be persisted.

9. The method of any of claims 6-8, wherein after the second RNIC sends the RDMA write request to the first RNIC, further comprising:

when receiving a reception acknowledgement message corresponding to the first data sent by the first RNIC, the second RNIC buffers the reception acknowledgement message corresponding to the first data, where the reception acknowledgement message corresponding to the first data includes a first Packet Sequence Number (PSN), the first PSN is a sequence number of the first data, and the reception acknowledgement message corresponding to the first data is used to indicate that the first RNIC receives the first data;

the second RNIC receives a first confirmation message sent by the first RNIC, wherein the first confirmation message comprises a second PSN;

when the first PSN is the same as the second PSN, the second RNIC generates a completion item CQE corresponding to the RDMA write persistence request, the CQE corresponding to the RDMA write persistence request to notify the second processor that the first data has been stored to a non-volatile memory of the first device.

10. The method of any of claims 6-8, wherein after the second RNIC sends the RDMA write request to the first RNIC, further comprising:

when receiving a reception acknowledgement message corresponding to the first data sent by the first RNIC, the second RNIC caches the reception acknowledgement message corresponding to the first data, where the reception acknowledgement message corresponding to the first data includes a first PSN, the first PSN is a sequence number of the first data, and the reception acknowledgement message corresponding to the first data is used to indicate that the first RNIC receives the first data;

generating, by the second RNIC, a CQE corresponding to the RDMA write persistence request, the CQE corresponding to the RDMA write persistence request to notify the second processor that the first data has been written to the first device;

the second RNIC receives the RDMA receiving request sent by the second processor;

the second RNIC receives a first confirmation message sent by the first RNIC, wherein the first confirmation message comprises a second PSN;

when the second PSN is the same as the first PSN, the second RNIC generates a CQE corresponding to the RDMA receive request to notify the second processor that the first data has been stored to a non-volatile memory of the first device.

11. A remote direct memory access network card RNIC is characterized by comprising:

the RDMA write request comprises first data and a data persistence mark, the RDMA write request is used for requesting to write the first data into first equipment, the data persistence mark is used for indicating that the first data are data to be persisted, the RNIC is the RNIC of the first equipment, the second RNIC is the RNIC of second equipment, and the first equipment and the second equipment communicate based on an RDMA mode;

a scheduling module, configured to send a Direct Memory Access (DMA) write request to a first processor, where the DMA write request includes the first data, and the DMA write request is used to instruct the first processor to write the first data into the first device, where the first processor is a processor of the first device, and the RNIC and the first processor communicate based on a DMA scheme;

and the persistent memory module is used for instructing the first processor to store the first data into a nonvolatile device of the first equipment according to the data persistence mark.

12. The RNIC of claim 11, wherein the data persistence flag is a write persistence instruction.

13. The RNIC of claim 11, wherein the data persistence flag is a destination storage address corresponding to the first data, a storage space corresponding to the destination storage address is used for storing the first data, the destination storage address is a persistent storage address in the first device, and the storage space corresponding to the persistent storage address is used for storing data to be persisted.

14. The RNIC of any one of claims 11-13, wherein the persistent memory module is specifically configured to:

adding a DMA least significant bit read request in a receiving queue corresponding to the second RNIC according to the data persistence mark;

sending the DMA least significant bit read request to the first processor, the DMA least significant bit read request being used to instruct the first processor to write all data cached in the peripheral bus link of the first device into a non-volatile memory of the first device.

15. The RNIC of any one of claims 11-13, wherein the persistent memory module is specifically configured to:

clearing, after the first RNIC corresponding to the RDMA request generated according to the data persistence flag and the first data has a Work Queue Entry (WQE) corresponding to a RDMA receive request generated according to the first data, the WQE, the RDMA receive request for receiving the RDMA send request initiated by the second device;

the first RNIC generates a Completion Queue Entry (CQE) corresponding to the RDMA receive request, the CQE corresponding to the RDMA receive request being used to instruct the first processor to store first data cached in a volatile storage medium of the first processor into a non-volatile memory of the first device.

16. A remote direct memory access network card RNIC is characterized by comprising:

the system comprises a scheduling module, a first storage module and a second storage module, wherein the scheduling module is used for receiving a remote direct access RDMA write persistence request sent by a second processor, the RDMA write persistence request comprises a data persistence mark, the RDMA write persistence request is used for requesting to store first data into a nonvolatile memory of a first device, the data persistence mark is used for indicating that the first data are data to be persisted, the RNIC is an RNIC of the second device, the second processor is a processor of the second device, the first device and the second device communicate based on an RDMA mode, and the RNIC and the second processor communicate based on a DMA mode;

the scheduling module is further to generate an RDMA write request from the RDMA write persistence request, the RDMA write request including the first data and the data persistence flag;

a sending module, configured to send the RDMA write request to a first RNIC, where the RDMA write request is used to request to write the first data in the first device, and the first RNIC is an RNIC of the first device.

17. The RNIC of claim 16, wherein the data persistence flag is a write persistence instruction.

18. The RNIC of claim 16, wherein the data persistence flag is a destination storage address corresponding to the first data, wherein a storage space corresponding to the destination storage address is used for storing the first data, wherein the destination storage address is a persistent storage address in the first device, and wherein a storage space corresponding to the persistent storage address is used for storing data to be persisted.

19. The RNIC of any one of claims 16-18, wherein the RNIC further comprises a receiving module, and wherein the scheduling module is further configured to:

caching a reception confirmation message corresponding to the first data when the reception module receives the reception confirmation message corresponding to the first data sent by the first RNIC, where the reception confirmation message corresponding to the first data includes a first Packet Sequence Number (PSN), and the first PSN is a sequence number of the first data, and the reception confirmation message corresponding to the first data is used to indicate that the first RNIC receives the first data;

the receiving module is further configured to receive a first acknowledgement message sent by the first RNIC, where the first acknowledgement message includes the second PSN;

the scheduling module is further configured to: generating a completion item CQE corresponding to the RDMA write persistence request for notifying the second processor that the first data has been stored to a non-volatile memory of the first device, where the first PSN is the same as the second PSN.

20. The RNIC of any of claims 16-18, the RNIC further comprising a receiving module, the scheduling module further to:

when the receiving module receives a reception acknowledgement message corresponding to the first data sent by the first RNIC, caching the reception acknowledgement message corresponding to the first data, where the reception acknowledgement message corresponding to the first data includes a first PSN, the first PSN is a sequence number of the first data, and the reception acknowledgement message corresponding to the first data is used to indicate that the first RNIC receives the first data;

generating a CQE corresponding to the RDMA write persistence request, the CQE corresponding to the RDMA write persistence request being used to inform the second processor that the first data has been written to the first device;

receiving an RDMA receiving request sent by the second processor;

the receiving module is further configured to: receiving a first acknowledgement message sent by the first RNIC, wherein the first acknowledgement message comprises a second PSN;

the scheduling module is further configured to: generating a CQE corresponding to the RDMA receive request to notify the second processor that the first data has been stored to a non-volatile memory of the first device if the second PSN is the same as the first PSN.

21. A first device, characterized in that it comprises a processor, a non-volatile memory and a remote direct access network card for executing the data processing method according to any one of claims 1 to 5.

22. A second device, characterized in that it comprises a processor, a non-volatile memory and a remote direct access network card for implementing the data processing method according to any one of claims 6 to 10.

Technical Field

The application relates to the technical field of computers, in particular to a data processing method, a remote direct memory access network card and equipment.

Background

A Storage Class Memory (SCM), such as a 3D Xpoint, is a new type of non-volatile memory (NVM), which is a composite memory combining a conventional storage medium (e.g., a mechanical hard disk, a solid state disk, etc.) and a memory (e.g., a dynamic random access memory). The SCM can be embedded into a slot of a mainboard like a dynamic random access memory, and compared with the dynamic random access memory, the SCM can still store data uninterruptedly under the power-off state, and has the characteristic of power-off storage. SCM can provide faster read and write speeds than flash memory, and is cheaper in cost than dynamic random access memory. In some computing device system architectures, an SCM is used as memory usage. In some aspects, redundant backup of data is achieved by forming a SCM resource pool by connecting multiple SCM-enabled computing devices in an interconnected manner to expand the capacity of the SCM.

In an SCM resource pool formed by a plurality of SCM-based computing devices, any two computing devices may communicate and transmit data based on Remote Direct Memory Access (RDMA) technology, where remote direct memory access may be referred to as remote direct access. Unlike traditional network transfer techniques, RDMA techniques can transfer data directly from the memory of one computing device to the memory of another computing device without the intervention of the operating systems or kernels of the two computing devices. For an SCM resource pool based on RDMA technology for communication and data transmission, the problem of remote memory persistence needs to be solved. Memory persistence, which refers to writing data back from a volatile storage medium of a computing device to a non-volatile storage medium of the computing device; and the remote memory persistence means that data in the SCM of one computing device is written into the SCM of another computing device and then stored in the SCM of the another computing device. In the storage system with the SCM as the memory, data in the SCM of the computing device a (the computing device that initiates the remote operation request) is to be stored in the SCM of the computing device B (the computing device that receives the remote operation request), the data needs to pass through a processor of the computing device B, and due to the existence of a cache in the processor of the computing device B, the data may be temporarily cached in the cache, and risk of power loss is encountered. In current solutions, computing device a typically writes data back to SCM of computing device B by initiating an RDMA read (read) request or an RDMA send (send) request after writing the data to computing device B with an RDMA write (write) request, which may be considered a remote memory persistence request. This solution presents the following problems: since computing device a initiates a remote memory persistence request after each RDMA write request, the network load of the RDMA network is increased.

Disclosure of Invention

The application provides a data processing method, a remote direct memory access network card and equipment, and solves the problem of large network load caused by an additionally sent remote memory persistence request.

In a first aspect, a data processing method is provided, including:

a first remote direct memory access network card (RNIC) receives a Remote Direct Memory Access (RDMA) write request sent by a second RNIC, where the RDMA write request includes first data and a data persistence flag, the RDMA write request is used to request that the first data be written into a first device, and the data persistence flag is used to indicate that the first data are data to be persisted, where the first RNIC is an RNIC of the first device, that is, an RNIC of a device receiving the RDMA request, the second RNIC is an RNIC of a second device, that is, an RNIC of a device sending the RDMA request, and the first device and the second device communicate based on an RDMA manner; according to the RDMA write request, the first RNIC determines that first data needs to be written into the first device, the first RNIC sends a Direct Memory Access (DMA) write request to the first processor, the DMA write request comprises the first data to instruct the first processor to write the first data into the first device, wherein the first processor is a processor of the first device, and the first RNIC and the first processor communicate in a DMA-based manner; according to the data persistence flag, the first RNIC determines that the first data is data to be persisted, and the first RNIC instructs the first processor to save the first data to a non-volatile memory of the first device.

Wherein the non-volatile memory may be an SCM. The SCM may be a phase-change random access memory (PRAM), and the PRAM may be, for example, a 3D Xpoint, a resistive random access memory (ReRAM), a Magnetic Random Access Memory (MRAM), or the like. The device receiving the RDMA request may be referred to as a remote device and the device sending the RDMA request may be referred to as a local device.

In the above scheme, after determining that the first data is the data to be persisted according to the data persistence flag in the RDMA write request, the RNIC of the remote device directly instructs the processor of the remote device to store the first data in the nonvolatile memory of the first device to complete memory persistence of the first data, without the RNIC of the local device initiating a remote memory persistence request again. The local device does not need to send the remote memory persistence request after sending the RDMA write request, so that the load of the RDMA network is reduced; in addition, the data persistence tag is carried in the RDMA write request, so that the remote write operation and the remote memory persistence operation become continuously executed operations, the data can be stored in the nonvolatile memory after being written into the remote equipment, and the problem of data inconsistency is avoided.

In one possible implementation, the data persistence flag may be a write persistence instruction. In the case of a datamation marker as a write persistence instruction, the RDMA write request may also be referred to as an RDMA write persistence request, such as a RDMA write durable. By adding a write persistence instruction on the basis of an original RDMA operation instruction, the RNIC of the remote device analyzes the RDMA write request to obtain the write persistence instruction, and the RNIC of the remote device can determine that first data in the RDMA write request is data to be persisted.

In another possible implementation manner, the data persistence flag is a destination storage address corresponding to the first data, a storage space corresponding to the destination storage address is used for storing the first data, the destination storage address is a persistent storage address in the first device, and the persistent storage address is used for storing data to be persisted. Here, the destination storage address corresponding to the first data is a storage address corresponding to a storage space for storing the first data in the first device specified by the second device. The remote device allocates a storage space for storing data to be persisted in advance, and the RNIC of the remote device can determine that the first data in the RDMA write request is the data to be persisted when determining that a destination storage address corresponding to the first data is a storage address corresponding to the storage space for storing the data to be persisted allocated in advance in the remote device.

In another possible implementation, the first RNIC may instruct the first processor to save the first data to the non-volatile memory of the first device by: the first RNIC adds a DMA Least Significant Bit (LSB) read request to a receiving queue (SQ) corresponding to the second RNIC; the first RNIC first processor sends the DMA LSB read request instructing the first processor to write all data cached in the peripheral bus link of the first device to the non-volatile memory of the first device. This approach may be applicable to the case where data does not pass through a Last Level Cache (LLC) of the first processor during the writing of data by the first RNIC. Since data may not pass through the LLC of the first processor, the data may not be completely written but may be cached in a cache of an input/output (I/O) controller of the processor, and according to the PCIe (peripheral component interconnect express) protocol, before any one read operation is performed, the write operation needs to be completely completed, so that by adding a DMA LSB read request to the receive queue and then sending the DMA LSB read request to the processor, the processor executes a read operation corresponding to the DMA LSB read request, the read operation may cause the data that is not completely written and is cached on the peripheral bus to be written into the nonvolatile memory, and further may store the first data that may be further cached on the peripheral bus into the nonvolatile memory.

In another possible implementation, the first RNIC may instruct the first processor to save the first data to the non-volatile memory of the first device by: the first RNIC clears a Work Queue Entry (WQE) corresponding to an RDMA receive (receive) request for receiving an RDMA send (send) request initiated by a second device after the WQE is generated according to the first data; after clearing the WQE, the first RNIC generates a Completion Queue Entry (CQE) corresponding to the RDMA request to instruct the first processor to store the first data cached in the volatile storage medium of the first processor in a non-volatile memory of the first device. The volatile storage medium of the first processor may be an LLC. This approach may be applicable where data passes through the LLC of the remote device's processor during the writing of data by the remote device's RNIC. Since the first data passes through the LLC of the first processor, by generating and clearing WQEs corresponding to RDMA receive requests and generating CQEs corresponding to RDMA receive requests from the first data, when the CQEs are acquired by the first processor, an interrupt may be generated that will cause the first processor to write data at the corresponding address back to the non-volatile memory. And WQE and CQE are generated according to the first data, which correspond to the first data, and the first data can be stored in the nonvolatile memory by writing the data of the corresponding address back to the nonvolatile memory.

In another possible implementation manner, after the first RNIC instructs the first processor to store the first data in the nonvolatile memory of the first device, a persistence confirmation message corresponding to the first data may be further sent to the second RNIC, where the persistence confirmation message includes a second Packet Sequence Number (PSN), and the second PSN is a sequence number of the first data, and the persistence confirmation message is used to indicate that memory persistence of the first data is completed. By sending a persistence confirmation message to the second RNIC, the second RNIC may determine from the second PSN that the first data is stored in the non-volatile memory of the first device.

In another possible implementation manner, after receiving the RDMA write request sent by the second RNIC, the first RNIC sends a reception acknowledgement message corresponding to the first data to the second RNIC, where the reception acknowledgement message corresponding to the first data includes the first PSN, and the first PSN is a sequence number of the first data, and the reception acknowledgement message is used to indicate that the first RNIC receives the first data. The second RNIC may determine that transmission of the first data is complete according to the first PSN by sending a receipt confirmation message to the second RNIC.

In a second aspect, another data processing method is provided, including: the second RNIC receives an RDMA write persistence request of a second processor, the RDMA write persistence request comprises a data persistence mark, the RDMA write persistence request is used for requesting to store first data into a nonvolatile memory of a first device, the data persistence mark is used for indicating that the first data are data to be persisted, the second RNIC and the second processor are respectively an RNIC and a processor of the second device, the second RNIC and the second processor are based on communication of a DMA mode, the second device is a device for initiating the RDMA request, the first device is a device for receiving the RDMA request, and the second device and the first device are based on communication of the RDMA mode; the second RNIC generates an RDMA write request from the RDMA write persistence request, the RDMA write request including the first data and the data persistence flag; the second RNIC sends the RDMA write request to the first RNIC, the RDMA write request requesting to write the first data to the first device, and the first RNC being the RNIC of the first device.

The device receiving the RDMA request may be referred to as a remote device, and the device sending the RDMA request may be referred to as a local device, that is, the first device is a remote device, and the second device is a local device.

In the above scheme, when the RNIC of the local device receives an RDMA write persistence request initiated by a processor of the local device, the RDMA write request is generated according to the RDMA write persistence request, the meaning of the RDMA write request may enable the RNIC of the remote device to write first data into the remote device, and the data persistence flag in the RDMA write request may enable the RNIC of the remote device to know that the first data is data to be persisted, so that the RNIC of the remote device may store the first data into a nonvolatile memory of the first device after writing the first data into the remote device, and complete memory persistence of the first data. The scheme is equivalent to the fact that the write request and the memory persistence request are fused in one request, the local device does not need to send additional memory persistence requests, and the load of the RDMA network is reduced; in addition, the data persistence tag is carried in the RDMA write request, so that the remote write operation and the remote memory persistence operation become continuously executed operations, the data can be stored in the nonvolatile memory after being written into the remote equipment, and the problem of data inconsistency is avoided.

In one possible implementation, the data persistence flag is a write persistence instruction. In the case of a datamation marker as a write persistence instruction, the RDMA request may also be referred to as an RDMA write persistence request, such as RDMA writeable. By adding the write persistence instruction on the basis of the original RDMA operation instruction, the RNIC of the remote device can determine that the first data in the RDMA write request is the data to be persisted after analyzing the RDMA write request to obtain the write persistence instruction.

In another possible implementation manner, the data persistence flag is a destination virtual memory address corresponding to the first data, the destination storage address corresponding to the first data is a persistent storage address in the first device, and a storage space corresponding to the persistent storage address is used for storing data to be persisted. The destination storage address corresponding to the first data refers to a storage address for storing the first data in the first device specified by the second device. By setting the destination storage address corresponding to the first data as the storage address corresponding to the storage space for storing the data to be persisted in the remote device, the RNIC of the remote device can determine that the first data is the data to be persisted according to the destination storage address.

In another possible implementation manner, after sending the RDMA write request to the first RNIC, the second RNIC generates a CQE corresponding to the RDMA write persistence request in the case of receiving a persistence confirmation message corresponding to the first data sent by the first RNIC, where the persistence confirmation message corresponding to the first data is used to indicate that the memory persistence of the first data is completed, and the CQE corresponding to the RDMA write persistence request is used to notify that the memory persistence of the second processor is completed. This approach may be applicable in situations where data is not passed through the LLC of the remote device's processor during the writing of data by the RNIC of the remote device. Under the condition of receiving the persistence confirmation message sent by the first RNIC, the second RNIC generates a CQE corresponding to the RDMA write persistence request, when the second processor acquires the CQE, the completion of memory persistence of the first number can be determined according to the CQE, the first processor does not need to initiate a remote memory persistence request again, and the load of the RDMA network is reduced.

In another possible implementation manner, before the second RNIC generates a CQE corresponding to the RDMA write persistence request, the second RNIC buffers a reception acknowledgement message corresponding to the first data under the condition that the second RNIC receives a reception acknowledgement corresponding to the first data sent by the first RNIC, where the reception acknowledgement message corresponding to the first data includes a first PSN, the first PSN is a sequence number of the first data, and the reception acknowledgement message corresponding to the first data is used to indicate that the first RNIC receives the first data; the second RNIC receives a first confirmation message sent by the first RNIC, wherein the first confirmation message comprises the second PSN; in a case where the second PSN is the same as the first PSN, the second RNIC determines that a persistence confirmation message corresponding to the first data is received. In the RDMA protocol, an Acknowledgement (ACK) message is only used for indicating the meaning of acknowledgement, and the ACK message is specifically used for indicating what request acknowledgement needs to be judged according to the sequence of the request sent before receiving the acknowledgement message or the received ACK message, in the design scheme, because the write operation occurs before the memory persistence operation, the first received acknowledgement message carrying the sequence number of the first data is necessarily the receiving acknowledgement message of the first data, after the receiving acknowledgement message of the first data, the RNIC of the second device can send the next data of the first data to the RNIC of the first device without waiting for receiving the persistence acknowledgement message of the first data by buffering the receiving acknowledgement message, the memory persistence of one data and the write operation of the next data of the data can be carried out in parallel by comparing the PSN, the efficiency of data memory persistence is improved, and the time delay is reduced.

In another possible implementation manner, after the second RNIC sends the RDMA write request to the first RNIC, in a case that a reception acknowledgement message corresponding to the first data sent by the first RNIC is received, the second RNIC generates a CQE corresponding to the RDMA write persistence request, where the reception acknowledgement message corresponding to the first data is used to indicate that the first RNIC receives the first data, and the CQE corresponding to the RDMA write persistence request is used to notify the second processor that the first data has been written to the first device; the second RNIC receives the RDMA receiving request sent by the second processor; and under the condition of receiving a persistence confirmation message corresponding to the first data, the second RNIC generates a CQE corresponding to the RDMA request, wherein the persistence confirmation message corresponding to the first data is used for indicating the completion of the persistence of the first data, and the CQE corresponding to the RDMA receiving request is used for notifying the completion of the persistence of the first data to the second processor. This approach may be applicable where data passes through the LLC of the remote device's processor during the writing of data by the remote device's RNIC. Because the first data passes through the LLC of the processor of the first device, the CQE corresponding to the RDMA write persistence request is generated under the condition that the reception confirmation message corresponding to the first data is received, so that the second processor can initiate the RDMA reception request according to the CQE corresponding to the RDMA write persistence request, the process that the second processor initiates the RDMA sending request is omitted, and the load of an RDMA network is reduced.

In another possible implementation manner, before the second RNIC generates the CQE corresponding to the RDMA request, the second RNIC may further cache a reception acknowledgement message corresponding to the first data, where the reception acknowledgement message corresponding to the first data includes the first PSN, and the first PSN is a sequence number of the first data; the second RNIC receives a first confirmation message sent by the first RNIC, wherein the first confirmation message comprises the second PSN; in a case where the second PSN is the same as the first PSN, the second RNIC determines that a persistence confirmation message corresponding to the first data is received. By caching the receiving confirmation message of the first data, the second device can send the next data of the first data to the first device without waiting for the persistence confirmation of the first data to be received, and the memory persistence of one data and the write-in operation of the next data of the data can be performed in parallel by comparing the PSNs, so that the efficiency of the data memory persistence is improved, and the time delay is reduced.

In a third aspect, there is provided another data processing method, including: the second processor sends an RDMA write persistence request to a second RNIC, the RDMA write persistence request comprises a data persistence flag, the RDMA write persistence request is used for requesting to store first data into a nonvolatile memory of the first device, the data persistence flag is used for indicating that the first data are data to be persisted, the second RNIC and the second processor are respectively an RNIC and a processor of the second device, the second RNIC and the second processor communicate based on a DMA mode, the second device is a device for initiating the RDMA request, the first device is a device for receiving the RDMA request, and the first device and the second device communicate based on the RDMA mode; in the case of obtaining a CQE corresponding to an RDMA write persistence request, the second processor determines that the first data has been stored to the non-volatile memory of the first device.

The device receiving the RDMA request may be referred to as a remote device, and the device sending the RDMA request may be referred to as a local device, that is, the first device is a remote device, and the second device is a local device.

The scheme can be applied to the condition that data does not pass through the LLC of the processor of the remote device in the process of writing data by the RNIC of the remote device. The processor of the local device can write the first data into the remote device and persist the memory into the remote device by only sending an RDMA write request once, and the remote memory persistence request does not need to be sent after the RDMA write request is sent, so that the load of an RDMA network is reduced; in addition, the data persistence tag is carried in an RDMA request, so that the remote write operation and the remote memory persistence operation become continuously executed operations, the data can be stored in a nonvolatile memory after being written into a remote device, and the problem of data inconsistency is avoided.

In another possible implementation, the data persistence flag is a write persistence instruction. By adding the RDMA write persistence instruction on the basis of the original RDMA operation instruction, the RNIC of the remote device can determine that the first data in the RDMA request is the data to be persisted after the RDMA request is analyzed to obtain the RDMA write persistence instruction.

In another possible implementation manner, the data persistence flag is a destination storage address corresponding to the first data, the destination storage address corresponding to the first data is a persistent storage address in the first device, and a storage space corresponding to the persistent storage address is used for storing data to be persisted. The destination storage address corresponding to the first data refers to a storage address for storing the first data in the first device specified by the second device. By setting the destination storage address corresponding to the first data as the storage address corresponding to the storage space for storing the data to be persisted in the remote device, the RNIC of the remote device can determine that the first data is the data to be persisted according to the destination storage address.

In a fourth aspect, there is provided a further data processing method, comprising: the second processor sends an RDMA write persistence request to a second RNIC, the RDMA write persistence request comprises a data persistence flag, the RDMA write persistence request is used for requesting to store first data into a nonvolatile memory of the first device, the data persistence flag is used for indicating that the first data are data to be persisted, the second RNIC and the second processor are respectively an RNIC and a processor of the second device, the second RNIC and the second processor communicate based on a DMA mode, the second device is a device for initiating the RDMA request, the first device is a device for receiving the RDMA request, and the first device and the second device communicate based on the RDMA mode; under the condition of acquiring a CQE corresponding to the RDMA write persistence request, the second processor sends an RDMA receiving request to the second RNIC; and under the condition of acquiring the CQE corresponding to the RDMA receiving request sent by the second RNIC, the second processor determines that the first data is stored in the nonvolatile memory of the first device.

This scheme may be applicable in the case where data does not pass through the LLC of the remote device's processor during the writing of data by the RNIC of the remote device. The method saves the local device from sending the RDMA sending request and reduces the load of the RDMA network.

In one possible implementation, the data persistence flag is a write persistence instruction. By adding the RDMA write persistence instruction on the basis of the original RDMA operation instruction, the RNIC of the remote device can determine that the first data in the RDMA request is the data to be persisted after the RDMA request is analyzed to obtain the RDMA write persistence instruction.

In another possible implementation manner, the data persistence flag is a destination storage address corresponding to the first data, the destination storage address corresponding to the first data is a persistent storage address in the first device, and a storage space corresponding to the persistent storage address is used for storing data to be persisted. The destination storage address corresponding to the first data refers to a storage address for storing the first data in the first device specified by the second device. By setting the destination storage address corresponding to the first data as the storage address corresponding to the storage space for storing the data to be persisted in the remote device, the RNIC of the remote device can determine that the first data is the data to be persisted according to the destination storage address.

In a fifth aspect, an RNIC is provided that includes modules for performing the data processing method of the first aspect or any one of its possible implementations.

In a sixth aspect, there is provided another RNIC comprising means for performing the data processing method of the second aspect or any one of its possible implementations.

In a seventh aspect, a processor is provided, configured to execute part or all of the processes related to the third or fourth aspect.

In an eighth aspect, there is provided a first device comprising a processor, a non-volatile memory, and an RNIC configured to perform the operational steps of the method flow of the first aspect.

In a ninth aspect, there is provided a second apparatus comprising a processor, a non-volatile memory, and an RNIC, the RNIC being configured to perform the operational steps of the method flow of the second aspect, and the processor being configured to perform the operational steps of the method flow of the third or fourth aspect.

In a tenth aspect, a computer-readable storage medium is provided, having stored therein instructions, which, when run on a computer, cause the computer to perform the method of the above aspects.

In an eleventh aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.

The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.

Drawings

FIG. 1 is a schematic diagram of a computing device according to an embodiment of the present disclosure;

fig. 2 is a schematic flow chart of remote memory persistence according to an embodiment of the present application;

fig. 3 is a schematic flow chart of another remote memory persistence provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of a network of computing devices communicating via RDMA techniques provided by an embodiment of the present application;

fig. 5 is a schematic diagram of a communication system including a local device and a remote device according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of an implementation of an RNIC provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of another RNIC configuration provided by embodiments of the present application;

FIG. 8 is a schematic diagram of a channel between two computing devices according to an embodiment of the present disclosure;

fig. 9 is a schematic flowchart of a data processing method according to an embodiment of the present application;

FIG. 10 is a schematic flow chart diagram of another data processing method provided in the embodiments of the present application;

FIG. 11 is a schematic flow chart diagram illustrating another data processing method according to an embodiment of the present application;

FIG. 12 is a schematic diagram of a computing device according to an embodiment of the present disclosure;

fig. 13 is a schematic structural diagram of another computing device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

First, a conventional data processing method will be described with reference to fig. 1 to 3.

FIG. 1 is a schematic diagram of a computing device. As shown in fig. 1, computing device 10 includes RNIC101, processor 102, and SCM103, where RNIC, processor 102, and SCM103 are connected by bus 104, where bus 104 includes, but is not limited to, a peripheral bus (e.g., a Peripheral Component Interconnect (PCI) bus, a PCIe bus, etc.), a system bus; the processor may be a Central Processing Unit (CPU). The processor 102 includes an integrated input/output controller (IIO), a Last Level Cache (LLC), an Integrated Memory Controller (iMC), and one or more processor cores (cores). In the processor, the IIO is used for processing messages interacted with peripheral devices such as the RNIC, and the messages may be PCI messages, PCIe messages, and the like. The IIO, LLC, iMC, and processor core may be connected via an intra-chip bus, e.g., if the processor is an advanced reduced instruction set machine (ARM) architecture chip, the IIO, LLC, iMC, and processor core are connected via an advanced high performance bus (AHB).

In a computing device with an SCM as memory, to write data in RNIC to the SCM, the data needs to pass through a processor of the computing device, and the data in RNIC may be written to the SCM in two data flows as shown in fig. 1. The first data stream comprises: firstly, data is written into IIO of a processor by RNIC, secondly, the IIO of the processor writes the data into iMC of the processor, the iMC of the processor caches the data in an asynchronous dynamic random access memory refresh (ADR) area of the iMC, and thirdly, the iMC of the processor writes the data into SCM. The second data stream comprises: firstly, data is written into IIO of a processor by RNIC, secondly, the IIO of the processor writes the data into LLC of the processor, thirdly, the data in the LLC of the processor is flushed into iMC of the processor, the iMC of the processor buffers the data in ADR of the iMC, and fourthly, the iMC of the processor writes the data into SCM. Since the ADR of the iMC has the characteristic of not losing power down, in general, data written into the ADR can be regarded as memory persistence being completed.

As can be seen from fig. 1, there are two possible data flow directions of data in the processor, wherein the flow direction of data in the processor is determined by the architectural characteristics of the processor. In some processor architectures, the data flow direction in the processor is related to a data direct I/O (DDIO) function in the IIO, and if the DDIO function in the IIO is turned on, the data flow direction in the processor is the second data flow direction; if the DDIO function in the IIO is not turned on, the data flow direction in the processor is the first data flow direction. In other processor architectures (such as the isylake architecture of intel), the flow direction of data in the processor is related to the DDIO function in the IIO and the packet sent by the RNIC to the IIO, and if the no-snoop (NS) flag bit in the packet sent by the RNIC to the IIO is 1 or the DDIO function in the IIO is not turned on, the flow direction of data in the processor is the first data flow direction; if the NS flag bit in the message sent by RNIC to IIO is not 1 and the DDIO function of the processor is turned on, the data flow in the processor is the second data flow.

A storage system formed of SCM-enabled computing devices may include a plurality of SCM-enabled computing devices, in which remote access between the two computing devices is enabled by the RNICs of the computing devices. Two computing devices can be divided into two roles, a local device and a remote device, where "local" and "remote" are two opposite concepts, a local device referring to a computing device that initiates an RDMA request, i.e., a computing device that requests access to another computing device. A remote device refers to a computing device that receives an RDMA request, i.e., a computing device that is accessed by another computing device. The access of the local device to the remote device may be to write data into the remote device for the local device, specifically, the local device transmits the data in the local device to the RNIC of the remote device through the RNIC of the local device, and the remote device receives the data through the RNIC of the remote device, so as to transmit the data in the local device to the remote device. The access of the local device to the remote device may also be that the local device reads data from the remote device, specifically, the local device may read data in an SCM of the remote device through an RNIC of the local device, the remote device transmits data to be read by the local device to an RNIC of the local device through the RNIC of the remote device, and the RNIC of the local device receives the data to complete reading of the data in the remote device.

Remote memory persistence is implemented in two computing devices, and for the two different data flows, in some designs, there are different remote memory persistence processes.

For the first data flow, the data does not pass through the LLC of the processor, and the flow of remote memory persistence is shown in fig. 2, and includes the following steps:

s201, a processor of the local device initiates a first RDMA write request to an RNIC of the local device.

S202, the RNIC of the local device processes the first RDMA write request and generates a second RDMA write request according to the first RDMA write request.

S203, the RNIC of the local device sends the second RDMA write request to the RNIC of the remote device, and the RNIC of the remote device receives the second RDMA write request.

S204, the RNIC of the remote device writes the data in the second RDMA write request into the SCM in a DMA mode.

Here, the data flow of the data in the second RDMA data write request in the remote device is the first data flow, there is a certain delay due to the data being transmitted on the bus, and a part of the data may be buffered in the buffer of the intermediate medium such as the IIO on the peripheral bus link due to the occupied bus.

S205, the RNIC of the remote device sends ACK to the RNIC of the local device, and the RNIC of the local device receives ACK.

S206, the RNIC of the local device generates a CQE corresponding to the first RDMA write request.

S207, the processor of the local device acquires the CQE corresponding to the first RDMA write request, and determines that the data write is completed.

S208, the processor of the local device initiates a first RDMA read request to the RNIC of the local device.

S209, the RNIC of the local device processes the first RDMA read request and generates a second RDMA read request according to the first RDMA read request.

S210, the RNIC of the local device sends the second RDMA read request to the RNIC of the remote device, and the RNIC of the remote device receives the second RDMA read request.

S211, the RNIC of the remote device reads data from the SCM in a DMA mode.

Since it is specified in the peripheral bus protocol (e.g., PCI protocol, PCIe protocol) that all write operations before a read operation need to be completed before any read operation is performed, reading data from the SCM by way of DMA may write data buffered in an intermediate medium on a peripheral bus link such as IIO into the SCM. Then, the data in the first RDMA write request, which may also be buffered in the intermediate medium, may be written to the SCM by a DMA read operation.

S212, the RNIC of the RNIC remote device sends ACK to the RNIC of the local device, and the RNIC of the local device receives ACK.

S213, the RNIC of the local device generates a CQE corresponding to the first RDMA read request.

S214, the processor of the local device acquires the CQE corresponding to the first RDMA read request, and determines that the memory persistence of the data is completed.

For the second data flow, where data is passed through the LLC of the processor, the flow of remote memory persistence is shown in fig. 3, and includes the following steps:

s301, a processor of the local device initiates a first RDMA write request to an RNIC of the local device.

S302, the RNIC of the local device processes the first RDMA write request and generates a second RDMA write request according to the first RDMA write request.

S303, the RNIC of the local device sends the second RDMA write request to the RNIC of the remote device, and the RNIC of the remote device receives the second RDMA write request.

S304, the RNIC of the remote device writes the data in the second RDMA write request into the SCM in a DMA mode.

The data flow of the remote device for the data in the second RDMA write request is the second data flow, and since the data is going to pass through the LLC, the data is buffered in the LLC during the writing of the data to the SCM.

S305, the RNIC of the remote device sends ACK to the RNIC of the local device, and the RNIC of the local device receives the ACK.

S306, the RNIC of the local device generates a CQE corresponding to the first RDMA write request.

S307, the processor of the local device acquires the CQE corresponding to the first RDMA write request, and determines that the data write is completed.

S308, the processor of the local device initiates a first RDMA sending request to the RNIC of the local device, wherein the first RDMA sending request carries a persistence mark.

S309, the RNIC of the local device processes the first RDMA sending request, and generates a second RDMA sending request according to the first RDMA sending request, wherein the second RDMA request carries a persistence tag and a destination virtual memory address of data.

S310, the RNIC of the local device sends a second RDMA send request to the RNIC of the remote device.

S311, the processor of the remote device initiates a first RDMA receive request to the RNIC of the remote device.

S312, the RNIC of the remote device generates a WQE corresponding to the first RDMA receiving request.

S313, the RNIC of the remote device generates a CQE corresponding to the first RDMA receiving request according to the second RDMA sending request.

S314, the RNIC of the remote device sends ACK to the RNIC of the local device, and the RNIC of the local device receives the ACK.

S315, the RNIC of the local device generates a CQE corresponding to the first RDMA send request.

S316, the processor of the local device acquires a CQE corresponding to the first RDMA sending request.

S317, the processor of the local device initiates a second RDMA receive request to the RNIC of the local device.

S318, the RNIC of the local device generates a WQE corresponding to the second RDMA receiving request.

S319, the processor of the remote device obtains the CQE corresponding to the first RDMA receiving request, and determines that the data corresponding to the first RDMA receiving request needs to be persisted.

S320, the processor of the remote device swipes the data of the corresponding address back into the ADR of the iMC.

S321, the processor of the remote device initiates a third RDMA sending request to the RNIC, wherein the third RDMA request carries an indication of completion of data flushing.

S322, the RNIC of the remote device processes the third RDMA sending request, and generates a fourth RDMA sending request according to the third RDMA sending request, wherein the fourth RDMA sending request carries an indication of completion of data flushing.

S323, the RNIC of the remote device sends a fourth RDMA sending request to the RNIC of the local device, and the RNIC of the local device receives the fourth RDMA sending request.

S324, the RNIC of the local device generates a CQE corresponding to the second RDMA receiving request according to the indication of completion of data flushing in the fourth RDMA sending request.

S325, the processor of the local device acquires the CQE corresponding to the second RDMA receiving request, and determines that the memory persistence of the data is completed.

It can be seen that in the remote memory persistence processes shown in fig. 2 and 3, after the local device initiates an RDMA write request, in order to ensure that data in the RDMA write request is written into the SCM, the local device needs to initiate an RDMA read request once after initiating the RDMA write request or an RDMA send request to write data that may not have been written into the SCM back into the SCM, which may be referred to as RDMA persistence requests. Because the local device needs to initiate an RDMA persistent request after each RDMA write request, each RDMA request occupies a certain bandwidth, and the additional RDMA persistent request increases the network load of the RDMA network.

Embodiments of the present application provide a data processing method, an RNIC, and a device, so as to solve the problem of a large network load in the remote memory persistence processes shown in fig. 2 and 3.

Embodiments of the present application may be applicable to computer networks that interconnect and communicate via RDMA techniques between computing devices, which may be as shown in fig. 4, where the computing devices and computing devices may be connected and communicate via a wired network (e.g., ethernet). In the computer network, two roles exist, namely a local device and a remote device, and the embodiment of the application can be particularly applied to a communication system formed by the local device and the remote device. The communication system formed by the local device and the remote device may be as shown in fig. 5, and the structures of the local device and the remote device may refer to the computing devices shown in fig. 1, each computing device including an RNIC, a processor, and an SCM, where the definitions of the local device and the remote device may be seen in the foregoing description, and the local device and the remote device perform data transmission with the opposite computing device through their respective RNICs.

According to the embodiment of the application, the network load required by the remote memory persistence process is reduced by improving the structure of the RNIC and the remote memory persistence process.

First, an implementation manner of the RNIC provided in the embodiment of the present application is introduced, and compared with a general RNIC, the RNIC of the embodiment of the present application further includes a Persistent Memory (PM) module, where a function implemented by the PM module may be implemented by a hardware circuit composed of a gate array or a software program running in the RNIC. In the case where the control means in the RNIC is a microprogram control means with micro-storage as a core, the function realized by the PM module may be realized by a software program running in the RNIC; in the case where the control method of the control unit of the RNIC is a control method mainly based on a logic wiring structure, the function realized by the PM module may be realized by a hardware circuit composed of a gate array. The PM module is to perform operations related to memory persistence. The PM module can judge whether the data in the RDMA write request needs to be subjected to memory persistence according to the data and parameters carried in the received RDMA write request, and directly instructs the processor to write the data in the RDMA write request into a nonvolatile memory of the computing device under the condition that the data in the RDMA write request needs to be subjected to memory persistence according to the parameters, so that the memory persistence of the data is completed.

In the embodiment of the present application, the nonvolatile memory may be an SCM. The SCM may specifically be a PRAM, which may be, for example, 3D Xpoint, ReRAM, MRAM, and the like.

Fig. 6 is a schematic structural diagram of an implementation manner of the RNIC60 provided in the embodiment of the present application, where the RNIC may be used as an RNIC of a remote device or a RNIC of a local device. As shown in FIG. 2, RNIC60 may include a receive module 601, a schedule module 602, a transmit module 603, and a persistent memory module 604.

The receiving module 601 is used for receiving a message sent by an external computing device. In the case where the RNIC60 serves as the RNIC of the local device, the receiving module 601 may be configured to receive a request or data sent by the RNIC of the remote device. In this embodiment, in the case that the RNIC60 serves as the RNIC of the local device, the receiving module 601 may be configured to perform the receiving operation performed by the second RNIC in the interaction flow between the second RNIC and the first RNIC in the method embodiments shown in fig. 9 to 11. In the case where the RNIC60 serves as the RNIC of the remote device, the receiving module 601 may be configured to receive a request or data sent by the RNIC of the local device. In this embodiment of the application, in a case that the RNIC serves as an RNIC of a remote device, the receiving module 601 may be configured to perform a receiving operation performed by a first RNIC in an interaction flow of a first RNIC and a second RNIC in the method embodiments shown in fig. 9 to fig. 11.

The sending module 603 is used to send messages to an external computing device. In the case where the RNIC60 serves as the RNIC of the local device, the sending module 603 may be configured to send a request or data to the RNIC of the remote device. In this embodiment, when the RNIC60 is used as the RNIC of the local device, the sending module 603 may be configured to execute sending operations executed by the second RNIC in the interaction flow between the second RNIC and the first RNIC in the method embodiments shown in fig. 9 to fig. 11. In the case where the RNIC60 is the RNIC of the remote device, the sending module 603 may be configured to send the request or data to the RNIC of the local device. In this embodiment, in the case that the RNIC60 is used as an RNIC of a remote device, the sending module 603 may be configured to execute sending operations executed by a first RNIC in an interaction flow between a second RNIC and a first RNIC in the method embodiments shown in fig. 9 to 11.

The scheduling module 602 is configured to communicate with a processor of a computing device and perform corresponding data processing. Where the RNIC60 is the RNIC of the local device, the scheduling module 602 is configured to communicate with and perform data processing on the processor of the local device. In the embodiment of the present application, in the case that the RNIC60 serves as the RNIC of the local device, the scheduling module 602 may be configured to perform the operations performed by the second RNIC in the method embodiments shown in fig. 9-11 and interacting with the second processor or the operations related to scheduling. Where the RNIC60 is the RNIC of a remote device, the scheduling module 602 is configured to communicate with and perform data processing on a processor of the remote device. In the embodiment of the present application, in the case that the RNIC60 is an RNIC of a remote device, the scheduling module 602 may be configured to perform operations performed by the first RNIC in the method embodiments shown in fig. 9-11 and interacting with the first processor or operations related to scheduling.

The PM module 604 is used to perform operations related to memory persistence. In this embodiment, for the first data flow, the PM module 604 is configured to determine, when the receiving module 601 receives an RDMA write request, whether first data in the RDMA write request is to-be-persisted data, and, when it is determined that the first data in the RDMA write request is to-be-persisted data, add an RDMA LSB read request in an SQ corresponding to the RDMA write request, so that the scheduler module 603 may send the RDMA LSB read request to the processor, so that the processor writes data cached on the peripheral bus link into the nonvolatile memory, and thus writes the first data into the nonvolatile memory. In this embodiment, for the second data flow, the PM module 604 is configured to determine, when the receiving module 601 receives an RDMA write request, whether first data in the RDMA write request is to-be-persisted data, and if it is determined that the first data in the RDMA write request is to-be-persisted data, generate a WQE corresponding to the RDMA receive request according to the first data in the RDMA write request, then clear the WQE, and regenerate a CQE corresponding to the RDMA receive request, so that a processor (i.e., a processor of a remote device) corresponding to the RNIC60 may write the first data cached in the volatile storage medium back to the nonvolatile memory according to the CQE.

In a possible implementation manner, the functions implemented by the receiving module 601, the scheduling module 602, the sending module 603, and the persistent memory module 604 may be implemented by the operation logic 701, the register 702, the control unit 703, and the input/output interface 704 in cooperation with each other, as shown in fig. 7, the operation logic 701, the register 702, the control unit 703, and the input/output interface 704 may be connected through one or more internal buses 705. The arithmetic logic component 701 may be used to execute arithmetic commands such as add commands, subtract commands, multiply commands, divide commands, and the like; the logical operation unit 701 can also be used for acquiring a logical command, such as an or logical command, an and logical command, a non-logical command, and the like; the logic operation unit 701 may further be configured to obtain a control signal from the control unit 703, obtain data corresponding to the control signal from the register 702 according to the obtained control signal, and perform a corresponding operation. The register 702 is a memory with a small storage space, and the register 702 can be used for storing various instructions; the registers 702 may also be used to store register operands and intermediate or final operation results that are temporarily stored during instruction execution; the registers may also be used to store data used by the logical operation unit 701 to complete tasks requested by the control unit 703. The control unit 703 is configured to decode the instructions stored in the register, and send out a control signal for each operation to be performed to complete each instruction; the control unit 703 may be controlled in two ways, one is a microprogram control mode using a micro memory as a core, the microprogram may be stored in the register 702, the other is a hardware control mode mainly using a logical hard-wired structure, and the control unit 703 may be composed of various and or gate arrays, for example. The input/output interface 704 is used for sending or receiving data, and there may be a plurality of input/output interfaces 704, which are respectively used for receiving data sent by the processor or sending data to the processor, or for receiving data sent by an external computing device or sending data to the external computing device.

Optionally, the RNIC may further include a crystal oscillator, a media access controller, a physical interface transceiver, and the like, and the embodiments of the present application are not limited.

The data processing method according to the embodiment of the present application may be implemented based on the RNIC described in the foregoing embodiments corresponding to fig. 6 and fig. 7.

Before describing the data processing method of the embodiments of the present application, for ease of understanding, some concepts related to the embodiments of the present application will be described first.

1. Concept of RDMA operations

1) RDMA single-sided operation

The RDMA single-side operation means that when an application on a local device accesses a memory of a remote device, only a processor of the local device participates, and does not need the processor of the remote device to participate, that is, only one processor of one side is working. The read-write operation of the data in the memory of the remote device can be completed as long as the local device confirms the source address and the destination address of the data. In the embodiment of the application, the RDMA single-edge operation mainly involved comprises RDMA Write operation (RDMA-Write).

2) RDMA bilateral operations

RDMA bilateral operations refer to when an application on a local device accesses a memory of a remote device, requiring both a processor of the local device and a processor of the remote device to participate, i.e., both processors of the two devices are working. The RDMA bilateral operation mainly involved in the embodiment of the application comprises RDMA sending operation (RDMA-Send) and RDMA receiving operation (RDMA-Receive). If the local device is to transfer data to the memory of the remote device via an RDMA send operation, the remote device must first initiate an RDMA receive operation for receiving the local device initiated RDMA send operation.

2. Concept of queue

The queue in the embodiment of the present application is similar to the message queue concept in socket communication, and may be understood as a container for storing various information or data, which is provided for asynchronous processing.

1) Work queue (work queue, WQ)

In RDMA technology, when two computing devices need to communicate, a channel connection is established between RNICs of the two computing devices, and the head and tail end points of each channel are two pairs of Queues (QPs). Illustratively, the channel between RNICs of two computing devices may be as shown in fig. 8, with each pair of QPs consisting of Send Queue (SQ) and RQ, where various types of messages are managed. The QP is directly mapped to a virtual address space of an application (client) of the computing device so that the application in the computing device can directly access the RNIC through it. Both SQ and RQ may be referred to as WQ, which is SQ for a computing device that is to send data; for a computing device to receive data, WQ is RQ.

An application in the computing device may create a Work Request (WR) to notify a certain WQ in the QP with a WR in which a remote operation request (e.g., a remote read operation request, a remote write operation request, etc.) of the application is described so that the RNIC of the computing device may determine the operations to handle the scheduling and to perform. In WQ, WRs are converted to WQE format, waiting for RNIC to schedule them. For example, an application of computing device A wishes to transfer content stored at address A to address B (address A is an address in computer A and address B is an address in computer B), which then informs the RNIC of computer A of address A, address B, and a write instruction by WR, which adds a WQE to SQ, which includes address A, address B, and the write instruction.

An application of a computing device may add a WQE corresponding to an RDMA request in a WQ after receiving the RDMA request by sending the RDMA request as a WR to an RNIC of the computing device. For RDMA requests of send type (e.g., RDMA read request, RDMA write request, RDMA send request), the WQE corresponding to the RDMA request of send type is added into SQ, as shown in FIG. 8; for a RDMA request of a receive type (e.g., an RDMA receive request), a WQE corresponding to the RDMA request of the receive type is added to the RQ.

2) Completion Queue (CQ)

In addition to the QP, there is a queue in RDMA technology, which is a Completion Queue (CQ) used to store a completion event corresponding to an operation to notify an upper application. For example, if the application of computer a wants to transfer the content stored at address a to address B (address a is the address in computer a and address B is the address in computer B), the RNIC of computer a sends the content at address a to the RNIC of computer B and determines that the RNIC of computer B received the content, the RNIC of computer a generates a CQE, and the application of computer a acquires the CQE and determines that the content stored at address a is transferred to address B.

Next, a data processing method of the embodiment of the present application is described. Referring to fig. 9, fig. 9 is a schematic flowchart of a data processing method provided in an embodiment of the present application, where the first RNIC and the first processor are an RNIC and a processor of a first device, respectively, and the first device is a remote device; the second RNIC and the second processor are an RNIC and a processor, respectively, of a second device, which is a local device. As shown, the method includes:

s801, the second processor sends an RDMA write persistence request to the second RNIC, the second RNIC receives the RDMA write persistence request, and the RDMA write persistence request includes a data persistence flag.

An RDMA write persistence request may be understood as a WR created by the second processor to describe a remote operation request of the second processor. In an embodiment of the application, the RDMA write persistence request is used to request to store the first data in a non-volatile memory of the first device to complete memory persistence of the first data.

The RDMA write persistence request may include an RDMA operation instruction, a source virtual memory address of the first data, and a destination virtual memory address of the first data. The source virtual memory address of the first data is a virtual memory address of the first data in the second device, and the storage space corresponding to the source virtual memory address is used for storing the first data in the second device. The destination virtual memory address of the first data is a virtual memory address in the first device, and the storage space corresponding to the destination virtual memory address is used for storing the first data after the first data is written into the first device. The destination virtual memory address is a virtual memory address registered by the first processor through the virtual memory address registration process. For example, if the first data is stored in the storage space corresponding to the virtual memory address a of the first device, and the second processor is to store the first data in the storage space corresponding to the virtual memory address B of the second device, the source virtual address of the first data is the virtual memory address a in the first device, and the destination virtual memory address of the second data is the virtual memory address B in the second device.

Here, the virtual memory address is a logical storage address having a mapping relationship with a physical address in a memory of the computing device, and is used to implement isolation between programs and guarantee normal operation of the programs in the computing device. In a computing device, an application program forms a plurality of sub programs after being compiled, addresses of the sub programs are usually started from '0', other addresses in the sub programs are calculated relative to a starting address (namely '0'), an address range formed by the addresses is called an address space, and the addresses in the address space are logical storage addresses; these addresses correspond to the memory space of the memory of the computing device, the address range formed by the addresses of the memory space of the memory of the computing device is called the memory space, and the addresses in the memory space are physical addresses. Under the condition that a plurality of subprograms run simultaneously, the addresses of the subprograms need to be loaded from the address "0", because only one physical address which is 0 in a computer is provided, part of the subprograms cannot be loaded from "0", so that the logical storage address in the address space is inconsistent with the physical address in the memory space, and if the subprogram a needs to be loaded from the logical storage address 0 but is actually loaded from the physical address 10, the logical storage address in the address space needs to be converted into the corresponding physical address in the memory space through address mapping.

In the RDMA technique, a first processor allocates a segment of memory space in a memory in advance, where the memory space is used to store data related to RDMA operation, and then sends a target page table to an RNIC through a virtual address registration process, where the target page table is used to store a corresponding relationship between a virtual memory address and a physical memory address corresponding to the memory space, so that the RNIC can determine, according to the target page table, a virtual memory address and a physical memory address corresponding to the memory space for storing data related to RDMA operation, and the RNIC can also determine, according to the target page table, a physical memory address corresponding to a certain virtual memory address. For example, a processor uses a memory space of 1001-3000 physical addresses for storing data related to RDMA operations, with corresponding virtual memory addresses of 1-2000, and illustratively sends a page table as shown in Table 1 to the RNIC.

TABLE 1

Virtual memory address Physical address
1 1001
2 1001
2000 3000

In the virtual memory address registration process, the processor, in addition to sending the target page table to the RNIC, specifies an access flag of the registered address space, where the access flag is used to indicate an attribute of a storage space corresponding to the address space, for example, an attribute of the storage space corresponding to the address space is remote readable (remote read), that is, the storage space is a storage space in which data can be read, and for example, an attribute of the storage space corresponding to the address space is remote writable (remote write), that is, the storage space is a storage space in which data can be written, and for example, an attribute of the storage space corresponding to the address space is remote readable and writable, that is, the storage space is a storage space in which data can be read and data can be written, and so on. In the process of registering the virtual memory address, the processor sends the initial virtual memory address of the address space, the length of the address space and the access mark of the address space to the RNIC so as to inform the attribute of the storage space corresponding to each address space of the RNIC and the virtual memory address contained in the storage space. For example, the processor may use a memory space of the memory space with physical addresses of 1001-3000 to store data related to RDMA operations, and the corresponding virtual memory addresses are 1-2000, wherein the processor designates the memory space corresponding to the address space with virtual memory addresses of 1-500 as a readable memory space, designates the memory space corresponding to the address space of 501-1000 as a writable memory space, and designates the memory space corresponding to the address space of 1001-1500 as a readable and writable memory space, and the processor may send the information shown in table 2 to the RNIC.

TABLE 2

Initial virtual memory address Length of address space Access token
1 500 Readable
501 500 Writable card
1001 500 Readable and writable

After the virtual memory address registration process is performed, the first device may send, to the second device, information related to an address space registered in the virtual memory address registration process through an RDMA sending operation, so that the second device may know a virtual memory address corresponding to a storage space in the first device, where the storage space is used to store data related to an RMDA operation, where the information related to the address space includes a start virtual memory address of the registered address space, a length of the address space, and an access tag.

In the embodiment of the application, the data persistence flag is used for indicating that the first data is data to be persisted. Depending on the network characteristics of an RDMA network, there may be two possible scenarios for data persistence marking:

1) in the RDMA network, the RNIC of the computing device executes corresponding operation according to the RDMA operation instruction obtained by parsing. Based on this setting, in a first possible implementation of the embodiment of the present application, a write persistence instruction may be newly added to the RDMA operation instruction, where the write persistence instruction indicates that the first data is data that needs to be written and is memory persisted, that is, the data persistence flag is a write persistence instruction.

In the case where the data persistence flag is a write persistence instruction, the RDMA write persistence request may include the write persistence instruction, a source virtual memory address of the first data, and a destination virtual memory address of the first data.

2) In a second possible implementation of the embodiment of the present application, the data persistence flag may also be a destination storage address corresponding to the first data, where a storage space corresponding to the destination storage address is used to store the first data, the destination storage address is a persistent storage address in the first device, and the storage space corresponding to the persistent storage address is used to store data to be persisted.

As described above, the processor of the first device may specify an address space corresponding to a storage space for storing data related to an RDMA operation and an access tag of the address space through the virtual memory address registration process, and after performing the virtual address registration process, the first device may transmit information registered in the virtual memory address registration process to the second device through an RDMA send operation. Based on the setting, a remote write persistence flag can be added to the access flag, and the remote write persistence flag indicates that the storage space corresponding to the address space is used for storing the data to be persisted. Then, the destination storage address may be a destination virtual memory address, which is a logical storage address in the address space where the access tag registered by the first processor through the virtual memory address registration process is the write persistence tag. Since the destination storage address is a logical storage address in an address space where the access flag registered by the first processor through the virtual memory address registration process is written as a write persistence flag, and a storage space corresponding to the address space where the access flag is written as the write persistence flag is used for storing data to be persisted, the storage space corresponding to the destination virtual memory address is used for storing the data to be persisted.

In the case where the data persistence flag is a destination virtual memory address of the first data, the RDMA write persistence request may include a write instruction, a source virtual memory address of the first data, and a destination virtual memory address of the first data, the destination virtual memory address being a logical storage address in an address space marked as a write persistence flag for accesses registered by the first processor through the virtual memory address registration process.

S802, the second RNIC generates an RDMA write request according to the RDMA write persistence request, the RDMA write request including the first data and the data persistence flag.

In a specific implementation, the second RNIC may create, in the SQ, a WQE corresponding to the RDMA write persistence request, where the WQE corresponding to the RDMA write persistence request may include a source virtual memory address of the first data, an RDMA operation instruction, and a destination virtual memory address of the first data; and then when the WQE corresponding to the RDMA write persistence request is scheduled, acquiring first data from a source virtual memory address according to the source virtual memory address of the first data, and packaging the first data, a destination virtual memory address of the first data and the RDMA operation instruction into an RDMA transmission message to form the RDMA write request, wherein the RDMA transmission message refers to a message transmitted between the first RNIC and the second RNIC. The operation of the second RNIC generating the RDMA write request according to the RDMA write persistence request may be specifically performed by a scheduling module of the second RNIC.

If the data persistence flag is a write persistence instruction, the RDMA write request may include the write persistence instruction, a destination virtual memory address of the first data, and the RDMA write request may also be referred to as a write persistence request; if the data persistence flag is a destination virtual memory address of the first data, the RDMA write request may include a write instruction, the destination virtual memory address of the first data, and the first data, the destination virtual memory address of the first data being a persistent virtual memory address registered by the first processor via the virtual memory address.

Optionally, the RDMA write request may further include a sequence number of the first data, a QP sequence number of the first device. The serial number of the first data packet is used for uniquely identifying the first data in the transmission process of the first device and the second device, so that the lost or repeated data packet can be detected conveniently; the QP sequence number of the remote device is used to identify the unique channel between the local device and the remote device.

S803, the second RNIC sends the RDMA write request to the first RNIC, and the first RNIC receives the RDMA write request.

In an embodiment of the application, the second RNIC may send the RDMA write request to the first RNIC based on an InfiniBand (IB) protocol; the second RNIC may also send the RDMA write request to the first RNIC based on a converged Ethernet remote direct memory access (RoCE) protocol; the second RNIC may also send the RDMA write request to the first RNIC based on a remote direct memory access (iWARP) protocol of a transport control protocol.

In a specific implementation, the second RNIC sends the RDMA write request to the first RNIC through a sending module of the second RNIC, and the first RNIC receives the RDMA write request through a receiving module of the first RNIC.

After receiving the RDMA write request, the receiving module of the first RNIC places the RDMA write request in an RQ corresponding to the first RNIC, and waits for the scheduling module of the first RNIC to schedule it. Under the condition that the dispatching module of the first RNIC dispatches the RDMA write request, the dispatching module of the first RNIC analyzes the RDMA write request to obtain first data, a destination virtual memory address of the first data and an RDMA operation instruction.

When the data persistence flag is the write persistence instruction, the scheduling module of the first RNIC may send the write persistence instruction to the PM module of the first RNIC, and the PM module of the first RNIC determines that the first data is data to be persisted according to the write persistence instruction, and performs step S805; the scheduling module of the first RNIC determines that the first data needs to be written into the destination virtual memory address according to the destination virtual memory address of the first data and the write persistence instruction, and the scheduling module of the first RNIC executes step S804.

Under the condition that the data persistence flag is a destination virtual memory address of the first data, the scheduling module of the first RNIC may send the destination virtual memory address of the first data to the PM module, and since the destination virtual memory address of the first data is a logical storage address in an address space where an access flag registered by the first processor through a virtual memory address registration process is a remote write persistence flag, the PM module of the first RNIC determines that the first data is data to be persisted, and performs step S805; the scheduling module of the first RNIC determines that the first data needs to be written into the destination virtual memory address according to the destination virtual memory address of the first data and the write instruction, and the scheduling module of the first RNIC executes step S804.

S804, the first RNIC sends a DMA write request to the first processor, and the first processor receives the DMA write request, wherein the DMA write request comprises first data.

The operation of the first RNIC sending a DMA write request to the first processor is performed by a scheduling module that may be the first RNIC.

Here, the first RNIC sends the DMA write request to the first processor to write the first data in DMA to the non-volatile memory of the first device.

S805, the first RNIC instructs the first processor to save the first data to the non-volatile memory of the first device.

As can be seen from the steps shown in fig. 9, after receiving an RDMA write request, an RNIC of a remote device determines, according to a data persistence flag in the RDMA write request, that data in the RDMA write request is data to be subjected to memory persistence, and directly instructs a first process to store first data in a nonvolatile memory of a first device, which is equivalent to combining the RDMA write request and the RDMA persistence request in one request, so that an operation of the remote device to initiate an RDMA persistence request is omitted, and since each request needs to occupy a certain network bandwidth, an operation of the remote device to initiate an RDMA persistence request is omitted, so that a bandwidth occupied by one request is saved, and a load of an RDMA network is reduced; in addition, because the RDMA write request and the RDMA persistent request are merged into one request, the data write operation and the memory persistent operation are equivalently changed into two operations which need to be continuously executed, the data can be guaranteed to be stored in a nonvolatile memory of the remote device after being written into the remote device, and the problem of data inconsistency is avoided.

As can be seen from the foregoing, to write the data received by the RNIC into the non-volatile memory of the computing device corresponding to the RNIC, the data may be temporarily buffered in each level of the processor due to the occupied bus, etc., since the data is to pass through the processor corresponding to the RNIC. Due to differences in architectural characteristics of processors, there are two data flow directions for data in the processors, i.e., there are two different data flow directions for the first data in the first processor. In this embodiment, in combination with the different data flow directions described in fig. 1, the first RNIC instructs the first processor to store the first data in the nonvolatile memory of the first device in different manners, and the following describes specifically a flow of two different data flow directions corresponding to the data processing methods. See fig. 10-11.

Fig. 10 is a schematic flowchart of another data processing method according to an embodiment of the present application, where the flow is applicable to a case where the data flow is the first data flow, where the first RNIC and the first processor are an RNIC and a processor of a first device, respectively, and the first device is a remote device; the second RNIC and the second processor are an RNIC and a processor, respectively, of a second device, which is a local device. As shown, the method includes the following steps:

s901, the second processor sends an RDMA write persistence request to the second RNIC, the second RNIC receives the RDMA write persistence request, and the RDMA write persistence request includes a data persistence flag.

S902, the second RNIC generates an RDMA write request from the RDMA write persistence request, the RDMA write request including the first data and the data persistence flag.

S903, the second RNIC sends the RDMA write request to the first RNIC, and the first RNIC receives the RDMA write request.

S904, the first RNIC sends a DMA write request to the first processor, and the first processor receives the DMA write request, wherein the DMA write request comprises first data.

Here, the implementation of steps S901 to S904 may refer to the description of steps S801 to S804, and will not be described herein again. Wherein, the PM module of the first RNIC determines that the first data is the data to be persisted according to the persistency flag, and the PM module of the first RNIC executes step S905.

S905, the first RNIC adds the DMA LSB read request to the RQ corresponding to the second RNIC.

Here, the PM module of the first RNIC may determine an RQ corresponding to the second RNIC from the QP of the first device in the RDMA write request and then add the DMA LSB read request in the RQ. The DMA LSB read request is a request corresponding to a DMA read operation, and the virtual memory address corresponding to the DMA LSB read request may be a logical storage address in any segment of address space registered by the first processor through the virtual memory address registration process. The virtual memory address corresponding to the DMA LSB read request refers to a virtual memory address corresponding to a memory space to be read by the DMA LSB read operation corresponding to the DMA LSB read request.

After acquiring the DAM LSB read request, the scheduling module of the first RNIC performs step S906.

S906, the first RNIC sends the DMA LSB read request to the first processor, and the first processor receives the DMA LSB read request.

Here, if the architecture of the first processor is a processor architecture in which the flow of data in the processor is related to the DDIO function in the IIO and the message that the RNIC sends to the IIO, the first RNIC sets the NS flag to 1 in the DMA LSB read request.

By initiating a DMA LSB read request to the first processor, all write operations by the first processor prior to the DMA LSB read request can be completed, thereby enabling all data that has not been written to the nonvolatile memory to be written to the nonvolatile memory, ensuring that the first data written by the DMA write request prior to initiating the DMA LSB read request can be written to the nonvolatile memory, and completing memory persistence of the first data.

Further, after completing the memory persistence of the first data, the first RNIC may inform that the memory persistence of the first data is complete by sending an acknowledgement message to the second RNIC. The method shown in fig. 10 may further include:

s907, the first RNIC sends a persistence confirmation message corresponding to the first data to the second RNIC, and the second RNIC receives the persistence confirmation message corresponding to the first data.

The persistent acknowledgement message corresponding to the first data includes a second PSN, which is a sequence number of the first data.

In a specific implementation, the first RNIC sends the persistent acknowledgement message corresponding to the first data to the second RNIC through a sending module of the first RNIC, and the second RNIC receives the persistent acknowledgement message corresponding to the first data through a receiving module of the second RNIC.

And the scheduling module of the first RNIC acquires the sequence number of the first data from the RDMA write request, and the sequence number is used as a second PSN, the second PSN is packaged into the RDMA transmission message to form a confirmation message, and the confirmation message is sent to the second RNIC through the sending module of the first RNIC. And after receiving the confirmation message, the receiving module of the second RNIC sends the confirmation message to the scheduling module of the second RNIC, and the scheduling module of the second RNIC analyzes the confirmation message to obtain a second PSN (packet data network), and determines the confirmation message as a persistent confirmation message corresponding to the first data according to the second PSN.

S908, the second RNIC generates a CQE corresponding to the RDMA write persistence request.

Here, the scheduling module of the second RNIC determines that the first data has been stored in the non-volatile memory of the first device according to the persistence confirmation message corresponding to the first data, and the scheduling module of the second RNIC determines a WQE corresponding to the first data according to the second PSN, acquires the content in the WQE to generate a CQE corresponding to an RDMA write persistence request, where the CQE corresponding to the RDMA write persistence request may include a source virtual memory address and/or a second PSN and/or a destination virtual memory address of the first data.

S909, the second processor obtains the CQE corresponding to the RDMA write persistence request, and determines that the first data is already stored in the nonvolatile memory of the first device.

Here, the second processor may determine that the first data has been stored in the non-volatile memory of the first device according to the contents of the CQE. For example, the content of the CQE is a source virtual address of the first data, and the second processor determines that the data stored in the source virtual memory address has been stored in the nonvolatile memory of the first device, that is, the first data has been stored in the nonvolatile memory of the first device. And after the first data is stored in the nonvolatile memory of the first equipment, finishing memory persistence of the first data.

As can be seen from the steps shown in fig. 10, in the embodiment of the present application, for the case that data does not pass through the LLC of the processor, the embodiment of the present application only needs the local device to initiate an RDMA write request to write the data in the RDMA write request into the nonvolatile memory of the remote device, and compared with the flow shown in fig. 2, one-sided operation is omitted, so that the load of the processor of the local device is reduced, and at the same time, the bandwidth occupied by initiating an RDMA persistent request is saved, and the network load is reduced.

Fig. 11 is a schematic flowchart of another data processing method according to an embodiment of the present application, where the flow is applicable to a case where the data flow is the second data flow, where the first RNIC and the first processor are an RNIC and a processor of a first device, respectively, and the first device is a remote device; the second RNIC and the second processor are an RNIC and a processor, respectively, of a second device, which is a local device. As shown, the method includes the following steps:

s1001, the second processor sends an RDMA write persistence request to the second RNIC, the second RNIC receives the RDMA write persistence request, and the RDMA write persistence request includes a data persistence flag.

S1002, the second RNIC generates an RDMA write request according to the RDMA write persistence request, the RDMA write request including the first data and the data persistence flag.

S1003, the second RNIC sends the RDMA write request to the first RNIC, and the first RNIC receives the RDMA write request.

S1004, the first RNIC sends a DMA write request to the first processor, and the first processor receives the DMA write request, wherein the DMA write request comprises first data.

Here, the implementation of steps S1001 to S1004 can refer to the description of steps S801 to S804, and will not be described herein again. After determining that the first data is the data to be persisted according to the data persistence flag, the PM module of the first RNIC executes steps S1006 to S1007.

S1005, the first processor sends a first RDMA receiving request to the first RNIC, and the first RNIC receives the first RDMA receiving request.

Since the data in the first processor is going to pass through the LLC, the first processor is required to write the data in the LLC back to the non-volatile memory, i.e. participation of the first processor is required, whereas writing the data in the LLC back to the non-volatile memory in the remote device involves RDMA bilateral operations. For the RDMA bilateral operation, the remote device must first initiate an RDMA receive operation, and the first processor sends a first RDMA receive request to the second RNIC for receiving an RDMA send request sent by the second device.

S1006, the second RNIC generates a WQE corresponding to the first RDMA receive request according to the first data.

In one possible implementation manner, the WQE corresponding to the first RDMA receive request may include a destination virtual memory address of the first data; in another possible implementation, the CQE corresponding to the first RDMA receive request may include a sequence number of the first data; in yet another possible implementation, the CQE corresponding to the first RDMA receive request may include a destination virtual memory address of the first data and a sequence number of the first data.

S1007, the second RNIC clears the WQE corresponding to the first RDMA receive request and generates a CQE corresponding to the first RDMA receive request.

In a possible implementation manner, the PM module of the second RNIC may obtain the content of the WQE from the WQE corresponding to the first RDMA receive request, and then generate a CQE corresponding to the first RDMA receive request according to the obtained content of the WQE, where the CQE corresponding to the first RDMA receive request may include the destination virtual memory address of the first data and/or the sequence number of the first data.

In steps S1006 to S1007, the operation performed by the PM module of the second RNIC replaces the operation of the RDMA send request initiated by the second device, which achieves the same effect as the effect of the second device by initiating an RDMA send request to instruct the first data to be stored into the non-volatile memory of the first device.

S1008, the first processor acquires a CQE corresponding to the first RDMA receiving request, and stores the first data in a nonvolatile memory of the first device.

In a case that a CQE corresponding to the first RDMA receive request includes a destination virtual memory address of the first data, the first processor may find the first data in the LLC according to the destination virtual memory address of the first data, thereby storing the first data in the non-volatile memory of the first device; in a case where the CQE corresponding to the first RDMA receive request includes a sequence number of the first data, the first processor may find the first data in the LLC according to the sequence number of the first data, thereby storing the first data in the non-volatile memory of the first device.

In an alternative embodiment, after receiving the first data, the first RNIC may send an acknowledgement message to the second RNIC to inform that the second RNIC received the first data, and then after step S1003, the method further includes:

s1009, the first RNIC sends a reception acknowledgement message corresponding to the first data to the second RNIC, and the second RNIC receives the reception acknowledgement message corresponding to the first data.

Here, the reception acknowledgement message corresponding to the first data includes the first PSN, which is a sequence number of the first data, and the reception acknowledgement message corresponding to the first data indicates that the first RNIC receives the first data.

In a specific implementation, the first RNIC sends a reception confirmation message corresponding to the first data to the second RNIC through a sending module of the first RNIC, and the second RNIC receives the reception confirmation message corresponding to the first data through a receiving module of the second RNIC.

The scheduling module of the first RNIC acquires a sequence number of first data from the RDMA write request, the sequence number is used as a first PSN, the first PSN is packaged into the RDMA transmission message to form a confirmation message, and the confirmation message is sent to the second RNIC through the sending module of the first RNIC. And after receiving the confirmation message, the receiving module of the second RNIC sends the confirmation message to the scheduling module of the second RNIC, the scheduling module of the second RNIC analyzes the confirmation message to obtain a first PSN, and the scheduling module of the second RNIC determines that the confirmation message is a reception confirmation message corresponding to the first data according to the first PSN.

S1010, the second RNIC generates a CQE corresponding to the RDMA write persistence request.

S1011, the second processor acquires the CQE corresponding to the RDMA write persistence request, and determines that the first data transmission is completed.

The detailed implementation of steps S1010 to S1011 can refer to the description of steps S908 to S909, and will not be described herein again.

In the foregoing step, if the first RNIC initiates an RDMA send request operation instead of the second device, the second processor may initiate an RDMA receive request to receive the RDMA send request initiated by the first processor after determining that the first data transfer is completed, and after step S1011, the method may further include:

s1012, the second processor initiates a second RDMA receive request to the second RNIC, and the second RNIC receives the second RDMA receive request.

S1013, the second RNIC generates a WQE corresponding to the second RDMA receive request.

In an alternative embodiment, after the first processor stores the first data in the nonvolatile memory of the first device, the first RNIC may notify that the memory persistence of the first data is completed by sending an acknowledgement message to the second RNIC, and then after step S1008, the method may further include:

s1014, the first processor sends the RDMA send request to the first RNIC, and the first RNIC receives the RDMA send request.

Here, the RDMA send request is to indicate that the first data has been stored in the non-volatile memory of the first device, and the first RDMA send request may include a sequence number of the first data.

S1015, the first RNIC generates a WQE corresponding to the RDMA send request.

S1016, the first RNIC sends a persistence confirmation message corresponding to the first data to the second RNIC, and the second RNIC receives the persistence confirmation message corresponding to the first data.

Here, the persistent acknowledgement message corresponding to the first data includes the second PSN, which is a sequence number of the first data.

In a specific implementation, the first RNIC sends the persistent acknowledgement message corresponding to the first data to the second RNIC through a sending module of the first RNIC, and the second RNIC receives the persistent acknowledgement message corresponding to the first data through a receiving module of the second RNIC.

And the scheduling module of the first RNIC acquires the sequence number of the first data packet from the first RDMA sending request as a second PSN, then encapsulates the second PSN into an RDMA transmission message to form a confirmation message, and sends the confirmation message to the second RNIC through the sending module of the first RNIC. And after receiving the confirmation message, the receiving module of the second RNIC sends the confirmation message to the scheduling module of the second RNIC, the scheduling module of the second RNIC analyzes the confirmation message to obtain a second PSN, and the scheduling module of the second RNIC determines that the confirmation message is a persistent confirmation message corresponding to the first data.

Upon receiving the persistence confirmation message corresponding to the first data transmitted by the first RNIC, the second RNIC performs step S1018.

S1017, the first RNIC clears the WQE corresponding to the RDMA send request and generates a CQE corresponding to the RDMA send request.

S1018, the second RNIC generates a CQE corresponding to the second RDMA receive request.

S1019, the second processor acquires a CQE corresponding to the second RDMA receive request, and determines that the first data is stored in the nonvolatile memory of the first device.

And after the first data is stored in the nonvolatile memory of the first equipment, finishing memory persistence of the first data.

As can be seen from the steps shown in fig. 11, in the embodiment of the present application, for the case where data passes through the LLC of the processor of the remote device, the processor of the local device only needs to send one more bilateral operation after sending the unilateral operation to store the data in the nonvolatile memory of the remote device, and compared with the flow shown in fig. 3, one bilateral operation is omitted, which reduces the load of the processor of the local device, and at the same time, saves the bandwidth occupied by initiating the RDMA persistent request, and reduces the network load.

In the processes shown in fig. 10 and fig. 11, a process of saving one data to the nonvolatile memory of the first device is involved, and the above scheme may also be used to save a plurality of data to the nonvolatile memory of the first device. The RDMA transfer protocol specifies that the RNIC of the local device needs to send the next data to the RNIC of the remote device before determining that the transmission process of the previous data is completed, that is, the RNIC of the local device sends the next data to the remote device after receiving the ACK message for the previous data sent by the RNIC of the remote device. In a possible implementation manner, the memory persistence process of the previous data and the write-in process of the next data in the two data can be performed in parallel, so as to improve the transmission efficiency.

The specific implementation manner of performing the memory persistence process of the previous data and the write-in process of the next data in the two data in parallel may be: after receiving the RDMA write request, the first RNIC sends a second confirmation message to the second RNIC, wherein the second confirmation message carries the PSN of the first data; after the second RNIC receives the second confirmation message, the second RNIC determines that the receiving confirmation message corresponding to the first data is received according to the PSN of the first data because the second RNIC receives the confirmation message with the PSN being the PSN of the first data for the first time, and further determines that the first RNIC receives the first data, and the second RNIC sends an RDMA write request corresponding to the next data of the first data to the first RNIC; in addition, the second RNIC buffers the second acknowledgement message, and when receiving the first acknowledgement message having the PSN of the first data, the second RNIC determines that the acknowledgement message having the PSN of the first data is received for the second time according to the PSN of the first data in the first acknowledgement message and the PSN of the first data in the second acknowledgement message buffered before, and the second RNIC determines that the persistent acknowledgement corresponding to the first data is received, and the above step S908 or S1018 is executed.

In the RDMA transfer protocol, the acknowledgement message is only used to indicate the meaning of the acknowledgement, and the acknowledgement message is specifically used to indicate what acknowledgement of the request needs to be determined according to the request sent before receiving the acknowledgement message or the order of the received acknowledgement message, since the write operation occurs before the memory persistence operation, the first received acknowledgement message carrying the sequence number of the first data is necessarily the receive acknowledgement message of the first data, after receiving the receive acknowledgement message of the first data, by buffering the receive acknowledgement message, the RNIC of the second device can send the next data of the first data to the RNIC of the first device without waiting for the persistence acknowledgement message of the first data to be received, and the memory persistence of one data and the write operation of the next data of the data can be performed in parallel by comparing the PSNs in the receive acknowledgement message, the efficiency of saving a plurality of data to the nonvolatile memory of the remote device is improved, and the time delay is reduced.

The above method may be implemented on an RNIC and a processor of a computing device, and in order to better implement the above method of the embodiment of the present application, the embodiment of the present application further provides a corresponding computing device.

Referring to fig. 12, fig. 12 is a schematic diagram illustrating a structure of a computing device according to an embodiment of the present disclosure, where the computing device 130 includes an RNIC131, a processor 132, and a non-volatile memory 133. The structure of RNIC131 may be as shown in fig. 6, and the structure of processor 132 may be as shown in processor 102 in fig. 1. The non-volatile memory 133 may be an SCM.

RNIC131 is configured to perform the steps performed by the first RNIC in the method embodiments shown in FIGS. 9-11, and processor 132 is configured to perform the steps performed by the first processor in the method embodiments shown in FIGS. 9-11.

Referring to fig. 13, fig. 13 is a schematic diagram illustrating a structure of a computing device 140 according to an embodiment of the present disclosure, where the computing device 140 includes an RNIC141, a processor 142, and a non-volatile memory 143. Where RNIC141 may be configured as shown in FIG. 6 and processor 142 may be configured as shown in processor 102 of FIG. 1. Non-volatile memory 133 includes what may be an SCM, NVRAM, NVDIMM.

RNIC141 is configured to perform the steps performed by the second RNIC in the method embodiments described above with reference to FIGS. 9-11, and processor 142 is configured to perform the steps performed by the second processor in the method embodiments described above with reference to FIGS. 9-11.

The embodiment of the present application further provides a processor, which may be configured as the processor in fig. 1, and is configured to execute the steps performed by the second processor in the method embodiments shown in fig. 9 to fig. 11.

The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded or executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a Solid State Drive (SSD).

It should be noted that the first, second, third, fourth and various numbers related to the embodiments of the present application are merely for convenience of description and are not intended to limit the scope of the embodiments of the present application.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

37页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:转换板卡和硬盘接口装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!