System, method and apparatus for accessing shared memory

文档序号：1804175 发布日期：2021-11-05 浏览：7次中文

阅读说明：本技术 用于访问共享存储器的系统、方法和装置 (System, method and apparatus for accessing shared memory ) 是由 G·拉玛吉里 T·P·林格 M·帕特尔 J·贾拉勒 A·K·图马拉 M·D·威克海塞尔于 2020-03-17 设计创作，主要内容包括：本发明公开了一种用于通过基于读取/写入(R/W)访问权限过滤数据访问请求和监听响应来保护相干数据处理网络中的相干存储器内容的系统、装置和方法。请求在存储器保护单元中利用访问权限被增强,并且访问权限用于控制由网络的主节点进行的存储器访问。(A system, apparatus, and method for protecting coherent memory content in a coherent data processing network by filtering data access requests and snoop responses based on read/write (R/W) access permissions. The request is enhanced with access rights in the memory protection unit and the access rights are used to control the memory access by the master node of the network.)

1. A method for memory protection in a data processing network, the method comprising:

receiving, at a master node of the data processing network, a request message from a first requesting node of the data processing network, wherein the request message comprises an action request for data associated with a first address in a shared memory of the data processing network and one or more access rights of the first requesting node to the first memory address;

determining, by the master node, whether the first requesting node allows the requested action according to the one or more access rights;

accessing the data associated with the first memory address from a system cache, a local cache of a second requesting node of the data processing network, or the shared memory according to a coherency protocol when the first requesting node allows the requested action; and

sending a response message to the first requesting node without accessing the data associated with the first memory address when the first requesting node does not allow the requested action.

2. The method of claim 1, wherein the first requesting node is coupled to the master node via a first memory protection unit of the data processing system, the method further comprising:

receiving, by the first memory protection unit, the action request from the first requesting node;

determining, by the first memory protection unit, the one or more access permissions based on the first memory address; and

enhancing, by the first memory protection unit, the action request with the one or more access rights for the first requesting node.

3. The method of claim 1, wherein the action request comprises a read request, and wherein the response message to the first requesting node comprises dummy data when the one or more access rights do not include a read right.

4. The method of claim 1, wherein the action request comprises a write request for modified data, the method further comprising the master node discarding the modified data when the one or more access rights do not include write rights.

5. The method of claim 4, further comprising the master node invalidating the modified data at the first requesting node when the one or more access rights do not include a write right.

6. The method of claim 1, wherein the action request comprises a read request, and wherein accessing the data associated with the first memory address comprises:

retrieving, by the master node from the second requesting node, modified data associated with the first memory address;

when the second requesting node has write rights to the modified data:

when the first requesting node does not have write authority for the modified data:

writing, by the master node, the modified data to the shared memory at the first memory address to change the modified data to clean data; and

sending, by the master node, the clean data to the first requesting node; and

when the first requesting node has write rights to the modified data:

sending, by the master node, the modified data to the first requesting node; and

when the second requesting node does not have write authority for the modified data:

retrieving, by the master node, clean data from the shared memory at the first memory address;

sending, by the master node, the clean data to the first requesting node; and

invalidating the data associated with the first memory address at the second requesting node.

7. The method of claim 1, wherein the action request comprises a request to invalidate data at the first address for which the first requesting node does not have write permission, and wherein accessing the data associated with the first memory address comprises:

retrieving, by the master node, data associated with the first memory address from the second requesting node;

when the retrieved data is in the modified coherency state:

writing, by the master node, the retrieved data to the shared memory at the first memory address to change the coherency state of the data associated with the first memory address from "modified" to "clean"; and

invalidating the data associated with the first memory address at the second requesting node.

8. The method of claim 1, wherein the second requesting node is coupled to the master node via a second memory protection unit, the method further comprising:

sending, by the master node, a snoop message to the second requesting node;

sending, by the second requesting node, a snoop response including the snooped data;

enhancing, by the second memory protection unit, the snoop response with one or more access permissions for the second requesting node;

receiving, by the master node, the enhanced snoop response;

discarding, by the master node, the snooped data when the one or more access permissions for the second requesting node indicate that the second requesting node does not have read permission for the snooped data; and

discarding, by the master node, the snooped data when the snooped data is modified and the one or more access permissions for the second requesting node indicate that the second requesting node does not have write permission to the modified data.

9. The method of claim 8, further comprising:

the snooped data at the second requesting node may be invalidated when the master node discards the snooped data if the coherency protocol allows the second requesting node to retain a copy of the first data.

10. The method of claim 8, further comprising, when the master node discards the snooped data:

retrieving, by the master node, clean data from the shared memory at the first memory address; and

sending, by the master node, the clean data to the first requesting node.

11. The method of claim 8, further comprising the second memory protection unit:

intercepting the listening message to the second requesting node;

coloring a transaction identifier in the snoop message with the one or more access permissions of the second requesting node to provide a colored snoop message;

forwarding the colored snoop message to the second requesting node;

intercepting the snoop response from the second requesting node;

de-coloring the transaction identifier in the snoop response; and

forwarding the decoloured snoop response enhanced with the one or more access rights to the primary node.

12. The method of claim 1, wherein the action request comprises a request to invalidate data associated with the first memory address, and wherein accessing the data associated with the first memory address comprises:

retrieving, by the master node, data in the system cache or the local cache of the second requesting node, wherein the data is in a "modified" coherency state;

when the first requesting node does not have write authority for the data:

write the data back to the shared memory according to the coherency protocol to change the coherency state of the data from "modified" to "clean";

sending the data to the first requesting node; and

changing the coherency state of the data in the system cache or the local cache of the second requesting node to "invalid".

13. An apparatus, the apparatus comprising:

a plurality of cross-point switches, wherein a first cross-point switch of the plurality of cross-point switches includes a first memory protection unit and provides an interface to a first request node;

a master node; and

an interconnect coupled between the plurality of cross point switches, the master node, and a shared memory, wherein the master node provides a coherency point for accessing the shared memory;

wherein the first memory protection unit intercepts a message from the first requesting node destined for the master node, enhances the intercepted message with one or more access rights of the first requesting node, and forwards the enhanced message to the master node; and is

Wherein the master node responds to the enhanced message in accordance with the one or more access rights.

14. The apparatus of claim 13, wherein the message is associated with a first memory address in the shared memory, and wherein the first memory protection unit is configured to look up the one or more access permissions in an address table of the first memory protection unit according to the first memory address.

15. The apparatus of claim 13, wherein a second cross-point switch of the plurality of cross-point switches comprises a second memory protection unit and provides an interface to a second request node, and wherein:

the master node is configured to send a listen message to the second requesting node in response to the message from the first requesting node;

the second requesting node is configured to send a snoop response containing the snooped data to the master node in response to the snoop message; and is

The second memory protection unit is configured to intercept the snoop response, enhance the snoop response with one or more access permissions of the second requesting node, and forward the enhanced snoop response to the master node; and is

Wherein the master node is further configured to:

discarding the snooped data when the one or more access permissions for the second requesting node indicate that the second requesting node does not have read permission for the snooped data; and

discarding the snooped data when the snooped data is modified and the one or more access permissions for the second requesting node indicate that the second requesting node does not have write permission to the modified data.

16. The apparatus of claim 15, wherein the master node is further configured to retrieve clean data from the shared memory when the snooped data is discarded, and forward the clean data to the first requesting node.

17. The apparatus of claim 13, wherein the master node is further configured to send dummy data to the first requesting node when the message from the first requesting node comprises a read request for data associated with a first memory address in the shared memory and the one or more access rights indicate that the first requesting node does not have read rights for the first memory address.

18. The apparatus of claim 13, wherein the master node is further configured to invalidate the first data at the first requesting node when the message from the first requesting node comprises a write request for first data and the one or more access permissions indicate that the first requesting node does not have write permission for the first memory address and the write request is of a type that allows the first requesting node to retain a copy of the first data.

19. The apparatus of claim 18, wherein the master node is further configured to discard the write request when the one or more access permissions indicate that the first requesting node does not have write permission for the first memory address.

20. A non-transitory computer readable medium containing instructions describing a hardware description language of the apparatus of claim 13.

21. A non-transitory computer readable medium containing a netlist description of the apparatus of claim 13.

22. An integrated circuit comprising the apparatus of claim 13.

Background

The present disclosure relates generally to computer memory and, more particularly, to schemes for accessing shared memory.

In many instruction execution systems, shared memory contents may be compromised due to unauthorized access to requests to read or write data to the shared memory. These unauthorized accesses may undesirably damage the memory contents.

Drawings

The accompanying drawings provide visual representations which will be used to more fully describe various representative embodiments and can be used by those skilled in the art to better understand the disclosed representative embodiments and their inherent advantages. In the drawings, like reference numerals designate corresponding elements.

FIG. 1 illustrates an exemplary 1x3 mesh CMN (coherent mesh network) system;

FIG. 2 is a block diagram of a data processing system having a Memory Protection Unit (MPU) according to various embodiments of the present disclosure;

FIG. 3 illustrates an example of a Memory Protection Unit (MPU) address area and permissions according to various embodiments of the present disclosure;

FIG. 4 is a signal flow diagram of read and write transactions with access rights according to various embodiments of the present disclosure;

FIG. 5 is a signal flow diagram of read and write transactions without access rights according to various embodiments of the present disclosure;

fig. 6 is a signal flow diagram of a snoop transaction according to various embodiments of the present disclosure;

fig. 7 is a signal flow diagram of a further snoop transaction according to various embodiments of the present disclosure;

8-10 are signal flow diagrams of additional access transactions according to various embodiments of the present disclosure;

11A and 11B are flow diagrams illustrating request filtering dependent on access rights according to various embodiments of the present disclosure;

FIG. 12 illustrates an apparatus for filtering requests according to various embodiments of the present disclosure; and is

13A-13D illustrate coloring of transaction identifiers by a memory protection unit, according to various embodiments of the present disclosure.

Detailed Description

While this disclosure is capable of embodiments in many different forms, there are shown in the drawings and will herein be described in detail specific embodiments, with the understanding that the present disclosure is to be considered as an example of the principles described and not intended to limit the disclosure to the specific embodiments shown and described. In the description below, like reference numerals are used to describe the same, similar or corresponding parts in the several views of the drawings.

In this document, relational terms such as first and second, top and bottom, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," "includes" or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the recitation of an element by "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

Reference throughout this document to "one embodiment," "certain embodiments," "an embodiment," or similar terms means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of such phrases or in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments without limitation.

As used herein, the term "or" is to be interpreted as inclusive or meaning any one or any combination. Thus, "A, B or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

For simplicity and clarity of illustration, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. Numerous details are set forth to provide an understanding of the embodiments described herein. Embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the embodiments. The description should not be considered as limiting the scope of the embodiments described herein.

The embodiments described herein illustrate how read/write permissions per master per memory region are used to protect shared memory contents from unauthorized visitors in a data processing network.

In accordance with the present disclosure, an improved measure for accessing shared memory while preventing unauthorized access is provided.

Fig. 1 is a schematic diagram of an exemplary data processing network 100. In this simple example, the network is configured as a 1x3 mesh CMN (coherent mesh network). An intersection point (MXP) provides an intersection point in a data processing network and is responsible for routing protocol packets of a message to the correct node based on a destination node identifier. An example of a CMN isCoreLink^TMCMN-600 coherent mesh network designed for a wide range of applications of intelligent connectivity systems, including network infrastructure, storage, servers, HPC, automotive, and industrial solutionsThe solution is decided. High scalability mesh toThe v8-A processor is optimized and can be customized over a wide range of performance points. The data processing network may include a coherent interconnect, such as, for example, based on5CHI protocolCMN series products (Andis a registered trademark of Arm Limited corporation). The interconnect specification identifies devices in the interconnect, as described below.

The network may include one or more requesting nodes that operate as requesting masters and initiate data transactions. An exemplary requesting node is:

RN-F: a fully coherent requesting node, such as a Central Processing Unit (CPU), a coherent Graphics Processing Unit (GPU), or other accelerator operating as a requesting master.

RN-I: an I/O coherent requesting node, for example, to tunnel input/output (I/O) traffic into a Coherent Hub Interface (CHI) or other network interconnect.

RN-D: a Distributed Virtual Memory (DVM) requesting node supporting DVM traffic.

The network may also include one or more master nodes that receive access requests from the requesting nodes. Each master node serves as a coherency and serialization point for a given set of memory addresses and may include snoop filters for monitoring data transactions and maintaining records as to which data lines are stored at or owned by one or more nodes. When a memory access is received at the home node, a snoop request may be sent to the node that has a copy of the access data in its local cache. Exemplary master nodes include a fully coherent master node (HN-F) that services normal memory requests and an I/O coherent master node (HN-I) that is responsible for servicing I/O requests. Such nodes may include cache memory and snoop filters to achieve efficient coherency resolution to send snoops when needed. Cache memory is typically fast Random Access Memory (RAM) to which processors may access faster than their conventional RAM.

Furthermore, the data processing network comprises one or more slave nodes which service the request from the master node when the request cannot be served locally in the master node. Examples of slave nodes are memory controllers or requesting nodes. Otherwise, the request is serviced by the master node that received the request.

As shown in fig. 1, an RNF (fully coherent requesting node) 102 is operably coupled to an MXP (mesh cross point) 104. The MXP 104 is operably coupled to the MXP 108 and the MXP 114. The MXP 108 is operably coupled to an RNI (I/O coherent request node) 106 and an HNF (full coherent master node) 110. The MXP 114 is operably coupled to the SNF (fully coherent slave) 112 and the HND (master node) 118. The requesting node 102, 106 accesses the data by sending a request to the primary node (HN-F/HN-I)118, 110. Slave node 112 may be, for example, a Dynamic Memory Controller (DMC).

For a read access, the master node 118 looks up the incoming address in the cache memory and the slave node 112. If an address is available in the cache memory, the request will be serviced by providing the data. If the data is not available in the cache but hits in slave node 112, master node 110 sends a snoop request to RN-F102, which contains the cache line and services the request. Depending on the type of snoop request, the snooped RN-F102 may send data back to the master node 110 (so that the master node may service the request) or directly to the requesting node 106 in a process known as DCT (direct cache transfer).

For a write access from RN-F102, home node 118 checks whether the request is for a partial write or a full cache line write. Depending on the size of the request, the master node 118 may merge the requested data with the memory data or snoop data. The merged data is either written to memory (slave) or may be filled into the cache based on the request attributes and when the cache exists in the master node.

If the request causes any errors in the master node, such as a cache access error or snoop error, the master node will complete the request by responding with an error status and optionally raise an interrupt so that the master node knows the access status.

Coherent network protocols such asThe CHI protocol may specify various action requests:

1. a read request-response with data will include the data.

2. A read request-response with no data will not include data.

3. Write request-write clean/modified data from requester to cache or memory.

4. Cache Maintenance Operation (CMO) -flushing cache lines to memory such as DRAM, I/O primary or downstream caches, and other memory outside of the home node.

5. Atomic request-a child cache line atomic operation is performed in memory.

6. Store request-operation where the cache line is to be stored in the RN-F cache for future access.

In a coherent network, various actions are performed to ensure that shared data is maintained in a coherent manner. For example, these actions may ensure that a node does not have an outdated copy of the data. However, read/write access by unauthorized masters, which may expose corrupt data in many different ways, may result in corrupt memory contents, and permission-based filtering alone cannot handle coherent systems. Thus, in a protected memory system, certain nodes may not allow some desired action, such as writing modified data to shared memory. This has an impact on coherency maintenance.

The present disclosure relates to protecting memory in a coherent data processing network.

In one embodiment, a request message is received at a master node of a data processing network from a first requesting node of the data processing network. The request message includes an action request for data associated with a first address in a shared memory of the data processing network and one or more access permissions of the first requesting node to the first memory address. For example, the request action may be to read, write, or change the coherency state of data associated with the first memory request. The master node determines whether the first requesting node allows the requested action based on the one or more access rights. When a first requesting node allows a requesting action, data associated with a first memory address is accessed from a system cache, a local cache of a second requesting node of the data processing network, or a shared memory according to a coherency protocol. However, when the first requesting node does not allow the requested action, the response message is sent to the first requesting node without accessing data associated with the first memory address.

Access rights may be provided by using Memory Protection Units (MPUs) located at the data processing network intersections. The MPU couples registers configurable to define requested access rights to the network at the intersection.

FIG. 2 shows a system 200 with an example of a data processing network that includes a Memory Protection Unit (MPU). Figure 2 shows master nodes RN-F202, RN-I216, mesh intersection points (MXPs) 206, 212, 220, master nodes HN-I210, HN-F208, slave nodes SN-F218 and MPUs 204, 214. The system 200 of fig. 2 may be, for example, a coherent mesh network.

Each of the requesting master nodes (RN-F202 and RN-I216) in system 200 is coupled to the network interconnect via a memory protection unit (MPU 204 and MPU214, respectively). The MPUs (204, 214) contain configurable registers that are programmed with address fields and corresponding read/write permissions, as shown in FIG. 3.

As described above, the Memory Protection Units (MPUs) 204, 214 may be computer hardware units. The MPU may be implemented as part of a Central Processing Unit (CPU), part of an interconnect fabric, or as a separate hardware module or block. In some embodiments, the MPU is a reduced version of a Memory Management Unit (MMU) that provides only memory protection support, and may be implemented in a low power processor that requires only memory protection, but not other memory management features such as virtual memory management.

First requesting node 202 is coupled to master node 208 via first memory protection unit 204 of data processing system 200. The first memory protection unit 204 receives the action request from the first requesting node 202, determines one or more access rights assigned to the first requesting node based on the first memory address, and enhances the action request with the one or more access rights of the first requesting node before sending the action request to the master node 208.

In one embodiment according to the present disclosure, access rights are stored in request bits that are not used in existing architectural interfaces, thereby enabling memory protection to be added to existing instant messaging protocols. In another embodiment, an existing field in the request (such as a transaction identifier field) is extended to store the access rights. In another embodiment, additional fields are added to the request to store access rights.

FIG. 3 shows an example 300 of a Memory Protection Unit (MPU) address area and permissions. MPU areas 0, 320(a) through N, 320(N) (where "N" is any suitable number) each have a read area 322, a write area 324, a start address 326, and an end address 328. The number of MPU regions is a design choice and any suitable number may be used. MPU zones 0, 320(a) have associated read portions 322(a), write portions 324(a), start address 326(a), and end address 328 (a). Similarly, MPU area N320 (N) has an associated read portion 322(N), write portion 324(N), start address 326(N), and end address 328 (N).

When a requesting node (e.g., 202 shown in FIG. 2) sends a request to an intersection (e.g., 204 shown in FIG. 2), the address from the request is looked up in the MPU for the region represented by the start address 326 and the end address 328. When a match is found, the corresponding read and write permission attributes of the region 320 are sent to the master node along with the request.

The MPU may also contain default read/write permissions if no matching region is found. The HN then uses the R/W rights to allow access to the memory contents.

These rights definitions include:

r: requesting to have read permission

W: requesting to have write permission

-R: request not to have read permission

W: request not having write permission

Read requests from the requesting node to the master node are intercepted by the MPU. The memory address to be read is looked up in a table in the MPU to determine the access authority of the requesting node to the memory address. The MPU then enforces the read request with access rights (AP) and forwards the enforced request to the master node. Therefore, the process is as follows:

requesting node → RN _ Req → MPU area lookup → RN _ Req + AP → home node.

For a snoop request sent by the master node to the RN-F, a snoop address from the snoop request is looked up in the MPU. The snoop response is enhanced with the access rights and the enhanced snoop response is sent back to the master node. The master node may then make a decision using the access rights to the snoop response. The process is as follows:

master node → HNF _ Snp _ request → MPU lookup → SnoopResponse + AP → master node.

The master node filters the snoop response based on the R/W access rights.

Fig. 4 shows an example 400 of read/write request types with access rights. The first requesting node CPU402 communicates with a master node 404 and a slave node 406. In the example shown, the slave node 406 is a Dynamic Memory Controller (DMC).

Read request 408 is transmitted from CPU402 to home node 404, which performs authority filtering, caching, or memory access 411. In this example, the CPU has read and write (R/W) access rights to the address, and the read request 408 is enhanced with these rights. Read request 410 is allowed and transmitted to DMC 406. In response to message 410, data 412 is transmitted from DMC406 to CPU 402.

Acknowledgement 414 is transmitted from CPU402 to master node 404.

A write request 416 is sent from the CPU402 to the master node 404. In response, home node 404 will perform privilege filtering, cache allocation, or memory write 417. Prior to a memory write, a "buffer ready" message 418 is sent from the master node 404 to the CPU402 to indicate that the master node is ready to receive data and has storage available to buffer the data. Data 420 is transmitted from the CPU402 to the primary node 404 and a write request 422 is transmitted from the primary node 404 to the DMC 406. A "buffer ready" message 426 is sent from the DMC406 to the master node 404. Finally, if this is the victim, a memory write is performed 427 by the home node 404. Data 428 is transmitted from the master node 404 to the DMC 406.

Thus, in some embodiments, the master node responds to the action request according to the enhanced access rights. In other words, the master node "filters" the action requests according to the enhanced access rights. For example, when the action request includes a read request and the one or more access rights do not include read rights, the master node sends the dummy data back to the first requesting node instead of servicing the request.

When the action request includes a write request for the modified data and the one or more access rights do not include a write right, the master node discards the modified data and optionally invalidates the modified data at the first requesting node.

Fig. 5 illustrates an example 500 of access rights based request filtering according to various embodiments of the present disclosure. A series of acts is illustrated by arrows representing the flow of information between the CPU, the primary node, and the DMC. FIG. 5 shows a CPU timeline 502, a master node timeline 504, and a DMC timeline 506, where time is incremented from top to bottom. The information flow may be generated by hardware of the data processing network, by software executing on a processor, or by a combination thereof.

The requester protection mechanism shown in fig. 5 protects good data from being sent to the CPU 502. The first transaction is a read request 510 from the CPU. A read request 512 is transmitted from the CPU to the master node. Read request 512 is enhanced with access rights (R/W). The authority filter of the home node determines that reads are not allowed based on the access authority and therefore no cache or memory access is performed at 520. Instead, virtual data 514 is transmitted from the master node to the CPU and an acknowledgement 518 is transmitted from the CPU back to the master node.

The second transaction is a write request 530. A write request message 532 enhanced with access rights R/W is sent from the CPU to the master node. In response, the master node 504 determines from the access rights that the CPU does not have write rights and discards the request at 540. Before the request is discarded, a "buffer ready" message 534 is transmitted from the master node to the CPU, and data 536 is transmitted from the CPU to the master node. However, since the requestor does not have write rights, data 536 is discarded. Data 536 is not written to memory. Alternatively, the master node may send an error message to the CPU in a "buffer ready" message 534.

The apparatus and system operations of FIG. 5 illustrate a requestor protection mechanism providing request filtering based on access rights. These rights include read rights, write rights, and snoop rights.

Reading authority: if the RN does not have read permission for the read request, the Home Node (HN) will not look up the internal cache or snoop the caches of any RN-F that may have a cache line. The HN will respond to the request with zero data and an error state indicating that the read request encountered an MPU violation.

Writing authority: if the RN does not have write permission, the HN will process the request, but any dirty data from the RN is not updated to memory. When needed, the HN may indicate a permission error for any completed response.

Monitoring authority: if the coherency of the RN-F must be snooped, the authority to snoop the response is checked. If the overheard RN-F returns data, the data will be filtered. The snoop response with dirty data is accepted only if RN-F has write permission, as shown in FIG. 6.

When the action request includes a read request for data and a copy of the requested data is stored in a local cache of the second requesting node, the home node may retrieve the data from the second requesting node by sending a snoop message to the second requesting node and receiving a data response. The data response is enhanced by the access rights of the second requesting node. When the retrieved data is in the modified state, the master node proceeds according to the access rights, as shown in fig. 6 and 7 described below.

When the second requesting node has write permission for the modified data and the first requesting node does not have write permission for the modified data, the master node writes the modified data to the shared memory at the first memory address to turn the modified data into clean data and sends the clean data to the first requesting node.

When the second requesting node has write permission to the modified data and when the first requesting node does not have write permission to the modified data, the master node sends the modified data to the first requesting node.

When the second requesting node does not have write authority for the modified data, the master node retrieves clean data from the shared memory at the first memory address, sends the clean data to the first requesting node, and invalidates data at the second requesting node associated with the first memory address.

Fig. 6 is a signal flow diagram 600 of snooped Central Processing Unit (CPU) filtering based on access rights. Snoop responses with dirty data are granted only if RN-F has write authority. Similarly, clean data is accepted only if RN-F has read authority. If an MPU violation exists, the data is discarded. The master node will obtain data from the DMC and service the request. As shown in fig. 6, a signal flow diagram 600 for a read request transaction 610 is initiated by the first requesting node CPU 0. FIG. 6 also shows a CPU0 timeline 602(a), a master node timeline 604, a second requesting node (CPU1) timeline 602(b), and a DMC timeline 606. A read request having read and write permissions (R/W)612 is transmitted from a first requesting node (CPU0) to a master node. The privilege filter of the home node allows cache and/or memory access 613 because the CPU0 has read privileges. A snoop request 614 is transmitted from the master node to a second requesting node (CPU1) that has determined to have a copy of the requested data. The CPU1 responds by returning modified or "dirty" snoop data 6l6 to the master node. Snooped data is enhanced by (R/. W) access rights of CPU 1. Since the CPU1 does not have write permission, the snoop data is discarded by the master node at 618 and a read request 620 is sent from the master node to the DMC to retrieve the clean data. Clean data 622 is then transmitted from the DMC to the CPU0, and a confirmation 624 is sent from the CPU0 to the master node.

FIG. 7 shows another embodiment 700 in which a first requesting node (CPU0) has read rights but not write rights (R/. W). This measure is an example of controlling "clean" data versus "dirty" (modified) data based on access rights, and shows a CPU0 timeline 702(a), a master node timeline 704, a second requesting node (CPU1) timeline 702(b), and a DMC timeline 706. In this case, the master node writes the modified data to the DMC and provides the clean data to the first requesting node. If the modified data is provided without a DMC write, any subsequent evictions from the CPU will be discarded (due to write permission filtering), thereby losing the modified data.

Fig. 7 also shows a read request transaction 710 initiated by the first requesting node CPU 0. A read request 712 with read rights but no write rights (R/. -W) is transmitted to the master node. Since the CPU0 has read rights, the home node's rights filter allows cache and/or memory access 713. A snoop request 714 is transmitted from the master node to a second requesting node (CPU 1). The CPU1 transmits the modified dirty data 716, enhanced with (R/W) permissions, to the master node 704. Since the write is allowed, the primary node sends a write request 734 (with R/W permission) to the DMC to initiate writing the modified data to storage 735. The "buffer ready" 736 signal is transmitted from the DMC to the master node, and then data 738 is transmitted from the master node to the DMC to complete the write back into memory and change the coherency state of the data from modified (dirty) to clean. Clean data 730 is transmitted from the master node to the CPU0 and an acknowledgement 732 is transmitted from the CPU0 to the master node.

When the action request includes a request to invalidate data at a first address at which the first requesting node does not have write permission, and when a copy of the data is stored at the second requesting node, the master node retrieves the data associated with the first memory address from the second requesting node. When the retrieved data is in the modified coherency state, the master node writes the retrieved data to the shared memory at the first memory address to change the coherency state of the data associated with the first memory address from "modified" to "clean" and invalidate the data associated with the first memory address at the second requesting node.

Fig. 8 is a signal flow diagram 800 of a method for invalidating request permission filtering. FIG. 8 shows a CPU0 timeline 802(a), a master node timeline 804, a second requesting node (CPU1) timeline 802(b), and a DMC timeline 806. Invalidating transaction 810 is initiated by CPU 0. A request to invalidate data stored at other nodes may be designated as a request to ensure data uniqueness. The "ensure uniqueness" request 812, enhanced with (R/. W) access rights, is transmitted to the master node. The authority filter at the master node converts the request to a "clean-unique" request at 814. A "snoop-clean-invalid" request 816 is transmitted from the master node to the CPU 1. The CPU1 transmits the modified data 818 enhanced with (R/W) rights to the master node to provide the listening data 818. Failure completion is sent from the master node to the CPU0 in message 820 and acknowledgement 822 is sent from the CPU0 to the master node.

To prevent modified data loss, the primary node sends a write request with (R/W) permission 824 to the DMC. A "buffer ready" 826 signal is transmitted from the DMC to the master node and then data 830 is transmitted from the master node to the DMC. Thus, the data is written to memory at 832.

In the case of a stale request authority, a request being of a stale type and having read-only authority (e.g., readoncomakeinvalid, MakeUnique, etc.), where an RN (requesting node) may receive data or receive a completion without data while invalidating memory contents from all downstream or peer caches, the master node converts such request into a non-intrusive request, as shown in fig. 8. For example, MakeUnique request 812 is converted to clearunique request 814 and readoncomakeinvalids (not labeled) is converted to readoncoclearinvalids (not labeled). Such conversion ensures that existing dirty or modified data in the system is written to memory and completion does not corrupt any memory contents.

For CMO, if the requesting node has read-only permission, the master node will make a similar transition to a non-intrusive request. For example, makeeinvalid is converted to clearinvalid. If the requesting node does not have read or write permission, the transaction is completed without updating the memory.

Another sequence is no data request permission, where some request types, such as MakeUnique and CleanUnique, have no data completion. If the CPU does not require a privilege error notification (bus error), it may erroneously transition to a "clean" state. Thus, clean data is data suitable for storage in coherent memory. This data is different from dirty data in that it is valid, or clear, or acceptable. Subsequent snoops to the cache line may expose bad data to other CPU and memory locations. To avoid this, the home node follows a no data request completion, where a snoop request (SnpMakelnvalid) is invalidated to invalidate a cache line in the RN cache. This ensures that the CPU does not have a cache line in a unique state.

Fig. 9 is a signal flow diagram 900 of a method for authority filtering of write requests, according to various embodiments. In this regard, a timeline 902 of a first requesting node (CPU0), a timeline 904 of a master node (HN-F), and a timeline 906 of a second requesting node (CPU1) are shown. Access request 908 is a "WriteClean" request to write modified (dirty) data to memory to change the state of the data from dirty data to clean data. However, the access right enhanced to the request indicates that the CPU0 does not have write right to the data address. Upon receiving the request, the master node HN-F determines that the CPU0 does not have write authority. The master node does not snoop CPU1 and write data back to memory. The master node sends a completion message 910 back to the CPU0 indicating that the master node is ready to receive data. CPU0 sends the data to the master node in message 912. Thus, for time period 914, the CPU0 has "bad" data that was modified (dirty) but could not be written back to memory. Once the data is written back to the master node in message 912, CPU0 changes the state of the data to "clean". However, the data is still "bad". The master node then sends a invalidate message 916 to CPU0 to invalidate the data stored at CPU 0. The CPU0 acknowledges this in message 918. Therefore, for the period 920, the CPU0 displays the state of the data as "clean", but in the period 922, the data is displayed as invalid.

Fig. 10 is a signal flow diagram of a method for authority filtering of read requests according to various embodiments of the present disclosure. The flow diagram includes a timeline 1002 for a first requesting node (CPU0), a timeline 1004 for a master node (HN-F), a timeline 1006 for a second requesting node (CPU1), and a timeline 1008 for a memory controller (DMC). The access request 1010 from the first requesting node (CPU0) is a "ReadShared" request to obtain a copy of the data stored at the second requesting node (CPU 1). Since the CPU0 has read authority for the memory address, the master node sends a snoop message 1012 for the data to the CPU 1. In time period 1014, the data at the CPU1 is "bad" data because the data was modified but could not be cleaned by being written back to memory because the CPU1 did not have write authority for the data. However, the CPU1 returns the modified (dirty) data to the master node in snoop response 1016. The access rights in snoop response 1016 indicate to the master node that CPU1 does not have write rights and that the data is "bad". The master node sends a read request for clean data to a memory controller (DMC) 1018. The clean data is sent back to the first requesting node (CPU0) in message 1020 and the data is acknowledged to the master node in message 1022. CPU0 now has good data, but CPU1 still has bad data. Thus, master node sends invalidation message 1024 to CPU1 and CPU1 acknowledges the message in response 1026. In this way, "bad" data at CPU1 is not passed to CPU0, thereby preventing write back to memory at CPU0 and memory being protected.

Fig. 11A and 11B are a flow diagram 1100 of a method for filtering access rights according to an embodiment of the present disclosure. The method may be implemented in hardware in a data processing network.

Referring to fig. 11A, a new request from a first requesting node (RN-F or RN-I) is accessed at block 1102 and an address lookup in the MPU within the MXP is performed at block 1104. At block 1106, the request is accompanied by R/W (read/write) rights.

At block 1108, the primary node (HN-F) receives the request and checks the permissions. At decision block 1110 it is determined whether the rights are acceptable. If the rights are not acceptable, as shown by negative branch 1112 of decision block 1110, an error response is sent and the protocol flow is completed at block 1114.

When the permissions are deemed acceptable, as shown by the positive branch 1116 of decision block 1110, a cache/snoop filter lookup is performed at block 1118. It is then determined whether listening is required at decision block 1120. If not, as shown by the negative branch 1122 from decision block 1120, flow proceeds to point "A" and from there to decision block 1124 in FIG. 11B.

Referring now to FIG. 11B, if the snoop does not require a path to the slave node SN-F, as indicated by the negative branch 1126 of policy block 1124, the protocol flow completes without error at block 1130.

If it is determined to go to SN-F, as indicated by the positive branch of decision block 1124, then the request is sent to the DMC at block 1158 and a response is received from the DMC at block 1160. As described above, the protocol flow completes without error at block 1030.

Referring again to FIG. 11A, when snooping is required, as indicated by the positive branch 1132 of decision block 1120, snoops are sent from the master node to the nodes indicated in the snoop filter as having copies of data (referred to as "snoopees") at block 1134. At block 1136, the MXP intercepts the snoop request and performs a snoop address lookup in the MPU for the snooped authority. At block 1138, the MXP colors the snoop transaction identifier with MPU permissions and forwards the snoop request to the snoopee. The snooper then processes the snoop request and sends a snoop response with the colored transaction identifier at block 1140.

At block 1142, the MXP intercepts the snoop response and populates the MPU permissions field from the colored transaction identifier, and flow continues to point "B".

Referring again to fig. 11B, the master node (HN-F) receives the snoop response and checks the snooped authority at block 1144. At decision block 1146, it is determined whether the snooped data can be used. If so, as shown by the positive branch 1148 of decision block 1146, the protocol flow completes without error at block 1130.

If the snooped data cannot be used, as shown by negative branch 1150 of decision block 1146, a determination is made at decision block 1152 whether the request is destined for SN-F. If so, as indicated by the positive branch of decision block 1152, the request is sent to the DMC at block 1158, the DMC response is received at block 1160, and the protocol flow is completed without error at block 1130.

If it is determined that the request is not destined for SN-F, as shown by negative branch 1154 of decision block 1152, an error response is sent and the protocol flow is completed at block 1156.

A snoop request received at a snooped or second requesting node contains a memory address of the snooped data. The memory address may be used in the MPU to determine access rights. The snoop response typically does not include a memory address that may include a transaction identifier. In one embodiment, access rights are associated with a transaction identifier in the MPU when a snoop request is received, so that the snoop response is enhanced with the access rights. This may be done, for example, by storing the table in the MPU. When a snoop response is received from a snooped person, the same transaction identifier in the snoop response is then used to identify the access rights. In another embodiment, access rights are added to the transaction identifier message sent to the snooped person. Thus, the access rights are stored in the request to the snooped person and returned in the response from the snooped person. For example, the number of transaction identifiers may be reduced by a factor of four, and the access rights are stored in the two most significant bits of the transaction identifier. The transaction identifier is then said to be "colored" by the access rights. In this embodiment, the memory protection unit intercepts a snoop message destined for a second requesting node, colors a transaction identifier in the snoop message with one or more access rights of the second requesting node to provide a colored snoop message, and forwards the colored snoop message to the second requesting node. The memory protection unit then intercepts the snoop response from the second requesting node, decolours the transaction identifier in the snoop response, and forwards the decoloured snoop response enhanced with the one or more access permissions to the primary node.

In either embodiment, the master node sends a snoop message to the second requesting node or snooped party via the MPU, and the second requesting node sends a snoop response including the snooped data. The second memory protection unit enhances the snoop response with one or more access rights for the second requesting node. This is done, for example, by looking up the access rights using the transaction identifier or by reading the access rights from the transaction identifier that is colored. The master node receives the enhanced snoop response and discards the snooped data when the one or more access rights to the second requesting node indicate that the second requesting node does not have read rights to the snooped data. The master node also discards the snooped data when the snooped data is modified and the one or more access rights for the second requesting node indicate that the second requesting node does not have write rights to the modified data. Furthermore, the snooped data at the second requesting node may be invalidated when the master node discards the snooped data if the coherency protocol allows the second requesting node to retain a copy of the first data.

When the master node discards the snooped data, the master node retrieves the clean data from the shared memory at the first memory address and sends it to the first requesting node.

Fig. 12 illustrates an apparatus 1200 for filtering requests according to an embodiment of the disclosure. The apparatus is configured to determine whether to grant access rights to the shared memory or whether to deny the access rights request based on the access rights request. Apparatus 1200 includes CPU01202(a), CPU01202(n), module 1204(a), module 1204(n), cache 1206(a), cache 1206(n), node module 1246, and DMC 1236. CPU01202(a) may be considered a first processor or first CPU master, or CPU/IO. The CPU11202(n) may be considered a second processor or a second CPU master.

Coherent interconnect 1246 includes cross points (MXPs) 1212 and 1220 and a master node HN-F1216. MXPs 1212 and 1220 each contain a Memory Protection Unit (MPU) (1213 and 1221, respectively).

CPU01202(a) and CPU11202(n) communicate bi-directionally with coherent interconnect 1246 via links 1210, 1242 and 1224, 1226, respectively. Coherent interconnect 1246 communicates bi-directionally with DMC (memory controller) 1236 via links 1234, 1238. The bi-directional communication may be a data communication bus, a wire, a set of wires, a wireless channel, or other suitable transmission medium that allows data to be transmitted (transmitted and/or received) between the component parts of the device 1200.

In operation, cache 1206(a) of CPU01202(a) sends a data access request to MXP1212 of interconnect 1246, as shown by line 1242. The MPU at cross-point 1212 enhances the request with access rights.

The MPU at cross-point 1212 sends a request to primary node (HN-F)1216 via line 1214. On link 1218, HN-F1216 sends a snoop request to CPU1(1202(n)) via MXP 1220. The MPU at 1220 colors the transaction identifier in the snoop request with access rights and forwards the snoop request to cache 1206(n) of CPU11202(n) via lines 1224.

After receipt at cache 1206(n), the data response is sent from cache 1206(n) to the MPU at crosspoint 1220 via line 1226. The transaction identifier in the data response is de-colored by the MPU and forwarded to HN-F1216 via line 1228. The HN-F module 1216 transmits the data to DMC 1236 via line 1234.

The HN-F module 1216 sends the data to the MXP1212 via line 1240. The MXP1212 forwards the data to the cache 1206(a) of CPU01202(a) via line 1210.

Accordingly, in various embodiments, an apparatus is provided that includes a plurality of cross-point switches, a master node, and an interconnect. The cross-point switch includes a first memory protection unit and provides an interface to a first request node, and an interconnector is coupled between the plurality of cross-point switches, the main node, and the shared memory. The master node provides a coherency point for accessing the shared memory. The memory protection unit intercepts a message from a first requesting node destined for a master node, enhances the intercepted message with one or more access rights of the first requesting node, and forwards the enhanced message to the master node. The master node responds to the enhanced message in accordance with one or more access rights.

The message received at the memory protection unit is associated with a first memory address in the shared memory, and the memory protection unit is configured to look up one or more access permissions in an address table of the first memory protection unit according to the first memory address.

The memory protection unit at the second requesting node receives the snoop message from the master node in that the master node is configured to transmit the snoop message in response to the access request from the first requesting node. The second requesting node sends a snoop response containing the snooped data back to the master node in response to the snoop message. A memory protection unit at the second requesting node intercepts the snoop response, enhances the snoop response with one or more access permissions of the second requesting node, and forwards the enhanced snoop response to the master node. The master node is further configured to discard the snooped data when the one or more access rights for the second requesting node indicate that the second requesting node does not have read rights to the snooped data, and discard the snooped data when the snooped data is modified and the one or more access rights for the second requesting node indicate that the second requesting node does not have write rights to the modified data.

Further, the master node retrieves clean data from the shared memory and forwards the clean data to the first requesting node when the snooped data is discarded.

The master node sends the dummy data to the first requesting node when the message from the first requesting node includes a read request for data associated with a first memory address in the shared memory and the one or more access rights indicate that the first requesting node does not have read rights for the first memory address. The master node may invalidate the first data at the first requesting node when the message from the first requesting node includes a write request for the first data and the one or more access permissions indicate that the first requesting node does not have write permission for the first memory address and the write request is of a type that allows the first requesting node to retain a copy of the first data. The master node may discard the write request when the one or more access rights indicate that the first requesting node does not have write rights to the first memory address.

Access rights in the snoop response may be obtained by coloring the transaction identifier received in the snoop request.

Fig. 13A to 13D show MPU coloring of transaction identifiers. The common transaction identifier is included in all messages that are part of the same transaction, such as messages passed along request links 1242, 1214, 1218, and 1236 and response links 1234, 1238, 1240, and 1210 shown in fig. 12, and enables the response to be associated with the request. According to one embodiment, the transaction identifier may be restricted such that the identifier bits are available for access rights.

Fig. 13A shows a register 1320 for holding a transaction identifier having bits 1322 (a.) (n) (where "n" is any suitable number). In this example, register 1320 has two bits 1322(a) and 1322(b) that are not used for the restricted transaction identifier. For example, the bits may be padded with zeros. Register state 1320 corresponding to the snoop request, depicted in fig. 12 as snoop request 1218, is transferred from HN-F1216 to MXP 1230.

Fig. 13B shows a register 1320 having bits 1322(a). Register 1320 has two bits 1322(a) and 1322(b) filled with R and W, respectively. Register state 1320, corresponding to the snoop request depicted in fig. 12 as snoop request 1234, is transferred from MXP 1230 to cache 1206(n) of CPU11202 (n). The MPU in the MXP uses the address in the snoop request to retrieve the access rights and fills the two unused bits 1322(a) and 1322(b) with the access rights in the transaction identifier field. The transaction identifier is said to be "colored" by the access rights.

Fig. 13C shows a register 1320 having bits 1322(a). Register 1320 has two bits 1322(a) and 1322(b) filled with R and W, respectively. This register state 1320 corresponds to the transaction identifier that was transferred from the CPU11202(n) to the MXP 1230 in fig. 12 via the link 1236. There is no need to change the operation of the CPU1 because the CPU returns the same transaction identifier (colored with access rights) as it received.

Fig. 13D shows a register 1320 having bits 1322(a). (n) (where "n" is any suitable number). Register 1320 has two bits 1322(a) and 1322(b) that are padded with zeros. Bits 1324(a) and 1324(b) are filled with R and W, respectively. This register 1320 corresponds to a snoop response to the master node. Which shows that the R bit and the W bit are not in the transaction identifier 1320. The access rights shown as bits 1324(a) and (b) are removed from the transaction identifier and replaced with zeros in the snoop response transmitted from MXP 1230 to HN-F1216 over link 1238 in fig. 12. The access rights are utilized to enhance snoop responses to the master node.

As used herein, the term "processor" may encompass or utilize programmable hardware such as: computers, microcontrollers, embedded microcontrollers, microprocessors, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), and Complex Programmable Logic Devices (CPLDs). These hardware examples may also be used in combination to obtain a desired functional controller module. Computers, microcontrollers, and microprocessors may be programmed using languages such as assembly, C, C + +, C #, and the like. FPGAs, ASICs and CPLDs are typically programmed using a Hardware Description Language (HDL) such as VHSIC Hardware Description Language (VHDL) or Verilog that configures connections between internal hardware modules with less functionality on a programmable device.

The present disclosure has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products according to embodiments described herein. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented in hardware, or by computer program instructions which are executed, or by combinations of both.

As will be appreciated by one skilled in the art, embodiments may be described as a system, method or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit," module "or" system.

Furthermore, the disclosure may take the form of a non-transitory computer-readable medium storing instructions of a Hardware Description Language (HDL), such as VHSIC Hardware Description Language (VHDL) or Verilog, that describes the apparatus or stores a netlist description of the apparatus according to the claims. Such descriptions may be used, for example, to configure Field Programmable Gate Arrays (FPGAs) or similar configurable hardware, or as inputs to design tools for custom integrated circuits.

Various representative embodiments that have been described in detail herein are presented by way of example and not by way of limitation. It will be understood by those skilled in the art that various changes in form and details of the described embodiments may be made to obtain equivalent embodiments which remain within the scope of the appended claims.

29页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：基于检测到的读取命令活跃流的自适应预读高速缓存管理器

System, method and apparatus for accessing shared memory

相关技术

网友询问留言