Electronic device and electronic system
阅读说明:本技术 电子设备和电子系统 (Electronic device and electronic system ) 是由 维卡斯·辛哈 西恩·勒 塔伦·那克拉 田颖莹 艾普瓦·帕特尔 奥马尔·托雷斯 于 2020-03-19 设计创作,主要内容包括:提供一种电子设备和电子系统。根据一个总体方面,所述电子设备可包括:处理器,被配置为发出针对来自高速缓冲存储器的一条数据的第一请求和针对来自系统存储器的所述一条数据的第二请求。所述电子设备可包括:高速缓冲存储器,被配置为临时存储数据的子集。所述电子设备可包括存储器互连。存储器互连可被配置为接收针对来自系统存储器的所述一条数据的第二请求。存储器互连可被配置为确定所述一条数据是否被存储在高速缓冲存储器中。存储器互连可被配置为:如果所述一条数据被确定为存储在高速缓冲存储器中,则取消针对来自系统存储器的所述一条数据的第二请求。(An electronic device and an electronic system are provided. According to one general aspect, the electronic device may include: a processor configured to issue a first request for a piece of data from the cache memory and a second request for the piece of data from the system memory. The electronic device may include: a cache memory configured to temporarily store a subset of the data. The electronic device may include a memory interconnect. The memory interconnect may be configured to receive a second request for the piece of data from the system memory. The memory interconnect may be configured to determine whether the piece of data is stored in the cache memory. The memory interconnect may be configured to: canceling the second request for the piece of data from the system memory if the piece of data is determined to be stored in the cache memory.)
1. An electronic device, comprising:
a processor configured to issue a first request for a piece of data from the cache memory and a second request for the piece of data from the system memory;
a cache configured to store a subset of data; and
a memory interconnect configured to: receiving a second request for the piece of data from the system memory, determining whether the piece of data is stored in the cache memory, and canceling the second request for the piece of data from the system memory if the piece of data is determined to be stored in the cache memory.
2. The electronic device of claim 1, wherein the processor is configured to include a speculative flag in the second request for the piece of data from the system memory.
3. The electronic device of claim 1, wherein the memory interconnect is configured to:
if the piece of data is determined to be stored in the cache memory, the second request for the piece of data from the system memory is canceled by issuing a cancellation response message to the processor.
4. The electronic device of claim 1, wherein the memory interconnect is configured to determine whether the piece of data is stored in the cache memory by checking a snoop filter directory.
5. The electronic device of claim 4, wherein the snoop filter directory includes false positive results instead of false negative results.
6. The electronic device of claim 1, wherein the processor is configured to:
a third request for the piece of data from the system memory is issued in response to receiving both the failure of the first request and the cancellation of the second request.
7. The electronic device of claim 1, wherein the memory interconnect is configured to: when the second request is issued, access requests to system memory for a piece of data already stored in the cache memory are blocked.
8. The electronic device of claim 1, wherein the memory interconnect is configured to:
canceling the second request for the piece of data from the system memory if the second request arrives at the memory interconnect before the write request associated with the piece of data, wherein the second request is earlier than the write request.
9. An electronic system, comprising:
a plurality of processors, wherein a requesting processor among the plurality of processors is configured to issue a first request for a piece of data from a cache memory system and a second request for the piece of data from a system memory;
a cache memory system including, for each processor of the plurality of processors, a portion of the cache memory system associated with the respective processor; and
a memory interconnect configured to:
facilitating cache coherency among the plurality of processors,
receiving a second request for the piece of data from the system memory,
determining whether the piece of data is stored in a portion of the cache memory system that is accessible by the requesting processor, and
canceling the second request for the piece of data from the system memory if the piece of data is determined to be stored in a portion of the cache memory system that is accessible by the requesting processor.
10. The electronic system of claim 9, wherein the requesting processor is configured to include a speculative flag in a second request for the piece of data from system memory.
11. The electronic system of claim 9, wherein the memory interconnect is configured to:
if the piece of data is determined to be stored in a portion of the cache memory system that is accessible by the requesting processor, a second request for the piece of data from the system memory is canceled by issuing a cancel response message to the requesting processor.
12. The electronic system of claim 9, wherein the memory interconnect is configured to determine whether the piece of data is stored in the portion of the cache memory accessible to the requesting processor by checking a snoop filter directory.
13. The electronic system of claim 12, wherein the snoop filter directory includes false positive results instead of false negative results.
14. The electronic system of claim 9, wherein the requesting processor is configured to:
a third request for the piece of data from the system memory is issued in response to receiving both the failure of the first request and the cancellation of the second request.
15. The electronic system of claim 9, wherein the memory interconnect is configured to: when the second request is issued, access requests to system memory for a piece of data already stored in the cache memory system are blocked.
16. The electronic system of claim 9, wherein the memory interconnect is configured to:
canceling the second request for the piece of data from the system memory if the second request arrives at the memory interconnect before the write request associated with the piece of data, wherein the second request is earlier than the write request.
17. An electronic device, comprising:
memory access interface circuitry configured to receive and transmit memory access requests and responses;
a cache coherency data structure configured to indicate contents of a cache memory; and
speculative request management circuitry configured to:
a speculative request to system memory for a piece of data is received,
determining whether the piece of data is stored in at least a portion of the cache memory, and
if the piece of data is determined to be stored in the cache, the speculative request is cancelled.
18. The electronic device of claim 17, wherein the speculative request management circuit is configured to:
the speculative request is cancelled, at least in part, by issuing a cancel request message to the requesting device.
19. The electronic device of claim 17, wherein the speculative request management circuit is configured to:
determining whether the piece of data is stored in at least one portion of the cache memory by accessing a cache coherency data structure.
Technical Field
This description relates to memory operations, and more particularly, to speculative Dynamic Random Access Memory (DRAM) reads borrowing interconnect directories in parallel with cache level searches.
Background
When a particular data is shared by multiple caches and the processor modifies the value of the shared data, the change must be propagated to all other caches that have copies of the data. This change propagation prevents the system from violating cache coherency. Notification of data changes may be accomplished by bus snooping.
Bus snooping or bus snooping is a scheme in which a coherency controller (snooper) in a cache monitors or snoops bus transactions, with the goal of maintaining cache coherency in a distributed shared memory system. A cache containing a coherence controller (snooper) is referred to as a snooped cache.
All snoopers monitor each transaction on the bus. If a transaction occurs on the bus that modifies a shared cache block, all snoopers check whether their caches have identical copies of the shared block. If the cache has a copy of the shared block, the corresponding snooper performs an action to ensure cache coherency. This action may be a flush (flush) or invalidation of the cache block. It also relates to changes in cache block status according to a cache coherency protocol.
When a bus transaction occurs for a particular cache block, all snoopers must snoop the bus transaction. The snoopers then query their corresponding cache tags to check if they have the same cache block. In most cases, the cache does not have this cache block because a well optimized parallel program does not share much data among multiple threads. Therefore, the snooper's cache tag lookup is typically an unnecessary task for caches that do not have the cache block. However, tag queries interfere with cache accesses by the processor and cause additional power consumption.
One way to reduce unnecessary interception is to use an interception filter. The snoop filter determines whether the listener needs to check its cache tag. The snoop filter is a directory-based structure and monitors all coherency traffic to keep track of the coherency state of the cache blocks. It means that the snoop filter knows the cache with a copy of the cache block. Thus, it may prevent unnecessary snoops from caches that do not have copies of the cache block. Depending on the location of the listening filter, there are two types of filters. One is a source filter that is located on the cache side and performs filtering before coherency traffic reaches the shared bus. The other is a destination filter that is located on the bus side and blocks unwanted coherent traffic flowing from the shared bus. Interception filters are also classified as inclusive and exclusive. The inclusive snoop filter keeps track of the presence of cache blocks in the cache. However, the exclusive snoop filter monitors the cache for the absence of cache blocks in the cache. In other words, a hit in the inclusive snoop filter means that the corresponding cache block is held by the cache. On the other hand, a hit in the exclusive snoop filter means that no cache has the requested cache block.
Disclosure of Invention
It is an object of the present disclosure to provide a device and a system having reduced delay and efficiently utilizing resources.
According to one general aspect, an apparatus may comprise: a processor configured to issue a first request for a piece of data from the cache memory and a second request for the piece of data from the system memory. The apparatus may comprise: a cache memory configured to store a subset of the data. The apparatus may include a memory interconnect. The memory interconnect may be configured to receive a second request for the piece of data from the system memory. The memory interconnect may be configured to determine whether the piece of data is stored in the cache memory. The memory interconnect may be configured to: canceling the second request for the piece of data from the system memory if the piece of data is determined to be stored in the cache memory.
According to another general aspect, a system may include: a plurality of processors, wherein a requesting processor is configured to issue a first request for a piece of data from the cache memory system and a second request for the piece of data from the system memory. The system may include: a cache memory system comprising, for each processor, a portion of the cache memory system associated with the respective processor. The system may include a memory interconnect. The memory interconnect may be configured to facilitate cache coherency among the plurality of processors. The memory interconnect may be configured to receive a second request for the piece of data from the system memory. The memory interconnect may be configured to determine whether the piece of data is stored in a portion of the cache memory system that is accessible by the requesting processor. The memory interconnect may be configured to: canceling the second request for the piece of data from the system memory if the piece of data is determined to be stored in the portion of the cache memory system.
According to another general aspect, an apparatus may include: memory access interface circuitry configured to receive and transmit memory access requests and responses. The apparatus may comprise: a cache coherency data structure configured to indicate contents of a cache memory. The apparatus may comprise: speculative request management circuitry configured to: the method includes receiving a speculative request to system memory for a piece of data, determining whether the piece of data is stored in at least a portion of a cache memory, and cancelling the speculative request if the piece of data is determined to be stored in the cache memory.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.
As set forth more fully in the claims, a system and/or method for memory operations, and more particularly, speculative Dynamic Random Access Memory (DRAM) reads borrowing interconnect directories in parallel with cache level searches is set forth substantially as shown in and/or described in connection with at least one of the figures.
According to the present disclosure, requests to system memory are cancelled when a request to cache memory is speculatively successful. Thus, a device and system are provided that have reduced latency and efficiently utilize resources.
Drawings
Fig. 1A is a block diagram of an example embodiment of a system in accordance with the disclosed subject matter.
Fig. 1B is a block diagram of an example embodiment of a system in accordance with the disclosed subject matter.
Fig. 2 is a block diagram of an example embodiment of a system in accordance with the disclosed subject matter.
Fig. 3 is a flow diagram of an example embodiment of a technique in accordance with the disclosed subject matter.
FIG. 4 is a schematic block diagram of an information handling system that may include devices formed in accordance with the principles of the disclosed subject matter.
Like reference symbols in the various drawings indicate like elements.
Detailed Description
Various example embodiments will be described more fully hereinafter with reference to the accompanying drawings, in which some example embodiments are shown. The subject matter of the present disclosure may, however, be embodied in many different forms and should not be construed as limited to the example embodiments set forth herein. Rather, these example embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the subject matter of the disclosure to those skilled in the art. In the drawings, the size and relative sizes of layers and regions may be exaggerated for clarity.
It will be understood that when an element or layer is referred to as being "on," "connected to" or "coupled to" another element or layer, it can be directly on, connected or coupled to the other element or layer or intervening elements or layers may be present. In contrast, when an element is referred to as being "directly on," "directly connected to" or "directly coupled to" another element or layer, there are no intervening elements or layers present. Like numbers refer to like elements throughout. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, a first component, a first region, a first layer and/or a first portion discussed below could be termed a second element, a second component, a second region, a second layer and/or a second portion without departing from the teachings of the presently disclosed subject matter.
For ease of description, spatially relative terms (such as "below …," "below …," "below," "above …," "above," and the like) may be used herein to describe the relationship of one element or feature to another element or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "below" or "beneath" other elements or features would then be oriented "above" the other elements or features. Thus, the exemplary term "below …" can include both an orientation of "above …" and "below …". The device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
Likewise, for ease of description, electrical terms (such as "high," "low," "pull-up," "pull-down," "1," "0," etc.) may be used herein to describe voltage levels or currents relative to other voltage levels or another element or feature as illustrated in the figures. It will be understood that electrically relative terms are intended to encompass different reference voltages of the device in use or operation in addition to the voltages and currents depicted in the figures. For example, if a device or signal in the figures is flipped or other reference voltages, currents, or charges are used, then an element described as "high" or "pull-up" will then be "low" or "pull-down" compared to the new reference voltage or current. Thus, the exemplary term "high" may include a relatively low voltage or current or both a relatively high voltage or current. The device may additionally be based on different electrical frames of reference, the electrical relative descriptors used herein interpreted accordingly.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the subject matter of the present disclosure. As used herein, the singular forms also are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Example embodiments are described herein with reference to cross-sectional illustrations that are schematic illustrations of idealized example embodiments (and intermediate structures). As such, variations from the shapes of the illustrations as a result, for example, of manufacturing techniques and/or tolerances, are to be expected. Thus, example embodiments should not be construed as limited to the particular shapes of regions illustrated herein but are to include deviations in shapes that result, for example, from manufacturing. For example, an implanted region illustrated as a rectangle will typically have rounded corners or curved features and/or a gradient of implant concentration at its edges rather than a binary change from implanted to non-implanted region. Likewise, a buried region formed by implantation may result in some implantation in the region between the buried region and the surface where the implantation occurs. Thus, the regions illustrated in the figures are schematic in nature and their shapes are not intended to illustrate the actual shape of a region of a device and are not intended to limit the scope of the presently disclosed subject matter.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the presently disclosed subject matter belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Hereinafter, example embodiments will be explained in detail with reference to the accompanying drawings.
Fig. 1A is a block diagram of an example embodiment of a system 100 according to the disclosed subject matter. In various embodiments, the system 100 (also referred to as an electronic system, electronic device, etc.) may include a computing device (such as, for example, a laptop computer, a desktop computer, a workstation, a system on a chip (SOC), a personal digital assistant, a smartphone, a tablet computer, and other suitable computers or virtual machines or virtual computing devices thereof).
In the illustrated embodiment, the system 100 may include a
In the illustrated embodiment, the system 100 may include a memory cache circuit or system (also referred to as a cache, cache memory) 104. Cache 104 may be configured to temporarily store data (e.g., data 133). In the illustrated embodiment, the cache 104 may include a level 1 (L1)
In the illustrated embodiment, system 100 may include a
In the illustrated embodiment, the system 100 may include a
In the illustrated embodiment, system 100 may include a
In various embodiments, the
In the illustrated embodiment,
In the illustrated embodiment, the
In various embodiments, the
In the illustrated embodiment,
In the illustrated embodiment, the
In the illustrated embodiment,
In such embodiments, instead of completing the
In various embodiments, the
In various embodiments, this determination may be made by examining or using snoop
In various embodiments, the
Conversely, if the
In such embodiments, by canceling the second request, the
In various embodiments,
In various example embodiments, the
Further, in various embodiments, the
In the illustrated embodiment, if the
In the illustrated embodiment, the
Fig. 1B is a block diagram of an example embodiment of a
In such embodiments, as described above,
In the illustrated embodiment, the
In the illustrated embodiment,
In the illustrated embodiment, when processing the speculative
In various embodiments, once the cache associated with the requesting processor (e.g., processor 102) is checked and found to be missing, the other caches (e.g.,
As described above, in various embodiments, the
Fig. 2 is a block diagram of an example embodiment of a
In various embodiments, the
In various embodiments,
In various embodiments, cache
In various embodiments,
In various embodiments, if speculative
Fig. 3 is a flow diagram of an example embodiment of a
Nevertheless, it is understood that the above are merely some illustrative examples and that the disclosed subject matter is not so limited. It is to be understood that the disclosed subject matter is not limited to the order or number of acts shown by
Fig. 4 is a schematic block diagram of an information handling system 400 that may include semiconductor devices formed in accordance with the principles of the disclosed subject matter.
Referring to FIG. 4, an information handling system 400 may include one or more devices constructed in accordance with the principles of the disclosed subject matter. In another embodiment, information handling system 400 may employ or perform one or more techniques in accordance with the principles of the disclosed subject matter.
In various embodiments, information handling system 400 may include computing devices (such as, for example, laptop computers, desktop computers, workstations, servers, blade servers, personal digital assistants, smart phones, tablet computers, and other suitable computers or virtual machines or virtual computing devices thereof). In various embodiments, information handling system 400 may be used by a user (not shown).
Information handling system 400 according to the disclosed subject matter may also include a Central Processing Unit (CPU), logic, or processor 410. In some embodiments, processor 410 may include one or more blocks of Functional Units (FUBs) or Combinational Logic (CLBs) 415. In such embodiments, the combinational logic block may include various boolean logic operations (e.g., nand, nor, xor), stable logic devices (flip-flops, latches), other logic devices, or combinations thereof. These combinational logic operations may be configured in a simple or complex manner to process the input signals to achieve the desired result. It is understood that while some illustrative examples of synchronous combinational logic operations are described, the disclosed subject matter is not so limited and may include asynchronous operations or a mixture thereof. In one embodiment, the combinational logic operation may include a plurality of Complementary Metal Oxide Semiconductor (CMOS) transistors. In various embodiments, these CMOS transistors may be arranged in gates that perform logical operations; nevertheless, it is understood that other techniques may be used and are within the scope of the disclosed subject matter.
The information processing system 400 according to the disclosed subject matter may also include volatile memory 420 (e.g., Random Access Memory (RAM)). Information handling system 400 according to the disclosed subject matter may also include non-volatile memory 430 (e.g., a hard disk drive, optical memory, NAND, or flash memory). In some embodiments, volatile memory 420, non-volatile memory 430, or combinations or portions thereof, may also be referred to as "storage media". In various embodiments, the volatile memory 420 and/or nonvolatile memory 430 may be configured to store data in a semi-permanent or substantially permanent form.
In various embodiments, the information handling system 400 may include one or more network interfaces 440 configured to allow the information handling system 400 to become part of and communicate via a communication network. Examples of Wi-Fi protocols can include, but are not limited to: institute of Electrical and Electronics Engineers (IEEE)802.11g, IEEE 802.11 n. Examples of cellular protocols may include, but are not limited to: IEEE 802.16m (also known as wireless-MAN (metropolitan area network) advanced), Long Term Evolution (LTE) advanced, enhanced data rates for GSM evolution (EDGE), evolved high speed packet access (HSPA +). Examples of wired protocols may include, but are not limited to: IEEE 802.3 (also known as ethernet), fibre channel, power line communication (e.g., HomePlug, IEEE 1901). It is to be understood that the above are merely some illustrative examples, and the disclosed subject matter is not limited thereto.
The information processing system 400 according to the disclosed subject matter may also include a user interface unit 450 (e.g., a display adapter, a haptic interface, a human interface device). In various embodiments, this user interface unit 450 may be configured to receive input from a user and/or provide output to a user. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); input from the user may be received in any form, including acoustic, speech, or tactile input.
In various embodiments, the information handling system 400 may include one or more other devices or hardware components 460 (e.g., a display or monitor, a keyboard, a mouse, a camera, a fingerprint reader, a video processor). It is to be understood that the above are merely some illustrative examples, and the disclosed subject matter is not limited thereto.
The information handling system 400 according to the disclosed subject matter may also include one or more system buses 405. In such embodiments, the system bus 405 may be configured to communicatively connect the processor 410, the volatile memory 420, the non-volatile memory 430, the network interface 440, the user interface unit 450, and the one or more hardware components 460. Data processed by the processor 410 or data input from outside the non-volatile memory 430 may be stored in the non-volatile memory 430 or the volatile memory 420.
In various embodiments, information handling system 400 may include or execute one or more software components 470. In some embodiments, the software components 470 may include an Operating System (OS) and/or applications. In some embodiments, the OS may be configured to provide one or more services to applications and manage or act as an intermediary between the applications and various hardware components (e.g., processor 410, network interface 440) of the information handling system 400. In such embodiments, the information handling system 400 may include one or more native applications that may be installed locally (e.g., within the non-volatile memory 430) and configured to be executed directly by the processor 410 and to interact directly with the OS. In such embodiments, the native application may comprise pre-compiled machine executable code. In some embodiments, the native application may include a script interpreter (e.g., csh, AppleScript, AutoHotkey) or a virtual execution machine (VM) (e.g., Java virtual machine, microsoft common language runtime) configured to translate source or object code into executable code that is then executed by the processor 410.
The semiconductor devices described above may be packaged using various packaging processes. For example, a semiconductor device constructed in accordance with the principles of the disclosed subject matter may be packaged using any of the following: package On Package (POP) technology, Ball Grid Array (BGA) technology, Chip Scale Package (CSP) technology, leaded plastic chip carrier (PLCC) technology, plastic dual in-line package (PDIP) technology, waffle die package technology, die in wafer form technology, Chip On Board (COB) technology, ceramic dual in-line package (CERDIP) technology, Plastic Metric Quad Flat Pack (PMQFP) technology, Plastic Quad Flat Pack (PQFP) technology, small outline package (SOIC) technology, Shrink Small Outline Package (SSOP) technology, Thin Small Outline Package (TSOP) technology, Thin Quad Flat Pack (TQFP) technology, system-in-package (SIP) technology, multi-chip package (MCP) technology, wafer-level fabrication package (WFP) technology, wafer-level processing package on package (WSP) technology, and other technologies as will be known to those skilled in the art.
Method steps may be performed by one or more programmable processors executing a computer program to perform functions by operating on input data and generating output. Method steps can also be performed by, and an apparatus can be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
In various embodiments, a computer-readable medium may include instructions that, when executed, cause an apparatus to perform at least a portion of the method steps. In some embodiments, the computer readable medium may be included in magnetic media, optical media, other media, or a combination thereof (e.g., CD-ROM, hard drive, read-only memory, flash drive). In such embodiments, the computer-readable medium may be an article of manufacture that is tangibly and non-transitory to implement.
While the principles of the disclosed subject matter have been described with reference to example embodiments, it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit or scope of the disclosed concepts. Accordingly, it should be understood that the above embodiments are not limiting, but merely illustrative. Thus, the scope of the disclosed concept is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing description. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the scope of the embodiments.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:一种大型货运无人机测控信息实时处理系统