Cache memory and management method thereof

文档序号:1296032 发布日期:2020-08-07 浏览:7次 中文

阅读说明:本技术 快取存储器及快取存储器的管理方法 (Cache memory and management method thereof ) 是由 林瑞源 卢彦儒 于 2019-01-30 设计创作,主要内容包括:本申请公开一种快取存储器及快取存储器的管理方法。快取存储器包含存储电路、缓冲电路以及控制电路。缓冲电路以先进先出的方式存储数据。控制电路耦接于存储电路及缓冲电路,且用来在存储电路中找到存储空间,并将该数据写入所述存储空间。(The application discloses a cache memory and a management method of the cache memory. The cache memory comprises a storage circuit, a buffer circuit and a control circuit. The buffer circuit stores data in a first-in first-out manner. The control circuit is coupled to the storage circuit and the buffer circuit, and is used for finding a storage space in the storage circuit and writing the data into the storage space.)

1. A cache memory, comprising:

a memory circuit;

a buffer circuit for storing a data in a first-in first-out manner; and

a control circuit coupled to the storage circuit and the buffer circuit for finding a storage space in the storage circuit and writing the data into the storage space.

2. The cache memory as claimed in claim 1, wherein when a target data is written into said cache memory, said control circuit writes said target data into said buffer circuit without checking said storage circuit.

3. The cache memory as claimed in claim 1, wherein when said control circuit checks whether said cache memory contains a target data, said control circuit checks whether said storage circuit and said buffer circuit store said target data.

4. The cache memory as claimed in claim 1, wherein said buffer circuit is implemented as a register.

5. A method for managing a cache memory, the cache memory including a storage circuit and a buffer circuit, the buffer circuit storing data in a first-in first-out manner, the method comprising:

when a target data is written into the cache memory, writing the target data into the buffer circuit without checking the storage circuit; and

a memory space is found in the memory circuit, and the target data is written into the memory space.

6. The method of claim 5, further comprising:

when checking whether the cache memory contains the target data, checking whether the storage circuit and the buffer circuit store the target data.

7. The method of claim 5, wherein the buffer circuit has a capacity less than the capacity of the storage circuit.

8. A cache memory, comprising:

a first level cache memory including a first control circuit;

a second level cache memory including a second control circuit; and

a register coupled to the first control circuit and the second control circuit;

the first control circuit and the second control circuit respectively control the first-level cache memory and the second-level cache memory to operate in a container mode or a proprietary mode by referring to a temporary storage value of the register.

9. The cache as claimed in claim 8 wherein said second level cache is shared by a first core and a second core of a processor, said first core executing a first program and said second core executing a second program, said register value corresponding to an inclusive mode when said first program and said second program are instructions or data sharing programs.

10. The cache memory as claimed in claim 8, wherein said second level cache memory is shared by a first core and a second core of a processor, said first core executing a first program and said second core executing a second program, said register value corresponding to a private mode when said first program and said second program are not programs sharing instructions or data.

Technical Field

The present invention relates to cache memories, and more particularly to multi-level cache memories.

Background

FIG. 1 is a block diagram of an electronic device 100 including a processor 110, a first level (L1) cache 120, a second level (L2) cache 130, and a system Memory 140. L1 cache 120 and L2 cache 130 are typically Static Random-Access memories (SRAMs), while system Memory 140 is typically Dynamic Random-Access memories (DRAMs). L2 cache 130 includes a control circuit 132 and a storage circuit 136. control circuit 132 writes data to storage circuit 136 or reads data from storage circuit 136. the data structures of storage circuit 136 and the algorithms used by control circuit 132 to Access storage circuit 136 are well known to those skilled in the art and will not be described herein.

FIG. 2 is a partial flow diagram of the operation of the electronic device 100 in the inclusive mode, during a data access, when data misses in the L cache 120 (miss), L cache 120 requests L cache 130 (step S210). in step S220, the control circuit 132 checks whether the data requested by the L cache 120 is stored in the storage circuit 136. assuming that the data requested by the L cache 120 is not stored in the storage circuit 136 (i.e., L cache does not consume time), the control circuit 132 requests the data from the system memory 140 (step S230). then, L cache 130 retrieves the data from the system memory 140 (step S240), L cache 130 returns data to L cache 120 (step S250). after receiving data returned from L cache 130, L1 cache 120 stores the data.finally, L cache 120 broadcasts L cache 130 (step S250) to 632 cache 130 (step S260). In response to cache 130, the control circuit 120 broadcasts data stored in cache 130 (step S260) that the cache 120, and reads the data from the cache 136. if the cache 132 is not able to process data in a corresponding cache access cycle, the corresponding to the corresponding cache 136. the cache 120 (step S15. the cache 120).

FIG. 3 is a partial flow diagram of the operation of the electronic device 100 in the exclusive mode, during a data access, when data misses in the L1 cache 120, the L1 cache 120 requests data from the L2 cache 130 (step S310). in step S320, the control circuit 132 checks whether the data requested by the L1 cache 120 is stored in the storage circuit 136. assuming that the data requested by the L1 cache 120 is stored in the storage circuit 136 (i.e., L2 cache hit), the control circuit 132 returns the data to the L1 cache 120 (step S330). Next, the L1 cache 120 kicks (evict) a line of data (line data) into the L2 cache 130 (step S340). in step S340, the control circuit 132 checks the tag of the storage circuit 136 and writes the data into the appropriate location of the storage circuit 136. since the access to the storage circuit 136 is relatively so that the control circuit 132 may not immediately process a command 110 that is stalled.

Disclosure of Invention

In view of the foregoing, it is an object of the present disclosure to provide a cache memory and a method for managing the cache memory, so as to improve the performance of an electronic device.

The application discloses a cache memory, which comprises a storage circuit, a buffer circuit and a control circuit. The buffer circuit stores a data in a first-in first-out manner. The control circuit is coupled to the storage circuit and the buffer circuit, and is used for finding a storage space in the storage circuit and writing the data into the storage space.

The application also discloses a management method of the cache memory, the cache memory comprises a storage circuit and a buffer circuit, the buffer circuit stores data in a first-in first-out mode, the method comprises: when a target data is written into the cache memory, writing the target data into the buffer circuit without checking the storage circuit; and finding a storage space in the storage circuit and writing the target data into the storage space.

The application also discloses a cache memory, which comprises a first-level cache memory, a second-level cache memory and a register. The first level cache includes a first control circuit. The second level cache includes a second control circuit. The register is coupled to the first control circuit and the second control circuit. The first control circuit and the second control circuit respectively control the first-level cache memory and the second-level cache memory to operate in an inclusive mode or an exclusive mode by referring to a register value of the register.

By providing the buffer circuit in the cache memory, the access speed of the cache memory can be improved. Compared with the prior art, the electronic device adopting the cache memory can reduce the occurrence probability of processor stagnation. Furthermore, the cache memory of the present disclosure is easily switchable between an inclusive mode and a private mode.

The features, implementations, and technical effects of the present disclosure will be described in detail below with reference to the accompanying drawings.

Drawings

FIG. 1 is a diagram of a conventional electronic device;

FIG. 2 is a partial flowchart illustrating an operation of a conventional electronic device in a container mode;

FIG. 3 is a partial flowchart of a conventional electronic device operating in a proprietary mode;

FIG. 4 is an architecture diagram of an embodiment of an electronic device according to the present disclosure;

FIG. 5 is a flowchart of an embodiment of a method for managing a cache memory according to the present disclosure;

FIG. 6 is a flowchart of an embodiment of step S540 of FIG. 5; and

fig. 7 is an architecture diagram of another embodiment of an electronic device of the present disclosure.

Description of the symbols

100. 400, 70 electronic device

110. 410, 72 processor

120. 420, 724, 734L 1 cache memory

130. 430, 74L 2 cache memory

140. 440 system memory

132. 432, 7241, 7341, 742 control circuit

136. 436, 7242, 7342, 746 storage circuit

434. 744 buffer circuit

720. 730 core

722. 732 processing unit

76 temporary storage

S210 to S260, S310 to S340, S510 to S580, S610 to S640

Detailed Description

The technical terms in the following description refer to the conventional terms in the technical field, and some terms are explained or defined in the specification, and the explanation of the some terms is based on the explanation or the definition in the specification.

The disclosure includes a cache memory and a method of managing the cache memory. Since some of the elements included in the cache memory of the present disclosure may be known elements alone, details of known elements will be omitted from the following description without affecting the full disclosure and feasibility of the embodiments of the apparatus. In addition, some or all of the processes of the cache memory management method of the present disclosure may be implemented in software and/or firmware, and may be executed by the cache memory of the present disclosure or its equivalent, and the following description of the method embodiments will focus on the contents of the steps rather than hardware without affecting the full disclosure and the feasibility of the method embodiments.

FIG. 4 is an architecture diagram of an embodiment of an electronic device 400 including a processor 410, L1 caches 420, L2 caches 430, and a system memory 440. L cache 430 includes a control circuit 432, a buffer circuit 434, and a storage circuit 436. the buffer circuit 434 stores data in a first-in-first-out (FIFO) manner, and the storage circuit 436 does not store data in a first-in-first-out manner.

FIG. 5 is a flowchart of an embodiment of a cache management method according to the present disclosure, wherein the flowchart of FIG. 5 is applicable to the inclusive mode and the exclusive mode, when the control circuit 432 obtains the target data from the L1 cache memory 420 or the system memory 440 and stores the target data, the control circuit 432 writes the target data into the buffer circuit 434 without checking the tag of the storage circuit 436 (step S510). Next, the control circuit 432 determines L2 whether the cache memory 430 is in an idle state (step S520). if not, the control circuit 432 further determines L whether another target data needs to be written into the L2 cache memory 430 (step S530). if not, the control circuit 432 writes the another target data into the buffer circuit 434 (step S510). if yes, the control circuit 432 searches and/or returns to step S520 (including accessing the buffer circuit 434 and/or the storage circuit 436) (step S530).

When L2 the cache 430 is idle (Yes in step S520), the control circuit 432 determines whether the buffer 434 is empty (step S550). if the buffer 434 does not store any data (i.e., Yes in step S550), the control circuit 432 returns to step S520. if the buffer 434 is not empty (i.e., No in step S550), the control circuit 432 searches the storage space in the storage circuit 436 (step S560), and then reads the target data from the buffer 434 and writes the target data to the storage circuit 436 (step S570). in other words, steps S560 and S570 aim to move the target data from the buffer 434 to the storage circuit 436. after the move, the target data is only present in the storage circuit 436 but not present in the buffer 434. in other words, the buffer 434 and the storage circuit 436 do not store the same line of data at the same time. after step S570 is completed, the control circuit 432 completes writing L2 the target data to the cache 430 (step S580), and the flow returns to step S520.

In step S560, the storage space may be unoccupied space or space occupied by data to be kicked, the control circuit 432 may find the data to be kicked according to an algorithm (e.g., least Recently Used (L east received Used, L RU)) and a tag in the storage circuit 436.

As can be seen from the flow chart of fig. 5, the buffer circuit 434 may store multiple target data at the same time, and the control circuit 432 reads the target data from and writes the target data to the storage circuit 436 sequentially in a first-in first-out manner. In some embodiments, the data in the buffer circuit 434 has the same format as the data in the storage circuit 436 (e.g., both are in the form of row data) to simplify step S570.

As a comparison, since the control circuit 432 needs to check the tag of the storage circuit 436 to find the suitable storage space (whether the storage space is empty or the space occupied by the data to be kicked) in step S510, it should be understood that the step S510 can be accomplished with only 1 cycle of the system clock, and since the control circuit 432 needs to check the tag first when writing the target data into the storage circuit 436, it takes at least 2 cycles of the system clock (depending on the size of the storage circuit 436) for the control circuit 432 to directly write the target data into the storage circuit 436. in other words, the buffer circuit 434 can increase the speed of the cache memory L2.

The idle state of step S520 includes (1) when the control circuit 432 has no pending read/write operations, and (2) L2 when the cache 430 is missing, the period from when the control circuit 432 requests data from the system memory 440 to when the system memory 440 returns, since the number of cycles of the system clock required for one access of the system memory 440 is usually much greater than the number of cycles of the system clock required for the control circuit 432 to write data into the storage circuit 436, the control circuit 432 has sufficient time to perform steps S560 and S570 in case (2).

In summary, since the operation of the L2 cache 430 requires only 1 cycle of system clock from the perspective of the processor 410, whether the L2 cache 430 misses in the inclusive mode or hits in the exclusive mode, the processor 410 is not stalled, thereby significantly increasing the performance of the electronic device 400.

FIG. 6 is a flowchart illustrating an embodiment of step S540 of FIG. 5. when L1 cache 420 misses and requests data from L2 cache 430, the control circuit 432 checks whether the buffer circuit 434 and storage circuit 436 store the target data (step S610). if so (i.e., the buffer circuit 434 or storage circuit 436 stores the target data, YES in step S620), the control circuit 432 reads the target data and returns the target data to L1 cache 420 (step S630). if not (i.e., the buffer circuit 434 and storage circuit 436 do not store the target data, NO in step S620), the control circuit 432 requests data from the system memory 440 (step S640).

FIG. 7 is an architecture diagram of an electronic device 70 including a processor 72, a L2 cache 74 and registers 76. the processor 72 includes a core 720 and a core 730. the core 720 includes processing units 722 and L1 caches 724. the L1 cache 724 includes control circuits 7241 and storage circuits 7242. the core 730 includes processing units 732 and L1 caches 734. L1 cache 734 includes control circuits 7341 and storage circuits 7342. briefly, the processor 72 is a multi-core architecture, the core 720 and the core 730 have respective L1 caches (724 and 734), and share L2 caches 74. the L2 cache 74 includes control circuits 742, 744 and storage circuits 746. the functions of the control circuits 742, 744 and 746 are similar to those of the control circuits 432, 434 and 436, respectively, and therefore the control circuits 7241, the control circuits 7341, and the control circuits 742 are coupled to the registers 76 of the control circuits 742, and the registers 76 are not read again.

L1 the control circuitry 7241 of the cache 724 of FIG. L1 the control circuitry 7341 of the cache 734 of FIG. L, and the control circuitry 742 of the cache 74 of FIG. L02 control the operation of the cache 74 of FIG. L1 the cache 724 of FIG. L1 the cache 734 of FIG. 38 and the cache 74 of FIG. L2 in the inclusive mode or the exclusive mode, respectively, in other words, the cache of L1 and the cache of L2 are programmably switchable between the inclusive mode and the exclusive mode, thus, the electronic device 70 does not need to determine the operation mode of the cache 724 of FIG. L1, the cache 734 of FIG. L1 and the cache 74 of L2 at the design stage, and instead, the user can set the register values of the registers 76 according to actual application (i.e., dynamic adjustment) after the circuit is completed, in some embodiments, the registers 76 may be the control registers of the processor 72.

The following is an exemplary application of the electronic device 70.

Example-when core 720 and core 730 are processing in parallel (i.e., executing the same process), the register 76 may be set to a first value (e.g., 1) such that L1 cache 724, L1 cache 734 and L2 cache 74 are operating in the inclusive mode.

Second, when the core 720 and the core 730 execute a first process and a second process, respectively, and the first process and the second process share instructions and/or data, the register 76 may be set to a first value (e.g., 1) such that the L1 cache 724, L1 cache 734 and L2 cache 74 operate in the inclusive mode.

Example three, when the core 720 and the core 730 execute the first program and the second program, respectively, and the first program and the second program do not share instructions and/or data (i.e., the first program and the second program are independent programs), the register 76 may be set to a second value (e.g., 0) such that the L1 cache 724, L1 cache 734 and L2 cache 74 operate in a exclusive mode.

In example three, the exclusive mode facilitates L1 cache 724, L1 cache 734, and L2 cache 74 to store more instructions and/or data, so performance of electronic device 70 may be improved.

In some embodiments, the control circuit 432, the control circuit 7241, the control circuit 7341, and the control circuit 742 may be implemented by a finite state machine (including a plurality of logic circuits).

Because those skilled in the art can appreciate details and variations of implementing method embodiments of the present disclosure from the disclosure of apparatus embodiments of the present disclosure, repeated descriptions are omitted herein for the avoidance of unnecessary detail without affecting the disclosed requirements and the implementability of the method embodiments. It should be noted that the shapes, sizes, proportions, and sequence of steps of the elements and the like in the drawings are merely illustrative and not intended to limit the present disclosure, which should be understood by those skilled in the art.

Although the embodiments of the present disclosure have been described above, the embodiments are not intended to limit the present disclosure, and those skilled in the art can make variations on the technical features of the present disclosure according to the explicit or implicit contents of the present disclosure, and all such variations may fall within the scope of patent protection sought by the present disclosure, in other words, the scope of patent protection of the present disclosure should be subject to the claims of the present specification.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种分布式存储系统及其存储方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类