Techniques for demoting cache lines to a shared cache
阅读说明:本技术 用于将高速缓存行降级到共享高速缓存的技术 (Techniques for demoting cache lines to a shared cache ) 是由 E.塔米尔 B.理查森 N.鲍尔 A.坎宁安 D.亨特 K.德维 C.韦 于 2019-05-30 设计创作,主要内容包括:用于将高速缓存行降级到共享高速缓存的技术包括一种计算设备,其具有有着多个核的至少一个处理器、具有核本地高速缓存和共享高速缓存的高速缓冲存储器、以及高速缓存行降级设备。计算设备的处理器的处理器核被配置成检索所接收的网络分组的数据的至少一部分,并将该数据移动到核本地高速缓存的一个或多个核本地高速缓存行中。处理器核被进一步配置成对该数据执行处理操作,并在已经完成处理操作之后向高速缓存行降级设备传输高速缓存行降级命令。高速缓存行降级设备被配置成执行高速缓存行降级操作以将该数据从核本地高速缓存行降级到共享高速缓存的共享高速缓存行。本文中描述了其他实施例。(Techniques for destaging cache lines to a shared cache include a computing device having at least one processor with a plurality of cores, a cache memory having a core local cache and a shared cache, and a cache line destaging device. A processor core of a processor of the computing device is configured to retrieve at least a portion of the data of the received network packet and move the data into one or more core local cache lines of the core local cache. The processor core is further configured to perform a processing operation on the data and transmit a cache line destage command to the cache line destage device after the processing operation has been completed. The cache line destaging device is configured to perform a cache line destaging operation to destage the data from the core local cache line to a shared cache line of the shared cache. Other embodiments are described herein.)
1. A computing device for demoting a cache line to a shared cache, the computing device comprising:
one or more processors, wherein each of the one or more processors comprises a plurality of processor cores;
a cache memory, wherein the cache memory comprises a core local cache and a shared cache, wherein the core local cache comprises a plurality of core local cache lines, and wherein the shared cache comprises a plurality of shared cache lines;
a cache line demotion device; and
a Host Fabric Interface (HFI) to receive network packets,
wherein processor cores of a processor of the one or more processors are to:
retrieving at least a portion of the data of the received network packet, wherein retrieving the data comprises moving the data into one or more of the plurality of core local cache lines;
performing one or more processing operations on the data; and
after the one or more processing operations on the data have been completed, transmitting a cache line destage command to a cache line destage device, and
wherein the cache line destaging device is to perform a cache line destaging operation to destage data from the one or more core local cache lines to one or more shared cache lines of the shared cache in response to having received the cache line destaging command.
2. The computing device of claim 1, wherein the processor core is further to determine whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein to transmit the cache line demotion command to the cache line demotion device comprises to transmit the cache line demotion command after determining that the size of the received network packet is greater than the packet size threshold.
3. The computing device of claim 2, wherein the processor core is further to transmit a cache line demotion instruction to a cache manager of the cache memory after having determined that the size of the received network packet is less than or equal to the packet size threshold, and wherein the cache manager is to demote data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on the cache line demotion instruction, wherein the cache line demotion instruction bypasses the cache line demotion device.
4. The computing device of claim 3, wherein to transmit a cache line demotion instruction comprises to transmit one or more cache line identifiers corresponding to the one or more shared cache lines.
5. The computing device of claim 1, wherein to perform a cache line demotion operation comprises to perform a read request or a direct memory access.
6. The computing device of claim 1, wherein the cache line demotion command comprises an indication of a core local cache line associated with the received network packet to be demoted to the shared cache.
7. The computing device of claim 1, wherein the cache line destaging device comprises one of a copy engine, a Direct Memory Access (DMA) device operable to copy data, or an offload device operable to perform read operations.
8. The computing device of claim 1, wherein to transmit a cache line demotion command comprises to transmit one or more cache line identifiers corresponding to the one or more shared cache lines.
9. A computing device for demoting a cache line to a shared cache, the computing device comprising:
means for retrieving, by a processor of a computing device, at least a portion of data of a network packet received by a Host Fabric Interface (HFI) of the computing device, wherein retrieving the data comprises moving the data into one or more core local cache lines of a plurality of core local cache lines of a core local cache of the computing device, and wherein the processor comprises a plurality of processor cores;
means for performing, by a processor core of the plurality of processor cores, one or more processing operations on data;
means for transmitting, by the processor and after the one or more processing operations on the data have been completed, a cache line destage command to a cache line destage device of the computing device; and
means for performing, by the cache line destaging device and in response to having received the cache line destage command, a cache line destage operation to destage data from the one or more core local cache lines to one or more shared cache lines of a shared cache of the computing device.
10. The computing device of claim 9, further comprising means for determining whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein means for transmitting a cache line demotion command to a cache line demotion device comprises means for transmitting the cache line demotion command after determining that the size of the received network packet is greater than a packet size threshold.
11. The computing device of claim 10, further comprising means for transmitting a cache line destage instruction to a cache manager of a cache memory including the core local cache and the shared cache after it has been determined that the size of the received network packet is less than or equal to the packet size threshold, and wherein the cache manager is to destage data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on the cache line destage instruction.
12. The computing device of claim 11, wherein means for transmitting a cache line demotion instruction comprises means for transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
13. The computing device of claim 9, wherein means for performing a cache line demotion operation comprises means for performing a read request or a direct memory access.
14. The computing device of claim 9, wherein means for transmitting a cache line demotion command comprises means for transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
15. A method for demoting a cache line to a shared cache, the method comprising:
retrieving, by a processor of a computing device, at least a portion of data of a network packet received by a Host Fabric Interface (HFI) of the computing device, wherein retrieving the data comprises moving the data into one or more core local cache lines of a plurality of core local cache lines of a core local cache of the computing device, and wherein the processor comprises a plurality of processor cores;
performing, by a processor core of the plurality of processor cores, one or more processing operations on data;
transmitting, by the processor core and after the one or more processing operations on the data have been completed, a cache line demotion command to a cache line demotion device of the computing device; and
performing, by the cache line destaging device and in response to having received the cache line destage command, a cache line destage operation to destage data from the one or more core local cache lines to one or more shared cache lines of a shared cache of the computing device.
16. The method of claim 15, further comprising determining whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein transmitting a cache line destage command to a cache line destage device comprises transmitting the cache line destage command after determining that the size of the received network packet is greater than the packet size threshold.
17. The method of claim 16, further comprising:
transmitting, by the processor core and after having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demotion instruction to a cache manager of a cache memory including the core local cache and the shared cache; and
demoting, by the cache manager, data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on a cache line demotion instruction.
18. The method of claim 17, wherein transmitting a cache line demotion instruction comprises transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
19. The method of claim 15, wherein performing a cache line destage operation comprises performing one of a read request or a direct memory access.
20. The method of claim 15, wherein transmitting a cache line demotion command comprises transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
21. A computing device for demoting a cache line to a shared cache, the computing device comprising:
circuitry for retrieving, by a processor of a computing device, at least a portion of data of a network packet received by a Host Fabric Interface (HFI) of the computing device, wherein retrieving the data comprises moving the data into one or more core local cache lines of a plurality of core local cache lines of a core local cache of the computing device, and wherein the processor comprises a plurality of processor cores;
circuitry to perform one or more processing operations on data by a processor core of the plurality of processor cores;
circuitry for transmitting, by the processor core and after the one or more processing operations on the data have been completed, a cache line demotion command to a cache line demotion device of the computing device; and
means for performing, by the cache line destaging device and in response to having received the cache line destage command, a cache line destage operation to destage data from the one or more core local cache lines to one or more shared cache lines of a shared cache of the computing device.
22. The computing device of claim 21, further comprising circuitry to determine whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein to transmit a cache line demotion command to a cache line demotion device comprises to transmit the cache line demotion command after determining that the size of the received network packet is greater than the packet size threshold.
23. The computing device of claim 22, further comprising:
circuitry for transmitting, by the processor core and after having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demotion instruction to a cache manager of a cache memory comprising the core local cache and the shared cache; and
circuitry for demoting, by the cache manager, data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on a cache line demotion instruction.
24. The computing device of claim 23, wherein to transmit a cache line demotion instruction comprises to transmit one or more cache line identifiers corresponding to the one or more shared cache lines.
25. The computing device of claim 21, wherein means for performing a cache line demotion operation comprises means for performing one of a read request or a direct memory access.
Background
Modern computing devices have become ubiquitous tools for personal, business, and social uses. As such, many modern computing devices are capable of connecting to various data networks, including the internet, to transmit and receive data communications at different rates over the various data networks. To facilitate communication between computing devices, data networks typically include one or more network computing devices (e.g., computing servers, storage servers, etc.) to route communications (e.g., north-south network traffic (traffic)) entering/leaving the network and communications (e.g., east-west network traffic) between network computing devices in the network (e.g., via switches, routers, etc.). In current packet-switched network architectures, data is transmitted in the form of network packets between networked computing devices. At a high level, data is packetized into network packets at one computing device, and the resulting packets are transmitted over a network to another computing device via a transmitting device (e.g., a Network Interface Controller (NIC) of the computing device).
Upon receipt of a network packet, the computing device typically performs one or more processing operations (e.g., security, Network Address Translation (NAT), load balancing, Deep Packet Inspection (DPI), Transmission Control Protocol (TCP) optimization, caching, Internet Protocol (IP) management, etc.) to determine what the computing device will do with the network packet (e.g., drop the network packet, process/store at least a portion of the network packet, forward the network packet, etc.). To do so, such packet processing is often performed in a packet processing pipeline (e.g., a service function chain) in which at least a portion of the data of a network packet is passed from one processor core to another as it is processed. However, during such packet processing, stalls may occur due to cross-core snoops, and cache pollution with stale data may be a problem.
Drawings
In the accompanying drawings, the concepts described herein are illustrated by way of example and not by way of limitation. For simplicity and clarity of illustration, elements illustrated in the figures have not necessarily been drawn to scale. Where considered appropriate, reference labels have been repeated among the figures to indicate corresponding or analogous elements.
FIG. 1 is a simplified block diagram of at least one embodiment of a system for demoting a cache line to a shared cache, the system comprising a source computing device and a network computing device communicatively coupled via a network;
FIG. 2 is a simplified block diagram of at least one embodiment of an environment of a network computing device of the system of FIG. 1;
FIG. 3 is a simplified flow diagram of at least one embodiment of a method for demoting a cache line to a shared cache, which may be performed by the network computing devices of FIGS. 1 and 2; and
fig. 4 and 5 are simplified block diagrams of at least one embodiment of another environment of the network computing device of fig. 1 and 2 for demoting a cache line to a shared cache.
Detailed Description
While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
References in the specification to "one embodiment," "an illustrative embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is claimed that: it is within the knowledge of one skilled in the art to effect such features, structures, or characteristics in connection with other embodiments, whether or not explicitly described. Further, it should be appreciated that an item included in the list in the form of "at least one of A, B and C" may mean (a); (B) (ii) a (C) (ii) a (A and B); (A and C); (B and C); or (A, B and C). Similarly, an item listed in the form of "at least one of A, B or C" may mean (a); (B) (ii) a (C) (ii) a (A and B); (A and C); (B and C); or (A, B and C).
In some cases, the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. A machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disk, or other media device).
In the drawings, some structural or methodical features may be shown in a particular arrangement and/or ordering. However, it should be appreciated that such a particular arrangement and/or ordering may not be required. More specifically, in some embodiments, such features may be arranged in a manner and/or order different from that shown in the illustrative figures. Moreover, the inclusion of structural or methodical features in a particular figure is not intended to imply that such features are required in all embodiments and may not be included or may be combined with other features in some embodiments.
Referring now to FIG. 1, in an illustrative embodiment, a system 100 for demoting a cache line to a shared cache includes a
In use, the
Oftentimes, more than one processing operation (e.g., guarding, Network Address Translation (NAT), load balancing, Deep Packet Inspection (DPI), Transmission Control Protocol (TCP) optimization, caching, Internet Protocol (IP) management, etc.) is performed by a network computing device, with each operation typically being performed by a different processor core in a packet processing pipeline, such as a service function chain. Thus, at the completion of processing, data accessed by one processor core needs to be released (e.g., destaged to the shared cache 116) for the next processor core to perform its designated processing operation.
To do so, the
The
As shown in FIG. 1, the illustrative
Processor(s) 108 may be embodied as any type of device or collection of devices capable of performing various computing functions as described herein. In some embodiments, processor(s) 108 may be embodied as one or more multi-core processors, Digital Signal Processors (DSPs), microcontrollers, or other processor(s) or processing/control circuit(s). In some embodiments, the processor(s) 108 may be embodied as, include, or otherwise be coupled to an integrated circuit, an embedded system, a field programmable array (FPGA) (e.g., reconfigurable circuitry), a system on a chip (SOC), an Application Specific Integrated Circuit (ASIC), reconfigurable hardware or hardware circuitry, or other special purpose hardware that facilitates the performance of the functions described herein.
The illustrative processor(s) 108 include a plurality of processor cores 110 (e.g., two processor cores, four processor cores, eight processor cores, sixteen processor cores, etc.) and a
The
The shared
The
Each of the processor(s) 108 and the
The one or more
The
It should be appreciated that in some embodiments, the
In some embodiments, execution of one or more of the functions of the
The
The one or more
The cache
The
The
Referring now to FIG. 2, in use, the
As illustratively shown, the network traffic ingress/
Further, in some embodiments, one or more of the illustrative components may form a portion of another component, and/or one or more of the illustrative components may be independent of each other. Additionally, in some embodiments, one or more of the components of
In the
The network traffic ingress/
Further, the network traffic ingress/
The destaging
The packet
The
The
If the
Referring now to FIG. 3, a
In
If the
Referring now to fig. 4 and 5, in use, the
As illustratively shown, the core
Referring now to fig. 5, similar to the illustrative environment of fig. 4, the
Referring back to fig. 4, as illustratively shown, the indication provided by the
As illustratively shown in both fig. 4 and 5, the data in core local cache line (1) 404a, core local cache line (2) 404b, and core local cache line (3) 404c is associated with the processed network packet, as indicated by the highlighted outline surrounding each of those core local cache lines 404. As also illustratively shown, the cache line destage operation results in the data being destaged such that data in core local cache line (1) 404a is destaged to shared cache line (1) 406a, data in core local cache line (2) 404b is destaged to shared cache line (2) 406b, and data in core local cache line (3) 404c is destaged to shared cache line (3) 406 c; however, it should be appreciated that due to cache line destage operations, a destaged cache line may be moved to any available shared
Examples of the invention
Illustrative examples of the techniques disclosed herein are provided below. Embodiments of the techniques may include any one or more of the examples described below and any combination of the examples described below.
Example 1 includes a computing device to demote a cache line to a shared cache, the computing device comprising one or more processors, wherein each of the one or more processors comprises a plurality of processor cores; a cache memory, wherein the cache memory comprises a core local cache and a shared cache, wherein the core local cache comprises a plurality of core local cache lines, and wherein the shared cache comprises a plurality of shared cache lines; a cache line demotion device; and a Host Fabric Interface (HFI) to receive a network packet, wherein a processor core of a processor of the one or more processors is to retrieve at least a portion of data of the received network packet, wherein retrieving the data comprises moving the data into one or more of the plurality of core local cache lines; performing one or more processing operations on the data; and after the one or more processing operations on the data have been completed, transmitting a cache line destage command to the cache line destage device, and wherein the cache line destage device is to perform a cache line destage operation to destage the data from the one or more core local cache lines to one or more shared cache lines of the shared cache in response to having received the cache line destage command.
Example 2 includes the subject matter of example 1, and wherein the processor core is further to determine whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein transmitting the cache line demotion command to the cache line demotion device comprises transmitting the cache line demotion command after determining that the size of the received network packet is greater than the packet size threshold.
Example 3 includes the subject matter of any of examples 1 and 2, and wherein the processor core is further to transmit a cache line demotion instruction to a cache manager of the cache memory after having determined that the size of the received network packet is less than or equal to the packet size threshold, and wherein the cache manager is to demote data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on the cache line demotion instruction, wherein the cache line demotion instruction bypasses the cache line demotion device.
Example 4 includes the subject matter of any of examples 1-3, and wherein transmitting the cache line demotion instruction comprises transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 5 includes the subject matter of any of examples 1-4, and wherein performing a cache line demotion operation comprises performing a read request or a direct memory access.
Example 6 includes the subject matter of any of examples 1-5, and wherein the cache line demotion command includes an indication of a core local cache line associated with the received network packet to be demoted to the shared cache.
Example 7 includes the subject matter of any of examples 1-6, and wherein the cache line demotion device comprises one of a copy engine, a Direct Memory Access (DMA) device available to copy data, or an offload device available to perform read operations.
Example 8 includes the subject matter of any of examples 1-7, and wherein transmitting the cache line demotion command comprises transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 9 includes one or more machine-readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing device: retrieving, by a processor of a computing device, at least a portion of data of a network packet received by a Host Fabric Interface (HFI) of the computing device, wherein retrieving the data comprises moving the data into one or more of a plurality of core local cache lines of a core local cache of the computing device, and wherein the processor comprises a plurality of processor cores; performing, by a processor core of the plurality of processor cores, one or more processing operations on data; transmitting, by the processor and after the one or more processing operations on the data have been completed, a cache line destage command to a cache line destage device of the computing device; and performing, by the cache line destaging device and in response to having received the cache line destage command, a cache line destage operation to destage data from the one or more core local cache lines to one or more shared cache lines of a shared cache of the computing device.
Example 10 includes the subject matter of example 9, and wherein the processor core is further to determine whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein transmitting the cache line demotion command to the cache line demotion device comprises transmitting the cache line demotion command after determining that the size of the received network packet is greater than the packet size threshold.
Example 11 includes the subject matter of any of examples 9 and 10, and wherein the processor core is further to transmit a cache line demotion instruction to a cache manager of a cache memory including the core local cache and the shared cache after having determined that a size of the received network packet is less than or equal to the packet size threshold, and wherein the cache manager is to demote data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on the cache line demotion instruction.
Example 12 includes the subject matter of any of examples 9-11, and wherein transmitting the cache line demotion instruction includes transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 13 includes the subject matter of any of examples 9-12, and wherein performing a cache line demotion operation comprises performing a read request or a direct memory access.
Example 14 includes the subject matter of any of examples 9-13, and wherein transmitting the cache line demotion command includes transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 15 includes a method for demoting cache lines to a shared cache, the method comprising retrieving, by a processor of a computing device, at least a portion of data of a network packet received by a Host Fabric Interface (HFI) of the computing device, wherein retrieving the data comprises moving the data into one or more core local cache lines of a plurality of core local cache lines of a core local cache of the computing device, and wherein the processor comprises a plurality of processor cores; performing, by a processor core of the plurality of processor cores, one or more processing operations on data; transmitting, by the processor core and after the one or more processing operations on the data have been completed, a cache line demotion command to a cache line demotion device of the computing device; and performing, by the cache line destaging device and in response to having received the cache line destage command, a cache line destage operation to destage data from the one or more core local cache lines to one or more shared cache lines of a shared cache of the computing device.
Example 16 includes the subject matter of example 15, and further comprising determining whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein transmitting the cache line demotion command to the cache line demotion device comprises transmitting the cache line demotion command after determining that the size of the received network packet is greater than the packet size threshold.
Example 17 includes the subject matter of any of examples 15 and 16, and further comprising transmitting, by the processor core and after having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demotion instruction to a cache manager of a cache memory comprising the core local cache and the shared cache; and demoting, by the cache manager, data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on the cache line demotion instruction.
Example 18 includes the subject matter of any of examples 15-17, and wherein transmitting the cache line demotion instruction includes transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 19 includes the subject matter of any of examples 15-18, and wherein performing a cache line demotion operation includes performing one of a read request or a direct memory access.
Example 20 includes the subject matter of any of examples 15-19, and wherein transmitting the cache line demotion command includes transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 21 includes a computing device to demote a cache line to a shared cache, the computing device comprising circuitry to retrieve, by a processor of the computing device, at least a portion of data of a network packet received by a Host Fabric Interface (HFI) of the computing device, wherein retrieving the data comprises moving the data into one or more of a plurality of core local cache lines of a core local cache of the computing device, and wherein the processor comprises a plurality of processor cores; circuitry to perform one or more processing operations on data by a processor core of the plurality of processor cores; circuitry for transmitting, by the processor core and after the one or more processing operations on the data have been completed, a cache line demotion command to a cache line demotion device of the computing device; and means for performing, by the cache line destaging device and in response to having received the cache line destage command, a cache line destage operation to destage data from the one or more core local cache lines to one or more shared cache lines of a shared cache of the computing device.
Example 22 includes the subject matter of example 21, and further comprising circuitry to determine whether a size of the received network packet is greater than a packet size threshold after the one or more processing operations on the data have been completed, wherein transmitting the cache line demotion command to the cache line demotion device comprises transmitting the cache line demotion command after determining that the size of the received network packet is greater than the packet size threshold.
Example 23 includes the subject matter of any of examples 21 and 22, and further comprising circuitry to transmit, by the processor core and after having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demotion instruction to a cache manager of a cache memory comprising the core local cache and the shared cache; and demoting, by the cache manager, data from the one or more core local cache lines to the one or more shared cache lines of the shared cache based on the cache line demotion instruction.
Example 24 includes the subject matter of any of examples 21-23, and wherein transmitting the cache line demotion instruction includes transmitting one or more cache line identifiers corresponding to the one or more shared cache lines.
Example 25 includes the subject matter of any of examples 21-24, and wherein the means for performing a cache line demotion operation comprises means for performing one of a read request or a direct memory access.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:用于延迟的不规则载荷的预取器