Cache bump elimination method, device, equipment and storage medium

文档序号:341755 发布日期:2021-12-03 浏览:15次 中文

阅读说明:本技术 缓存颠簸消除方法、装置、设备及存储介质 (Cache bump elimination method, device, equipment and storage medium ) 是由 刘洋 李宗鹏 黄浩 于 2021-08-25 设计创作,主要内容包括:本发明公开了一种缓存颠簸消除方法、装置、设备及存储介质,所述方法通过获取运行环境中处理器缓存的外围缓存大小、缓存行大小以及采用的组相连结构路数,并获取对地址空间连续的有序数据集中数据类型在进行内存对齐后的对齐数据大小,有序数据集的数据总量;根据所述外围缓存大小、所述缓存行大小、所述组相连结构路数、所述对齐数据大小和所述数据总量计算数据调整偏移量及调整次数;根据所述数据调整偏移量和所述需要调整次数对所述有序数据集进行二分查找调整,能够消除缓存颠簸,避免了性能劣化,保证了二分算法的高效性,极大程度上降低查找部分数据时因查找次数增加带来的时间消耗。(The invention discloses a cache jolt elimination method, a device, equipment and a storage medium, wherein the method comprises the steps of obtaining the peripheral cache size and the cache line size of a processor cache in an operating environment and the number of paths of an adopted group connection structure, and obtaining the size of aligned data after memory alignment of data types in an ordered data set with continuous address space and the total data amount of the ordered data set; calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount; and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times, so that cache jolt can be eliminated, performance degradation is avoided, the high efficiency of a binary algorithm is ensured, and time consumption caused by the increase of the search times when part of data is searched is greatly reduced.)

1. A cache thrashing elimination method is characterized by comprising the following steps:

acquiring the peripheral cache size and cache line size of a processor cache in an operating environment and the number of adopted group connection structure paths, and acquiring the size of aligned data after memory alignment of data types in an ordered data set with continuous address space and the total data amount of the ordered data set;

calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount;

and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

2. The method according to claim 1, wherein the obtaining a peripheral cache size and a cache line size of a processor cache in the operating environment and an adopted N-way group link structure, and obtaining an aligned data size after memory alignment of a data type in an ordered data set with continuous address space, the total amount of data in the ordered data set comprises:

acquiring the peripheral cache size and the cache line size of a processor cache in an operating environment and an adopted N-path group connection structure from a processor;

and acquiring the size of aligned data of the data type of the ordered data in the ordered data set after memory alignment and the total data amount of the ordered data set from a preset programming language manual.

3. The method as claimed in claim 1, wherein said calculating a data adjustment offset and adjustment times based on said peripheral cache size, said cache line size, said set of contiguous structure ways, said alignment data size, and said total amount of data comprises:

obtaining a path of cache capacity according to the peripheral cache size and the N path of group connection structure through the following formula:

wherein S iscwFor one-way cache capacity, ScThe size of the peripheral cache is N, and the number of the group connection structure paths is N;

obtaining the data capacity capable of being accommodated in the one-way cache according to the one-way cache capacity and the size of the aligned data through the following formula:

wherein, Q isdcwFor a way cache to accommodate data volume, ScwFor one-way cache capacity, SdIs the alignment data size;

obtaining the number of cache lines in the one-way cache according to the one-way cache capacity and the cache line size by the following formula:

wherein Q isclwNumber of cache lines in a cache, ScwFor one-way cache capacity, SclIs the cache line size;

obtaining an average cache line receivable data amount according to the receivable data amount and the cache line number by the following formula:

wherein Q isdclFor average cache line to hold data volume, QdcwIn order for a way of cache to accommodate the amount of data,Qclwcaching the number of cache lines in one path;

and adjusting the offset and the adjusting times according to the data capacity of the average cache line.

4. The method of cache thrashing elimination of claim 1, wherein said performing a binary search adjustment on said ordered data set according to said data adjustment offset and said number of required adjustments comprises:

adjusting the boundary point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times;

comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result;

and carrying out data processing on the ordered data set according to the comparison result.

5. The cache thrashing method according to claim 4, wherein before adjusting the demarcation point index of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times, the cache thrashing eliminating method further comprises:

initializing the adjusted times in the current binary search process;

when detecting that the ordered data set searched currently is empty, judging that the searching fails, and ending the searching process;

recording a left boundary index number and a right boundary index number of the currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a demarcation point index number.

6. The cache thrashing elimination method of claim 4, wherein said adjusting the demarcation point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the number of times of adjustment required comprises:

acquiring the adjusted times in the current binary search process, and judging whether the adjusted times are smaller than the times needing to be adjusted;

when the adjusted times are smaller than the times needing to be adjusted, setting the preset index number as the index number of the data adjustment offset which is deviated from the boundary index number to the left boundary direction of the ordered data set searched currently, and when the position of the preset index number exceeds the left boundary of the ordered data set searched currently, setting the preset index number as the index number of the left boundary, adjusting the boundary index number to the preset index number, and recording the adjusted times of the index number;

and when the adjusted times are equal to the times needing to be adjusted, setting the preset index number as the demarcation point index number.

7. The cache thrashing elimination method of claim 4, wherein said data processing said ordered data set according to said comparison comprises:

when the target data is the same as the data to be searched, finishing searching;

when the target data is different from the data to be searched, the ordered data set is divided into a left data set and a right data set by taking the preset index number as a dividing point;

and discarding one data set which does not contain the data to be searched, and taking the remaining one data set as the input of the ordered data set searched currently.

8. A cache thrashing elimination apparatus, comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring the peripheral cache size and cache line size of a processor cache in an operating environment and the number of adopted group connection structure paths, and acquiring the aligned data size and the total data amount of an ordered data set after memory alignment of data types in the ordered data set with continuous address space;

a calculation module, configured to calculate a data adjustment offset and an adjustment number according to the peripheral cache size, the cache line size, the number of ways of the group connection structure, the size of the aligned data, and the total data amount;

and the adjusting module is used for performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

9. A cache thrashing elimination device, said cache thrashing elimination device comprising: memory, a processor and a cache thrashing elimination program stored on the memory and executable on the processor, the cache thrashing elimination program being configured to implement the steps of the cache thrashing elimination method according to any one of claims 1 to 7.

10. A storage medium having stored thereon a cache thrashing elimination program, which when executed by a processor implements the steps of the cache thrashing elimination method according to any one of claims 1 to 7.

Technical Field

The invention relates to the technical field of computer science, in particular to a method, a device, equipment and a storage medium for eliminating cache jolt.

Background

The purpose of the processor Cache (CPU Cache) is to reduce the average time of the processor for accessing data, and to improve the data reading amount of the processor in unit time by reducing the waiting time of the processor for I/O communication, so that the processor can process more data in unit time; today, modern processors have integrated into them a large number of cache components, which are divided into different levels of cache according to their access reading speed, e.g. L1, L2, L3; if the processor cannot directly acquire corresponding data in the cache through the data address, cache hit and miss occur, the hit and miss can increase the data reading time, reduce the data throughput of the processor in unit time and cause the performance degradation of the processor; the data in the cache is organized in cache lines, and in order to associate the data address with the location of the data in the cache, the structure of the cache adopts various ways, such as: full connection, multi-path connection, direct connection, etc.

Generally, a binary search algorithm is adopted to search continuous and ordered data in an address space, but the binary search algorithm is extremely special, and cache jolt is more likely to occur when a program structure of the binary search algorithm is combined with a group-associated or directly-associated cache structure; if the data sets used for searching are continuous in the address space, taking a cache structure as a direct connection example, when the total data size reaches four times of the maximum capacity of the cache, the memory thrashing phenomenon starts to occur; because these data are placed in close proximity in memory, they are logically close together when mapped into a directly-associative cache structure; when the data volume does not reach the cache capacity, all the data can be mapped in the cache without conflict; as the data amount continues to increase, the mapping relationship of the data begins to conflict, that is, different data needs to be mapped into the same cache line, if the data with conflict in the mapping relationship is continuously read, continuous memory exchange occurs, resulting in performance degradation; when the data volume reaches four times of the memory capacity, because the data of the middle position of the data fetched by the binary algorithm each time is compared with the target value, it is not difficult to find that the position of the data set middle data mapped into the cache in the first binary operation and the position of the data set middle data mapped into the cache in the second binary operation are in the same cache line, so that the memory hit and miss can be caused by the continuous first and second binary operations; when a large number of binary search operations are performed, severe memory thrashing can be caused.

Disclosure of Invention

The invention mainly aims to provide a cache thrashing elimination method, a cache thrashing elimination device, equipment and a storage medium, and aims to solve the technical problems that in the prior art, when two-way searching is carried out on sequential data with continuous address spaces, the access speed is low along with the reduction of a cache level, and the performance degradation caused by cache thrashing is serious.

In a first aspect, the present invention provides a cache thrashing elimination method, including the following steps:

acquiring the peripheral cache size and cache line size of a processor cache in an operating environment and the number of adopted group connection structure paths, and acquiring the size of aligned data after memory alignment of data types in an ordered data set with continuous address space and the total data amount of the ordered data set;

calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount;

and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

Optionally, the obtaining a peripheral cache size and a cache line size of a processor cache in the operating environment and an adopted N-way group connection structure, and obtaining an aligned data size of a data type in an ordered data set with continuous address space after performing memory alignment, where the total data amount of the ordered data set includes:

acquiring the peripheral cache size and the cache line size of a processor cache in an operating environment and an adopted N-path group connection structure from a processor;

and acquiring the size of aligned data of the data type of the ordered data in the ordered data set after memory alignment and the total data amount of the ordered data set from a preset programming language manual.

Optionally, the calculating a data adjustment offset and an adjustment number according to the peripheral cache size, the cache line size, the number of ways of the group connection structure, the size of the aligned data, and the total data amount includes:

obtaining a path of cache capacity according to the peripheral cache size and the N path of group connection structure through the following formula:

wherein S iscwFor one-way cache capacity, ScThe size of the peripheral cache is N, and the number of the group connection structure paths is N;

obtaining the data capacity capable of being accommodated in the one-way cache according to the one-way cache capacity and the size of the aligned data through the following formula:

wherein, Q isdcwFor a way cache to accommodate data volume, ScwFor one-way cache capacity, SdIs the alignment data size;

obtaining the number of cache lines in the one-way cache according to the one-way cache capacity and the cache line size by the following formula:

wherein Q isclwNumber of cache lines in a cache, ScwFor one-way cache capacity, SclIs the cache line size;

obtaining an average cache line receivable data amount according to the receivable data amount and the cache line number by the following formula:

wherein Q isdclFor average cache line to hold data volume, QdcwFor a way cache to accommodate data volume, QclwCaching the number of cache lines in one path;

and adjusting the offset and the adjusting times according to the data capacity of the average cache line.

Optionally, the performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times includes:

adjusting the boundary point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times;

comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result;

and carrying out data processing on the ordered data set according to the comparison result.

Optionally, before the adjusting the demarcation point index of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times, the cache thrashing elimination method further includes:

initializing the adjusted times in the current binary search process;

when detecting that the ordered data set searched currently is empty, judging that the searching fails, and ending the searching process;

recording a left boundary index number and a right boundary index number of the currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a demarcation point index number.

Optionally, the adjusting the demarcation point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times includes:

acquiring the adjusted times in the current binary search process, and judging whether the adjusted times are smaller than the times needing to be adjusted;

when the adjusted times are smaller than the times needing to be adjusted, setting the preset index number as the index number of the data adjustment offset which is deviated from the boundary index number to the left boundary direction of the ordered data set searched currently, and when the position of the preset index number exceeds the left boundary of the ordered data set searched currently, setting the preset index number as the index number of the left boundary, adjusting the boundary index number to the preset index number, and recording the adjusted times of the index number;

and when the adjusted times are equal to the times needing to be adjusted, setting the preset index number as the demarcation point index number.

Optionally, the performing data processing on the ordered data set according to the comparison result includes:

when the target data is the same as the data to be searched, finishing searching;

when the target data is different from the data to be searched, the ordered data set is divided into a left data set and a right data set by taking the preset index number as a dividing point;

and discarding one data set which does not contain the data to be searched, and taking the remaining one data set as the input of the ordered data set searched currently.

In a second aspect, to achieve the above object, the present invention further provides a cache thrashing elimination apparatus, including:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring the peripheral cache size and cache line size of a processor cache in an operating environment and the number of adopted group connection structure paths, and acquiring the aligned data size and the total data amount of an ordered data set after memory alignment of data types in the ordered data set with continuous address space;

a calculation module, configured to calculate a data adjustment offset and an adjustment number according to the peripheral cache size, the cache line size, the number of ways of the group connection structure, the size of the aligned data, and the total data amount;

and the adjusting module is used for performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

In a third aspect, to achieve the above object, the present invention further provides a cache thrashing elimination apparatus, including: a memory, a processor, and a cache thrashing elimination program stored on the memory and executable on the processor, the cache thrashing elimination program configured to implement the steps of the cache thrashing elimination method as described above.

In a fourth aspect, to achieve the above object, the present invention further provides a storage medium, where a cache thrashing elimination program is stored, and when being executed by a processor, the cache thrashing elimination program implements the steps of the cache thrashing elimination method described above.

The cache bump elimination method provided by the invention obtains the size of peripheral cache, the size of cache lines and the number of adopted group connection structure paths of processor cache in an operating environment, and obtains the size of aligned data after memory alignment of data types in sequential data sets with continuous address spaces and the total data amount of the sequential data sets; calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount; and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times, so that cache jolt can be eliminated, performance degradation is avoided, the high efficiency of a binary algorithm is ensured, and time consumption caused by the increase of the search times when part of data is searched is greatly reduced.

Drawings

FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a first embodiment of a cache thrashing elimination method according to the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of a cache thrashing elimination method according to the present invention;

FIG. 4 is a flow chart illustrating a third embodiment of a cache thrashing elimination method according to the present invention;

FIG. 5 is a flowchart illustrating a fourth embodiment of a cache thrashing elimination method according to the present invention;

FIG. 6 is a flowchart illustrating a fifth embodiment of a cache thrashing elimination method according to the present invention;

FIG. 7 is a flowchart illustrating a sixth embodiment of a cache thrashing elimination method according to the present invention;

FIG. 8 is a functional block diagram of a first embodiment of a cache thrashing elimination apparatus according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The solution of the embodiment of the invention is mainly as follows: acquiring the size of a peripheral cache, the size of a cache line and the number of adopted group-connected structure paths of a processor cache in an operating environment, and acquiring the size of aligned data of a data type in an ordered data set with continuous address space after memory alignment and the total data amount of the ordered data set; calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount; according to the data adjustment offset and the required adjustment times, the ordered data set is subjected to binary search adjustment, cache jolt can be eliminated, performance degradation is avoided, the efficiency of a binary algorithm is guaranteed, time consumption caused by increase of the search times during searching of partial data is greatly reduced, and the technical problem that in the prior art, when the binary search is carried out on the ordered data with continuous address space, the performance degradation is serious due to cache jolt along with reduction of cache levels, the access speed is low is solved.

Referring to fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the apparatus may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., a Wi-Fi interface). The Memory 1005 may be a high-speed RAM Memory or a Non-Volatile Memory (Non-Volatile Memory), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration of the apparatus shown in fig. 1 is not intended to be limiting of the apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a cache thrashing program.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 through the processor 1001, and performs the following operations:

acquiring the peripheral cache size and cache line size of a processor cache in an operating environment and the number of adopted group connection structure paths, and acquiring the size of aligned data after memory alignment of data types in an ordered data set with continuous address space and the total data amount of the ordered data set;

calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount;

and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 by the processor 1001, and also performs the following operations:

acquiring the peripheral cache size and the cache line size of a processor cache in an operating environment and an adopted N-path group connection structure from a processor;

and acquiring the size of aligned data of the data type of the ordered data in the ordered data set after memory alignment and the total data amount of the ordered data set from a preset programming language manual.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 by the processor 1001, and also performs the following operations:

obtaining a path of cache capacity according to the peripheral cache size and the N path of group connection structure through the following formula:

wherein S iscwFor one-way cache capacity, ScThe size of the peripheral cache is N, and the number of the group connection structure paths is N;

obtaining the data capacity capable of being accommodated in the one-way cache according to the one-way cache capacity and the size of the aligned data through the following formula:

wherein, Q isdcwFor a way cache to accommodate data volume, ScwFor one-way cache capacity, SdIs the alignment data size;

obtaining the number of cache lines in the one-way cache according to the one-way cache capacity and the cache line size by the following formula:

wherein Q isclwNumber of cache lines in a cache, ScwFor one-way cache capacity, SclIs the cache line size;

obtaining an average cache line receivable data amount according to the receivable data amount and the cache line number by the following formula:

wherein Q isdclFor average cache line to hold data volume, QdcwFor a way cache to accommodate data volume, QclwCaching the number of cache lines in one path;

and adjusting the offset and the adjusting times according to the data capacity of the average cache line.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 by the processor 1001, and also performs the following operations:

adjusting the boundary point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times;

comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result;

and carrying out data processing on the ordered data set according to the comparison result.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 by the processor 1001, and also performs the following operations:

initializing the adjusted times in the current binary search process;

when detecting that the ordered data set searched currently is empty, judging that the searching fails, and ending the searching process;

recording a left boundary index number and a right boundary index number of the currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a demarcation point index number.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 by the processor 1001, and also performs the following operations:

acquiring the adjusted times in the current binary search process, and judging whether the adjusted times are smaller than the times needing to be adjusted;

when the adjusted times are smaller than the times needing to be adjusted, setting the preset index number as the index number of the data adjustment offset which is deviated from the boundary index number to the left boundary direction of the ordered data set searched currently, and when the position of the preset index number exceeds the left boundary of the ordered data set searched currently, setting the preset index number as the index number of the left boundary, adjusting the boundary index number to the preset index number, and recording the adjusted times of the index number;

and when the adjusted times are equal to the times needing to be adjusted, setting the preset index number as the demarcation point index number.

The apparatus of the present invention calls the cache thrashing elimination program stored in the memory 1005 by the processor 1001, and also performs the following operations:

when the target data is the same as the data to be searched, finishing searching;

when the target data is different from the data to be searched, the ordered data set is divided into a left data set and a right data set by taking the preset index number as a dividing point;

and discarding one data set which does not contain the data to be searched, and taking the remaining one data set as the input of the ordered data set currently searched.

According to the scheme, the size of the peripheral cache of the processor cache in the operating environment, the size of the cache line and the number of the adopted group connection structure paths are obtained, and the size of the aligned data of the data type in the sequential data set with continuous address space after the memory alignment and the total data amount of the sequential data set are obtained; calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount; and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times, so that cache jolt can be eliminated, performance degradation is avoided, the high efficiency of a binary algorithm is ensured, and time consumption caused by the increase of the search times when part of data is searched is greatly reduced.

Based on the hardware structure, the embodiment of the cache thrashing elimination method is provided.

Referring to fig. 2, fig. 2 is a flowchart illustrating a first embodiment of a cache thrashing elimination method according to the present invention.

In a first embodiment, the cache thrashing elimination method comprises the following steps:

step S10, obtaining the peripheral cache size and cache line size of the processor cache in the operating environment and the number of ways of the adopted group connection structure, and obtaining the aligned data size after performing memory alignment on the data type in the sequential data set with continuous address space, and the total data amount of the sequential data set.

It should be noted that, starting with obtaining the peripheral cache size and cache line size of the processor cache in the operating environment and the number of the adopted group-connected structure ways, the method can start with the cache structure of the peripheral cache of the processor, and prepares for subsequently improving the binary search algorithm; correspondingly, after memory alignment operation is performed on the ordered data sets with continuous address spaces, the corresponding size of the aligned data and the total data amount of the ordered data sets can be obtained.

Step S20, calculating a data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of ways of the group connection structure, the size of the alignment data, and the total data amount.

It is understood that the data adjustment offset and the adjustment times can be calculated by the obtained processor cache related information and the related information of the ordered data set.

Further, the step S20 includes the following steps:

obtaining a path of cache capacity according to the peripheral cache size and the N path of group connection structure through the following formula:

wherein S iscwFor one-way cache capacity, ScThe size of the peripheral cache is N, and the number of the group connection structure paths is N;

obtaining the data capacity capable of being accommodated in the one-way cache according to the one-way cache capacity and the size of the aligned data through the following formula:

wherein, Q isdcwFor a way cache to accommodate data volume, ScwFor one-way cache capacity, SdIs the alignment data size;

obtaining the number of cache lines in the one-way cache according to the one-way cache capacity and the cache line size by the following formula:

wherein Q isclwNumber of cache lines in a cache, ScwFor one-way cache capacity, SclIs the cache line size;

obtaining an average cache line receivable data amount according to the receivable data amount and the cache line number by the following formula:

wherein Q isdclFor average cache line to hold data volume, QdcwFor a way cache to accommodate data volume, QclwCaching the number of cache lines in one path;

and adjusting the offset and the adjusting times according to the data capacity of the average cache line.

It will be appreciated that, in general, the capacity of a one-way cache is calculatedIn today's computer architecture designThis value must be set to an integer;

calculating the data volume that one-way cache can holdSetting to round up;

calculating the number of cache lines in a cacheThis value must be set to an integer in today's computer architecture design;

calculating the amount of data that can be held by one cache line on averageSetting to round up;

calculating data adjustment offset O according to the above informationdAnd the number of times C that needs to be adjusteda

In particular implementations, the cache size may be in three levels S of the processor cache in the current environmentc6MB, cache line size Scl64B, N-12 way set associative architecture, in a binary search ordered data set, the size S of the data type after the memory alignment operation is performedd8B. Total amount of data T for ordered incremental data setsdAs an example 8M; the calculation method is as follows;

calculating capacity of one-way cache

Calculating the data volume that one-way cache can hold

Calculating the number of cache lines in a cache

Calculating the amount of data that can be held by one cache line on average

It should be appreciated that the offset O is adjusted based on the average cache line containable data amountdAnd the number of times of adjustment CaI.e. calculating the amount of data Q at the beginning of a cache severe thrashingdct=4×Qdct=256K;

Calculating the current data amount as QdctMultiple of

CaIs composed ofIntegral solution of equation to obtain Ca=6;

And step S30, performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

It should be understood that the ordered data set can be finely adjusted by the binary search algorithm through the data adjustment offset and the required adjustment times, so that cache thrashing when binary search is performed on the continuous ordered data in the address space is eliminated.

According to the scheme, the size of the peripheral cache of the processor cache in the operating environment, the size of the cache line and the number of the adopted group connection structure paths are obtained, and the size of the aligned data of the data type in the sequential data set with continuous address space after the memory alignment and the total data amount of the sequential data set are obtained; calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount; and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times, so that cache jolt can be eliminated, performance degradation is avoided, the high efficiency of a binary algorithm is ensured, and time consumption caused by the increase of the search times when part of data is searched is greatly reduced.

Further, fig. 3 is a schematic flow chart of a second embodiment of the cache thrashing elimination method according to the present invention, and as shown in fig. 3, the second embodiment of the cache thrashing elimination method according to the present invention is proposed based on the first embodiment, in this embodiment, the step S10 specifically includes the following steps:

step S11, obtaining the peripheral cache size and cache line size of the processor cache in the operating environment and the adopted N-way group connection structure from the processor.

It should be noted that, the information of the peripheral cache size of the processor cache, the cache line size and the adopted N-way set connection structure in the operating environment may be obtained from the relevant information of the processor.

Step S12, obtaining, from a preset programming language manual, the size of aligned data of the data type of the ordered data in the ordered data set after performing memory alignment, and the total amount of data in the ordered data set.

It is understood that the size of the alignment data and the total amount of data in the ordered data set may be obtained from a manual of a preset programming language, or the size of the alignment data and the total amount of data may be obtained in the ordered data set of the binary search, which is not limited in this embodiment.

According to the scheme, the peripheral cache size and the cache line size of the processor cache in the operating environment and the adopted N-path group connection structure are obtained from the processor; the size of the aligned data of the data type of the ordered data in the ordered data set after memory alignment and the total data amount of the ordered data set are obtained from the preset programming language manual, so that the real-time performance and the accuracy of the data can be ensured, the caching performance of the processor is further ensured, and the high efficiency of the binary algorithm is improved.

Further, fig. 4 is a schematic flow chart of a third embodiment of the cache thrashing elimination method of the present invention, and as shown in fig. 4, the third embodiment of the cache thrashing elimination method of the present invention is proposed based on the first embodiment, in this embodiment, the step S30 specifically includes the following steps:

and step S31, adjusting the boundary point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times.

It should be noted that the demarcation point index number of the currently searched ordered data set can be adjusted to a preset index number by the data adjustment offset and the number of times of adjustment.

And step S32, comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result.

It can be understood that the data to be searched is data to be searched in binary, and the target data corresponding to the preset index number in the ordered data set is compared with the data to be searched, so that a corresponding comparison result can be generated.

And step S33, performing data processing on the ordered data set according to the comparison result.

It should be understood that different data processing modes can be adopted for the ordered data set according to the comparison result, and further the adjustment of the binary search algorithm for the ordered data set is completed.

According to the scheme, the boundary point index number of the currently searched ordered data set is adjusted to the preset index number according to the data adjustment offset and the required adjustment times; comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result; performing data processing on the ordered data set according to the comparison result; the cache jolt can be eliminated, the performance degradation is avoided, the high efficiency of the binary algorithm is ensured, and the time consumption caused by the increase of the searching times when part of data is searched is greatly reduced.

Further, fig. 5 is a schematic flow chart of a fourth embodiment of the cache thrashing elimination method according to the present invention, and as shown in fig. 5, the fourth embodiment of the cache thrashing elimination method according to the present invention is proposed based on the third embodiment, in this embodiment, before the step S31, the cache thrashing elimination method further includes the following steps:

step S301, initializing the adjusted times in the current binary search process.

It should be noted that the adjusted number of times in the current binary search process is initialized, that is, the number of times that the adjustment has been performed in the current binary search process is initialized to 0.

And step S302, when the current searched ordered data set is detected to be empty, judging that the searching is failed, and ending the searching process.

It can be understood that, when the ordered data set currently searched is detected to be empty, it indicates that the search fails, and the search process may be ended.

Step S303, recording a left boundary index number and a right boundary index number of the currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a boundary point index number.

It should be understood that after recording the left and right bound index numbers of the currently sought ordered data set, the index number of the middle position of the currently sought ordered data set may be calculated;

in general, the index number of the middle position can be determined by the following formula,

wherein, ImIs the index number of the middle position of the currently sought ordered data set, correspondingly, IleftIndex number, I, for the left boundary of the ordered data setrightAnd for the index number of the right boundary of the ordered data set, rounding the index number of the middle position downwards and taking the index number as a demarcation point.

According to the scheme, the adjusted times in the current binary search process are initialized; when detecting that the ordered data set searched currently is empty, judging that the searching fails, and ending the searching process; recording a left boundary index number and a right boundary index number of a currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a dividing point index number, so that the dividing point index number can be accurately determined, preparation is made for data processing of the ordered data set, the speed and the efficiency of data processing are improved, and the time for eliminating cache jolt is saved.

Further, fig. 6 is a schematic flow chart of a fifth embodiment of the cache thrashing elimination method according to the present invention, and as shown in fig. 6, the fifth embodiment of the cache thrashing elimination method according to the present invention is proposed based on the third embodiment, in this embodiment, the step S31 specifically includes the following steps:

step S311, obtaining the adjusted times in the current binary search process, and judging whether the adjusted times are less than the times needing to be adjusted.

It should be noted that, the adjusted number of times in the current binary search process is obtained, and whether the adjusted number of times reaches the required number of times is further determined, that is, whether the adjusted number of times is smaller than the required number of times is determined by comparison.

Step S312, when the adjusted number of times is smaller than the number of times that needs to be adjusted, setting the preset index number as the index number of the boundary index number that offsets the data adjustment offset toward the left boundary of the currently searched ordered data set, and when the position of the preset index number exceeds the left boundary of the currently searched ordered data set, setting the preset index number as the index number of the left boundary, adjusting the boundary point index number to the preset index number, and recording the number of times that the index number has been adjusted.

In a specific implementation, the required adjustment number C may be setaIf the adjusted number of times C is 6ac<Ca6, the preset index number IamSet as the demarcation index number ImShifting the data adjustment offset O towards the left boundary direction of the currently sought ordered data setd256 bits as the resultSet index numbers, i.e. Iam=Im-Od=Im-256; if the offset position exceeds the left boundary I of the currently searched ordered data setleftI.e. when Iam<IleftWhen it is, then IamSet as index number of left border, i.e. Iam=IleftSetting the adjusted number of times C at the end of this stepac=Cac+1 for increasing the record of the number of times the adjustment has been made; i.e. the number of times the recording index number has been adjusted.

Step 313, when the adjusted number of times is equal to the number of times required to be adjusted, setting the preset index number as the demarcation point index number.

It is understood that, when the adjusted number of times is equal to the required number of times, the preset index number may be directly set as the demarcation point index number, i.e. Iam=ImNo adjustment is made.

According to the scheme, whether the adjusted times are smaller than the required adjustment times is judged by acquiring the adjusted times in the current binary search process; when the adjusted times are smaller than the times needing to be adjusted, setting the preset index number as the index number of the decomposition index number which deviates the data adjustment deviation amount towards the left boundary direction of the ordered data set searched currently, and when the position of the preset index number exceeds the left boundary of the ordered data set searched currently, setting the preset index number as the index number of the left boundary, adjusting the boundary point index number to the preset index number, and recording the adjusted times of the index number; when the adjusted times are equal to the times needing to be adjusted, the preset index number is set as the dividing point index number, the dividing point index number can be accurately determined, preparation is made for data processing of the ordered data set, the speed and the efficiency of data processing are improved, and the time for eliminating cache jolt is saved.

Further, fig. 7 is a schematic flowchart of a sixth embodiment of the cache thrashing elimination method according to the present invention, and as shown in fig. 7, the sixth embodiment of the cache thrashing elimination method according to the present invention is proposed based on the third embodiment, in this embodiment, the step S33 specifically includes the following steps:

and step S331, when the target data is the same as the data to be searched, finishing the searching.

In the target data D, the data isfAnd the data D to be searchedsAnd if so, directly returning the preset index number and finishing the search.

And S332, when the target data is different from the data to be searched, dividing the ordered data set into a left data set and a right data set by taking the preset index number as a demarcation point.

It will be understood that if D isf≠DsThen use IamAs a demarcation point, the data set is divided into a left part and a right part, and the index number intervals of the two parts of the set are respectively [ Ileft,Iam-1]And [ Iam+1,Iright],IleftIs the left boundary, IrightAnd the right boundary, if the left boundary of the interval is larger than the right boundary, the ordered data set corresponding to the index number interval can be considered to be empty.

And S333, discarding one data set which does not contain the data to be searched, and taking the remaining data set as the input of the currently searched ordered data set.

It should be understood that discarding D that does not contain the data to be looked up is characterized by an ordered increment of the data setsA set of data; if D isf<DsThen the index number interval [ I ] is reservedam+1,Iright]Corresponding data set if Df>DsThen the index number interval [ I ] is reservedleft,Iam-1]A corresponding data set; and then the reserved data set is used as the input of the ordered data set currently searched in the step 3.2, and the step is skipped to, when the target data is the same as the data to be searched, the searching is finished.

According to the scheme, when the target data is the same as the data to be searched, the searching is finished; when the target data is different from the data to be searched, the ordered data set is divided into a left data set and a right data set by taking the preset index number as a dividing point; abandoning a data set which does not contain the data to be searched, taking the remaining data set as the input of the ordered data set searched currently, accurately obtaining the input of the effective data set, screening out the data set which does not contain the data to be searched in advance, reducing the data processing amount, improving the speed and efficiency of data processing, and saving the time for eliminating cache jolt.

Correspondingly, the invention further provides a cache thrashing eliminating device.

Referring to fig. 8, fig. 8 is a functional block diagram of a first embodiment of a cache thrashing elimination device according to the present invention.

In a first embodiment of the present invention, the cache thrashing elimination apparatus comprises:

the obtaining module 10 is configured to obtain a peripheral cache size and a cache line size of a processor cache in an operating environment, and a number of adopted group-connected structure ways, and obtain an aligned data size and a total data amount of an ordered data set after performing memory alignment on a data type in the ordered data set with continuous address space.

And a calculating module 20, configured to calculate a data adjustment offset and an adjustment number according to the peripheral cache size, the cache line size, the number of ways of the group connection structure, the size of the aligned data, and the total data amount.

And the adjusting module 30 is configured to perform binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

The obtaining module 10 is further configured to obtain, from the processor, a peripheral cache size and a cache line size of a processor cache in an operating environment, and an N-way group connection structure adopted; and acquiring the size of aligned data of the data type of the ordered data in the ordered data set after memory alignment and the total data amount of the ordered data set from a preset programming language manual.

The computing module 20 is further configured to obtain a path of cache capacity according to the peripheral cache size and the N-path group connection structure by using the following formula:

wherein S iscwFor one-way cache capacity, ScThe size of the peripheral cache is N, and the number of the group connection structure paths is N;

obtaining the data capacity capable of being accommodated in the one-way cache according to the one-way cache capacity and the size of the aligned data through the following formula:

wherein, Q isdcwFor a way cache to accommodate data volume, ScwFor one-way cache capacity, SdIs the alignment data size;

obtaining the number of cache lines in the one-way cache according to the one-way cache capacity and the cache line size by the following formula:

wherein Q isclwNumber of cache lines in a cache, ScwFor one-way cache capacity, SclIs the cache line size;

obtaining an average cache line receivable data amount according to the receivable data amount and the cache line number by the following formula:

wherein Q isdclFor average cache line to hold data volume, QdcwFor a way cache to accommodate data volume, QclwCaching the number of cache lines in one path;

and adjusting the offset and the adjusting times according to the data capacity of the average cache line.

The adjusting module 30 is further configured to adjust the boundary point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times; comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result; and carrying out data processing on the ordered data set according to the comparison result.

The adjusting module 30 is further configured to initialize the adjusted times in the current binary search process; when detecting that the ordered data set searched currently is empty, judging that the searching fails, and ending the searching process; recording a left boundary index number and a right boundary index number of the currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a demarcation point index number.

The adjusting module 30 is further configured to obtain an adjusted number of times in the current binary search process, and determine whether the adjusted number of times is smaller than the number of times that needs to be adjusted; when the adjusted times are smaller than the times needing to be adjusted, setting the preset index number as the index number of the decomposition index number which deviates the data adjustment deviation amount towards the left boundary direction of the ordered data set searched currently, and when the position of the preset index number exceeds the left boundary of the ordered data set searched currently, setting the preset index number as the index number of the left boundary, adjusting the boundary point index number to the preset index number, and recording the adjusted times of the index number; and when the adjusted times are equal to the times needing to be adjusted, setting the preset index number as the demarcation point index number.

The adjusting module 30 is further configured to complete the search when the target data is the same as the data to be searched; when the target data is different from the data to be searched, the ordered data set is divided into a left data set and a right data set by taking the preset index number as a dividing point; and discarding one data set which does not contain the data to be searched, and taking the remaining one data set as the input of the ordered data set currently searched.

The steps implemented by each functional module of the cache thrashing elimination apparatus may refer to each embodiment of the cache thrashing elimination method of the present invention, and are not described herein again.

In addition, an embodiment of the present invention further provides a storage medium, where a cache thrashing elimination program is stored on the storage medium, and when executed by a processor, the cache thrashing elimination program implements the following operations:

acquiring the peripheral cache size and cache line size of a processor cache in an operating environment and the number of adopted group connection structure paths, and acquiring the size of aligned data after memory alignment of data types in an ordered data set with continuous address space and the total data amount of the ordered data set;

calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount;

and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times.

Further, the cache thrashing elimination program when executed by the processor further implements the following operations:

acquiring the peripheral cache size and the cache line size of a processor cache in an operating environment and an adopted N-path group connection structure from a processor;

and acquiring the size of aligned data of the data type of the ordered data in the ordered data set after memory alignment and the total data amount of the ordered data set from a preset programming language manual.

Further, the cache thrashing elimination program when executed by the processor further implements the following operations:

obtaining a path of cache capacity according to the peripheral cache size and the N path of group connection structure through the following formula:

wherein S iscwFor one-way cache capacity, ScThe size of the peripheral cache is N, and the number of the group connection structure paths is N;

obtaining the data capacity capable of being accommodated in the one-way cache according to the one-way cache capacity and the size of the aligned data through the following formula:

wherein, Q isdcwFor a way cache to accommodate data volume, ScwFor one-way cache capacity, SdIs the alignment data size;

obtaining the number of cache lines in the one-way cache according to the one-way cache capacity and the cache line size by the following formula:

wherein Q isclwNumber of cache lines in a cache, ScwFor one-way cache capacity, SclIs the cache line size;

obtaining an average cache line receivable data amount according to the receivable data amount and the cache line number by the following formula:

wherein Q isdclFor average cache line to hold data volume, QdcwFor a way cache to accommodate data volume, QdcwCaching the number of cache lines in one path;

and adjusting the offset and the adjusting times according to the data capacity of the average cache line.

Further, the cache thrashing elimination program when executed by the processor further implements the following operations:

adjusting the boundary point index number of the currently searched ordered data set to a preset index number according to the data adjustment offset and the required adjustment times;

comparing the target data corresponding to the preset index number in the ordered data set with the data to be searched to generate a comparison result;

and carrying out data processing on the ordered data set according to the comparison result.

Further, the cache thrashing elimination program when executed by the processor further implements the following operations:

initializing the adjusted times in the current binary search process;

when detecting that the ordered data set searched currently is empty, judging that the searching fails, and ending the searching process;

recording a left boundary index number and a right boundary index number of the currently searched ordered data set, calculating an index number of a middle position of the currently searched ordered data set according to the left boundary index number and the right boundary index number, and taking the index number of the middle position as a demarcation point index number.

Further, the cache thrashing elimination program when executed by the processor further implements the following operations:

acquiring the adjusted times in the current binary search process, and judging whether the adjusted times are smaller than the times needing to be adjusted;

when the adjusted times are smaller than the times needing to be adjusted, setting the preset index number as the index number of the decomposition index number which deviates the data adjustment deviation amount towards the left boundary direction of the ordered data set searched currently, and when the position of the preset index number exceeds the left boundary of the ordered data set searched currently, setting the preset index number as the index number of the left boundary, adjusting the boundary point index number to the preset index number, and recording the adjusted times of the index number;

and when the adjusted times are equal to the times needing to be adjusted, setting the preset index number as the demarcation point index number.

Further, the cache thrashing elimination program when executed by the processor further implements the following operations:

when the target data is the same as the data to be searched, finishing searching;

when the target data is different from the data to be searched, the ordered data set is divided into a left data set and a right data set by taking the preset index number as a dividing point;

and discarding one data set which does not contain the data to be searched, and taking the remaining one data set as the input of the ordered data set currently searched.

According to the scheme, the size of the peripheral cache of the processor cache in the operating environment, the size of the cache line and the number of the adopted group connection structure paths are obtained, and the size of the aligned data of the data type in the sequential data set with continuous address space after the memory alignment and the total data amount of the sequential data set are obtained; calculating data adjustment offset and adjustment times according to the peripheral cache size, the cache line size, the number of the group connection structure paths, the alignment data size and the total data amount; and performing binary search adjustment on the ordered data set according to the data adjustment offset and the required adjustment times, so that cache jolt can be eliminated, performance degradation is avoided, the high efficiency of a binary algorithm is ensured, and time consumption caused by the increase of the search times when part of data is searched is greatly reduced.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

26页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:数据操作方法、数据操作装置、数据处理器

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!