Data caching method and device based on accessed times

文档序号：1003494 发布日期：2020-10-23 浏览：30次中文

阅读说明：本技术 一种基于被访问次数的数据缓存方法和装置 (Data caching method and device based on accessed times ) 是由邓智成于 2020-06-29 设计创作，主要内容包括：本发明公开了一种基于被访问次数的数据缓存方法和装置,包括如下步骤：S1、获取每一个时间周期内,所有已缓存的数据的被访问次数；S2、对每个时间周期内的所有已缓存的数据,根据该数据的被访问次数将该数据划分至不同阶层；S3、根据该数据在当前时间周期的访问次数、上一时间周期内数据的被访问次数、当前时间周期内数据的被访问次数以及上一时间周期内数据的被访问次数,对数据的缓存时长进行延长或缩短操作。由于采用了上述技术方案,与现有技术相比,本发明根据数据的被访问次数来延长或缩短该数据的缓存时长,不仅可以保证服务器的可用性,还能提高不同数据条件下的缓存命中率。(The invention discloses a data caching method and device based on accessed times, which comprises the following steps: s1, acquiring the access times of all the cached data in each time period; s2, for all cached data in each time period, dividing the data into different hierarchies according to the number of times the data is accessed; and S3, according to the number of times of accessing the data in the current time period, the number of times of accessing the data in the previous time period, the number of times of accessing the data in the current time period and the number of times of accessing the data in the previous time period, prolonging or shortening the caching duration of the data. Due to the adoption of the technical scheme, compared with the prior art, the cache duration of the data is prolonged or shortened according to the access times of the data, the availability of the server can be ensured, and the cache hit rate under different data conditions can be improved.)

1. A data caching method based on a number of accesses, the method comprising the steps of:

s1, acquiring the access times of all the cached data in each time period;

s2, for all cached data in each time period, dividing the data into different hierarchies according to the number of times the data is accessed;

and S3, according to the number of times of accessing the data in the current time period, the number of times of accessing the data in the previous time period, the number of times of accessing the data in the current time period and the number of times of accessing the data in the previous time period, prolonging or shortening the caching duration of the data.

2. The data caching method based on the number of times of being accessed according to claim 1, wherein: the step S2 specifically includes: if the number of times of access is more than or equal to 100 and less than or equal to 1000, dividing the data into a level 1; if the number of times of access is more than 1000 and less than or equal to 10000, dividing the data into a 2 nd level; if the number of accesses is > 10000, the data is divided into levels 3.

3. The data caching method based on the number of accesses according to claim 1 or 2, wherein: the step S3 specifically includes: if the accessed times of the data in the current time period are larger than or equal to the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the cache duration of the data.

4. The accessed number based data caching method of claim 3, wherein: the shortening of the buffering duration of the data in the step S3 means that the buffering duration of the data is halved.

5. The data caching method based on the number of accesses according to claim 1 or 2, wherein: when the current hierarchy of the data is the 1 st hierarchy, the time length of each time of cache prolonging is 10 minutes; when the current hierarchy of the data is the 2 nd hierarchy, the time length of each time of cache prolonging is 30 minutes; when the current hierarchy of the data is hierarchy 3, the time length of each time of the buffer is 60 minutes.

6. The data caching method based on the number of accesses according to claim 1 or 2, wherein: when the current hierarchy of the data is the 1 st hierarchy, the lower limit of the data caching time length is 2 minutes; when the current hierarchy of the data is the 2 nd hierarchy, the lower limit of the data caching time length is 2 minutes; when the current hierarchy of the data is hierarchy 3, the lower limit of the data caching duration is 4 minutes.

7. The data caching method based on the number of accesses according to claim 1 or 2, wherein: when the current hierarchy of the data is the 1 st hierarchy, the upper limit of the data caching duration is 2 hours; when the current hierarchy of the data is the 2 nd hierarchy, the upper limit of the data caching duration is 4 hours; when the current level of the data is level 3, the upper limit of the data caching duration is 8 hours.

8. A data caching apparatus based on a number of accesses, comprising:

the acquisition unit is used for acquiring the access times of all the cached data in each time period;

the dividing unit is used for dividing all cached data in each time period into different levels according to the accessed times of the data;

the acquisition unit is connected with the dividing unit and used for sending the accessed times of the data to the dividing unit; the dividing unit receives the accessed times of all the cached data in each time period and divides the data into different hierarchies according to the accessed times; and the execution unit prolongs or shortens the cache duration of the data according to the access times of the data in the current time period, the access times of the data in the previous time period, the access times of the data in the current time period and the access times of the data in the previous time period.

9. The accessed number-based data caching apparatus according to claim 8, wherein: the dividing unit divides all the buffered data in each time period according to the following rules: if the accessed times of the data in the current time period are larger than or equal to the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the cache duration of the data.

10. The accessed number-based data caching device according to claim 8 or 9, wherein: the execution unit prolongs or shortens the cache duration of the data according to the following rules: if the accessed times of the data in the current time period are larger than or equal to the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the cache duration of the data.

Technical Field

The invention relates to the field of data processing, in particular to a data caching method and device based on accessed times.

Background

In the conventional caching technology, generally, a fixed caching duration is set, and when the duration is invalid, cache entries are removed according to a corresponding elimination algorithm, so that the maximum data is cached as much as possible by using a limited cache size.

Caching: the structure for coordinating the difference between the data transmission speeds of two hardware/services with a large difference between the speeds can be called as a cache. And (3) recovering: and fetching data from the original server when the cache has no data. Hit rate: when an end user accesses the acceleration node, if the node caches data to be accessed, the node is called hit, if the node does not cache the data to be accessed, the node needs to return to an original server for fetching, the node is not hit, and the hit rate is equal to the number of hits/(number of hits + no number of hits).

Conventional caching strategies generally fall into three categories: LRU, LFU and ARC. LRU: the Least recently used algorithm preferentially eliminates the entries which are not used for the longest time; LFU: the least frequently Used algorithm is Used, and the entry with the lowest use frequency is preferentially eliminated; ARC: adaptive Replacement Cache, Adaptive Cache Replacement algorithm, consists of 2 LRUs, the first, i.e., L1, containing entries that have been recently used only once, and the second, i.e., L2, containing entries that have been recently used twice, with L1 placing new objects and L2 placing common objects.

The disadvantages of these existing caching strategies are: the fixed caching time is generally short, for example, 1-10 minutes, and when a server returning original data has a problem, the server can be unavailable; secondly, all caches use fixed cache duration without considering long-time/short-time access of hot data, so that the hit rate does not reach a high level.

Disclosure of Invention

In order to solve the problems that the availability of a server cannot be guaranteed by the existing cache strategy and the cache hit rate is low in the background art, the invention provides a data caching method based on the number of accessed times, and the specific technical scheme is as follows.

A data caching method based on a number of accesses, the method comprising the steps of:

s1, acquiring the access times of all the cached data in each time period;

s2, for all cached data in each time period, dividing the data into different hierarchies according to the number of times the data is accessed;

Current frequency ═ current total read count (cur _ cnt) -previous total read count (cnt))/(current time-previous update time). By calculating the current frequency and comparing with the previous frequency (rate), the rising or falling of the frequency and which stage the frequency is in can be judged, the buffer duration of the data is prolonged or shortened according to the logic, and the current frequency is covered with the rate value in the buffer public data (so as to facilitate the next judgment). The frequency strategy ensures that the hot data which appears in a short time can normally respond within a longer period of time (for example, 2 to 8 hours) when the hot data returns to the original server and goes wrong. For example, a star at 12:00 announces a marriage, the video traffic may suddenly increase, and the video cache may be renewed according to a frequency policy.

Preferably, the step S2 specifically includes: if the number of times of access is more than or equal to 100 and less than or equal to 1000, dividing the data into a level 1; if the number of times of access is more than 1000 and less than or equal to 10000, dividing the data into a 2 nd level; if the number of accesses is > 10000, the data is divided into levels 3. The traditional cache strategy is eliminated by a fixed algorithm (such as LRU, LFU), not necessarily applicable to all items, and the strategy finds the optimal cache strategy by designing a three-stage mode and adjusting the mode.

Preferably, the step S3 specifically includes: if the accessed times of the data in the current time period are larger than or equal to the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the cache duration of the data.

Preferably, the step S3 of shortening the buffering duration of the data refers to halving the buffering duration of the data. The traditional caching strategy is a fixed caching duration invalidation mechanism, the access mode of the hot data in a short time is not considered, the caching duration is prolonged during stage upgrading and shortened during stage degrading, and the hot data in a short time can be effectively cached.

Preferably, when the current hierarchy of the data is the 1 st hierarchy, the time length of each cache extension is 10 minutes; when the current hierarchy of the data is the 2 nd hierarchy, the time length of each time of cache prolonging is 30 minutes; when the current hierarchy of the data is hierarchy 3, the time length of each time of the buffer is 60 minutes. The strategy designs an automatic cache duration prolonging mechanism, and hot data in a short time can be effectively cached for a long time.

Preferably, when the current hierarchy of the data is the 1 st hierarchy, the lower limit of the data caching duration is 2 minutes; when the current hierarchy of the data is the 2 nd hierarchy, the lower limit of the data caching time length is 2 minutes; when the current hierarchy of the data is hierarchy 3, the lower limit of the data caching duration is 4 minutes. The strategy design automatically shortens the cache duration mechanism, can effectively remove unnecessary cache data, and timely releases the cache space.

Preferably, when the current hierarchy of the data is the 1 st hierarchy, the upper limit of the data caching duration is 2 hours; when the current hierarchy of the data is the 2 nd hierarchy, the upper limit of the data caching duration is 4 hours; when the current level of the data is level 3, the upper limit of the data caching duration is 8 hours. The strategy designs the upper limit of the cache to prevent the cache from being cached indefinitely.

Preferably, the time period ranges from 1 to 10 minutes.

Based on the same inventive concept, the invention also provides a data caching device based on the accessed times.

The acquisition unit is used for acquiring the access times of all the cached data in each time period;

the dividing unit is used for dividing all cached data in each time period into different levels according to the accessed times of the data;

Current frequency ═ current total read count (cur _ cnt) -previous total read count (cnt))/(current time-previous update time). By calculating the current frequency and comparing with the previous frequency (rate), the rising or falling of the frequency and which stage the frequency is in can be judged, the buffer duration of the data is prolonged or shortened according to the logic, and the current frequency is covered with the rate value in the buffer public data (so as to facilitate the next judgment). The frequency strategy ensures that the hot data which appears in a short time can normally respond within a longer period of time (for example, 2 to 8 hours) when the hot data returns to the original server and goes wrong. For example, 12:00 Zhao Li Ying declares marriage, the video traffic will suddenly increase, and the video cache can be renewed according to a frequency policy.

Preferably, the dividing unit divides all buffered data in each time period according to the following rules: if the accessed times of the data in the current time period are larger than or equal to the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the cache duration of the data. The traditional cache strategy is eliminated by a fixed algorithm (such as LRU, LFU), not necessarily applicable to all items, and the strategy finds the optimal cache strategy by designing a three-stage mode and adjusting the mode.

Preferably, the execution unit performs an operation of extending or shortening the buffering duration of the data according to the following rule: if the accessed times of the data in the current time period are larger than or equal to the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times of the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the cache duration of the data. The traditional caching strategy is a fixed caching duration invalidation mechanism, the access mode of the hot data in a short time is not considered, the caching duration is prolonged during stage upgrading and shortened during stage degrading, and the hot data in a short time can be effectively cached.

Due to the adoption of the technical scheme, compared with the prior art, the cache duration of the data is prolonged or shortened according to the access times of the data, so that the usability of the server can be ensured, and the cache hit rate under different data conditions can be improved.

Drawings

Fig. 1 is a flowchart of a data caching method based on the number of times of access according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a mechanism for extending/shortening the buffer duration of data in each level;

fig. 3 is a structural diagram of a data caching apparatus based on the number of times of access according to an embodiment of the present invention.

Detailed Description

The present invention is described in further detail below with reference to the attached drawing figures.

As shown in fig. 1, a data caching method based on the number of times of access, the method includes the following steps:

s1, acquiring the access times of all cached data within each minute;

s2, for all cached data in each time period, dividing the data into different levels according to the number of times the data is accessed: if the number of times of access is more than or equal to 100 and less than or equal to 1000, dividing the data into a level 1; if the number of times of access is more than 1000 and less than or equal to 10000, dividing the data into a 2 nd level; if the number of accesses is > 10000, the data is divided into 3 rd level;

s3, if the number of times of accessing the data in the current time period is larger than or equal to the number of times of accessing the data in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; and if the accessed times of the data in the current time period are less than the accessed times in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, halving the cache duration of the data.

As shown in fig. 2, when the current level of the data is level 1, the time length for extending the cache each time is 10 minutes; when the current hierarchy of the data is the 2 nd hierarchy, the time length of each time of cache prolonging is 30 minutes; when the current hierarchy of the data is hierarchy 3, the time length of each time of the buffer is 60 minutes.

When the current hierarchy of the data is the 1 st hierarchy, the lower limit of the data caching time length is 2 minutes; when the current hierarchy of the data is the 2 nd hierarchy, the lower limit of the data caching time length is 2 minutes; when the current hierarchy of the data is hierarchy 3, the lower limit of the data caching duration is 4 minutes.

When the current hierarchy of the data is the 1 st hierarchy, the upper limit of the data caching duration is 2 hours; when the current hierarchy of the data is the 2 nd hierarchy, the upper limit of the data caching duration is 4 hours; when the current level of the data is level 3, the upper limit of the data caching duration is 8 hours.

As shown in fig. 3, a data caching apparatus based on the number of times of access includes:

the acquisition unit is used for acquiring the access times of all the cached data in each time period;

the dividing unit is used for dividing all cached data in each time period into different levels according to the accessed times of the data: when the number of times of access is more than or equal to 100 and less than or equal to 1000, dividing the data into a level 1; when the number of times of access is more than 1000 and less than or equal to 10000, dividing the data into a 2 nd level; when the number of accesses is > 10000, dividing the data into 3 rd levels;

the execution unit is used for prolonging or shortening the cache duration of the data according to the access times of the data in the current time period, the access times of the data in the previous time period, the access times of the data in the current time period and the access times of the data in the previous time period: when the accessed times of the data in the current time period are larger than or equal to the accessed times in the previous time period, and the hierarchy of the data in the current time period is larger than or equal to 1, prolonging the cache duration of the data; when the number of times of accessing the data in the current time period is less than the number of times of accessing the data in the previous time period, and the hierarchy of the data in the current time period is less than the hierarchy of the data in the previous time period, shortening the caching duration of the data;

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

8页详细技术资料下载

Data caching method and device based on accessed times

相关技术

网友询问留言