Data pre-reading optimization method and device

文档序号:948179 发布日期:2020-10-30 浏览:5次 中文

阅读说明:本技术 一种数据预读取的优化方法和装置 (Data pre-reading optimization method and device ) 是由 李龙翔 刘羽 崔坤磊 张敏 杨振宇 于占乐 王倩 于 2020-07-17 设计创作,主要内容包括:本发明公开了一种数据预读取的优化方法和装置,方法包括:在CFD程序运行期间采集CFD程序中各函数的运行特征信息,并分析运行特征信息以确定待优化函数并写入分析日志;针对分析日志中的每个待优化函数,将其预读取调度距离和预读取调度位置作为状态、将其优化结果变化作为动作构建动作价值函数;针对每个动作价值函数,使用强化学习算法以CFD求解器的单步速度作为奖励进行迭代训练,直到动作价值函数收敛;根据对应的状态、动作、和收敛的动作价值函数确定最优的预读取调度距离和预读取调度位置以在缓存中执行数据预读取。本发明能够获得PSD和PSP的最优预读取优化结果值,进而提高CFD求解器的运行速度。(The invention discloses an optimization method and a device for data pre-reading, wherein the method comprises the following steps: collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log; aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function; aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged; determining an optimal pre-fetch scheduling distance and pre-fetch scheduling location to perform data pre-fetching in the cache according to the corresponding state, action, and converged action cost function. The invention can obtain the optimal pre-reading optimization result values of the PSD and the PSP, thereby improving the running speed of the CFD solver.)

1. An optimization method for data pre-reading is characterized by comprising the following steps:

collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log;

aiming at each function to be optimized in the analysis log, taking a pre-reading scheduling distance and a pre-reading scheduling position of the function as states, and taking the change of an optimization result of the function as an action construction action value function;

for each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as a reward until the action value function converges;

and determining the optimal pre-reading scheduling distance and the optimal pre-reading scheduling position according to the corresponding state, action and converged action cost function so as to execute data pre-reading in a cache.

2. The method of claim 1, wherein the run characteristic information for each function includes function run time and function cache hit rate;

analyzing the operating characteristic information to determine a function to be optimized comprises: determining the function with the function running time higher than a preset value and/or the function cache hit rate lower than a preset value as the function to be optimized;

Writing the analysis log includes: writing each function to be optimized and the associated variable into the analysis log.

3. The method of claim 1, further comprising:

when optimizing a plurality of functions to be optimized, sequencing the functions to be optimized according to the function running time of the functions to be optimized, optimizing the functions to be optimized with higher function running time to the state that the function running time of the functions to be optimized is not higher than a preset value, and then optimizing the functions to be optimized with lower function running time.

4. The method of claim 1, wherein constructing an action cost function with a pre-reading scheduling distance and a pre-reading scheduling position as states and an optimization result change as an action for each to-be-optimized function in the analysis log comprises:

using a set of value pairs of all possible pre-reading scheduling distances and pre-reading scheduling positions of the function to be optimized to form an optimization result vector as a state;

and forming an optimization result change as an action by using all possible pre-reading scheduling distances and a change action set of the pre-reading scheduling positions of the function to be optimized.

5. The method of claim 4, wherein iteratively training using a reinforcement learning algorithm with a single step speed of a CFD solver as a reward for each of the action cost functions until the action cost functions converge comprises:

Initializing the action cost function and setting the current state of the action cost function according to the optimization result vector;

selecting an action from the set of change actions to perform on the current state to obtain a reward and a next state, updating the action cost function according to the reward and the next state and overriding the current state with the next state;

and repeating the previous step until the action cost function converges.

6. The method of claim 5, wherein updating the action cost function based on the reward and the next state comprises:

Q(s,α)=(1-β)·Q(s,α)+β[R+γmaxαQ(s′,α)]

where s is the current state, α is the action, Q (s, a) is the action cost function, β is the learning rate, R is the reward, γ is the discount factor, s' is the next state, maxαQ (s', α) is the maximum convergence function value for the action taken for the next state.

7. The method of claim 5, wherein setting the current state of the action cost function according to the optimization result vector comprises: and setting the pre-reading scheduling distance to be zero, and setting the pre-reading scheduling position to be the position where the cache hit rate of the function to be optimized is lowest.

8. An optimization apparatus for pre-reading data, comprising:

a processor; and

a memory storing program code executable by the processor, the program code when executed sequentially performing the steps of:

collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log;

aiming at each function to be optimized in the analysis log, taking a pre-reading scheduling distance and a pre-reading scheduling position of the function as states, and taking the change of an optimization result of the function as an action construction action value function;

for each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as a reward until the action value function converges;

and determining the optimal pre-reading scheduling distance and the optimal pre-reading scheduling position according to the corresponding state, action and converged action cost function so as to execute data pre-reading in a cache.

9. The apparatus of claim 8, wherein for each function to be optimized in the analysis log, constructing an action cost function with its pre-read scheduling distance and pre-read scheduling position as states and its optimization result change as an action comprises:

Using a set of value pairs of all possible pre-reading scheduling distances and pre-reading scheduling positions of the function to be optimized to form an optimization result vector as a state;

and forming an optimization result change as an action by using all possible pre-reading scheduling distances and a change action set of the pre-reading scheduling positions of the function to be optimized.

10. The apparatus of claim 9, wherein iteratively training using a reinforcement learning algorithm with a single step speed of a CFD solver as a reward for each of the action cost functions until the action cost functions converge comprises:

initializing the action cost function and setting the current state of the action cost function according to the optimization result vector;

from the changeSelecting an action to perform on the current state for a reward and a next state, updating the action cost function according to the reward and the next state and overriding the current state with the next state, wherein the action cost function Q (s, α) ═ 1- β · Q (s, α) + β [ R + γ max)αQ(s′,α)]Where s is the current state, α is the action, Q (s, a) is the action cost function, β is the learning rate, R is the reward, γ is the discount factor, s' is the next state, max αQ (s', alpha) is the maximum convergence function value of the action adopted in the next state;

and repeating the previous step until the action cost function converges.

Technical Field

The present invention relates to the field of machine learning, and more particularly, to a method and an apparatus for optimizing data pre-reading.

Background

In high-performance software such as CFD (computational fluid dynamics), the memory bandwidth requirement is generally high in the actual operation process because the calculation process comprises a large amount of sparse matrix calculation. The performance of the operation is limited by the speed of data access in addition to the performance of the processor. In most CFD software, a sparse matrix data structure is widely adopted, so that a large amount of discontinuous data access is inevitably generated, and the requirement on memory access is extremely high. In the last decade, the application and development of technologies such as a hierarchical cache subsystem, a non-uniform memory, synchronous multi-execution order and out-of-order execution enable the performance and the computing capacity of a modern processor to be improved by more than several times, but the data access speed between a CPU and a memory is increased slowly. This makes the memory bandwidth reading speed gradually become a major bottleneck in part of high-performance application software at present.

For data reading optimization in a memory, there are two main methods at present: (1) the cache is fully utilized, and most of the access data are put into the memory; (2) the delay caused by memory access is effectively covered by using the caching technology and the maximum memory bandwidth. Due to the process level and size limitations of processors, the buffer size in the mainstream processor is limited at present, all the used data cannot be put into the buffer for most applications, and as the processing problem scale of application software increases year by year along with the performance of the processor, the used data also grows exponentially, and the buffer size of the processor grows slowly, so that the implementation difficulty of the method (1) is high. The method (2) is mainly realized by a software pre-reading technology. Data pre-reading is an enhanced function in a micro-architecture system, and data to be used is put into a cache of a processor in advance by effectively utilizing the read-write bandwidth of a memory. By combining with software optimization, the performance of most applications can be significantly improved, so that the method (2) is the current main method. For application software, in the past era of dominant frequency upgrade, higher performance can be obtained on a new platform without any modification of codes, but when the software is optimized by using a pre-reading technology, a uniform mode is lacked, which is related to the pre-reading optimization mode and the implementation method. When the software is optimized by adopting a pre-reading method, the following relevant parameters need to be determined: (1) the distance to the data schedule (PSD) is pre-read. The PSD unit is the number of instructions, and when the PSD is large enough, the delay of accessing the memory can be hidden by the execution time of other instructions, but when the PSD value is too large, the buffer pollution is caused at the same time, and the data to be read is lost due to the occupation of the buffer by other instruction data. And for different processor platforms and different data, the optimal value of the PSD of the specified pre-reading instruction can be changed; (2) the location of the instruction (PSP) is pre-read. The pre-read instruction does not have an influence on the application performance, and the frequent use of the pre-read instruction also causes the application performance to be reduced. Furthermore, the pre-read instruction must be interleaved with the compute instruction, and the location of the instruction also has a large impact on optimizing performance. It can be seen that the result of the pre-reading optimization has a very high dependency on data processed by the platform and the application software, and the optimal values of the PSD and the PSP cannot be obtained through a theoretical formula, and a test method must be adopted, so that the workload of software optimization using the pre-reading method is very large.

Aiming at the problems of large optimization workload and difficulty in implementation of PSD and PSP in the prior art, no effective solution is available at present.

Disclosure of Invention

In view of this, an object of the embodiments of the present invention is to provide a method and an apparatus for optimizing data pre-reading, which can obtain optimal pre-reading optimization result values of a PSD and a PSP, so as to improve an operation speed of a CFD solver.

In view of the above object, a first aspect of the embodiments of the present invention provides an optimization method for data pre-reading, including the following steps:

collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log;

aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function;

aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged;

determining an optimal pre-fetch scheduling distance and pre-fetch scheduling location to perform data pre-fetching in the cache according to the corresponding state, action, and converged action cost function.

In some embodiments, the run characteristic information of each function includes function run time and function cache hit rate; analyzing the operating characteristic information to determine a function to be optimized comprises: determining a function with the function running time higher than a preset value and/or the function cache hit rate lower than a preset value as a function to be optimized; writing the analysis log includes: writing each function to be optimized and the associated variable into an analysis log.

In some embodiments, the method further comprises: when a plurality of functions to be optimized are optimized, the functions to be optimized are sequenced according to the function running time of the functions to be optimized, the functions to be optimized with higher function running time are optimized to the state that the function running time of the functions to be optimized is not higher than a preset value, and then the functions to be optimized with lower function running time are optimized.

In some embodiments, for each function to be optimized in the analysis log, constructing an action cost function with its pre-read scheduling distance and pre-read scheduling position as states and its optimization result change as an action comprises:

using a set of all possible pre-reading scheduling distances and value pairs of pre-reading scheduling positions of a function to be optimized to form an optimization result vector as a state;

And forming an optimization result change as an action by using all possible pre-reading scheduling distances of the function to be optimized and the change action set of the pre-reading scheduling positions.

In some embodiments, iteratively training using a reinforcement learning algorithm with a single step speed of the CFD solver as a reward for each action cost function until the action cost function converges comprises:

initializing an action cost function and setting the current state of the action cost function according to the optimization result vector;

selecting an action from the set of change actions to execute on the current state to obtain a reward and a next state, updating an action cost function according to the reward and the next state and overriding the current state with the next state;

and repeating the previous steps until the action cost function converges.

In some embodiments, updating the action cost function according to the reward and the next state comprises:

Q(s,α)=(1-β)·Q(s,α)+β[R+γmaxαQ(s′,α)

where s is the current state, α is the action, Q (s, a) is the action cost function, β is the learning rate, R is the reward, γ is the discount factor, s' is the next state, maxαQ (s', α) is the maximum convergence function value for the action taken for the next state.

In some embodiments, setting the current state of the action cost function according to the optimization result vector comprises: and setting the pre-reading scheduling distance to be zero, and setting the pre-reading scheduling position to be the position where the cache hit rate of the function to be optimized is lowest.

A second aspect of the embodiments of the present invention provides an optimization apparatus for data pre-reading, including:

a processor; and

a memory storing program code executable by the processor, the program code when executed sequentially performing the steps of:

collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log;

aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function;

aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged;

determining an optimal pre-fetch scheduling distance and pre-fetch scheduling location to perform data pre-fetching in the cache according to the corresponding state, action, and converged action cost function.

In some embodiments, for each function to be optimized in the analysis log, constructing an action cost function with its pre-read scheduling distance and pre-read scheduling position as states and its optimization result change as an action comprises:

Using a set of all possible pre-reading scheduling distances and value pairs of pre-reading scheduling positions of a function to be optimized to form an optimization result vector as a state;

and forming an optimization result change as an action by using all possible pre-reading scheduling distances of the function to be optimized and the change action set of the pre-reading scheduling positions.

In some embodiments, iteratively training using a reinforcement learning algorithm with a single step speed of the CFD solver as a reward for each action cost function until the action cost function converges comprises:

initializing an action cost function and setting the current state of the action cost function according to the optimization result vector;

selecting an action from a set of change actions to perform on the current state to obtain a reward and a next state, updating an action cost function according to the reward and the next state and overriding the current state with the next state, wherein the action cost function Q (s, α) ═ 1- β · Q (s, α) + β [ R + γ maxαQ(s′,α)]Where s is the current state, α is the action, Q (s, a) is the action cost function, β is the learning rate, R is the reward, γ is the discount factor, s' is the next shapeState, maxαQ (s', alpha) is the maximum convergence function value of the action adopted in the next state;

And repeating the previous steps until the action cost function converges.

The invention has the following beneficial technical effects: according to the optimization method and device for data pre-reading provided by the embodiment of the invention, the running characteristic information of each function in the CFD program is collected during the running period of the CFD program, and the running characteristic information is analyzed to determine the function to be optimized and write the function to be optimized into an analysis log; aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function; aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged; according to the technical scheme that the optimal pre-reading scheduling distance and the optimal pre-reading scheduling position are determined according to the corresponding state, action and converged action cost function so as to execute data pre-reading in the cache, the optimal pre-reading optimization result values of the PSD and the PSP can be obtained, and the running speed of the CFD solver is further improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of an optimization method for data pre-reading according to the present invention;

FIG. 2 is a schematic diagram of an iteration cycle of an action cost function of a Q-Learning algorithm in the optimization method for data pre-reading provided by the present invention;

FIG. 3 is a schematic diagram of an iteration flow of an action cost function of a Q-Learning algorithm in the optimization method for data pre-reading provided by the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.

It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.

In view of the above, a first aspect of the embodiments of the present invention provides an embodiment of an optimization method capable of obtaining optimal pre-reading optimization result values of PSD and PSP. Fig. 1 is a schematic flow chart of an optimization method for data pre-reading provided by the present invention.

The optimization method for data pre-reading, as shown in fig. 1, includes the following steps:

step S101: collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log;

step S103: aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function;

step S105: aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged;

step S107: determining an optimal pre-fetch scheduling distance and pre-fetch scheduling location to perform data pre-fetching in the cache according to the corresponding state, action, and converged action cost function.

The embodiment of the invention uses the reinforcement learning method, and integrates the optimization testing process and the parameter tuning problem of PSD, PSP and the like into the training process of the reinforcement learning model in the memory reading optimization problem of high-performance software such as CFD and the like. According to the embodiment of the invention, a CFD software optimization system based on data driving is designed, and a specified CFD software can be optimized by adopting a strengthening method, so that the optimal running performance is obtained for specific processing data and a hardware platform, and the running speed of high-performance software such as CFD and the like is effectively improved.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program to instruct relevant hardware to perform the processes, and the processes can be stored in a computer readable storage medium, and when executed, the processes can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like. Embodiments of the computer program may achieve the same or similar effects as any of the preceding method embodiments to which it corresponds.

In some embodiments, the run characteristic information of each function includes function run time and function cache hit rate; analyzing the operating characteristic information to determine a function to be optimized comprises: determining a function with the function running time higher than a preset value and/or the function cache hit rate lower than a preset value as a function to be optimized; writing the analysis log includes: writing each function to be optimized and the associated variable into an analysis log.

In some embodiments, the method further comprises: when a plurality of functions to be optimized are optimized, the functions to be optimized are sequenced according to the function running time of the functions to be optimized, the functions to be optimized with higher function running time are optimized to the state that the function running time of the functions to be optimized is not higher than a preset value, and then the functions to be optimized with lower function running time are optimized.

In some embodiments, for each function to be optimized in the analysis log, constructing an action cost function with its pre-read scheduling distance and pre-read scheduling position as states and its optimization result change as an action comprises:

using a set of all possible pre-reading scheduling distances and value pairs of pre-reading scheduling positions of a function to be optimized to form an optimization result vector as a state;

and forming an optimization result change as an action by using all possible pre-reading scheduling distances of the function to be optimized and the change action set of the pre-reading scheduling positions.

In some embodiments, iteratively training using a reinforcement learning algorithm with a single step speed of the CFD solver as a reward for each action cost function until the action cost function converges comprises:

initializing an action cost function and setting the current state of the action cost function according to the optimization result vector;

selecting an action from the set of change actions to execute on the current state to obtain a reward and a next state, updating an action cost function according to the reward and the next state and overriding the current state with the next state;

and repeating the previous steps until the action cost function converges.

In some embodiments, updating the action cost function according to the reward and the next state comprises:

Q(s,α)=(1-β)·Q(s,α)+β[R+γmaxαQ(s′,α)]

where s is the current state, α is the action, Q (s, a) is the action cost function, β is the learning rate, R is the reward, γ is the discount factor, s' is the next state, maxαQ (s', α) is the maximum convergence function value for the action taken for the next state.

In some embodiments, setting the current state of the action cost function according to the optimization result vector comprises: and setting the pre-reading scheduling distance to be zero, and setting the pre-reading scheduling position to be the position where the cache hit rate of the function to be optimized is lowest.

The following further illustrates embodiments of the invention in terms of specific examples.

The hot spot function analysis module is responsible for identifying and monitoring micro-architecture characteristics such as hot spot functions in the running process of the application software and cache hit rate in the running process of the functions, analyzing the running time of the functions in the program and the running characteristics of the functions, and collecting indexes such as miss of caches of different levels generated by statements in the functions. And judging a function to be optimized and a corresponding variable to be pre-read according to a performance analysis result of the monitoring function, and outputting and recording an analysis log file if the function which consumes more time and a calculation variable in which cache miss occurs is used as an optimization object and the like.

And the pre-reading optimization training module is used for training the optimal values of the pre-reading optimization statement position PSP and the corresponding PSD in the corresponding function by inputting the performance analysis log result corresponding to the software and using a reinforcement learning method. PSP and PSD forming vectors corresponding to optimization variables needing to be read in advance are used as states, and the region to which each unit belongs is changed to be used as an action so as to construct an action value function; and taking the running speed of a single step of the CFD solver on the cluster as reward, and performing iterative computation on the action value function by adopting a reinforcement learning algorithm according to the reward to obtain a convergence function value, so as to realize pre-reading optimization of the function according to the state, the action and the convergence function value. The reinforcement Learning algorithm uses a Q-Learning algorithm, a state set is composed of PSP and PSD values which are possible for all optimization variables, and a saved optimization result vector p is usedsAs a state, each element in the vector has the meaning of

ps=[PSP1,PSD1PSP2,PSD2,…,PSPN,PSDN]

The subscript i represents the number of variables in the optimization function, and the PSP is the line number after the pre-read statement is inserted, and the calculation is started from the initial position of the function body. In optimizing a single function, the value of the PSP is inserted in order of variable number when a plurality of pre-read statements are inserted in the same position, regardless of the inserted pre-read statement. In addition, the optimization function sequence is optimized from a plurality of sequences according to the running time ratio, and when the running time ratio is lower than a user-specified threshold value, the optimization is not performed. In the optimization process for each function, a reinforcement learning training process is used as shown in fig. 3.

PSD and PSP changes of all variables form an action set A; when the iterative computation of the action cost function is performed by adopting a Q-Learning algorithm according to the reward, the method comprises the following steps:

1) initializing a function value of the action price, and setting the current state of the action value function; selecting an action from the set of actions according to the current state and policy;

2) performing the action results in a reward and a next state;

3) updating an action cost function according to the reward and the next state;

4) the next said state is then taken as the current state and the iteration is repeated until the end state of the set of states is reached.

When the action cost function is updated using the reward and the next state, the calculation formula of the updated action cost function is:

Q(s,α)=(1-β)·Q(s,α)+β[R+γmaxαQ(s′,α)]

s=s′

wherein s is the current state, α is the action, Q (s, a) is an action cost function representing a convergence function value obtained by the current state s executing the action α, β is the learning rate, R is the reward, γ is the discount factor, s' is the next state, maxαQ (s', α) is the maximum convergence function value for the action taken for the next state.

The initial state S _0 is generated as follows:

1) Setting PSD values of all variable pre-reading statements to be 0;

2) the PSP values of all variable pre-read statements are the statement positions containing the highest cache miss in all variable statements.

The running speed of the optimized CFD software in the cluster is used as a reward R, and calculation is carried out according to the single-step running average time delta T used by the software for processing specific data:

R=1/ΔT

and performing iterative computation on the action value function by adopting a reinforcement learning algorithm according to the reward to obtain a convergence function value. After performing an action, the CFD software single step speed can be collected as the reward R obtained from the environment. Actions are performed according to the current state and policy, while detecting the reward and the next state, and then the state and convergence function value can be updated according to the maximum convergence function value and the reward pair of the next state. To ensure that the parallel region partitioning system can search all possible actions, a greedy- ξ strategy can be used for the search.

The iterative process of using the action cost function in the Q-Learning algorithm is shown in fig. 2, where the environment contains the CFD solver and the actual running speed, and the solver single-step running efficiency is returned as a reward for the result after each training, and the PSP and PSD values of the pre-read statements are modified according to the state, action and convergence functions. One state and one action correspond to the convergence function value. By the method, the optimal pre-reading optimization result can be obtained according to the running characteristics of the CFD solver and the hardware characteristics of the using platform, and the running speed of the CFD solver is further improved.

It can be seen from the foregoing embodiments that, in the optimization method for data pre-reading provided by the embodiments of the present invention, the operation characteristic information of each function in the CFD program is collected during the operation of the CFD program, and the operation characteristic information is analyzed to determine a function to be optimized and write the function to the analysis log; aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function; aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged; according to the technical scheme that the optimal pre-reading scheduling distance and the optimal pre-reading scheduling position are determined according to the corresponding state, action and converged action cost function so as to execute data pre-reading in the cache, the optimal pre-reading optimization result values of the PSD and the PSP can be obtained, and the running speed of the CFD solver is further improved.

It should be particularly noted that, the steps in the embodiments of the optimization method for data pre-reading described above can be mutually intersected, replaced, added, and deleted, so that these optimization methods for data pre-reading with reasonable permutation and combination transformation also belong to the scope of the present invention, and should not limit the scope of the present invention to the described embodiments.

In view of the above-mentioned objects, a second aspect of the embodiments of the present invention provides an embodiment of an optimization apparatus capable of obtaining optimal pre-reading optimization result values of PSD and PSP. The optimization device for data pre-reading comprises:

a processor; and

a memory storing program code executable by the processor, the program code when executed sequentially performing the steps of:

collecting operation characteristic information of each function in the CFD program during the operation of the CFD program, analyzing the operation characteristic information to determine a function to be optimized and writing the function to an analysis log;

aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function;

aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged;

determining an optimal pre-fetch scheduling distance and pre-fetch scheduling location to perform data pre-fetching in the cache according to the corresponding state, action, and converged action cost function.

In some embodiments, for each function to be optimized in the analysis log, constructing an action cost function with its pre-read scheduling distance and pre-read scheduling position as states and its optimization result change as an action comprises:

Using a set of all possible pre-reading scheduling distances and value pairs of pre-reading scheduling positions of a function to be optimized to form an optimization result vector as a state;

and forming an optimization result change as an action by using all possible pre-reading scheduling distances of the function to be optimized and the change action set of the pre-reading scheduling positions.

In some embodiments, iteratively training using a reinforcement learning algorithm with a single step speed of the CFD solver as a reward for each action cost function until the action cost function converges comprises:

initializing an action cost function and setting the current state of the action cost function according to the optimization result vector;

selecting an action from a set of change actions to perform on the current state to obtain a reward and a next state, updating an action cost function according to the reward and the next state and overriding the current state with the next state, wherein the action cost function Q (s, α) ═ 1- β · Q (s, α) + β [ R + γ maxαQ(s′,α)]Where s is the current state, α is the action, Q (s, a) is the action cost function, β is the learning rate, R is the reward, γ is the discount factor, s' is the next state, maxαQ (s', alpha) is the maximum convergence function value of the action adopted in the next state;

And repeating the previous steps until the action cost function converges.

As can be seen from the foregoing embodiments, the optimization device for pre-reading data provided in the embodiments of the present invention collects the operation characteristic information of each function in the CFD program during the operation of the CFD program, and analyzes the operation characteristic information to determine a function to be optimized and write the function to an analysis log; aiming at each function to be optimized in the analysis log, taking the pre-reading scheduling distance and the pre-reading scheduling position as states, and taking the optimization result change as an action construction action value function; aiming at each action value function, performing iterative training by using a reinforcement learning algorithm and taking the single-step speed of a CFD solver as reward until the action value function is converged; according to the technical scheme that the optimal pre-reading scheduling distance and the optimal pre-reading scheduling position are determined according to the corresponding state, action and converged action cost function so as to execute data pre-reading in the cache, the optimal pre-reading optimization result values of the PSD and the PSP can be obtained, and the running speed of the CFD solver is further improved.

It should be particularly noted that the above-mentioned embodiment of the optimization apparatus for data pre-reading employs the embodiment of the optimization method for data pre-reading to specifically describe the working process of each module, and those skilled in the art can easily think that these modules are applied to other embodiments of the optimization method for data pre-reading. Of course, since the steps in the embodiment of the method for optimizing data pre-reading may be mutually intersected, replaced, added, or deleted, these optimized devices for optimizing data pre-reading, which are transformed by reasonable permutation and combination, should also belong to the scope of the present invention, and should not limit the scope of the present invention to the embodiment.

The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of an embodiment of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:提供异构命名空间的存储设备及其在数据库中的应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类