High-energy-efficiency task scheduling algorithm based on Markov decision process

文档序号:1504224 发布日期:2020-02-07 浏览:18次 中文

阅读说明:本技术 一种基于马尔科夫决策过程的高能效任务调度算法 (High-energy-efficiency task scheduling algorithm based on Markov decision process ) 是由 龙浩 霍娜 于 2019-08-30 设计创作,主要内容包括:一种基于马尔科夫决策过程的高能效任务调度算法,基于服务器和客户端两层架构构建多任务移动群智感知系统;根据公式获得智能手机当前状态的奖励值Λ(S<Sub>t</Sub>,Task<Sub>t</Sub>);根据智能手机的当前状态S<Sub>t</Sub>,并依据公式计算下一个状态S<Sub>t+1</Sub>,然后再获得智能手机下一个状态的奖励值;依次获得智能手机在每个状态下的奖励值,然后组成奖励矩阵R<Sub>t</Sub>;根据公式计算在任意时刻t,当前手机状态S<Sub>t</Sub>到下一时刻的手机状态S<Sub>t+1</Sub>的预测概率矩阵P<Sub>t</Sub>;通过公式算出任务调度的最小传感器能耗;构建MDP公式,并进行迭代计算,获得下一时刻的传感器能耗;获取最佳任务调度时间序列。该算法能使得最大化感知精度和最小化能量消耗成本达到平衡,能有助于智能手机在感知过程中减少电量的消耗。(An energy-efficient task scheduling algorithm based on a Markov decision process is characterized in that a multi-task mobile crowd sensing system is constructed based on two layers of architectures of a server and a client; obtaining the reward value Lambda (S) of the current state of the smart phone according to a formula t ,Task t ) (ii) a According to the current state S of the smart phone t And calculating the next state S according to the formula t+1 Then obtaining the reward value of the next state of the smart phone; obtaining the reward values of the smart phone in each state in sequence, and then forming a reward matrix R t (ii) a Calculating the current state S of the mobile phone at any time t according to a formula t Handset state S to the next moment t+1 Is predicted by the probability matrix P t (ii) a Calculating the minimum sensor energy consumption of task scheduling through a formula; constructing an MDP formula, and performing iterative calculation to obtain the energy consumption of the sensor at the next moment; and acquiring the optimal task scheduling time sequence. The algorithm can makeThe maximum sensing precision and the minimum energy consumption cost are balanced, and the smart phone can reduce the consumption of electric quantity in the sensing process.)

1. An energy-efficient task scheduling algorithm based on a Markov decision process is characterized by comprising the following steps:

s1: constructing a system model;

a multi-Task mobile crowd-sourcing sensing system is constructed based on a server and client two-layer framework, wherein the client adopts a smart phone, and a sensing Task of the sensing systemtAs shown in equation (1);

Taskt=(j,Jj,St,Qt) (1);

in the formula, j represents the number of the task;

Jjrepresents a set of sensor numbers in a multi-sensor task j;

Strepresents the current state of the smart phone at a certain time t, St=(Et,Lt,Rt),EtE {0, 1, 2, …, N } represents the remaining power of the smartphone at time t, LtE {0, 1} represents the perception state of the smartphone at some time t, where 0 represents running a low-power-consumption application, 1 represents the system running a critical and high-power-consumption application, and RtE {0, 1} represents the charging state of the smart phone at a certain time t, wherein 0 represents that no charging power supply is connected, and 1 represents that the charging power supply is connected;

Qtrepresenting the quality of the perceived data at a certain time t;

s2: obtaining the reward value Lambda (S) of the current state of the smart phone according to the formula (2)t,Taskt) (ii) a According to the current state S of the smart phonetAnd calculating the next state S according to the formula (3)t+1Then gain intelligence againThe reward value of the next state of the mobile phone can be obtained; obtaining the reward values of the smart phone in each state in sequence, and then forming a reward matrix Rt

Figure FDA0002186370360000011

In the formula, λtA platform reward representing a perceived task;

phprobability of heavy load execution for the next time interval smartphone, 0.5 < ph<1;

plProbability of smart phone executing normal load in next time interval, 0.5 < pl<1;

qrProbability of recharging smart phone for next time interval, 0.5 < qr<1;

St+1=(Et+1,Lt+1,Rt+1) (3);

In the formula, Et+1=Et+dt-et,dtIs the amount of recharging, dt=Rtqr+(1-Rt)(1-qr),etIs the amount of power consumed by the sensing process, et=e1+e2+e3,e1Minimum amount of power required to keep the smartphone active, e2Representing the amount of power consumed by the sensing process, e3The method comprises the steps of representing the electric quantity consumed by the operation of a system of the smart phone;

Lt+1=Ltph+(1-Lt)(1-pl);

s3: according to the formula (4), the current mobile phone state S at any time t is calculatedtHandset state S to the next momentt+1Is predicted by the probability matrix Pt

Figure FDA0002186370360000021

S4: the feeling of meeting the platform requirements at all tasks is calculated by the formula (5)Minimizing the total sensor energy consumption V at time t with knowledge of masst(S);

Figure FDA0002186370360000022

In the formula, Vπ(y)=Vt(S);

wjEnergy consumption to read data from sensor j;

n represents the total number of time points;

m represents the total number of sensors;

T′={t′ijrepresents a scheduled sequence of a set of tasks at a point in time;

ΓC(tij,t′ij) For sensing the accuracy of the motion, according to a formula

Figure FDA0002186370360000023

s5: based on the reward matrix RtAnd a prediction probability matrix PtAnd constructing an MDP formula through a value iteration function, and aiming at any stable strategy pi ═ pi (pi)0,π1…), the state-value function satisfies the Bellman equation at the state X ∈ XIterative calculation is carried out according to the value iterative function in the formula (6) to obtain the energy consumption V of the sensor at the t + l momentt+1(S);

Vπ(x)=R(x,π(x))+γ∑yP(y|x,π(x))Vπ(y) (6);

In the formula, Vπ(x)=Vt+1(S);

Gamma is a discount coefficient;

s6: obtaining an optimal task scheduling sequence according to a formula (7);

maxS|Vt+1(S)-Vt(S)|<∈ (7);

wherein epsilon is a convergence value;

and S is a set of states of the smart phone at all times.

Technical Field

The invention relates to an energy-efficient task scheduling algorithm based on a Markov decision process.

Background

Smart phones have become an indispensable part of people's daily life. Smart phones are equipped with various embedded sensors including microphones, cameras, GPS, accelerometers, gyroscopes, WiFi/3G/4G interfaces, etc. The smart phone with the embedded sensor can provide application and sensing services in a plurality of fields such as air monitoring, social networking, medical care, transportation and safety. In the crowd sensing system, a task publisher distributes a sensing task to task participants in a bidding mode through a sensing platform. After receiving the sensing task, the participant collects sensing data from one or more sensors through the smart phone and sends the sensing data to the sensing platform, and the system is a multitask system supporting multiple sensing applications. On the one hand, sensor tasks are assigned to a number of smartphones to collect data; on the other hand, smartphones undertake many different perceptual tasks generated for multiple applications

The smart phone sensors are very energy consuming to collect data, and if the activity of the sensors is not scientifically managed in an energy-saving manner, the limited electric quantity of the smart phone is exhausted in a short time. During the process of collecting data by the smart phone sensor, the smart phone is required to actively perform scanning so as to obtain the (e.g. WIFI interface) states of some sensors. It also requires a thread to be created to take readings from some other sensor (e.g., accelerometer). In addition, some sensors (such as GPS) consume power continuously, and if not controlled properly, can cause the power of the smart phone to be consumed quickly, thus not allowing more users to participate.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides an energy-efficient task scheduling algorithm based on a Markov decision process, which can balance the maximum perception precision and the minimum energy consumption cost so as to realize the task scheduling of the maximum perception precision under the condition of minimum energy consumption and be beneficial to reducing the consumption of electric quantity in the perception process of a smart phone.

In order to achieve the above object, the present invention provides an energy efficient task scheduling algorithm based on a markov decision process, which specifically comprises the following steps:

s1: constructing a system model;

building multitask based on two-layer architecture of server and clientThe mobile crowd sensing system comprises a client, a smart phone and a sensing Task of the sensing systemtAs shown in equation (1);

Taskt=(j,Jj,St,Qt) (1);

in the formula, j represents the number of the task;

Jjrepresents a set of sensor numbers in a multi-sensor task j;

Strepresents the current state of the smart phone at a certain time t, St=(Et,Lt,Rt),EtE {0, 1, 2, …, N } represents the remaining power of the smartphone at time t, LtE {0, 1} represents the perception state of the smartphone at some time t, where 0 represents running a low-power-consumption application, 1 represents the system running a critical and high-power-consumption application, and RtE {0, 1} represents the charging state of the smart phone at a certain time t, wherein 0 represents that no charging power supply is connected, and 1 represents that the charging power supply is connected;

Qtrepresenting the quality of the perceived data at a certain time t;

s2: obtaining the reward value Lambda (S) of the current state of the smart phone according to the formula (2)t,Taskt) (ii) a According to the current state S of the smart phonetAnd calculating the next state S according to the formula (3)t+1Then obtaining the reward value of the next state of the smart phone; obtaining the reward values of the smart phone in each state in sequence, and then forming a reward matrix Rt

Figure BDA0002186370370000021

In the formula, λtA platform reward representing a perceived task;

phprobability of performing heavy load for the next time interval smartphone, 0.5<ph<1;

plProbability of smartphone performing a generic load for the next time interval, 0.5<pl<1;

qrProbability of smartphone recharge for the next time interval, 0.5<qr<1;

St+1=(Et+1,Lt+1,Rt+1) (3);

In the formula, Et+1=Et+dt-et,dtIs the amount of recharging, dt=Rtqr+(1-Rt)(1-qr),etIs the amount of power consumed by the sensing process, et=e1+e2+e3,e1Minimum amount of power required to keep the smartphone active, e2Representing the amount of power consumed by the sensing process, e3The method comprises the steps of representing the electric quantity consumed by the operation of a system of the smart phone;

Lt+1=Ltph+(1-Lt)(1-pl);

s3: according to the formula (4), the current mobile phone state S at any time t is calculatedtHandset state S to the next momentt+1Is predicted by the probability matrix Pt

Figure BDA0002186370370000031

S4: the minimum total sensor energy consumption V at the t moment when all tasks reach the sensing quality required by the platform is calculated by the formula (5)t(S);

Figure BDA0002186370370000032

In the formula, Vπ(y)=Vt(S);

wjEnergy consumption to read data from sensor j;

n represents the total number of time points;

m represents the total number of sensors;

T′={t′ijrepresents a scheduled sequence of a set of tasks at a point in time;

ΓC(tij,t′ij) For sensing the accuracy of the motion, according to a formula

Figure BDA0002186370370000033

Calculating, wherein sigma is the reading of the sensor;

s5: based on the reward matrix RtAnd a prediction probability matrix PtAnd constructing an MDP formula through a value iteration function, and aiming at any stable strategy pi ═ pi (pi)0,π1…), the state-value function satisfies the Bellman equation at the state X ∈ XIterative calculation is carried out according to the value iterative function in the formula (6) to obtain the energy consumption V of the sensor at the moment of t +1t+1(S);

Vπ(x)=R(x,π(x))+γ∑yP(y|x,π(x))Vπ(y) (6);

In the formula, Vπ(x)=Vt+1(S);

Gamma is a discount coefficient;

s6: obtaining an optimal task scheduling sequence according to a formula (7);

maxS|Vt+1(S)-Vt(S)|<∈ (7);

wherein epsilon is a convergence value;

and S is a set of states of the smart phone at all times.

The method not only defines the state of the current perception task, including the state of the current mobile phone and the perception precision requirement, but also defines the reward value related to perception energy consumption, and provides a calculation basis for the Markov to decide and calculate the next state of the perception task. Meanwhile, an optimal task scheduling sequence is obtained by constructing a perception precision model and iteratively calculating a prediction probability matrix and a reward matrix under the condition of meeting the perception precision required by the system, so that the optimal balance between energy consumption and perception precision is realized. Compared with the existing energy-saving algorithm, the algorithm has lower energy-saving cost and calculation complexity than the existing algorithm, and the average energy is saved by more than 75%. The invention provides an effective perception task scheduling strategy by applying the MDP according to three factors of current load, residual energy and charging probability of the smart phone. The current state and the perception precision of the intelligent mobile phone device are comprehensively considered, the optimal task scheduling time sequence is calculated, the energy consumption and the perception precision of the perception task can be optimally balanced through task scheduling, and therefore the effective energy saving of the perception process is achieved. The invention solves the task scheduling problem in the general mobile crowd sensing system, is convenient for reasonably planning and scheduling tasks, and can schedule different sensing tasks to be distributed to different sensors of the smart phone.

Drawings

FIG. 1 is a schematic diagram of energy consumption of a smart phone under the same task number of the algorithm, the Baseline algorithm and the Opt-MESS algorithm of the invention;

FIG. 2 shows smartphone energy consumption for different task durations for the algorithm, the Baseline algorithm, and the Opt-MESS algorithm of the present invention;

FIG. 3 shows smartphone energy consumption under different task perception accuracies of the algorithm, the Baseline algorithm and the Opt-MESS algorithm of the present invention;

FIG. 4 shows the smartphone energy consumption in different task perception periods of the algorithm, the Baseline algorithm and the Opt-MESS algorithm of the present invention.

Detailed Description

An energy-efficient task scheduling algorithm based on a Markov decision process specifically comprises the following steps:

s1: constructing a system model;

the method comprises the steps that a multitask mobile crowd-sourcing sensing system is constructed on the basis of a server and a client-side framework, the client-side is connected with the server through a network, the client-side regularly detects and summarizes required data, and the summarized data are sent to the server; the server is a crowd sensing server, the client side adopts a smart phone, and because the operating system of the smart phone adopts time division multiplexing, the loads of the smart phone operating system in operation at an interval are the same, a discrete time model is adopted in the calculation process, and the unit length is used as the interval as the calculation basis.

Perception Task of perception systemtAs shown in equation (1);

Taskt=(j,Jj,St,Qt) (1);

in the formula, j represents the number of the task;

Jjrepresents a set of sensor numbers in a multi-sensor task j;

Strepresents the current state of the smart phone at a certain time t, St=(Et,Lt,Rt),EtE {0, 1, 2, …, N } represents the remaining power of the smartphone at time t, LtE {0, 1} represents the perception state of the smartphone at some time t, where 0 represents running a low-power-consumption application, 1 represents the system running a critical and high-power-consumption application, and RtE {0, 1} represents the charging state of the smart phone at a certain time t, wherein 0 represents that no charging power supply is connected, and 1 represents that the charging power supply is connected;

Qtrepresenting the quality of the perceived data at a certain time t;

s2: obtaining the reward value Lambda (S) of the current state of the smart phone according to the formula (2)t,Taskt) (ii) a The smart phone comprises two state processes, namely a current time state StAnd the next time state St+1The two state processes can be used for correlating the current operation load of the smart phone according to the current state S of the smart phonetAnd calculating the next state S according to the formula (3)t+1Then obtaining the reward value of the next state of the smart phone; obtaining the reward values of the smart phone in each state in sequence, and then forming a reward matrix Rt

Figure BDA0002186370370000051

Equation (2) contains 4 cases representing the difference between the prize earned by the participant by the platform and the participant's electricity consumption cost.

In the formula, λtA platform reward representing a perceived task;

phprobability of performing heavy load for next time interval smartphone,0.5<ph<1; the probability of correspondingly performing non-load is 1-ph

plProbability of smartphone performing a generic load for the next time interval, 0.5<pl<1; probability of no load to be performed in place is 1-pl

qrProbability of smartphone recharge for the next time interval, 0.5<qr<1; the probability of being correspondingly incapable of recharging is 1-qr

St+1=(Et+1,Lt+1,Rt+1) (3);

In the formula, Et+1=Et+dt-et,dtIs the amount of recharging, dt=Rtqr+(1-Rt)(1-qr),etIs the amount of power consumed by the sensing process, et=e1+e2+e3,e1Minimum amount of power required to keep the smartphone active, e2Representing the amount of power consumed by the sensing process, e3The method comprises the steps of representing the electric quantity consumed by the operation of a system of the smart phone;

Lt+1=Ltph+(1-Lt)(1-pl);

Rt+1represents RtThe next state of (a);

s3: according to the formula (4), the current mobile phone state S at any time t is calculatedtHandset state S to the next momentt+1Is predicted by the probability matrix Pt

Figure BDA0002186370370000061

S4: the minimum total sensor energy consumption V at the t moment when all tasks reach the sensing quality required by the platform is calculated by the formula (5)t(S);

In the formula, Vπ(y)=Vt(S);

wjEnergy consumption to read data from sensor j;

n represents the total number of time points;

m represents the total number of sensors;

T′={t′ijrepresents a scheduled sequence of a set of tasks at a point in time;

ΓC(tij,t′ij) For sensing the accuracy of the motion, according to a formula

Figure BDA0002186370370000063

Calculating, wherein sigma is the reading of the sensor;

s5: based on the reward matrix RtAnd a prediction probability matrix PtAnd constructing an MDP formula through a value iteration function, and aiming at any stable strategy pi ═ pi (pi)0,π1…), the state-value function satisfies the Bellman equation at the state X ∈ X

Figure BDA0002186370370000064

Iterative calculation is carried out according to the value iterative function in the formula (6) to obtain the energy consumption V of the sensor at the moment of t +1t+1(S);

Vπ(x)=R(x,π(x))+γ∑yP(y|x,π(x))Vπ(y) (6);

In the formula, Vπ(x)=Vt+1(S);

Gamma is a discount coefficient;

s6: obtaining an optimal task scheduling sequence according to a formula (7);

maxs|Vt+1(S)-Vt(S)|<∈ (7);

wherein epsilon is a convergence value;

and S is a set of states of the smart phone at all times.

To conserve power, sensor sensing may not need to be performed precisely within the time required by the system, as the readings of some sensors (e.g., light, temperature, etc.) may change slowly over time. It is composed of a base, a cover and a coverThey may be collected at an instant in time slightly different from the conditions of the request. If the smartphone is required to collect readings from the sensors at time t, but data is collected at time t', the accuracy of this sensing action ΓC(t,t′)∈[0,1]. Of course, the closer t' is to t, ΓCThe larger the value of (t, t'), the more accurate the perception data.

Figure BDA0002186370370000071

Is in the range of 0 to 1, which is a common model so that the perceptual accuracy can be estimated by it. Task for perceptual TasktSensing data is acquired at time t', wherein gamma isC(t,t′)≥QtThe task meets the requirements of the perception platform.

In order to realize and check the effectiveness of the strategy, an android-based crowd sensing application program is designed and compared with the widely used Baseline method and the Opt-MESS method in performance so as to verify the effectiveness of the algorithm. The Baseline method arranges that the sensor accurately collects readings within the required time, and the Opt-MESS method requires the sensor to collect data at the time point after linear planning and greedy calculation. In simulation, data are collected at different time points by randomly generating sensor tasks, and the power consumption of the sensors is calculated to serve as the energy consumption for sensing the task consumption, so that the energy consumption serves as a main index for performance evaluation. The process considers 6 commonly used embedded sensors including GPS, light sensors, accelerometers, gyroscopes, WiFi and 4G. The calculation of power consumption uses the real data of the above-mentioned sensor power consumption obtained from the power profile of the google Nexus 4 smartphone, multiplied by the estimated duration to get the power consumption. Then, a sensing data collection time point is set, a sensing scheduling period is set to be 12 hours (for example, the period can be from 8 am to 8 pm), the average time is divided into 2 minutes, a uniform time sequence with equal intervals can be obtained, and sensing data can be collected according to the sequence. And finally, randomly generating a sensor task, and randomly selecting a sensor set from the 6 sensors. The duration of each task varies from 1 hour to 6 hours, with a start time chosen randomly from [1,6] such that its end time does not exceed 12 hours. The precision requirement varies from 0.5 to 1, the number of tasks varies from 5 to 30, and the step size is 5. The method carries out comprehensive performance evaluation on the proposed algorithm by changing the number of tasks, the duration of the tasks and the requirement of task perception accuracy. All experimental results were averaged after at least 100 rounds.

It can be observed from fig. 1 that, whatever the algorithm used, the energy consumption increases with a monotonic increase in the number of tasks. This is because the more sensing tasks, the more time instants required for sensing, and whatever optimization is done, the higher energy consumption. As can be seen from fig. 1, compared with the Baseline method, the mobile phone energy consumption of the algorithm is reduced by 75.8% on average. Compared with the Opt-MESS method which is newly proposed at present, the energy consumption is averagely reduced by 18.4 percent. Furthermore, as the number of perceived tasks becomes larger, energy conservation becomes more and more important. It can be observed from the figure that the algorithm can schedule tasks and collect data in a mobile crowd sensing system in a strategic manner according to the requirements of the tasks, and can realize remarkable energy saving under the condition of sacrificing less sensing precision.

It can be observed from fig. 2 that the power consumption of the handset monotonically increases as the duration of the task increases. The longer the duration of the sensing task, the more data is collected and the more energy consumed. Compared with the Baseline method, the algorithm has obvious energy-saving effect and is also superior to the Opt-MESS method.

From fig. 3, the trade-off between energy consumption and task perception accuracy can be observed: whichever algorithm is used, energy consumption monotonically increases with the demand for perceptual accuracy, but increases very slowly. Increasing the perception accuracy requirement will undoubtedly increase the energy consumption, since more readings need to be collected at more times to meet the requirement. In any case, setting the perceptual accuracy requirement to a relatively suitable value (e.g. 0.8 or 0.9) allows for optimal power consumption, as doing so does not result in excessive power consumption increase. When the perceptual accuracy requirement of all tasks is 1, the energy consumption of the three algorithms is very close.

As can be seen from FIG. 4, the algorithm saves 75.2% and 17.3% of energy on average compared with the Baseline method and the Opt-MESS method. In addition, when the sensing period is shorter, the energy-saving effect is more remarkable. This is because the sensing task is more likely to request data of the ordinary sensor within the same or similar time within a shorter sensing time. Therefore, the algorithm of the invention can realize better energy saving.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:多核处理器控制方法、装置、电子设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!