air pollutant concentration monitoring method based on time domain weighting

文档序号：1707600 发布日期：2019-12-13 浏览：8次中文

阅读说明：本技术 一种基于时域加权的空气污染物浓度监测方法 (air pollutant concentration monitoring method based on time domain weighting ) 是由顾锞乔俊飞夏俊勇于 2019-09-09 设计创作，主要内容包括：本发明公开了一种基于时域加权的空气污染物浓度监测方法。引入时间加权矩阵到支持向量回归模型中来增强不同时刻训练样本的权重。首先,收集一周内6种气象指标和6种空气污染物浓度共计12个特征的数据作为特征向量并以此构成样本数据。然后引入时间加权矩阵建立时域加权支持向量回归机模型。最后,用样本训练TSVR模型,对监测结果进行评估。实验结果表明,本发明提出的模型在空气污染物浓度监测和实现效率方面与目前的监测方法相比具有很大优势。(The invention discloses an air pollutant concentration monitoring method based on time domain weighting. And introducing a time weighting matrix into the support vector regression model to enhance the weight of the training samples at different moments. Firstly, data of 12 features in total of 6 meteorological indexes and 6 air pollutant concentrations in one week are collected as feature vectors, and sample data are formed according to the feature vectors. And then introducing a time weighting matrix to establish a time-domain weighted support vector regression model. And finally, training a TSVR model by using the sample, and evaluating the monitoring result. Experimental results show that the model provided by the invention has great advantages in the aspects of air pollutant concentration monitoring and realization efficiency compared with the conventional monitoring method.)

1. An air pollutant concentration monitoring method based on time domain weighting is characterized by comprising the following steps:

the first step is as follows: collecting data of 12 features in total of 6 meteorological indexes and 6 air pollutant concentrations in one week as feature vectors, forming sample data by the feature vectors, and establishing a time domain weighted support vector regression model;

The second step is that: training a TSVR model by using a sample, and evaluating a monitoring result;

in the first step:

(1) collecting data of 12 features as feature vectors, including collecting values of 6 meteorological indexes including temperature, relative humidity, wind speed, wind direction, pressure and visibility at the current moment, and collecting PM 1 hour before_2.5、PM₁₀、CO、NO₂、SO₂And O₃Concentration values of these 6 air pollutants; constructing sample data through the feature vectors;

(2) Establishing a time domain weighting support vector regression model:

Single-task support vector regression expression H (x)_i) Comprises the following steps:

H(x_i)＝w^TΨ(x_i)+b (1)

Wherein x is_iInput vector, Ψ (x), for a single-task support vector regression machine_i) Represents a non-linear mapping that maps the input feature vector to a high-dimensional hilbert space, w and b represent weight and bias, respectively;

Definition D_t＝{(x₁,y₁),(x₂,y₂),…,(x_i,y_i),…,(x_p,y_p) Is a sample data set, where p is the total number of samples, and for i, takes any integer between 1 and p, x_ian input feature vector of the ith sample with dimension of the number of input features, y_iThe dimension of the output real value corresponding to the ith sample is 1; establishing an objective function and a constraint function for solving the weight w and the bias b of the single-task support vector regression:

Wherein Φ ═ (Ψ (x)₁),Ψ(x₂),…,Ψ(x_p) Is a vector for non-linear mapping, ξ (ξ)₁,ξ₂,…,ξ_p)^TRepresenting a vector consisting of error relaxation variables whose solution is given below, gamma is a real, regular parameter, I_prepresenting a unit vector containing p elements; solving the optimization problem of the formula (2) to obtain values of the weight w, the offset b and the error relaxation variable xi;

let T be₀Is the current time, T_-1,T_-2,…,T_-λ,…T_-∞for the samples used for training, a weight parameter is added to each training sample, i.e.Tau is a weight parameter and takes a real number; when the value of tau is greater than 0,Near the current time T₀the training sample has the largest influence on the training model; adding a time weighting matrix Λ to equation (2), equation (2) being written as:

the square matrix Λ is represented as:

Wherein λ {1,2, …, ∞ } corresponds to T_-λSamples at time, τ > 0 to approximate the samples to T₀(ii) a Introducing a weighted threshold Y_ΛScreening the sample, and simplifying the formula (4) into:

wherein, Y_Λand (3) constructing a Lagrange function L (w, xi, a and b) by applying a Lagrange multiplier method to solve the optimization problem:

L(w,ξ,a,b)＝F(w,ξ)-a^T(Φ^Tw+bI_p+ξ-y) (6)

Wherein a ═ a₁,a₂,…,a_r)^TRepresenting a matrix of a Lagrange multiplier, wherein r represents the number of elements contained in a, the Lagrange multiplier is an unknown quantity to be solved, and the Lagrange multiplier and other parameters w, xi and b to be solved are solved together through solving an equation; let L (w, ξ, a, b) have partial derivatives with respect to w, b, ξ and a, respectively, equal zero:

the linear equations (5) to (9) contain 4 unknowns a, w, ξ, b, and the solution of a is solved for the equationsand solution b of b^*(ii) a A is to^*、b^*The final representation of the time domain weighted support vector regression is obtained by substituting (1):

Wherein, K (x, x)_i)＝Ψ(x)^TΨ(x_i)^TAnd a radial basis function kernel for mapping the sample data x to a high dimensional space;

In the second step:

Training is carried out, collected sample data is used as a training sample training model to obtain an optimal time domain weighted support vector regression model, the influence of sample data with the time sequence of the sample far away from the current time on the model is considered to be small, so that the optimal model is trained by using a feature vector consisting of 12 features in total, namely 6 meteorological indexes at the current time and 6 air pollutant concentrations monitored 1 hour before, an air pollutant concentration monitoring value is obtained, then a test result is compared with a real result, and the model performance is judged according to model evaluation indexes MSE, NMGE and IOA.

Technical Field

The invention relates to an air pollutant concentration monitoring method based on time domain weighting, which is used for monitoring the concentration of air pollutants in real time by taking numerical values of 12 features of 6 meteorological indexes at the current moment and 6 air pollutant concentrations before 1 hour as feature vectors. An air pollutant concentration monitoring method based on time domain weighting belongs to the cross field of atmospheric environment and machine learning.

Background

In recent years, PM_2.5、PM₁₀、CO、NO₂、SO₂And O₃Is the most typical air pollutant monitored by countries in the world. The concentration of these air pollutants far exceeds the specified standards, forHuman health in many countries poses serious and long-term hazards. PM (particulate matter)_2.5also called fine particles, is particles with an aerodynamic equivalent diameter of less than or equal to 2.5 microns in ambient air. The air suspension agent can suspend in the air for a long time, compared with thicker atmospheric particulate matters, the PM2.5 has small particle size, large area, strong activity, easy attachment of toxic and harmful substances, long retention time in the atmosphere and long conveying distance, thereby having larger influence on human health and atmospheric environmental quality.

In order to avoid the great harm of air pollution to human body, a system capable of monitoring air pollutants in real time is needed to assist the decision of government and people. Due to the hysteresis of monitoring, the monitoring of the concentration of air pollutants can only be known after several hours. In order to achieve real-time measurement of air pollutant concentration, a soft measurement monitoring system is proposed to indirectly measure the current air pollutant concentration.

The invention aims to introduce a time weighting matrix to distinguish different influences of training samples at different moments on the weight of a real-time air pollutant concentration monitoring model, a training sample training model is formed by taking numerical values of 12 features in a week as feature vectors, and the feature vectors formed by 6 meteorological indexes at the current moment and 6 air pollutant concentrations monitored 1 hour ago and totaling 12 features are input into the trained monitoring model as test samples to obtain the real-time air pollutant concentration monitoring value.

Disclosure of Invention

The invention obtains an air pollutant concentration monitoring method based on time domain weighting, which is used for solving the problem of real-time monitoring of the air pollutant concentration.

The invention adopts the following technical scheme and implementation steps:

a method for monitoring the concentration of air pollutants based on time domain weighting is characterized in that the concentration of air pollutants is monitored in real time, a feature vector is formed by the numerical values of 12 features of 6 meteorological indexes at the current moment and 6 air pollutant concentrations before 1 hour, a training sample training time domain weighting support vector regression model is formed through the feature vector, and the result is evaluated;

The method is characterized by comprising the following steps:

(1) collecting data and establishing a time domain weighting support vector regression model;

Collecting data of 12 features in a week as a feature vector, collecting numerical values of 6 meteorological indexes including temperature, relative humidity, wind speed, wind direction, pressure and visibility at the current moment, and collecting corresponding PM 1 hour before_2.5、PM₁₀、 CO、NO₂、SO₂and O₃Concentration values of these 6 air pollutants; constructing sample data through the feature vectors;

establishing a time domain weighted support vector regression model:

Single-task support vector regression expression H (x)_i) Comprises the following steps:

H(x_i)＝w^TΨ(x_i)+b (1)

Wherein x is_iinput vector, Ψ (x), for a single-task support vector regression machine_i) Representing a non-linear mapping of the input feature vector to the high-dimensional hilbert space, and w and b represent the weight and bias, respectively. Establishing a target function and a constraint function related to w and b, and solving the optimization problem by adopting a Lagrange multiplier method, thereby obtaining a final expression of a time domain weighting support vector regression machine;

(2) Training is carried out, collected sample data is used as a training sample to train the model, an optimal time domain weighted support vector regression model is obtained, the influence of the sample data with the time sequence of the sample far away from the current time on the model is considered to be small, so that the optimal model is trained by using a feature vector consisting of 6 meteorological indexes at the current time and 6 air pollutant concentrations monitored 1 hour ago and totaling 12 features, the air pollutant concentration monitoring value is obtained, then the test result is compared with the real result, and the model performance is judged according to the model evaluation indexes MSE, NMGE and IOA.

the invention is mainly characterized in that:

In order to monitor the concentration of the air pollutants in real time, the air pollutant concentration monitoring model based on time domain weighting fully considers the influence of time domain related information on the monitoring of the concentration of the air pollutants, realizes the real-time monitoring of the concentration of the air pollutants and has better generalization capability.

Detailed Description

The invention obtains an air pollutant concentration monitoring model based on time domain weighting, and an optimal time domain weighting support vector regression model is obtained by taking numerical values of 12 features of 6 meteorological indexes at the current moment and 6 air pollutant concentrations at 1 hour ago as input and training the model through sample data, so that the real-time monitoring of the future air pollutant concentration is realized. The problem that the concentration of air pollutants is difficult to monitor and control is solved, and references can be provided for government decision, mass travel and the like;

the invention adopts the following technical scheme and implementation steps:

A monitoring method based on time domain weighting air pollutant concentration takes numerical values of 12 features of 6 meteorological indexes at the current moment and 6 air pollutant concentrations before 1 hour as feature vectors;

(1) Collecting data and establishing a time domain weighting support vector regression model;

collecting data of 12 features in a week as a feature vector, collecting numerical values of 6 meteorological indexes including temperature, relative humidity, wind speed, wind direction, pressure and visibility at the current moment, and collecting PM 1 hour before_2.5、PM₁₀、CO、 NO₂、SO₂and O₃concentration values of these 6 air pollutants; constructing sample data through the collected feature vectors;

establishing a time domain weighted support vector regression model:

The traditional single-task support vector regression is expanded into a time domain weighting support vector regression, and the expression H (x) is as follows:

H(x)＝Ψ(x)^TW+b (1)

Wherein x is_iInput vector, Ψ (x), for a single-task support vector regression machine_i) RepresentsMapping the input feature vector to a non-linear mapping of a high-dimensional Hilbert space, w and b representing weight and bias, respectively;

Definition D_t＝{(x₁,y₁),(x₂,y₂),…,(x_i,y_i),…,(x_p,y_p) Is a sample data set, where p is the total number of samples, and for i, takes any integer between 1 and p, x_ian input feature vector of the ith sample with dimension of the number of input features, y_ithe dimension of the output real value corresponding to the ith sample is 1; establishing an objective function and a constraint function for solving the weight w and the bias b of the single-task support vector regression:

Wherein Φ ═ (Ψ (x)₁),Ψ(x₂),…,Ψ(x_p) Is a vector for non-linear mapping, ξ (ξ)₁,ξ₂,…,ξ_p)^Trepresenting a vector consisting of error relaxation variables, whose solution is given below, gamma is a positive, real, regular parameter whose value is determined by a number of trial and error, I_prepresenting a unit vector containing p elements. By solving the optimization problem of the expression (2), the values of the weight w, the offset b and the error relaxation variable ξ can be obtained;

However, the above objective function does not consider the difference in importance of data samples in the time domain, and let T be assumed₀is the current time, T_-1,T_-2,…,T_-λ,…T_-∞for the samples used for training, each training sample is assigned a weight parameter, i.e.tau is a weight parameter and can take any real number; when the value of tau is greater than 0,near the current time T₀The training sample has the largest influence on the training model; adding a time to the formula (2)The weighting matrix Λ, equation (2) is written as:

the square matrix Λ is represented as:

Wherein, Y_Λand (3) constructing a Lagrange function L (w, xi, a and b) by applying a Lagrange multiplier method to solve the optimization problem:

L(w,ξ,a,b)＝F(w,ξ)-a^T(Φ^Tw+bI_p+ξ-y) (6)

Wherein a ═ a₁,a₂,…,a_r)^TAnd a matrix representing a Lagrange multiplier, r representing the number of elements contained in a, the Lagrange multiplier being an unknown quantity to be solved, and solving the Lagrange multiplier and other parameters w, xi and b to be solved together by solving an equation. Let L (w, ξ, a, b) have partial derivatives with respect to w, b, ξ and a, respectively, equal zero:

the linear equations (5) to (9) contain 4 unknowns a, w, ξ, b, and the solution of a is solved for the equationsand solution b of b^*. A is to^*、b^*The final representation of the time domain weighted support vector regression is obtained by substituting (1):

Wherein K (x)_i,x_j)＝Ψ(x_i)^TΨ(x_j)^TAnd the radial basis function kernel is used to map the sample data x into a high dimensional space.

(2) training is carried out, collected sample data is used as a training sample to train the model, an optimal time domain weighted support vector regression model is obtained, the influence of the sample data with the time sequence of the sample far away from the current time on the model is considered to be small, so that the optimal model is trained by using a feature vector consisting of 6 meteorological indexes at the current time and 6 air pollutant concentrations monitored 1 hour ago and totaling 12 features, an air pollutant concentration monitoring value is obtained, then a test result is compared with a real result, and the model performance is judged according to model evaluation indexes MSE, NMGE and IOA.

The model is tested, the model is called TSVR model for short, and compared with other five models which are popular at present, and the comparison effect is shown in tables 1,2 and 3.

TABLE 1 mean square error MSE comparison of prediction results of atmospheric pollutant concentrations for this model and five advanced models

TABLE 2 normalized mean Total error NMGE of the model and five advanced models for atmospheric pollutant concentration prediction results

Table 3 IOA comparison of the prediction results of atmospheric pollutant concentrations for this model and five advanced models.

9页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种测定多花梾木耐涝能力的方法

air pollutant concentration monitoring method based on time domain weighting

相关技术

网友询问留言