Bayesian network complex industrial process soft measurement method based on hidden variables

文档序号:1140674 发布日期:2020-09-11 浏览:4次 中文

阅读说明:本技术 一种基于隐变量的贝叶斯网络复杂工业过程软测量方法 (Bayesian network complex industrial process soft measurement method based on hidden variables ) 是由 徐玉雪 王云 何雨辰 于 2020-04-02 设计创作,主要内容包括:本发明公开了一种基于隐变量的贝叶斯网络复杂工业过程软测量方法。该方法充分发挥贝叶斯网络和局部加权学习的优势,通过隐变量计算待预测在线样本与对应训练样本的全局相似度对原始数据加权,将加权后的数据输入贝叶斯网络进行预测,实现复杂非平稳工业过程的自适应软测量。本发明针对复杂工业过程的时变特性,在有监督地训练贝叶斯网络的基础上,引入基于隐变量的相似度计算和加权,缓解了模型过拟合现象,提高了预测精度,为复杂化工生产过程中与生产安全、生产质量和生产效率密切相关的质量变量的软测量建模提供了方法支持。(The invention discloses a Bayesian network complex industrial process soft measurement method based on hidden variables. The method gives full play to the advantages of the Bayesian network and local weighted learning, weights the original data by calculating the global similarity of the online sample to be predicted and the corresponding training sample through the hidden variables, inputs the weighted data into the Bayesian network for prediction, and realizes the self-adaptive soft measurement of the complex non-stationary industrial process. Aiming at the time-varying characteristic of the complex industrial process, the similarity calculation and weighting based on the hidden variable are introduced on the basis of supervised training of the Bayesian network, so that the model overfitting phenomenon is relieved, the prediction precision is improved, and method support is provided for soft measurement modeling of the quality variable closely related to the production safety, the production quality and the production efficiency in the complex chemical production process.)

1. A Bayesian network complex industrial process soft measurement method based on hidden variables is characterized by comprising the following steps:

step 1, collecting variables closely related to production safety and quality in a complex continuous industrial process through a sensor, and dividing the variables into a training sample set [ X, Y ]]And query sample set [ X ]query,Yquery]. Training sample set, process variable X ═ X1,x2,...xn]∈Rm ×nAnd the mass variable Y ═ Y1,y2,...yn]∈Rk×nIt will be used for soft measurement modeling, where m is the number of variables for the process variable X, k is the number of variables for the quality variable Y, and n is the number of samples contained in the data set. Query sample set, Process variable XqueryFor soft measurement prediction of on-line samples to be predicted, YqueryFor verifying the soft measurement prediction effect;

step 2, for each query sample xq∈XquerySelecting a certain number of samples D from a training sample set by using a sliding windowtrainAccording to the relative variable relationship of the industrial process and DtrainThe size of the system is constructed with a supervision Bayesian network, the Bayesian network is trained, and D is obtainedtrainCorresponding hidden variable T, xqInput Bayes network to obtain hidden variable tnew

Step 3, utilizing an implicit variable T, tnewThe confidence coefficient and the local similarity of each sliding window are calculated, and then x is obtainedqAnd XtrainThe global similarity of (a) is used to locally weight the original industrial process variable;

step 4, taking the global similarity as a weight pair Xtrain、YtrainWeighting to obtain training data X after local weightingt、YtIs mixing Xt、YtInputting Bayes network according to node to obtain xqCorresponding predicted value yqFinish all the query samples xqA prediction of a corresponding industrial process quality variable;

and 5, after all the query samples are predicted, measuring the prediction effect of the soft measurement model on the complex industrial process variables. The accuracy of the prediction result is measured by the root mean square error RMSE, and R is used2And measuring the data tracking capability of the prediction result, wherein the calculation method comprises the following steps:

Figure FDA0002436176970000011

wherein y isrealI.e. each query sample xqCorresponding to yqIs yrealHas an average value of ypredAnd n is the number of the query samples.

2. The Bayesian network complex industrial process soft measurement method based on implicit variables as recited in claim 1, wherein: the step 2 specifically comprises:

step 2-1, traversing the training sample set divided in step 1 by using a sliding window, wherein the number of samples included in each window is W, the number of windows is s, and the number of traversed samples is W-W × s, that is, each query sample x isq∈XqueryThe corresponding training set is Dtrain=[Xtrain,Ytrain]∈R(m+k)×W

Step 2-2, selecting sequence α ═<X,t,Y>Constructing a quality-related Bayesian network with a node number of 3, which comprises a quality variable Y in a training set, relative to a conventional Bayesian networktrainAnd (4) supervision. Inputting X according to corresponding network nodetrain、YtrainI.e. supervised training of the Bayesian network and solving of the training data set DtrainCorresponding hidden variable T ∈ RW×iAnd i represents the number of hidden variables;

step 2-3, new process variable xqInputting into Bayesian network to obtain xqCorresponding hidden variable tnew∈R1×i

3. The Bayesian network complex industrial process soft measurement method based on implicit variables as recited in claim 1, wherein: in the step 3, the specific process is as follows:

step 3-1, applying SVDD to determine xqThe confidence with each window, the confidence of the vth window, is calculated as follows:

Figure FDA0002436176970000021

wherein a isvAnd RvRespectively representing hidden variables T calculated from the v-th windowvAt the centre coordinate and radius, x, of the constructed hyperspherevRepresents tnewCorresponding coordinates in the hypersphere.

Step 3-2, supposing that the r training sample belongs to the v moving window, tnewCorresponding hidden variable t to current training samplerLocal similarity of SvThe calculation is as follows:

Figure FDA0002436176970000022

Figure FDA0002436176970000023

wherein the content of the first and second substances,

Figure FDA0002436176970000024

Step 3-3, the r training sample and x in the v windowqThe global similarity of (a) is:

Simr=winv·Sv

xqthe global similarity to the training samples can be expressed as:

SIM=[Sim1,Sim2,···,Simr,···Simw×s]

4. the Bayesian network complex industrial process soft measurement method based on implicit variables as recited in claim 1, wherein: in the step 4, the specific process is as follows:

step 4-1, calculating X according to the formula shown in step 3t、Yt

Figure FDA0002436176970000025

In the step 4-2, the step of the method,inputting X according to corresponding network nodet、YtTraining Bayesian network to find xqCorresponding predicted value yq

Step 4-3, repeating the step 2 to the step 4 until the prediction is finished XqueryPredicted value Y corresponding to all the query samples in the databaseqAnd step 5 is executed.

Technical Field

The invention belongs to the field of continuous chemical process control and soft measurement, and particularly relates to a Bayesian network complex industrial process soft measurement method based on hidden variables.

Background

In continuous chemical industrial production, production states exist which are difficult to measure and monitor on line, and in the face of monitoring of relevant variables in such complex processes, soft measurement methods are often used: the sample data of the physical quantity which is easy to detect is sampled first, and then the variables which can reflect the state of the production process and the quality of the product to some extent are indirectly estimated. At present, various models based on data driving are applied to soft measurement, and a hidden variable which can reveal the close relation between a quality variable and a production system is difficult to be effectively found out by a learning method directly based on the data driving.

The Bayesian network is one of effective theoretical models in the field of uncertain knowledge and reasoning at present, the graph model combining the graph theory and the probability theory can better process complex, fuzzy and uncertain scenes, and the graph model can be used for constructing a soft measurement model to be a research hotspot.

In the actual complex industrial process, the model is degraded due to the aging of platform equipment, the inactivation of the catalyst, the change of the process environment and the like, and the model established originally is not suitable for the existing operation state any more, so that the precision of the model is reduced. In order to correctly track the process state and solve the time-varying problem, a bayesian network based on a sliding window and instant learning is proposed, but certain limitations exist, and the influence of hidden variables on a soft measurement model is often ignored. Aiming at the time-varying process in the complex industrial production, the similarity calculation and weighting based on the hidden variable are introduced on the basis of supervised Bayesian network training, so that the overfitting is relieved, the model error is reduced, and the prediction accuracy of the quality variable in the complex industrial process is improved.

Disclosure of Invention

In order to solve the technical problems in the background technology, the invention provides a Bayesian network complex industrial process soft measurement method based on hidden variables.

The technical scheme adopted by the invention is as follows:

step 1, collecting data closely related to production safety and quality in a complex continuous industrial process through a sensor, and dividing the data into a training sample set [ X, Y ]]And query sample set [ X ]query,Yquery]. Training sample set, process variable X ═ X1,x2,...xn]∈Rm×nAnd the mass variable Y ═ Y1,y2,...yn]∈Rk×nWill be used for soft measurement modeling, where m is a process variationThe number of variables of the quantity X, k the number of variables of the quality variable Y, and n the number of samples included in the data set. Query sample set, Process variable XqueryFor soft measurement prediction, YqueryFor verifying the soft measurement prediction effect;

the data related to the complex industrial process may specifically include values of important variables collected by a sensor installed in a chemical reaction vessel of the complex industrial process. Process variables are variables that are relatively easy to monitor, and quality variables are important variables in relation to the quality and safety of the production. The sensors include temperature sensors mounted at the top, the column plate and the bottom of the column, pressure sensor at the overhead gas, velocity sensor at the reflux drum and velocity sensor at the next stage of production inlet.

Step 2, for each query sample xq∈XquerySelecting a certain number of samples D from the training sample set by using a sliding windowtrainAccording to the relative variable relationship of the industrial process and DtrainThe size of the system is constructed with a supervision Bayesian network, the Bayesian network is trained, and D is obtainedtrainCorresponding hidden variable T, xqInput Bayes network to obtain hidden variable tnew

Step 3, utilizing the hidden variable T, TnewFinding the confidence and local similarity of each sliding window to obtain xqAnd XtrainThe global similarity of (a) is used to locally weight the original industrial process variable;

step 4, taking the global similarity as a weight pair Xtrain、YtrainWeighting to obtain training data X after local weightingt、YtIs mixing Xt、YtInputting Bayes network according to node to obtain xqCorresponding predicted value yqFinish all the query samples xqA prediction of a corresponding industrial process quality variable;

and 5, after all the query samples are predicted, measuring the prediction effect of the soft measurement model on the complex industrial process variables. The accuracy of the prediction result is measured by the root mean square error RMSE, and R is used2And measuring the data tracking capability of the prediction result, wherein the calculation method comprises the following steps:

Figure BDA0002436176980000021

wherein y isrealI.e. each query sample xqCorresponding to yq

Figure BDA0002436176980000022

Is yrealHas an average value of ypredAnd n is the number of the query samples.

The step 2 specifically comprises the following steps:

step 2-1, traversing the training sample set divided in step 1 by using a sliding window, wherein the number of samples included in each window is W, the number of windows is s, and the number of traversed samples is W-W × s, that is, each query sample x isq∈XqueryThe corresponding training set is Dtrain=[Xtrain,Ytrain]∈R(m+k)×W

Step 2-2, selecting sequence α ═<X,t,Y>Constructing a quality-related Bayesian network with a node number of 3, which comprises a quality variable Y in a training set, relative to a conventional Bayesian networktrainAnd (4) supervision. Inputting X according to corresponding network nodetrain、YtrainI.e. supervised training of the Bayesian network and solving of the training data set DtrainCorresponding hidden variable T ∈ RW×iAnd i represents the number of hidden variables;

the Bayesian network of the present invention is schematically illustrated in FIG. 1, wherein nodes represent variables, and directional arrows between nodes represent causal dependencies between variables, the Bayesian network having 3 nodes in total, three directional edges, and an order of α<X,t,Y>Node X represents a process variable, node T represents a hidden variable, and node Y represents a quality variable. The node X is a father node, two edges start to respectively point to child nodes T and Y, and one edge of the node T starts to point to Y. Network node X corresponds to input XtrainNode Y corresponds to input YtrainTraining BayesAnd after the network is started, the hidden variable is output from the node T.

Step 2-3, new process variable xqInputting a Bayesian network node X, and calculating XqCorresponding hidden variable tnew∈R1 ×i

In the step 3, the flow is as shown in fig. 3, and the specific process is as follows:

step 3-1, applying SVDD to determine xqThe confidence with each window, the window confidence for the vth is defined as:

wherein a isvAnd RvRespectively representing hidden variables T calculated from the v-th windowvCentre coordinates and radius, x, of the constructed hyperspherevRepresents tnewCorresponding coordinates in the hypersphere.

Step 3-2, supposing that the r training sample belongs to the v moving window, tnewCorresponding to the current training sample to obtain an implicit variable trLocal similarity of SvThe calculation is as follows:

wherein the content of the first and second substances,to take a tuning parameter of 0-1, σdIs dτThe corresponding standard deviation.

Step 3-3, the r training sample and x in the v windowqThe global similarity of (a) is:

Simr=winv·Sv

xqthe global similarity to the training samples can be expressed as:

SIM=[Sim1,Sim2,···,Simr,···Simw×s]

in the step 4, the specific process is as follows:

step 4-1, calculating X according to the formula shown in step 3t、Yt

Step 4-2, inputting X according to the corresponding network nodet、YtTraining Bayesian network to find xqCorresponding predicted value yq

Step 4-3, repeating the step 2 to the step 4 until the prediction is finished XqueryPredicted value Y corresponding to all the query samples in the databaseqAnd step 5 is executed.

Compared with the prior art, the invention has the following beneficial effects:

1. the causal dependence relationship among the variables is fully excavated by utilizing the process knowledge of industrial production, and the Bayesian network is constructed, so that the relationship among the variables is more visual.

2. The Bayesian network is supervised and trained by fully utilizing process knowledge and important labeled production data.

3. Latent relations among a plurality of variables are mined by calculating latent variables, the latent variables are used for weighting original data, the utilization rate of the original data is improved, model updating is facilitated, and adaptive soft measurement of an industrial process is realized

Aiming at the time-varying characteristic of the actual production process, the Bayesian network is used for calculating the hidden variable in a supervision manner, the traditional Bayesian network is expanded into a self-adaptive soft measurement model, and fewer training samples can be selected for prediction; compared with other traditional self-adaptive soft measurement models, the method has the advantages that overfitting is relieved, prediction accuracy is improved, and method support is provided for soft measurement modeling of quality variables closely related to production safety, production quality and production efficiency in a complex chemical production process.

Drawings

FIG. 1 is a Bayesian network architecture to which the present invention relates;

FIG. 2 is a schematic view of a debutanizer configuration in accordance with the present invention;

FIG. 3 is a schematic flow chart of the method of the present invention;

FIG. 4 is a graph showing the results of predicting butane content based on the method of the present invention;

FIG. 5 is a graph showing the results of a comparative method for predicting butane content using a hidden variable-free weighted Bayesian network;

FIG. 6 is a process scheme according to the present invention.

Detailed Description

The invention is further illustrated by the following figures and examples.

The embodiment of the invention and the implementation process thereof are as follows:

taking the production process of the debutanizer as an example, the soft measurement modeling method is described in detail based on the data recorded in the operation process, and the technical route is shown in fig. 6.

The debutanizer is a refining process for naphtha cracking, and aims to separate butane from naphtha and verify a reference platform of different algorithms in the field of soft measurement models. FIG. 2 is a process flow diagram for debutanizer production. In the debutanizer column, the detection of the butane content at the bottom of the column is a vital part in monitoring the production. However, butane is not as easy to measure as temperature. The present invention therefore selects readily measurable process variables related to temperature, pressure, flow, concentration as shown in table 1 to estimate the mass variable, i.e., butane content.

TABLE 1

Process variable Description of variables
U1 Temperature at the top of the column
U2 Pressure at the top of the column
U3 Amount of reflux
U4 Next stage flow
U5 Temperature of column plate 6
U6 Temperature at the bottom of the tower (left)
U7 Temperature at the bottom of the tower (Right)

As shown in fig. 3, in order to predict the butane content at the bottom of the debutanizer tower by applying a complex industrial process soft measurement modeling method based on supervised bayesian network hidden variable similarity and local weighting, the following steps are made:

step 1, collecting variables closely related to production safety and quality in a complex continuous industrial process through a sensor, and dividing the variables into a training sample set [ X, Y ]]And query sample set [ X ]query,Yquery]. In the sample set, the number of process variables is 7, the number of quality variables is 1, and the number of query samples is 500.

Step 2, for each query sample xq∈XquerySelecting 50 training samples by using 10 sliding windows with the window size of 5, inputting the training samples into a Bayesian network, performing parameter learning by using an EM (effective man algorithm), and then solving a hidden variable T corresponding to the training samples by using a combined tree inference engine; will be asxqInputting evidence into network to obtain hidden variable tnew

Step 3, utilizing the hidden variable T, TnewFinding the confidence and local similarity of each sliding window to obtain xqAnd XtrainThe global similarity of (a) is used to locally weight the original industrial process variable;

step 4, taking the global similarity as a weight pair Xtrain、YtrainWeighting to obtain training data X after local weightingt、YtIs mixing Xt、YtInputting Bayes network according to node to obtain xqCorresponding predicted value yqFinish all the query samples xqA prediction of a corresponding industrial process quality variable;

and 5, after all the query samples are predicted, measuring the prediction effect of the soft measurement model on the complex industrial process variables. The accuracy of the prediction result is measured by the root mean square error RMSE, and R is used2And measuring the data tracking capability of the prediction result, wherein the calculation method comprises the following steps:

Figure BDA0002436176980000051

wherein y isrealI.e. each query sample xqCorresponding to yqIs yrealHas an average value of ypredAnd n is the number of the query samples.

The result of the method of the invention for predicting butane content is shown in fig. 4, and the traditional method comprises the following steps: local weighted Bayes soft measurement modeling without introducing hidden variables is used as an effect comparison method, the prediction results are shown in FIG. 5, the prediction effect pairs of the two methods are shown in Table 2, and as can be seen from Table 2, the method of the invention has higher prediction accuracy than a Bayes network without introducing hidden variables.

TABLE 2

The method of the invention Conventional methods
RMSE 0.0350 0.0536
R2 0.9414 0.8622

In summary, the method provided by the invention is a Bayesian network complex industrial process soft measurement method based on hidden variables, which can complete soft measurement modeling of a complex industrial process, realize prediction of quality variables such as butane and improve the prediction accuracy of soft measurement to a certain extent.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种自动线生产任务动态调度方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类