Method for diagnosing fan fault through self-association neural network

文档序号:696713 发布日期:2021-05-04 浏览:17次 中文

阅读说明:本技术 一种自联想神经网络诊断风机故障方法 (Method for diagnosing fan fault through self-association neural network ) 是由 武鑫 王立鹏 吕佃顺 李练兵 李思佳 李政宇 陈伟光 于 2020-12-31 设计创作,主要内容包括:一种自联想神经网络诊断风机故障的方法,将采集的数据归一化,去除奇异值,分为训练数据和测试数据。根据输入输出样本数据,确定自联想神经网络的输入输出节点及隐藏层结构。在保证种群多样性的同时选出较优个体的前提下,设计风机变桨系统故障诊断的选择算子公式,避免网络的权值和阈值陷入局部最优点,提高风机故障诊断准确率。设计适合风机变桨系统故障诊断的适应度函数,引入接收者操作特征曲线下方的面积AUC,以确保较低的漏诊率,并降低不平衡数据对模型诊断效果的干扰。基于改进自适应遗传算法-自联想神经网络算法建立变桨系统正常运行模型,采用詹森-山农散度比较正常时刻与故障时刻残差分布的偏移度,实现对变桨系统的故障诊断。(A method for diagnosing fan faults by a self-association neural network is characterized by normalizing acquired data, removing singular values, and dividing the data into training data and testing data. And determining an input/output node and a hidden layer structure of the self-associative neural network according to the input/output sample data. On the premise of ensuring population diversity and selecting better individuals, a selection operator formula for fault diagnosis of the fan variable pitch system is designed, so that the situation that the weight and the threshold of a network are trapped in local optimum points is avoided, and the fault diagnosis accuracy of the fan is improved. And designing a fitness function suitable for fault diagnosis of the variable-pitch system of the fan, and introducing an area AUC below an operation characteristic curve of a receiver to ensure lower missed diagnosis rate and reduce the interference of unbalanced data on the diagnosis effect of the model. A normal operation model of the variable pitch system is established based on an improved adaptive genetic algorithm-self-association neural network algorithm, and the Zhansen-Shannon divergence is adopted to compare the deviation degree of residual error distribution at the normal time and the fault time, so that fault diagnosis of the variable pitch system is realized.)

1. A method for diagnosing fan faults through a self-associative neural network is characterized by comprising the following steps:

step 1: carrying out preprocessing operations such as normalization, singular value removal and the like on data acquired by a supervisory control and data acquisition (SCADA) system of a wind turbine generator, and dividing the data into training data and testing data; determining the number of input and output nodes of the self-associative neural network model according to data acquired by a data acquisition and monitoring system of the wind turbine;

step 2: the method comprises the steps of setting a population number N and a maximum iteration number M, randomly generating N individuals as an initial population, and coding a weight and a threshold of the auto-associative neural network by adopting a real number coding method, wherein each individual contains all weights, thresholds and position information of the weights and the thresholds of the auto-associative neural network;

and step 3: decoding each individual to obtain a weight value and a threshold value of the self-associative neural network; training a self-association neural network, and calculating an individual fitness value;

and 4, step 4: arranging the individuals in a descending order according to the fitness value, then selecting the individuals in the population according to an improved selection operator, and performing crossing and mutation operations on the self-adaptive crossing probability and the self-adaptive mutation probability of the other selected individuals except the individuals selected by the elite retention strategy;

and 5: judging whether the fitness value meets the requirement or reaches the maximum evolution algebra, if so, outputting the optimal individual, otherwise, entering the step 3;

step 6: decoding the optimal individual, endowing the weight threshold value information to a parameter space of the self-association neural network, performing parameter fine tuning training, inputting the test set into a fault diagnosis model to obtain residual data and distribution information thereof, analyzing the residual, and verifying the accuracy and efficiency of the algorithm.

2. The method of diagnosing a wind turbine fault of claim 1, wherein: the method for selecting the operator formula in the step 2 is as follows:

1) determining an initial population, and calculating the fitness value of each individual according to a fitness function;

2) sorting the individuals in the population in a descending order according to the size of the fitness value;

3) according to the current iteration number and the total iteration number, determining the number n of remaining excellent individuals, selecting the remaining individuals by using a roulette method, and then performing crossover and mutation operations on the selected individuals;

the formula for the calculation of the number n of remaining excellent individuals is as follows:

wherein P is the maximum ratio of excellent individuals to be selected to the whole, N is the total number of population individuals, xmaxThe number of the evolution algebras is the maximum evolution algebra, n is the number of excellent individuals, x is the evolution algebra, and e is the base number of the natural logarithm;

4) performing elite reservation operation; and comparing the previous generation with the current generation of the best individual, selecting the individual with the highest fitness as the current generation of the best individual, and replacing the current generation of the worst individual.

3. The method of diagnosing a wind turbine fault of claim 1, wherein: the method for designing the fitness function suitable for fault diagnosis of the variable pitch system of the fan in the step 3 is as follows;

selecting output values from associative neural networksThe absolute value of the difference from the desired output value y as the objective function Jm(ii) a Taking the reciprocal of the target function as a fitness function; j. the design is a squaremThe objective function is calculated as:

in the formula, m is the number of output nodes of the self-associative neural network;

adding an Area (AUC) value below a receiver operation characteristic curve (ROC) of the self-association neural network on the basis of the fitness function of the existing genetic algorithm;

fitness function f (x)i) The calculation formula is as follows:

where ζ is a minimum value, the prevented denominator is 0, k is an adjustment coefficient, and f (x)i) Is an improved fitness function; j. the design is a squaremFor the objective function, AUC is the area under the receiver operating characteristic curve (ROC);

and assigning the individual decoding to a weight and a threshold of the self-association neural network AANN, and inputting training data to obtain an area AUC value under a receiver operation characteristic curve ROC.

4. The method for diagnosing a fan fault according to claim 3, wherein: the objective function JmSelecting output values from associative neural networksAbsolute value of the difference from the desired output value y, JmThe objective function is calculated as:

in the formula, m is the number of nodes output by the self-associative neural network.

5. The method of diagnosing a wind turbine fault of claim 1, wherein: in the step 4, the jensen-shannon (JS) divergence is a variant of the curebeck-leibler (KL) divergence, wherein the calculation formula of the curebeck-leibler (KL) divergence is as follows:

dividing the residual error range obtained by fan fault diagnosis into n equal parts, counting the number of residual errors falling into each residual error interval to obtain the probability distribution p (x) of the residual errorsi)=xi/sp、q(xi)=xi/sq(ii) a Wherein p (x)i) The ratio, s, of residual errors falling into an interval i during normal operation of the variable pitch systempThe residual error number of all intervals is the number of the residual errors of the pitch system during normal operation; q (x)i) Is the ratio of the fault time residual error falling into the interval i, sqThe number of residual errors, x, of all intervals at the time of the faultiThe number of residual errors falling into the ith interval is n, the number of the intervals equally dividing the range of the residual errors output by the model is n, and i is the ith interval of the residual errors output by the model;

compared with the kulbeck-leibler (KL) divergence, the jensen-shannon (JS) divergence is more accurate in similarity judgment, and a calculation formula JS (p | | q) of the jensen-shannon (JS) divergence is as follows:

in the formula, p is the real distribution of the data, and q is the fitting distribution of the data.

Technical Field

The invention relates to a fan fault diagnosis method.

Background

The variable pitch system is an important component in a fan control system, not only directly influences the utilization rate of the fan on wind energy, but also takes charge of the self safety of the fan under extreme conditions due to the pneumatic brake function. The continuous change of the high wind hail weather and the wind speed and the wind direction influences the performance and the service life of the variable pitch system, increases the fault occurrence rate of the variable pitch system, and enables the variable pitch system to become a fan fault prone part. Meanwhile, the variable pitch system is arranged at a height of tens of meters away from the ground, and has high maintenance difficulty and high cost, so that the variable pitch system has great significance for fault diagnosis research of the fan variable pitch system. At present, a fault diagnosis method for a variable pitch system mainly selects wind speed, wind direction, pitch angle and motor rotating speed parameters, and establishes a fault prediction model of the variable pitch system, for example, a document 'research [ J ] of a wind power variable pitch fault prediction method based on an SCADA system' (Xiaocheng, Liuwujun, Zhangieu. For fault diagnosis of systems such as a variable pitch system which are difficult to establish accurate mathematical models and have more operation data variables, the artificial neural network has a good effect, but initial parameters of the network are difficult to determine, and random initial parameters easily cause the neural network to generate a local optimal solution, so that the convergence speed and the result accuracy are influenced.

At present, how to improve the convergence rate in large sample processing and enhance the global optimization effect is a technical problem to be solved.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a fan variable pitch system fault diagnosis method of a self-associative neural network. The method improves the existing adaptive genetic algorithm from two aspects of operator selection and fitness function so as to improve the search efficiency and the global optimization capability of the existing adaptive genetic algorithm and avoid the influence of the neural network on the speed of fault diagnosis of the variable pitch system and the accuracy of the result due to the generation of the local optimal solution.

The method adopts an Improved Adaptive Genetic Algorithm (IAGA) to optimize an initial weight and a threshold value of an auto-associative neural network (AANN), obtains residual distribution of the variable pitch system in a normal state through the auto-associative neural network (AANN), calculates the offset of the residual distribution at normal time and fault time by using the divergence of the Jansen-Shannon (JS), and judges whether the variable pitch system fails.

The invention relates to a fan variable pitch system fault diagnosis method based on an improved adaptive genetic algorithm-self-association neural network model, which comprises the following steps:

step 1: preprocessing operations such as normalization, singular value removal and the like are carried out on data collected by a supervisory control and data acquisition (SCADA) system of a wind turbine generator, and the data are divided into training data and testing data. And determining the number of input and output nodes of the self-associative neural network model according to the data acquired by the data acquisition and monitoring system of the wind turbine.

Step 2: the number N of the population and the maximum iteration number M are set, N individuals are randomly generated to serve as an initial population, a real number coding method is adopted to code the weight and the threshold of the auto-associative neural network, and each individual contains all the weight, the threshold and the position information of the auto-associative neural network.

And step 3: decoding each individual to obtain a weight value and a threshold value of the self-associative neural network; training a self-association neural network, and calculating an individual fitness value.

And 4, step 4: and arranging the individuals in a descending order according to the fitness value, then selecting the individuals in the population according to an improved selection operator, and performing crossing and mutation operations on the self-adaptive crossing probability and the self-adaptive mutation probability of the other selected individuals except the individuals selected by the elite reservation strategy.

And 5: and (4) judging whether the fitness value meets the requirement or reaches the maximum evolution algebra, if so, outputting the optimal individual, otherwise, entering the step 3.

Step 6: decoding the optimal individual, endowing the weight threshold value information to a parameter space of the self-association neural network, performing parameter fine tuning training, inputting the test set into a fault diagnosis model to obtain residual data and distribution information thereof, analyzing the residual, and verifying the accuracy and efficiency of the algorithm.

The method comprises the following steps:

in the step 1, data information related to the operation of the pitch system is selected from related data recorded by a wind turbine data acquisition and monitoring System (SCADA), and the data information includes 15 operation parameters, which are respectively: the wind power generation system comprises the following components of current wind speed, active power of a fan, blade rotating speed, generator rotating speed, pitch angle and pitch variation driving current of 3 blades of the fan, bearing temperature of a pitch variation system, servo motor temperature and Insulated Gate Bipolar Transistor (IGBT) temperature of the 3 blades. Thus, the input and output nodes of the self-associative neural network are 15.

Invalid data with power of 0, numerical value missing, repeated recording and the like are removed from the data, and the processed data are normalized to obtain a sample set of the fan pitch system. The normalization processing formula is as follows:

in the formula, xiFor the input data, xmin、xmaxMinimum and maximum values of data, x'iIs normalized data.

In step 1, the self-associative neural network model is as follows:

the main characteristics of the operation data of the variable pitch system are extracted by the self-association neural network and are correlated with each other, meanwhile, the normal operation mode of the operation data is solidified into the network, and the health state of the variable pitch system is reflected more accurately and intuitively by comparing the difference degree of residual error distribution of the data at the time of degradation or failure and the normal operation data.

In the self-association neural network, the process that input data reaches a bottleneck layer through an input layer and a mapping layer belongs to an encoding process; the process from the bottleneck layer to the mapping layer and finally to the output layer belongs to the decoding process. Firstly, high-dimensional data is nonlinearly mapped to a bottleneck layer through an input layer, compression of the high-dimensional data is achieved, and effective characteristic dimensions are extracted.

Let g (x) denote the coding function, xiAs an input, i is the ith dimension of x. The data is encoded to obtain yiThe expression is as follows:

yi=gθ(xi)=s(Wxi+b)

wherein s (x) is a nonlinear activation function; θ ═ W, b) is a parameter for g (x); w represents a weight; b is an offset.

Then, y of the hidden layeriThe bottleneck layer is mapped to the output layer with the same dimensionality as the input layer through a nonlinear function to realize the decoding of information, namely the reconstruction of input data, and the expression is as follows:

wherein θ ' ═ (W ', b ') is a parameter of h (y); w' represents a weight; b' is an offset.

Thus, through the nonlinear compression and reconstruction of the data, the self-associative neural network model learns the nonlinear relation among the variables. For data input as normal operation of the pitch system, the self-associative neural network (AANN) will solidify the characteristics and associations between normal operation data into the model. When the self-association neural network (AANN) model is trained, the optimal network parameters are sought with the aim of minimizing input and output errors, so that the output can restore the input data as much as possible. When the deterioration or failure time data is input, the output data is greatly deviated from the input data.

When training the self-associative neural network (AANN), training sample data yiAs inputs and periods of the modelLooking at the output, and then training the generated output according to the modelAnd calculating the error square sum E of the model, wherein the formula is as follows:

in the formula, n is the number of variables contained in one sample; m is the total number of samples.

In the step 4, the selection operator selects the individual with high fitness according to the fitness function to reproduce to generate the next generation, and the commonly used selection operators comprise a roulette method, a sequencing method and an elite retention strategy. The wheel disc betting method has random errors and can possibly eliminate individuals with high fitness, the sequencing method and the elite selection strategy reserve excellent individuals but possibly damage the diversity of population, so that the algorithm is easy to fall into premature convergence, meanwhile, the fault of the fan pitch system can be timely discovered, and workers can be reasonably arranged to overhaul, the reliability of the wind turbine generator is improved, and the economic effect of the wind power plant is increased.

The roulette method includes the following steps:

1) calculating each individual fitness value;

2) calculating the probability and the cumulative probability of each individual being selected;

3) randomly generating a random number r between [0,1 ];

4) and sequentially comparing the accumulated probability with the value of r, and selecting the first value larger than r.

The steps of the invention for selecting an operator are as follows:

1) determining an initial population, and calculating the fitness value of each individual according to a fitness function;

2) sorting the individuals in the population according to the descending order according to the size of the fitness value;

3) based on the current iteration number and the total iteration number, the remaining individuals are selected by using the 3 rd step and the 4 th step of the roulette method according to the number n of the determined remaining excellent individuals, and then the selected individuals are subjected to crossover and mutation operations.

The formula for the calculation of the number n of remaining excellent individuals is as follows:

wherein P is the maximum ratio of excellent individuals to be selected to the whole, N is the total number of population individuals, xmaxFor the maximum evolutionary algebra, n is the number of excellent individuals, x is the evolutionary algebra, and e is the base of the natural logarithm.

4) And performing elite reservation operation. And comparing the previous generation with the current generation of the optimal individuals, selecting the individuals with the highest fitness as the current generation of the optimal individuals, and replacing the individuals with low fitness of the current generation.

Wherein, the calculation formula of the crossover and mutation operations is as follows:

wherein the upper and lower limits of the crossing rate are Pc1、Pc2The upper and lower limits of the mutation rate are respectively Pm1、Pm2;fmaxRepresenting the maximum fitness value in each generation of population individuals; f. ofavgThe current population average fitness value is obtained; f' is the greater fitness value of the two individuals to be interleaved; f is the fitness value of the variant individual.

The fitness function in the improved selection operator in the steps 3 and 4 is calculated as follows:

selecting output values from associative neural networksThe absolute value of the difference from the desired output value y as the objective function Jm. And taking the reciprocal of the objective function as a fitness function. J. the design is a squaremThe objective function is calculated as:

in the formula, m is the number of nodes output by the self-associative neural network.

The genetic algorithm is evolved towards the direction of increasing the fitness function value, but the smaller the error of the auto-associative neural network model is, the more the model is, and meanwhile, the accuracy of the model is emphasized by the fault diagnosis model of the wind turbine pitch system.

Fitness function f (x)i) The calculation formula is as follows:

in the formula, zeta is a minimum value, and the prevention denominator is 0; k is an adjustment coefficient; f (x)i) Is an improved fitness function; j. the design is a squaremFor the objective function, AUC is the area under the receiver operating characteristic curve (ROC).

The individual decoding is given to the weight and the threshold of the self-association neural network, and then training data are input, so that the area AUC value under the receiver operation characteristic curve ROC can be solved, and the misdiagnosis rate of the improved self-adaptive genetic algorithm-self-association neural network fan variable pitch system fault diagnosis model is reduced.

The area AUC under receiver operating characteristic curve (ROC) is calculated as follows:

the calculation of the AUC value of the area under the receiver operating characteristic curve defines some indexes by the confusion matrix, as shown in table 1:

TABLE 1 confusion matrix

In the table, TP is a true example, which indicates that the model is diagnosed as a fault, and is actually an event of the fault; FN is a false negative, representing an event where the model diagnoses no fault, but actually has a fault; FP is a false positive example, which represents an event that the model diagnoses as a fault but actually has no fault; TN is a true negative example indicating that the model diagnoses no faults, and is also a fault-free event in practice.

According to the above definition, the following criteria are obtained:

the accuracy is the proportion of correct diagnosis events of the model, and under the condition that positive and negative samples are balanced, the higher the accuracy is, the better the diagnosis effect of the model is. The calculation formula of the accuracy (accuracycacy) is as follows:

the true class ratio (TPR) represents the proportion that is correctly identified by the diagnostic model in the event of all actual faults.

The negative-positive class rate (FPR) represents the proportion of events that are erroneously determined to be a fault by the diagnostic model in all instances of an actual fault.

And respectively taking the negative and positive class rate and the real class rate as horizontal and vertical coordinates, calculating corresponding negative and positive class rate and real class rate values by using different thresholds to form different points, connecting the points to form a receiver operation characteristic curve, and calculating the area enclosed by the receiver operation characteristic curve and the horizontal and vertical coordinates to form the AUC.

And 6, taking the normal operation data of the wind driven generator as input to obtain model output residual errors based on the method of the improved adaptive genetic algorithm-the self-associative neural network, and counting the frequency of each residual error to obtain residual error probability distribution. According to the residual error probability distribution standard, after fault data of the wind driven generator are input, the output residual error and the probability distribution condition of the model are different from the normal condition, and whether the wind driven generator fails or not is judged by comparing the difference degree of the two residual error probability distributions, namely the offset degree. And setting the threshold value of the model according to the parameter, wherein all fault conditions are contained as far as possible, and the phenomenon of the health state in the threshold value is eliminated. When the derived degree of deviation exceeds this threshold, it represents a failure of a critical component of the wind turbine.

The invention uses variant Jansen-Shannon (JS) divergence analysis residual data of the Curebeck-Labuler (KL) divergence and the distribution thereof, wherein the calculation formula of the Curebeck-Labuler (KL) divergence is as follows:

dividing the residual error range output by the self-association neural network model in the step 1 of the invention into n equal parts, counting the number of the residual errors falling into each residual error interval, and obtaining the probability distribution p (x) of the residual errorsi)=xi/sp、q(xi)=xi/sq. Wherein p (x)i) The ratio, s, of residual errors falling into an interval i during normal operation of the variable pitch systempThe residual error number of all intervals is the number of the residual errors of the pitch system during normal operation; q (x)i) Is the ratio of the fault time residual error falling into the interval i, sqThe number of residual errors, x, of all intervals at the time of the faultiThe number of residual errors falling into the ith interval is shown, n is the number of the intervals of the residual error range equal division of the model output, and i is the ith interval of the residual error output by the model.

Compared with the kulbeck-leibler (KL) divergence, the jensen-shannon (JS) divergence is more accurate in similarity judgment, and a calculation formula JS (p | | q) of the jensen-shannon (JS) divergence is as follows:

in the formula, p is the real distribution of the data, and q is the fitting distribution of the data.

In the step 6, the normal operation model of the fan pitch system is a self-association neural network model trained by normal operation data, and the model can identify each working state of the fan pitch system during normal operation. The method comprises the steps of selecting data recorded by a wind turbine generator data acquisition and monitoring System (SCADA) and data in normal operation before and after a part of fan variable pitch system fault moment from a test set as input of an improved adaptive genetic algorithm-self-association neural network model, utilizing a Zhansen-Shannon (JS) divergence calculation model to output residual distribution and offset of normal distribution under corresponding working conditions, considering sensitivity and model accuracy, and selecting a sliding window value as 10. And determining a threshold value by taking the minimum deviation degree of the residual error distribution of the fault data and the normal distribution as a boundary.

Drawings

FIG. 1 is a general structure diagram of a wind turbine fault diagnosis model based on an improved adaptive genetic algorithm-self-associative neural network;

FIG. 2 is a diagram of a self-associative neural network (AANN) architecture;

FIG. 3 is a diagram of an Improved Adaptive Genetic Algorithm (IAGA) structure of the present invention;

FIG. 4 is a graph of the optimal fitness of individuals for the three optimization algorithms of the Improved Adaptive Genetic Algorithm (IAGA), the improved genetic Algorithm (AGA) and the Genetic Algorithm (GA) of the present invention;

FIG. 5 is a training error curve of the improved adaptive genetic algorithm-self-association neural network (IAGA-AANN), the improved genetic algorithm-adaptive association network (AGA-AANN) and the genetic algorithm-self-association neural network (GA-AANN) model of the present invention;

FIG. 6 is a scatter plot of the residual distribution of a portion of the training data;

FIG. 7 is a graph of the migration degree of the improved adaptive genetic algorithm-self-association neural network (IAGA-AANN) fault diagnosis model of the present invention.

Detailed Description

The invention is further described below with reference to the drawings and the detailed description.

The general structure diagram of the fault diagnosis model of the fan pitch system of the wind turbine based on the improved adaptive genetic algorithm-the self-association neural network is shown in fig. 1, and the specific method comprises the following steps:

step 1: preprocessing operations such as normalization, singular value removal and the like are carried out on data acquired by a wind turbine data acquisition and monitoring System (SCADA), and the data are divided into training data and testing data. And determining the number of input and output nodes of the self-associative neural network model according to the data acquired by the data acquisition and monitoring system of the wind turbine generator.

Step 2: the number N of the population and the maximum iteration number M are set, N individuals are randomly generated to serve as an initial population, a real number coding method is adopted to code the weight and the threshold of the auto-associative neural network, and each individual contains all the weight, the threshold and the position information of the auto-associative neural network.

And step 3: decoding each individual to obtain a weight value and a threshold value of the self-associative neural network; training a self-association neural network, and calculating an individual fitness value.

And 4, step 4: and arranging the individuals in a descending order according to the fitness value, then selecting the individuals in the population according to an improved selection operator, and performing crossing and mutation operations on the self-adaptive crossing probability and the self-adaptive mutation probability of the other selected individuals except the individuals selected by the elite reservation strategy.

And 5: and (4) judging whether the fitness value meets the requirement or reaches the maximum evolution algebra, if so, outputting the optimal individual, otherwise, entering the step 3.

Step 6: decoding the optimal individual, endowing the weight threshold value information to a parameter space of the self-association neural network, performing parameter fine tuning training, inputting the test set into a fault diagnosis model to obtain residual data and distribution information thereof, analyzing the residual, and verifying the accuracy and efficiency of the algorithm.

The data in the step 1 is from a data acquisition and monitoring System (SCADA) of the wind turbine generator, and data information related to the operation of the pitch system is selected from relevant data recorded by the data acquisition and monitoring system, and the data information comprises 15 operation parameters which are respectively as follows: the wind power generation system comprises the following components of current wind speed, active power of a fan, blade rotating speed, generator rotating speed, pitch angle and pitch variation driving current of 3 blades of the fan, bearing temperature of a pitch variation system, servo motor temperature and Insulated Gate Bipolar Transistor (IGBT) temperature of the 3 blades. Thus, the number of input and output nodes of the self-associative neural network is 15.

Invalid data with power of 0, numerical value missing, repeated recording and the like are removed, the processed data are normalized, and a sample set of the fan variable pitch system is obtained. The normalization processing formula is as follows:

in the formula, xiFor the input data, xmin、xmaxInputting minimum and maximum values of data, x, respectivelyi' is normalized data.

As shown in fig. 2, the theory of the self-associative neural network described in step 1 of the present invention is as follows:

the main characteristics of the operation data of the variable pitch system are extracted by the self-association neural network and are correlated with each other, meanwhile, the normal operation mode of the operation data is solidified into the network, and the health state of the variable pitch system is reflected more accurately and intuitively by comparing the difference degree of residual error distribution of the data at the time of degradation or failure and the normal operation data.

In the self-association neural network, the process that input data reaches a bottleneck layer through an input layer and a mapping layer belongs to an encoding process; the process from the bottleneck layer to the mapping layer and finally to the output layer belongs to the decoding process. Firstly, high-dimensional data is nonlinearly mapped to a bottleneck layer through an input layer, compression of the high-dimensional data is achieved, and effective characteristic dimensions are extracted.

Let g (x) denote the coding function, xiAs an inputAnd i is the ith dimension of x. The data is encoded to obtain yiThe expression is as follows:

yi=gθ(xi)=s(Wxi+b)

wherein s (x) is a nonlinear activation function; θ ═ W, b) is a parameter for g (x); w represents a weight; b is an offset.

Then, y of the hidden layeriThe bottleneck layer is mapped to the output layer with the same dimensionality as the input layer through a nonlinear function to realize the decoding of information, namely the reconstruction of input data, and the expression is as follows:

wherein θ ' ═ (W ', b ') is a parameter of h (y); w' represents a weight; b' is an offset.

Thus, through the nonlinear compression and reconstruction of the data, the self-associative neural network model learns the nonlinear relation among the variables. For data input as normal operation of the pitch system, the self-associative neural network (AANN) will solidify the characteristics and associations between normal operation data into the model. When the self-association neural network (AANN) model is trained, the optimal network parameters are sought with the aim of minimizing input and output errors, so that the output can restore the input data as much as possible. When the deterioration or failure time data is input, the output data is greatly deviated from the input data.

When training the self-associative neural network (AANN), training sample data yiAs inputs to the model and desired outputs, and outputs generated by training according to the modelAnd calculating the error square sum E of the model, wherein the formula is as follows:

in the formula, n is the number of variables contained in one sample; m is the total number of samples.

The Genetic Algorithm (GA) is a global optimization probability search algorithm which is formed by simulating the genetic and biological evolutionary theory in the natural world. The potential solution of the problem is regarded as a population, a fitness function is used as a judgment standard, 3 genetic operators are selected, crossed and mutated to generate a next generation of individuals, and the next generation of individuals is continuously evolved and iterated until the optimal individuals are found out. Compared with other algorithms, the standard genetic algorithm has the advantages of strong adaptability, global optimization and the like, but has the defects of poor local optimization capability, time-consuming training, low search efficiency in the later evolution stage, easy generation of premature problems and influence on the accuracy of fan fault diagnosis.

Aiming at the problems, the invention improves from two aspects of operator selection and fitness function, and provides an Improved Adaptive Genetic Algorithm (IAGA) for improving the search efficiency and local optimization capability of the original algorithm and avoiding the premature phenomenon, as shown in FIG. 3, the invention specifically comprises the following steps:

1) selecting an operator: the selection operation is to select the individual with high fitness according to the fitness function to reproduce to generate the next generation, and the commonly used selection operators comprise a roulette method, a sequencing method and an elite reservation strategy. The wheel disc betting method has random errors and can possibly eliminate individuals with high fitness, the sequencing method and the elite selection strategy reserve excellent individuals but possibly damage the diversity of population, so that the algorithm is easy to fall into premature convergence, meanwhile, the fault of the fan pitch system can be timely discovered, and workers can be reasonably arranged to overhaul, the reliability of the wind turbine generator is improved, and the economic effect of the wind power plant is increased. The roulette method includes the following steps:

(1) calculating each individual fitness value;

(2) calculating the probability and the cumulative probability of each individual being selected;

(3) randomly generating a random number r between [0,1 ];

(4) and sequentially comparing the accumulated probability with the value of r, and selecting the first value larger than r.

The invention improves the steps of selecting the operator as follows:

(1) determining an initial population, and calculating the fitness value of each individual according to a fitness function;

(2) sorting the individuals in the population in a descending order according to the fitness;

(3) according to the current iteration times x and the total iteration times xmaxThe number n of remaining excellent individuals is determined, the remaining individuals are selected using roulette, and the selected individuals are subjected to crossover and mutation operations. The formula for the number of excellent individuals, n, is as follows:

wherein P is the maximum ratio of excellent individuals to be selected to the total, N is the total number of population individuals, xmaxIs the maximum evolution algebra.

(4) And (4) performing elite reservation operation, comparing the previous generation individuals with the optimal individuals of the current generation, selecting the individuals with the highest fitness as the optimal individuals of the current generation, and replacing the worst individuals of the current generation. In order to retain its excellent genes, the individuals were directly entered into the next generation without crossover and mutation operations.

2) Designing a fitness function: the goal of the genetic algorithm is to find the corresponding weights, thresholds that minimize the self-associative neural network output error in all evolutionary generations. Thus selecting the output value from the associative neural networkThe absolute value of the difference from the desired output value y is used as the objective function.

In the formula, m is the number of nodes output by the self-associative neural network.

The genetic algorithm is evolved towards the increasing direction of the fitness function value, but the smaller the error of the self-association neural network model is, the better the model is, so the reciprocal of the objective function is taken as the fitness function. Meanwhile, the fan fault diagnosis model pays great attention to the accuracy of model prediction, so that the area AUC value below the receiver operation characteristic curve of the self-association neural network is added on the basis of the fitness function, the interference of unbalanced data on the fault diagnosis model diagnosis effect of the fan variable pitch system is reduced, the accuracy of the diagnosis model is improved, the abnormal state of the fan variable pitch system is accurately identified, the false alarm rate is reduced, and the improved fitness function is as follows:

in the formula, ζ is a minimum value, the prevention denominator is 0, and k is an adjustment coefficient. And (3) decoding the individual, assigning a weight and a threshold value to the self-associative neural network, inputting training data, solving an area AUC value band below an operation characteristic curve of a receiver, and substituting the area AUC value band into a formula to obtain the fitness value of the individual.

3) Adaptive crossover and mutation probabilities: the population is updated by the crossover and mutation operators, and the conventional genetic algorithm uses constant crossover and mutation probabilities, so that the algorithm search efficiency is low, local optima can be trapped, and the optimal individual cannot be found. The invention selects the self-adaptive cross rate and the variation rate, and the formula is as follows:

wherein the upper and lower limits of the crossing rate are Pc1、Pc2The upper and lower limits of the mutation rate are respectively Pm1、Pm2;fmaxRepresenting the maximum fitness value in each generation of population individuals; f. ofavgIs the current populationAveraging the fitness value; f' is the greater fitness value of the two individuals to be interleaved; f is the fitness value of the variant individual.

4) Because the invention uses real number coding, the crossover operator adopts arithmetic crossover operator, and the formula is as follows:

in the formula (I), the compound is shown in the specification,the ith gene representing the kth and l chromosomes (individuals) of the t +1 th generation, respectively; alpha is [0,1]]The random number in (c).

The mutation operator adopts a uniform mutation method to select the jth gene in the mth chromosome for mutation operation, and the chromosome value range is [ U ]min,Umax]The expression is:

xmj=Umin+λ(Umax-Umin)

in the formula, lambda is random number uniformly distributed in [0,1 ].

In step 6, the method is based on the improved adaptive genetic algorithm-self-association neural network, normal operation data of the variable-pitch system of the fan are used as input, model output residual errors are obtained, frequency of each residual error is counted, and residual error probability distribution is obtained. By taking the residual error as the reference, after fault data of the variable pitch system are input, the output residual error and the probability distribution condition of the model are different from the normal condition, and whether the variable pitch system of the fan fails or not is judged by calculating the difference degree of the probability distribution of the two residual errors, namely the offset degree.

The similarity between the two distributions is compared, typically measured by relative entropy, i.e., kulbeck-leibler (KL) divergence. Let p (x) represent the true distribution of the data, and q (x) be the fitted distribution of the data, then in the case of discrete random variables, the kulbeck-leibler (KL) divergence calculation formula is as follows:

dividing the residual error range output by the fan fault diagnosis model of the improved self-adaptive genetic algorithm-self-associative neural network into n equal parts, counting the number of the residual errors falling into each residual error interval, and obtaining the probability distribution p (x) of the residual errorsi)=xi/sp、q(xi)=xi/sq. Wherein p (x)i) The ratio, s, of residual errors falling into an interval i during normal operation of the variable pitch systempThe residual error number of all intervals is the number of the residual errors of the pitch system during normal operation; q (x)i) Is the ratio of the fault time residual error falling into the interval i, sqThe number of residual errors, x, of all intervals at the time of the faultiThe number of residuals falling into the ith interval is shown.

Since the kulbeck-leibler (KL) divergence range is [0, ∞), is not upper bound, and has asymmetry, it is not suitable as a threshold. In order to more accurately compare the deviation degree of residual distribution, the variation Jansen-Shannon (JS) divergence based on the Kulbert-Laibuler (KL) divergence is used as an index for measuring the difference of the two residual distributions. The value range is [0,1], and the smaller the JS divergence, the more similar the distribution is, the completely same distribution is 0, and conversely, the distribution is 1. Compared with the Kulbeck-Laibuler (KL) divergence, the Jansen-Shannon (JS) divergence has more definite similarity judgment, and the calculation formula is as follows:

the calculation mode of the receiver operation characteristic curve ROC and the area AUC below the receiver operation characteristic curve in the fitness function of step 3 is as follows:

because the proportion of the fan fault data to the normal operation data is lower, even if the model predicts all normal states, the accuracy rate is not too low. Therefore, the accuracy as the judgment standard can not better reflect the diagnosis effect of the model when the model is trained. And calculating the area AUC under the curve according to the receiver operating characteristic curve ROC, even if the positive and negative samples are not balanced, the area AUC can not be changed greatly, and the diagnosis effect of the diagnosis model can be better reflected, so that the area AUC under the receiver operating characteristic curve is selected as the model evaluation index. The calculation of the AUC values of the area under the receiver operating characteristic curve defines some indexes by the confusion matrix, as shown in table 2:

TABLE 2 confusion matrix

In the table, TP is a true example, which indicates that the model is diagnosed as a fault, and is actually an event of the fault; FN is a false negative, representing an event where the model diagnoses no fault, but actually has a fault; FP is a false positive example, which represents an event that the model diagnoses as a fault but actually has no fault; TN is a true negative example indicating that the model diagnoses no faults, and is also a fault-free event in practice.

According to the above definition, the following criteria are obtained:

the accuracy is the proportion of correct diagnosis events of the model, and under the condition that positive and negative samples are balanced, the higher the accuracy is, the better the diagnosis effect of the model is. The calculation formula of the accuracy (accuracycacy) is as follows:

the true class ratio (TPR) represents the proportion that is correctly identified by the diagnostic model in the event of all actual faults.

The negative-positive class rate (FPR) represents the proportion of events that are erroneously determined to be a fault by the diagnostic model in all instances of an actual fault.

And respectively taking the negative and positive class rate and the real class rate as horizontal and vertical coordinates, calculating corresponding negative and positive class rate and real class rate values by using different thresholds to form different points, connecting the points to form a receiver operation characteristic curve, and calculating the area enclosed by the receiver operation characteristic curve and the horizontal and vertical coordinates to form the AUC.

And 6, the normal operation model of the variable pitch system of the fan is an improved adaptive genetic algorithm-self-association neural network model trained by normal operation data, and the model can identify each working state of the variable pitch system of the fan in normal operation. Data recorded by a data acquisition and monitoring System (SCADA) before and after a part of fan variable pitch system fault time and data in normal operation are selected from a test set to be used as input of an improved adaptive genetic algorithm-self-association neural network model, residual distribution and offset of normal distribution under corresponding working conditions are output by using a Zhansen-Shannon (JS) divergence calculation model, the sensitivity and the accuracy of the model are considered, and a sliding window value is selected to be 10. And determining a threshold value by taking the minimum deviation degree of the residual error distribution of the fault data and the normal distribution as a boundary.

In this embodiment, data collected by a SCADA (supervisory control and data acquisition) system of a 1.5MW wind turbine in a certain north China wind farm is selected for experimental analysis. Through multiple experiments, the population scale is set to be 60, the maximum evolution generation number is 200, and the maximum and minimum values of the cross probability are respectively as follows: pc1=0.89,Pc2The maximum and minimum values of the mutation probability are respectively 0.58: pm1=0.15,Pm20.005. The number of input nodes of the self-associative neural network is 15, the number of nodes of a mapping layer and a demapping layer is 9, the number of nodes of a bottleneck layer is 5, and the learning rate is 0.02.

Fig. 4 is an optimal fitness curve of three optimization algorithms of the Improved Adaptive Genetic Algorithm (IAGA), the improved genetic algorithm (AGA) and the Genetic Algorithm (GA) according to the present invention. It can be seen from fig. 4 that the Improved Adaptive Genetic Algorithm (IAGA) is not only superior to the improved genetic algorithm (AGA) and the Genetic Algorithm (GA) in convergence rate, but also has a larger final fitness function value and a better parameter optimization effect.

FIG. 5 shows the training error curves of the improved adaptive genetic algorithm-self-associative neural network (IAGA-AANN), the improved genetic algorithm-self-associative neural network (AGA-AANN) and the genetic algorithm-self-associative neural network (GA-AANN) models according to the present invention. The improved genetic algorithm-self-association neural network (GA-AANN) reaches the set precision requirement when iterating 1592 times; the improved genetic algorithm-self-association neural network (AGA-AANN) meets the precision requirement when the iteration is 1075 times; the improved adaptive genetic algorithm-self-association neural network (IAGA-AANN) meets the precision requirement in 694 iterations, which shows that the improved adaptive genetic algorithm-self-association neural network (IAGA-AANN) has a faster convergence speed.

Fig. 6 is a scatter plot of the degree of offset of the distribution of the residuals of a portion of the training data. And substituting the preprocessed training set containing the normal operation and fault time data of the variable pitch system into an improved adaptive genetic algorithm-self-association neural network model, and comparing the residual error distribution deviation degree of the normal operation and fault time calculated by the Jansen-Shannon (JS) divergence to obtain the threshold value of the model of 0.39. The obtained model threshold value is used as an index for judging whether the variable pitch system is in fault, so that the area AUC under the operation characteristic curve of the model receiver is maximized, namely the accuracy of the diagnosis model is highest. It was found that the area AUC under the receiver operating characteristic curve reached the optimal value of 0.967 when the window time length was chosen to be 50 min.

According to the method, a certain wind power plant selected by the invention has a fault in a No. 3 fan pitch system of No. 5, 5 and 11 in 2019, so that data 10 days before the fault is selected to test the effectiveness of the fault diagnosis method provided by the invention. The trained improved adaptive genetic algorithm-self-association neural network (IAGA-AANN) fault diagnosis model is used for obtaining the residual error distribution of the wind turbine, and the obtained deviation curve is shown in fig. 7. As can be seen from fig. 7, the threshold value is exceeded once every 5 months and 3 days, but the proportion of the threshold value does not exceed 60% in the sliding time window, and therefore, it is determined as an invalid alarm. And (3) starting to generate large oscillation at the offset degree of 5, 8 days and having a rising trend, and judging that the offset degree exceeds a threshold value and is continuously increased after 5, 10 days, and stopping for maintenance if the fan variable pitch system is very likely to break down.

In order to compare the performances of the algorithms more intuitively, the same variable pitch system test set is used as input data of a diagnosis model, the Areas (AUC) and the accuracy under operation characteristic curves of receivers of the diagnosis model such as a Least Squares Support Vector Machine (LSSVM), an auto-associative neural network (AANN), a genetic algorithm-auto-associative neural network (GA-AANN), an improved adaptive genetic algorithm-auto-associative neural network (IAGA-AANN) and the like are compared, and the results are shown in Table 2.

TABLE 2 improved adaptive genetic Algorithm-comparison of the self-Association neural network (IAGA-AANN) with other models

The positive and negative samples contained in the selected test set are relatively balanced, so that the accuracy can be used as a measurement index of the diagnosis effect of the test set model. From table 2, it can be seen that the Area (AUC) and accuracy under the receiver operation characteristic curve of the improved adaptive genetic algorithm-self-association neural network (IAGA-AANN) diagnostic model are both significantly higher than those of other models, so that the improved adaptive genetic algorithm-self-association neural network (IAGA-AANN) fault diagnosis model has higher accuracy and better effect.

The core concept of the embodiment of the invention is as follows: training the model by using the normal operation historical data of the fan, calculating the Zhansen-Shannon (JS) divergence of the output residual error distribution of the model at the normal and fault moments, obtaining a model threshold value, and judging whether the variable pitch system is in fault. And the area AUC value and the time window under the receiver operation characteristic curve are introduced into the evaluation index, so that the accuracy of the model is further increased. And carrying out experimental analysis by adopting data of normal operation and fault time of a fan variable pitch system, and verifying the effectiveness of the model.

19页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种风电机组故障智能监测系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!