Transformer fault diagnosis method based on hard voting ensemble learning

文档序号:1903394 发布日期:2021-11-30 浏览:16次 中文

阅读说明:本技术 一种基于硬投票集成学习的变压器故障诊断的方法 (Transformer fault diagnosis method based on hard voting ensemble learning ) 是由 马洪斌 杨飞 史娜 于 2021-08-13 设计创作,主要内容包括:本发明公开了一种基于硬投票集成学习的变压器故障诊断的方法,包括以下的步骤:步骤1、建立对变压器进行故障诊断的硬投票集成学习分类模型;步骤2、对未知的变压器故障进行故障类型识别;本发明采用机器学习搭建了变压器的故障诊断模型,故障识别准确率高,故障识别简单高效,为初期的变压器运行状态识别提供了有效的支撑,大大降低了工作的强度,保障了变压器故障类型识别的安全性和可靠性。(The invention discloses a transformer fault diagnosis method based on hard voting ensemble learning, which comprises the following steps of: step 1, establishing a hard voting integrated learning classification model for fault diagnosis of a transformer; step 2, identifying the fault type of the unknown transformer fault; according to the method, the fault diagnosis model of the transformer is built by machine learning, the fault identification accuracy is high, the fault identification is simple and efficient, effective support is provided for the identification of the initial running state of the transformer, the working strength is greatly reduced, and the safety and the reliability of the fault type identification of the transformer are guaranteed.)

1. A transformer fault diagnosis method based on hard voting ensemble learning is characterized by comprising the following steps:

step 1, establishing a hard voting integrated learning classification model for fault diagnosis of a transformer;

(1) acquiring a data set, wherein normal and fault data of the transformer acquired in the equipment operation and maintenance management system are used as initial data; each group of sample data mainly comprises five gas components in the transformer oil and corresponding transformer fault types;

(2) classifying the data set, dividing the initial data into a training set and a test set, performing training and learning by using the training set data to obtain a fault diagnosis model of the transformer, verifying the obtained fault diagnosis model by using the test set data, and judging the accuracy of model identification;

(3) analyzing fault characteristics, namely dividing the running state of the transformer into normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge according to the normal and fault states of the transformer in the running process;

(4) inputting characteristic gas, namely inputting five gas components of methane, ethylene, ethane, acetylene and hydrogen in the transformer oil and five corresponding states of normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge as training sets;

(5) selecting a base classifier, and respectively adopting a support vector machine, logistic regression, K-nearest neighbor classification, Bayesian classification, decision tree and random forest to train and learn training data to obtain six transformer fault diagnosis models;

(6) verifying the identification result, namely verifying the obtained transformer fault diagnosis model by using a test set to obtain the identification accuracy of each model and the identification result of each model for various faults;

(7) performing integrated learning diagnosis, namely taking the obtained six transformer fault diagnostizers as a base learner, adopting a hard voting combination strategy according to identification results of the six transformer fault diagnostizers, performing statistical analysis on a prediction result of the base learner according to each type of fault, outputting the same type of fault label with the largest number as a final result label, and adopting a combination mode that two classifiers with higher identification results and a random forest are selected as basic classifiers, selecting a decision tree classifier as an adjustment classifier, and performing integrated combination on the three classifiers to obtain a new transformer fault diagnostizer;

(8) verifying the obtained integrated learning transformer fault diagnostor by using a test set to obtain a transformer fault identification result;

step 2, identifying the fault type of the unknown transformer fault;

five gas components of methane, ethylene, ethane, acetylene and hydrogen collected from the transformer to be diagnosed are input into the integrated learning transformer fault diagnosis model, and a fault identification result of the transformer can be obtained.

2. The method for diagnosing the fault of the transformer based on the hard voting ensemble learning according to the claim 1, wherein the method comprises the following steps: in the step (1), the operation data of the transformer is acquired from the equipment operation and maintenance management system.

3. The method for diagnosing the fault of the transformer based on the hard voting ensemble learning according to the claim 1, wherein the method comprises the following steps: in step (2) in step 1, the ratio of the training data to the test data is 0.7: 0.3.

4. the method for diagnosing the fault of the transformer based on the hard voting ensemble learning according to the claim 1, wherein the method comprises the following steps: in the step (3) in the step 1, the operation state of the transformer is divided into normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge.

5. The method for diagnosing the fault of the transformer based on the hard voting ensemble learning according to the claim 1, wherein the method comprises the following steps: in the step (5) in the step 1, the base classifier selects a support vector machine, logistic regression, nearest neighbor classification, Bayesian classification, decision tree and random forest.

6. The method for diagnosing the fault of the transformer based on the hard voting ensemble learning according to the claim 1, wherein the method comprises the following steps: in the step (7) in the step 1, the integrated learning mainly adopts Vote integrated learning, the hard voting combined output strategy adopted by the combined strategy is used for performing statistical analysis on the prediction result of the base learner for each type of fault, the fault label with the largest number of the same type is output as a final result label, the combination mode adopts two classifiers with higher recognition results and a random forest as a basic classifier, a decision tree classifier is selected as an adjustment classifier, and the three classifiers are integrated and combined to obtain the new transformer fault diagnoser.

7. The method for diagnosing the fault of the transformer based on the hard voting ensemble learning according to the claim 1, wherein the method comprises the following steps: in the step (7) in the step 1, the integrated learning mainly combines a decision tree, a support vector machine and a random forest.

Technical Field

The invention relates to the technical field of transformer fault diagnosis, in particular to a transformer fault diagnosis method based on hard voting ensemble learning.

Background

The transformer is an important component in a power system, plays an important role in safe and reliable supply of power, and once a fault occurs, the safe and stable operation of a power grid is seriously influenced. The transformer is used as a power connection and conversion part, the internal structure and the operation condition are complex, and the fault is difficult to accurately study and judge when the fault occurs, so that the fault diagnosis of the transformer has great significance for guaranteeing the safe operation of a power grid. At present, an oil-gas analysis method (IEC three-ratio method) is mainly adopted for transformer fault diagnosis, latent faults which develop gradually can be found accurately and reliably, and major accidents caused by the latent faults are prevented. However, in practical application, the diagnosis accuracy rate is often only about 80%, and the accuracy rate of fault diagnosis is low.

Disclosure of Invention

The invention aims to provide a transformer fault diagnosis method based on hard voting ensemble learning, so as to solve the technical problem.

In order to achieve the purpose, the invention adopts the following technical scheme: .

A transformer fault diagnosis method based on hard voting ensemble learning comprises the following steps:

step 1, establishing a hard voting integrated learning classification model for fault diagnosis of a transformer;

(1) acquiring a data set, wherein normal and fault data of the transformer acquired in the equipment operation and maintenance management system are used as initial data; each group of sample data mainly comprises five gas components in the transformer oil and corresponding transformer fault types;

(2) classifying the data set, dividing the initial data into a training set and a test set, performing training and learning by using the training set data to obtain a fault diagnosis model of the transformer, verifying the obtained fault diagnosis model by using the test set data, and judging the accuracy of model identification;

(3) analyzing fault characteristics, namely dividing the running state of the transformer into normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge according to the normal and fault states of the transformer in the running process;

(4) inputting characteristic gas, namely inputting five gas components of methane, ethylene, ethane, acetylene and hydrogen in the transformer oil and five corresponding states of normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge as training sets;

(5) selecting a base classifier, and respectively adopting a support vector machine, logistic regression, K-nearest neighbor classification, Bayesian classification, decision tree and random forest to train and learn training data to obtain six transformer fault diagnosis models;

(6) verifying the identification result, namely verifying the obtained transformer fault diagnosis model by using a test set to obtain the identification accuracy of each model and the identification result of each model for various faults;

(7) performing integrated learning diagnosis, namely taking the obtained six transformer fault diagnostizers as a base learner, adopting a hard voting combination strategy according to identification results of the six transformer fault diagnostizers, performing statistical analysis on a prediction result of the base learner according to each type of fault, outputting the same type of fault label with the largest number as a final result label, and adopting a combination mode that two classifiers with higher identification results and a random forest are selected as basic classifiers, selecting a decision tree classifier as an adjustment classifier, and performing integrated combination on the three classifiers to obtain a new transformer fault diagnostizer;

(8) verifying the obtained integrated learning transformer fault diagnostor by using a test set to obtain a transformer fault identification result;

step 2, identifying the fault type of the unknown transformer fault;

five gas components of methane, ethylene, ethane, acetylene and hydrogen collected from the transformer to be diagnosed are input into the integrated learning transformer fault diagnosis model, and a fault identification result of the transformer can be obtained.

Preferably: in the step (1), the operation data of the transformer is acquired from the equipment operation and maintenance management system.

Preferably: in step (2) in step 1, the ratio of the training data to the test data is 0.7: 0.3.

preferably: in the step (3) in the step 1, the operation state of the transformer is divided into normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge.

Preferably: in the step (5) in the step 1, the base classifier selects a support vector machine, logistic regression, nearest neighbor classification, Bayesian classification, decision tree and random forest.

Preferably: in the step (7) in the step 1, the integrated learning mainly adopts Vote integrated learning, the hard voting combined output strategy adopted by the combined strategy is used for performing statistical analysis on the prediction result of the base learner for each type of fault, the fault label with the largest number of the same type is output as a final result label, the combination mode adopts two classifiers with higher recognition results and a random forest as a basic classifier, a decision tree classifier is selected as an adjustment classifier, and the three classifiers are integrated and combined to obtain the new transformer fault diagnoser.

Preferably: in the step (7) in the step 1, the integrated learning mainly combines a decision tree, a support vector machine and a random forest.

Compared with the prior art, the invention has the following advantages: according to the method, the fault diagnosis model of the transformer is built by machine learning, the fault identification accuracy is high, the fault identification is simple and efficient, effective support is provided for the identification of the initial running state of the transformer, the working strength is greatly reduced, and the safety and the reliability of the fault type identification of the transformer are guaranteed.

Drawings

FIG. 1 is a flow chart of the integrated learning fault diagnosis of the present invention.

Detailed Description

The invention is explained in further detail below with reference to the figures and the specific embodiments.

Integrated learning;

the ensemble learning method is mainly generated by combining individual learners, and a plurality of classifiers can be obtained by learning a sample data set, wherein the classifiers have diversity and accuracy. Classifiers can be further classified as "homogeneous" and "heterogeneous" according to whether the learning algorithms of the same type exist among the individual learners. For the combination of the individual learners, the prediction results of the individual learners for the new sample classification are comprehensively considered according to a certain strategy, so that a final prediction result is obtained. The integration strategy mainly comprises an average value method, a majority voting method and the like.

A support vector machine;

a Support Vector Machine (SVM) is a pattern recognition method developed on the basis of a statistical learning theory, is a global optimal solution algorithm instead of solving a local minimum value, and has good generalization capability. The SVM has a firm theoretical basis and is widely applied to the fields of face recognition, automatic text classification and the like.

Performing logistic regression;

logistic regression is also called logistic regression analysis, and is widely applied to classification because of higher training speed and better classification effect. The logistic regression model is a combination of a linear regression model and a logistic regression model. The logistic regression is mainly used for finding a prediction function and a loss function, wherein the prediction function is a function of the result probability of the input variable prediction output variable. To evaluate the difference between the result predicted by the prediction function and the actual value, a loss function is required to be constructed as a measurement standard, and the loss function represents the average difference between the predicted value and the actual value after the training sample is input.

K-nearest neighbor;

the K-nearest neighbor is given a sample, the class number of the sample is determined, the nearest K samples of the sample in the training set are found, and then the class with the most occurrence times in the K samples is found, wherein the label of the class is the label of the sample. In the nearest neighbor algorithm, the euclidean distance is generally used, and no weight is given to each training sample when performing distance calculation, so that the difference of each sample is difficult to be reflected. Usually, a distance formula is derived through a gaussian kernel function, and kernel parameters are learned through a training set to obtain a weighting matrix corresponding to the training set. When the test set is classified, the difference between the samples can be shown through the introduced weighting matrix, so that the classification efficiency is improved.

Naive Bayes;

naive bayes theory assumes that the outcome of an event is uncertain and that it is quantified by the probability of the event occurring. If the probability of a past occurrence of an event is known, then the probability of a future occurrence can be mathematically calculated. The naive Bayes model assumes that all condition attributes are independent from each other, and a father node shared by the condition attributes is a node of the class attribute. The assumption is a conditional independent assumption. The structure of the constructed model is assumed to be simple, and the complexity of the Bayesian network is reduced. Data in practice often have complexity, complexity and incompleteness, so that the condition independent assumption is difficult to satisfy when the Bayesian model is constructed.

A decision tree;

decision tree learning is mainly to infer classification rules expressed in the form of decision trees from a set of irregular and unordered cases. In the process of creating the decision tree, due to the fact that training samples are too few or noise exists in data set, a plurality of branches of the decision tree reflect abnormal phenomena in the training samples. Decision trees can over-fit training samples, and pruning of decision trees solves the over-fit problem. The paper cutting is divided into pre-paper cutting and post-paper cutting, and the pre-paper cutting can lead to the early construction stop of the decision tree by making some judgments in advance, but the efficiency is high, the method is suitable for the condition of large data volume, and is usually applied to data prediction. The decision tree has simple structure, easy understanding, simple algorithm description and high classification speed, and is widely used in data processing.

Random forests;

the random forest is formed by extracting a plurality of samples from original samples by a resampling method, modeling a decision tree for each sample, and then combining a plurality of decision trees. And predicting unknown samples through the formed forest, and selecting the classification with the most votes. Therefore, the random forest can greatly improve the accuracy of classification compared with the decision tree.

The mechanism of gas generation by transformer oil and solid insulation;

at present, oil is almost used for insulation and heat dissipation of large-scale power transformers, and the power transformer oil and solid organic insulating materials in the oil are gradually aged and cracked under the operating voltage due to the action of various factors such as electricity, heat, oxidation, local arcs and the like, so that a small amount of low molecular hydrocarbons such as methane, ethylene, ethane, acetylene and the like, and gases such as carbon monoxide, carbon dioxide, hydrogen and the like are generated and mostly dissolved in the oil. The components and the content of the dissolved gas in the oil reflect the degree of insulation aging or failure of the power transformer to a certain extent, and can be used as characteristic quantities for reflecting the abnormity of power equipment. By periodically analyzing the components, the content and the gas production rate of gas dissolved in oil for the power transformer in operation, latent faults existing in the power transformer can be discovered as soon as possible.

Analyzing typical faults of the transformer;

the internal faults of the transformer are mainly overheating type and abnormal discharge type, wherein the overheating type mainly comprises low-temperature overheating (less than 300 ℃), medium-temperature overheating (300-700 ℃) and high-temperature overheating (more than 700 ℃), and the abnormal discharge type mainly comprises low-energy discharge and high-energy discharge.

According to the test and the analysis of the transformer fault processing case, the components of the gas generated when the oil-immersed transformer breaks down are as follows:

type of failure Main gas component Secondary gas component
Oil superheating Methane ethylene Hydrogen ethane
Oil and paper superheating Methane ethylene carbon monoxide carbon dioxide Hydrogen ethane
Partial discharge in oil and paper insulation Hydrogen methane carbon monoxide Acetylene ethane carbon dioxide
Spark discharge in oil Acetylene with hydrogen
Electric arc in oil Acetylene with hydrogen Methane ethylene ethane
Electric arc in oil and paper Hydrogen acetylene carbon monoxide carbon dioxide Methane vinyl acetylene

Through analysis, the fault types of the transformer are divided into medium-low temperature overheating, high temperature overheating, low-energy discharge and high-energy discharge. And carrying out mode identification by taking the four fault types and the normal conditions of the transformer and five state types in total as identification results of the transformer.

The invention mainly comprises the following steps:

firstly, establishing an integrated learning classification model for fault diagnosis of a transformer;

acquiring a data set; the method comprises the steps that 133 groups of gas components in transformer oil of a transformer in normal and fault states are obtained in an equipment operation and maintenance management system, and each group of sample data mainly comprises five kinds of gas component information of methane, ethylene, ethane, acetylene and hydrogen and five kinds of state type information of normal, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge of the transformer;

when different fault types occur to the transformer, the generated gas components are different, the fault types of the transformer are known in an actual working group, and then the gas components corresponding to the different faults are collected;

dividing sample data into a training set and a test set, in order to avoid errors caused by classification of the sample data among the same types, dividing the sample data into 93 groups of training data and 40 groups of test data according to a ratio of 7:3 by adopting proportion division;

the method comprises the steps of (1) carrying out transformer fault mode identification by taking methane, ethylene, ethane, acetylene and hydrogen as characteristic vectors, considering that the classification problem is mainly directed at a numerical structure, and gas components are numerical types and do not need to be converted, so that the normal state, medium and low temperature overheating, high temperature overheating, low energy discharge and high energy discharge of a transformer are converted into the numerical types from character types and are respectively marked as 0, 1, 2, 3 and 4;

learning the training data by respectively adopting a support vector machine, logistic regression, nearest neighbor classification, Bayesian classification, decision tree and random forest to obtain fault diagnosis models, and performing model verification on each fault diagnosis model by using test data to obtain the identification result of each model;

the identification accuracy of each classifier can be obtained by the identification and verification results of the above 6 classifiers through 40 groups of test data, a classifier support vector machine (the fault identification accuracy is 90%) and a random forest (the fault identification accuracy is 90%) with higher identification rate are selected as base classifiers, in addition, the identification results of each type of fault are analyzed through each classifier, and a decision tree is selected to be combined with the support vector machine and the random forest to obtain a new integrated classifier;

firstly, identifying the fault type of an unknown transformer fault;

five gas components of methane, ethylene, ethane, acetylene and hydrogen are collected from a transformer to be subjected to fault diagnosis and input into an integrated learning fault diagnosis model to obtain a corresponding transformer fault type.

The invention adopts integrated learning to build a transformer fault identification classification model, and performs fault diagnosis by utilizing five gas components generated in transformer oil. Dividing the collected sample data into a training set and a testing set, obtaining a transformer fault diagnosis model through the training set data, and verifying the result through the testing set data. The classifier with high fault recognition rate is used as a base classifier, and integrated learning is adopted for combination, so that the fault recognition accuracy is greatly improved.

Examples

Firstly, establishing an integrated learning classification model for fault diagnosis of a transformer;

acquiring a data set; gas components in transformer oil of 133 groups of transformers in normal and fault states are acquired from an equipment operation and maintenance management system and serve as sample data, and each group of sample data comprises five kinds of gas component information of methane, ethylene, ethane, acetylene and hydrogen and corresponding transformer fault types;

dividing 133 groups of sample data into a training set and a testing set, and dividing the training set and the testing set into 93 groups of training data and 40 groups of testing data according to the proportion of 7: 3;

respectively learning the training data by adopting a support vector machine, logistic regression, nearest neighbor classification, Bayesian classification, decision tree and random forest to obtain fault diagnosis models, and performing model verification on each fault diagnosis model by using test data to obtain the following identification results of each model:

classifier Identify correct number Rate of identification accuracy
Decision tree 28 70%
Support vector machine 36 90%
Nearest neighbor classification 31 77.5%
Random forest 36 90%
Logistic regression 32 80%
Bayesian classification 30 75%

The decision tree, the support vector machine and the random forest are used as base classifiers, the new set classifier is formed by adopting the vote integrated learning, and the fault identification result is output in a majority voting method as follows:

classifier Identify correct number Rate of identification accuracy
Integrated learning 37 92.5%

And for a single classifier, the vote integrated learning combines a homogeneous classifier and a heterogeneous classifier, so that the fault identification accuracy is improved and is 92.5%.

Secondly, identifying the fault type of unknown transformer faults

The transformer fault diagnosis is carried out by five characteristic gases collected by the oil chromatography on-line monitoring device in actual field, and the diagnosis result is as follows:

serial number Hydrogen gas Methane Ethylene Ethane (III) Acetylene Predicting faults Actual failure
1 14.3 1.78 0.38 0.43 0 Medium and low temperature superheating Medium and low temperature superheating
2 124.25 17.13 1.99 6.51 0.32 Low energy discharge Low energy discharge

The prediction result of the first group of transformers is medium and low temperature overheating, and an overheating site is found at the connecting position below the main transformer body through actual site infrared temperature measurement detection. The predicted result of the second group of transformers is low-energy discharge, and the discharging trace between the bare lead and the sleeve conductive tube is found through inspection tour of the transformers on site.

According to the invention, the historical fault data is modeled to build a fault diagnosis model, and the gas component data can be directly input to diagnose and identify the fault type of the transformer.

Comparative example

And (3) learning and training the same training data by adopting a support vector machine and a random forest, building a transformer fault identification model, and verifying by adopting test data to obtain the accuracy rate of the transformer fault identification of 90%. The model established by training the training data by adopting the ensemble learning has the recognition accuracy rate of 92.5 percent which is verified in the test set, so that the effect of the ensemble learning on the classification recognition accuracy rate is better.

The foregoing is a preferred embodiment of the present invention, and it will be apparent to those skilled in the art that variations, modifications, substitutions and alterations can be made in the embodiment without departing from the principles and spirit of the invention.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种根据声学参量分析灌浆质量设备及其使用方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!