Intelligent epileptic spike detection method based on fusion of adaptive template matching and machine learning algorithm

文档序号：1451355 发布日期：2020-02-21 浏览：4次中文

阅读说明：本技术 基于自适应模板匹配与机器学习算法融合的癫痫棘波智能检测方法 (Intelligent epileptic spike detection method based on fusion of adaptive template matching and machine learning algorithm ) 是由王紫萌吴端坡冯维于 2019-10-28 设计创作，主要内容包括：本发明提供了一种基于自适应模板匹配与机器学习算法融合的癫痫棘波智能检测方法包括：(1)脑电信号(EEG)采集：选取实验对象,使用脑电采集设备采集癫痫患者的脑电数据,建立实验数据库；(2)数据预处理：对采集到的原始EEG数据进行带通滤波得到标准EEG信号；(3)进行自适应模板匹配棘波检测：(4)基于机器学习的棘波检测方法：首先将脑电信号分割成1s长的脑电片段,然后提取每个脑电片段中的时域和频域特征,构建棘波特征向量；使用特征向量训练随机森林分类模型,得到基于机器学习的棘波检测结果。(5)检测结果融合：将步骤S3和步骤S4的检测方法融合,如果同时被S3和S4检测为棘波,则将其视为癫痫棘波。(The invention provides an intelligent epileptic spike detection method based on self-adaptive template matching and machine learning algorithm fusion, which comprises the following steps: (1) electroencephalogram (EEG) acquisition: selecting an experimental object, acquiring electroencephalogram data of an epileptic by using electroencephalogram acquisition equipment, and establishing an experimental database; (2) data preprocessing: performing band-pass filtering on the acquired original EEG data to obtain a standard EEG signal; (3) carrying out adaptive template matching spike detection: (4) the spike detection method based on machine learning comprises the following steps: firstly, segmenting a brain electrical signal into electroencephalogram segments with the length of 1s, then extracting time domain and frequency domain characteristics in each electroencephalogram segment, and constructing spike characteristic vectors; and training a random forest classification model by using the feature vectors to obtain a spike detection result based on machine learning. (5) And (3) fusing detection results: the detection methods of step S3 and step S4 are fused, and if detected as a spike by both S3 and S4, it is regarded as an epileptic spike.)

1. The intelligent epilepsia spike detection method based on the fusion of adaptive template matching and a machine learning algorithm is characterized by comprising the following steps of:

step S1: collecting electroencephalogram signals; selecting an experimental object, acquiring electroencephalogram data of an epileptic by using electroencephalogram acquisition equipment, and establishing an experimental database;

step S2: preprocessing data; performing Butterworth band-pass filtering on the acquired original EEG data to obtain a standard EEG signal;

step S3: adaptive template matching spike detection; firstly, defining a universal template according to the waveform characteristics of an epileptic spike, and carrying out universal template matching to obtain a candidate spike signal; then clustering the candidate spikes by using a K-means algorithm to obtain a plurality of classes; counting the number of candidate spikes in each class, and if the number of spikes is less than 5% of the total number of spikes, rejecting the class; respectively using the screened class centers as new templates to perform self-adaptive template matching, and adding all matching results to obtain spike detection results;

step S4: machine learning spike detection; firstly, segmenting a brain electrical signal into electroencephalogram segments with the length of 1s, then extracting time domain and frequency domain characteristics in each electroencephalogram segment, and constructing spike characteristic vectors; training a random forest classification model by using the feature vectors to obtain a spike detection result based on machine learning;

step S5: fusing detection results; fusing the spike detection result of the step S3 and the spike detection result of the step S4, and if the spike signals are detected in the same segment in the steps S3 and S4, marking the final result as that the segment has the spike signals, and regarding the segment as an epileptic spike;

wherein the step S3 further includes:

step S31, counting the characteristics of rising edge slope, falling edge slope, amplitude height, duration and the like of the spike waveform in the electroencephalogram data, and defining a universal template;

step S32, setting the window width to 300, and carrying out general template matching operation on the electroencephalogram signals according to the time sequence to obtain candidate spike signals;

step S33, performing K-means clustering on the candidate spikes, and dividing the candidate spikes into different classes according to different waveforms;

step S34, counting the number of candidate spike waves in each spike wave cluster, if the number is less than 5% of the total number of candidate spike waves, rejecting the class, and finally taking the centroid of the rest classes as a new template;

step S35, new template matching is carried out by respectively using the mass center of each class as a template, and the results are superposed to obtain a spike detection result;

the step S4 further includes:

step S41, each channel of the electroencephalogram signal is divided into segments with the length of 1S, the time domain feature and the frequency domain feature of each segment are extracted, and the feature vector of each electroencephalogram segment is constructed;

step S42, dividing the feature vectors into a training set and a testing set, and training a random forest classification model by using data in the training set;

step S43, inputting the data in the test set into the random forest model, and obtaining an output result, which is a spike detection result, so as to detect whether there is a spike in this segment.

2. The intelligent epileptic spike detection method based on adaptive template matching fused with machine learning algorithm as claimed in claim 1, characterized in that in the data preprocessing, 5 th order IIR butterworth band pass filter with frequency range of 0.5-32Hz is used to remove noise and artifacts in EEG signals.

3. The intelligent epileptic spike detection method based on adaptive template matching and machine learning algorithm fusion as claimed in claim 1 or 2, characterized in that spike detection results are compared with detection results of adjacent channels, if the detection results have a "spike-to-spike" phenomenon in the adjacent channels, the spike is considered, otherwise, the result is discarded.

Technical Field

The invention relates to the field of computers, in particular to an intelligent epileptic spike detection method based on self-adaptive template matching and machine learning algorithm fusion.

Background

Epilepsy is a chronic disease in which sudden abnormal discharges in cerebral neurons lead to transient cerebral dysfunction. Over six thousand five million people worldwide suffer from epilepsy, with about nine million people in epilepsy in china. Seizures are paroxysmal, repetitive and unpredictable and may occur at any age.

Electroencephalograms (EEG) are potential signals generated by the discharge of neurons in the brain, reflect the rhythmic activity of bioelectricity in the brain, and include a large amount of physiological and disease information. In clinical medicine, EEG signal processing can not only provide a diagnosis basis for some brain diseases, but also provide an effective treatment means for some brain diseases, and plays an important role in the detection of epilepsy.

Spike waves are typical epilepsy characteristic waveforms, are usually recorded in electroencephalograms, are sharp relative to background waveforms, have high amplitude and transient characteristics, and clinically, current epilepsy examination mainly identifies spike waves of electroencephalogram signals through human eye detection. At present, clinical examination of epilepsia electroencephalogram is mainly to identify spike waves in electroencephalogram signals through manual detection, but the efficiency is low, the subjectivity is strong, the accuracy of results cannot be guaranteed, and therefore the spike wave automatic detection technology receives more and more attention in recent years.

There are many methods for identifying spike waves, and it is common that wavelet analysis method performs wavelet decomposition on time-frequency characteristics of epileptic brain electrical signals, and takes wavelet coefficients as input signals of machine learning classifier and neural network to detect spike waves. However, because the characteristics of the mother wavelet and the spike wave are different, the background electroencephalogram inhibition is poor in the signals obtained by decomposition and reconstruction, and the spike wave extraction effect is not ideal. The detection method of the singular point of the signal of the wavelet transform modulus maximum is another common method for spike detection, but the method can only detect positive-phase spike and has higher false positive rate when detecting negative-phase spike. Morphological filtering is another approach to study spike extraction, which uses predefined structural elements to match signals based on their geometric features to extract signals with similar morphology. The method has the characteristics of easy algorithm gradual change, definite physical significance, practicality, effectiveness and the like, can decompose a signal containing complex components into parts with different physical significances, separate the signal from a background and keep the global or local main morphological characteristics of the signal, but a single opening-closing (OC) or closing-opening (CO) operation can cause a statistical bias phenomenon, so that the detected spike and an actual spike have certain deviation on the waveform and the position.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides an intelligent epileptic spike detection method based on the fusion of adaptive template matching and a machine learning method, so as to improve the identification rate of epileptic spikes.

In order to achieve the purpose, the invention is realized by the following scheme: an intelligent epileptic spike detection method based on self-adaptive template matching and machine learning algorithm fusion comprises the following steps:

step S1: acquiring electroencephalogram signals, selecting an experimental object, establishing an epileptic electroencephalogram database, and marking spike waves in each channel of the electroencephalogram signals;

step S2: preprocessing the electroencephalogram signals, and removing high-frequency components and artifacts by using a 5-order Butterworth band-pass filter.

Step S3: and (3) detecting the epileptic spike by adopting a self-adaptive template matching method.

Step S4: and detecting the epileptic spike by adopting a machine learning method.

Step S5: and fusing the detection results of the step S3 and the step S4 to obtain a final spike detection result.

According to an embodiment of the present invention, the sampling frequency in step S1 is 500Hz, and a large amount of electroencephalogram data is required to be taken as an experimental sample, and the experimental subject includes people of different sexes and different ages.

According to an embodiment of the invention, in the adaptive template matching spike detection process, morphological characteristics such as rising edge slope, falling edge slope, amplitude and duration of manually marked spikes are counted to establish a universal spike template.

According to an embodiment of the present invention, in the process of detecting epilepsy spike by using an adaptive template matching method, the method includes:

step S32, setting the window width to 300, and carrying out general template matching operation on the electroencephalogram signals according to the time sequence to obtain candidate spike signals;

step S33, performing K-means clustering on the candidate spikes, and dividing the candidate spikes into different classes according to different waveforms;

and step S35, performing new template matching by respectively using the centroid of each class as a template, and superposing the results to obtain a spike detection result.

According to an embodiment of the present invention, the K-means clustering includes:

step S331: randomly selecting k samples in a sample set as an initial clustering centroid;

step S332: and calculating the distance between each sample and the initial centroid, reclassifying according to the minimum distance, and classifying each candidate spike into the class of the centroid closest to the candidate spike to obtain a clustering result.

Step S333: and averaging the obtained samples in each class to be used as the centroid of the next clustering.

Step S334: and repeating the steps S332 and S333 until the centroid position is not changed any more, and finishing the clustering.

According to an embodiment of the present invention, in the process of detecting epilepsy spike by using a machine learning method, the method includes:

step S41: the electroencephalogram signal is divided into single-channel segments with the length of 1s, if the segments have spikes, the segments are marked as 1, and if the segments do not have spikes, the segments are marked as 2. And respectively extracting time domain characteristics and frequency domain characteristics of the electroencephalogram fragments to construct epilepsy spike characteristic vectors with strong robustness.

Step S42: and randomly dividing the feature vectors into a training set and a testing set, and training a plurality of decision trees in a random forest classifier by using a plurality of electroencephalogram signal samples in the training set to form a random forest model.

Step S43: and inputting the data in the test set into the trained random forest model to obtain a spike detection result based on a machine learning method.

According to an embodiment of the invention, in the training of the random forest model, the method comprises the following steps:

step S421: the new training set with the same number of samples as the training set is extracted and put back in the training sample set.

Step S422: randomly sampling without playback in the feature vector set to form a feature vector set to be selected;

step S423: and (4) according to the candidate feature training set obtained in the step (S422), calculating the optimal splitting mode of each node and splitting the node without pruning until the impurity degree of each leaf node reaches the specified requirement to form a decision tree.

Step S424: and repeating the steps S421 to S423 until all decision trees are generated and integrated to obtain the random forest model.

By adopting the technical scheme of the invention, the epileptic spike detection is carried out by fusing self-adaptive template matching and machine learning, so that the identification rate of the epileptic spike is greatly improved.

Drawings

Fig. 1 is a general flowchart of an intelligent epileptic spike detection method based on adaptive template matching and machine learning algorithm fusion according to the present invention.

FIG. 2 is a flow chart of adaptive template matching spike detection according to the present invention.

FIG. 3 is a K-means clustering flow chart of the present invention.

FIG. 4 is a flowchart illustrating a machine learning spike detection process according to the present invention.

FIG. 5 is a flow chart of the training of a random forest model for machine learning spike detection in accordance with the present invention.

Detailed Description

Electroencephalogram signals generally contain a lot of physiological information about human diseases, and especially play an important role in the aspect of detection of epilepsy. The electroencephalogram signals contain many epileptic characteristic waves, and spike waves are typical waveforms in the electroencephalogram signals. Therefore, spike detection is required to be carried out on the epileptic brain electrical signals for better research. The existing spike method is difficult to completely and accurately determine the spike position, so that the research on epileptic diseases is greatly influenced. In view of this, the present embodiment provides an intelligent epileptic spike detection method based on adaptive template matching and machine learning algorithm fusion.

In order to make the objects, implementations and innovations of the present invention more prominent, the present invention will be further described in detail with reference to the accompanying drawings and examples.

Fig. 1 is a general flowchart of an intelligent epileptic spike detection method based on adaptive template matching and machine learning algorithm fusion, including:

step S1: acquiring an electroencephalogram signal: selecting an experimental object, acquiring electroencephalogram data of an epileptic by using electroencephalogram acquisition equipment, and establishing an experimental database;

step S2: data preprocessing: performing Butterworth band-pass filtering on the acquired original EEG data to obtain a standard EEG signal;

step S3: self-adaptive template matching spike detection: firstly, defining a universal template according to the waveform characteristics of an epileptic spike, and carrying out universal template matching to obtain a candidate spike signal; then clustering the candidate spikes by using a K-means algorithm to obtain a plurality of classes; counting the number of candidate spikes in each class, and if the number of spikes is less than 5% of the total number of spikes, rejecting the class; and respectively using the screened class centers as new templates to perform self-adaptive template matching, and adding all matching results to obtain spike detection results.

Step S4: machine learning spike detection: firstly, segmenting a brain electrical signal into electroencephalogram segments with the length of 1s, then extracting time domain and frequency domain characteristics in each electroencephalogram segment, and constructing spike characteristic vectors; and training a random forest classification model by using the feature vectors to obtain a spike detection result based on machine learning.

Step S5: and (3) fusing detection results: the S3 spike detection method and the S4 spike detection method are fused, and if the spike is detected by both S3 and S4, the spike is regarded as an epileptic spike.

The intelligent epileptic spike detection method based on the fusion of adaptive template matching and machine learning algorithm proposed in the present embodiment is described in detail below with reference to fig. 1 to 5.

The intelligent epileptic spike detection method based on the fusion of the adaptive template matching and the machine learning algorithm provided by the embodiment starts in step S1, wherein a multi-lead electroencephalograph is used for collecting long-range monitoring electroencephalograms of a patient, the sampling frequency is 500Hz, the electrode distribution adopts the international 10-20 electroencephalogram collection standard, 19-channel electroencephalogram data are collected in total, and a large number of electroencephalograms of experimental bodies with different sexes and different ages are collected to obtain a plurality of electroencephalogram signal samples. Marking a plurality of electroencephalogram signal samples by a professional doctor, and marking the spike waveform in each channel of the electroencephalogram signal.

Then, step S2 is executed to perform preprocessing operation on the brain wave. A5-order Butterworth band-pass filter is adopted to filter frequency components above 32Hz and below 0.5Hz, and the interference of noise and artifacts is reduced.

And step S3, carrying out self-adaptive template matching on the electroencephalogram signals after the preprocessing operation to obtain a spike detection result. The adaptive template matching spike detection method will be described in detail below with reference to fig. 2.

Firstly, statistical analysis is carried out on spike waveforms marked in the electroencephalogram signal, and the average values of the rising edge slope, the falling edge slope, the peak value and the duration of all marked spike waveforms are respectively obtained and used as standards to establish a universal template (step S31). Then, setting the window width to 300, and performing a general template matching operation on the electroencephalogram signals in time sequence to obtain candidate spike signals (step S32). And performing K-means clustering on the candidate spikes, and classifying the candidate spikes into different classes according to different waveforms (step S33). And counting the number of candidate spikes in each spike cluster, if the number is less than 5% of the total candidate spike number, rejecting the class, and finally taking the centroid of the rest class as a new template (step S34). And (4) performing new template matching by using the centroid of each class as a template, and superposing the results to obtain a spike detection result (step S35).

As shown in fig. 3, the process of K-means clustering is as follows:

k samples are randomly selected among the n candidate spikes as an initial cluster centroid (step S331). And calculating the distance between each candidate spike and each initial centroid, reclassifying according to the minimum distance, and classifying each candidate spike into the class of the closest centroid to obtain a clustering result (step S332). The candidate spikes in each cluster are averaged as the centroid of the next cluster (step S333). And repeating the steps S332 and S333 until the position of the centroid is not changed any more or the clustering frequency reaches the requirement, finishing clustering and obtaining a clustering result (step S334). In this embodiment, the number k of the initial centroids is n, that is, each candidate spike is clustered as a centroid, and finally n clustering results are obtained.

And step S4, performing spike extraction on the preprocessed electroencephalogram signal by adopting a machine learning method. The machine learning spike detection method will be described in detail below with reference to fig. 3.

Firstly, segmenting the electroencephalogram signal of each channel into segments with the length of 1S, extracting a plurality of characteristic parameters of each segment, wherein the characteristic parameters comprise time domain characteristic parameters and frequency domain characteristic parameters, and constructing a characteristic vector corresponding to each electroencephalogram segment (step S41). The feature vectors are then divided into a training set and a test set, and the random forest classification model is trained using the data in the training set (step S42). The data in the test set is input into the random forest model, and the output result obtained after voting is the spike detection result after each decision tree, so that whether a spike exists in the segment can be detected (step S43).

The electroencephalogram segment obtained by segmentation in the step S41 is recorded as x (N), N is 1,2, …, N and N are the lengths of the electroencephalogram segments, the sampling frequency of the electroencephalogram signal is 500Hz, so that N is 500Hz, the rhythm wave is extracted through wavelet packet transformation before feature extraction, and because the spike wave frequency range is over 14Hz, the signal decomposition and reconstruction are carried out by using a db6 wavelet basis function to obtain β wave and gamma wave which are respectively recorded as x (N)₁(n) and x₂(n)。

The time domain characteristic parameters extracted in the step S41 comprise an original electroencephalogram signal x (n) and two rhythm wave signals x₁(n) and x₂Minimum, maximum, mean, standard deviation, kurtosis, skewness, and Hjo of (n)And (4) rth parameter. The minimum value Min and the maximum value Max are respectively the maximum value of the signal amplitude, the average value Mean is the electroencephalogram signal amplitude trend, and the formula is as follows:

the standard deviation SD reflects the difference between the amplitude and the average value of each sample point, and the formula is as follows:

wherein x (N) is electroencephalogram signal, N is the number of sampling points of x (N), and is the average value of the amplitudes of all the sampling points in x (N).

Kurtosis Kur represents the peak level of the data frequency distribution curve, and is given by the formula:

skewness Skaew represents the characteristic of amplitude asymmetry degree of the electroencephalogram signals, and the formula is as follows:

the Hjorth parameters include Hjorth mobility and Hjorth complexity:

hjorth mobility can be represented by the following equation:

the Hjorth complexity can be represented by the following equation:

wherein

dnf_n＝x(n)-x(n-1)。

The frequency domain characteristic parameter extracted in the step S41 comprises the energy E of two rhythm waves_iTwo rhythm wave energy to total signal energy ratio R_i。

Extracting rhythm waves through wavelet packet transformation, and performing five-layer wavelet decomposition on the electroencephalogram signals by using a db6 wavelet function to obtain β waves and gamma waves which are respectively marked as x₁(n) and x₂(n)。

Two rhythm wave energy E_iObtained from the following equation:

total energy E of signal_allThe formula of (1) is as follows:

further, the energy ratio, R, of the rhythm wave can be calculated_i＝E_i/E_all，i＝1，2。

Step S42 is a training process of the random forest model, and the random forest classifier includes a plurality of decision trees, and its output class is determined by the maximum votes in the results of all the trees. And repeatedly and randomly selecting M samples from the M samples in the original training sample set by using a bootstrap resampling technology to generate a new training sample set, and then generating a random forest by using M individual decision tree classifiers. The essence of the random forest classifier is an improvement on a decision tree algorithm, a plurality of decision trees are combined together, and each tree is established by independent randomly extracted samples. Each tree in the forest has the same distribution, and the classification error depends on the classification capability of each tree and the relevance of each tree.

The data set is divided into a training set and a testing set, and the random forest model training process is described below with reference to fig. 5:

step S421: firstly, sampling with returning is carried out for M times from all the feature vector sets to form a feature set to be selected, and the number of samples in the feature set to be selected is the same as that of the samples in the original feature vector set.

Step S422: secondly, randomly selecting a certain number of feature vectors from the features to be selected, and selecting the optimal features.

Step S423: and (4) according to the candidate feature training set obtained in the step (S422), calculating the optimal splitting mode of each node and splitting the node without pruning until the impurity degree of each leaf node reaches the specified requirement to form a decision tree.

Step S424: and repeating the step S421 to the step S423 until all the decision trees stop growing, and generating a random forest.

And step S43, inputting the electroencephalogram data in the test set into a random forest model, obtaining a spike wave detection result after voting selection by a decision tree, determining an electroencephalogram segment where a spike wave is located, and further determining an electroencephalogram channel where the spike wave is located and a time point.

In step S5, the data in the test set is first subjected to adaptive template matching to obtain a spike detection result. And meanwhile, inputting the test set into a random forest model for classification to obtain a spike detection result. The results of the two methods are then compared in fusion and if detected as spikes by both methods at the same time, they are considered epileptic spikes.

The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

14页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种多通道生物电信号采集系统及其控制方法

Intelligent epileptic spike detection method based on fusion of adaptive template matching and machine learning algorithm

相关技术

网友询问留言