Epileptic period classification method based on pulse group intelligent algorithm and combined with STFT-PSD and PCA

文档序号:1619933 发布日期:2020-01-14 浏览:4次 中文

阅读说明:本技术 一种基于脉冲群智能算法并结合stft-psd和pca的癫痫时期分类方法 (Epileptic period classification method based on pulse group intelligent algorithm and combined with STFT-PSD and PCA ) 是由 段立娟 连召洋 陈军成 乔元华 于 2019-09-30 设计创作,主要内容包括:本文公开了一种癫痫时期特征提取及分类方法。首先,对原始的癫痫脑电数据进行随机打乱预处理,并分别划分4折的训练集和测试集。其次,采用结合方法对预处理后的数据提取特征,一方面,通过WPT或STFT-PSD提取非线性的时频特征,然后,在得到的时频特征上再结合PCA算法提取脑电数据的主成分特征,并消除噪声和冗余特征,并作为特征提取的最终特征。最后,采用脉冲神经网络对提取的特征做分类分析,脉冲神经网络算法不仅考虑个体互助和信息交互,拥有很强的鲁棒性;并且它模拟的神经元更加接近大脑中真实的神经元,考虑更多的时间信息,拥有更强的计算能力。(An epileptic stage feature extraction and classification method is disclosed. Firstly, carrying out random scrambling pretreatment on original epilepsia electroencephalogram data, and respectively dividing a training set and a test set which are respectively divided into 4 folds. Secondly, extracting characteristics of the preprocessed data by adopting a combination method, on one hand, extracting nonlinear time-frequency characteristics through WPT or STFT-PSD, then, extracting principal component characteristics of the electroencephalogram data by combining the obtained time-frequency characteristics with PCA algorithm, eliminating noise and redundant characteristics, and using the principal component characteristics as final characteristics of characteristic extraction. Finally, the extracted features are classified and analyzed by adopting a pulse neural network, and the pulse neural network algorithm not only considers individual mutual assistance and information interaction, but also has strong robustness; and the simulated neurons are closer to the real neurons in the brain, more time information is considered, and the method has stronger computing power.)

1. An epileptic period classification method based on a pulse group intelligent algorithm and combined with STFT-PSD and PCA mainly comprises three parts: the method is characterized by comprising the following steps of preprocessing, feature extraction and feature classification, only applying for protection to an integral framework and steps comprising preprocessing, feature extraction and feature classification only in the field of epileptic period classification, and not applying for protection to other application fields and methods adopted by each submodule:

(1) electroencephalogram signal preprocessing

Firstly, randomly disordering and normalizing acquired epilepsia electroencephalogram data; then, the first 75% of samples are training sets, the second 25% of samples are testing sets, and 4-fold cross validation is adopted;

(2) feature extraction and fusion

Firstly, extracting nonlinear time-frequency characteristics from preprocessed electroencephalogram data by using STFT-PSD or WPT, and then extracting principal component characteristics through PCA according to the obtained time-frequency characteristics;

(3) and (4) feature classification: and inputting the linear features extracted after PCA into an SNM-CS classification model to finish classification of the epileptic brain electrical signals.

2. The method for classifying epilepsy phase based on pulse group intelligence algorithm and combined with STFT-PSD and PCA as claimed in claim 1, wherein:

the time-frequency feature E is extracted through WPT, and the method specifically comprises the following steps:

1) calculating a reconstruction coefficient c of the wavelet tree:

ci,j1=Fwprcoef(X,i,n,nt,iwin,a1,b1)

wherein X is the input EEG signal EEG after pretreatment, i is the serial number of EEG sample, n is the number of sampling points in each EEG sample, ntIs a sampling point of a time window, iwinIs the sequence number of the time window, a1To a decomposition level, b1Is the name of the wavelet base, j1Is a with1And iwinA variable of interest;

2) then, the subband mean variance coefficient E is calculated:

where i is the EEG sample sequence number, j2Is a with1And iwinRelated variable, ikIs the number of the sub-band, a2Is the number of sub-bands, FvarIs a function of variance, E is extractedFrequency characteristics;

the time-frequency feature P is extracted through the STPT-PSD, and the method specifically comprises the following steps:

1) computing DFT of signal with FFT

[Y,f]=FDFT(Xin,n,nop,nwin,nfft,fs)

Wherein, the matrix XinIs a central estimate of the vector x of each EEG sample with a suitable offset, n is the sample point of each EEG sample, nopIs the length of the overlapping window, nwinLength of sliding window, nfftA sampling point of Discrete Fourier Transform (DFT), fs is a sampling frequency, f is a sampling frequency vector formed by sampling frequency of each EEG sample, and Y is a matrix value of a signal after DFT;

2) then, a power density spectrum matrix P of the PSD is calculatedxx,

Pxx=FPSD(Y,nfft,f,n,nop,nwin,fs)

Wherein, FPSDAs a function of the power density matrix;

3) finally, the m n dimensional matrix P for each EEG sample is formedxxAnd drawing into an mn-dimensional vector characteristic P containing time-frequency information.

3. The method for classifying epilepsy phase based on pulse group intelligence algorithm and combined with STFT-PSD and PCA as claimed in claim 1, wherein: the principal component feature of the PCA extraction is as follows:

1) obtaining the weight W and the eigenvalue λ by optimizing an objective function L (W)

L(w)=wTCw-λ(wTw-1)

2) Obtaining a low-dimensional space coordinate Y through the weight W and the feature X

Y=XW

Wherein, X is the time-frequency characteristic E extracted by WPT or the nonlinear time-frequency characteristic P extracted by STPT-PSD, Y is the linear characteristic of the output corresponding to the X, when X is E, X is 128 dimension, Y is 8 dimension; when X is P, X is 3999 dimensions and Y is 33 dimensions.

4. The method for classifying epilepsy phase based on pulse group intelligence algorithm and combined with STFT-PSD and PCA as claimed in claim 1, wherein: the process of the SNM-CS classification model is as follows

1) Initializing the value of the individual in the population, i.e. the weight W, which is also the candidate solution or the position of the individual, and the lower limit value Lb of the weight, the upper limit value L of the weightu

When the iteration time t < iter is not reached or the classification accuracy of the training data set is less than the threshold value, repeating 2) to 5)

2) Randomly selecting a cuckoo, and generating a new candidate solution, namely a new position, through a Levy flight mechanism, wherein the specific process is as follows:

the formula for generating the new position by the Levy aircraft is as follows

Where α >0 is the step scaling factor, 1< λ ≦ 3, and the Levy random path is defined as follows:

Figure FDA0002222861460000022

wherein the content of the first and second substances,

Figure FDA0002222861460000023

where N (. eta.) is a normal distribution function, σ2Is the variance of a normal distribution, and Γ () is the gamma distribution function;

3) combining with SNM to obtain pulse emissivity and adaptability value, the calculation method of pulse emissivity is as follows:

the pulsed neural cloud model can be used to solve recognition problems, generate similar pulse emittance based on homogeneous samples,

the input current is defined as follows:

I=γ·x·w

wherein X is input data of the classification model, linear features are extracted after PCA, w is the weight of a neuron in the SNM, gamma is an acceleration factor emitted by an auxiliary neuron,

substituting the obtained input current I into the following formula to calculate v, mu, wherein the specific formula is as follows:

wherein C is membrane capacitance, C is 100, v is membrane voltage of neuron, μ is recovery variable, v isrTo reset the membrane voltage reference, vr=-60,vtFor the instantaneous threshold voltage, the coefficient k is 0.7, a is the scaling factor for the recovery variable, b is the sensitivity for the recovery variable, vcReset value of the membrane voltage, and the derivation;

calculating the pulse emissivity of all samples, specifically:

when v is>vpeak,vpeakSending pulses for a preset value, and obtaining the pulse emissivity of each sample by counting the number of the sent pulses;

calculating the individual fitness (namely classification accuracy) of each sample, wherein the individual fitness f of the ith sampleiIs represented as follows:

fi=Ffit(vfire,T)

wherein v isfireFor the obtained pulse emissivity, T is the label value of the electroencephalogram data, FfitA function for representing the calculated fitness, which is also the classification accuracy of the sample;

4) updating the current solution of the individual and the local optimal solution of the population

If the candidate solution fitness f of the current iterationiCandidate solution fitness f greater than last iterationjIf the fitness of the individual candidate solution is larger than the current local optimal solution, updating the population local optimal solution to be the current individual candidate solution;

5) discarding bad individuals and establishing new individuals;

with a certain probability P in the populationaDiscarding poor individuals, namely poor candidate solutions, reconstructing the individuals, discarding candidate solutions with the similarity between the individuals larger than a threshold A, and discarding candidate solutions with the fitness smaller than a threshold B;

5) and obtaining a population history optimal solution and the classification accuracy of the epilepsia electroencephalogram signals at the population history optimal solution according to the local optimal solution of each iteration of the population.

Technical Field

The invention belongs to the technical field of electroencephalogram signal processing in medical diseases, and particularly relates to an epileptic period classification method based on a pulse group intelligent algorithm and combined with STFT-PSD and PCA.

Background

Epilepsy is a serious disease of brain dysfunction, which not only causes the body of a patient to suffer pain, but also causes mental and psychosocial disorders to a certain extent. The epilepsia seriously damages the physical and mental health of a patient, and the epilepsia electroencephalogram signals are detected from different periods, so that the method can bring help to doctors for diagnosing the state of an illness.

Because the acquired electroencephalogram signal is a non-stationary signal which has strong randomness and visually lacks regularity in waveform, an effective feature extraction method is required to improve the classification accuracy of the electroencephalogram signal. Because frequency changes in epilepsy are obvious, the power spectral density (STFT-PSD) and the wavelet transform (WPT) of the short-time Fourier transform can extract time-frequency information from the electroencephalogram signals through nonlinear transformation. In addition, noise may exist in the electroencephalogram signal, and both the WPT and the STFT-PSD cannot effectively eliminate the noise and unimportant redundant features in the electroencephalogram signal. Principal Component Analysis (PCA) can also eliminate noise and insignificant features while preserving significant features in a low dimensional space. Therefore, the STFT-PSD or the combination of WPT and PCA is adopted to extract the electroencephalogram characteristics.

The extracted features are mainly used for classifying epileptic electroencephalogram signals, and Murugevel provides a support vector machine (MSVM) algorithm and an Extreme Learning Machine (ELM) algorithm which are combined in a layered mode and are used for classifying epileptic periods. However, the performance of the conventional machine learning classification algorithm also needs to be improved, and it becomes very critical to select a suitable classification model. The pulse group intelligent optimization algorithm is a classification algorithm combining group intelligent optimization and a pulse neuron model. The swarm intelligence optimization algorithm is a bionic random search algorithm without central control, so that when one or more individuals are poor in performance, the overall solution problem is not influenced. The cuckoo algorithm can effectively solve the optimization problem by simulating parasitic brooding and Levy flight mechanisms of the cuckoo. In addition, epilepsy is associated with the firing of neuronal impulses in the brain. The impulse neuron model SNM considers the influence of time information, the simulated neurons of the impulse neuron model are closer to the neurons in the human real brain, and strong theoretical support is provided in the aspect of biomedicine. Therefore, pulse group intelligence algorithm is used to complete the classification of epileptic epochs.

Disclosure of Invention

Aiming at the background, the invention provides an epileptic period classification method based on a pulse group intelligent algorithm and combined with STFT-PSD and PCA, and improves the classification accuracy. In the aspect of feature extraction, STFT-PSD or WPT is used for extracting nonlinear time-frequency features, and PCA is used for extracting principal component features and eliminating noise and unimportant redundant features. In the aspect of feature classification, the pulse group intelligent classification algorithm not only fully considers individual cooperation and information interaction, has strong robustness, but also considers more information and has stronger computing power.

In order to achieve the purpose, the invention adopts the following technical scheme:

an epileptic period classification method based on a pulse group intelligent algorithm and combined with STFT-PSD and PCA comprises the following steps:

step (1) preprocessing of EEG signals

The acquired electroencephalogram data of the epilepsy are randomly disordered and normalized, and a training set and a testing set of each fold are divided by adopting a 4-fold cross validation method.

Step (2) feature extraction and fusion

The method comprises the steps of firstly extracting nonlinear time-frequency characteristics by using STFT-PSD or WPT, and then extracting principal component characteristics by using PCA and eliminating noise and unimportant redundant characteristics.

a. Time-frequency feature extraction

And extracting time-frequency characteristics from the preprocessed electroencephalogram data through WPT or STFT-PSD.

And extracting nonlinear time-frequency characteristics through WPT.

1) And calculating a reconstruction coefficient c of the wavelet tree.

2) Then, a subband mean variance coefficient is calculated.

And extracting nonlinear time-frequency characteristics through STFT-PSD.

1) The DFT of the signal is calculated using the FFT.

2) Then, an auto-spectrum of the PSD is calculated.

3) The feature matrix is stretched into feature vectors.

b. Linear feature extraction

And extracting principal component characteristics through PCA according to the obtained time-frequency characteristics, and eliminating redundant noise.

1) The weights are obtained by optimizing an objective function.

2) And obtaining a low-dimensional space coordinate through the weight and the time-frequency characteristics to obtain final characteristics.

Step (3) feature classification

In the aspect of a classification method, the pulse group intelligent algorithm fully considers individual cooperation and information interaction and has stronger computing power. The pulse group intelligent optimization classification algorithm is combined with a cuckoo search algorithm (CS) with a Levy flight mechanism and a pulse neuron model (SNM). The neurons simulated in SNM are more realistic and take more temporal information into account. Neurons (i.e. the individual in the CS) are not activated every time, but only when their energy reaches a certain value. When an individual is activated, it transmits a pulse and communicates information to other individuals in the CS, who decide to approach or depart from the individual based on the fitness value.

The process of the SNM-CS classification model is as follows:

1) and generating N bird nests of the initial population and initializing parameters.

2) One cuckoo was randomly selected and a new candidate solution was generated by the Levy flight mechanism.

3) Pulse emissivity was obtained in combination with SNM.

4) Updating candidate solutions

5) Some bad individuals are discarded and new individuals are established.

6) And updating and finding the optimal solution of the historical population, and converting the optimal solution into the classification accuracy of the epileptic electroencephalogram signals.

Drawings

FIG. 1 is a block diagram of a process according to the present invention;

Detailed Description

The invention is further described with reference to the accompanying drawings and the detailed description.

The process of the method comprises the following steps:

(1) and (4) preprocessing the electroencephalogram signals.

Firstly, randomly disordering and normalizing the acquired epilepsia electroencephalogram data. Then, the first 75% of samples were training set and the last 25% of samples were testing set, and 4-fold cross validation was used.

(2) Binding feature extraction

a. Nonlinear time-frequency feature extraction

Extracting time-frequency characteristics E through WPT, which comprises the following steps:

1) calculating a reconstruction coefficient c of the wavelet tree:

ci,j1=Fwprcoef(X,i,n,nt,iwin,a1,b1)

where X is the preprocessed input EEG signal (EEG), i is the number of EEG samples, n is the number of sampling points in each EEG sample, n is 4096, ntIs a sampling point of a time window, nt=32,iwinIs the sequence number of the time window, a1To a decomposition level, a1=3,b1Is the name of the wavelet base, b1=dmey,j1Is a with1And iwinThe variables involved.

2) Then, the subband mean variance coefficient E is calculated:

Figure BDA0002222861470000031

wherein, a2As to the number of sub-bands,

Figure BDA0002222861470000032

j2is a with1And iwinRelated variable, 1 ≦ j2≤128,ikIs the number of sub-bands, FvarIs a function of the variance, j2Is a with1And iwinThe variable concerned, the time-frequency feature vector E, is 128-dimensional.

And extracting nonlinear time-frequency characteristics P through the STPT-PSD.

1) The DFT of the signal is calculated using the FFT.

[Y,f]=FDFT(Xin,n,nop,nwin,nfft,fs)

Wherein, the matrix XinIs a central estimate of each EEG sample vector x with a suitable offset. n is the sample point for each EEG sample, nopIs the length of the overlapping window, nop=128,nwinLength of sliding window, nwin=256,nfftIs a sample point of a Discrete Fourier Transform (DFT), nfftFs is the sampling frequency 256, fs is 128, f is the sampling frequency vector formed by the sampling frequency of each EEG sample, and Y is the matrix value of the DFT-passed signal.

2) Then, a power density spectrum matrix P of the PSD is calculatedxx

Pxx=FPSD(Y,nfft,f,n,nop,nwin,fs)

Wherein, PxxAs a power density spectrum matrix, FPSDAs a function of the power density matrix.

3) Finally, a 129 x 31 dimensional matrix P for each EEG sample is constructedxxAnd drawing into a 3999-dimensional vector feature P.

b. Linear feature extraction Y

1) Obtaining the weight W and the eigenvalue λ by optimizing an objective function L (W)

L(w)=wTCw-λ(wTw-1)

And selecting the dimension output by the PCA according to the ACR, wherein the ACR accumulates the contribution rate, the size of the ACR is related to the calculated characteristic value lambda, and after WPT, when the dimension is 8, namely the characteristic value lambda of the first 8 dimensions and the corresponding weight W are selected, the ACR is 98.02% and is higher than 98%. After STFT-PSD, ACR was 98.145% and higher than 98% when the dimension was 33.

2) Obtaining a low-dimensional space coordinate Y through the weight W and the feature X

Y=XW

Wherein, X is the time-frequency characteristic E extracted by WPT or the nonlinear time-frequency characteristic P extracted by STPT-PSD. Y is the linear characteristic of the output corresponding to the Y, when X is E, X is 128-dimension, Y is 8-dimension; when X is P, X is 3999 dimensions and Y is 33 dimensions;

(3) feature classification

And inputting the linear features extracted after PCA into an SNM-CS classification model to finish classification of the epileptic brain electrical signals.

The process of the SNM-CS classification model is as follows

1) Generating N individuals of an initial population, and initializing parameters.

The values of the individuals in the population (i.e., the weights W, which are also candidate solutions or positions of the individuals) are initialized. Where N is 40, the lower limit value L of the weightbUpper limit value of weight L-20u=20

When the iteration time t < iter is not reached or the classification accuracy of the training data set is less than the threshold value, repeating 2) to 5)

2) Randomly selecting a cuckoo, and generating a new candidate solution, namely a new position, through a Levy flight mechanism, wherein the specific process is as follows:

the formula for generating a new position by the Levy flight mechanism is as follows

Figure BDA0002222861470000051

Where α >0 is the step scaling factor, α ═ 1, λ ═ 1.2, and the Levy random path is defined as follows:

Figure BDA0002222861470000052

Figure BDA0002222861470000053

where N (. eta.) is a normal distribution function, σ2Is the variance of a normal distribution, and Γ () is the gamma distribution function.

3) Combining with SNM to obtain pulse emissivity and adaptability value, the calculation method of pulse emissivity is as follows:

the pulsed neural cloud model can be used to solve the recognition problem, producing similar pulse emittance based on homogeneous samples.

The input current is defined as follows:

I=γ·x·w

wherein, X is input data of the classification model, linear features are extracted after PCA, w is the weight of a neuron in the SNM, gamma is an accelerating factor emitted by an auxiliary neuron, and gamma is 100.

Substituting the obtained input current I into the following formula to calculate v and mu, wherein the specific formula is as follows:

Figure BDA0002222861470000054

Figure BDA0002222861470000055

wherein C is membrane capacitance, C is 100, v is membrane voltage of neuron, μ is recovery variable, v is membrane voltage of neuronrTo reset the membrane voltage reference, vr=-60,vtIs the instantaneous threshold voltage, vt-40, coefficient k-0.7, a scaling factor for the recovery variable, a-0.03, b sensitivity for the recovery variable, b-2, vcIs a reset value of the membrane voltage, vcTo-50, shows derivation, UdFor restoring increasing values of variables after the pulse is issued, Ud100. When v is>vpeak,vpeakV is set to 35, v is set to vcMu is set to U + UdAnd issues a pulse. The pulse emissivity is as follows vfireAnd individual fitness (i.e., classification accuracy) fiThe function is as follows:

vfire=Ffire(u,v,vpeak)

fi=Ffit(vfire,T)

wherein v isfireFor the pulse emissivity obtained, FfireCalculating the pulse emissivity of an individual according to the number of pulses emitted in unit time in order to calculate the function of the pulse emissivity, wherein T is the label value of the electroencephalogram data, and FfitFor calculating the function of fitness, the pulse emissivity v is determined according to the individualfireDividing individuals into different classesComparing with the label value T, calculating the classification accuracy, namely the fitness of the individual, fiThe obtained individual fitness is also the classification accuracy of the sample.

4) Updating the current solution of the individual and the local optimal solution of the population

If the candidate solution fitness f of the current iterationiCandidate solution fitness f greater than last iterationjAnd updating the candidate solution j of the individual to be i, and updating the population local optimal solution to be the candidate solution of the current individual if the fitness of the individual candidate solution is greater than that of the current local optimal solution.

5) Discarding some bad individuals and creating new individuals;

the population will have a certain probability PaDiscarding bad individuals, i.e. bad candidate solutions, Pa0.35 and individual was reconstructed. And discarding the candidate solution with the inter-individual similarity larger than a threshold value, and also discarding the candidate solution with the fitness smaller than the threshold value.

6) And obtaining a population history optimal solution and the classification accuracy of the epilepsia electroencephalogram signals at the population optimal solution according to the local optimal solution of each iteration of the population.

To verify the stability of the algorithm, the algorithm was run 20 times, each time calculating the average accuracy of 4-fold cross-validation, and then the highest accuracy and average accuracy of these 20 times were recorded.

Comparative analysis

TABLE 1 comparison of the accuracy of the method of the present invention with the mainstream electroencephalogram classification algorithm

Figure BDA0002222861470000061

The performance of the feature extraction and classification algorithm was verified in comparative experiments using the published data set of the university of bourne, germany. The experimental combination is shown in table 1, Max _ acc represents the highest classification accuracy out of 20 runs of the algorithm with 4-fold average accuracy, and Avg _ acc represents the average classification accuracy out of 20 runs of the algorithm with 4-fold average accuracy. The second column is the name of the feature extraction method employed, and the third column is the name of the classification model employed. The average classification accuracy and the highest classification accuracy of the ELM algorithm on the original data are 79.44% and 88% respectively, the ELM is used as a classification model, and the average classification accuracy of the DFA, the AE and the HE which are used as feature extraction is 82%, 88% and 88% respectively. WPT is used for feature extraction, HELM and KHELM are respectively used as classification models, the average classification accuracy is 90.32% and 93.68%, and the highest classification accuracy is 94% and 98%. STFT-PSD is used for feature extraction, HELM and KHELM are respectively used as classification models, the average classification accuracy is 93.56% and 94.36%, and the highest classification accuracy is 98%. The Nonlinear Features are used for feature extraction, the GMM is used as a classification model, and the average classification accuracy is 95%. EMD is used for feature extraction, C4.5 is used for a classification model, and the average classification accuracy is 95.3%. FLP is used as characteristic extraction, Kernel SVM is used as a classification model, and the average classification accuracy is 95.33%. WPE is used for feature extraction, SVM is used as a classification model, and the average classification accuracy is 96.5%. DTCTWT is used as feature extraction, GRNN is used as a classification model, and the average classification accuracy is 98.00%. FE-ESN is used as characteristic extraction, ELM is used as a classification model, and the highest classification accuracy rate is 98.3%. As shown in Table 1, the classification method for the epileptic period of the invention extracts nonlinear time-frequency characteristics by using WPT or STFT-PSD, then extracts principal component characteristics by combining PCA, eliminates noise, and finally improves the classification accuracy by combining with an SNM-CS classification method. The highest classification accuracy of WPT + PCA or STFT-PSD + PCA combined with SNM-CS can reach 100%, and the average classification accuracy reaches 98.53% and 98.95% respectively. Compared with other comparison methods, the average classification accuracy and the highest classification accuracy of the combined feature and classification method are the highest.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于脑电多参数的虚拟现实情境任务注意力测训系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!