Electroencephalogram signal identification method

文档序号:753116 发布日期:2021-04-06 浏览:32次 中文

阅读说明:本技术 一种脑电信号识别方法 (Electroencephalogram signal identification method ) 是由 李升� 陈家锐 陈宝琴 杨培浩 陈炳材 张军 于 2020-12-21 设计创作,主要内容包括:本发明公开了一种脑电信号识别方法,首先采集脑电信号,采集到的数据分为两类,包括训练集与测试集;脑电信号的采集采用P300脑机接口进行采集,采集时,通过测量人在看到相应字符时的脑电信号作为数据来源;脑电信号预处理;采集优先信号通道;搭建生成对抗网路,通过生成的对抗网络对脑电信号进行分析识别。通过采用本方法,解决了脑电P300信号识别率低的问题,解决了信号识别鲁棒性低的问题。(The invention discloses an electroencephalogram signal identification method, which comprises the steps of firstly, acquiring electroencephalogram signals, wherein the acquired data are divided into two types, including a training set and a testing set; collecting electroencephalogram signals by adopting a P300 brain-computer interface, wherein the electroencephalogram signals when corresponding characters are seen by a measuring person are used as data sources during collection; preprocessing an electroencephalogram signal; collecting a priority signal channel; and establishing a generated confrontation network, and analyzing and identifying the electroencephalogram signals through the generated confrontation network. By adopting the method, the problem of low recognition rate of the electroencephalogram P300 signal is solved, and the problem of low robustness of signal recognition is solved.)

1. An electroencephalogram signal identification method is characterized by comprising the following steps:

s1, acquiring electroencephalogram signals, wherein the acquired data are divided into two types including a training set and a testing set;

s2, preprocessing an electroencephalogram signal;

s3, collecting a priority signal channel;

and S4, establishing a generated confrontation network, and analyzing and identifying the electroencephalogram signals through the generated confrontation network.

2. The electroencephalogram signal identification method according to claim 1, characterized in that: in the step S1, the electroencephalogram signal is collected by using a P300 brain-computer interface, and during collection, the electroencephalogram signal when a person sees a corresponding character is measured as a data source; the subject will be tested for several rounds for each character, each round will flash 12 times, 2 out of 12 will stimulate the subject to produce a P300 brain electrical signal.

3. The electroencephalogram signal identification method according to claim 2, characterized in that: in the step S2, the signal preprocessing includes the steps of:

3000 continuous sampling values of data collected by an S2.1 and P300 brain-computer interface are obtained, the first 250 meaningless data are removed,

s2.2, adopting low-pass filtering with the order of 8 for processing data;

s2.3, carrying out normalization processing on the data of each channel to form a waveform diagram of 20 channels;

and S2.4, carrying out superposition averaging on the waveforms of the 20 channels to obtain a waveform diagram.

4. The electroencephalogram signal identification method according to claim 3, characterized in that: the rows and the columns in each round of measurement flicker once, the rows are divided into A types, and the columns are divided into B types; the time range of P300 electroencephalogram signals of a plurality of subjects is determined by analyzing known 12 characters, the graphs are represented as oscillograms of six rows and six columns in one flash, and the row and the column with the highest probability within 500ms of the P300 electroencephalogram signals are found out.

5. The electroencephalogram signal identification method according to claim 1, characterized in that: in step S3, the optimal channel is selected by using pearson correlation coefficient to study the linear correlation degree between each tested channel, which is defined as shown in the following formula:

wherein, x, y are mutually two channels which are different.

6. The electroencephalogram signal identification method according to claim 5, characterized in that: using a training data set, substituting data into a formula to obtain correlation coefficients among all channels; and labeling the correlation relation by adopting a thermodynamic diagram and finding out the correlation.

7. The electroencephalogram signal identification method according to claim 1, characterized in that: in step S4, the generated countermeasure network includes a supervised classification model, an unsupervised discriminant model, and a generated model, the generated model is used to generate the dummy data, and the supervised classification model and the unsupervised discriminant model share the weight of the hidden layer; the hidden layer comprises 3 full connection layers, and the layers are activated through a relu function.

8. The electroencephalogram signal identification method according to claim 7, characterized in that: the output result of the generated model and the format of the training data are the same, the hidden layer of the generated model is composed of two full-connection layers, and the number of neurons in the two full-connection layers is 600.

9. The electroencephalogram signal identification method according to claim 8, characterized in that: initializing a model, wherein the parameter initialization method is standardized kaiming initialization, and performing iterative training on the model, wherein the iterative training comprises the following steps:

A. training a supervised classification model separately using the label samples;

B. training an unsupervised discriminant model by using the unlabeled sample and the dummy data sample;

C. determining weights in a supervised classification model and an unsupervised discrimination model, and training by using a pseudo data sample to generate a confrontation network model so as to improve the reliability of the generated model;

D. after full training, the obtained supervised classification model can be applied to classification of new data which are not labeled;

E. the supervised classification model is used for realizing label classification on the sample data, namely distinguishing positive samples or negative samples; the unsupervised judging model is used for distinguishing whether the sample data is from the unlabeled new data of the data set or the pseudo data sample obtained by the generating model; the input and hidden layers of the supervised classification model and the unsupervised discrimination model are the same, and the final realized results are different.

Technical Field

The invention relates to the technical field of electroencephalogram signals, in particular to an electroencephalogram signal identification method.

Background

The brain is the central nervous system of higher nervous activities in the human body, has hundreds of millions of neurons, and transmits and processes human body information by interconnecting. The electroencephalogram signals can be divided into induced electroencephalogram signals and spontaneous electroencephalogram signals according to the mode of generation. The P300 event related potential is one of induced electroencephalogram signals, is used as an endogenous component, is not influenced by stimulated physical characteristics, is related to perceptual or cognitive psychological activities, and is closely related to the processing processes of attention, memory, intelligence and the like. The electroencephalogram signals collected in the sleeping process belong to spontaneous electroencephalogram signals. The spontaneous sleep electroencephalogram signals can reflect the self change of the body state, and are also important auxiliary tools for evaluating the sleep quality and diagnosing and treating sleep-related diseases.

The existing electroencephalogram detection method has the problems of low recognition rate, low accuracy and low recognition speed.

Disclosure of Invention

The invention aims to: aiming at the existing problems, the electroencephalogram signal identification method is provided, the problem of low electroencephalogram P300 signal identification rate is solved, and the problem of low signal identification robustness is solved.

The invention adopts the following specific technical scheme:

an electroencephalogram signal identification method is characterized by comprising the following steps:

s1, acquiring electroencephalogram signals, wherein the acquired data are divided into two types including a training set and a testing set; collecting electroencephalogram signals by adopting a P300 brain-computer interface, wherein the electroencephalogram signals when corresponding characters are seen by a measuring person are used as data sources during collection; the subject will be tested for several rounds for each character, each round will flash 12 times, 2 out of 12 will stimulate the subject to produce a P300 brain electrical signal. The rows and the columns in each round of measurement flicker once, the rows are divided into A types, and the columns are divided into B types; the time range of P300 electroencephalogram signals of a plurality of subjects is determined by analyzing known 12 characters, the graphs are represented as oscillograms of six rows and six columns in one flash, and the row and the column with the highest probability within 500ms of the P300 electroencephalogram signals are found out.

S2, preprocessing an electroencephalogram signal; in the step S2, the signal preprocessing includes the steps of:

3000 continuous sampling values of data collected by an S2.1 and P300 brain-computer interface are obtained, the first 250 meaningless data are removed,

s2.2, adopting low-pass filtering with the order of 8 for processing data;

s2.3, carrying out normalization processing on the data of each channel to form a waveform diagram of 20 channels;

and S2.4, carrying out superposition averaging on the waveforms of the 20 channels to obtain a waveform diagram.

S3, collecting a priority signal channel; the selection of the optimal channel uses the pearson correlation coefficient to study the linear correlation degree between each tested channel, which is defined as shown in the following formula:

wherein, x, y are mutually two channels which are different. Using a training data set, substituting data into a formula to obtain correlation coefficients among all channels; and labeling the correlation relation by adopting a thermodynamic diagram and finding out the correlation.

And S4, establishing a generated confrontation network, and analyzing and identifying the electroencephalogram signals through the generated confrontation network. In step S4, the generated countermeasure network includes a supervised classification model, an unsupervised discriminant model, and a generated model, the generated model is used to generate the dummy data, and the supervised classification model and the unsupervised discriminant model share the weight of the hidden layer; the hidden layer comprises 3 full connection layers, and the layers are activated through a relu function. The output result of the generated model and the format of the training data are the same, the hidden layer of the generated model is composed of two full-connection layers, and the number of neurons in the two full-connection layers is 600.

Initializing a model, wherein the parameter initialization method is standardized kaiming initialization, and performing iterative training on the model, wherein the iterative training comprises the following steps:

A. training a supervised classification model separately using the label samples;

B. training an unsupervised discriminant model by using the unlabeled sample and the dummy data sample;

C. determining weights in a supervised classification model and an unsupervised discrimination model, and training by using a pseudo data sample to generate a confrontation network model so as to improve the reliability of the generated model;

D. after full training, the obtained supervised classification model can be applied to classification of new data which are not labeled;

E. the supervised classification model is used for realizing label classification on the sample data, namely distinguishing positive samples or negative samples; the unsupervised judging model is used for distinguishing whether the sample data is from the unlabeled new data of the data set or the pseudo data sample obtained by the generating model; the input and hidden layers of the supervised classification model and the unsupervised discrimination model are the same, and the final realized results are different.

In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:

1. firstly, for character recognition, sample data is screened firstly, and then the ideas of 8-order low-pass filtering and data normalization and superposition averaging are adopted. The filtering function is to eliminate the influence of a part of noise, after normalization, superposition and averaging, a wave pattern corresponding to each character test of each person is made and analyzed, the line number and the column number which are most likely to become a P300 signal are extracted, the most likely characters are comprehensively analyzed according to the conditions of all people, and the analysis method is verified by char13-char17 given in the second question, wherein the accuracy rate of the analysis method is about 60%.

2. For all tested channel extractions, correlation coefficients are used herein to determine the correlation between each channel and other channels, and are visualized with a thermodynamic diagram. When the correlation coefficient of a certain channel and another channel is less than 0 and exceeds 10 times, the channel is selected to be rejected, and the tested S4 data has errors, and the data is not analyzed. The calculations yielded representative channels for the tested S1, S2, S3, S5, and the intersection of them as the optimal channel combination, for a total of 11 channels, [ 'T7', 'T8', 'CP3', 'CP4', 'CP5', 'CP6', 'P4', 'P8', 'Oz', 'O1', 'O2').

3. According to the optimal channel combination obtained by the method, after the original brain wave data is preprocessed and sliced, a training sample set is obtained, part of the training sample set is used as labeled samples, the rest of the training sample set is used as unlabeled samples, an SGAN model is trained, S5_ test is used as verification data, the average accuracy of the model is finally obtained to be 15%, and the rest of characters are predicted. The semi-supervised generation of the countermeasure network only needs to acquire a small number of labeled samples to train the model, so that the training time is effectively reduced, and the robustness of the model is effectively improved.

4. By adopting the method, the problem of low recognition rate of the electroencephalogram P300 signal is solved, and the problem of low robustness of signal recognition is solved.

Drawings

FIG. 1 is a schematic diagram of a display used in the data acquisition of the present invention.

Fig. 2 is a waveform diagram of the present invention after the 20 channels are superposed and averaged.

Fig. 3 is a waveform diagram of the superposition of the 8 th character 6 line of the human subject 1 according to the present invention.

Fig. 4 is a waveform diagram of the superposition of the 8 th character 6 column of the human subject 1 according to the present invention.

Fig. 5 is a SGAN flow chart of the present invention.

FIG. 6 is a P300 signal diagram of the present invention.

Fig. 7 is a P300-free signal diagram of the present invention.

The labels in the figure are: the device comprises a base 1, a spring 2, a vibrating motor 3, a frame 4, a material discarding barrel 5, a stepping motor 6, a motor base 7, a coupler 8, a discharge pipe 9, a transmission shaft 10, a bearing 11, a dividing cylinder 12, a dividing ratio adjusting blade 13, a feeding bottom plate 14, a feeding middle plate 15, a speed reducing net 16, a speed reducing plate 17, a feeding top plate 18, a material stopping plate 19 and a hopper 20.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In the description of the present invention, it should be noted that, unless otherwise explicitly stated or limited, the terms "mounted" and "connected" are to be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; either directly or indirectly through intervening media, either internally or in any combination thereof. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1:

the invention discloses an implementation mode of an electroencephalogram signal identification method, which comprises the steps of firstly collecting electroencephalogram signals, wherein experimental data in the method are provided by P300 brain-computer interfaces of 5 healthy adult testees (S1-S5). Each subject can observe a character matrix consisting of 36 characters in rows or columns (6 rows and 6 columns in total), as shown in fig. 1. The acquisition device acquires brain electrical data for the subject at a sampling rate of 250 Hz. Data is divided into two categories, including training set and testing set.

The specific data acquisition process comprises the following steps: after entering a flashing mode of the character matrix, flashing one row or one column of the character matrix in a random sequence every time, wherein the flashing time is 80 milliseconds, and the interval is 80 milliseconds; finally, when all rows and columns blink once, a round of experiment is ended. In the process of observing the target characters, when the lines or columns of the target characters flicker, P300 potential appears in the EEG signals; while when other rows and columns blink, the P300 potential does not appear. The experimental procedure was repeated for 5 cycles in 1 cycle. In each character matrix flicker experiment, the electroencephalogram data table contains 20 recording channels, namely Fz, F3, F4, Cz, C3, C4, T7, T8, CP3, CP4, CP5, CP6, Pz, P3, P4, P7, P8, Oz, O1 and O2, and each row represents one sample point data. Before electroencephalogram analysis, raw data needs to be preprocessed, so that other noise interference is reduced, and more sample characteristics are kept as far as possible.

The data is then preprocessed, because the tested characters of each subject are consistent, but the time of the generated P300 electroencephalogram signal is possibly inconsistent due to the difference of individual targets, and because the sampling frequency of the device is 250Hz and the sampling interval is 4ms, namely, in the data table attached to the first subject, more than 3000 data values are continuously sampled, and as can be seen from the event data table, the first character target flickers all 250 sample points later, i do not consider the first 250 data. For the processing of the data, we use low-pass filtering with the order of 8, and then perform normalization processing on the data of each channel, which is to facilitate the inconsistency of the initial data of each channel in the accessory to be processed by using superposition average, after the data is normalized, we perform low-pass filtering and normalization on the experiment of one character of the subject, and then perform waveform diagrams of 20 channels, as can be seen from fig. 2, in which data of some channels have redundancy.

Then, the waveforms of 20 channels are subjected to superposition and averaging to obtain a waveform diagram. As shown in fig. 3 and 4. Fig. 3 shows the waveform of the 8 th character to be verified of the subject 1 after all the line/column flickers are respectively and correspondingly superimposed and averaged in 5 rounds of experiments, from which the corresponding line and column generating the P300 signal in about 300 and 500ms needs to be analyzed, i.e. the line 2, 3 and the column 12 of the possible selection area in the following figure should be taken in the interval.

The identification method comprises the following steps: for each character, the subject is tested for 5 rounds, each round flashes for 12 times, and 2 times of the 12 times stimulate the subject to generate P300 electroencephalogram signals; that is, for the determination of a character, we only need to determine the row and column where the character is located. Therefore, for the waveform after the superposition and averaging, because the interference of the electromyographic signals, the electroencephalographic signals and the electro-ocular signals exists, the accuracy after the filtering cannot be ensured, and therefore, the graph which is most likely to become the P300 signal within 0-600ms after stimulation is found in the waveform as the determination of the row and column.

In one round of experiment, the rows and columns all flash once, and we divide the rows into one class and the columns into one class. The time range of the P300 electroencephalogram signal of 5 subjects is roughly determined by analyzing the known 12 characters, the graph is represented as a waveform diagram of six rows and six columns in one flash, and the row or the column where the character is located can be found out by finding out the row and the column with the highest probability within 300-500ms of the possible occurrence of the P300 electroencephalogram signal. The results of predictive identification are shown in tables 1-1 to 1-10:

TABLE 3-1 statistical validation of the character "char 13

TABLE 3-2 statistical validation of the character "char 14

TABLE 3 statistical validation of the character "char 15

TABLE 3-4 statistical validation of the character "char 16

TABLE 3-5 statistical validation of the character "char 17

TABLE 3-6 statistical validation of the character "char 18

TABLE 3-7 statistical validation of the character "char 19

TABLE 3-8 statistical validation of the character "char 20

TABLE 3-9 statistical validation of the character "char 21

Statistical validation of 3-10 characters "char 22

In 20 electroencephalogram signal acquisition channels, irrelevant or redundant channel data not only can increase the complexity of the system, but also influences the accuracy and performance of classification and identification. For each tested, the channel name more favorable for classification is extracted separately. Therefore, the pearson correlation coefficient was chosen to study the degree of linear correlation between each channel tested, as defined by equation (4-1).

Wherein, x, y are mutually two channels which are different.

The training data set is used to calculate the correlation coefficient among all channels tested in S1, S2, S3, S4 and S5. Since the correlation is more visually observed, it is represented herein using a thermodynamic diagram. Considering that when the same testee recognizes different characters, the correlation among the channels of the obtained electroencephalogram data is different.

And then selecting an optimal channel and selecting an optimal channel combination name. Through thermodynamic diagrams and correlation, the correlation coefficients of the remaining 19 channels are almost the same except that channel "Cz" is negatively correlated with each channel, indicating that the fluctuations and directions between the 19 channels are the same. Normally this would not be the case and is suspected of being a problem with plant acquisition errors, which is also consistent with our analysis in the first place. To reduce errors, no channel selection analysis was performed at test S4.

In order to achieve the purpose that the number of the selected channel combinations is more than or equal to 10, after repeated tests and comparison are carried out for a plurality of times, each channel of each testee is subjected to more than 10 times of negative correlation with other channels, the channel is removed, each tested channel is obtained, and the intersection of all tested channels is selected as the optimal channel combination.

The specific channel selection is shown in Table 2-1, where the symbol "X" is to cull the channel and the symbol "V" is to preserve the channel.

TABLE 2-1 Each channel tested and optimal channel combination

S1 S2 S3 Common channel
Fz × × × ×
F3 × ×
F4 × ×
Cz × × × ×
C3 × × ×
C4 × ×
T7
T8
CP3
CP4
CP5
CP6
Pz ×
P3 ×
P4
P7 × ×
P8
Oz
O1
O2

From the thermodynamic diagrams and table 2-1, we can obtain that the channels of the tested S1 are selected to be 'F3', 'F4', 'C4', 'T7', 'T8', 'CP3', 'CP4', 'CP5', 'CP6', 'Pz', 'P3', 'P4', 'P7', 'P8', 'Oz', 'O1', 'O2', and 17 in total.

The channels of the tested S2 were selected as [ 'F3', 'C3', 'C4', 'T7', 'T8', 'CP3', 'CP4', 'CP5', 'CP6', 'Pz', 'P3', 'P4', 'P8', 'Oz', 'O1', 'O2', for a total of 16.

The channels of the tested S3 were selected as 15 [ 'C3', 'T7', 'T8', 'CP3', 'CP4', 'CP5', 'CP6', 'Pz', 'P3', 'P4', 'P7', 'P8', 'Oz', 'O1', 'O2'.

The channels of the tested S5 were selected as 15 [ 'Fz', 'F3', 'Cz', 'C4', 'T7', 'T8', 'CP3', 'CP4', 'CP5', 'CP6', 'P4', 'P8', 'Oz', 'O1', 'O2' ].

Finally, the intersection is taken to obtain the optimal channel combination names [ 'T7', 'T8', 'CP3', 'CP4', 'CP5', 'CP6', 'P4', 'P8', 'Oz', 'O1', 'O2', and 11 in total.

Semi-supervised learning is the training of a class prediction model in a dataset containing a small number of labeled samples and a large number of unlabeled samples. The model must perform a classification learning task from a small number of labeled samples and somehow utilize a larger unlabeled dataset to improve the performance of the supervised task in order to classify new examples in the future.

A Generative Adaptive Network (GAN) is an architecture for training an image generation model by an image discrimination model by effectively utilizing a large amount of unmarked data sets. Discriminators in GAN are trained to predict whether a given image is real (from the dataset) or false (generated by the generative model), and in certain cases, discriminative models can serve as a starting point for developing semi-supervised classifier models.

Semi-supervised generative countermeasure network (or SGAN) models are extensions of the GAN architecture for solving semi-supervised learning problems, and include simultaneous training of supervised classification models, unsupervised discriminant models, and generative models. The supervised classification model obtained through training can be well applied to the label-free samples, the model is generated, and after optimization iteration, the generated image samples can be falsified and falsified. The SGAN flow diagram is shown in fig. 5.

The SGAN model is realized by the following steps:

1. initializing a supervised classification model, an unsupervised discrimination model and a generation model. The supervised classification model and the unsupervised identification model share the weight of the hidden layer, the hidden layer is composed of 3 full-connection layers, the activation function is relu, the parameter initialization method is standardized kaiming initialization, the generation model is used for generating pseudo data, the output result and the format of training data are the same, the hidden layer is composed of two full-connection layers, and the number of neurons is 600.

2. And constructing the SGAN model by using the initialized model in the first step.

3. The model is iteratively trained, and the training process can be subdivided into:

training a supervised classification model separately using the label samples;

training an unsupervised discriminant model by using unlabeled samples and 'pseudo' samples;

weights in the supervised classification model and the unsupervised discrimination model are fixed, and a 'pseudo' sample is used for training the SGAN model so as to improve the reliability of the generated model.

4. After full training, the obtained supervised classification model can be applied to classification of new data which are not labeled.

Although the input and hidden layers of the supervised classification model and the unsupervised discrimination model are the same, the final realized results are different. The supervised classification model is used for realizing label classification on the sample data, namely distinguishing positive samples or negative samples; while the unsupervised discriminative model is used to distinguish whether the sample data originates from 'real data' of the dataset or 'pseudo data' resulting from the generative model.

And according to the optimal channel combination obtained by the second problem, selecting tested S5 to perform data slicing on each known character, wherein the data length is 120ms, and obtaining a training sample. 30 of which are positive samples for generating the P300 signal, as shown in fig. 6; 30 are negative samples without the P300 signal, as shown in FIG. 7. And then, randomly selecting 5 samples from the positive and negative samples as labeled samples, and taking the remaining 50 samples as unlabeled samples to train the SGAN model and store the model.

Because each round of recognition process of one character comprises 60 rows and columns of data, the characters to be verified are sliced into row data and column data to form a test set. And finally, extracting a supervision classification model from the trained SGAN model, applying a test set to the model to obtain a classification label of whether the predicted row and column have P3OO signals, and determining the predicted character according to the row number and the column number. As shown in table 3.1, and the average prediction accuracy was 15%.

TABLE 3.1 model prediction statistics for the known characters tested at S5

Similarly, the unknown character data of the tested S5 is sliced and applied to the model to obtain the prediction results shown in table 3.2.

TABLE 3.2 model prediction statistics for unknown characters tested at S5

Therefore, by adopting the method, the problem of low recognition rate of the electroencephalogram P300 signal is solved, and the problem of low robustness of signal recognition is solved.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:心脏螺旋回顾重建数据挑选、回顾重建方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!