Noise classification method based on BP network

文档序号:1467523 发布日期:2020-02-21 浏览:12次 中文

阅读说明:本技术 一种基于bp网络的噪声分类方法 (Noise classification method based on BP network ) 是由 张涛 耿彦章 邵洋洋 于 2019-10-10 设计创作,主要内容包括:一种基于BP网络的噪声分类方法:对输入的噪声信号进行预处理;对预处理后的每一帧噪声信号分别进行傅里叶变换得到噪声信号功率谱;利用每一帧噪声信号功率谱分别计算每一帧噪声信号的梅尔频率倒谱系数及梅尔频率倒谱系数的一阶差分;计算每一帧噪声信号的伽玛通频率倒谱系数;将每一帧噪声信号的梅尔频率倒谱系数、梅尔频率倒谱系数的一阶差分和伽玛通频率倒谱系数组合作为该帧噪声信号的联合特征,将全部帧噪声信号的联合特征中的一部分作为训练数据,另一部分作为测试数据;分别训练一级BP网络和二级BP网络;将一级BP网络和二级BP网络联合进行测试,得到最终的噪声信号分类结果。本发明有着更高的噪声分类准确率。(A noise classification method based on a BP network comprises the following steps: preprocessing an input noise signal; performing Fourier transform on each frame of the preprocessed noise signals respectively to obtain a noise signal power spectrum; respectively calculating the Mel frequency cepstrum coefficient of each frame of noise signal and the first-order difference of the Mel frequency cepstrum coefficient by using the power spectrum of each frame of noise signal; calculating a gamma pass frequency cepstrum coefficient of each frame of noise signals; combining the Mel frequency cepstrum coefficient, the first order difference of the Mel frequency cepstrum coefficient and the gamma pass frequency cepstrum coefficient of each frame of noise signal as the joint characteristics of the frame of noise signal, using one part of the joint characteristics of all the frame of noise signals as training data, and using the other part as test data; respectively training a first-level BP network and a second-level BP network; and (4) testing the first-level BP network and the second-level BP network jointly to obtain a final noise signal classification result. The invention has higher noise classification accuracy.)

1. A noise classification method based on a BP network is characterized by comprising the following steps:

1) preprocessing an input noise signal, including framing and windowing;

2) performing Fourier transform on each frame of the preprocessed noise signals respectively to obtain a noise signal power spectrum;

3) respectively calculating the Mel frequency cepstrum coefficient of each frame of noise signal and the first-order difference of the Mel frequency cepstrum coefficient by using the power spectrum of each frame of noise signal;

4) calculating a gamma pass frequency cepstrum coefficient of each frame of noise signals;

5) combining the Mel frequency cepstrum coefficient, the first order difference of the Mel frequency cepstrum coefficient and the gamma pass frequency cepstrum coefficient of each frame of noise signal as the joint characteristics of the frame of noise signal, using one part of the joint characteristics of all the frame of noise signals as training data, and using the other part as test data;

6) training a first-level BP network;

7) training a secondary BP network;

8) and (4) testing the first-level BP network and the second-level BP network jointly to obtain a final noise signal classification result.

2. The method of claim 1, wherein the step 2) is to perform fourier transform on the noise signal of each frame by using the following formula:

X(i,k)=FFT[xi(n)]

wherein X (i, k) is the power spectrum of the ith frame signal at the kth spectral line, FFT represents Fourier transform, Xi(n) denotes an ith frame signal, and n denotes a sequence index number.

3. The BP network-based noise classification method according to claim 1, wherein the step 3) comprises:

(3.1) calculating spectral line energy for the power spectrum of each frame of noise signal:

E(i,k)=[X(i,k)]2

wherein E (i, k) represents the spectral line energy of the ith frame data at the kth spectral line; x (i, k) is the power spectrum of the ith frame signal at the kth spectral line;

(3.2) calculating the energy of each frame of noise signal passing through the Mel Filter

Wherein S (i, M) represents the energy of the ith frame signal passing through the mth Mel filter, M is the total number of Mel filters, Hm(k) Representing the frequency domain response of the mth mel-filter at the kth spectral line, N being the total number of spectral lines;

(3.3) calculating mel frequency cepstrum coefficients:

wherein, mfcc (i, n) is the mel-frequency cepstrum coefficient of the ith frame noise signal at the kth spectral line;

(3.4) calculating the first order difference of the mel-frequency cepstrum coefficients:

△mfcc(i,k)=2mfcc(i-2,k)-mfcc(i-1,k)+mfcc(i+1,k)+2mfcc(i+2,k)

wherein △ mfcc (i, k) represents the first difference of the mel-frequency cepstrum coefficients of the i-th frame signal at the k-th spectral line, and mfcc (i-2, k), mfcc (i-1, k), mfcc (i +1, k), and mfcc (i +2, k) represent the mel-frequency cepstrum coefficients of the i-2-th frame signal, i-1 frame signal, i +1 frame signal, and i +2 frame signal at the k-th spectral line, respectively.

4. The BP network-based noise classification method according to claim 1, wherein the step 4) comprises:

(4.1) calculating the energy through the gamma pass filter from the resulting spectral line energy:

Figure FDA0002228629700000021

wherein R (i, P) represents the energy of the ith frame noise signal passing through the P-th gamma-pass filter, P is the total number of gamma-pass filters, Hp(k) Representing the frequency domain response of the p-th gamma-pass filter at the k-th spectral line, N being the total number of spectral lines, e (f) representing the exponentially compressed values;

(4.2) calculating the gamma-pass frequency cepstrum coefficient:

wherein gfcc (i, k) represents the gamma pass frequency cepstrum coefficient of the ith frame signal at the kth spectral line.

5. The noise classification method based on the BP network according to claim 1, wherein the training of the first-level BP network in step 6) comprises inputting the training data into the first-level BP network for network training, obtaining the class of each frame of noise signal after the training of the first-level BP network, and storing the network weight of the trained first-level BP network.

6. The noise classification method based on the BP network according to claim 1, wherein the training of the secondary BP network in step 7) is to combine the mel-frequency cepstrum coefficient, the first difference of the mel-frequency cepstrum coefficient and the gamma-pass-frequency cepstrum coefficient of each frame of noise signals in each class obtained by the training of the primary BP network and input the combined result into the secondary BP network of the corresponding class for network training, obtain the recognition result of each frame of noise signals in each class after the training of the secondary BP network, and store the network weight of the trained secondary BP network.

7. The noise classification method based on the BP network of claim 1, wherein in the step 8), the test data is inputted into a joint network formed by serially connecting the first-level BP network and the second-level BP network for testing according to the stored network weight of the first-level BP network and the stored network weight of the second-level BP network, so as to obtain the recognition result of each frame of noise signals in the test data.

Technical Field

The invention relates to a noise classification method. In particular to a noise classification method based on a BP network.

Background

In the processing of speech signals, the problem of noise pollution is inevitable. With the wide application of digital voice signals in scientific research and life, the influence of noise on the digital voice signals becomes more and more obvious, and how to effectively inhibit the noise and improve the quality and intelligibility of the voice signals becomes a hot point for the research of numerous scholars. One difficulty in speech enhancement is the large number of sources of noise. The statistical characteristics of different noises are different, so in practical application, in order to achieve better signal processing effect, the noises with different noise characteristics need to be processed differently according to application occasions.

Generally, two key technical points are mainly used for solving the noise classification problem, namely, which kind of characteristics are to be extracted for distinguishing noise types; secondly, which classification technology is applied to the extracted features. For the first key technical point, currently, commonly used noise features include an adaptive sub-wave feature, a Short Auto-correlation Function (SACF), a bark domain energy distribution, a Mel-Frequency Cepstrum Coefficient (MFCC), a first-order difference Mel-Frequency Cepstrum Coefficient (Δ MFCC), a discrete fourier Coefficient, a linear predictive coding Coefficient, a gamma-pass filter Coefficient, and the like. As for the noise classification method, currently, commonly used techniques include a noise classification algorithm based on a Hidden Markov Model (HMM), a noise classification algorithm based on a Gaussian Mixture Model (GMM), a noise classification algorithm based on a Support Vector Machine (SVM), a noise classification algorithm based on a neural network, and the like.

However, the currently proposed noise classification method has a low classification accuracy. Since the accuracy of noise classification directly affects the performance of signal processing, it becomes a new challenge in the field of signal processing to provide a highly accurate noise classification method.

Disclosure of Invention

The invention aims to solve the technical problem of providing a noise classification method based on a BP network with higher classification accuracy.

The technical scheme adopted by the invention is as follows: a noise classification method based on a BP network comprises the following steps:

1) preprocessing an input noise signal, including framing and windowing;

2) performing Fourier transform on each frame of the preprocessed noise signals respectively to obtain a noise signal power spectrum;

3) respectively calculating the Mel frequency cepstrum coefficient of each frame of noise signal and the first-order difference of the Mel frequency cepstrum coefficient by using the power spectrum of each frame of noise signal;

4) calculating a gamma pass frequency cepstrum coefficient of each frame of noise signals;

5) combining the Mel frequency cepstrum coefficient, the first order difference of the Mel frequency cepstrum coefficient and the gamma pass frequency cepstrum coefficient of each frame of noise signal as the joint characteristics of the frame of noise signal, using one part of the joint characteristics of all the frame of noise signals as training data, and using the other part as test data;

6) training a first-level BP network;

7) training a secondary BP network;

8) and (4) testing the first-level BP network and the second-level BP network jointly to obtain a final noise signal classification result.

Step 2) performing Fourier transform on each frame of noise signals by adopting the following formula:

X(i,k)=FFT[xi(n)]

wherein X (i, k) is the power spectrum of the ith frame signal at the kth spectral line, FFT represents Fourier transform, Xi(n) denotes an ith frame signal, and n denotes a sequence index number.

The step 3) comprises the following steps:

(3.1) calculating spectral line energy for the power spectrum of each frame of noise signal:

E(i,k)=[X(i,k)]2

wherein E (i, k) represents the spectral line energy of the ith frame data at the kth spectral line; x (i, k) is the power spectrum of the ith frame signal at the kth spectral line;

(3.2) calculating the energy of each frame of noise signal passing through the Mel Filter

Figure BDA0002228629710000021

Wherein S (i, M) represents the energy of the ith frame signal passing through the mth Mel filter, M is the total number of Mel filters, Hm(k) Representing the frequency domain response of the mth mel-filter at the kth spectral line, N being the total number of spectral lines;

(3.3) calculating mel frequency cepstrum coefficients:

Figure BDA0002228629710000022

wherein, mfcc (i, n) is the mel-frequency cepstrum coefficient of the ith frame noise signal at the kth spectral line;

(3.4) calculating the first order difference of the mel-frequency cepstrum coefficients:

Δmfcc(i,k)=2mfcc(i-2,k)-mfcc(i-1,k)+mfcc(i+1,k)+2mfcc(i+2,k)

where Δ mfcc (i, k) represents the first difference in mel-frequency cepstrum coefficients at the kth spectral line for the i-th frame signal, and mfcc (i-2, k), mfcc (i-1, k), mfcc (i +1, k), and mfcc (i +2, k) represent the mel-frequency cepstrum coefficients at the kth spectral line for the i-2 th, i-1, i +1, and i +2 frame signals, respectively.

The step 4) comprises the following steps:

(4.1) calculating the energy through the gamma pass filter from the resulting spectral line energy:

Figure BDA0002228629710000023

wherein R (i, P) represents the energy of the ith frame noise signal passing through the P-th gamma-pass filter, P is the total number of gamma-pass filters, Hp(k) Representing the frequency domain response of the p-th gamma-pass filter at the k-th spectral line, N being the total number of spectral lines, e (f) representing the exponentially compressed values;

(4.2) calculating the gamma-pass frequency cepstrum coefficient:

Figure BDA0002228629710000024

wherein gfcc (i, k) represents the gamma pass frequency cepstrum coefficient of the ith frame signal at the kth spectral line.

And 6) the training first-stage BP network inputs the training data into the first-stage BP network for network training, obtains the class category of each frame of noise signal after the training of the first-stage BP network, and stores the network weight of the trained first-stage BP network.

And 7) the training secondary BP network combines the Mel frequency cepstrum coefficient, the first difference of the Mel frequency cepstrum coefficient and the gamma pass frequency cepstrum coefficient of each frame of noise signals in each class obtained by the training of the primary BP network, inputs the combined result into the secondary BP network of the corresponding class for network training, obtains the recognition result of each frame of noise signals in each class after the training of the secondary BP network, and stores the network weight of the trained secondary BP network.

And 8) inputting the test data into a combined network formed by connecting the first-stage BP network and the second-stage BP network in series for testing according to the stored network weight of the first-stage BP network and the stored network weight of the second-stage BP network, and obtaining the identification result of each frame of noise signals in the test data.

The noise classification method based on the BP network adopts a two-stage BP network form to identify the noise category of a noise signal, and compared with the noise classification method only adopting a single-stage BP network, the noise classification method adopting the network structure has higher noise classification accuracy. Meanwhile, the scheme provided by the invention has wide applicability and strong experimentability, and has certain reference significance for noise classification.

Drawings

Fig. 1 is a flowchart of a noise classification method based on a BP network according to the present invention.

Detailed Description

The following describes a noise classification method based on a BP network in detail with reference to embodiments and drawings.

As shown in fig. 1, a noise classification method based on a BP network of the present invention includes the following steps:

1) preprocessing an input noise signal, including framing and windowing;

2) performing Fourier transform on each frame of the preprocessed noise signals respectively to obtain a noise signal power spectrum; specifically, the following formula is adopted to perform Fourier transform on each frame of noise signals:

X(i,k)=FFT[xi(n)]

wherein X (i, k) is the power spectrum of the ith frame signal at the kth spectral line, FFT represents Fourier transform, Xi(n) denotes an ith frame signal, and n denotes a sequence index number.

3) Respectively calculating a Mel Frequency Cepstrum Coefficient (MFCC) of each frame of noise signal and a first-order difference (delta MFCC) of the Mel frequency cepstrum coefficient by using the power spectrum of each frame of noise signal; the method specifically comprises the following steps:

(3.1) calculating spectral line energy for the power spectrum of each frame of noise signal:

E(i,k)=[X(i,k)]2

wherein E (i, k) represents the spectral line energy of the ith frame data at the kth spectral line; x (i, k) is the power spectrum of the ith frame signal at the kth spectral line;

(3.2) calculating the energy of each frame of noise signal passing through the Mel Filter

Figure BDA0002228629710000031

Wherein S (i, M) represents the energy of the ith frame signal passing through the mth Mel filter, M is the total number of Mel filters, Hm(k) Representing the frequency domain response of the mth mel-filter at the kth spectral line, N being the total number of spectral lines;

(3.3) calculating mel frequency cepstrum coefficients:

Figure BDA0002228629710000041

wherein, mfcc (i, n) is the mel-frequency cepstrum coefficient of the ith frame noise signal at the kth spectral line;

(3.4) calculating the first order Difference of the Mel frequency cepstrum coefficients (Δ MFCC)

Δmfcc(i,k)=2mfcc(i-2,k)-mfcc(i-1,k)+mfcc(i+1,k)+2mfcc(i+2,k)

Wherein △ mfcc (i, k) represents the first difference of the mel-frequency cepstrum coefficients of the i-th frame signal at the k-th spectral line, and mfcc (i-2, k), mfcc (i-1, k), mfcc (i +1, k), and mfcc (i +2, k) represent the mel-frequency cepstrum coefficients of the i-2-th frame signal, i-1 frame signal, i +1 frame signal, and i +2 frame signal at the k-th spectral line, respectively.

4) Calculating a gamma pass frequency cepstrum coefficient (GFCC) of the noise signal of each frame; the method specifically comprises the following steps:

(4.1) calculating the energy through a gamma pass (Gamma) filter from the obtained spectral line energy:

Figure BDA0002228629710000042

wherein R (i, P) represents the energy of the ith frame noise signal passing through the P-th gamma-pass filter, P is the total number of gamma-pass filters, Hp(k) Representing the frequency domain response of the p-th gamma-pass filter at the k-th spectral line, N being the total number of spectral lines, e (f) representing the exponentially compressed values;

(4.2) calculating the gamma-pass frequency cepstrum coefficient:

Figure BDA0002228629710000043

wherein gfcc (i, k) represents the gamma pass frequency cepstrum coefficient of the ith frame signal at the kth spectral line.

5) Combining the Mel frequency cepstrum coefficient, the first order difference of the Mel frequency cepstrum coefficient and the gamma pass frequency cepstrum coefficient of each frame of noise signal as the joint characteristics of the frame of noise signal, using one part of the joint characteristics of all the frame of noise signals as training data, and using the other part as test data;

6) and training the first-level BP network, namely inputting the training data into the first-level BP network for network training, obtaining the class type of each frame of noise signal after the training of the first-level BP network, and storing the network weight of the trained first-level BP network.

7) The training of the second-level BP network is to combine the Mel frequency cepstrum coefficient of each frame of noise signals in each class obtained by the training of the first-level BP network, the first-order difference of the Mel frequency cepstrum coefficient and the gamma pass frequency cepstrum coefficient and input the combined values into the second-level BP network of the corresponding class for network training, obtain the recognition result of each frame of noise signals in each class after the training of the second-level BP network, and store the network weight of the trained second-level BP network.

8) And the test data is input into a combined network formed by connecting the first-level BP network and the second-level BP network in series for testing according to the stored network weight of the first-level BP network and the stored network weight of the second-level BP network, so as to obtain the identification result of each frame of noise signals in the test data.

Specific examples are given below:

preprocessing an input noise signal:

1. selecting data:

selecting Pink, Factory1, F16, Destoreyengine and Buccaner 1 from a Noisex-92 standard noise library; babble, White, Hfchannel, Factory2, Buccaneer 2; volvo, Machinegun, M109, Leopard, Destoreerrors take 15 kinds of noises as samples, the sampling frequency is 16KHz, and the noises are divided into three categories as classification bases of a first-stage BP network, which are respectively: class a 1: pink, factary 1, F16, destroyerine, Buccaneer 1; class a 2: babble, White, Hfchannel, Factory2, Buccaneer 2; class a 3: volvo, Machinegun, M109, Leopard, Destoreros.

2. Framing and windowing

(1) Framing: the frame length is 256 points, and the frame shift is 128 points;

(2) the window function is a Hamming window;

after the preprocessing, 36713 frame data exist in each type of noise, and 550695 frame data are obtained in 15 types of noise. Carrying out Fourier transform on each frame of data to obtain a signal power spectrum:

(III) calculating 550695 24-dimensional MFCC and delta MFCC coefficients of each frame of data in the frame of data;

(IV) calculating the 24-dimensional GFCC coefficient of each frame of data in 550695 frame data;

combining the 24-dimensional MFCC and delta MFCC coefficients and the 24-dimensional GFCC coefficients of each frame of data to form a 48-dimensional combined feature, selecting 500000 frame data from 550695 frame data as training data, and using the rest data as test data;

and (VI) inputting the training data in the step (five) into a first-stage BP network (BP0) for training, wherein the parameters of the BP0 network are set as follows: the input layer is 48 nodes, the hidden layer is 49 nodes, and the output layer is 3 nodes. After the training is successful, the trained network weight is stored;

(seventhly) inputting 48-dimensional joint features of 5 kinds of noise in each of the A1, A2 and A3 categories into three two-level BP networks (BP1, BP2 and BP3) respectively for training, wherein parameters of the BP1, BP2 and BP3 networks are set as follows: the input layer is 48 nodes, the hidden layer is 49 nodes, and the output layer is 5 nodes. And after the training is successful, the trained network weight is stored.

And (eighthly), performing combined test on the trained primary BP network and the trained secondary BP network, wherein the input data of the combined networks is the test data obtained in the fifth step.

The test output result of the combination network is the classification category of each noise in 15 types of noise by the combination network, and the classification accuracy of each noise by the method is obtained through calculation, and the specific result is shown in table 1. The BP0 represents the classification accuracy of the first-level BP network on three major types of noise, the BP represents the classification accuracy of the three second-level BP networks on each major type of noise, and the BP0+ BP represents the classification accuracy of the final combined network on 15 types of noise.

TABLE 1BP network Classification accuracy for class 15 noise

Figure BDA0002228629710000051

Figure BDA0002228629710000061

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:录音方法、装置、设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!