Voice signal separation method

文档序号：1339716 发布日期：2020-07-17 浏览：7次中文

阅读说明：本技术 一种语音信号分离方法 (Voice signal separation method ) 是由李一兵吴静孙骞吕威田园于 2020-03-19 设计创作，主要内容包括：本发明提出了一种语音信号分离方法,首先建立观测信号的线性瞬时混合模型,针对随着源信号数目增多的情况下分离精度下降明显的问题,提出改进的最小化l<Sub>1</Sub>范数算法。算法首先对观测信号和混合矩阵进行预处理,而后根据向量的长度和方向来找到最接近观测信号的向量,在此基础上,又将混合矩阵的形式进行改变,利用变化后的混合矩阵估计某时刻的源信号,进而估计出所有时刻的源信号。本发明所提方法解决了随着源信号数目增多的情况下分离精度下降明显的问题,同时有效地分离出源信号。(The invention provides a voice signal separation method, which comprises the steps of firstly establishing a linear instantaneous mixed model of an observation signal, and aiming at the problem that the separation precision is obviously reduced under the condition that the number of source signals is increased, providing improved minimum l 1 And (4) carrying out norm algorithm. The algorithm firstly preprocesses an observation signal and a mixing matrix, then finds a vector closest to the observation signal according to the length and the direction of the vector, changes the form of the mixing matrix on the basis, estimates a source signal at a certain moment by using the changed mixing matrix, and further estimates source signals at all moments. The method provided by the invention solves the problem that the separation precision is obviously reduced under the condition that the number of the source signals is increased, and simultaneously, the source signals are effectively separated.)

1. A method for separating a speech signal, comprising the steps of:

step 1: establishing a linear instantaneous mixed model of an observation signal, which specifically comprises the following steps:

wherein x (t) ═ x₁(t),x₂(t),L,x_N(t)]^TIs an N-dimensional observation signal vector, A ═ a₁,a₂,L,a_M]Is a mixed matrix of N × M dimensions, s (t) ═ s₁(t),s₂(t),L,s_M(t)]^TIs an M-dimensional source signal vector, t is a time sample point and a_iAn ith column vector representing the mixing matrix;

step 2: removing all zero column vectors in the observation signals, and then, symmetrical the observation signals to an upper plane:

and step 3: with improved minimisation₁Norm separation source signal:

minimization of₁Norm is:

the method comprises the following steps:

(3a) calculating the observation signal angle θ (t) at time t and the column vector direction angle α of the mixing matrix_i：

The calculation formula is as follows:

α_i＝arctan(a_i2/a_i1)i＝1,2,K,n

in the formula (I), the compound is shown in the specification,representing two observation signals, a_inRepresenting the nth element in the ith column vector in the mixing matrix;

(3b) calculating the mixed direction of any two column vectors in the mixed matrix through sine theorem and cosine theorem:

the specific process is as follows:

∠AOB＝∠AOx-∠BOx

AB²＝OA²+OB²-2OAOBcos∠AOB

OC²＝OA²+AC²-2OAACcos∠OAC

∠COx＝∠AOx-∠AOC

the vectors OA and OB are any two column vectors in the estimated mixing matrix, the angle ∠ AOx of the vector OA and the angle ∠ BOx of the vector OB are respectively the directions corresponding to the column vectors in the mixing matrix, and the lengths of the vectors OA and OB correspond to the lengths of the column vectors of the mixing matrix;

(3c) calculating theta (t) and α_iAngle Δ θ:

if Δ θ is 0, use is made of:

x(t)＝a_is_i(t)

wherein x (t) is an observed signal vector at time t, a_iFor the ith column vector, s, of the mixing matrix_i(t) estimating an ith source signal at the time t;

if Δ θ ≠ 0, use:

in the formula, W_r＝A_r ^-1Whereina^cAnd a^dAre the two vectors closest to the observed signal vector at time t;

(3d) the method comprises the following steps Traversing all the time instants, obtaining the representation s (t) of the source signal at all the time instants.

Technical Field

The invention relates to a voice signal separation method under an under-determined model, in particular to a voice signal separation method, and belongs to the field of signal processing.

Background

In recent years, separation of speech signals has become a research hotspot in the field of signal processing. It has many applications and impacts in teleconferencing, hearing aids and machine speech recognition. Since the received sound is usually noisy, the problem of identifying the sound of interest and obtaining a clear sound in such an environment becomes a considerable problem, the so-called blind source separation problem.

Blind source separation is generally divided according to the number of source signals and observation signals, and can be divided into over-determined, adaptive and under-determined blind source separation, wherein the under-determined blind source separation is more in line with the actual situation, is more widely applied in life, and is more challenging. Underdetermined blind source separation refers to the case where the number of sensors or microphones is less than the number of source signals. In general, the method for solving the underdetermined blind source separation is also suitable for the over-determined and the adaptive situations, so that the research on the underdetermined blind source separation method is necessary. The general approach to underdetermined blind source separation is to use sparse component analysis, also commonly referred to as a "two-step" approach. The first step is to estimate the mixing matrix by observing the signals, and the second step is to separate the source signals by using the estimated mixing matrix. According to the current research situation of source signal separation, the problem that the existing source signal separation algorithm generally has obvious reduction under the condition that the number of source signals is increased is solved.

Disclosure of Invention

In view of the above prior art, the technical problem to be solved by the present invention is to provide an improvement-based minimization method that can improve the problem of significant reduction of separation accuracy when the number of source signals increases₁Norm speech signal separation method.

In order to solve the above technical problem, the present invention provides a method for separating a voice signal, comprising the following steps:

step 1: establishing a linear instantaneous mixed model of an observation signal, which specifically comprises the following steps:

wherein x (t) ═ x₁(t),x₂(t),L,x_N(t)]^TIs an N-dimensional observation signal vector, A ═ a₁,a₂,L,a_M]Is a mixture of N × M dimensionsMatrix, s (t) ═ s₁(t),s₂(t),L,s_M(t)]^TIs an M-dimensional source signal vector, t is a time sample point and a_iAn ith column vector representing the mixing matrix;

step 2: removing all zero column vectors in the observation signals, and then, symmetrical the observation signals to an upper plane:

and step 3: with improved minimisation₁Norm separation source signal:

minimization of₁Norm is:

the method comprises the following steps:

(3a) calculating the observation signal angle θ (t) at time t and the column vector direction angle α of the mixing matrix_i：

The calculation formula is as follows:

α_i＝arctan(a_i2/a_i1)i＝1,2,K,n

in the formula (I), the compound is shown in the specification,representing two observation signals, a_inRepresenting the nth element in the ith column vector in the mixing matrix.

(3b) Calculating the mixed direction of any two column vectors in the mixed matrix through sine theorem and cosine theorem:

the specific process is as follows:

∠AOB＝∠AOx-∠BOx

AB²＝OA²+OB²-2OAOBcos∠AOB

OC²＝OA²+AC²-2OAACcos∠OAC

the vectors OA and OB are any two column vectors in the estimated mixing matrix, the angle ∠ AOx of the vector OA and the angle ∠ BOx of the vector OB are directions corresponding to the column vectors in the mixing matrix, respectively, and the lengths of the vectors OA and OB correspond to the lengths of the column vectors in the mixing matrix.

(3c) Calculating theta (t) and α_iAngle Δ θ:

if Δ θ is 0, use is made of:

x(t)＝a_is_i(t)

wherein x (t) is an observed signal vector at time t, a_iFor the ith column vector, s, of the mixing matrix_iAnd (t) is the ith source signal estimated at the time t.

If Δ θ ≠ 0, use:

in the formula, W_r＝A_r ^-1Whereina^cAnd a^dIs the two vectors closest to the observed signal vector at time t.

(3d) The method comprises the following steps Traversing all the time instants, obtaining the representation s (t) of the source signal at all the time instants.

The invention has the beneficial effects that: the present invention is directed to the second step of the sparse component analysis method. In the present invention, source signal separation is adopted based on improved minimization₁And (3) a norm separation method.

(1) The proposed source signal separation algorithm is applicable to two paths of observation signals;

(2) with the increase of the number of the source signals, the separation precision of the source signal separation algorithm is reduced more stably.

Drawings

FIG. 1 is a flow chart of the algorithm of the present invention;

FIG. 2 is a graph of a three-way initial source signal;

FIG. 3 shows two observation signals mixed together;

FIG. 4 is a diagram illustrating a mixture of any two column vectors;

fig. 5 is a diagram of the separated three-way source signal.

Detailed Description

The method comprises the steps of firstly, finding a vector closest to an observed signal according to the length and the angle of the vector, then, changing the form of a mixing matrix, estimating a source signal at a certain moment by using the changed mixing matrix, and further estimating source signals at all moments.

The invention is described in detail below with reference to the accompanying drawings and specific embodiments.

Referring to FIG. 1, an improvement-based minimization of the present invention₁The method for separating the norm voice signals comprises the following concrete steps:

step 1: establishing a linear instantaneous mixed model of an observation signal; fig. 2 is a three-way initial source signal, and fig. 3 is a two-way observation signal mixed.

In step 1, the established mathematical model is a linear instantaneous hybrid model. The speech signal is chosen as the source signal, the noise considered is additive noise, and the signal-to-noise ratio is 30 dB.

And establishing a linear instantaneous mixed model of the observed signals, wherein the specific expression is shown as follows.

Wherein x (t) ═ x₁(t),x₂(t),L,x_N(t)]^TIs an N-dimensional observation signal vector, A ═ a₁,a₂,L,a_M]Is an N ×M-dimensional mixing matrix, s (t) ═ s₁(t),s₂(t),L,s_M(t)]^TIs an M-dimensional source signal vector, t is a time sample point and a_iThe ith column vector representing the mixing matrix.

Step 2: removing all zero column vectors in the observation signals, and then, symmetrically arranging the observation signals to an upper plane;

in step 2, since all zero column vectors in the observation signal have no effect on the separation source signal, all zero column vectors need to be removed. In order to facilitate post-processing of the signals, the observed signals are symmetrical to the upper plane.

And step 3: using improved minimization₁The norm separates the source signals.

For separating the source signals, the invention uses minimization₁Norm criterion.

The method comprises the following specific steps:

(3a) computing α the angle of the observed signal at time t (t) and the column vector direction angle of the mixing matrix_i。

The source signal at each sampling instant can be separated from the observed signal x (t) at that instant, so that the source signal separation problem translates into a source signal separation problem at a single sampling instant, the observed signal direction θ (t) at the next time t and the column vector direction α of the mixing matrix are first calculated_i。

The calculation formula is as follows:

θ(t)＝arctan(xt2/xt1)

α_i＝arctan(a_i2/a_i1)i＝1,2,K,n

in the formula (I), the compound is shown in the specification,representing two observation signals, a_inRepresenting the nth element in the ith column vector in the mixing matrix.

(3b) And calculating the mixed direction of any two column vectors in the mixed matrix through sine theorem and cosine theorem. Fig. 4 is a diagram illustrating a mixture of any two column vectors.

Since the length and direction of the column vector are considered simultaneously to seek the minimization of the sum of the modulus values of the source signals, on the basis of knowing the length and direction of the column vector of the mixing matrix, the sine theorem and the cosine theorem are needed to be used for solving the direction of any two column vectors after mixing.

The specific process is as follows:

∠AOB＝∠AOx-∠BOx

AB²＝OA²+OB²-2OAOBcos∠AOB

OC²＝OA²+AC²-2OAACcos∠OAC

the vectors OA and OB are any two column vectors in the mixing matrix, the angle ∠ AOx of the vector OA and the angle ∠ BOx of the vector OB are directions corresponding to the column vectors in the mixing matrix, respectively, and the lengths of the vectors OA and OB correspond to the lengths of the column vectors in the mixing matrix.

(3c) Calculating theta (t) and α_iAngle Δ θ:

if Δ θ is 0, the slope of the sampling point of the observation signal is the same as the direction of one column vector of the hybrid matrix, and then the following formula is used to obtain the slope;

x(t)＝a_is_i(t)

wherein x (t) is an observed signal vector at time t, a_iFor the ith column vector, s, of the mixing matrix_iAnd (t) is the ith source signal estimated at the time t.

If Δ θ ≠ 0, it means that the slope of the sampling point of the observation signal is different from the direction of one column vector of the mixing matrix, and at this time, the direction obtained by mixing any two column vectors of the mixing matrix obtained in (3b) is used to find two column vectors a which minimize the sum of the modulus values of the source signal^cAnd a^d. And then, the source signal at the corresponding moment is obtained by using the following formula.

In the formula, W_r＝A_r ^-1Whereina^cAnd a^dIs the two vectors closest to x at time t.

(3d) The method comprises the following steps And traversing all the moments to obtain the representation s (t) of the source signals at all the moments, and fig. 5 is a diagram of the separated three-way source signals.

Minimization of l based on improvement of the invention₁The norm voice signal separation method has the advantage that the separation precision is gradually reduced along with the increase of the number of the source signals.

Minimization of l based on improvement of the invention₁The norm speech signal separation method is only suitable for two paths of observation signals.

In summary, the following steps: the invention provides a method for minimizing l based on improvement₁The norm method comprises the steps of firstly establishing a linear instantaneous mixed model of an observation signal, and providing improved minimum l aiming at the problem that the separation precision is obviously reduced under the condition that the number of source signals is increased₁And (4) carrying out norm algorithm. The algorithm firstly preprocesses the observation signal and the mixed matrix, then finds the vector closest to the observation signal according to the length and the direction of the vector, changes the form of the mixed matrix on the basis, and utilizes the changed formThe mixing matrix of (a) estimates the source signal at a certain time, and then estimates the source signals at all times. The method provided by the invention solves the problem that the separation precision is obviously reduced under the condition that the number of the source signals is increased, and simultaneously, the source signals are effectively separated.

It should be noted that the above embodiments do not limit the present invention in any way, and all technical solutions obtained by using equivalent alternatives or equivalent variations fall within the scope of the present invention.

11页详细技术资料下载

Voice signal separation method

相关技术

网友询问留言