Voice noise reduction processing method and device, computer equipment and storage medium

文档序号:1536661 发布日期:2020-02-14 浏览:6次 中文

阅读说明:本技术 语音降噪处理方法、装置、计算机设备及存储介质 (Voice noise reduction processing method and device, computer equipment and storage medium ) 是由 肖强 肖全之 黄荣均 方桂萍 闫玉凤 于 2019-10-21 设计创作,主要内容包括:本申请涉及一种语音降噪处理方法、装置、计算机设备及存储介质。其中,语音降噪处理方法包括当检测到语音采集设备与目标物的距离达到预设值时,获取语音采集设备采集的含噪语音信号,并对含噪语音信号进行分频处理,得到低频带信号;获取低频带信号的幅度谱和相位谱;获取幅度谱对应的调制域信号;采用谱减法对调制域幅度谱或功率谱进行处理,得到降噪后的调制域幅度谱或功率谱;对调制域相位谱进行补偿,得到补偿后的调制域相位谱;基于补偿后的调制域相位谱、降噪后的调制域幅度谱和低频带信号的相位谱,得到降噪后的低频带信号;降噪后的低频带信号用于合成降噪语音信号。谱减法所采用的平滑过减因子避免了平滑过减因子过大导致语音消息消除过多的问题。(The application relates to a voice noise reduction processing method and device, computer equipment and a storage medium. The voice noise reduction processing method comprises the steps of obtaining a noise-containing voice signal collected by voice collection equipment when the fact that the distance between the voice collection equipment and a target object reaches a preset value is detected, and carrying out frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal; acquiring an amplitude spectrum and a phase spectrum of a low-frequency band signal; acquiring a modulation domain signal corresponding to the amplitude spectrum; processing the modulation domain amplitude spectrum or the power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum or power spectrum after noise reduction; compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum; obtaining a low-frequency band signal subjected to noise reduction based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum subjected to noise reduction and the phase spectrum of the low-frequency band signal; the noise-reduced low-frequency band signal is used for synthesizing a noise-reduced speech signal. The smooth over-reduction factor adopted by the spectral subtraction method avoids the problem that the voice message is eliminated too much due to the fact that the smooth over-reduction factor is too large.)

1. A speech noise reduction processing method is characterized by comprising the following steps:

when the distance between the signal acquisition equipment reaches a preset value, acquiring a noise-containing voice signal acquired by the voice acquisition equipment, and carrying out frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal;

acquiring an amplitude spectrum and a phase spectrum of the low-frequency band signal;

acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

processing the modulation domain amplitude spectrum or the modulation domain power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

obtaining a low-frequency band signal subjected to noise reduction based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum subjected to noise reduction and the phase spectrum of the low-frequency band signal; and the low-frequency band signal after noise reduction is used for synthesizing a noise-reduced voice signal.

2. The speech noise reduction processing method according to claim 1, further comprising the steps of:

carrying out frequency division processing on the noisy speech signal to obtain a high-frequency band signal;

acquiring an amplitude spectrum and a phase spectrum of the high-frequency band signal;

processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum of the noise-reduced high-frequency band signal and the phase spectrum of the high-frequency band signal; the transfer gain function is obtained according to the power spectrum and the cross-power spectrum of the high-frequency band signal and the estimated noise cross-power spectrum;

and synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain the noise-reduced voice signal.

3. The speech noise reduction processing method according to claim 1, wherein in the step of obtaining the noise-reduced modulation domain amplitude spectrum by processing the modulation domain amplitude spectrum or the modulation domain power spectrum by spectral subtraction, the noise-reduced modulation domain amplitude spectrum is obtained based on the following formula:

Figure FDA0002241015970000021

Figure FDA0002241015970000022

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; i S (u, w, k) I is the modulation domain amplitude spectrum after noise reduction; p is the type of spectral subtraction method, using modulation domain amplitude spectral subtraction when p is 1, and modulation domain power spectral subtraction when p is 2; when p is 1, | V (u, w, k) <' > cellspFor the estimated noise modulation domain amplitude spectrum, | V (u, w, k) | when p ═ 2pα (k) is the modulation domain smoothing over-reduction factor, YLF(u,w,k)|pIs the modulation domain amplitude spectrum or power spectrum; SNRpost(u, w, k) is the posterior signal-to-noise ratio of the modulation domain, [ theta ] is the smoothing factor, α0Is a constant.

4. The speech noise reduction processing method according to claim 1, wherein the step of compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum comprises:

adopting an antisymmetric function and an estimated noise modulation domain amplitude spectrum to carry out conjugate angle adjustment on the modulation domain phase spectrum to obtain phase compensation;

or, the anti-symmetric function and the estimated noise modulation domain power spectrum are adopted to carry out conjugate angle adjustment on the modulation domain phase spectrum to obtain phase compensation;

and obtaining the compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum.

5. The speech noise reduction processing method according to claim 4, wherein the step of obtaining the phase compensation by performing a conjugate angle adjustment on the modulation domain phase spectrum using an antisymmetric function and the estimated noise modulation domain amplitude spectrum is performed based on the following formula:

Figure FDA0002241015970000023

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; Λ (u, w, k) is the phase compensation;

Figure FDA0002241015970000031

in the step of obtaining phase compensation, the phase compensation is obtained based on the following formula by using an antisymmetric function and an estimated noise modulation domain power spectrum to perform conjugate angle adjustment on the modulation domain phase spectrum:

Figure FDA0002241015970000032

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; Λ (u, w, k) is the phase compensation;

Figure FDA0002241015970000033

in the step of obtaining the compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum, the compensated phase spectrum is obtained based on the following formula:

angle[S(u,w,k)]=angle[YLF(u,w,k)]+Λ(u,w,k);

therein, angle [ S (u, w, k)]Is the compensated phase spectrum; angle [ Y ]LF(u,w,k)]Is the modulation domain phase spectrum.

6. The speech noise reduction processing method according to claim 1, wherein the step of obtaining the noise-reduced low-frequency band signal based on the compensated modulation domain phase spectrum, the noise-reduced modulation domain amplitude spectrum and the phase spectrum of the low-frequency band signal comprises:

sequentially carrying out Fourier inverse transformation and overlap addition processing on the modulation domain amplitude spectrum subjected to noise reduction and the modulation domain phase spectrum subjected to compensation to obtain the amplitude spectrum of the low-frequency band signal subjected to noise reduction;

and sequentially carrying out Fourier inverse transformation processing and overlap addition processing on the amplitude spectrum and the low-frequency band signal phase spectrum of the low-frequency band signal subjected to noise reduction to obtain the low-frequency band signal subjected to noise reduction.

7. The speech noise reduction processing method according to claim 2, wherein the step of obtaining the noise-reduced high-band signal based on the amplitude spectrum of the noise-reduced high-band signal and the phase spectrum of the high-band signal comprises:

and sequentially carrying out Fourier inverse transformation processing and overlapping addition on the phase spectrum of the high-frequency band signal and the amplitude spectrum of the noise-reduced high-frequency band signal to obtain the noise-reduced low-frequency band signal.

8. The speech noise reduction processing method according to claim 2, wherein in the step of processing the amplitude spectrum of the high-band signal by using a transfer gain function to obtain the amplitude spectrum of the noise-reduced high-band signal, the amplitude spectrum of the noise-reduced high-band signal is obtained based on the following formula:

|S(u,w)|=|YHF(u,w)|H(u,w);

Figure FDA0002241015970000041

Figure FDA0002241015970000042

Figure FDA0002241015970000043

Figure FDA0002241015970000044

wherein, | S (u, w) | is the amplitude spectrum of the noise-reduced high-frequency band signal; h (u, w) is a transfer gain function; | YHF(u, w) | is the high-frequency band amplitude spectrum of the noisy speech;

Figure FDA0002241015970000045

9. The speech noise reduction processing method according to claim 1, wherein the step of obtaining the magnitude spectrum and the phase spectrum of the low-frequency band signal comprises:

preprocessing the low-frequency band signal to obtain a stable low-frequency band signal;

and processing the stable low-frequency band signal by adopting Fourier transform to obtain the amplitude spectrum and the phase spectrum of the low-frequency band signal.

10. The speech noise reduction processing method according to claim 9, wherein the step of preprocessing the low-frequency band signal to obtain a stationary low-frequency band signal comprises:

and sequentially performing framing processing and windowing processing on the low-frequency band signal to obtain the stable low-frequency band signal.

11. The speech noise reduction processing method according to claim 2, wherein the step of obtaining the magnitude spectrum and the phase spectrum of the high-band signal comprises:

preprocessing the high-frequency band signal to obtain a stable high-frequency band signal;

and carrying out Fourier transform processing on the stable high-frequency band signal to obtain an amplitude spectrum and a phase spectrum of the high-frequency band signal.

12. The speech noise reduction processing method according to claim 11, wherein the step of preprocessing the high-band signal to obtain a smoothed high-band signal comprises:

and sequentially performing framing processing and windowing processing on the high-frequency band signal to obtain the stable high-frequency band signal.

13. The speech noise reduction processing method according to claim 1, wherein the step of obtaining the modulation domain signal corresponding to the amplitude spectrum comprises:

and processing the amplitude spectrum by adopting Fourier transform to obtain the modulation domain signal.

14. The speech noise reduction processing method according to any one of claims 2 to 13, wherein the step of performing frequency division processing on the noisy speech signal to obtain a low-frequency band signal comprises:

carrying out non-convex low-pass filtering processing on the noisy speech signal to obtain a low-frequency band signal;

the step of performing frequency division processing on the noisy speech signal to obtain a high-frequency band signal comprises the following steps:

and carrying out non-convex high-pass filtering processing on the noisy speech signal to obtain the high-frequency band signal.

15. A speech noise reduction processing apparatus, comprising:

the voice signal acquisition module is used for acquiring a noise-containing voice signal acquired by the voice acquisition equipment when the distance between the signal acquisition equipment reaches a preset value;

the low-frequency filtering module is used for carrying out frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal;

the low-frequency band signal frequency spectrum acquisition module is used for acquiring a magnitude spectrum and a phase spectrum of the low-frequency band signal;

the modulation domain signal acquisition module is used for acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

the modulation domain amplitude spectrum processing module is used for processing the modulation domain amplitude spectrum or the power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

the compensation module is used for compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

a low-frequency band signal denoising module, configured to obtain a low-frequency band signal after denoising based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum after denoising, and the phase spectrum of the low-frequency band signal; and the low-frequency band signal after noise reduction is used for synthesizing a noise-reduced voice signal.

16. The speech noise reduction processing apparatus according to claim 15, further comprising:

the high-frequency filtering module is used for carrying out frequency division processing on the voice signal containing the noise to obtain a high-frequency band signal;

the high-frequency band signal frequency spectrum acquisition module is used for acquiring an amplitude spectrum and a phase spectrum of the high-frequency band signal;

the high-frequency band signal noise reduction module is used for processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum of the noise-reduced high-frequency band signal and the phase spectrum of the high-frequency band signal; the transfer gain function is obtained according to the power spectrum and the cross-power spectrum of the high-frequency band signal and the estimated noise cross-power spectrum;

and the synthesis module is used for synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain the noise-reduced voice signal.

17. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 14 when executing the computer program.

18. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 14.

Technical Field

The present application relates to the field of speech noise reduction technologies, and in particular, to a speech noise reduction processing method and apparatus, a computer device, and a storage medium.

Background

Noise has a great influence on the acoustic analysis and characteristics of voice, and various noises generate various striae in a spectrogram, so that the acoustic characteristics cannot be correctly identified and analyzed. In a voice transmission system, noise can mask voice to influence whether the voice content is heard clearly, and excessive noise can cause the voice recognition rate to be reduced. Aiming at the interference of noise, voice noise reduction processing is required, the influence of noise on voice is reduced, and the characteristics of voice are highlighted.

In the implementation process, the inventor finds that at least the following problems exist in the conventional technology: in the conventional method, there is a problem that voice message elimination is excessive.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a voice noise reduction processing method, device, computer device, and storage medium capable of avoiding excessive voice message elimination.

In order to achieve the above object, an embodiment of the present invention provides a speech noise reduction processing method, including:

when the distance between the signal acquisition equipment reaches a preset value, acquiring a noise-containing voice signal acquired by the voice acquisition equipment, and carrying out frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal;

acquiring an amplitude spectrum and a phase spectrum of a low-frequency band signal;

acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

processing the modulation domain amplitude spectrum or the modulation domain power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

obtaining a low-frequency band signal subjected to noise reduction based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum subjected to noise reduction and the phase spectrum of the low-frequency band signal; the noise-reduced low-frequency band signal is used for synthesizing a noise-reduced speech signal.

In one embodiment, the method further comprises the following steps:

carrying out frequency division processing on the voice signal containing noise to obtain a high-frequency band signal;

obtaining an amplitude spectrum and a phase spectrum of the high-frequency band signal;

processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum and the phase spectrum of the noise-reduced high-frequency band signal; the transfer gain function is obtained according to the power spectrum and the cross power spectrum of the high-frequency band signal and the estimated noise cross power spectrum;

and synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain a noise-reduced voice signal.

In one embodiment, in the step of processing the modulation domain amplitude spectrum or the power spectrum by using a spectral subtraction method to obtain the modulation domain amplitude spectrum after noise reduction, the modulation domain amplitude spectrum after noise reduction is obtained based on the following formula:

Figure BDA0002241015980000021

Figure BDA0002241015980000022

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; i S (u, w, k) I is the modulation domain amplitude spectrum after noise reduction; p is the type of spectral subtraction method, using modulation domain amplitude spectrum subtraction when p is 1, and modulation domain power spectrum subtraction when p is 2; when p is 1, | V (u, w, k) <' > cellspModulating a domain amplitude spectrum for the estimated noise, when p is 2, | V (u, w, k) | Ypα (k) is the modulation domain smoothing over-reduction factor, YLF(u,w,k)|pIs a modulation domain amplitude spectrum or power spectrum; SNRpost(u, w, k) is the posterior signal-to-noise ratio of the modulation domain, [ theta ] is the smoothing factor, α0Is a constant.

In one embodiment, the step of compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum includes:

adopting an antisymmetric function and the estimated noise modulation domain amplitude spectrum to carry out conjugate angle adjustment on the modulation domain phase spectrum to obtain phase compensation;

or, the anti-symmetric function and the estimated noise modulation domain power spectrum are adopted to carry out conjugate angle adjustment on the modulation domain phase spectrum to obtain phase compensation;

and obtaining a compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum.

In one embodiment, an antisymmetric function and an estimated noise modulation domain amplitude spectrum are adopted to perform conjugate angle adjustment on a modulation domain phase spectrum, and in the step of obtaining phase compensation, phase compensation is obtained based on the following formula:

Figure BDA0002241015980000031

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; Λ (u, w, k) is phase compensation;

Figure BDA0002241015980000032

is an anti-symmetric function, ξ is a constant, | V (u, w, k) <' > is a linear vector1Modulating the domain amplitude spectrum for the estimated noise;

in the step of obtaining the phase compensation, the phase compensation is obtained based on the following formula:

Figure BDA0002241015980000033

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; Λ (u, w, k) is phase compensation;

Figure BDA0002241015980000034

is an anti-symmetric function, ξ is a constant, | V (u, w, k) <' > is a linear vector2Modulating the domain power spectrum for the estimated noise;

in the step of obtaining the compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum, the compensated phase spectrum is obtained based on the following formula:

angle[S(u,w,k)]=angle[YLF(u,w,k)]+Λ(u,w,k);

therein, angle [ S (u, w, k)]To the compensated phase spectrum; angle [ Y ]LF(u,w,k)]Is a modulation domain phase spectrum.

In the step of obtaining the compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum, the compensated phase spectrum is obtained based on the following formula:

angle[S(u,w,k)]=angle[YLF(u,w,k)]+Λ(u,w,k);

therein, angle [ S (u, w, k)]To the compensated phase spectrum; angle [ Y ]LF(u,w,k)]Is a modulation domain phase spectrum.

In one embodiment, the step of obtaining the noise-reduced low-frequency band signal based on the compensated modulation domain phase spectrum, the noise-reduced modulation domain amplitude spectrum, and the phase spectrum of the low-frequency band signal includes:

sequentially carrying out Fourier inversion and overlap addition processing on the modulation domain amplitude spectrum subjected to noise reduction and the modulation domain phase spectrum subjected to compensation to obtain the amplitude spectrum of the low-frequency band signal subjected to noise reduction;

and sequentially carrying out Fourier inverse transformation processing and overlap addition processing on the amplitude spectrum and the low-frequency band signal phase spectrum of the low-frequency band signal subjected to noise reduction to obtain the low-frequency band signal subjected to noise reduction.

In one embodiment, the step of obtaining the noise-reduced high-frequency band signal based on the amplitude spectrum of the noise-reduced high-frequency band signal and the phase spectrum of the high-frequency band signal includes:

and sequentially carrying out Fourier inverse transformation processing and overlap addition on the phase spectrum of the high-frequency band signal and the amplitude spectrum of the noise-reduced high-frequency band signal to obtain the noise-reduced low-frequency band signal.

In one embodiment, in the step of processing the amplitude spectrum of the high-frequency band signal by using the transfer gain function to obtain the amplitude spectrum of the noise-reduced high-frequency band signal, the amplitude spectrum of the noise-reduced high-frequency band signal is obtained based on the following formula:

|S(u,w)|=|YHF(u,w)|H(u,w);

Figure BDA0002241015980000051

Figure BDA0002241015980000052

Figure BDA0002241015980000053

Figure BDA0002241015980000054

wherein, | S (u, w) | is the amplitude spectrum of the high-frequency band signal after noise reduction; h (u, w) is a transfer gain function; | YHF(u, w) | is the high-frequency band amplitude spectrum of the noisy speech;

Figure BDA0002241015980000055

a cross-power spectrum of a high-frequency band of the voice containing noise;

Figure BDA0002241015980000056

for the estimated high-band noise cross-power spectrum; i. j is the high-frequency band signal collected by two voice collecting devices respectively;

Figure BDA0002241015980000057

is the modified posterior signal-to-noise ratio;

Figure BDA0002241015980000058

is a gain function; g. b and h are constants; | Yi(u,w)YjAnd (u, w) | is the mutual amplitude spectrum of the high-frequency band signals.

In one embodiment, the step of obtaining the magnitude spectrum and the phase spectrum of the low-band signal comprises:

preprocessing a low-frequency band signal to obtain a stable low-frequency band signal;

and processing the stable low-frequency band signal by adopting Fourier transform to obtain the amplitude spectrum and the phase spectrum of the low-frequency band signal.

In one embodiment, the step of preprocessing the low-frequency band signal to obtain a smoothed low-frequency band signal comprises:

and sequentially performing framing processing and windowing processing on the low-frequency band signal to obtain a stable low-frequency band signal.

In one embodiment, the step of obtaining the magnitude spectrum and the phase spectrum of the high-band signal comprises:

preprocessing the high-frequency band signal to obtain a stable high-frequency band signal;

and carrying out Fourier transform processing on the stable high-frequency band signal to obtain the amplitude spectrum and the phase spectrum of the high-frequency band signal.

In one embodiment, the step of preprocessing the high-band signal to obtain a smoothed high-band signal comprises:

and sequentially performing framing processing and windowing processing on the high-frequency band signal to obtain a stable high-frequency band signal.

In one embodiment, the step of acquiring the modulation domain signal corresponding to the amplitude spectrum includes:

and processing the amplitude spectrum by adopting Fourier transform to obtain a modulation domain signal.

In one embodiment, the step of performing frequency division processing on the noisy speech signal to obtain a low-frequency band signal includes:

carrying out bumpless low-pass filtering processing on the noisy speech signal to obtain a low-frequency band signal;

the step of performing frequency division processing on the noisy speech signal to obtain a high-frequency band signal comprises the following steps:

and carrying out non-convex high-pass filtering processing on the noisy speech signal to obtain a high-frequency band signal.

An embodiment of the present invention further provides a speech noise reduction processing apparatus, including:

the voice signal acquisition module is used for acquiring a noise-containing voice signal acquired by the voice acquisition equipment when the distance between the signal acquisition equipment reaches a preset value;

the low-frequency filtering module is used for carrying out frequency division processing on the voice signal containing noise to obtain a low-frequency band signal;

the low-frequency band signal frequency spectrum acquisition module is used for acquiring an amplitude spectrum and a phase spectrum of the low-frequency band signal;

the modulation domain signal acquisition module is used for acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

the modulation domain amplitude spectrum processing module is used for processing the modulation domain amplitude spectrum or the power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio of the modulation domain and the smooth factor;

the compensation module is used for compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

the low-frequency band signal denoising module is used for obtaining a low-frequency band signal subjected to denoising based on the compensated modulation domain phase spectrum, the denoised modulation domain amplitude spectrum and the phase spectrum of the low-frequency band signal; the noise-reduced low-band signal is used to synthesize a noise-reduced speech signal.

In one embodiment, the speech noise reduction processing apparatus further includes:

the high-frequency filtering module is used for carrying out frequency division processing on the voice signal containing noise to obtain a high-frequency band signal;

the high-frequency band signal frequency spectrum acquisition module is used for acquiring an amplitude spectrum and a phase spectrum of the high-frequency band signal;

the high-frequency band signal noise reduction module is used for processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum and the phase spectrum of the noise-reduced high-frequency band signal; the transfer gain function is obtained according to the power spectrum and the cross-power spectrum of the high-frequency band signal and the estimated noise cross-power spectrum;

and the synthesis module is used for synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain the noise-reduced voice signal.

The embodiment of the invention also provides computer equipment which comprises a memory and a processor, wherein the memory stores computer programs, and the processor realizes the steps of the method when executing the computer programs.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the above-mentioned method.

One of the above technical solutions has the following advantages and beneficial effects:

according to the voice noise reduction processing method, when the distance between the voice acquisition devices reaches a preset value, the acquired voice signals have strong correlated noise in the low frequency band, so that when the distance between the voice acquisition devices reaches the preset value, the noise-containing voice signals acquired by the voice acquisition devices are acquired, and the noise-containing voice signals are subjected to frequency division processing to obtain the low frequency band signals. And obtaining a voice signal after noise reduction by acquiring a modulation domain signal corresponding to the amplitude spectrum of the low-frequency band signal and processing the modulation domain signal. Specifically, the modulation domain amplitude spectrum or the modulation domain power spectrum is processed by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction, and in the process, a smooth over-subtraction factor adopted by the spectral subtraction method is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain. By the method, excessive elimination of the voice message due to too large smooth over-reduction factor is avoided, insufficient noise suppression capability due to too small over-reduction factor is avoided, meanwhile, abrupt change between adjacent frames cannot be generated, and the quality of the voice signal is effectively improved. Furthermore, the phase spectrum of the modulation domain is compensated to obtain a compensated phase spectrum of the modulation domain, so that the background noise of the voice signal can be further suppressed, and the quality of the voice signal is improved. Therefore, based on the compensated modulation domain phase spectrum, the noise-reduced modulation domain amplitude spectrum and the phase spectrum of the low-frequency band signal, the noise of the obtained noise-reduced low-frequency band signal is low, and a voice signal with higher quality is obtained.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is a first schematic flow chart diagram illustrating a speech noise reduction processing method according to an embodiment;

FIG. 2 is a second schematic flow chart diagram illustrating a speech noise reduction processing method according to an embodiment;

FIG. 3 is a flowchart illustrating steps of compensating a modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum according to an embodiment;

FIG. 4 is a flowchart illustrating steps of obtaining a noise-reduced low-frequency band signal based on the compensated modulation domain phase spectrum, the noise-reduced modulation domain amplitude spectrum, and the phase spectrum of the low-frequency band signal in one embodiment;

FIG. 5 is a flow diagram illustrating the steps of obtaining a magnitude spectrum and a phase spectrum of a low band signal in one embodiment;

FIG. 6 is a schematic flow chart diagram illustrating the steps for obtaining the magnitude and phase spectra of a high-band signal in one embodiment;

FIG. 7 is a block diagram showing a first schematic configuration of a speech noise reduction processing apparatus according to an embodiment;

FIG. 8 is a block diagram showing a second schematic configuration of a speech noise reduction processing apparatus according to an embodiment;

FIG. 9 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, there is provided a speech noise reduction processing method, including the steps of:

s110, when the distance between the signal acquisition devices reaches a preset value, acquiring a noise-containing voice signal acquired by the voice acquisition device, and performing frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal;

the voice acquisition device can be any device for acquiring voice in the field. In one particular example, the voice capture device is a microphone. The preset value may be 16 cm.

Specifically, the noise-containing speech signal collected by the speech collecting device can be obtained by any method in the art. In one specific example, the obtaining of the noisy speech signal includes calling the noisy speech signal from a speech acquisition device. The low-frequency band signal can be obtained by performing frequency division processing on the noisy speech signal in any way in the field. Optionally, the noisy speech signal is processed by low-pass filtering to obtain a low-frequency band signal. In a specific example, the noise-containing speech signal is subjected to bumpless low-pass filtering at a frequency dividing point to obtain a low-frequency band signal, so that no bump is generated at the frequency dividing point in a subsequent speech synthesis process.

S120, obtaining a magnitude spectrum and a phase spectrum of the low-frequency band signal;

specifically, fourier transform processing is performed on the low-frequency band signal to obtain a magnitude spectrum and a phase spectrum of the low-frequency band signal. It should be noted that the amplitude spectrum and the phase spectrum of the low-frequency band signal may also be obtained by any means in the art.

S130, acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

specifically, the modulation domain signal corresponding to the amplitude spectrum may be obtained by any technical means in the art. In a specific example, the step of acquiring the modulation domain signal corresponding to the amplitude spectrum may include processing the amplitude spectrum by using fourier transform to obtain the modulation domain signal. Further, the modulation domain signal can be obtained by performing short-time fourier transform on the amplitude spectrum at each frequency point frame by frame. The modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum.

S140, processing the modulation domain amplitude spectrum or the modulation domain power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

the modulation domain amplitude spectrum or the modulation domain power spectrum is processed by adopting a spectral subtraction method, so that the modulation domain amplitude spectrum after noise reduction can be obtained.

Specifically, the modulation domain amplitude spectrum is processed by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction. Or, the modulation domain power spectrum is processed by a spectral subtraction method, so that a modulation domain power spectrum after noise reduction can be obtained, the modulation domain power spectrum after noise reduction can be converted into a modulation domain amplitude spectrum after noise reduction, and the conversion method is not described herein again.

In one embodiment, in the step of processing the modulation domain amplitude spectrum or the power spectrum by using a spectral subtraction method to obtain the modulation domain amplitude spectrum after noise reduction, the modulation domain amplitude spectrum after noise reduction is obtained based on the following formula:

Figure BDA0002241015980000101

Figure BDA0002241015980000102

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; i S (u, w, k) I is the modulation domain amplitude spectrum after noise reduction; p is the type of spectral subtraction method, using modulation domain amplitude spectrum subtraction when p is 1, and modulation domain power spectrum subtraction when p is 2; when p is 1, | V (u, w, k) <' > cellspModulating a domain amplitude spectrum for the estimated noise, when p is 2, | V (u, w, k) | YpFor estimated noisePower spectrum of modulation domain, α (k) is the smoothing over-reduction factor of modulation domainLF(u,w,k)|pIs a modulation domain amplitude spectrum or power spectrum; SNRpost(u, w, k) is the posterior signal-to-noise ratio of the modulation domain, [ theta ] is the smoothing factor, α0Is a constant.

S150, compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

in the voice noise reduction processing, the modulation domain phase spectrum is processed to further suppress background noise, so that the voice quality is improved. The amplitude spectrum is a real signal and the resulting modulation domain signal is therefore conjugate symmetric. Based on the method, the angular relation between the conjugate terms can be changed through an anti-symmetric function, and then the modulation domain phase spectrum can be compensated, so that the compensated modulation domain phase spectrum is obtained.

S160, obtaining a low-frequency band signal subjected to noise reduction based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum subjected to noise reduction and the phase spectrum of the low-frequency band signal; the noise-reduced low-frequency band signal is used for synthesizing a noise-reduced speech signal.

Specifically, the low-frequency band signal after noise reduction can be obtained according to the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum after noise reduction, and the phase spectrum of the low-frequency band signal. The noise-reduced low-band signal may be synthesized into a noise-reduced speech signal. Optionally, a high-frequency band signal of the speech signal may be obtained, and the high-frequency band signal and the noise-reduced low-frequency band signal are synthesized to obtain a noise-reduced speech signal; or acquiring a high-frequency band signal of the voice signal, then performing noise reduction processing on the high-frequency band signal, and synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain the noise-reduced voice signal. The noise reduction processing method performed on the high-frequency band signal may be any method in the art, and is not specifically limited herein. Signal synthesis may also be performed in the manner used in the art.

According to the voice noise reduction processing method, when the distance between the voice acquisition devices reaches a preset value, the acquired voice signals have strong correlated noise in the low frequency band, so that when the distance between the voice acquisition devices reaches the preset value, the noise-containing voice signals acquired by the voice acquisition devices are acquired, and the noise-containing voice signals are subjected to frequency division processing to obtain the low frequency band signals. And obtaining a voice signal after noise reduction by acquiring a modulation domain signal corresponding to the amplitude spectrum of the low-frequency band signal and processing the modulation domain signal. Specifically, the modulation domain amplitude spectrum or the modulation domain power spectrum is processed by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction, and in the process, a smooth over-subtraction factor adopted by the spectral subtraction method is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain. By the method, excessive elimination of the voice message due to too large smooth over-reduction factor is avoided, insufficient noise suppression capability due to too small over-reduction factor is avoided, meanwhile, abrupt change between adjacent frames cannot be generated, and the quality of the voice signal is effectively improved. Furthermore, the phase spectrum of the modulation domain is compensated to obtain a compensated phase spectrum of the modulation domain, so that the background noise of the voice signal can be further suppressed, and the quality of the voice signal is improved. Therefore, based on the compensated modulation domain phase spectrum, the noise-reduced modulation domain amplitude spectrum and the phase spectrum of the low-frequency band signal, the noise of the obtained noise-reduced low-frequency band signal is low, and a voice signal with higher quality is obtained.

In one embodiment, as shown in fig. 2, further comprising the steps of:

s210, carrying out frequency division processing on the voice signal containing noise to obtain a high-frequency band signal;

specifically, the noise-containing speech signal collected by the speech collecting device can be obtained by any method in the art. In one specific example, the obtaining of the noisy speech signal includes calling the noisy speech signal from a speech acquisition device. The high-frequency band signal can be obtained by performing frequency division processing on the noisy speech signal in any manner in the field. In a specific example, the bumpless high-pass filtering is performed on the noisy speech signal at a frequency dividing point to obtain a high-frequency band signal, so that the frequency dividing point has no bulge in the subsequent speech synthesis process. It should be noted that the step of performing frequency division processing on the noisy speech signal to obtain the high-frequency band signal may be performed together with the step S110, that is, the high-frequency band signal and the low-frequency band signal may be obtained by performing frequency division processing only once.

S220, obtaining an amplitude spectrum and a phase spectrum of the high-frequency band signal;

specifically, fourier transform processing is performed on the high-frequency band signal to obtain a magnitude spectrum and a phase spectrum of the high-frequency band signal. It should be noted that the amplitude spectrum and the phase spectrum of the high-band signal may also be obtained by any means in the art.

S230, processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum and the phase spectrum of the noise-reduced high-frequency band signal; the transfer gain function is obtained by estimating according to the power spectrum and the cross-power spectrum of the high-frequency band signal and the estimated noise cross-power spectrum;

specifically, the amplitude spectrum of the high-frequency band signal is processed by using a transfer gain function, so as to obtain the amplitude spectrum of the high-frequency band signal after noise reduction. It should be noted that when the distance between the two microphones reaches a preset value, weak correlated noise exists in a high frequency band in the acquired voice signal. The amplitude spectrum of the high-frequency band signal is processed by adopting the transfer gain function, so that the weak correlation noise can be effectively inhibited.

In one embodiment, in the step of processing the amplitude spectrum of the high-frequency band signal by using the transfer gain function to obtain the amplitude spectrum of the noise-reduced high-frequency band signal, the amplitude spectrum of the noise-reduced high-frequency band signal is obtained based on the following formula:

|S(u,w)|=|YHF(u,w)|H(u,w);

Figure BDA0002241015980000132

Figure BDA0002241015980000133

Figure BDA0002241015980000134

wherein, | S (u, w) | is the amplitude spectrum of the high-frequency band signal after noise reduction; h (u, w) is a transfer gain function; | YHF(u, w) | is the high-frequency band amplitude spectrum of the noisy speech;

Figure BDA0002241015980000135

a cross-power spectrum of a high-frequency band of the voice containing noise;

Figure BDA0002241015980000136

for the estimated high-band noise cross-power spectrum; i. j is the high-frequency band signal collected by two voice collecting devices respectively;

Figure BDA0002241015980000137

is the modified posterior signal-to-noise ratio;

Figure BDA0002241015980000138

is a gain function; g. b and h are constants; | Yi(u,w)YjAnd (u, w) | is the mutual amplitude spectrum of the high-frequency band signals.

In a specific example, the step of obtaining the noise-reduced high-frequency band signal based on the amplitude spectrum of the noise-reduced high-frequency band signal and the phase spectrum of the high-frequency band signal includes: and sequentially carrying out Fourier inversion processing and overlap addition on the phase spectrum of the high-frequency band signal and the amplitude spectrum of the noise-reduced high-frequency band signal to obtain the noise-reduced low-frequency band signal. The low-frequency band signal after noise reduction may be obtained by any means in the art.

S240, synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain a noise-reduced voice signal.

The noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal are synthesized to obtain a noise-reduced voice signal. The synthesis treatment means may be any one of those in the art, and is not particularly limited herein.

According to the voice noise reduction processing method, when the distance between the voice acquisition devices reaches the preset value, weak correlated noise exists in the acquired voice signals in the high frequency band, therefore, when the distance between the voice acquisition devices reaches the preset value, the noise-containing voice signals acquired by the acquisition devices are acquired, frequency division processing is carried out on the noise-containing voice signals, and the high frequency band signals are acquired. And obtaining the amplitude spectrum of the high-frequency band signal after noise reduction by adopting transfer gain function processing to the amplitude spectrum of the high-frequency band signal. Based on the amplitude spectrum and the phase spectrum of the high-frequency band signal after noise reduction, the obtained high-frequency band signal after noise reduction has less noise. And aiming at the noise characteristics of the high-frequency band signal and the low-frequency band signal, different noise reduction processing methods are carried out, so that the voice signal with less noise is obtained.

In one embodiment, as shown in fig. 3, the step of compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum includes:

s310, adopting an antisymmetric function and the estimated noise modulation domain amplitude spectrum to carry out conjugate angle adjustment on the modulation domain phase spectrum to obtain phase compensation;

or, the anti-symmetric function and the estimated noise modulation domain power spectrum are adopted to carry out conjugate angle adjustment on the modulation domain phase spectrum to obtain phase compensation;

in the voice noise reduction processing, the modulation domain phase spectrum is processed to further suppress background noise, so that the voice quality is improved. The amplitude spectrum is a real signal, so the resulting modulated domain signal is conjugate symmetric. Based on the method, the angle relation between conjugate terms can be changed for the modulation domain phase spectrum through an anti-symmetric function and the estimated noise modulation domain amplitude spectrum, and then the modulation domain phase spectrum can be compensated to obtain the compensated modulation domain phase spectrum.

It should be noted that the estimated noise modulation domain amplitude spectrum or the estimated noise modulation domain phase spectrum may be obtained according to the modulation domain amplitude spectrum of the noisy speech signal.

In one embodiment, an antisymmetric function and an estimated noise modulation domain amplitude spectrum are adopted to perform conjugate angle adjustment on a modulation domain phase spectrum, and in the step of obtaining phase compensation, phase compensation is obtained based on the following formula:

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; Λ (u, w, k) is phase compensation;

Figure BDA0002241015980000152

is an anti-symmetric function, ξ is a constant, | V (u, w, k) <' > is a linear vector1Modulating the domain amplitude spectrum for the estimated noise;

in the step of obtaining the phase compensation, the phase compensation is obtained based on the following formula:

Figure BDA0002241015980000153

wherein u is a modulation frame variable; w is a discrete frequency variable; k is a modulation domain variable; Λ (u, w, k) is phase compensation;is an anti-symmetric function, ξ is a constant, | V (u, w, k) <' > is a linear vector2Modulating the domain power spectrum for the estimated noise;

and S320, obtaining a compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum.

Specifically, after the phase compensation is obtained, the modulation domain phase spectrum may be processed based on the phase compensation to obtain a compensated modulation domain phase spectrum.

In a specific example, in the step of obtaining a compensated modulation domain phase spectrum according to the phase compensation and the modulation domain phase spectrum, the compensated phase spectrum is obtained based on the following formula:

angle[S(u,w,k)]=angle[YLF(u,w,k)]+Λ(u,w,k);

according to the voice noise reduction processing method, the phase compensation is obtained, the modulation domain phase spectrum is processed based on the phase compensation, the compensated modulation domain phase spectrum is obtained, and the strong correlated noise of the low-frequency band signal is suppressed.

In one embodiment, as shown in fig. 4, the step of obtaining the noise-reduced low-frequency band signal based on the compensated modulation domain phase spectrum, the noise-reduced modulation domain amplitude spectrum, and the phase spectrum of the low-frequency band signal includes:

s410, sequentially carrying out Fourier inversion and overlap addition processing on the modulation domain amplitude spectrum after noise reduction and the modulation domain phase spectrum after compensation to obtain a noise-reduced amplitude spectrum;

among them, fourier transform is one of the common means for signal processing, and is not specifically described here.

Specifically, inverse Fourier transform processing is carried out on the modulation domain amplitude spectrum after noise reduction and the modulation domain phase spectrum after compensation, and then overlapping addition processing is carried out on the obtained result to obtain the amplitude spectrum after noise reduction;

and S420, carrying out Fourier inversion processing and overlapping addition processing on the amplitude spectrum and the low-frequency band signal phase spectrum after noise reduction to obtain a low-frequency band signal after noise reduction.

Specifically, the amplitude spectrum and the low-frequency band signal phase spectrum after noise reduction are subjected to inverse fourier transform processing, and then the obtained results are subjected to overlap-add processing to obtain a low-frequency band signal after noise reduction.

In one embodiment, as shown in fig. 5, the step of acquiring the magnitude spectrum and the phase spectrum of the low-frequency band signal comprises:

s510, preprocessing a low-frequency band signal to obtain a stable low-frequency band signal;

specifically, the low-frequency band signal is preprocessed to obtain a stable low-frequency band signal. The pretreatment may be any one of the methods in the art, and is not specifically required herein.

S520, processing the stable low-frequency band signal by adopting Fourier transform to obtain an amplitude spectrum and a phase spectrum of the low-frequency band signal.

And Fourier transform processing is carried out on the stable low-frequency band signal to obtain a frequency expression formula of the low-frequency band signal, and further obtain an amplitude spectrum and a phase spectrum of the low-frequency band signal.

In one embodiment, the step of preprocessing the low-frequency band signal to obtain a smoothed low-frequency band signal comprises: and sequentially performing framing processing and windowing processing on the low-frequency band signal to obtain a stable low-frequency band signal.

According to the voice noise reduction method, the low-frequency band signal is preprocessed, so that a stable low-frequency band signal is obtained. Specifically, the voice signal is not stable macroscopically and is stable microscopically, so that the voice signal can be divided in units of one frame. After the speech signal is framed, windowing can be performed on the speech signal, that is, the speech signal obtained by framing is multiplied by a window function, so that frequency spectrum leakage is reduced.

In one embodiment, as shown in fig. 6, the step of acquiring the magnitude spectrum and the phase spectrum of the high-band signal comprises:

s610, preprocessing the high-frequency band signal to obtain a stable high-frequency band signal;

specifically, the high-frequency band signal is preprocessed, so that a stable high-frequency band signal is obtained. The pretreatment may be any one of the methods in the art, and is not specifically required herein.

S620, Fourier transform processing is carried out on the stable high-frequency band signal, and the amplitude spectrum and the phase spectrum of the high-frequency band signal are obtained.

And Fourier transform processing is carried out on the stable high-frequency band signal to obtain a frequency expression formula of the high-frequency band signal, and further obtain an amplitude spectrum and a phase spectrum of the high-frequency band signal.

In one embodiment, the step of preprocessing the high-band signal to obtain a smoothed high-band signal comprises: and sequentially performing framing processing and windowing processing on the high-frequency band signal to obtain a stable high-frequency band signal.

According to the voice noise reduction method, the stable high-frequency band signal is obtained by preprocessing the high-frequency band signal. Specifically, the voice signal is not stable macroscopically and is stable microscopically, so that the voice signal can be divided in units of one frame. After the speech signal is framed, windowing can be performed on the speech signal, that is, the speech signal obtained by framing is multiplied by a window function, so that frequency spectrum leakage is reduced.

In a specific embodiment, the step of performing frequency division processing on the noisy speech signal to obtain a low-frequency band signal includes:

carrying out bumpless low-pass filtering processing on the noisy speech signal to obtain a low-frequency band signal;

the step of performing frequency division processing on the noisy speech signal to obtain a high-frequency band signal comprises the following steps:

and carrying out non-convex high-pass filtering processing on the noisy speech signal to obtain a high-frequency band signal.

According to the voice noise reduction processing method, the low-frequency band signal and the high-frequency band signal are obtained by carrying out the non-protrusion low-pass filtering processing on the noise-containing voice signal at the frequency division point, so that the frequency division point has no protrusion.

It should be understood that although the various steps in the flow charts of fig. 1-6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in a strict order unless explicitly stated herein, and may be performed in other orders. Moreover, at least some of the steps in fig. 1-6 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 7, there is provided a speech noise reduction processing apparatus including:

the voice signal acquisition module 10 is configured to acquire a noisy voice signal acquired by the voice acquisition device when a distance between the signal acquisition devices reaches a preset value;

the low-frequency filtering module 20 is configured to perform frequency division processing on the noisy speech signal to obtain a low-frequency band signal;

a low-band signal spectrum acquisition module 30, configured to acquire a magnitude spectrum and a phase spectrum of the low-band signal;

a modulation domain signal obtaining module 40, configured to obtain a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

a modulation domain amplitude spectrum processing module 50, configured to process the modulation domain amplitude spectrum or the power spectrum by using a spectral subtraction method, so as to obtain a noise-reduced modulation domain amplitude spectrum; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

the compensation module 60 is configured to compensate the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

a low-frequency band signal denoising module 70, configured to obtain a denoised low-frequency band signal based on the compensated modulation domain phase spectrum, the denoised modulation domain amplitude spectrum, and the phase spectrum of the low-frequency band signal; the noise-reduced low-frequency band signal is used for synthesizing a noise-reduced speech signal.

In one embodiment, as shown in fig. 8, the speech noise reduction processing apparatus further includes:

the high-frequency filtering module 80 is configured to perform frequency division processing on the noisy speech signal to obtain a high-frequency band signal;

a high-band signal spectrum acquisition module 90, configured to acquire a magnitude spectrum and a phase spectrum of the high-band signal;

the high-frequency band signal denoising module 100 is configured to process the magnitude spectrum of the high-frequency band signal by using a transfer gain function to obtain a magnitude spectrum of the high-frequency band signal after denoising; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum and the phase spectrum of the noise-reduced high-frequency band signal; the transfer gain function is obtained by estimating according to the power spectrum and the cross power of the high-frequency band signal and the estimated noise cross power spectrum;

and a synthesis module 110, configured to synthesize the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal, so as to obtain a noise-reduced speech signal.

For the specific definition of the speech noise reduction processing apparatus, reference may be made to the above definition of the speech noise reduction processing method, which is not described herein again. The modules in the voice noise reduction processing device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a speech noise reduction processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

when the distance between the signal acquisition equipment reaches a preset value, acquiring a noise-containing voice signal acquired by the voice acquisition equipment, and carrying out frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal;

acquiring an amplitude spectrum and a phase spectrum of a low-frequency band signal;

acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

processing the modulation domain amplitude spectrum or the modulation domain power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

obtaining a low-frequency band signal subjected to noise reduction based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum subjected to noise reduction and the phase spectrum of the low-frequency band signal; the noise-reduced low-frequency band signal is used for synthesizing a noise-reduced speech signal.

In one embodiment, the processor, when executing the computer program, further performs the steps of:

carrying out frequency division processing on the voice signal containing noise to obtain a high-frequency band signal;

obtaining an amplitude spectrum and a phase spectrum of the high-frequency band signal;

processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum and the phase spectrum of the noise-reduced high-frequency band signal; the transfer gain function is obtained according to the power spectrum and the cross power spectrum of the high-frequency band signal and the estimated noise cross power spectrum;

and synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain a noise-reduced voice signal.

In one embodiment, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of:

when the distance between the signal acquisition equipment reaches a preset value, acquiring a noise-containing voice signal acquired by the voice acquisition equipment, and carrying out frequency division processing on the noise-containing voice signal to obtain a low-frequency band signal;

acquiring an amplitude spectrum and a phase spectrum of a low-frequency band signal;

acquiring a modulation domain signal corresponding to the amplitude spectrum; the modulation domain signal comprises a modulation domain amplitude spectrum, a modulation domain power spectrum and a modulation domain phase spectrum;

processing the modulation domain amplitude spectrum or the modulation domain power spectrum by adopting a spectral subtraction method to obtain a modulation domain amplitude spectrum after noise reduction; the smooth over-subtraction factor in the spectral subtraction is obtained according to the posterior signal-to-noise ratio and the smooth factor of the modulation domain;

compensating the modulation domain phase spectrum to obtain a compensated modulation domain phase spectrum;

obtaining a low-frequency band signal subjected to noise reduction based on the compensated modulation domain phase spectrum, the modulation domain amplitude spectrum subjected to noise reduction and the phase spectrum of the low-frequency band signal; the noise-reduced low-frequency band signal is used for synthesizing a noise-reduced speech signal.

In one embodiment, the computer program when executed by the processor further performs the steps of:

carrying out frequency division processing on the voice signal containing noise to obtain a high-frequency band signal;

obtaining an amplitude spectrum and a phase spectrum of the high-frequency band signal;

processing the amplitude spectrum of the high-frequency band signal by adopting a transfer gain function to obtain the amplitude spectrum of the high-frequency band signal subjected to noise reduction; obtaining a noise-reduced high-frequency band signal based on the amplitude spectrum and the phase spectrum of the noise-reduced high-frequency band signal; the transfer gain function is obtained according to the power spectrum and the cross power spectrum of the high-frequency band signal and the estimated noise cross power spectrum;

and synthesizing the noise-reduced high-frequency band signal and the noise-reduced low-frequency band signal to obtain a noise-reduced voice signal.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:音频处理方法、装置及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!