Voice enhancement analysis method and system for walking aid assistive device

文档序号：1965040 发布日期：2021-12-14 浏览：30次中文

阅读说明：本技术 一种用于助行辅具的语音增强分析方法及系统 (Voice enhancement analysis method and system for walking aid assistive device ) 是由韩宇菲史金飞黄家才盛云龙陈国军陈伟李毅搏张铎于 2021-08-30 设计创作，主要内容包括：本发明实施例公开了一种用于助行辅具的语音增强分析方法及系统,涉及语音识别技术领域,能够缓减目前的助行辅具的语音增强分析中的降噪效果及鲁棒性不足的问题。本发明包括：语音增强系统、参考传声器、环境噪声传声器、语音识别系统和助行辅具驱动,所述参考传声器、所述环境噪声传声器和所述语音识别系统,都与所述语音增强系统连接。所述语音增强系统通过变步长与变阶数相结合的混合参数模型,对所述语音信号中所述外界环境噪声进行处理,将降噪后的语音信号输出至语音识别系统。本发明适用于助行辅具的语音操控。(The embodiment of the invention discloses a voice enhancement analysis method and system for a walking aid, relates to the technical field of voice recognition, and can alleviate the problems of insufficient noise reduction effect and robustness in the voice enhancement analysis of the existing walking aid. The invention comprises the following steps: the system comprises a voice enhancement system, a reference microphone, an environment noise microphone, a voice recognition system and a walking aid driving aid, wherein the reference microphone, the environment noise microphone and the voice recognition system are all connected with the voice enhancement system. And the voice enhancement system processes the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order, and outputs the voice signal after noise reduction to a voice recognition system. The invention is suitable for voice control of the auxiliary walking aid.)

1. A speech enhancement analysis system for a walking aid, comprising:

the system comprises a voice enhancement system, a reference microphone, an environment noise microphone, a voice recognition system and a walking aid driver, wherein the reference microphone, the environment noise microphone and the voice recognition system are all connected with the voice enhancement system;

the reference microphone is used for acquiring voice signals and sending the acquired voice signals to the voice enhancement system and the environmental noise microphone;

the environment noise microphone is used for extracting external environment noise from the voice signal and sending the external environment noise to the voice enhancement system;

the voice enhancement system is used for processing the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order, and outputting the voice signal after noise reduction to a voice recognition system;

the voice recognition system is used for triggering the walking aid assistant tool to drive and execute corresponding actions according to the voice signal subjected to noise reduction;

the walking aid drive is used for driving the walking aid to respond to the corresponding action.

2. The system of claim 1, wherein the speech enhancement system is further configured to adjust weight coefficients of a mixed parameter model with a combination of a variable step size and a variable order running in the speech enhancement system.

3. A speech enhancement analysis method for a walking aid, comprising:

s1, collecting voice signals including external environment noise;

s2, extracting the external environment noise from the voice signal;

s3, processing the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order to obtain a noise-reduced voice signal;

and S4, controlling the walking aid to perform actions according to the voice signals after noise reduction.

4. The method according to claim 3, further comprising, after step S1:

and adjusting the weight coefficient of the mixed parameter model combining the variable step length and the variable order according to the voice signal and the external environment noise.

5. The method according to claim 3 or 4, wherein the processing the external environment noise in the speech signal through the mixed parameter model combining the variable step size and the variable order number comprises:

wherein the content of the first and second substances,represents the root mean square of the smoothing error value at the last time,represents the smoothed error value at the previous instant, λ represents the iteration coefficient, which is generally smaller and closer to 1, and n represents the number of iterations.

6. The method of claim 5, further comprising:

wherein the content of the first and second substances,the variance of the noise of the system is represented,representing an a priori error calculated approximately, k representing a count, max representing taking a maximum, an Representing the square of the smoothed a priori error.

7. The method of claim 6, further comprising:

wherein μ (n) represents a step size, D represents a set constant, and β represents a regularization factor;

and isl_f(n) represents the intermediate fractional order, α represents the iteration coefficient, γ represents the step size of varying order, L (n) represents the order,representing the complete error of order l (n), a representing the error width,indicating a partial error.

8. The method of claim 7, further comprising:

wherein FE represents the difference between the complete error and the partial error, M (n) represents the order of the schematic,representing the complete error of order m (n),represents the partial error after the overall order is M (n), and the partial M (n) -Delta is taken, and q is₁(n)＝λq₁(n-1)+(1-λ)|FE(n)|，q₁(n) represents an intermediate quantity of iterative computations; q. q.s₂(n)＝λq₂(n-1)+(1-λ)FE(n)，q₂(n) represents another intermediate quantity of iterative computation.

9. The method of claim 8, further comprising:

γ(n)＝ρ₁q₁(n), γ (n) denotes a step of variable order, ρ₁Represents a constant coefficient;

Δ(n)＝min(Δ_max,ρ₂|q₂(n) |), Δ (n) denotes an error width, Δ_maxDenotes the maximum error width, p₂Representing constant coefficients.

Technical Field

The invention relates to the technical field of voice recognition, in particular to a voice enhancement analysis method and system for a walking aid.

Background

With the aging of the population, the proportion of the population over 60 years old is expected to double in the next 50 years, and disabled people caused by various disasters and diseases are also increased year by year, so that the disabled people have different degrees of disabilities, such as walking, eyesight, hands and language. Therefore, providing a walking tool with superior performance for the elderly and disabled has become one of the important concerns of the whole society. The walking aid is used as a service appliance, and can greatly improve the daily life and the working quality of the old and the disabled. One of the key technologies of the walking aid is the technology for realizing human-computer interaction with a user, and the interaction comprises two elements: on one hand, the walking aid can be controlled more naturally, and on the other hand, the walking aid can better understand the thinking and commands of the person.

However, the existing electric walking aid products generally have the defects of unfriendly human-computer interaction, inconvenient operation, low safety and the like. In addition, sound noise exists in the environments of daily life, work and the like, and the voice signal with the external environment noise brings great difficulty to the recognition of the voice signal, so that the recognition rate is reduced.

The adaptive beamforming voice enhancement technology is a core technology for improving the recognition rate, but the current scheme has the defects of low convergence rate, poor steady-state performance and the like, and the noise reduction effect is seriously influenced. Moreover, the adaptive filtering scheme is not suitable for processing nonlinear and non-stationary time noise sequences, and has poor robustness, large calculation amount and high requirement on hardware. Because the noise change in daily life is great, the current scheme still is difficult to satisfy the real-time of system, and conventional noise reduction system structure is simple relatively, does not possess sufficient performance to deal with the scene of some noise sudden changes, and noise reduction system itself very easily produces extra noise again, can produce very big influence to the noise reduction effect. These factors all result in insufficient noise reduction effect and robustness at present.

Disclosure of Invention

The embodiment of the invention provides a method and a system for voice enhancement analysis of a walking aid, which can alleviate the problems of insufficient noise reduction effect and robustness in the voice enhancement analysis of the existing walking aid.

In order to achieve the above purpose, the embodiment of the invention adopts the following technical scheme:

in one aspect, a speech enhancement analysis system for a walking aid is provided, comprising:

the reference microphone is used for acquiring voice signals and sending the acquired voice signals to the voice enhancement system and the environmental noise microphone;

the environment noise microphone is used for extracting external environment noise from the voice signal and sending the external environment noise to the voice enhancement system;

the voice recognition system is used for triggering the walking aid assistant tool to drive and execute corresponding actions according to the voice signal subjected to noise reduction;

the walking aid drive is used for driving the walking aid to respond to the corresponding action.

In another aspect, a speech enhancement analysis method for a walking aid is provided, comprising:

s1, collecting voice signals including external environment noise;

s2, extracting the external environment noise from the voice signal;

s3, processing the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order to obtain a noise-reduced voice signal;

and S4, controlling the walking aid to perform actions according to the voice signals after noise reduction.

After step S1, the method further includes: and adjusting the weight coefficient of the mixed parameter model combining the variable step length and the variable order according to the voice signal and the external environment noise.

The processing of the external environment noise in the voice signal through the mixed parameter model combining the variable step length and the variable order comprises the following steps:wherein the content of the first and second substances,represents the root mean square of the smoothing error value at the last time,represents the smoothed error value at the previous instant, λ represents the iteration coefficient, which is generally smaller and closer to 1, and n represents the number of iterations.

Wherein the content of the first and second substances,the variance of the noise of the system is represented,represents an a priori error of approximately 10-like calculations, k represents a count, max represents taking a maximum, and representing the square of the smoothed a priori error.

Wherein μ (n) represents a step size, D represents a set constant, and β represents a regularization factor; and is l_f(n) represents the intermediate fractional order, α represents the iteration coefficient, γ represents the step size of varying order, L (n) represents the order,representing the complete error of order l (n), a representing the error width,indicating a partial error.

Wherein FE represents the difference between the complete error and the partial error, M (n) represents the order of the schematic,representing the complete error of order m (n),represents the partial error after the overall order is M (n), and the partial M (n) -Delta is taken, and q is₁(n)＝λq₁(n-1)+(1-λ)|FE(n)|，q₁(n) represents an intermediate quantity of iterative computation, q₂(n)＝λq₂(n-1)+(1-λ)FE(n)，q₂(n) represents another intermediate quantity of iterative computation.

γ(n)＝ρ₁q₁(n), γ (n) denotes a step of variable order, ρ₁Denotes a constant coefficient, Δ (n) ═ min (Δ)_max，ρ₂|q₂(n) |), Δ (n) denotes an error width, Δ_maxDenotes the maximum error width, p₂Representing constant coefficients.

Compared with the prior art, the embodiment makes full use of a new adaptive beam forming voice enhancement technical method, can achieve a good noise control effect, can effectively deal with different noise scenes, achieves noise intensity recognition and classification, utilizes stored scene characteristic quantities to perform intelligent judgment, selects the most suitable system coefficient, and finally achieves a good voice mode control effect of walking aid through a voice recognition system and walking aid driving. The novel high-performance model is applied to the voice control of the walking aid, the calculated amount can be greatly reduced by adjusting the key parameters of the model through the model, the convergence speed, the steady state and other performances are improved, and the implementation is easy, so that the noise reduction effect and the robustness are enhanced. The novel model applied by the invention adopts robust design, and can more quickly and accurately track the change of noise signals by adjusting key parameters in the model, thereby greatly improving the noise reduction effect.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a connection structure between various system units according to an embodiment of the present invention;

fig. 2 is a schematic view of a flow chart of a voice manipulation mode according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of an adaptive model part according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an execution flow according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items. It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

The design objectives of this embodiment are: aiming at assisting the walking aid to have a voice control mode, the convenience of voice control is improved, and the noise reduction effect and robustness are enhanced.

To achieve the above purpose, the design idea of this embodiment is: designing a speech manipulation modality for hybrid parameter adaptive beamforming speech enhancement model in a walking aid, comprising: the reference microphone, the environment noise microphone, the voice enhancement system, the voice recognition system and the walking aid driving device are connected with the voice enhancement system. Wherein: and the reference microphone is used for acquiring a voice signal with external environment noise and sending the acquired signal to the voice enhancement system. And the environment noise microphone is used for collecting the external environment noise and transmitting the external environment noise to the voice enhancement system. And the voice enhancement system is used for processing the voice signals with noise and the noise signals collected by the microphones through a mixed parameter model combining variable step length and variable order, and outputting the voice signals which are clearer after noise reduction to the voice recognition system of the assistant aid. Specifically, the noise reduction processing of noisy voices in various actual environments is realized through a voice enhancement model formed by a mixed variable parameter self-adaptive wave beam of a voice enhancement system, the definition and the intelligibility of voice signals are improved, and the success rate of voice signal identification is finally ensured, so that the voice control mode has practicability and safety in complex noise scenes such as streets, markets, stations and the like. And the voice recognition system is used for recognizing the corresponding action of the walking aid assistive device according to the received voice signal subjected to noise reduction. And the walking aid driving device is used for driving the walking aid to respond to corresponding actions of the voice signals.

The overall process of the voice control mode is as follows: the user speaks the voice awakening word to awaken the auxiliary walking aid, and the auxiliary walking aid enters the voice control mode through the awakening word, so that mistaken awakening and instruction mistake are avoided, and the safety of the auxiliary walking aid is improved. The current wake-up word is "Xiaojie". The specific control action command has more than ten action command words such as 'forward', 'backward', 'left turn', 'right turn', 'up', 'down', 'ease', 'stop', etc. The specific control method is that after instruction words such as forward or left turn are spoken, the walking aid starts to transfer the mechanical driving part to perform corresponding actions, and when the intention of a user is reached, if the user advances to a destination or turns left to a target angle, the user speaks 'stop', and in addition, the function can be expanded, and the function of voice interaction can be realized by matching with an online voice recognition and voice assistant system.

In addition, the voice enhancement system is also used for jointly adjusting the weight coefficient of the voice enhancement system according to the voice signal with the external environment noise acquired by the reference microphone and the external environment noise acquired by the environment noise microphone, and the voice enhancement system utilizes the mixed parameter model to improve the convergence speed and reduce the steady-state error and the calculated amount. Specifically, the weight coefficients of the speech enhancement system may be automatically adjusted, and processed using the model to generate a set of noise-controlled weight coefficients w. And then selecting a coefficient with larger influence on noise to recombine based on the weight coefficient w of the noise control to obtain an optimal weight vector, and calculating a final output signal.

The principle of the hybrid parameter adaptive model combining the variable step length and the variable order is as follows:

the two variables, the order and the step size, have a large impact on the performance of the adaptive model. Too small an order may cause the model to fall into a locally optimal solution, and divergence when the order is less than that required by the actual system. To avoid this, a value having a larger order is generally assigned in the general model. However, this also leads to increased computation, poor convergence and steady-state performance, and therefore the order should be reasonably adjusted. Likewise, large steps typically increase model convergence performance, but decrease steady-state performance and robustness. While a small step size will guarantee steady-state and robust performance, but will affect convergence performance. The step size should also have a corresponding variation strategy. Compared with the adjustment of only the step length or only the order, if the order and the step length are adjusted simultaneously, the order and the step length have reasonable values respectively, and the model performance may be further improved. In the process of model convergence, the convergence performance can be improved by a proper large step length, meanwhile, the order is also adjusted, the convergence to the optimal order is fast, the order is closer to the optimal value, the MSD variation is larger, and the model convergence performance is better. Under the combined action of the step length and the order, the convergence speed of the model can be further improved. And in a steady state, the small step length can improve the steady state performance, the order is converged in a reasonable range of the optimal order, and the more the order is close to the optimal value in the steady state, the smaller the error caused by the part without the estimated order is, so that the smaller the MSE in the steady state is, and the steady state performance can be improved. Therefore, under the combined action of the step length and the order, the steady-state performance of the model can be further improved.

However, the learners do not relate to the variable-step-size or variable-order models in the past, only pay attention to one of the step size and the order, but neglect the other, so that the step size and the order cannot be coordinated, and the performance improvement of the models still has great potential. In addition, in the conventional changing strategies for improving the models, some models cannot accurately adjust the order and the step length according to the state of the model, some models cannot remove the interference of system errors, so that the step length or the value of the order is not optimal for each moment, the problems of model imbalance, serious order overestimation, insufficient modeling and the like are easily caused, many models are lack of corresponding performance and parameter selection analysis, and the experimental environment is limited, so that the models are not practical.

The invention discloses a voice control mode of a hybrid parameter self-adaptive beam forming voice enhancement model in a walking aid voice, which collects external noise and voice signals with noise through various microphones and sends the collected signals to a voice enhancement system. The voice enhancement system processes through a variable step length and variable order mixed model of variable iteration parameters and outputs a voice signal after noise reduction. The voice recognition system receives the voice signal which is clearer after the noise is reduced, so that the walking aid assistant tool is controlled to perform corresponding action response. The novel walking aid voice control method based on the mixed parameter adaptive beam forming voice enhancement model adopts the robustness design, carries out the circulating iterative processing on the noise signal through the system, can select the most suitable weight coefficient, and more quickly and accurately tracks the change of the noise signal, thereby greatly improving the noise reduction effect, improving the purity of the voice signal and finally improving the recognition rate of the corresponding voice signal in the walking aid voice control method.

Embodiments of the present invention specifically provide a speech enhancement analysis system for a walking aid, specifically provide a connection and interaction manner between each system unit in an architecture as shown in fig. 1. Comprises the following steps:

the system comprises a voice enhancement system, a reference microphone, an environment noise microphone, a voice recognition system and a walking aid driving device, wherein the reference microphone, the environment noise microphone and the voice recognition system are all connected with the voice enhancement system, and the connection in a bus form can be specifically adopted.

And the reference microphone is used for acquiring a voice signal and sending the acquired voice signal to the voice enhancement system and the ambient noise microphone.

And the environment noise microphone is used for extracting the external environment noise from the voice signal and sending the external environment noise to the voice enhancement system.

And the voice enhancement system is used for processing the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order and outputting the voice signal subjected to noise reduction to the voice recognition system.

The method comprises the steps of acquiring a voice signal with noise and a noise signal of each microphone, processing the voice signal with noise and the noise signal with noise acquired by each microphone through a mixed parameter model combining variable step length and variable order, and outputting the voice signal with clearer noise after noise reduction to a voice recognition system of a walking aid. Specifically, the noise reduction processing of noisy voices in various actual environments is realized through a voice enhancement model formed by a mixed variable parameter self-adaptive wave beam of a voice enhancement system, the definition and the intelligibility of voice signals are improved, and the success rate of voice signal identification is finally ensured, so that the voice control mode has practicability and safety in complex noise scenes such as streets, markets, stations and the like.

And the voice recognition system is used for triggering the walking aid assistant tool to drive to execute corresponding actions according to the voice signal subjected to noise reduction.

The walking aid driving device is used for driving the walking aid to respond to corresponding actions. The voice recognition system can recognize corresponding actions of the walking aid after receiving the voice signals subjected to noise reduction. The walking aid is driven to drive the walking aid to respond to the corresponding action of the voice signal.

The communication structure between the various system elements shown in fig. 1. The reference microphone is used for collecting a voice signal with external environment noise, sending the collected signal to the voice enhancement system, and generating an input voice signal x (n) with noise through processing of the voice enhancement system. The environment noise microphone is used for collecting the external environment noise, transmitting the external environment noise to the voice enhancement system and generating another signal d (n). The main key points are a voice enhancement control system and a noise front end identification detection system, and the voice enhancement control system and the noise front end identification detection system are coordinated and matched to generate a better noise reduction effect.

Taking the voice operation mode flow shown in fig. 2 as an example, the user speaks a voice wake-up word to wake up the assistive device, and the assistive device enters the voice operation mode through the wake-up word, so as to avoid false wake-up and false command, and improve the safety of the assistive device. The specific control action command has more than ten action command words such as 'forward', 'backward', 'left turn', 'right turn', 'up', 'down', 'ease', 'stop', etc. The specific control method is that after instruction words such as forward or left turn are spoken, the walking aid starts to transfer the mechanical driving part to perform corresponding actions, and when the intention of a user is reached, if the user advances to a destination or turns left to a target angle, the user speaks 'stop', and in addition, the function can be expanded, and the function of voice interaction can be realized by matching with an online voice recognition and voice assistant system.

Specifically, the speech enhancement system is further configured to adjust a weight coefficient of a mixed parameter model in which a variable step length and a variable order are combined, the mixed parameter model being operated in the speech enhancement system. For example: the voice enhancement system adjusts the weight coefficient of the voice enhancement system together according to the voice signal with the external environment noise collected by the reference microphone and the external environment noise collected by the environment noise microphone, and the voice enhancement system improves the convergence rate and reduces the steady-state error and the calculated amount by using the mixed parameter model. Specifically, the weight coefficients of the speech enhancement system may be automatically adjusted, and processed using the model to generate a set of noise-controlled weight coefficients w. And then selecting coefficients with larger noise influence to be recombined based on the weight coefficients controlled by the first group of noise to obtain an optimal weight vector, so as to calculate a final output signal.

Based on the above system, the present embodiment further provides a speech enhancement analysis method for a walking aid, as shown in fig. 4, which substantially includes:

and S1, collecting the voice signal comprising the external environment noise.

And S2, extracting the external environment noise from the voice signal.

And S3, processing the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order to obtain the voice signal after noise reduction.

And S4, controlling the walking aid to perform actions according to the voice signals after noise reduction.

And the voice enhancement system jointly adjusts the weight coefficient of the voice enhancement system according to the voice signal with the external environment noise acquired by the reference microphone and the external environment noise acquired by the environment noise microphone, and the voice enhancement system improves the convergence rate and reduces the steady-state error and the calculated amount by using a mixed parameter model. Specifically, the weight coefficient of the speech enhancement system can be automatically adjusted, and the model is used for processing to generate a group of noise-controlled weight coefficients w; and then selecting a coefficient with larger influence on noise to carry out recombination based on the weight coefficient of the noise control to obtain an optimal weight vector, so as to calculate a final output signal.

The above steps may be repeated to perform a loop iteration process.

Further, after step S1, the method further includes:

and adjusting the weight coefficient of the mixed parameter model combining the variable step length and the variable order according to the voice signal and the external environment noise.

The method for adjusting the weight coefficient of the mixed parameter model combining the variable step length and the variable order comprises the following steps: and acquiring a weight coefficient w of noise control by using a noisy speech signal x (n) acquired by a reference microphone and an external environment noise signal d (n) acquired by an environment noise microphone.

In step S3, a weight coefficient w having a large influence on the external environment noise is selected and recombined to obtain an optimal weight vector, and the external environment noise in the voice signal is processed by using the optimal weight vector.

As shown in fig. 3, for a structural block diagram of a new hybrid parameter adaptive beamforming speech enhancement model, a noisy speech signal x (n) is acquired by a reference microphone, and an external environment noise signal d (n) is acquired by an environment noise microphone, and then they are transmitted to the speech enhancement system at the front end of fig. 2 for processing, and a set of noise-controlled weight coefficients w can be generated by using the model. And finally, selecting a weight coefficient w of the feedforward voice enhancement system by the system, selecting a coefficient with large influence on noise, and recombining to obtain an optimal weight vector for calculating a final output signal.

Further, the manner of obtaining the weight coefficient w for noise control includes:

the method specifically applies an improved model for adjusting the order and the step length simultaneously, wherein the main calculation process of the speech enhancement model part is as follows.

The weight value updating formula in the adaptive model can be expressed as follows:

W(n+1)＝W(n)+μe(n)x(n) (1)

wherein n represents iteration times, w (n) represents weight vectors, mu represents step sizes, e (n) represents errors, and x (n) represents noisy speech signals.

Specifically, the method processes the external environment noise in the voice signal through a mixed parameter model combining variable step length and variable order, and comprises the calculation processes of formulas (2) to (11), wherein:

the model is not interfered by system errors, can accurately adjust the order and the step length according to the iteration state, simultaneously obtains the optimal values of the order and the step length, and improves some key performances of the model. The adjustment strategy of the model for the step length and the order is as follows:

wherein the content of the first and second substances,represents the root mean square of the smoothing error value at the last time,representing the smoothed error value at the previous instant, lambda represents the iteration coefficient, generally smaller and closer to 1, n represents the number of iterations,

wherein the content of the first and second substances,the variance of the noise of the system is represented,representing the a-priori error calculated approximately, k representing the count, max representing the maximum,

wherein the content of the first and second substances,meaning that the squared smoothed a priori error is calculated,

where μ (n) represents a step size, D represents a constant, β represents a regularization factor,

wherein the content of the first and second substances,l_f(n) represents the intermediate fractional order, α represents the iteration coefficient, γ represents the step size of varying order, L (n) represents the order,representing the complete error of order l (n), a representing the error width,the partial error is represented by a partial error,

q₁(n)＝λq₁(n-1)+(1-λ)|FE(n)| (8)

wherein q is₁(n) represents an intermediate quantity of iterative computations,

q₂(n)＝λq₂(n-1)+(1-λ)FE(n) (9)

wherein q is₂(n) represents another intermediate quantity of iterative computations, representing,

γ(n)＝ρ₁q₁(n) (10)

where γ (n) denotes the step size of the variable order, ρ₁The coefficient of the constant value is represented by,

Δ(n)＝min(Δ_max，ρ₂|q₂(n)|) (11)

wherein Δ (n) represents an error width, Δ_maxDenotes the maximum error width, p₂Representing constant coefficients.

In practical applications, the two variables, the order and the step size, have a large impact on performance. Too small an order falls into the case of a locally optimal solution, and divergence is caused when the order is smaller than that required by an actual system. To avoid this, a value having a larger order is generally assigned. However, this also leads to increased computation, poor convergence and steady-state performance, and therefore the order should be reasonably adjusted. Likewise, large steps generally increase convergence performance, but decrease steady-state performance and robustness; while a small step size will guarantee steady-state and robust performance, but will affect convergence performance.

The scheme designed by the embodiment adjusts the order and the step length simultaneously, so that the order and the step length respectively have reasonable values, and the performance is further improved. In the convergence process, the convergence performance can be improved by a proper large step length, meanwhile, the order is also adjusted, the convergence to the optimal order is fast, the closer the order is to the optimal value, the larger the MSD variation is, and the better the convergence performance is. Under the combined action of the step length and the order, the convergence speed can be further improved. And in a steady state, the small step length can improve the steady state performance, the order is converged in a reasonable range of the optimal order, and the more the order is close to the optimal value in the steady state, the smaller the error caused by the part without the estimated order is, so that the smaller the MSE in the steady state is, and the steady state performance can be improved. Therefore, under the combined action of the step length and the order, the steady-state performance can be further improved.

The performance of the model is evaluated by comparing the model with other classical models, including MPFT-LMS, NFT-LMS, VSTFT-LMS, IAPFT-LMS, FT-LMS with different parameter values, and the like. For a fair comparison, the parameter settings of each model are selected according to the basis given in the respective article. In which the input signal is formed by passing a typical white gaussian noise through a shaping filter, i.e., h (z) ═ 0.35+ z^-1+z^-2And then obtaining the product. A high noise experimental environment with SNR OdB was generated by system noise, and four different unknown system tests were attached:

in the formula a_k，b_k，c_k，d_kBoth represent zero mean gaussian white noise sequences.

W₁And W₂Used as the first experimental system in the case of smaller order, when n < 10000, W_opt＝W₁And when 10000 is less than or equal to n < 20000, W_opt＝W₂。W₃And W₄Used as a second experimental system in the case of a larger order, W is set to W when n is less than 10000_opt＝W₃When n is more than or equal to 10000 and less than 20000, W_opt＝W₄. With such an arrangement, it is advantageous to verify the ability to cope with an emergency. The expected sequence is calculated by the following formula:

d(n)＝x^T(n)w_opt+v(n) (14)

from the experimental results, the curve designed in this embodiment has the fastest descending MSD and EMSE curves, and the curve value is also the smallest in the whole process, which indicates that the weight coefficient is closest to the optimal value. Besides ensuring normal adjustment of the order, the method also has a reasonable adjustment strategy of the step length, so that the two variables are coordinated and matched, and the performance is further comprehensively improved.

In conclusion, the model has the advantages of being simple, easy to apply to speech enhancement, and capable of adjusting some key parameters along with noise energy changes, thereby greatly reducing the calculated amount, improving the steady state, convergence, tracking and robustness performances, and improving the noise reduction effect for time-varying noise.

Because the existing voice enhancement model based on self-adaptive beam forming is not suitable for processing nonlinear and non-stationary time noise sequences, the robustness is poor, the noise change in daily life is large, the real-time performance of the system is required to be strong, the conventional noise reduction system has a relatively simple structure, the conventional noise reduction system has insufficient capacity to deal with scenes with noise mutation, the scenes cannot be accurately identified, the most suitable noise reduction mode is selected, and extra noise is easily generated, such as sudden noise and the like, which can greatly affect the noise reduction effect. Compared with the prior art, the walking aid has the advantages that the voice control mode of the mixed parameter self-adaptive beam forming voice enhancement model is adopted, the ambient noise is collected through the ambient noise microphone, the voice signal with the ambient noise is collected through the reference microphone, and various collected signals are sent to the voice enhancement system; the voice enhancement system processes the signals collected by the various microphones and outputs the clearer voice signals subjected to noise reduction to a voice recognition system of the assistant aid; the voice recognition system recognizes the received clear voice signal and controls the assistant assistive device to make a corresponding action signal. In addition, the speech recognition success rate can be fed back to the speech enhancement system, and the speech enhancement system adjusts the weight coefficient of the noise control system according to the speech signal recognition success rate after the final noise reduction.

The invention starts from a core self-adaptive model, applies a high-performance model to speech enhancement, and adjusts key parameters of the model through the model, thereby greatly reducing the calculated amount, improving the performances such as convergence speed, steady state and the like, being easy to realize, and further enhancing the noise reduction effect and robustness. The invention fully utilizes a high-performance voice enhancement model formed by self-adaptive wave beams to realize good noise reduction effect of noisy voice. The novel model applied by the invention adopts a robust design, and can more quickly and accurately track the change of a noise signal by adjusting key parameters in the control model, thereby greatly improving the noise reduction effect.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

16页详细技术资料下载

Voice enhancement analysis method and system for walking aid assistive device

相关技术

网友询问留言