Voice compensation method, device and related components

文档序号:196452 发布日期:2021-11-02 浏览:38次 中文

阅读说明:本技术 一种语音补偿方法、装置及相关组件 (Voice compensation method, device and related components ) 是由 吴琼 于 2021-08-25 设计创作,主要内容包括:本申请公开了一种语音补偿方法,应用于耳机设备,耳机设备包括麦克风,该语音补偿方法包括:获取麦克风采集的语音信号;判断语音信号是否为遮挡后的语音信号;若是,确定与遮挡后的语音信号对应的补偿EQ参数;根据补偿EQ参数对麦克风的当前EQ参数进行补偿。本申请能够提高语音信号的清晰度,从而使耳机使用者在佩戴口罩时的语音通话质量提高。本申请还公开了一种语音补偿装置、耳机设备及计算机可读存储介质,具有以上有益效果。(The application discloses a voice compensation method, which is applied to earphone equipment, wherein the earphone equipment comprises a microphone, and the voice compensation method comprises the following steps: acquiring a voice signal acquired by a microphone; judging whether the voice signal is a shielded voice signal or not; if so, determining a compensation EQ parameter corresponding to the shielded voice signal; and compensating the current EQ parameters of the microphone according to the compensated EQ parameters. This application can improve speech signal's definition to make the pronunciation conversation quality improvement of earphone user when wearing the gauze mask. The application also discloses a voice compensation device, earphone equipment and a computer readable storage medium, which have the beneficial effects.)

1. A speech compensation method applied to a headphone device including a microphone, the speech compensation method comprising:

acquiring a voice signal acquired by the microphone;

judging whether the voice signal is a shielded voice signal or not;

if so, determining a compensation EQ parameter corresponding to the shielded voice signal;

and compensating the current EQ parameter of the microphone according to the compensated EQ parameter.

2. The speech compensation method according to claim 1, wherein if the speech signal is an occluded speech signal, the speech compensation method further comprises:

determining the type of a sheltering object corresponding to the sheltered voice signal;

the process of determining the compensated EQ parameters corresponding to the occluded speech signal comprises:

selecting a compensated EQ parameter corresponding to the obstruction type.

3. The speech compensation method according to claim 1, wherein after the acquiring the speech signal collected by the microphone, the speech compensation method further comprises:

performing time-frequency conversion on the voice signal to obtain a first frequency response curve;

the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve;

calculating an amplitude average value of a target frequency band of the difference curve, wherein the target frequency band is a complete frequency band of the difference curve or an amplitude variation frequency band in the complete frequency band;

the process of judging whether the voice signal is the voice signal after being shielded comprises the following steps:

judging whether the average value of the amplitude values of the target frequency band is greater than a first preset value or not;

if so, judging that the voice signal is an unshielded voice signal;

if not, the voice signal is judged to be the voice signal after being shielded.

4. The speech compensation method according to claim 2, wherein after the acquiring the speech signal collected by the microphone, the speech compensation method further comprises:

performing time-frequency conversion on the voice signal to obtain a first frequency response curve;

the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve;

dividing the target frequency band of the difference curve into a plurality of sub-frequency bands;

respectively calculating the average value of the amplitude of each sub-frequency band;

determining the minimum value in the amplitude average values of all the sub-frequency bands;

the process of determining the type of the occlusion object corresponding to the occluded voice signal comprises:

determining an obstruction type from the minimum value.

5. The speech compensation method of claim 4, wherein the determining the type of obstruction from the minimum value comprises:

acquiring N second preset values corresponding to the sub-frequency band where the minimum value is located, wherein N is a positive integer;

obtaining N +1 judgment intervals based on the N second preset values, wherein any value in the ith judgment interval is smaller than any value in the i +1 th judgment interval, and i is 1, 2, … and N;

and determining the type of the obstruction according to the judgment interval where the minimum value is positioned.

6. The speech compensation method according to claim 1, wherein after the acquiring the speech signal collected by the microphone, the speech compensation method further comprises:

judging whether the voice signal comprises compensation triggering information or not;

the process of judging whether the voice signal is the voice signal after being shielded comprises the following steps:

and when the voice signal comprises the compensation triggering information, judging whether the voice signal is a shielded voice signal.

7. The speech compensation method according to any one of claims 2-6, further comprising:

acquiring a target frequency response curve of an unshielded voice signal and a test frequency response curve of the voice signal shielded by the shielding object of each shielding object type;

subtracting each test frequency response curve and the target frequency response curve frequency point by frequency point to obtain a reference difference value curve;

determining a plurality of sub-bands of the reference difference curve;

calculating the average value of the amplitude of each reference difference curve in each sub-frequency band;

and determining a first preset value or a second preset value corresponding to the sub-frequency band according to all the amplitude average values of the same sub-frequency band.

8. A speech compensation apparatus, applied to a headphone device including a microphone, the speech compensation apparatus comprising:

the acquisition module is used for acquiring the voice signal acquired by the microphone;

the first judgment module is used for judging whether the voice signal is the voice signal after being shielded or not, and if so, the determination module is triggered;

the determining module is configured to determine a compensated EQ parameter corresponding to the shielded speech signal;

and the compensation module is used for compensating the current EQ parameters of the microphone according to the compensated EQ parameters.

9. An earphone device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the speech compensation method according to any of claims 1-7 when executing said computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the speech compensation method according to any one of claims 1-7.

Technical Field

The present disclosure relates to the field of speech signal processing technologies, and in particular, to a speech compensation method, apparatus, and related components.

Background

At present, when an earphone user carries out voice call through an earphone, a microphone of the earphone collects voice signals of the earphone user, and the voice signals are processed through a noise reduction algorithm built in the earphone and are sent out. If the earphone user wears the gauze mask after, because the gauze mask has blockked the propagation of partly acoustic signal for the sound that the microphone was gathered changes, and speech signal's definition step-down, thereby influences the voice conversation quality of earphone user when wearing the gauze mask.

Therefore, how to provide a solution to the above technical problem is a problem that needs to be solved by those skilled in the art.

Disclosure of Invention

The application aims to provide a voice compensation method, a voice compensation device, an earphone device and a computer readable storage medium, which can improve the definition of a voice signal, so that the voice communication quality of an earphone user when wearing a mask is improved.

In order to solve the above technical problem, the present application provides a speech compensation method, applied to an earphone device, where the earphone device includes a microphone, and the speech compensation method includes:

acquiring a voice signal acquired by the microphone;

judging whether the voice signal is a shielded voice signal or not;

if so, determining a compensation EQ parameter corresponding to the shielded voice signal;

and compensating the current EQ parameter of the microphone according to the compensated EQ parameter.

Optionally, if the speech signal is a masked speech signal, the speech compensation method further includes:

determining the type of a sheltering object corresponding to the sheltered voice signal;

the process of determining the compensated EQ parameters corresponding to the occluded speech signal comprises:

selecting a compensated EQ parameter corresponding to the obstruction type.

Optionally, after acquiring the voice signal collected by the microphone, the voice compensation method further includes:

performing time-frequency conversion on the voice signal to obtain a first frequency response curve;

the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve;

calculating an amplitude average value of a target frequency band of the difference curve, wherein the target frequency band is a complete frequency band of the difference curve or an amplitude variation frequency band in the complete frequency band;

the process of judging whether the voice signal is the voice signal after being shielded comprises the following steps:

judging whether the average value of the amplitude values of the target frequency band is greater than a first preset value or not;

if so, judging that the voice signal is an unshielded voice signal;

if not, the voice signal is judged to be the voice signal after being shielded.

Optionally, after acquiring the voice signal collected by the microphone, the voice compensation method further includes:

performing time-frequency conversion on the voice signal to obtain a first frequency response curve;

the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve;

dividing the target frequency band of the difference curve into a plurality of sub-frequency bands;

respectively calculating the average value of the amplitude of each sub-frequency band;

determining the minimum value in the amplitude average values of all the sub-frequency bands;

the process of determining the type of the occlusion object corresponding to the occluded voice signal comprises:

determining an obstruction type from the minimum value. Optionally, the process of determining the type of the obstruction according to the minimum value includes:

acquiring N second preset values corresponding to the sub-frequency band where the minimum value is located, wherein N is a positive integer;

obtaining N +1 judgment intervals based on the N second preset values, wherein any value in the ith judgment interval is smaller than any value in the i +1 th judgment interval, and i is 1, 2, … and N;

and determining the type of the obstruction according to the judgment interval where the minimum value is positioned.

Optionally, after acquiring the voice signal collected by the microphone, the voice compensation method further includes:

judging whether the voice signal comprises compensation triggering information or not;

the process of judging whether the voice signal is the voice signal after being shielded comprises the following steps:

and when the voice signal comprises the compensation triggering information, judging whether the voice signal is a shielded voice signal.

Optionally, the speech compensation method further includes:

acquiring a target frequency response curve of an unshielded voice signal and a test frequency response curve of the voice signal shielded by the shielding object of each shielding object type;

subtracting each test frequency response curve and the target frequency response curve frequency point by frequency point to obtain a reference difference value curve;

determining a plurality of sub-bands of the reference difference curve;

calculating the average value of the amplitude of each reference difference curve in each sub-frequency band;

and determining a first preset value or a second preset value corresponding to the sub-frequency band according to all the amplitude average values of the same sub-frequency band.

In order to solve the above technical problem, the present application further provides a speech compensation apparatus applied to an earphone device, where the earphone device includes a microphone, and the speech compensation apparatus includes:

the acquisition module is used for acquiring the voice signal acquired by the microphone;

the first judgment module is used for judging whether the voice signal is the voice signal after being shielded or not, and if so, the determination module is triggered;

the determining module is configured to determine a compensated EQ parameter corresponding to the shielded speech signal;

and the compensation module is used for compensating the current EQ parameters of the microphone according to the compensated EQ parameters.

In order to solve the above technical problem, the present application further provides an earphone device, including:

a memory for storing a computer program;

a processor for implementing the steps of the speech compensation method as described in any of the above when said computer program is executed.

To solve the above technical problem, the present application further provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the steps of the speech compensation method according to any one of the above.

The application provides a voice compensation method, when judging that the voice signal that the microphone gathered is the voice signal after sheltering from, compensate the current EQ parameter of microphone through the EQ parameter of compensation that corresponds with the voice signal after sheltering from to realize the compensation to the voice signal that the microphone gathered, improve voice signal's definition, and then make the voice conversation quality improvement of earphone user when wearing the gauze mask. The application also provides a voice compensation device, earphone equipment and a computer readable storage medium, which have the same beneficial effects as the voice compensation method.

Drawings

In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 is a flow chart illustrating steps of a speech compensation method according to the present application;

fig. 2 is a frequency spectrum diagram of a speech signal provided in the present application;

FIG. 3 is a spectrum diagram of another speech signal provided by the present application;

FIG. 4 is a schematic diagram of a compensated EQ parameter provided herein;

fig. 5 is a schematic structural diagram of a speech compensation apparatus provided in the present application;

fig. 6 is a schematic structural diagram of an earphone device provided in the present application.

Detailed Description

The core of the application is to provide a voice compensation method, a voice compensation device, an earphone device and a computer readable storage medium, which can improve the definition of a voice signal, so that the voice communication quality of an earphone user when wearing a mask is improved.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, fig. 1 is a flowchart illustrating steps of a speech compensation method according to the present application, the speech compensation method includes:

s101: acquiring a voice signal acquired by a microphone;

it can be understood that the earphone device includes a microphone, and when a user of the earphone performs a voice call or voice control through the earphone device, the microphone collects a voice signal of the user of the earphone, and performs noise reduction processing on the voice signal according to a built-in noise reduction algorithm and then performs an operation corresponding to the voice signal. In this embodiment, the voice signal collected by the microphone may be acquired according to a preset acquisition period, or the voice signal collected by the microphone may be acquired after the trigger signal is received.

S102: judging whether the voice signal is the voice signal after being shielded, if so, executing S103;

specifically, after the voice signal is acquired, it is determined whether the voice signal is a voice signal that is blocked by a blocking object, that is, whether the blocking object exists on a voice signal transmission path from the sound source to the microphone. There are various ways to determine whether there is a shelter, for example, whether there is a shelter can be detected by a sensor, or whether there is a shelter can be determined according to the feature information of the voice signal. It can be understood that, if a shielding object exists on the voice signal transmission path, the shielding object may block propagation of a part of the acoustic signal, so that the definition of the voice signal acquired by the microphone is reduced, and therefore, the feature information of the voice signal acquired by the microphone when the microphone is shielded may be distinguished from the feature information of the voice signal acquired by the microphone when the microphone is not shielded, and the feature information may be specifically reflected by a frequency response curve of the voice signal.

S103: determining a compensation EQ parameter corresponding to the shielded voice signal;

s104: and compensating the current EQ parameters of the microphone according to the compensated EQ parameters.

Specifically, when the voice signal is judged to be the voice signal after being shielded by the shielding object, the compensation EQ parameter corresponding to the shielded voice signal is determined, the compensation EQ parameter is input to the microphone end, the current EQ parameter of the microphone end is compensated, so that the microphone can process the voice signal according to the compensated EQ parameter, the definition of the voice signal is improved, when a headset user carries out voice call through the headset device, the voice call quality is improved, and when the headset user controls the headset device through the voice signal, the control accuracy is improved. It can be understood that, if the speech signal is an unobstructed speech signal, the compensation processing on the speech signal is not needed, and the microphone still operates according to the current EQ parameters.

It can be seen that, in this embodiment, when it is determined that the voice signal collected by the microphone is the voice signal after being shielded, the current EQ parameter of the microphone is compensated by the compensation EQ parameter corresponding to the voice signal after being shielded, so that the compensation of the voice signal collected by the microphone is realized, the definition of the voice signal is improved, and the voice call quality of the earphone user when wearing the mask is improved.

On the basis of the above-described embodiment:

as an alternative embodiment, if the speech signal is an occluded speech signal, the speech compensation method further includes:

determining the type of a sheltering object corresponding to the sheltered voice signal;

the process of determining the compensated EQ parameters corresponding to the occluded speech signal comprises:

a compensated EQ parameter corresponding to the occlusion type is selected.

Specifically, different types of blocking objects have different blocking effects on the voice signals, so that the clarity of the voice signals collected by the microphone is different. The earphone users wear different types of masks as an example for explanation, the currently common mask types can be divided into an N95 mask and a common medical surgical mask, generally, the N95 mask has a stronger blocking effect on voice signals, the common medical surgical mask has a weaker blocking effect on the voice signals, and the definition of the voice signals collected by the microphone and blocked by the N95 mask is obviously lower than that of the voice signals collected by the microphone and blocked by the common medical surgical mask. Because the different types of shelters make the microphone acquire different definitions of voice signals, if the microphone adopts the same compensation EQ parameter for each shelter type, the compensation effect may be poor, and therefore, the microphone in this embodiment needs to perform adaptive adjustment on the voice signals sheltered by the different shelter types.

In this embodiment, each of the shelter types corresponds to a set of compensated EQ parameters, when it is determined that the speech signal acquired by the microphone is a sheltered speech signal, the shelter type is determined, and then the compensated EQ parameter corresponding to the shelter type is selected and input to the microphone end to compensate the current EQ parameter at the microphone end, so that the microphone processes the speech signal according to the compensated EQ parameter, thereby improving the clarity of the speech signal and improving the speech communication quality when the user of the earphone wears the mask.

As an alternative embodiment, after acquiring the voice signal collected by the microphone, the voice compensation method further includes:

performing time-frequency conversion on a voice signal to obtain a first frequency response curve;

the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve;

calculating the average value of the amplitude of the target frequency band of the difference curve, wherein the target frequency band is the complete frequency band of the difference curve or the amplitude variation frequency band in the complete frequency band;

the process of judging whether the voice signal is the voice signal after being shielded comprises the following steps:

judging whether the average value of the amplitude values of the target frequency band is greater than a first preset value or not;

if so, judging the voice signal as an unshielded voice signal;

if not, the voice signal is judged to be the voice signal after being shielded.

And the reference frequency response curve is a frequency response curve corresponding to the voice signal output when the voice signal is not shielded.

Specifically, the present embodiment mainly determines whether the voice signal is blocked and the type of the blocking object according to the frequency energy loss of the voice signal. Performing time-frequency conversion on a voice signal currently acquired by a microphone through Fast Fourier Transform (FFT) to obtain a first frequency response curve, subtracting the first frequency response curve from a reference frequency response curve frequency point by frequency point to obtain a difference curve, and determining a target frequency band on the difference curve, where the target frequency band may be a complete frequency band corresponding to the difference curve, and as shown in fig. 2, the complete frequency band is 100Hz to 8 kHz. In order to reduce the data processing amount, the target frequency band may also be a frequency band with a large amplitude change in the complete frequency band, that is, an amplitude change frequency band, which is obtained through a plurality of tests, and in the high frequency band of the frequency response curve, as shown in fig. 2, for example, in the range of 1kHz to 8kHz, the amplitude change is large, and the judgment is easy, so that the amplitude change frequency band in the difference curve may be used as the target frequency band.

It can be understood that, if the voice signal that the microphone was gathered at present is the voice signal after sheltering from, because the transmission of some voice signal has been blockked to the shelter, the definition of the voice signal that the microphone was gathered can reduce, the amplitude of first frequency response curve at some frequency point can be less than the amplitude of reference frequency response curve at the same frequency point, consequently, make the most negative values of the result of difference with first frequency response curve and reference frequency response curve frequency point by frequency point, thereby the difference curve is also the most negative values at the amplitude of target frequency channel, correspondingly, the amplitude average value is the negative value also, and shelter from the better effect, the amplitude of each frequency point on the difference curve is littleer. If the voice signal is an unshielded voice signal, the difference between the amplitudes of the first frequency response curve and the reference frequency response curve at the same frequency point is not large, and the average value of the amplitudes of the obtained difference curve at the target frequency band is larger than the average value of the differences corresponding to the voice signal after being shielded. Therefore, whether the voice signal is the voice signal after shielding can be determined based on the average value of the amplitude values of the difference curve in the target frequency band, if the average value of the amplitude values of the target frequency band is larger than the first preset value, the voice signal is judged to be the voice signal without shielding, and if not, the voice signal is judged to be the voice signal after shielding.

As an alternative embodiment, after acquiring the voice signal collected by the microphone, the voice compensation method further includes:

performing time-frequency conversion on a voice signal to obtain a first frequency response curve;

the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve;

dividing a target frequency band of the difference curve into a plurality of sub-frequency bands;

respectively calculating the average value of the amplitude of each sub-frequency band;

determining the minimum value in the average value of the amplitudes of all the sub-frequency bands;

the process of determining the type of the occlusion object corresponding to the occluded voice signal comprises the following steps:

determining the occlusion type based on the minimum value.

Specifically, the target frequency band of the difference curve may be divided into a plurality of sub-frequency bands according to the order of frequency points from small to large, and the amplitude variation frequency band is specifically used as the target frequency band in this embodiment for explanation. Assuming that the target frequency band is 1kHz to 8kHz, and the plurality of sub-frequency bands are 1kHz to 3kHz, 3kHz to 5kHz, 5kHz to 7kHz, and 7kHz to 8kHz, respectively, the average value of the amplitudes of the sub-frequency bands is calculated, as shown in table 1, table 1 is a spectrum difference table of the voice signal.

TABLE 1 Spectrum Difference Table for Speech signals

A B C D
Sub-band 1kHz~3kHz 3kHz~5kHz 5kHz~7kHz 7kHz~8kHz
Average value of amplitude a b c d

Considering that the smaller the amplitude average value is, the larger the difference between the first frequency response curve and the reference frequency response curve in the frequency band is, the larger the amplitude average value in the frequency band is, the more accurately the amplitude average value in the frequency band is selected to distinguish the types of the shielding objects, so that in this embodiment, one minimum value is selected from a, b, c, and d to determine the type of the shielding object.

As an alternative embodiment, the process of determining the shade type from the minimum value includes:

acquiring N second preset values corresponding to the sub-frequency band where the minimum value is located, wherein N is a positive integer;

obtaining N +1 judgment intervals based on the N second preset values, wherein any value in the ith judgment interval is smaller than any value in the (i + 1) th judgment interval, and i is 1, 2, … and N;

and determining the type of the obstruction according to the judgment interval where the minimum value is positioned.

Specifically, each sub-frequency band has one or more second preset values which are respectively corresponding and used for judging the type of the shielding object, and the number of the second preset values is determined according to the number of the types of the shielding objects. Two types of masks are assumed, namely N95 masks and normal surgical masks. It can be understood that, because the N95 mask has a stronger shielding effect on the acoustic signal than the general medical surgical mask, the amplitude of each frequency point of the difference curve 1 obtained based on the acoustic signal shielded by the N95 mask is smaller than the amplitude of the same frequency point of the difference curve 2 obtained based on the acoustic signal shielded by the general medical surgical mask. Each sub-band corresponds to a second preset value, the second preset values corresponding to the sub-bands A, B, C, D are a01, b01, C01 and d01, then a minimum value is selected from the average values a, b, C and d of the amplitudes of the sub-bands corresponding to the currently acquired voice signal, and if C is the minimum value, it is indicated that the voice signal better distinguishes the type of the occlusion object of the voice signal in the sub-band C. It can be understood that two determination intervals can be obtained based on the second preset value c01, which can be (c00, c01) and [ c01, c0N ], where c00 and c0N are respectively a preset determination lower limit value and a preset determination upper limit value, c and c01 are compared to determine which determination interval c is in, if c is in (c00, c01), the current type of the mask is determined to be the N95 mask, and c is in [ c01, c0N) to determine that the current type of the mask is the common medical surgical mask. Assuming that the types of the blocking objects are three types, namely an N95 mask, a general medical surgical mask and a cotton mask, the second preset values corresponding to the sub-band C are C01 and C02, and C01 is less than C02, the determination intervals obtained based on the second preset values C01 and C02 can be (C00, C01), [ C01, C02] and (C02, C0N), and the types of the blocking objects can be determined according to which determination interval the C is in, for example, the type of the blocking objects is determined as the N95 mask if the C is in (C00, C01), the type of the C is determined as the cotton mask if the C is in [ C01, C02], and the type of the C is in (C02, C0N), and the general medical surgical mask is determined. Of course, the two end points of each determination interval may be selected according to actual engineering requirements, and it is sufficient to ensure that any value in the ith determination interval is smaller than any value in the (i + 1) th determination interval, and the present application is not specifically limited herein.

As an alternative embodiment, after acquiring the voice signal collected by the microphone, the voice compensation method further includes:

judging whether the voice signal comprises compensation triggering information or not;

the process of judging whether the voice signal is the voice signal after being shielded comprises the following steps:

and when the voice signal comprises the compensation triggering information, judging whether the voice signal is the voice signal after shielding.

Specifically, the compensation triggering information may include common phrases such as "feeding" and "hello", and when it is detected that the voice signal includes such information, it is determined that the voice compensation function needs to be started, and at this time, it is determined whether the voice signal is a signal after being blocked, and the microphone EQ parameter is compensated according to the determination result.

Of course, the compensation trigger information may include other phrases besides the above common phrases, and may be preset in the headset according to actual requirements.

As an alternative embodiment, the speech compensation method further comprises:

acquiring a target frequency response curve of an unshielded voice signal and a test frequency response curve of the voice signal shielded by the shielding object of each shielding object type;

each test frequency response curve and each target frequency response curve are subjected to frequency point-by-frequency point difference to obtain a reference difference value curve;

determining a plurality of sub-bands of a reference difference curve;

calculating the average value of the amplitude of each reference difference curve in each sub-frequency band;

and determining a first preset value or a second preset value corresponding to the sub-frequency band according to the average value of all the amplitude values of the same sub-frequency band.

Specifically, the present embodiment defines how to determine the first preset value and the second preset value. In order to facilitate understanding of the solution of the present embodiment, the mask types are respectively an N95 mask and a general medical surgical mask, which are taken as examples below. In order to improve the judgment accuracy, an earphone user is enabled to wear the two masks respectively for recording for multiple times, the earphone user is supposed to wear an N95 mask for 5 times of recording to obtain 5 recording signals, the 5 recording signals are respectively subjected to time-frequency conversion to obtain 5 test frequency response curves, the 5 test frequency response curves are averaged point by point to obtain an average value which is used as a test frequency response curve L1 corresponding to the N95 mask, the earphone user is enabled to wear a common medical surgical mask for 5 times of recording to obtain 5 recording signals, the 5 recording signals are respectively subjected to time-frequency conversion to obtain 5 test frequency response curves, the 5 test frequency response curves are averaged point by point to obtain an average value which is used as a test frequency response curve L2 corresponding to the common medical surgical mask, and the earphone user is enabled to record 5 times without shielding when not wearing any mask, and 5 recording signals are obtained, time-frequency conversion is carried out on the 5 recording signals to obtain 5 frequency response curves, the 5 frequency response curves are averaged frequency points, and the average value curve is used as a target frequency response curve L0.

The test frequency response curve L1 and the target frequency response curve L0 are subtracted from each other frequency point to obtain a first reference difference curve, which is shown as curve i in fig. 3, and the test frequency response curve L2 and the target frequency response curve L0 are subtracted from each other frequency point to obtain a second reference difference curve, which is shown as curve ii in fig. 3. And dividing the frequency bands of the two reference difference curves to obtain a plurality of sub-frequency bands, wherein each sub-frequency band is respectively 100 Hz-1 kHz, 1 kHz-3 kHz, 3 kHz-5 kHz, 5 kHz-7 kHz, 7 kHz-8 kHz and 1 kHz-8 kHz, and referring to fig. 3, the influence of wearing the mask on the high-frequency band is large, the frequency energy loss of voice signals is not changed greatly in the frequency band of 100 Hz-1 kHz, the analysis is inconvenient, and data in the frequency band are not processed in the subsequent process so as to reduce the data processing amount.

The average amplitude values of the sub-bands are calculated respectively, and table 2 can be obtained.

TABLE 2 Spectrum Difference Table based on Voice signals of N95 mask or general medical surgery mask

It can be understood that the sub-band F is used for determining whether the voice signal is a voice signal after being shielded, therefore, the first preset value can be determined according to the average value of the amplitudes of the sub-band F, the average value of the amplitudes of the voice signal after wearing the N95 mask in the sub-band F is-2.1, the average value of the amplitudes of the voice signal after wearing the common medical surgical mask in the sub-band F is-1.1, the first preset value can be set to a value greater than-1.1, for example, the first preset value can be set to-0.7, if the average value of the amplitudes of the currently acquired voice signals in the range of 1kHz to 8kHz is greater than-0.7, it is indicated that the earphone user does not wear the mask currently, the voice signal is a voice signal which is not shielded, otherwise, it is indicated that the earphone user wears the mask, and the voice signal is a voice signal after being shielded.

Specifically, the value of the second preset value is described by taking the sub-band C as an example, and the value manner of each of the other sub-bands is the same. Referring to table 2, the average value of the amplitude of the voice signal after wearing the N95 mask in the sub-band C is-3.7, and the average value of the amplitude of the voice signal after wearing the common medical surgical mask in the sub-band C is-1.9, so that the second preset value corresponding to the sub-band C can be taken from-3.7 to-1.9, such as-2.8. Assuming that the average value of the amplitude of the sub-frequency band C corresponding to the shielded voice signal currently acquired by the microphone is C, when C is less than-2.8, the type of the shielding object is determined to be an N95 mask, and when C is more than or equal to-2.8, the type of the shielding object is determined to be a medical surgical mask.

Specifically, different mask types correspond to different compensated EQ parameters, the compensated EQ parameter of the N95 mask and the compensated EQ parameter of the general medical surgical mask are shown in fig. 4, the compensated EQ parameter of the N95 mask is shown in fig. 4, and the compensated EQ parameter of the general medical surgical mask is shown in fig. 4.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a voice compensation apparatus provided in the present application, applied to a headset device, where the headset device includes a microphone, and the voice compensation apparatus includes:

the acquisition module 11 is configured to acquire a voice signal acquired by a microphone;

the first judging module 12 is configured to judge whether the voice signal is a shielded voice signal, and if so, trigger the determining module 13;

a determining module 13, configured to determine a compensated EQ parameter corresponding to the shielded speech signal;

and the compensation module 14 is configured to compensate the current EQ parameter of the microphone according to the compensated EQ parameter.

It can be seen that, in this embodiment, when it is determined that the voice signal collected by the microphone is the voice signal after being shielded, the current EQ parameter of the microphone is compensated by the compensation EQ parameter corresponding to the voice signal after being shielded, so that the compensation of the voice signal collected by the microphone is realized, the definition of the voice signal is improved, and the voice call quality of the earphone user when wearing the mask is improved.

As an alternative embodiment, if the speech signal is an occluded speech signal, the speech compensation apparatus further includes:

the type acquisition module is used for determining the type of the sheltering object corresponding to the sheltered voice signal;

the determining module 13 is specifically configured to:

a compensated EQ parameter corresponding to the occlusion type is selected.

As an optional embodiment, after acquiring the voice signal collected by the microphone, the voice compensation apparatus further includes a first data processing module, where the first data processing module includes:

the first conversion unit is used for carrying out time-frequency conversion on the voice signals to obtain a first frequency response curve;

the first calculating unit is used for subtracting the first frequency response curve and the reference frequency response curve by frequency points to obtain a difference value curve;

the second calculating unit is used for calculating the average value of the amplitude of the target frequency band of the difference curve, wherein the target frequency band is the complete frequency band of the difference curve or the amplitude change frequency band in the complete frequency band;

the first determining module 12 is specifically configured to: judging whether the average value of the amplitude values of the target frequency band is greater than a first preset value or not;

if so, judging the voice signal as an unshielded voice signal;

if not, the voice signal is judged to be the voice signal after being shielded.

As an optional embodiment, the speech compensation apparatus further includes a second data processing module, where the second data processing module includes:

the second conversion unit is used for carrying out time-frequency conversion on the voice signal to obtain a first frequency response curve;

the third calculating unit is used for subtracting the first frequency response curve and the reference frequency response curve frequency points by frequency points to obtain a difference value curve;

a fourth calculating unit, configured to divide the target frequency band of the difference curve into a plurality of sub-frequency bands;

the fifth calculating unit is used for calculating the amplitude average value of each sub-frequency band respectively and determining the minimum value in the amplitude average values of all the sub-frequency bands;

the type obtaining module is specifically configured to:

determining an obstruction type from the minimum value.

As an alternative embodiment, the process of determining the shade type from the minimum value includes:

acquiring N second preset values corresponding to the sub-frequency band where the minimum value is located, wherein N is a positive integer;

obtaining N +1 judgment intervals based on the N second preset values, wherein any value in the ith judgment interval is smaller than any value in the i +1 th judgment interval, and i is 1, 2, … and N;

and determining the type of the obstruction according to the judgment interval where the minimum value is positioned.

As an alternative embodiment, the speech compensation apparatus further comprises:

the second judgment module is used for judging whether the voice signal comprises compensation trigger information or not;

the first determining module 12 is specifically configured to:

and when the voice signal comprises the compensation triggering information, judging whether the voice signal is the voice signal after shielding.

As an optional embodiment, the speech compensation apparatus further includes a preprocessing module, and the preprocessing module includes:

the curve acquisition unit is used for acquiring a target frequency response curve of an unshielded voice signal and a test frequency response curve of the voice signal shielded by the shielding object of each shielding object type;

the sixth calculating unit is used for subtracting each test frequency response curve and the target frequency response curve frequency point by frequency point to obtain a reference difference value curve;

the seventh calculating unit is used for determining a plurality of sub-frequency bands of the reference difference curve and calculating the amplitude average value of each reference difference curve in each sub-frequency band;

and the preset value determining unit is used for determining a first preset value or a second preset value corresponding to the sub-frequency band according to the average value of all the amplitude values of the same sub-frequency band.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an earphone device provided in the present application, where the earphone device includes:

a memory 21 for storing a computer program;

a processor 22 for implementing the steps of the speech compensation method as described in any of the above embodiments when executing the computer program.

Specifically, the memory 21 includes a nonvolatile storage medium, an internal memory 21. The non-volatile storage medium stores an operating system and computer-readable instructions, and the internal memory 21 provides an environment for the operating system and the computer-readable instructions in the non-volatile storage medium to run. The processor 22 provides the vehicle navigation device with calculation and control capabilities, and when executing the computer program stored in the memory 21, the following steps may be implemented: acquiring a voice signal acquired by a microphone; judging whether the voice signal is a shielded voice signal or not; if so, determining a compensation EQ parameter corresponding to the shielded voice signal; and compensating the current EQ parameters of the microphone according to the compensated EQ parameters.

It can be seen that, in this embodiment, when it is determined that the voice signal collected by the microphone is the voice signal after being shielded, the current EQ parameter of the microphone is compensated by the compensation EQ parameter corresponding to the voice signal after being shielded, so that the compensation of the voice signal collected by the microphone is realized, the definition of the voice signal is improved, and the voice call quality of the earphone user when wearing the mask is improved.

As an alternative embodiment, when the processor 22 executes the computer subroutine stored in the memory 21, the following steps can be implemented: if the voice signal is the voice signal after being shielded, determining the type of the shielding object corresponding to the voice signal after being shielded; a compensated EQ parameter corresponding to the occlusion type is selected.

As an alternative embodiment, when the processor 22 executes the computer subroutine stored in the memory 21, the following steps can be implemented: performing time-frequency conversion on a voice signal to obtain a first frequency response curve; the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve; calculating the average value of the amplitude of the target frequency band of the difference curve, wherein the target frequency band is the complete frequency band of the difference curve or the amplitude variation frequency band in the complete frequency band; judging whether the average value of the amplitude values of the target frequency band is greater than a first preset value or not; if so, judging the voice signal as an unshielded voice signal; if not, the voice signal is judged to be the voice signal after being shielded.

As an alternative embodiment, when the processor 22 executes the computer subroutine stored in the memory 21, the following steps can be implemented: performing time-frequency conversion on the voice signal to obtain a first frequency response curve; the first frequency response curve and the reference frequency response curve are subjected to frequency point-by-frequency point difference to obtain a difference value curve; dividing the target frequency band of the difference curve into a plurality of sub-frequency bands; respectively calculating the average value of the amplitude of each sub-frequency band; determining the minimum value in the amplitude average values of all the sub-frequency bands; determining an obstruction type from the minimum value.

As an alternative embodiment, when the processor 22 executes the computer subroutine stored in the memory 21, the following steps can be implemented: acquiring N second preset values corresponding to the sub-frequency band where the minimum value is located, wherein N is a positive integer; obtaining N +1 judgment intervals based on the N second preset values, wherein any value in the ith judgment interval is smaller than any value in the i +1 th judgment interval, and i is 1, 2, … and N; and determining the type of the obstruction according to the judgment interval where the minimum value is positioned.

As an alternative embodiment, when the processor 22 executes the computer subroutine stored in the memory 21, the following steps can be implemented: after acquiring a voice signal acquired by a microphone, judging whether the voice signal comprises compensation triggering information; and when the voice signal comprises the compensation triggering information, judging whether the voice signal is the voice signal after shielding.

As an alternative embodiment, when the processor 22 executes the computer subroutine stored in the memory 21, the following steps can be implemented: acquiring a target frequency response curve of an unshielded voice signal and a test frequency response curve of the voice signal shielded by the shielding object of each shielding object type; each test frequency response curve and each target frequency response curve are subjected to frequency point-by-frequency point difference to obtain a reference difference value curve; determining a plurality of sub-bands of a reference difference curve; calculating the average value of the amplitude of each reference difference curve in each sub-frequency band; and determining a first preset value or a second preset value corresponding to the sub-frequency band according to the average value of all the amplitude values of the same sub-frequency band.

In another aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the speech compensation method as described in any one of the above embodiments.

For the introduction of a computer-readable storage medium provided in the present application, please refer to the above embodiments, which are not described herein again.

The present application provides a computer-readable storage medium having the same advantageous effects as the above-described voice compensation method.

It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种耳机及耳机降噪模式切换方法、装置、存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类