Terminal microphone testing method and device, mobile terminal and storage medium

文档序号:196483 发布日期:2021-11-02 浏览:25次 中文

阅读说明:本技术 终端麦克风的测试方法、装置、移动终端和存储介质 (Terminal microphone testing method and device, mobile terminal and storage medium ) 是由 王艳芬 严锋贵 李应伟 于 2021-09-06 设计创作,主要内容包括:本发明涉及一种移动终端麦克风的检测方法、装置、移动终端和存储介质,所述方法包括:当所述移动终端麦克风开启时,连续获取麦克风采集的音频,根据所述音频,生成音频对应的频谱。获取所述频谱高频谱段能量及低频谱段能量,并获取所述高频谱段与所述低频谱段能量差值。若所述高频谱段能量值与所述低频谱段能量值之间的差值高于阈值,通知所述移动终端发生堵麦。本发明提供的检测方法可以实现在应用开启录音功能时自动记录检测麦克风所对应的声音通道是否处于堵塞状态,提高使用的便捷程度。(The invention relates to a detection method and a detection device for a microphone of a mobile terminal, the mobile terminal and a storage medium. And acquiring the energy of the high spectrum band and the energy of the low spectrum band of the frequency spectrum, and acquiring the energy difference value of the high spectrum band and the low spectrum band. And if the difference value between the high-frequency spectrum energy value and the low-frequency spectrum energy value is higher than a threshold value, informing the mobile terminal that wheat blocking occurs. The detection method provided by the invention can realize automatic recording and detection of whether the sound channel corresponding to the microphone is in a blocked state when the recording function is started, and improves the convenience degree of use.)

1. A detection method for a microphone of a mobile terminal is characterized by comprising the following steps:

acquiring audio collected by a microphone;

determining a high spectral energy value and a low spectral energy value of the audio;

and if the difference value between the high-frequency spectrum energy value and the low-frequency spectrum energy value is higher than a threshold value, informing the mobile terminal that wheat blocking occurs.

2. The detection method according to claim 1, wherein after said acquiring the audio collected by the microphone, the method further comprises:

performing framing processing on the audio to obtain continuous multi-frame data;

performing window processing on the multi-frame data after down-sampling by down-sampling the multi-frame data;

and carrying out Fourier transform on the windowed data to obtain a plurality of spectral band energy values of the audio.

3. The detection method of claim 2, wherein the down-sampling comprises:

and intercepting the frequency spectrum corresponding to the audio, wherein the acquired data frequency is less than 8000HZ frequency band.

4. The detection method of claim 1, wherein the obtaining the high and low spectral energy comprises:

and filtering the frequency spectrum to obtain a high frequency band with the frequency more than 2000HZ and a low frequency band with the frequency less than 50 HZ.

5. The detection method of claim 1, wherein the difference between the high spectral energy value and the low spectral energy value is above a threshold value, comprising:

the low spectral energy value increases and the high spectral energy value decreases: or

The low spectral energy value is unchanged and the high spectral energy value is decreased.

6. The method of claim 5, wherein the notifying the user equipment of the occurrence of wheat block when the high-band energy, the low-band energy and the energy difference satisfy a predetermined relationship, further comprises:

acquiring the blocking time length of the sound channel;

and when the blocking time length is greater than or equal to a first time period, informing the user equipment of the occurrence of wheat blocking.

7. The method of claim 5, wherein the notifying the user equipment of the occurrence of wheat block when the high-band energy, the low-band energy and the energy difference satisfy a predetermined relationship, further comprises:

acquiring the blocking time length of the sound channel and the total partial recording time length to obtain a blocking time length ratio;

and when the blockage duration proportion is greater than or equal to a first proportion value, informing the user equipment that wheat blockage occurs.

8. The detection method according to claim 6 or 7, wherein the notifying the user equipment of the occurrence of the wheat jam comprises:

when the user equipment is informed of the microphone blockage by confirmation, informing the user in a popup window mode;

the pop-up window mode comprises a modal pop-up window or a non-modal pop-up window.

9. The detection method according to claim 8, further comprising:

and when the pop-up mode is the modal pop-up, sending a request to the user to disable the microphone.

10. The detection method according to claim 9, further comprising:

when the user confirms that the microphone is forbidden, the microphone function is closed;

when the user allows the microphone to be used, the microphone functions normally.

11. An apparatus for detecting a microphone of a mobile terminal, the apparatus comprising:

the first detection module is used for detecting whether the recording function of the microphone is started or not;

the acquisition module is used for acquiring external sound fragments through the corresponding sound channel by the microphone when the recording function is started;

the second detection module is used for detecting whether the sound channel is blocked or not when the recording function is started;

and the notification module is used for notifying a user when the detection result of the sound channel is blockage.

12. A mobile terminal, comprising:

a processor;

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of any of claims 1-7 via execution of the executable instructions.

13. A computer-readable storage medium, in which a computer program is stored, which computer program can be invoked by a processor to perform the method according to any one of claims 1 to 10.

Technical Field

The present application relates to the field of mobile terminal detection technologies, and in particular, to a method and an apparatus for detecting a microphone of a mobile terminal, and a storage medium.

Background

The technological progress makes mobile terminals such as mobile phones and tablet computers become indispensable communication tools in people's daily life and work, and a microphone is needed to record sound in the processes of using the mobile terminals to carry out conversation, voice or video chat, and the like, so the microphone is an indispensable component in the mobile phone terminal. The user may inadvertently jam the microphone with his or her finger while holding the device for use during a daily call or other activity. The detection method provided by the current mobile terminal is that in a special application program, a user needs to manually start a detection program for judging whether a microphone and a corresponding sound channel are blocked, and the use is inconvenient.

Disclosure of Invention

In view of the above problems, the present invention provides a method and an apparatus for detecting a microphone of a mobile terminal, a mobile terminal and a storage medium, which can detect a blockage degree of the microphone of the mobile terminal.

In a first aspect, a method for detecting a microphone of a mobile terminal is characterized by comprising the steps of acquiring audio collected by the microphone, determining a high-frequency-band energy value and a low-frequency-band energy value of the audio, and notifying the mobile terminal of microphone blockage if a difference value between the high-frequency-band energy value and the low-frequency-band energy value is higher than a threshold value.

In a second aspect, the present invention provides a microphone detection apparatus for a mobile terminal, wherein the apparatus comprises: the first detection module is used for detecting whether the recording function of the microphone is started. And the acquisition module is used for acquiring the external sound fragment through the corresponding sound channel by the microphone when the recording function is started. And the second detection module is used for detecting whether the sound channel is blocked or not when the recording function is started. And the notification module is used for notifying a user when the detection result of the sound channel is blockage.

In a third aspect, the invention provides a mobile terminal, which is characterized by comprising a processor and a memory, wherein the memory is used for storing executable instructions of the processor. Wherein the processor is configured to perform the method of the first aspect via execution of the executable instructions.

In a fourth aspect, the present invention provides a computer-readable storage medium. Wherein the computer-readable storage medium has stored thereon a computer program that can be invoked by a processor to perform the method according to the first aspect.

The scheme provided by the invention can realize that when the mobile terminal starts the recording state, the microphone acquires external sound and simultaneously starts the blockage detection of the sound channel corresponding to the microphone, and if the sound channel corresponding to the microphone is confirmed to be in the blockage state after being detected by the blockage algorithm, the sound channel corresponding to the microphone is sent to the user, so that the sound channel corresponding to the microphone is automatically started and detected, and the complexity of user operation is reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is to be understood that the drawings in the following description are merely exemplary of the invention and that other drawings may be derived by those skilled in the art without paying out the inventive work.

Fig. 1 illustrates a flow chart of a microphone detection method of a mobile terminal according to one embodiment of the present invention;

fig. 2 illustrates a flow chart of a microphone detection method of a mobile terminal according to another embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a microphone detection method interface of a mobile terminal according to another embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating a microphone detection method interface of a mobile terminal according to another embodiment of the present invention;

fig. 5 shows an architectural block diagram of a microphone detection apparatus of a mobile terminal according to an embodiment of the present invention;

FIG. 6 shows a flow chart of a second detection module of the microphone detection apparatus of the mobile terminal according to one embodiment of the present invention;

fig. 7 shows an architectural block diagram of a mobile terminal according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Referring to fig. 1, fig. 1 is a flowchart illustrating a microphone detection method of a mobile terminal according to an embodiment of the present application. The method can realize the detection of whether the sound channel corresponding to the microphone is blocked or not when the external sound is acquired. Therefore, the technical scheme provided by the invention detects whether the recording function of the microphone is started, and the starting of the recording function of the microphone is used as the starting identifier of the detection algorithm for starting whether the sound channel corresponding to the microphone is blocked. As will be described in detail with respect to the flow shown in fig. 1, the method for detecting a microphone of the mobile terminal may specifically include the following steps:

and step S101, acquiring the audio collected by the microphone.

According to the technical scheme provided by the invention, when the recording function of the microphone is started, the mobile terminal acquires the audio collected by the microphone.

The mobile terminal may be an electronic terminal including a microphone, such as a mobile phone, a tablet computer, a laptop wearable device, and the like, which is not limited herein.

It is understood that the number of the microphones on the mobile terminal may be one or more, and the number and types of the microphones are not limited in the present invention, as long as the microphones can receive the sound transmitted from the outside, which is within the scope of the microphone definition protected by the present invention.

It is to be understood that the position of the microphone is not particularly limited, and the position of the microphone relative to the mobile terminal may be on the bottom, top or back, and the position of the microphone relative to the mobile terminal is within the protection scope of the present invention.

It can be understood that the recording function refers to a function capable of receiving sound transmitted from the outside, and includes the conditions that the external sound information can be acquired by opening a recording APP, performing mobile communication conversation, performing voice input conversion into characters, performing voice chat in a game process, and the like, and the specific APP can be set in a manner that a user manually adds a white list to give permission. Sound also does not refer to only human speech, and what an object propagates in a medium by vibration falls within the scope of the recording claimed in the present invention in accordance with the definition of general sound or sound, and the sound involved in the recording function claimed in the present invention is not necessarily required to fall within the scope that can be heard and distinguished by an ordinary human being.

Step S102, determining a high-frequency spectrum energy value and a low-frequency spectrum energy value of the audio.

When the recording function is started, the microphone can acquire external sound fragments through the corresponding sound channel, whether the sound channel is blocked or not is detected, a result can be obtained by acquiring the external sound and carrying out frequency analysis, the frequency spectrum of the picked sound fragments is obtained through processing the audio fragments, and the high-frequency spectrum energy value and the low-frequency spectrum energy value of the audio are determined.

The method for acquiring the high frequency spectrum and the low frequency spectrum of the audio comprises the steps of framing, down-sampling and windowing the recording data (PCM format) acquired by the microphone, and performing FFT (fast Fourier transform) to calculate the power spectrum of the signal. And performing interpolation (interpolation) on the obtained power spectrum to refine the frequency resolution, and respectively counting the energy values of the low frequency spectrum band and the high frequency spectrum band. The method for acquiring the low-frequency spectrum band and the high-frequency spectrum band can respectively perform low-pass filtering and high-pass filtering by using a filter.

It is understood that the mobile terminal where the microphone is located needs to communicate with the outside through the sound channel and receive and transmit the outside sound segment, the positions of the microphone and the sound channel are not particularly limited, the microphone may be aligned with the sound channel or offset at a certain angle, and the number of microphones may not be consistent with the number of sound channels. The number of the microphones may be greater than or less than the number of the sound channels, the shape of the sound channels may be a circular truncated cone, a circular cylinder or a prism, and the present invention is not limited to this, as long as the function of transmitting the microphones with the external sound can be realized.

It should be understood that the length and content of the sound segment are not particularly limited, and the sound segment can be regarded as a sound segment satisfying the requirements of the present invention as long as the sound segment satisfies the detection requirement of the blocking algorithm.

It should be understood that the algorithm for detecting whether the sound channel is blocked is not particularly limited, and in some embodiments of the present invention, the algorithm for detecting whether the sound channel is blocked includes high-pass filtering, low-pass filtering, band-pass filtering, and related averaging and operation combination matching.

And step S103, if the difference value between the high-frequency spectrum energy value and the low-frequency spectrum energy value is higher than a threshold value, informing the mobile terminal that wheat blocking occurs.

The difference between the energy in the low frequency spectrum and the energy in the high frequency spectrum is calculated and compared to a fixed threshold (empirical value, e.g., -40dB) and the comparison is recorded. Compared with the condition of no wheat blockage, the energy of the low frequency spectrum section of the frequency spectrum is increased or unchanged, the energy of the high frequency spectrum section is reduced, and the difference value is increased in the wheat blockage state. And when the difference value is greater than the threshold value, determining that the wheat blocking condition occurs, and informing the mobile terminal of the occurrence of wheat blocking.

In another embodiment of the present application, on the basis of calculating the difference between the low-frequency band energy and the high-frequency band energy by the above method to determine that the microphone is blocked, the vocal band (voiceBand) may be compared with another fixed threshold (empirical value, such as-50 dB) to determine whether the current microphone is in a sound pickup state, that is, whether a person is speaking currently. Since the AC frequency of the normal speaker is between [300, 3400] HZ, it can be determined whether there is currently a person speaking by comparing the spectrum in this range with a fixed empirical threshold. And the difference value between the low-frequency spectrum band energy and the high-frequency spectrum band energy and the comparison result of the vocal cord spectrum band and the threshold value are integrated, so that whether the current scene is speech with wheat blocked or speech without speech with pure wheat blocked can be further judged.

And when the detection result of the sound channel is blocked, informing a user that the sound channel is blocked.

It should be understood that the present invention is not limited to the manner of notifying users, and some embodiments of the invention include warning, popup, text and graphic prompts, which are all within the scope of the present invention.

It is understood that, in some embodiments of the present invention, whether to notify when the detection result of the sound channel is non-blocked is not limited, in some embodiments of the present invention, the user is notified when the detection result of the sound channel is non-blocked, and in other embodiments of the present invention, the user is not notified when the detection result of the sound channel is non-blocked, and the above embodiments are all within the protection scope of the present invention.

Referring to fig. 2, fig. 2 is a flowchart illustrating a microphone detection method of a mobile terminal according to another embodiment of the present invention;

s201 and S202, acquiring the audio collected by the microphone, and determining the details of the high-frequency spectral energy value and the low-frequency spectral energy value of the audio are the same as those described in fig. 1, and are not described herein again.

In an embodiment of the present application, in step S203, if a difference between the high-band energy and the low-band energy is higher than a threshold, wheat blocking occurs. When the recording function is started, the microphone acquires an external sound fragment through a corresponding sound channel, and detects whether the sound channel is blocked.

And step S204, acquiring the blocking time length of the sound channel, and informing a user of wheat blocking when the blocking time length is greater than or equal to a first time period.

The method for detecting the voice channel blockage comprises the steps of obtaining the blockage time length of the voice channel, and judging that a user needs to be informed in a pop-up window mode at the moment when the blockage time length is larger than or equal to a first time period. In this embodiment, the length of the occlusion time is used as an influencing factor for prompting occlusion. The results of the detection of the occlusion time of the acoustic channel include a greater than first time period and a less than first time period. The length of the acoustic channel blockage time is calculated from the time when the occurrence of a blockage situation is first detected.

For example, in some embodiments, the blocking time length of the sound channel is acquired to be 3.5s, the first time period is acquired to be 3s, and at this time, the blocking time length is greater than or equal to the first time period, and then the detection result of the sound channel is determined to be blocking.

For example, in some embodiments, the blocking time length of the sound channel is obtained to be 2.5s, the first time period is obtained to be 3s, and at this time, the blocking time length is smaller than the first time period, and it is determined that the detection result of the sound channel is non-blocking.

By the method, if the finger position can be adjusted in time after the microphone is blocked by the fingers of the user in a short time, the user is not disturbed when the microphone pickup channel is exposed, and the experience sense of the user when the user uses the mobile device is enhanced.

The above-mentioned jam duration may be understood as a series of frame sets satisfying the jam detection result, and a specific jam detection method will be described in detail below. It is understood that the blocking time and the first time period of the sound channel are not particularly limited by the present invention, and the above embodiments are all within the scope of the present invention.

In some embodiments, as shown in step S205, a congestion duration ratio is obtained, and when the congestion duration ratio is greater than or equal to a first ratio value, the user equipment is notified that congestion occurs. And detecting and acquiring the blocking time length of the sound channel and the partial time length or the total time length of the recording to obtain a blocking proportion, and judging that a user is required to be informed when the blocking proportion is greater than or equal to a first proportion value. The jam proportion is calculated by dividing the length of the jam time of the sound channel by the total length of the recording.

For example, in some embodiments, the length of the blockage time of the sound channel is obtained to be 4s, the total recording time is 10s, the first ratio is 50%, and the blockage ratio is 40%, and then the detection result of the sound channel is determined to be non-blocked.

For example, in some embodiments, the length of the blockage time of the sound channel is obtained to be 6s, the total recording time is 10s, the first ratio is 50%, and the blockage ratio is 60%, and then it is determined that the detection result of the sound channel is not blocked.

The total recording time in this embodiment may be understood as the time length from the beginning of the microphone picking up the external sound, i.e. the beginning of the recording to any time in the process. Specifically, the ratio of the number of frames in which the determination result is the congestion divided by the total number of frames exceeds a certain threshold. The specific blocking determination method will be described in detail below, and it should be understood that the blocking time length of the sound channel, the total recording time length, and the first ratio value are not particularly limited by the present invention, and the above embodiments are all within the scope of the present invention.

Referring to fig. 3 and 4, fig. 3 is a schematic diagram illustrating an interface of a microphone detection method of a mobile terminal according to another embodiment of the present invention. Fig. 4 is a schematic interface diagram illustrating a microphone detection method of a mobile terminal according to another embodiment of the present invention.

In another embodiment provided by the present application, when it is confirmed that the user equipment is notified of the occurrence of the microphone blockage, the user is notified in a pop-up mode, where the pop-up mode includes a modal pop-up or a non-modal pop-up. When the pop mode is a modal pop, as shown in fig. 4, a request is sent to the user whether to disable the microphone. And when the user confirms that the microphone is forbidden, the microphone function is closed, and when the user confirms that the microphone pop-up window blockage reminding is ignored, the microphone function continues to operate. The modal popup refers to a popup in which a user must perform an operation action to force the user to perform the operation, otherwise, other operations cannot be performed. The opposite modeless pop-up window, as shown in fig. 3, may not respond to the user without affecting the user operation, and usually has a time limit, and a pop-up window automatically disappears after a certain period of time appears.

In some embodiments, when the detection result of the sound channel is blocking, the user is notified in a pop-up mode, and the pop-up mode comprises a modal pop-up mode or a non-modal pop-up mode. The types of the pop-up window can be generally divided into two types, the modeless pop-up window is generally designed to tell the user the information content without the user performing related operations, and the modeless pop-up window types comprise Toast/HUD and Snackbar. Fig. 3 pertains to modeless pop-up, where the user does not need to perform additional operations on the pop-up.

The modal popup informs the user of information content and needs the user to perform functional operation, the modal popup interrupts the operation behavior of the user and forces the user to perform the operation, otherwise, other operations cannot be performed, and the types of the modal popup include alert/dialog, Actionbar, Popover and the like. Fig. 4 belongs to a modal popup, and a user must perform a substantial operation on the modal popup of fig. 4, for example, clicking a "disable" button can implement an effect of disabling a microphone when detecting that a sound channel corresponding to the microphone is blocked, and clicking "ignore" can ignore the sound channel corresponding to the microphone in the process of starting a recording function this time, and when starting the recording function next time, the method for detecting microphone blocking provided by the present invention can be repeatedly performed. It should be understood that the present invention is not limited to the pop-up window, and both the modal pop-up window and the non-modal pop-up window are included in the scope of the present invention.

In some embodiments, the notification is not sent to the user when the detection result of the sound channel is non-blocked. And when the detection result of the sound channel is non-blocked, the notification is not sent to the user, so that the effect of reducing the frequency of the user disturbed is realized, the user is prevented from being disturbed by the notification content without substantial influence in normal use, and the use investment degree of the user is improved.

In some embodiments, when the detection result of the sound channel is a blockage, the user is notified in a pop-up mode, when the pop-up mode is a modal pop-up, a request is sent to the user to disable the microphone, and the user must actively input substantial information to the mobile terminal, for example, when the user selects "disable", the microphone may be disabled, so that the user is quickly detected and automatically notified when the recording function is started, the user is not required to manually start the detection function of whether the sound channel corresponding to the microphone is blocked, and the convenience of use of the user is improved. The user can be clearly reminded of the attention in the mode of the modal popup window, and the user is required to carry out information transmission in the mode of modal interaction, so that the attitude and the tendency of the user to the microphone blockage are indicated, and the right of the user selection is given. When the user must use the recording function, even if the sound channel corresponding to the microphone is blocked to cause poor sound quality, the recording function should still be provided, and the user can select the option of 'ignore' at this time. When the user has a high requirement on the sound quality of the microphone, the user can choose to forbid the microphone and perform cleaning work of the sound channel, and at the moment, the user can choose a 'forbidding' option.

When the mode of the popup window is a non-modal popup window, the system automatically reminds the user that the current microphone is in a blocked state, and reminds the user that the finger is placed or other foreign matters block a microphone sound guide hole. At the moment, the user does not need to perform any interactive operation on the popup result, the popup prompt automatically disappears after the preset time is exceeded, or disappears after the user touches other screen areas, or the blockage is eliminated and automatically disappears after the user adjusts the finger position to make the microphone sound pickup channel. For the modeless popup, the user can decide to quit recording or clear dust and foreign matters in the call application or adjust the position of the finger to continue call recording. Meanwhile, after the user eliminates the non-modal popup, if the situation of wheat blocking is detected again, the user can popup again to prompt. The modeless popup may appear as a small popup, automatically disappearing for 1-2 seconds. The small bullet frame can appear at any position of the screen, only characters can be placed, icons cannot be carried, and meanwhile the characters are simplified and are not suitable for being too long. In an embodiment of the application, other screen controls can be operated through the popup, normal use of a user is not affected, the position of the popup is not limited, and the popup can be arranged above the screen or at the bottom of the screen or in the center of the screen.

Referring to fig. 5, in some embodiments, the present invention provides a microphone testing apparatus 50 for a mobile terminal, the apparatus including: a first detecting module 501, configured to detect whether a recording function of the microphone is turned on. The obtaining module 503, when the recording function is turned on, obtains the external sound clip through the corresponding sound channel by the microphone. The second detecting module 505 detects whether the sound channel is blocked when the recording function is turned on. And a notification module 507 for notifying a user when the detection result of the sound channel is a blockage.

Referring to fig. 6, fig. 6 is a flowchart illustrating a second detection module of the microphone detection apparatus of the mobile terminal according to an embodiment of the invention.

As shown, the second detection module is implemented in a method. The wheat blockage detection module adopts 0, 200]HZ Low Band (Low Band), [2000, f0/2]HZ High Band (High Band) (where f)0Is the signal sampling rate), the energy difference of the high and low frequency bands, and [300, 3400]]The method is characterized in that an energy value of an HZ vocal cord (Voice Band) is used as a detection characteristic value for distinguishing microphone blockage, and the implementation steps are as follows:

in step S601, a PCM format of microphone recording data, that is, a so-called wav file, is obtained to perform framing (Frame) processing on the recording data. Processing speech signals requires the use of fourier transforms, which however require input signals that are stationary. In reality, the signal is apparently unstable from a macroscopic point of view, but a small section of the signal can be intercepted from a microscopic point of view in a relatively short time to be regarded as a stable signal, and then the small section of the signal can be subjected to Fourier transform, and the intercepted small section of the signal is called a frame. The length of each frame must macroscopically be able to ensure that the signal in the frame is stable, and the mouth shape cannot change significantly during a frame, i.e. the length of a frame should be less than the length of a phoneme, and the frame length is typically less than 50 ms. Microscopically, a frame must include enough periods of vibration because the fourier transform is to analyze the frequency, which can only be repeated a sufficient number of times. Since a frame includes a plurality of periods, it generally takes at least 20 milliseconds.

In step S602, down-sampling (sample) is performed on the framed data, and the sampling rate is reduced to 8000HZ, which is used to remove the frequency band with too high frequency, so as to reduce the burden of the subsequent calculation process and remove useless calculation information. And windowing the downsampled data.

In step S603, a windowing operation is performed on each frame of signals obtained by intra-frequency framing obtained by sampling, i.e., a window function is multiplied by the signals. The window function is generally a parabolic function with an opening downward, both end values approach 0, and the central peak value is 1. The window function is of various types and is all the existing way, and is not limited herein. The purpose of windowing is to make the amplitude of a frame of signal gradually change to 0 at both ends, so as to perform fourier transform, and make the peaks on the frequency spectrum thinner, thereby reducing the frequency spectrum leakage.

The cost of windowing is that the portions at the ends of a frame signal are attenuated and not as important as the central portion. The remedy is that the frames are not truncated back to back but rather overlap one another by a fraction. The time difference between the start positions of two adjacent frames is called frame shift, and the common method is to take the time difference to be half of the frame length, or to be fixed to be 10 milliseconds.

In step S604, FFT fourier transform is performed on the windowed frame data to calculate a power spectrum of the signal.

The horizontal axis of the spectrum is frequency, the vertical axis is amplitude, a small peak on the spectrum is called a fine structure, the distance of the small peak on the horizontal axis is the fundamental frequency, and the pitch represents the pitch of the voice, wherein the more sparse the peak is, the higher the fundamental frequency is, and the higher the pitch is.

In step S605, Interpolation (Interpolation) is performed on the obtained power spectrum, the frequency is refined, and the average power of the low-frequency average frequency spectrum, the high-frequency average frequency spectrum, and the vocal cord frequency spectrum are respectively counted. The frequency spectrum is filtered, the low-frequency mean frequency spectrum is obtained through low-frequency filtering, the high-frequency filtering is carried out to obtain the high-frequency spectrum, and the vocal cord frequency spectrum is obtained through vocal cord filtering. Wherein, the normal human voice alternating current frequency is between [300, 3400] HZ.

In step S606, the difference between the low frequency average spectrum and the high frequency average spectrum is calculated and compared with a fixed threshold (empirical value, such as-40 dB) to record the comparison result. In step S607, the vocal cord spectrum is compared with another fixed threshold (empirical value, such as-50 dB), and the comparison result is recorded. By measuring the audio characteristic changes before and after the microphone is blocked and under the pickup condition or not under different scenes such as indoor quiet environment, noise environment, low-frequency noise booming environment, white noise, powder noise and other environments, the specific rising and falling changes of high-frequency and low-frequency signals in the audio characteristics before and after the microphone is blocked can be found under various environments. For example, when the ambient environment is in a quiet state, the energy of the low frequency band of the spectrum is increased or unchanged in the wheat blocking state, the energy of the high frequency band is decreased, and the difference value is increased compared with the case of no wheat blocking. The relationship between the similar wheat blocking state and the frequency band energy can be found in other environments, which are not listed here.

By combining the above, it can be concluded that the relationship between the microphone blocking speaking scene and the frequency band energy variation is shown in table one.

Watch 1

Whether the microphone is blocked or not can be judged by calculating the difference change between the low-frequency average frequency spectrum and the high-frequency average frequency spectrum.

In step S607, on the basis of determining whether the microphone is blocked by the above method, it may be further determined whether the current microphone is in a sound pickup state by comparing the vocal cord spectrum with another fixed threshold, that is, whether a person is speaking currently. Since the AC frequency of the normal speaker is between [300, 3400] HZ, it can be determined whether there is currently a person speaking by comparing the spectrum in this range with a fixed empirical threshold. By combining the determination result of S607 and the determination result of S606, it can be further determined whether the microphone is currently in the microphone-blocking speech scene or the microphone-blocking non-speech scene on the basis of whether the microphone is blocked.

In step S608, the comparison results of step S606 and step S607 are input to the second detection module (Predicate) to perform the microphone clogging determination. The determination methods include, but are not limited to:

1. a logical and operation determination is made on the comparison results of step S606 and step S607.

2. Different weights are assigned to the comparison results of step S606 and step S607, and the results are summed up and determined according to the result of the summation.

3. The percentage of the comparison result of the step S606 and the comparison result of the step S607 satisfying the condition is counted for a certain period of time (for example, 5 seconds), and the wheat blockage determination is performed by the percentage.

Specific determination methods include, but are not limited to, the above three ways. The judgment module can judge whether the current microphone is in a specific state of speech blockage, speech blockage and speech blockage.

The mode of adding vocal cord frequency threshold comparison can be used as a supplement to the embodiment judgment method only performing difference operation and threshold comparison on the high-frequency mean frequency spectrum and the low-frequency mean frequency spectrum, and the detection accuracy is improved.

Please refer to fig. 7, which shows a block diagram of an electronic device according to an embodiment of the present application. The electronic device 70 may be the mobile terminal or the detection terminal. The electronic device 70 in the present application may include one or more of the following components: a processor 701, a memory 702. One or more applications may be stored in the memory 702 and configured to be executed by the one or more processors 701, the one or more applications configured to perform the methods as described in the foregoing method embodiments.

Processor 701 may include one or more processing cores. The processor 701 interfaces with various components throughout the electronic device 70 using various interfaces and lines to perform various functions of the electronic device 70 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 702 and invoking data stored in the memory 702. Alternatively, the processor 701 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 701 may integrate a Central Processing Unit (CPU), a microphone detector of the mobile terminal, a GPU (graphics Processing unit), a modem, or a combination of one or more of the CPU, the GPU, the application, and the like, wherein the CPU mainly processes an operating system, a user interface, an application program, and the like, the GPU is responsible for rendering and rendering content, and the modem is used for Processing wireless communication.

The Memory 702 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 702 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 702 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created by the electronic device 70 during use (e.g., phone book, audio-video data, chat log data), etc.

In addition, this application shows a block diagram of a computer-readable storage medium provided in this application. The computer readable medium has stored therein a program code which can be called by a processor to execute the method described in the above method embodiments.

The computer readable storage medium may be an electronic memory such as flash memory, EEPROM electrically erasable programmable read only memory), EPROM, hard disk, or ROM. Alternatively, the computer-readable storage medium includes a non-volatile computer-readable storage medium. The computer readable storage medium has a storage space for program code for performing any of the method steps of the above-described method. The program code can be read from or written to one or more computer program products. The program code may be compressed, for example, in a suitable form.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modification or the replacement does not drive the wood of the corresponding technical scheme to depart from the spirit and the scope of the technical scheme of the embodiments of the application.

15页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种智能音箱中框模具加工用定位组件

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!