Audio information processing method and device and electronic equipment

文档序号:535102 发布日期:2021-06-01 浏览:8次 中文

阅读说明:本技术 音频信息处理方法、装置及电子设备 (Audio information processing method and device and electronic equipment ) 是由 龚永燕 于 2021-02-22 设计创作,主要内容包括:本申请提出了一种音频信息处理方法、装置及电子设备,检测到电子设备获取的音频信息包含有满足特定条件的第一音频的情况下,为了消除该第一音频这一噪声干扰,电子设备可以提取该音频信息包含的第一音频,从而基于该第一音频的声源、音频播放装置及音频接收对象之间的相对位置关系,获得针对该第一音频的初始反相第一音频,将其发送至音频播放装置输出,以使得传输至音频接收对象处的目标反相第一音频与第一音频能够相互抵消,实现对电子设备采集到的第一音频的主动降噪处理,避免第一音频对使用电子设备的用户造成噪声干扰,提高了用户对电子设备的使用体验。(The application provides an audio information processing method, an audio information processing device and electronic equipment, wherein when detecting that audio information acquired by the electronic equipment contains a first audio meeting a specific condition, in order to eliminate the noise interference of the first audio, the electronic device may extract the first audio contained in the audio information, so as to obtain the initial reverse first audio aiming at the first audio based on the relative position relation among the sound source of the first audio, the audio playing device and the audio receiving object, and send the initial reverse first audio to the audio playing device for output, the target reversed-phase first audio transmitted to the audio receiving object and the first audio can be mutually offset, active noise reduction processing of the first audio collected by the electronic equipment is achieved, noise interference of the first audio on a user using the electronic equipment is avoided, and use experience of the user on the electronic equipment is improved.)

1. A method of audio information processing, the method comprising:

acquiring audio information;

under the condition that the audio information comprises first audio meeting a specific condition, extracting the first audio contained in the audio information;

obtaining an initial inverse first audio for the extracted first audio based on a relative positional relationship among a sound source of the first audio, an audio playing device, and an audio receiving object;

and sending the initial reverse phase first audio to the audio playing device for outputting, so that the target reverse phase first audio transmitted to the audio receiving object and the first audio are mutually offset.

2. The method of claim 1, the detecting that the audio information includes first audio that satisfies a particular condition extracting the first audio included with the audio information comprising:

a first processor detects whether the audio information contains first audio satisfying a certain condition;

if yes, the first processor sends noise reduction prompt information to the second processor;

the second processor extracting the first audio contained in the audio information;

and in the same time, the first power consumption generated by the operation of the first processor is smaller than the second power consumption generated by the operation of the second processor.

3. The method of claim 1, the detecting that the audio information includes first audio that satisfies a particular condition, comprising:

detecting that the audio information includes stationary noise, determining the stationary noise as a first audio; alternatively, the first and second electrodes may be,

acquiring steady-state noise and unsteady-state noise contained in the audio information;

and determining that the steady-state noise and the unsteady-state noise meet a specific condition, or determining that the steady-state noise and the unsteady-state noise meet a specific condition and an abnormal frequency signal exists in the steady-state noise, and determining the steady-state noise as the first audio frequency.

4. The method of claim 3, the steady state noise comprising electrical noise generated by the electronic device itself; the determining that the stationary noise and the non-stationary noise satisfy a certain condition includes:

acquiring a signal intensity difference between the stationary noise and the non-stationary noise;

determining that the signal strength difference is greater than a steady state noise reduction threshold, the steady state noise reduction threshold determined based on a signal strength of ambient noise of the electronic device.

5. The method according to any one of claims 1 to 4, wherein obtaining the initial inverse first audio for the extracted first audio based on the relative positional relationship among the sound source of the first audio, an audio playing device, and an audio receiving object comprises:

acquiring a first sound source position of the first audio, a sound outlet hole position of an audio playing device and a sound receiving position of an audio receiving object;

according to the first sound source position, the sound outlet hole position and the sound receiving position, carrying out phase inversion processing on the first audio frequency to obtain an initial phase inversion first audio frequency;

the sending the initial reverse phase first audio to an audio playing device for outputting comprises:

and mixing the initial reverse-phase first audio and the audio to be played, and sending the obtained mixed audio to an audio playing device for outputting.

6. The method according to any one of claims 1 to 4, wherein the detecting that the audio information contains a first audio satisfying a specific condition comprises:

detecting that the audio information contains user audio, and carrying out sound source positioning on the user audio to obtain the position of a speaker of the user audio;

determining that the position relation between the position of the speaker and the electronic equipment meets a specific condition, and determining the user audio as first audio;

the method further comprises the following steps:

based on the first audio, authenticating the speaker of the first audio;

if the identity authentication is passed, a voice recognition engine of the electronic equipment is awakened, and the voice recognition engine executes a voice instruction contained in the first audio.

7. An audio information processing apparatus, the apparatus comprising:

the audio information acquisition module is used for acquiring audio information;

the first audio extraction module is used for extracting the first audio contained in the audio information under the condition that the audio information contains the first audio meeting specific conditions;

the reverse phase processing module is used for obtaining initial reverse phase first audio aiming at the extracted first audio based on the relative position relation among the sound source of the first audio, an audio playing device and an audio receiving object;

and the audio transmission module is used for sending the initial reverse phase first audio to an audio playing device for outputting so as to enable the target reverse phase first audio transmitted to the audio receiving object to be mutually offset with the first audio.

8. An electronic device, comprising a body, and an electronic component, an audio acquisition device, an audio playback device, and an audio processing device disposed in the body, wherein:

the audio acquisition device is used for acquiring audio information under the operating condition of the electronic equipment;

the audio processing apparatus is configured to call and execute a first program to implement the steps of the audio information processing method according to any one of claims 1 to 6, where the first program is a program to implement the audio information processing method according to any one of claims 1 to 6.

9. The electronic device of claim 8, the audio processing means comprising:

the first processor is used for acquiring audio information, detecting that the audio information contains a first audio meeting a specific condition, and generating noise reduction prompt information;

the second processor is used for receiving the noise reduction prompt information sent by the first processor, extracting the first audio contained in the audio information, obtaining an initial reversed-phase first audio aiming at the extracted first audio based on the relative position relation among a sound source of the first audio, an audio playing device and an audio receiving object, and sending the initial reversed-phase first audio to the audio playing device for outputting so as to enable the target reversed-phase first audio transmitted to the audio receiving object to be mutually offset with the first audio;

and in the same time, the first power consumption generated by the operation of the first processor is smaller than the second power consumption generated by the operation of the second processor.

10. The electronic device of claim 8 or 9, the first processor being disposed in the audio acquisition apparatus;

the audio acquisition device comprises a plurality of audio collectors deployed in an array;

the electronic component includes a capacitive component that is capable of generating steady state noise during operation of the electronic device.

Technical Field

The present application relates to the field of audio signal processing, and in particular, to an audio information processing method and apparatus, and an electronic device.

Background

At present, in the operation process of electronic equipment such as a notebook computer and the like, steady-state noise may be collected, such as low-cost common capacitors usually assembled by the notebook computer and capacitor electrical noise generated in the working process of the common capacitors; the noise generated by the fan of the notebook computer, and the like, if the notebook computer runs in a very quiet environment or needs a quite quiet environment, the steady-state noise can cause noise interference to the user, and the use experience of the user on the notebook computer is reduced.

Disclosure of Invention

In view of the above, in order to solve the above technical problem, the present application proposes the following technical solutions:

in one aspect, the present application provides an audio information processing method, where the method includes:

acquiring audio information;

under the condition that the audio information comprises first audio meeting a specific condition, extracting the first audio contained in the audio information;

obtaining an initial inverse first audio for the extracted first audio based on a relative positional relationship among a sound source of the first audio, an audio playing device, and an audio receiving object;

and sending the initial reverse phase first audio to the audio playing device for outputting, so that the target reverse phase first audio transmitted to the audio receiving object and the first audio are mutually offset.

In some embodiments, the detecting that the audio information includes a first audio satisfying a specific condition, extracting the first audio included in the audio information includes:

a first processor detects whether the audio information contains first audio satisfying a certain condition;

if yes, the first processor sends noise reduction prompt information to the second processor;

the second processor extracting the first audio contained in the audio information;

and in the same time, the first power consumption generated by the operation of the first processor is smaller than the second power consumption generated by the operation of the second processor.

In some embodiments, the detecting that the audio information includes first audio that satisfies a particular condition includes:

detecting that the audio information includes stationary noise, determining the stationary noise as a first audio; alternatively, the first and second electrodes may be,

acquiring steady-state noise and unsteady-state noise contained in the audio information;

and determining that the steady-state noise and the unsteady-state noise meet a specific condition, or determining that the steady-state noise and the unsteady-state noise meet a specific condition and an abnormal frequency signal exists in the steady-state noise, and determining the steady-state noise as the first audio frequency.

In some embodiments, the steady state noise comprises electrical noise generated by the electronic device itself; the determining that the stationary noise and the non-stationary noise satisfy a certain condition includes:

acquiring a signal intensity difference between the stationary noise and the non-stationary noise;

determining that the signal strength difference is greater than a steady state noise reduction threshold, the steady state noise reduction threshold determined based on a signal strength of ambient noise of the electronic device.

In some embodiments, the obtaining of the initial inverse first audio for the extracted first audio based on a relative positional relationship between a sound source of the first audio, an audio playback device, and an audio receiving object includes:

acquiring a first sound source position of the first audio, a sound outlet hole position of an audio playing device and a sound receiving position of an audio receiving object;

according to the first sound source position, the sound outlet hole position and the sound receiving position, carrying out phase inversion processing on the first audio frequency to obtain an initial phase inversion first audio frequency;

the sending the initial reverse phase first audio to an audio playing device for outputting comprises:

and mixing the initial reverse-phase first audio and the audio to be played, and sending the obtained mixed audio to an audio playing device for outputting.

In some embodiments, the detecting that the audio information includes first audio that satisfies a particular condition includes:

detecting that the audio information contains user audio, and carrying out sound source positioning on the user audio to obtain the position of a speaker of the user audio;

determining that the position relation between the position of the speaker and the electronic equipment meets a specific condition, and determining the user audio as first audio;

the method further comprises the following steps:

based on the first audio, authenticating the speaker of the first audio;

if the identity authentication is passed, a voice recognition engine of the electronic equipment is awakened, and the voice recognition engine executes a voice instruction contained in the first audio.

In another aspect, the present application also provides an audio information processing apparatus, including:

the audio information acquisition module is used for acquiring audio information;

the first audio extraction module is used for extracting the first audio contained in the audio information under the condition that the audio information contains the first audio meeting specific conditions;

the reverse phase processing module is used for obtaining initial reverse phase first audio aiming at the extracted first audio based on the relative position relation among the sound source of the first audio, an audio playing device and an audio receiving object;

and the audio transmission module is used for sending the initial reverse phase first audio to an audio playing device for outputting so as to enable the target reverse phase first audio transmitted to the audio receiving object to be mutually offset with the first audio.

In another aspect, the present application further provides an electronic device, which includes a main body, and an electronic component, an audio collecting device, an audio playing device, and an audio processing device disposed in the main body, wherein:

the audio acquisition device is used for acquiring audio information under the operating condition of the electronic equipment;

the audio processing device is used for calling and executing a first program to realize the steps of the audio information processing method, and the first program is a program for realizing the audio information processing method.

In some embodiments, the audio processing apparatus comprises:

the first processor is used for acquiring audio information, detecting that the audio information contains a first audio meeting a specific condition, and generating noise reduction prompt information;

the second processor is used for receiving the noise reduction prompt information sent by the first processor, extracting the first audio contained in the audio information, obtaining an initial reversed-phase first audio aiming at the extracted first audio based on the relative position relation among a sound source of the first audio, an audio playing device and an audio receiving object, and sending the initial reversed-phase first audio to the audio playing device for outputting so as to enable the target reversed-phase first audio transmitted to the audio receiving object to be mutually offset with the first audio;

and in the same time, the first power consumption generated by the operation of the first processor is smaller than the second power consumption generated by the operation of the second processor.

In some embodiments, the first processor is disposed in the audio acquisition device;

the audio acquisition device comprises a plurality of audio collectors deployed in an array;

the electronic component includes a capacitive component that is capable of generating steady state noise during operation of the electronic device.

Therefore, the application provides an audio information processing method, an audio information processing device and an electronic device, when detecting that audio information acquired by the electronic device contains a first audio meeting a specific condition, in order to eliminate the noise interference of the first audio, the electronic device may extract the first audio contained in the audio information, so as to obtain the initial reverse first audio aiming at the first audio based on the relative position relation among the sound source of the first audio, the audio playing device and the audio receiving object, and send the initial reverse first audio to the audio playing device for output, the target reversed-phase first audio transmitted to the audio receiving object and the first audio can be mutually offset, active noise reduction processing of the first audio collected by the electronic equipment is achieved, noise interference of the first audio on a user using the electronic equipment is avoided, and use experience of the user on the electronic equipment is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic structural diagram of an alternative example of an electronic device suitable for use in the audio information processing method and apparatus proposed in the present application;

fig. 2 is a schematic diagram of a hardware structure of still another alternative example of an electronic device suitable for use in the audio information processing method and apparatus proposed in the present application;

fig. 3 is a schematic structural diagram of yet another alternative example of an electronic device suitable for use in the audio information processing method and apparatus proposed in the present application;

fig. 4 is a schematic diagram of a hardware structure of still another alternative example of an electronic device suitable for the audio information processing method and apparatus proposed in the present application;

FIG. 5 is a schematic flow chart diagram of an alternative example of the audio information processing method proposed in the present application;

fig. 6 is a signaling flow diagram of still another alternative example of the audio information processing method proposed in the present application;

fig. 7 is a schematic flowchart of yet another alternative example of the audio information processing method proposed in the present application;

fig. 8 is a schematic flowchart of yet another alternative example of the audio information processing method proposed in the present application;

fig. 9 is a schematic flowchart of yet another alternative example of the audio information processing method proposed in the present application;

fig. 10 is a schematic structural diagram of an alternative example of an audio information processing apparatus proposed in the present application;

fig. 11 is a schematic structural diagram of still another alternative example of the audio information processing apparatus proposed in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present application may be combined with each other without conflict.

It should be understood that "system", "apparatus", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements. An element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

In the description of the embodiments herein, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more than two. The terms "first", "second" and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.

Additionally, flow charts are used herein to illustrate operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

Referring to fig. 1, a schematic diagram of an alternative example of an electronic device suitable for use in the audio information processing method and apparatus provided in the present application may include, but is not limited to, a smart phone, a tablet computer, a wearable device, a Personal Computer (PC), a netbook, a Personal Digital Assistant (PDA), a smart watch, a smart speaker, a robot, a desktop computer, and the like. The electronic device shown in fig. 1 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 1, the electronic device proposed in this embodiment may include a main body 100, and an electronic component 200, an audio collecting device 300, an audio playing device 400, and an audio processing device 500 disposed in the main body 100, wherein:

the electronic element 200 may be configured to build a hardware device that supports normal functions of an electronic device, and it can be understood that electronic elements constituting the hardware device are various and can be determined according to specific implemented functions.

The audio collecting apparatus 300 may be used for collecting audio information under the operating condition of the electronic device.

In this embodiment of the application, the audio information collected by the audio collecting device 300 may include various audios generated under an environment where the electronic device is located, an audio generated by the electronic device itself, and the like.

In some embodiments, in order to facilitate positioning of a sound source, the audio acquisition device 300 may include a plurality of audio collectors disposed in an array, such as an array microphone, and the application does not limit the specific composition structure of the audio acquisition device 300.

The audio playing apparatus 400 may be configured to output audio generated or obtained by the electronic device itself, and as the case may be, the structure and the application scenario of the audio playing apparatus 400 are not described in detail in this application. Optionally, the audio playing apparatus 400 may include an audio player disposed in a body of the electronic device, and a specific disposition position of the audio player may be determined according to a type and a structural disposition requirement of the electronic device, which is not limited in this application.

In still other embodiments, the audio playing apparatus 400 may further include an independent audio player, which may be connected to the sound output hole of the electronic device through a data transmission line to play the audio output by the electronic device, in which case, the user may adjust the position of the independent audio player and the audio transmission direction thereof according to the requirement, which is not described in detail herein.

The audio processing apparatus 500 may be configured to execute the audio information processing method and apparatus provided in the embodiment of the present application, and specifically, may implement each step of the audio information processing method described in the corresponding embodiment below by calling and executing a first program, where the first program may be a program for implementing the corresponding audio information processing method.

In some embodiments, the first program may be stored in a memory of the audio processing apparatus 500, and the processor included in the audio processing apparatus 500 calls and executes the first program stored in the memory as required to implement the steps of the audio information processing method described in the following corresponding embodiments, and the specific implementation process of the method is not described in detail in this application.

In this embodiment of the present application, the electronic device is often not very quiet in working environment, and there is noise interference in general, in order to reduce interference of at least part of noise on an audio receiving object (such as a user, an audio collecting device, etc., where this application does not limit the category of the audio receiving object, and may be determined as the case may be), it is desirable that an active noise reduction manner, such as an inverse noise reduction method, be adopted to eliminate a first audio meeting a specific condition received by the audio receiving object, and this application does not describe in detail a specific implementation process of the active noise reduction, and reference may be made to but is not limited to the description of the corresponding part of the following embodiment.

The first audio required to be actively subjected to noise reduction processing may have different categories and different characteristics, and may be recorded as the first audio meeting specific conditions, where specific contents of the specific conditions may be determined as appropriate, and detailed descriptions thereof are not provided herein.

In summary, in the embodiment of the present application, for audio information acquired by an electronic device, it may be detected whether a first audio meeting a specific condition exists, if the first audio exists, active noise reduction processing needs to be performed on the first audio, so as to obtain an initial inverse first audio for the first audio, and ensure that a target inverse first audio transmitted to an audio receiving object where the audio is played by the electronic device and the first audio transmitted to the audio receiving object can cancel each other, thereby eliminating noise interference of the first audio on the audio receiving object, and ensuring that the audio receiving object can clearly and accurately receive audio content played by the electronic device.

In some embodiments of the present application, the audio information processing process implemented by the audio processing apparatus 500 may be implemented based on an Artificial Intelligence (AI) technology, and specifically, a control chip supporting implementation of the AI technology, such as the AI chip in fig. 2, executes a program configured in advance based on the AI technology to implement the audio information processing method provided by the present application. Specifically, in combination with the above analysis, after the Audio information is collected by the Audio collection device 300 (e.g., the microphone Mic in fig. 2), the Audio information may be sent to the AI chip through the PDM Audio interface (or other data Audio interfaces, which is not limited in this application) to perform active noise reduction processing, and for the initial phase first Audio obtained through processing, the initial phase first Audio may be sent to the Audio Codec (Audio Codec) through the integrated circuit built-in Audio bus (I2S (Inter-IC Sound) bus) to perform Audio format conversion processing, so as to obtain analog Audio supported by the Audio player 400(Speaker), and the Codec implementation process of the Audio in the process is not described in detail in this application.

In the using process of the electronic device, the music player 400 may also play other audio generated or received by the electronic device itself, which is recorded as the audio to be played, and may send the High-Definition audio to be played to the audio codec by the HD (High Definition, High fidelity) audio controller in the corresponding chipset (chipset), and the audio codec mixes the received audio to be played and the initial inverse phase first audio, and sends the obtained mixed signal to the audio player for output.

It should be noted that, because there is often a certain distance between the audio player and the audio receiving object, if the audio player directly outputs the audio with the opposite phase of the first audio, the audio to be transmitted to the audio receiving object may not be completely opposite to the phase of the first audio transmitted thereto, that is, the phase difference is greater than or less than 180 °, so that the two audio cannot be completely cancelled out, and even the two audio may be interfered by the played audio.

Therefore, in order to ensure reliable elimination of noise, i.e. the first audio at the audio receiving object, the present application needs to perform an inverse operation on the collected first audio according to the respective distances from the sound source to the audio collecting device and the audio collecting object, and the distance between the audio receiving object and the audio playing device, so as to obtain an initial inverse first audio, i.e. an audio which may have a phase difference greater than or less than 180 ° with respect to the first audio, and of course, an audio which may have a phase opposite to the first audio, depending on the distance values obtained in a specific scene. Therefore, the phase of the first audio generated by the sound source is opposite to that of the corresponding audio signals transmitted to the audio receiving object, for example, the phase difference is 108 degrees, so that the two paths of audio can be mutually offset, and the audio receiving object cannot be interfered by the noise of the first audio in the process of receiving the required audio. The specific process of acquiring the initial phase first audio is not described in detail in the present application.

In the electronic device shown in fig. 2, to achieve the above noise reduction purpose, usually in an operating state of the electronic device, the audio acquisition device 300 acquires audio information in real time, and sends the audio information to the audio processing device 500 for audio information processing, so as to determine whether to perform active noise reduction processing on audio information existing in the current environment. The program executed by the AI chip shown in fig. 2 or the processor such as the CPU of the electronic device calls the program to execute, and the analysis processing on the audio information acquired in real time often causes the system power consumption of the electronic device to be very high.

In order to solve the above-mentioned high power consumption problem, the present application further provides a low power consumption processor, which is denoted as a first processor, and performs a preliminary analysis on the acquired audio information, and after preliminarily determining that the audio information includes a first audio that needs to be actively denoised (i.e. a first audio that meets a specific condition), notifies a second processor such as the AI chip or the CPU, and performs a complex analysis on the audio information, and completes the inverse denoising processing on the first audio according to the above-mentioned manner; on the contrary, if the first processor preliminarily determines that the acquired audio information does not include the first audio meeting the specific condition, the second processor is not required to analyze the audio information.

It can be seen that, because the power consumption generated by the operation of the first processor is lower than that generated by the second processor within the same time, the first processor replaces the second processor to perform preliminary analysis on the audio information acquired in real time, and whether to trigger the second processor to execute the program for implementing the audio information processing method provided by the application is determined according to the obtained preliminary analysis result, the second processor is not required to execute the program in real time, the power consumption generated by the execution of the program by the second processor is greatly reduced, the noise elimination requirement of the application scene is also met, and the optimal recording and playing effect is ensured.

Based on the above technical concept, as shown in fig. 3, the audio processing apparatus 500 may include the first processor 510 and the second processor 520. As described above, the first processor 510 obtains the audio information, detects whether the audio information includes the first audio meeting the specific condition, and if not, may continue to detect the audio information acquired by the audio acquisition apparatus; if so, a noise reduction prompt message may be generated and sent to the second processor 520, and the second processor 520 performs inverse noise reduction processing on the first audio according to the above-mentioned manner, and a specific implementation process of the embodiment of the present application is not described in detail herein.

It should be noted that the present application does not limit the specific types of the first processor 510 and the second processor 520 and the deployment locations thereof, and may be determined according to the circumstances.

In still other embodiments, the above-described technical concept of determining whether to trigger the second processor 520 to operate based on the initial analysis result of the audio information obtained by the first processor 510 to implement the low-power consumption audio detection analysis may also be applied to other application scenarios of the electronic device, such as waking up an operating system and a voice recognition engine (i.e., a voice assistant) of the electronic device after a certain condition is met, so as to solve the problem of high power consumption caused by the operating system and the voice recognition engine of the electronic device being always in an operating state. Of course, the technical concept is not limited to the application scenarios listed in the present application, and the present application is not described in detail herein.

For example, in a process that a user uses an electronic device, in order to reduce power consumption of the electronic device, after the electronic device is not used, the electronic device is generally controlled to enter a low power consumption state, such as a power-off state or a standby state, and when the user needs to use the electronic device, an operating system of the electronic device is started to normally operate; of course, the audio information may be in a normal operation state, but the acquired audio information is not directly analyzed in real time, so as to reduce power consumption generated by the second processor analyzing the audio information in real time.

Specifically, when the electronic device is in a low power consumption state, the audio collector 300 can still normally collect audio information, the first processor 510 performs initial analysis on the audio information, determines that a first audio meeting a specific condition exists, such as that a user voice exists, and determines that a relative position relationship between a speaker of the user voice and the electronic device meets the specific condition by combining sound source positioning, such as that the user faces a display screen of the electronic device, it can be determined that the user wants to use the electronic device without the user speaking a preset wake-up word of the electronic device, or based on the analysis result, the second processor can be notified to perform complex analysis on the collected audio information, perform identity authentication based on the user voice contained in the audio information, determine that the user is a legal user of the electronic device, thereby directly waking up an operating system of the electronic device and a voice recognition engine installed in the electronic device, the speech recognition engine executes the collected user speech instruction to meet the application requirement, but is not limited to the audio information processing method described in this embodiment.

It can be understood that, under the normal working condition of the electronic device, the first processor and the second processor, which may be especially included, may also execute the corresponding method steps described above, and in this case, after the user identity authentication is completed, the voice recognition engine may directly execute the user voice instruction included in the cached audio information without waking up the operating system and the voice recognition engine, and the specific implementation process is not described in detail in this application.

In a possible implementation manner, the first processor 510 may be disposed in the audio capturing apparatus 300, that is, the audio capturing apparatus 300 may be an intelligent audio capturing apparatus with a built-in first processor, and the present application does not limit a specific hardware disposition relationship between the audio capturing apparatus and the first processor, and the first processor 510 is not limited to the hardware disposition relationship disposed in the audio capturing apparatus 300, and the present application is only described by way of example.

Referring to the schematic structural diagram of yet another alternative example of the electronic device shown in fig. 4, the first processor 510 may be a Neural Network Processing Unit (NPU), that is, a new processor based on a Neural Network algorithm and acceleration, so that the audio collector 310 in the audio collecting apparatus 300 collects audio information and sends the audio information to the first processor 510, and then may perform simple signal analysis on the audio information by using the Neural Network algorithm to determine whether there is a first audio meeting a specific condition, where a specific analysis implementation process is not described in detail in this application.

As shown in fig. 4, the audio collector 310 in the audio collecting apparatus 300 and the second processor 520 may implement transmission of audio information through a PDM audio interface (but not limited to such an audio interface, and the type of the used audio interface may be determined according to the requirement of an audio format, which is not listed in this application), and the first processor 510 and the second processor 520 perform data interaction through data interfaces such as SPI (Serial Peripheral interface) and/or GPIO (General-Purpose-wide IO port) of the first processor 510 and the second processor 520, for example, corresponding instructions and working state information generated based on the preliminary analysis result of the audio information are transmitted.

In addition, in some application scenarios of the electronic device, after receiving the audio information sent by the audio collector, the second processor 520 may buffer the audio information and send the audio information to other chips in the electronic device for processing when needed, so that the second processor 520 may perform communication connection with other chips in the electronic device, and a specific connection manner is not limited, fig. 3 only takes a USB connection manner as an example to illustrate communication connection between the second processor 520 and a chipset of the electronic device, and a specific communication connection manner and a connected chip type are given as an example.

For example, in the voice interaction application scenario of the above trigger-free word (e.g., no wakeup word is needed), the second processor 520 may send the received audio information to the chipset, so that, when the second processor 520 determines that the speaker of the user voice contained in the audio information is a valid user of the electronic device, that is, the user voice is authenticated, the operating system of the electronic device is woken up, the voice assistant is triggered to enter a working state, that is, the chipset enters a normal running state, a chip that implements the voice assistant function may recognize a voice instruction contained in the audio information, and send the voice instruction to a corresponding chip to execute the voice instruction, thereby meeting the application requirement. For the similar control processes in other application scenarios, detailed descriptions are not given in this application.

Referring to fig. 5, a schematic flowchart of an alternative example of the audio information processing method proposed by the present application, which may be applied to an electronic device, is not limited to the product category of the electronic device and the composition structure thereof, and may refer to, but is not limited to, the structure described in the above embodiment of the electronic device. As shown in fig. 5, the audio information processing method proposed by the present embodiment may include:

step S11, acquiring audio information;

in conjunction with the above description, the audio information may be acquired by an audio acquirer in the audio acquisition apparatus 300 of the electronic device, and a specific audio acquisition implementation process is not described in detail. The audio processing device in the electronic device can actively or passively acquire the audio information acquired by the audio acquisition device, and the specific implementation process of acquiring the audio information by the audio processing device is not limited in the application.

It can be understood that, in different application scenarios, the audio content and the category of the audio information may be different, for example, environmental noise of an environment where the electronic device is located (e.g., noise generated by operation of other devices in the environment, noise generated by a natural environment, etc.), electrical noise generated by the electronic device itself, user voice output by the user for the electronic device, etc., and the specific content included in the audio information is not limited in the present application.

Step S12, extracting a first audio included in the audio information when the audio information includes the first audio satisfying a specific condition;

in practical applications, for different application scenarios and application requirements of the electronic device, the content and category of the first audio meeting the specific condition may be different, which is not limited in the present application and may be determined according to the situation.

In a possible implementation manner, the first audio meeting the specific condition may refer to a steady-state audio, such as noise generated by operation of a system radiator (e.g., a fan) of the electronic device, such as noise generated by operation of an electric appliance such as a household cleaner/range hood, and the like, which may be determined according to a current environment of the electronic device. In this case, the audio processing apparatus may extract the steady-state audio upon detecting that the acquired audio information includes the steady-state audio, and continue to process the steady-state audio in the following manner, so as to achieve inverse cancellation of the steady-state audio at the transmission audio receiving object.

In yet another possible implementation manner, in some scenarios, the signal strength of the unsteady noise in the environment where the electronic device is located is large, so that the steady noise in the environment is not substantially received by the audio receiving object (such as the user) in the environment. The method is characterized in that the stable state noise is taken as the capacitance electrical noise generated by a capacitance element in the electronic equipment, and the actual environment noise is taken as the unsteady state noise for illustration, if the current environment is noisy, the capacitance electrical noise generated by the electronic equipment can be basically covered, under the condition, the electronic equipment can process the capacitance electrical noise generated by the electronic equipment without the reversed phase noise reduction mode provided by the application, specifically, any processing can not be performed, and other noise reduction filtering modes can also be adopted for processing, so that the method is not limited by the application and can be determined according to the situation. On the contrary, if the environment of the electronic device is quiet, the capacitive electrical noise may be highlighted, and in this case, the capacitive electrical noise generally needs to be subjected to inverse noise reduction so as to reduce interference to the user using the electronic device.

Therefore, for such application scenarios described above, the electronic device may acquire stationary noise and non-stationary noise in the acquired audio information, and then determine whether the two kinds of noise satisfy a specific condition, such as whether a signal intensity difference between the stationary noise and the non-stationary noise described in the above scenario is greater than a corresponding stationary noise reduction threshold, and in turn determine whether inverse noise reduction processing needs to be performed on the stationary noise.

Further, in another possible implementation manner, in some application scenarios, it may only need to perform active noise reduction processing on abnormal steady-state noise with a sudden change frequency, and then, for such application scenarios, under the condition that it is determined that the steady-state noise and the non-steady-state noise satisfy corresponding specific conditions as described above, the present application may further determine whether the steady-state noise has an abnormal frequency signal, and in turn, determine whether to perform inverse noise reduction processing on the currently detected steady-state noise.

It should be noted that, the first audio meeting the specific condition is not limited to what is described above, and may be determined according to an application scenario of the electronic device, and the detailed description of the application is not provided herein.

Step S13 of obtaining an initial inverse first audio for the extracted first audio based on a relative positional relationship among a sound source of the first audio, an audio playing device, and an audio receiving object;

in combination with the description of the corresponding portion of the above embodiment, distances from the sound source of the first audio and the audio playing device to the audio receiving object are often different, and the audio playing device and the audio receiving object are often at a certain distance, so that the audio signal directly output by the audio player is transmitted to the audio receiving object, and signal attributes are often changed to a certain extent, for example, signal intensity is reduced due to signal attenuation, and a certain difference may exist between a phase of the audio signal directly output by the audio player and a phase of the audio signal transmitted to the audio receiving object by the audio player.

Therefore, in order to ensure that the phase of the noise signal to be eliminated transmitted to the audio receiving object and the phase of the inverse noise signal for eliminating the noise signal are opposite, and the two signals can be mutually offset, in the process of performing inverse processing on the extracted first audio, the relative position relationship among the sound source of the first audio, the audio playing device and the audio receiving object needs to be considered, so as to ensure that the audio playing device outputs the calculated initial inverse first audio, and the audio inverse offset can be realized at the position of transmitting the first audio to the audio receiving object.

It can be understood that if the distance between the audio receiving object and the audio playing apparatus is very small and smaller than the specific distance threshold, the audio signal received by the audio receiving object may be considered to be the same as the audio signal directly played by the audio playing apparatus, in this case, the initial inverse first audio obtained by the present application may be an audio with an opposite phase to the first audio. It can be seen that, for different application scenarios, the phase difference between the obtained initial inverse first audio and the extracted first audio may be different, and in the present application, the corresponding initial inverse first audio needs to be specifically calculated according to the relative position relationship among the sound source of the first audio, the audio playing device, and the audio receiving object in the specific application scenario, and the specific inverse calculation process is not described in detail in the present application.

Step S14, the initial inverse first audio is sent to the audio playing device for output, so that the target inverse first audio transmitted to the audio receiving object and the first audio cancel each other.

As described above, in the process of performing inverse phase calculation on the extracted first audio, the distance difference between the sound source of the first audio and the audio playing device and the audio receiving object is considered, and for the initial inverse first audio output by the audio playing device, if the audio transmitted to the audio receiving object is recorded as the target inverse first audio, the target inverse first audio and the first audio transmitted to the audio receiving object can be ensured to cancel each other, so that the audio receiving object is no longer interfered by the noise of the first audio.

For example, assuming that a first distance between a sound source of the first audio and the audio receiving object is denoted as d1, and a second distance between the audio playing device and the audio receiving object is denoted as d2, under the same environment, the transmission speed of the audio signal is relatively fixed, so that the time taken for the audio signal to transmit the first distance d1 is t1, and the time taken for the audio signal to transmit the second distance d2 is t 2.

In practical applications, if d1 is smaller than d2, and the audio receiving object is closer to the sound source relative to the audio playing apparatus, the time t1 taken for the first audio generated by the sound source to transmit to the audio receiving object is smaller than the time t2 taken for the initial inverse first audio output by the audio playing apparatus to transmit to the audio receiving object, and for the time period (t2-t1), the audio receiving object still suffers from the noise of the first audio. But from the next moment of this time period, the target inverse audio will be propagated at the audio receiving object, and the propagated first audio can be mutually cancelled.

On the contrary, if d1 is greater than d2, and the audio receiving object is closer to the audio playing device than the sound source, t1 is greater than t2, and if the initial inverse first audio is obtained as described above, and then the audio playing device directly plays the audio, then, for the same time, the first audio generated by the sound source and the initial inverse first audio output by the audio playing device reach the audio receiving object before the first audio, specifically reach (t1-t2) for a duration, which is constrained by the above positional relationship, and the initial inverse first audio reached in the duration may also reach an unwanted noise signal for the audio receiving object, and the audio receiving object does not need to receive the initial inverse first audio.

Therefore, when d1 is greater than d2, the initial inverse first audio sent to the audio playing device can be delayed, and it is ensured that when the played initial inverse first audio is transmitted to an audio receiving object, the first audio generated by the sound source is also transmitted to the audio receiving object, so that two paths of audio transmitted to the audio receiving object can cancel each other, and noise interference is reduced. The specific duration of the initial inverse first audio delay may be determined according to the distance difference between the first distance d1 and the second distance d2, the audio transmission speed, and other parameters, and the detailed implementation process of the delay process is not described in detail in this application.

In summary, in the embodiment of the present application, in order to eliminate noise interference of a first audio when it is detected that audio information acquired by an electronic device includes the first audio meeting a specific condition, the electronic device may extract the first audio included in the audio information, so that an initial inverse first audio for the first audio is obtained based on a relative position relationship between a sound source of the first audio, an audio playing device, and an audio receiving object, and the initial inverse first audio is sent to the audio playing device for output, so that a target inverse first audio transmitted to the audio receiving object and the first audio can be cancelled out, active noise reduction processing of the first audio acquired by the electronic device is implemented, noise interference of the first audio to a user using the electronic device is avoided, and user experience of the electronic device is improved.

Referring to fig. 6, a signaling flow diagram of another optional example of the audio information processing method proposed in the present application, this embodiment may be an optional detailed implementation method of the audio information processing scheme described in the foregoing embodiment, and in combination with the above description of the composition structure of the electronic device, this embodiment can reduce power consumption of the electronic device while implementing active noise reduction, mainly reduce power consumption consumed in an audio detection process, in this case, the embodiment of the present application may be applied to, but is not limited to, the electronic device shown in fig. 3 and fig. 4, as shown in fig. 6, the audio information processing method proposed in this embodiment may include:

step S21, the first processor acquires audio information;

in conjunction with the description of the corresponding parts of the above embodiments, for a first processor and a second processor in an electronic device, a first power consumption generated by the operation of the first processor is smaller than a second power consumption generated by the operation of the second processor within the same time. In some scenarios, the value of the first power consumption may be much smaller than the value of the second power consumption, for example, the power consumption difference between the second power consumption and the first power consumption is greater than a power consumption threshold (which is a relatively large value and can be determined according to the power consumption value consumed by the electronic device during operation, and the size of the value is not limited by the present application), and the like.

In some embodiments, the audio capture device of the electronic device performs audio capture and sends the captured audio information to the first processor actively or passively (i.e., the first processor actively reads the audio information captured by the audio capture device, etc.). In some scenes, the audio acquisition device can also actively or passively send the audio information to the second processor, so that the second processor performs complex analysis processing on the corresponding audio information; of course, the audio acquisition device or the first processor may send the audio information to the second processor under the condition that the audio information satisfies a certain condition, and the execution stage of sending the audio information to the second processor is not limited in the present application.

The audio information collected by the electronic device may include at least one type of audio, and generally includes multiple types of audio because the operating environment of the electronic device is not an absolutely quiet environment, but the audio type included in the audio information is not limited in the present application, and may be determined according to the situation.

Step S22, the first processor detects whether the audio information contains a first audio meeting a specific condition, and if not, continues to detect the acquired audio information; if yes, go to step S23;

in order to avoid the problem of high power consumption caused by the fact that the second processor directly performs real-time analysis on the audio information acquired by the electronic device, the embodiment of the present application selects the first processor with the characteristic of low power consumption to perform real-time detection on the audio information so as to determine whether the acquired audio information includes the first audio meeting the specific condition, and the specific detection implementation process may refer to but is not limited to the description of the corresponding part in the above embodiment, which is not described herein again.

After the detection and analysis, the first processor determines that the acquired audio information contains the first audio meeting specific conditions, and then informs the second processor to work to perform fine detection on the audio information; on the contrary, if it is determined that the acquired audio information does not include the first audio satisfying the specific condition, the second processor may not be notified to detect the audio information, that is, for such audio information that does not include the first audio satisfying the specific condition, the second processor may not be used to perform detection analysis, which reduces power consumption caused by the second processor performing detection analysis on such audio information.

Step S23, the first processor sends noise reduction prompt information to the second processor;

because the noise reduction prompting information is mainly used for informing the second processor to perform complex detection analysis on the corresponding audio information, in order to ensure that the second processor accurately determines the corresponding audio information including the first audio, the noise reduction prompting information may include time information corresponding to the acquired audio information, a detection result of the audio information including the first audio, and the like, so that the second processor can determine the audio information corresponding to the time information from the received multiple pieces of audio information, and complete subsequent detection analysis on the audio information.

Of course, in still other embodiments, if the first processor determines whether to send the obtained audio information to the second processor for further subsequent processing according to the detection result after completing the detection on the obtained audio information, the noise reduction prompt information sent from the first processor to the second processor may not include the time information, and the second processor may also accurately know which audio information needs to be subjected to complex analysis. It can be seen that, in different application scenarios, the contents of the noise reduction prompt information sent by the first processor may be different, and the contents contained in the noise reduction prompt information are not limited in the present application, and may be determined according to the circumstances.

In a possible implementation manner, the noise reduction prompting information may be in an output form of a high/low level signal, an event, an instruction, or the like, and the second processor receives the noise reduction prompting information, that is, may process the corresponding audio information according to a preset audio analysis method.

Step S24, the second processor obtains the audio information, receives the noise reduction prompt information, and extracts a first audio included in the audio information;

as discussed above, the execution steps of the second processor to obtain the audio information are not limited in the embodiments of the present application, and may be determined as the case may be.

In some embodiments, the first processor may include an NPU, and the second processor may include an AI chip, and the like, but is not limited to these two processors. In this embodiment, only this is taken as an example for explanation, when the NPU performs a preliminary analysis on the audio information and determines that the audio information includes the first audio, the second processor performs a complex analysis on the audio information based on the AI technology, for example, the first audio meeting a specific condition is extracted from the acquired audio information by using a neural network, a sound source localization algorithm, and the like.

Step S25, the second processor obtains an initial inverse first audio for the extracted first audio based on the relative positional relationship among the sound source of the first audio, the audio playing device, and the audio receiving object;

in step S26, the second processor sends the initial inverse first audio to the audio playing device for output, so that the target inverse first audio transmitted to the audio receiving object and the first audio cancel each other out.

For specific implementation processes of step S25 and step S26, reference may be made to the description of corresponding parts in the foregoing embodiments, which are not described herein again.

In summary, in the embodiment of the present application, the first processor with low power consumption is used to perform real-time analysis on the collected audio information, and when it is determined that the collected audio information includes the first audio meeting the specific condition, the second processor with high power consumption is notified to perform the first audio extraction on the corresponding audio information, so as to construct the corresponding initial inverse first audio, which is output by the audio playing device, so as to ensure that the target inverse first audio and the first audio, which are transmitted to the audio receiving object, can cancel each other, thereby implementing active noise reduction on the first audio, and avoiding noise interference of the first audio on the audio receiving object.

The first processor determines that the acquired audio information does not contain the first audio, the second processor does not need to perform complex analysis on the acquired audio information, and compared with the method that the second processor directly completes the whole audio information processing process, the power consumption generated by the second processor for audio processing is greatly reduced.

It should be noted that, in practical application of the present application, according to application requirements in different scenes, the audio information processing method provided in the present application may be executed by the second processor in the electronic device, so as to achieve the purpose of actively reducing noise of the first audio transmitted to the audio receiving object according to the above manner; if the power consumption needs to be reduced in the application scenario, the audio information processing method provided by the present application may be implemented by the first processor and the second processor in a manner described in the above embodiment, so as to reduce the power consumption of the system, and the specific implementation process is not described in detail in the present application.

Referring to fig. 7, a schematic flow chart of yet another optional example of the audio information processing method proposed in the present application is shown, and this embodiment may be a further optional detailed implementation method of the audio information processing scheme described in the foregoing embodiment, and mainly describes, as an example, an audio in which the first audio is stationary noise, but does not limit a specific application scenario for removing stationary noise, and specific audio content and category of the stationary noise, which may be determined as the case may be. Based on this, as shown in fig. 7, the method may include:

step S31, acquiring audio information;

for a specific implementation process of step S31, reference may be made to the description of the corresponding parts in the foregoing embodiments, which is not described in detail in this embodiment.

Step S32, detecting that the audio information contains stationary noise, and extracting the stationary noise from the audio information;

in this embodiment of the application, in different application scenarios, a first processor or a second processor may perform signal feature analysis on acquired audio information, determine respective noise types of different audios included in the audio information, and if the audio information includes audio of the type of stationary noise, directly determine the stationary noise as the first audio.

The signal characteristic analysis process of the audios with different noise categories can be determined according to the characteristics of the audio signals of the noise, and if the signal amplitude, the frequency and other attribute values of a certain path of audio signal contained in the process are changed along with time and basically stabilized in a certain range, the path of audio signal can be considered as steady-state noise; if the attribute value of a certain path of audio signal changes irregularly with time, the path of audio signal can be considered to be unsteady noise, and the like, so that the electronic device can analyze the attribute value change condition of each path of audio signal in the obtained audio information according to the attribute value change condition to realize noise category identification, but the method is not limited to the two noise categories and the identification methods thereof, and under some application scenarios, the noise category and the like can be determined by combining the position of a noise source, for example, for capacitive electrical noise, the position of the sound source is determined to be located in the area where the electronic device is located by sound source positioning, and the audio generated by the sound source can be considered to be capacitive electrical noise and the like, which is not described in detail herein.

It should be noted that, because the steady-state noise also includes multiple categories, such as electronic noise generated by the electronic device itself, noise generated by a specific category of household appliances working around the electronic device, and the like, in different application scenarios, all noise reduction processing may be required, and steady-state noise of a specific category may also be required to be processed, so, in a case that it is determined that the audio information includes the steady-state noise, all or part of the steady-state noise included in the audio information, that is, the steady-state noise requiring noise reduction processing, may be extracted according to the noise reduction requirement of the application scenario and the category of each steady-state noise, and a specific implementation process of the present application is not described in detail.

Step S33, obtaining the first sound source position of the steady-state noise, the sound outlet position of the audio playing device and the sound receiving position of the audio receiving object;

as described above, in the case that the first audio is stationary noise, the embodiment of the present application may determine that the acquired audio information includes stationary noise, and determine that inverse noise reduction processing needs to be performed on the stationary noise. Because the method and the device need to ensure that the reverse phase offset of the steady-state noise at the reception position of the audio receiving object is realized, and the phase deviation caused by the long-distance transmission of the audio signal is considered, the relative position relation among the sound source of the steady-state noise (namely, the first audio), the audio playing device and the audio receiving object can be obtained.

Specifically, the sound source positioning method and the sound source positioning device can perform sound source positioning on each path of audio collected by the audio collection device to obtain the sound source position of the steady-state noise, for example, the sound source positioning of the steady-state noise is realized by using the attribute values of each path of audio collected by the array microphone and the position relation between the microphones included in the array microphone, and the concrete implementation process of the sound source positioning is not described in detail.

In addition, the sound outlet hole position of the audio playing device and the sound reception position of the audio receiving object (such as the ear position of the user, the sound reception hole positions of other audio receivers, and the like) need to be acquired, and in order to ensure the positioning consistency of the acquired position information, the sound source position, the sound outlet hole position and the sound reception position can be determined and expressed based on the same position reference point, the position reference point can be a certain position in the electronic equipment, and the specific position of the position reference point is not limited by the application.

In different application scenarios, the reception position of the audio receiving object may be determined according to the category of the audio receiving object, for example, the audio receiving object is a user, and the reception position is an ear of the user, so that the reception position may be determined by counting a distance between the ear of the user and the position reference point in a process of using the electronic device by a general user, and the reception position may be used as a parameter of the audio information processing application as needed, or may be individually adjusted by the user, so that the reception position is more matched with the reception position of the user, and the audio information processing effect is further improved.

Step S34, according to the first sound source position, the sound outlet hole position and the sound receiving position, carrying out phase inversion processing on the steady-state noise to obtain initial phase-inversion steady-state noise;

in some embodiments, based on the position information such as the first sound source position, the sound output hole position, and the sound receiving position, the distance between the sound output hole position and the sound receiving position (i.e. the second distance) is determined to be within a certain range, and if the second distance is smaller than a certain distance threshold, the audio played by the sound output hole is considered to have the same attribute information and content as the audio received by the sound output hole of the audio receiving object, and there is no phase shift problem, in this case, the extracted steady-state noise can be directly subjected to phase inversion processing to obtain an initial phase-inverted steady-state noise having a phase difference of 180 ° with the steady-state noise, that is, a target phase-inverted steady-state noise transmitted to the sound receiving position of the audio receiving object, so as to achieve phase-inverted cancellation of the corresponding steady-state noise transmitted to the sound receiving position of the audio receiving object, interference from this steady state noise is avoided.

In still other embodiments, if the second distance between the sound output hole position and the sound receiving position exceeds the above range, a certain phase deviation may occur due to the transmission of the audio signal in a long distance, so that a phase deviation may exist between the audio directly output by the sound output hole of the audio playing device and the audio transmitted to the sound receiving position of the audio receiving object.

Therefore, in order to achieve the above object, in the process of constructing the anti-phase stationary noise, a certain phase difference exists between the constructed initial anti-phase stationary noise (i.e. the initial anti-phase first audio) and the extracted stationary noise in consideration of the size of the second distance between the sound outlet position and the sound receiving position, but the phase difference may not be 180 °, and specifically may be determined according to the phase deviation from the audio played by the audio playing device to the sound receiving position. After the processing, the played initial reversed steady-state noise is subjected to phase shift, and the target reversed steady-state noise transmitted to the sound receiving position can be reversely offset with the steady-state noise.

In still other embodiments, in combination with the above embodiments, if the second distance is smaller than the first distance, a delay process may be further required in the process of constructing the initial inverse steady-state noise, so as to avoid noise interference caused by the played initial inverse steady-state noise to the audio receiving object. For a specific implementation process of the delay processing, reference may be made to the description of the corresponding part in the foregoing embodiment, which is not described herein again.

It is to be understood that, in the case where the extracted stationary noise includes a plurality of stationary noises, the constructed initial inverse stationary noise may include an initial inverse stationary noise for each stationary noise to ensure the overall noise reduction processing for the plurality of extracted stationary noises.

Step S35, mixing the initial reversed-phase steady-state noise and the audio to be played to obtain mixed audio;

step S36, sending the mixed audio to an audio playing device for output, so that the target inverse steady-state noise in the mixed audio transmitted to the sound reception position of the audio receiving object and the steady-state noise cancel each other.

In the embodiment of the application, the content of the audio to be played may be determined according to a specific application scene, for example, a user controls an electronic device to play music, and uses an earphone to listen to music, or in various application scenes such as an audio-video conference, a lesson study, and the like, the audio to be played may be music to be played, speaking audios of other participants, audios in a lesson, and the like.

It can be understood that, in an application scenario where a user uses an earphone to listen to audio, after the audio codec obtains the initial inverse first audio, if there is an audio to be played at this time, the two audio channels may be mixed and then sent to the earphone for playing. Of course, if the user does not listen to the audio with the earphone, but the audio player of the electronic device plays the audio, the sound receiving position of the audio receiving object may be the ear position of the user. In other application scenarios, the audio receiving object is not limited to the user, and the sound receiving position of the audio receiving object does not have the ear position of the user, which may be determined as the case may be.

In addition, if the audio to be played does not exist in the electronic equipment, the initial reversed-phase steady-state noise can be directly sent to the audio playing device to be output, and the target reversed-phase steady-state noise and the target steady-state noise transmitted to the sound receiving position of the audio receiving object can be guaranteed to be mutually offset.

In summary, in the embodiment of the present application, the electronic device determines that the acquired audio information has a steady-state noise, and in order to eliminate noise interference of the steady-state noise on the audio receiving object, a first sound source position of the steady-state noise, a sound outlet position of the audio playing device, and a sound receiving position of the audio receiving object may be acquired, so as to perform phase inversion processing on the extracted steady-state noise, mix the acquired initial phase-inverted steady-state noise and the audio to be played, send the mixed audio to the audio playing device for output, and cancel out the target phase-inverted steady-state noise in the mixed audio transmitted to the sound receiving position of the audio receiving object from the audio playing device, thereby avoiding noise interference of the steady-state noise on the audio receiving object, and enabling the audio receiving device to reliably receive the.

It can be understood that, in the above audio information processing process, in order to reduce the system power consumption of the electronic device as much as possible, a first processor with low power consumption may be used to perform audio real-time analysis, and when it is determined that steady-state noise exists in the audio information, a second processor with high power consumption performs complex analysis on the audio information to avoid power consumption caused by the complex analysis performed by the second processor on the audio information that does not include steady-state noise.

In practical application of the present application, under different application scenarios, the first audio meeting the specific condition is not limited to the presence of the stationary noise described in the above embodiment; the method may further include that stationary noise and non-stationary noise in the acquired audio information satisfy a specific condition, in this case, it is only determined that the acquired audio information includes stationary noise, and it cannot be directly determined that noise reduction processing needs to be performed on the stationary noise, and it is further determined whether the stationary noise and the non-stationary noise in the audio information satisfy the specific condition, for example, whether a signal intensity difference between the stationary noise and the non-stationary noise is greater than a stationary noise reduction threshold, so as to determine that the stationary noise can interfere with an audio receiving object, where the stationary noise reduction threshold is determined based on a signal intensity of ambient noise of the electronic device, and a specific application implementation process may refer to the description of the corresponding part of the foregoing embodiment, which is not described in detail herein.

In yet another possible implementation manner, in a case where it is determined that stationary noise and non-stationary noise in the acquired audio information satisfy a specific condition, it may be further determined whether the detected stationary noise is abnormal, so as to determine whether the stationary noise needs to be denoised. It can be seen that, under different application scenarios, the content of the first audio meeting the specific condition may be different, and the present application may be determined according to the specific noise cancellation requirement of the application scenario, which is not listed one by one, and only this further possible implementation manner is taken as an example, and an applicable audio information processing procedure is described herein.

Referring to fig. 8, a schematic flow chart of yet another alternative example of the audio information processing method proposed by the present application is different from the content of the first audio meeting the specific condition defined in the above embodiment, the embodiment of the present application describes the first audio of other contents, and similar processing procedures after the first audio is extracted may refer to the description of corresponding parts of the above embodiment, and the present embodiment is not described in detail. Thus, as shown in fig. 8, the method may include:

step S41, acquiring audio information;

step S42, acquiring stationary noise and non-stationary noise contained in the audio information;

step S43, acquiring a signal intensity difference between the stationary noise and the non-stationary noise;

step S44, determining that the signal intensity difference is larger than a steady-state noise reduction threshold value and the steady-state noise has abnormal frequency signals, and determining the steady-state noise as a first audio frequency;

in the embodiment of the present application, steady-state noise and non-steady-state noise included in the audio information may be extracted, and if the steady-state noise is covered by the non-steady-state noise, it may be considered that the steady-state noise does not cause interference to the audio receiving object, and it may not be necessary to perform inverse denoising processing on the steady-state noise and the non-steady-state noise according to the present application.

According to the above manner, it is determined that the obtained signal intensity difference is greater than the steady-state noise reduction threshold, that is, the steady-state noise and the non-steady-state noise satisfy a specific condition, it may be considered that the steady-state noise may cause interference to the audio receiving object from the signal intensity, and in some embodiments, the steady-state noise may be directly used as the first audio for subsequent processing without performing other detection; in the embodiment of the present application, it can be considered that the normal steady-state noise does not need to be filtered by the inverse denoising processing method of the present application, and the noise reduction processing may be performed without processing or using other filtering methods, and for the abnormal steady-state noise, the audio information processing method provided by the present application is further used to perform inverse denoising processing on the abnormal steady-state noise, so as to ensure reliable inverse cancellation of the abnormal steady-state noise at the sound reception position of the audio receiving object.

Therefore, in the embodiment of the present application, under the condition that it is determined that the obtained signal intensity difference is greater than the steady-state noise reduction threshold, whether the extracted steady-state noise has an abnormal frequency signal or not may be further analyzed, specifically, by analyzing a spectrogram of the steady-state noise, if an abnormal frequency signal with an abrupt amplitude exists, it may be determined that the steady-state noise is abnormal, and it is determined as the first audio frequency, and the subsequent steps are continuously performed.

It can be understood that, in the above different application scenarios, it is determined that the obtained audio information does not include the first audio meeting the specific condition according to the corresponding detection manner, and the subsequent obtained audio information can be continuously detected, and the specific implementation process may be determined in combination with the application scenario for the definition content of the first audio meeting the specific condition, which is not described in detail herein.

Step S45 of obtaining an initial inverse first audio for the extracted first audio based on a relative positional relationship among a sound source of the first audio, an audio playing device, and an audio receiving object;

step S46, the initial inverse first audio is sent to the audio playing device for output, so that the target inverse first audio transmitted to the audio receiving object and the first audio cancel each other.

For specific implementation processes of step S45 and step S46, reference may be made to the description of corresponding parts in the foregoing embodiments, which are not described herein again.

In summary, in the embodiment of the present application, after the electronic device acquires the audio information, it determines that a signal intensity difference between the steady-state noise and the non-steady-state noise included in the audio information is greater than a steady-state noise reduction threshold, that is, the signal intensity of the steady-state noise is sufficient to cause interference to the audio receiving object, and determines that the steady-state noise has an abnormal frequency signal, and it needs to perform inverse cancellation on the steady-state noise, so that the steady-state noise is determined as the first audio, and according to a relative position relationship between the sound source position of the first audio, the audio playing device, and the audio receiving object, the corresponding initial inverse first audio is acquired and is output by the audio playing device, and the target inverse first audio transmitted to the audio receiving object can cancel the first audio, thereby ensuring that the audio receiving object does not receive the interference.

In addition, in combination with the description of the other embodiments, in order to reduce system power consumption, the audio information processing process in this embodiment may be implemented by matching a first processor and a second processor in an electronic device, and specific implementation processes of this embodiment are not described herein again.

In still other embodiments provided by the present application, the first audio meeting the specific condition is not limited to a steady-state audio, and may also be a preset user audio, so as to implement voice control of the electronic device based on the first audio, and the system, the voice recognition engine, and the like of the electronic device can be woken up without the user memorizing and speaking a preset wake-up word, so as to meet the control requirement of the user on the electronic device.

Based on the above analysis, referring to fig. 9, a flowchart of yet another alternative example of the audio information processing method proposed by the present application is different from the definition content of the above detailed embodiment on the first audio, and specifically as shown in fig. 9, the method may include:

step S51, acquiring audio information;

step S52, detecting that the audio information contains user audio, and positioning the sound source of the user audio to obtain the position of the speaker of the user audio;

step S53, determining that the position relation between the position of the speaker and the electronic equipment meets a specific condition, and determining the user audio as a first audio;

in combination with the description of the corresponding part of the embodiment of the electronic device, in the scene of the wake-up-free word sound interaction application, the electronic device may determine, through the user audio, the relative position relationship between the electronic device and the user audio, to determine whether the user has an intention to use the electronic device, and then perform complex analysis on the user audio in a case where it is determined that the user has an intention to use the electronic device.

Based on this, after the audio information is collected in real time by the audio collecting device, the audio of the user contained in the audio information is subjected to sound source positioning to determine that the position relation between the position of the speaker and the electronic equipment meets a specific condition, such as whether the user speaks in the direction facing the electronic equipment system, and the like.

Step S54, extracting a first audio included in the audio information;

step S55, based on the first audio, the identity authentication is carried out on the speaker of the first audio;

in step S56, if the identity authentication is passed, the speech recognition engine of the electronic device is awakened, and the speech recognition engine executes the speech command included in the first audio.

In practical applications, in order to reduce the power consumption of the system, especially when the electronic device is in a low power consumption state, the first processor (e.g., NPU integrated in the array audio collector, etc.) may execute the above steps S51 to S53, and notify the second processor to execute the subsequent steps, so as to solve the problem of high power consumption of the system caused by the real-time execution of the steps S51 to S53 by the second processor with high power consumption.

After the electronic device extracts the first audio (i.e., the user audio) contained in the audio information, for example, a voiceprint recognition technology can be adopted to realize identity authentication of the speaker of the first audio, and a specific identity authentication implementation process is not described in detail, for example, the extracted first audio can be directly input into a pre-constructed identity recognition model to determine whether the speaker of the first audio is a preset legal user of the electronic device. If yes, an operating system and a voice recognition engine of the electronic equipment are awakened, the cached user audio is sent to the operating system, and the voice recognition engine executes a corresponding voice instruction. It is understood that, in order to ensure the authentication reliability, the first audio acquired for a continuous period of time (e.g., 2 seconds or more) may be processed.

Through the detection and analysis, if the position relation between the position of the speaker and the electronic equipment does not meet a specific condition, the subsequently acquired audio information can be continuously analyzed; if the identity authentication fails, the identity authentication processing may be continued on the subsequently acquired audio information, and after the identity authentication fails for a preset number of times, the electronic device is locked to improve the use security of the electronic device, but the method is not limited to the security measures described in this embodiment.

In addition, in practical application, the collected audio information may only or mainly include the first audio, and in this case, the electronic device may not extract the first audio from the collected audio information any more, and may complete subsequent processing on the audio information as the first audio; if the environment in which the user is located is noisy, the directly collected audio information can be combined with a noise reduction technology to obtain a clearer and cleaner first audio so as to improve the reliability of the voice control of the electronic equipment.

In summary, in the embodiment of the present application, a user can directly speak to the electronic device according to an application requirement without configuring a wake-up word of an operating system and a voice recognition engine of the electronic device in advance, the electronic device determines that the collected audio information includes a user audio, performs sound source localization on the user audio, determines that a position relationship between a position of a speaker and the electronic device satisfies a specific condition, uses the user audio as a first audio, performs identity authentication on the speaker accordingly, and if the identity authentication passes, directly wakes up a system and the voice recognition engine of the electronic device, sends the user audio to the voice recognition engine, determines and executes a corresponding voice instruction, thereby satisfying a voice control requirement on the electronic device.

Based on the technical idea of the audio information processing method described in the above embodiments, the following will describe the composition of a virtual device that implements the audio information processing method, but is not limited to the composition described in the device embodiments below.

Referring to fig. 10, a schematic structural diagram of an alternative example of the audio information processing apparatus proposed in the present application, which may be applied to the electronic device, as shown in fig. 10, may include:

the audio information acquisition module 11 is configured to acquire audio information;

a first audio extracting module 12, configured to extract a first audio included in the audio information when the audio information includes the first audio that meets a specific condition;

an inverse processing module 13, configured to obtain an initial inverse first audio for the extracted first audio based on a relative positional relationship among a sound source of the first audio, an audio playing device, and an audio receiving object;

and the audio transmission module 14 is configured to send the initial inverse first audio to an audio playing device for output, so that the target inverse first audio transmitted to the audio receiving object and the first audio cancel each other out.

In one possible implementation, the inversion processing module 13 may include:

the position acquisition unit is used for acquiring a first sound source position of the first audio, a sound outlet hole position of the audio playing device and a sound receiving position of an audio receiving object;

the reverse phase processing unit is used for performing reverse phase processing on the first audio frequency according to the first sound source position, the sound outlet hole position and the sound receiving position to obtain an initial reverse phase first audio frequency;

accordingly, the audio transmission module 14 may include:

and the audio mixing output unit is used for mixing the initial reverse phase first audio and the audio to be played and sending the obtained mixed audio to an audio playing device for outputting.

In some embodiments, the first audio extraction module 12 may include:

a first detection unit located in a first processor of the electronic device, for detecting whether the audio information contains a first audio meeting a specific condition;

the first noise reduction prompting unit is positioned in the first processor and used for sending noise reduction prompting information to the second processor under the condition that the detection result of the first detection unit is positive;

based on this, the above-mentioned inverting processing module 13 and the audio transmission module 14 may be located in the second processor of the electronic device, so as to ensure that the electronic device utilizes the first processor and the second processor, and in the audio information processing process, the system power consumption is reduced to the greatest extent possible.

In some embodiments, in order to determine that the audio information includes the first audio satisfying a specific condition, the first audio extracting module 12 may include:

a first determining unit, configured to detect that the audio information includes stationary noise, and determine the stationary noise as a first audio; alternatively, the first and second electrodes may be,

the first acquisition unit is used for acquiring steady-state noise and unsteady-state noise contained in the audio information;

a second determining unit, configured to determine that the stationary noise and the non-stationary noise satisfy a specific condition, and determine the stationary noise as a first audio; or, a third determining unit, configured to determine that the stationary noise and the non-stationary noise satisfy a specific condition, and the stationary noise has an abnormal frequency signal, and determine the stationary noise as the first audio.

In practical applications of the present application, if the steady-state noise includes electrical noise generated by the electronic device itself, the second determining unit and the third determining unit may each include:

a signal intensity difference acquisition unit configured to acquire a signal intensity difference between the stationary noise and the non-stationary noise;

a fourth determination unit configured to determine that the signal strength difference is greater than a steady-state noise reduction threshold, where the steady-state noise reduction threshold is determined based on a signal strength of the ambient noise of the electronic device.

In still other embodiments, as shown in fig. 11, the first audio extracting module 12 may further include:

the sound source positioning unit 121 is configured to detect that the audio information includes a user audio, perform sound source positioning on the user audio, and obtain a speaker position of the user audio;

a position relation determining unit 122, configured to determine that a position relation between the speaker position and the electronic device satisfies a specific condition, and determine the user audio as a first audio;

correspondingly, the above apparatus may further include:

the identity authentication module 15 is configured to authenticate an identity of a speaker of the first audio based on the first audio;

and the voice control module 16 is configured to wake up a voice recognition engine of the electronic device if the identity authentication result of the identity authentication module 15 is that the identity authentication is passed, and execute a voice instruction included in the first audio by the voice recognition engine.

It should be noted that, various modules, units, and the like in the embodiments of the foregoing apparatuses may be stored in the memory as program modules, and the processor executes the program modules stored in the memory to implement corresponding functions, and for the functions implemented by the program modules and their combinations and the achieved technical effects, reference may be made to the description of corresponding parts in the embodiments of the foregoing methods, which is not described in detail in this embodiment.

Finally, it should be noted that, in the present specification, the embodiments are described in a progressive or parallel manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. The device and the electronic equipment disclosed by the embodiment correspond to the method disclosed by the embodiment, so that the description is relatively simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

29页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于降低静音舱内部噪声的控制装置和方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!