Sound source positioning method and device

文档序号：1693830 发布日期：2019-12-10 浏览：25次中文

阅读说明：本技术 声源定位方法和装置 (Sound source positioning method and device ) 是由夏杰周强于 2019-09-12 设计创作，主要内容包括：本发明公开声源定位方法和装置,其中,方法包括：对麦克风阵列接收的信号进行计算得到空间谱；确定空间谱所具有的谱峰的数量；若空间谱具有多个谱峰,使用固定波束形成器形成与麦克风阵列对应的多个不同方向的波束,其中,所述多个不同方向的波束至少包括第一方向波束和第二方向波束；计算第一方向波束的能量、第二方向波束的能量以及第一方向波束和第二方向波束的能量差；判断能量差是否大于等于预设阈值；若能量差大于等于预设阈值,输出第一方向波束的能量和第二方向波束的能量中能量最大的波束所对应的角度为波达方向。本申请的提供的方案充分考虑到各种干扰噪声的影响,能够实现更精准地声源定位。(the invention discloses a sound source positioning method and a sound source positioning device, wherein the method comprises the following steps: calculating signals received by a microphone array to obtain a spatial spectrum; determining the number of spectral peaks that the spatial spectrum has; if the spatial spectrum has a plurality of spectral peaks, forming a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former, wherein the beams in the different directions at least comprise a first direction beam and a second direction beam; calculating the energy of the first directional beam, the energy of the second directional beam and the energy difference of the first directional beam and the second directional beam; judging whether the energy difference is greater than or equal to a preset threshold value or not; and if the energy difference is larger than or equal to the preset threshold, outputting the angle corresponding to the beam with the maximum energy in the energy of the first direction beam and the energy of the second direction beam as the direction of arrival. The scheme provided by the application fully considers the influence of various interference noises, and can realize more accurate sound source positioning.)

1. A sound source localization method, comprising:

calculating signals received by a microphone array to obtain a spatial spectrum;

Determining the number of spectral peaks the spatial spectrum has;

if the spatial spectrum has a plurality of spectral peaks, forming a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former, wherein the beams in the different directions at least comprise a first direction beam and a second direction beam;

calculating an energy of the first directional beam, an energy of the second directional beam, and an energy difference between the first directional beam and the second directional beam;

judging whether the energy difference is greater than or equal to a preset threshold value or not;

and if the energy difference is larger than or equal to the preset threshold, outputting the angle corresponding to the beam with the largest energy in the energy of the beam in the first direction and the energy of the beam in the second direction as the direction of arrival.

2. The method of claim 1, wherein after the determining the number of spectral peaks the spatial spectrum has, the method further comprises:

And if the spatial spectrum only has a single spectral peak, outputting the angle corresponding to the single spectral peak as the direction of arrival.

3. the method of claim 2, wherein the method further comprises:

If the energy difference is smaller than the preset threshold value, calculating the relative delay of the first direction beam and the relative delay of the second direction beam;

And outputting the angle corresponding to the beam with the smallest delay in the relative delay of the first directional beam and the relative delay of the second directional beam as the direction of arrival.

4. the method of claim 1, wherein the calculating a spatial spectrum of signals received by a microphone array comprises:

Obtaining a separation matrix corresponding to a plurality of microphone receiving signals by using independent vector analysis;

A spatial spectrum of a separation matrix corresponding to wake-up signals capable of waking up the device is calculated.

5. the method of claim 4, wherein the deriving a separation matrix corresponding to a plurality of microphone received signals using independent vector analysis comprises:

Modeling a signal received by the microphone array as X (t, f) based on a short-time Fourier transform;

Filtering signals received by the microphone array by using an independent vector analysis matrix calculation separation matrix W (t, f) to obtain an estimated signal Y (t, f) of the sound source signal, wherein the Y (t, f) is W (t, f) X (t, f);

and sending the estimation signal to a wake-up module in the equipment, and determining a separation matrix corresponding to the wake-up signal.

6. The method of claim 5, wherein the computing the spatial spectrum of the separation matrix corresponding to the wake-up signal comprises:

Calculating a spatial covariance matrix by using a separation matrix corresponding to the wake-up signal;

decomposing eigenvalues of the spatial covariance matrix obtained by calculation to obtain a maximum eigenvalue, wherein a vector corresponding to the maximum eigenvalue is a signal space, and simultaneously the residual vectors form a noise space;

a spatial spectrum is calculated based on the signal space and the noise space.

7. A sound source localization apparatus comprising:

The spatial spectrum calculation module is configured to calculate signals received by the microphone array to obtain a spatial spectrum;

A spectral peak number determination module configured to determine a number of spectral peaks the spatial spectrum has;

A beam forming module configured to form a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former if the spatial spectrum has a plurality of spectral peaks, wherein the plurality of beams in different directions at least include a first direction beam and a second direction beam;

An energy calculation module configured to calculate an energy of the first directional beam, an energy of the second directional beam, and an energy difference of the first directional beam and the second directional beam;

the judging module is configured to judge whether the energy difference is greater than or equal to a preset threshold value;

the first output module is configured to output an angle corresponding to a beam with the largest energy among the energy of the beam in the first direction and the energy of the beam in the second direction as an arrival direction if the energy difference is greater than or equal to the preset threshold.

8. The apparatus of claim 7, further comprising:

And the second output module is configured to output the angle corresponding to the single spectral peak as the direction of arrival if the spatial spectrum only has the single spectral peak.

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1 to 6.

10. a storage medium having stored thereon a computer program, characterized in that the program, when being executed by a processor, is adapted to carry out the steps of the method of any one of claims 1 to 6.

Technical Field

the invention belongs to the technical field of voice interaction, and particularly relates to a sound source positioning method and device.

Background

In a real scene, when the microphone array is used to locate the speaker direction, interference from other directions, such as interference noise of television, music, etc., is inevitably received. Meanwhile, due to the limitation of power supply, the arrangement position of the microphone array is close to the wall, and the reflected sound waves caused by the wall surface easily influence the positioning accuracy.

the inventor finds in the process of implementing the present application that the prior art solution has at least the following drawbacks: although a conventional MUSIC (Multiple Signal Classification, spatial spectrum estimation algorithm) method can locate a plurality of sound sources at the same time, it is difficult to distinguish which angle the wake-up direction corresponds to. The traditional GCC-PHAT (generalized cross-Correlation PHAse Transformation) method also has difficulty in solving the problem of inaccurate positioning caused by wall reflection.

disclosure of Invention

an embodiment of the present invention provides a sound source positioning method and apparatus, which are used to solve at least one of the above technical problems.

in a first aspect, an embodiment of the present invention provides a sound source localization method, including: calculating signals received by a microphone array to obtain a spatial spectrum; determining the number of spectral peaks the spatial spectrum has; if the spatial spectrum has a plurality of spectral peaks, forming a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former, wherein the beams in the different directions at least comprise a first direction beam and a second direction beam; calculating an energy of the first directional beam, an energy of the second directional beam, and an energy difference between the first directional beam and the second directional beam; judging whether the energy difference is greater than or equal to a preset threshold value or not; and if the energy difference is larger than or equal to the preset threshold, outputting the angle corresponding to the beam with the largest energy in the energy of the beam in the first direction and the energy of the beam in the second direction as the direction of arrival.

in a second aspect, an embodiment of the present invention provides a sound source localization apparatus, including: the spatial spectrum calculation module is configured to calculate signals received by the microphone array to obtain a spatial spectrum; a spectral peak number determination module configured to determine a number of spectral peaks the spatial spectrum has; a beam forming module configured to form a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former if the spatial spectrum has a plurality of spectral peaks, wherein the plurality of beams in different directions at least include a first direction beam and a second direction beam; an energy calculation module configured to calculate an energy of the first directional beam, an energy of the second directional beam, and an energy difference of the first directional beam and the second directional beam; the judging module is configured to judge whether the energy difference is greater than or equal to a preset threshold value; the first output module is configured to output an angle corresponding to a beam with the largest energy among the energy of the beam in the first direction and the energy of the beam in the second direction as an arrival direction if the energy difference is greater than or equal to the preset threshold.

in a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the sound source localization method of any of the embodiments of the present invention.

In a fourth aspect, the present invention further provides a computer program product, which includes a computer program stored on a non-volatile computer-readable storage medium, the computer program including program instructions, which, when executed by a computer, cause the computer to execute the steps of the sound source localization method according to any one of the embodiments of the present invention.

according to the scheme provided by the method and the device, the space spectrum is obtained through calculation, whether the reflected sound beams exist or not can be judged according to the number of the spectrum peaks in the space spectrum, if a plurality of spectrum peaks exist, the reflected sound beams exist, at the moment, the interference of the reflected sound beams needs to be eliminated through other calculation, the accurate direct sound beams can be finally determined, and the direction corresponding to the direct sound beams is the arrival direction of the sound source. Because the direct sound beam has a larger energy, there is generally an energy difference with the reflected sound beam, and when the energy difference exceeds a threshold value, the beam having the larger energy can be determined as the direct sound beam, and the beam direction corresponding to the direct sound beam is the arrival direction of the sound source. Therefore, the arrival direction of the sound source can be accurately calculated through the scheme of the application.

drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

fig. 1 is a flowchart of a sound source positioning method according to an embodiment of the present invention;

FIG. 2 is a flow chart of another sound source localization method according to an embodiment of the present invention;

Fig. 3 is a flowchart of a specific example of a sound source localization method according to an embodiment of the present invention;

fig. 4 is a block diagram of a sound source positioning device according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

referring to fig. 1, a flow chart of an embodiment of the sound source positioning method according to the present application is shown, and the sound source positioning method according to the present embodiment can be applied to terminals with voice wake-up, recognition, understanding and feedback capabilities, such as smart voice televisions, smart speakers, smart dialogue toys, and other existing smart voice terminals with voice wake-up capabilities.

As shown in fig. 1, in step 101, a spatial spectrum is calculated from signals received by a microphone array;

in step 102, determining the number of spectral peaks that the spatial spectrum has;

In step 103, if the spatial spectrum has a plurality of spectral peaks, a fixed beam former is used to form a plurality of beams in different directions corresponding to the microphone array, wherein the plurality of beams in different directions at least includes a first direction beam and a second direction beam;

in step 104, calculating the energy of the first direction beam, the energy of the second direction beam and the energy difference of the first direction beam and the second direction beam;

in step 105, determining whether the energy difference is greater than or equal to a preset threshold;

In step 106, if the energy difference is greater than or equal to the preset threshold, the angle corresponding to the beam with the largest energy among the energy of the first directional beam and the energy of the second directional beam is output as the direction of arrival.

In this embodiment, for step 101, the sound source localization apparatus calculates a spatial spectrum of signals received by the microphone array. Thereafter, for step 102, the sound source localization apparatus determines a number of spectral peaks that the spatial spectrum has, wherein the number of spectral peaks includes a single spectral peak and a plurality of spectral peaks. And then. In step 103, if the sound source localization apparatus detects that the spatial spectrum has a plurality of spectral peaks, a plurality of beams in different directions corresponding to the microphone array are formed by using the fixed beam former, wherein the plurality of beams in different directions at least include a first direction beam and a second direction beam. The beams in the multiple directions may include beams in more than two directions, and here, the beams are only broadly divided into beams in the first direction and beams in the second direction, which respectively represent any one of the direct sound beam and the reflected sound beam, and the present application is not limited herein, and will not be described again in the following.

Thereafter, for step 104, the sound source localization apparatus calculates the energy of the first direction beam, the energy of the second direction beam, and the energy difference of the first direction beam and the second direction beam. Then, with respect to step 105, the sound source localization apparatus determines whether or not the energy difference between the first direction beam and the second direction beam is equal to or greater than a preset threshold. Finally, in step 106, if the energy difference is greater than or equal to the preset threshold, the angle corresponding to the beam with the largest energy among the energy of the first directional beam and the energy of the second directional beam is output as the direction of arrival.

According to the method, a space spectrum is obtained through calculation, whether the reflected sound beams exist can be judged according to the number of spectral peaks in the space spectrum, if a plurality of spectral peaks exist, the reflected sound beams exist, at this time, some other calculation is needed to eliminate the interference of the reflected sound beams, and the more accurate direct sound beams can be finally determined, wherein the direction corresponding to the direct sound beams is the arrival direction of the sound source. Because the direct sound beam has a larger energy, there is generally an energy difference with the reflected sound beam, and when the energy difference exceeds a threshold value, the beam having the larger energy can be determined as the direct sound beam, and the beam direction corresponding to the direct sound beam is the arrival direction of the sound source. Therefore, the arrival direction of the sound source can be accurately calculated through the scheme of the application.

In some optional embodiments, after determining the number of spectral peaks the spatial spectrum has, the method further comprises: if the spatial spectrum only has a single spectral peak, the angle corresponding to the output single spectral peak is the direction of arrival. Therefore, for only a single spectral peak, corresponding angle information can be directly output, and the corresponding angle information is the direction of arrival.

with further reference to fig. 2, the method of claim 2, wherein the method further comprises:

In step 201, if the energy difference is smaller than a preset threshold, calculating a relative delay of the first directional beam and a relative delay of the second directional beam;

In step 202, the angle corresponding to the beam with the smallest delay from among the relative delays of the first directional beam and the second directional beam is output as the direction of arrival.

in this embodiment, for step 201, if the sound source localization apparatus detects that the energy difference between the first directional beam and the second directional beam is smaller than the preset threshold, the relative delay of the first directional beam and the relative delay of the second directional beam are calculated, and further, the time delay estimation may be performed by using the generalized cross-correlation function method. Thereafter, for step 202, for the relative delay of the first directional beam and the relative delay of the second directional beam, the one with the smallest delay corresponds to the delay of the direct sonic beam, and the direction of the direct sonic beam corresponds to the direction of arrival, so that the angle corresponding to the beam with the smallest delay is output, i.e., the direction of arrival of the sound source.

The method of the present embodiment can determine which beam is a direct sound beam by calculating the relative delay, and since the delay of the direct sound beam is shorter than that of the reflected sound beam, it can be determined which beam is the direct sound beam in this manner. And the direction corresponding to the direct sound beam is the direction of arrival of the sound source, so that the direction of arrival of the sound source can be finally output by the method.

In some optional embodiments, calculating the spatial spectrum of the signals received by the microphone array comprises: obtaining a separation matrix corresponding to a plurality of microphone receiving signals by using independent vector analysis; a spatial spectrum of a separation matrix corresponding to wake-up signals capable of waking up the device is calculated. So that the spatial spectrum can be calculated quickly by the above scheme.

In a further alternative embodiment, deriving a separation matrix corresponding to the plurality of microphone received signals using independent vector analysis comprises: modeling a signal received by the microphone array as X (t, f) based on a short-time Fourier transform; filtering signals received by the microphone array by using an independent vector analysis matrix calculation separation matrix W (t, f) to obtain an estimated signal Y (t, f) of the sound source signal, wherein the Y (t, f) is W (t, f) X (t, f); the estimated signal is sent to a wake-up module in the device and a separation matrix corresponding to the wake-up signal is determined. The separation matrix corresponding to the renewed model can be determined more quickly in the above manner, thereby facilitating the calculation of the spatial spectrum.

Further optionally, calculating the spatial spectrum of the separation matrix corresponding to the wake-up signal includes: calculating a spatial covariance matrix by using a separation matrix corresponding to the wake-up signal; performing eigenvalue decomposition on the spatial covariance matrix obtained by calculation to obtain a maximum eigenvalue, wherein a corresponding vector is a signal space, and a noise space is formed by residual vectors; a spatial spectrum is calculated based on the signal space and the noise space.

The following description is provided to enable those skilled in the art to better understand the present disclosure by describing some of the problems encountered by the inventors in implementing the present disclosure and by describing one particular embodiment of the finally identified solution.

aiming at the problems of interference and wall reflection existing in a real scene, a specific embodiment of the application provides the following solution:

firstly, obtaining separation matrixes of different sound sources by using an IVA method, and then calculating a corresponding spatial covariance matrix. And after the spatial covariance matrix corresponding to the awakened signal is determined, the awakening direction is positioned by using an MUSIC method, and the spatial spectrum of the awakened signal is calculated. If the spatial spectrum corresponds to only a single spectral peak, the corresponding angle information is output. If a plurality of spectral peaks correspond to each other, it is indicated that the wake-up signal spatial covariance matrix includes wall reflection, and respective corresponding directions of the direct sound and the reflected sound need to be determined. The method comprises the steps of firstly obtaining wave beams in different directions through fixed wave beam forming, then calculating the energy of a direct sound wave beam and a reflected sound wave beam, wherein the energy of the direct sound wave beam is usually larger than the energy of the reflected sound wave beam, when the difference between the energy of a larger wave beam and the energy of a smaller wave beam exceeds a certain threshold value, the wave beam with larger energy can be judged as the direct sound wave beam, and the angle corresponding to the wave beam with larger energy is output. If the energy of the two beams is relatively close, the generalized cross-correlation function between the two beams needs to be calculated, and because the propagation time of the direct sound is shorter than that of the reflected sound, the beam with smaller time delay is judged as the direct sound beam, and the angle corresponding to the beam with smaller delay is output.

In a real home environment, the positioning of the user position using the microphone array may be affected by other interference such as television. The signal received by the microphone array is subjected to a short-time fourier transform to be modeled as X (t, f) ═ a (t, f) S (t, f).

Wherein X (t, f) ═ X₁(t，f)X₂(t，f)...X_K(t，f)]the signal received by the microphones is represented, K is the total number of microphones, t is a time index, and f is a frequency band index. S₁(t, f) represents a desired signal from a user, S₂(t,f)...S_N(t, f) represents an interfering signal from a television or the like, N is the number of sound sources, N is equal to or less than K, and a (t, f) is an acoustic transfer function. The microphone received signal is filtered using an Independent Vector Analysis (IVA) to compute a separation matrix W (t, f) resulting in an estimate of the acoustic source signal Y (t, f) ═ W (t, f) X (t, f).

To estimate the separation matrix, we minimize the cost function as follows:

Where E [. cndot. ] represents the desired operation, the separation matrix W (t, f) is iteratively updated based on the gradient descent rule:

W(t,f)＝W(t,f)+μ(I-E[eY^h])W(t,f)。

And sending the estimation signal to a wake-up module, and determining a separation matrix corresponding to the wake-up signal.

Calculating spatial co-correlation matrix R of expected signal by using separation matrix corresponding to wake-up signal_SS(t,f)＝W₁(t,f)X(t,f)X^h(t,f)W₁ ^h(t, f) for R_ss(t, f) decomposing the eigenvalues, wherein the vector corresponding to the maximum eigenvalue is the signal space U_sthe remaining K-1 vectors form the noise space U_nThe spatial spectrum function is calculated using the following formula:

Where d (t, f, θ) represents a steering vector in the θ direction, the direction θ is changed, and the angle is estimated by finding a peak. If the corresponding spatial spectrum only has a single spectral peak, the corresponding angle is output. If the angle corresponds to a plurality of spectrum peaks, the found angle has reflected sound waves, and the direct sound direction and the reflected sound direction need to be further determined.

When the microphone array is placed close to a wall, reflected sound of the wall can be received, and the determination of the direct sound angle can be influenced by the existence of the reflected sound. To get the direct sound angle, we first divide 360 degrees into K subspaces while using K fixed beamformers h (t, f, θ)_k) Filtering the received signals of K microphones to obtain K sub-beams, wherein the beam of the subspace corresponding to the direct sound angle is called as a direct sound beam y_d(t,f)＝h(t,f,θ_i) X (t, f), the beam whose reflected acoustic angle corresponds to the subspace is called the reflected acoustic beam y_r(t,f)＝h(t,f,θ_j) X (t, f). Since the direct sound energy is generally larger than the reflected sound energy, the energy difference Ed between the direct sound beam and the reflected sound beam can be calculated as 10log10(| y)_d|²)-10log10(|y_r|²) To determine the direct sound angle. When the energy difference Ed is greater than the threshold th0, it is determined that the beam with larger energy is a direct sound beam, the corresponding DOA angle is the desired signal direction, and th0 is obtained by a large number of experiments or empirical values, such as 1.5 dB. If the direct sound beam energy and the reflected sound beam energy are closer, further judgment is needed. Since the reflected sound reaches the microphone array in a longer time than the direct sound reaches the microphone array, it is possible to calculate the relative delay Gd of the direct-sound fixed beam and the reflected-sound fixed beam as GCC (yd, y)_r) Judging that the beam with smaller delay Gd is a direct sound beam, the corresponding DOA angle is the direction of the expected signal,

Is a normalized generalized cross-correlation function.

Referring to fig. 4, a block diagram of a sound source positioning device according to an embodiment of the present invention is shown.

As shown in fig. 4, the sound source localization apparatus 400 includes a spatial spectrum calculation module 410, a spectral peak number determination module 420, a beam forming module 430, an energy calculation module 440, a judgment module 450, and a first output module 460.

the spatial spectrum calculation module 410 is configured to calculate signals received by the microphone array to obtain a spatial spectrum; a spectral peak number determination module 420 configured to determine the number of spectral peaks the spatial spectrum has; a beam forming module 430 configured to form a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former if the spatial spectrum has a plurality of spectral peaks, wherein the plurality of beams in different directions at least includes a first direction beam and a second direction beam; an energy calculation module 440 configured to calculate an energy of the first direction beam, an energy of the second direction beam, and an energy difference of the first direction beam and the second direction beam; a determining module 450 configured to determine whether the energy difference is greater than or equal to a preset threshold; the first output module 460 is configured to output, if the energy difference is greater than or equal to the preset threshold, an angle corresponding to a beam with the largest energy among the energy of the first directional beam and the energy of the second directional beam as a direction of arrival.

In some optional embodiments, the apparatus further comprises: and a second output module (not shown in the figure) configured to output the angle corresponding to the single spectral peak as the direction of arrival if the spatial spectrum has only the single spectral peak.

It should be understood that the modules depicted in fig. 4 correspond to various steps in the methods described with reference to fig. 1 and 2. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 4, and are not described again here.

it is to be noted that the modules in the embodiments of the present application are not intended to limit the aspects of the present application, for example, the spectral peak number determination module may be described as a module that determines the number of spectral peaks possessed by the spatial spectrum. In addition, the related function module may also be implemented by a hardware processor, for example, the spectrum peak number determining module may also be implemented by a processor, which is not described herein again.

In other embodiments, the present invention further provides a non-transitory computer storage medium, where the computer storage medium stores computer-executable instructions, where the computer-executable instructions may perform the sound source localization method in any of the above method embodiments;

as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:

calculating signals received by a microphone array to obtain a spatial spectrum;

Determining the number of spectral peaks the spatial spectrum has;

If the spatial spectrum has a plurality of spectral peaks, forming a plurality of beams in different directions corresponding to the microphone array by using a fixed beam former, wherein the beams in the different directions at least comprise a first direction beam and a second direction beam;

calculating an energy of the first directional beam, an energy of the second directional beam, and an energy difference between the first directional beam and the second directional beam;

judging whether the energy difference is greater than or equal to a preset threshold value or not;

The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the sound source localization apparatus, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory located remotely from the processor, which may be connected to the sound source localization apparatus over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform any of the sound source localization methods described above.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 5, the electronic device includes: one or more processors 510 and memory 520, with one processor 510 being an example in fig. 5. The apparatus of the sound source localization method may further include: an input device 530 and an output device 540. The processor 510, the memory 520, the input device 530, and the output device 540 may be connected by a bus or other means, and the bus connection is exemplified in fig. 5. The memory 520 is a non-volatile computer-readable storage medium as described above. The processor 510 executes various functional applications of the server and data processing by executing the non-volatile software programs, instructions and modules stored in the memory 520, so as to implement the sound source localization method of the above-described method embodiment. The input means 530 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the apparatus for the multi-device cooperative voice interaction algorithm. The output device 540 may include a display device such as a display screen.

The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.

As an embodiment, the electronic device is applied to a sound source localization apparatus, and includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:

Calculating signals received by a microphone array to obtain a spatial spectrum;

Determining the number of spectral peaks the spatial spectrum has;

calculating an energy of the first directional beam, an energy of the second directional beam, and an energy difference between the first directional beam and the second directional beam;

Judging whether the energy difference is greater than or equal to a preset threshold value or not;

And if the energy difference is larger than or equal to the preset threshold, outputting the angle corresponding to the beam with the largest energy in the energy of the beam in the first direction and the energy of the beam in the second direction as the direction of arrival.

the electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) A mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.

(3) A portable entertainment device: such devices can display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) the server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.

(5) And other electronic devices with data interaction functions.

the above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

14页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种基于虚拟球阵列扩展技术的噪声源定位识别方法

Sound source positioning method and device

相关技术

网友询问留言