System and method for audio capture

文档序号：863755 发布日期：2021-03-16 浏览：28次中文

阅读说明：本技术 用于音频捕获的系统和方法 (System and method for audio capture ) 是由肖恩·迈克尔·埃德林肖恩·塔格特·彭特科斯特于 2019-05-15 设计创作，主要内容包括：一种用于噪声过滤的方法,包括接收与至少一个噪声源和至少一个目标音频源的相对方向对应的方向数据；从至少一个噪声源捕获噪声数据；从至少一个目标音频源捕获目标音频数据；使用方向数据从目标音频数据过滤噪声数据；以及输出经过滤的目标音频。(A method for noise filtering comprising receiving directional data corresponding to the relative direction of at least one noise source and at least one target audio source; capturing noise data from at least one noise source; capturing target audio data from at least one target audio source; filtering noise data from the target audio data using the directional data; and outputting the filtered target audio.)

1. A method for noise filtering, comprising:

receiving directional data corresponding to the relative directions of at least one noise source and at least one target audio source;

capturing noise data from the at least one noise source;

capturing target audio data from the at least one target audio source;

filtering the noise data from the target audio data using the directional data; and

outputting the filtered target audio.

2. The method of claim 1, wherein the directional data is used to determine at least one beamforming configuration, wherein at least one beam associated with the beamforming configuration captures at least one of the noise source or the target audio source, and the method comprises the step of applying the beamforming configuration to sound data captured by the sound capture device.

3. The method of claim 2, wherein the noise data and the target audio data are captured using a sound capture device, and the direction data is used at the sound capture device to estimate the power of the noise data and/or the target audio data.

4. The method of claim 3, wherein the sound capture device comprises a target audio capture device and a noise capture device, and the direction data is used at the target audio capture device to estimate the noise data and/or the power of the target audio data.

5. A system for an unmanned aerial vehicle UAV, comprising:

a sound capture device configured to capture noise data from at least one noise source and target audio from at least one target audio source; and

a processing unit configured to:

receiving directional data corresponding to the relative directions of the at least one noise source and the at least one target audio source;

receiving the noise data and the target audio data;

filtering the noise data from the target audio data using the directional data; and

outputting the filtered target audio.

6. The system of claim 5, wherein the direction data is used at the sound capture device to estimate the power of the noise data and/or the target audio data.

7. The system of claim 5, comprising a directional sensor configured to sense a direction of the noise source relative to the UAV and a direction of the target audio source relative to the UAV, and the processing unit is configured to determine a relative direction between the noise source and the target audio source.

8. The system of claim 5, wherein the sound capture device comprises: a noise capture device configured to capture the noise data; and a target audio capture device configured to capture the target audio.

9. The system of claim 5, wherein the system further comprises at least one sensor, and wherein the processing unit is configured to receive sensor data from the at least one sensor and associate the filtered target audio with the sensor data.

10. The system of claim 9, wherein the at least one sensor comprises a video camera configured to capture video data, and the processing unit is configured to associate the filtered target audio with the video data.

11. The system of claim 10, wherein the direction of the target audio capture device is aligned with the direction of the video camera.

12. The system of claim 9, wherein the at least one sensor is a location sensor and the processing unit is configured to associate the filtered target audio with location data.

13. The system of claim 5, wherein the sound capture device comprises a MEMS microphone.

14. The system of claim 13, wherein the sound capture device is an array of microphones.

15. The system of any of claims 5 to 14, wherein the sound capture device is attached to the UAV via a gimbal.

16. An unmanned aerial vehicle UAV payload comprising a system as claimed in any of claims 5 to 15.

17. A method for noise filtering, comprising:

receiving directional data corresponding to the relative directions of at least one noise source and at least one target audio source;

steering the sound capture device to capture noise data from the at least one noise source and to capture target audio data from the at least one target audio source;

filtering the noise data from the target audio data using the directional data; and

outputting the filtered target audio.

18. The method of claim 17, wherein steering the sound capture device comprises applying a beamforming configuration to redirect at least one beam to capture the at least one noise source and/or the at least one target audio source.

19. A method according to claim 17 or 18, wherein the sound capture device is mounted via a gimbal, and the step of steering the sound capture device comprises steering the gimbal to redirect the sound capture device.

20. The method of claim 19, wherein the sound capture device comprises a noise capture device for capturing the noise data and a target audio capture device for capturing the target audio, and wherein the target audio capture device is mounted via a gimbal, and the step of steering the sound capture device comprises steering the gimbal to redirect the target audio capture device toward the target audio source.

21. A system for noise filtering for an unmanned aerial vehicle, UAV, comprising:

a sound capture device configured to capture noise from at least one noise source and configured to capture target audio from at least one target audio source, wherein the sound capture device is independently steerable relative to the UAV; and

a processing unit configured to:

receiving directional data corresponding to the relative directions of the at least one noise source and the at least one target audio source;

receiving the noise data and the target audio data;

filtering the noise data from the target audio data using the directional data; and

outputting the filtered target audio.

22. The system of claim 21, wherein the sound capture device is mounted to the UAV via a gimbal, and the sound capture device is steerable by steering the gimbal.

23. The system of claim 21 or 22, wherein the sound capture device is configured to steer independently relative to the UAV through beamforming.

24. The system of claim 21, wherein the sound capture device comprises a noise capture device to capture the noise data and a target audio capture device to capture the target audio, and the target audio capture device is mounted to the UAV via a gimbal and the target audio capture device is steerable by steering the gimbal.

25. The system of claim 21, wherein the sound capture device comprises a MEMS microphone.

26. The system of claim 25, wherein the sound capture device is an array of microphones.

27. An unmanned aerial vehicle UAV payload comprising the system of any of claims 21-26.

28. A method for noise filtering, comprising:

determining a beamforming pattern comprising a main beam and a null beam;

capturing noise data using the null beam;

capturing target audio data using the main beam;

filtering the noise data from the target audio data; and

outputting the filtered target audio.

29. The method of claim 28, wherein the null beam captures all significant noise sources.

30. The method of claim 28, wherein the null beam has a beamwidth of at least 180 °.

31. The method of claim 28, wherein the null beam has a beamwidth of 360 ° -X °, where X is a beamwidth of the main beam.

32. The method of claim 28, wherein the null beam has a gain that varies across its beamwidth by less than 20%.

33. The method of claim 28, wherein the null beam has a frequency range defined by low frequencies at which gain varies by less than 20%.

34. The method of claim 28, wherein the step of filtering the noise data from the target audio data uses a simplified 2 x 2 gain matrix, the 2 x 2 gain matrix characterizing gains of the main beam in forward and reverse directions and gains of the null beam in forward and reverse directions.

35. A gimbaled microphone configured to use the method of any of claims 28-34.

36. The gimbaled microphone according to claim 35, wherein said microphone is a MEMS array.

37. The gimbaled microphone of claim 35, wherein said microphone is an end-ray microphone array.

38. The gimbaled microphone of claim 35, further comprising:

one or more sensors for detecting directional data corresponding to the relative direction of the gimbaled microphone and mounted to an Unmanned Aerial Vehicle (UAV); and

a processing unit configured to filter the noise data from the target audio data using the direction data.

39. The gimbaled microphone of claim 35, wherein said noise data is filtered from said target audio data without the relative orientation of said gimbaled microphone and without being mounted to an Unmanned Aerial Vehicle (UAV).

40. An Unmanned Aerial Vehicle (UAV) comprising the gimbaled microphone of any one of claims 35-39.

Technical Field

The invention relates to a system and a method for audio capture. More particularly, but not exclusively, the invention relates to a system and method for filtering noise in audio capture.

Background

Many aircraft (aircrafts), such as Unmanned Aerial Vehicles (UAVs), helicopters, vertical lift systems, and fixed wing aircraft, generate undesirable noise. In a UAV, noise is generated by the interaction of the engine (due to, for example, exhaust or combustion), the motor assembly (due to, for example, vibration), the airflow, and the UAV and/or propeller (propeller) of the UAV. When capturing audio of a UAV, the noise generated by the UAV itself may be significantly larger than the target audio signal, and this noise may prevent or hinder audio capture or processing by the UAV.

For UAVs used to capture video and audio captures, the noise generated by the UAV is a particular problem. Such filming may be used for live broadcasts, recording events (e.g., concerts), or for entertainment and documentary purposes (such as filming for television or movies).

UAV audio capture currently used for filming requires expensive and time consuming post-processing to remove the noise generated by the UAV. Typically, during UAV filming, audio is captured by placing a microphone on the ground and/or by having a separate microphone worn by the target of interest. This has the following drawbacks: UAV noise is picked up by microphones on the ground and/or on the target of interest. This requires expensive and time consuming post-processing. Furthermore, a ground or body microphone needs to be provided, limiting versatility.

Other areas of concern for UAV noise include UAV audio capture in defense and security, law enforcement, industrial, and telecommunications applications. UAVs are well suited for such applications because they can be deployed quickly, can be deployed remotely, and can cover substantial distances. Audio capture may be used to identify targets (e.g., by spectral analysis), for sound source localization, or for measuring noise levels. In defense and security applications, audio may be captured for gunfire detection. In industrial applications, audio captured by UAVs may be used to detect mechanical faults and assess noise compliance. UAVs may be used to allow remote communications. For example, in search and rescue, a UAV may be used to capture audio from survivors at a remote location (and transmit it to search and rescue personnel). In another example, in logistics (such as package courier), audio from the recipient may be captured (and transmitted to a courier company). A hobby/leisure/self-timer UAV user may also wish to record audio.

The invention may provide improved audio capture or at least provide the public or industry with a useful choice.

Disclosure of Invention

According to an example embodiment, there is provided a method for noise filtering, comprising: receiving directional data corresponding to the relative directions of at least one noise source and at least one target audio source; capturing noise data from at least one noise source; capturing target audio data from at least one target audio source; filtering noise data from the target audio data using the directional data; and outputting the filtered target audio.

The directional data may be used to determine at least one beamforming configuration, wherein at least one beam associated with the beamforming configuration captures at least one of a noise source or a target audio source. The method may comprise the step of applying a beamforming configuration to sound data captured by the sound capture device.

The noise data and the target audio data may be captured using a sound capture device, and the direction data may be used at the sound capture device to estimate the power of the noise data and/or the target audio data.

The sound capture device may include a target audio capture device and a noise capture device, and the direction data may be used at the target audio capture device to estimate the noise data and/or the power of the target audio data.

According to another example embodiment, there is provided a system for an Unmanned Aerial Vehicle (UAV), comprising: a sound capture device configured to capture noise data from at least one noise source and target audio from at least one target audio source; and a processing unit configured to: receiving directional data corresponding to the relative directions of at least one noise source and at least one target audio source; receiving noise data and target audio data; filtering noise data from the target audio data using the directional data; and outputting the filtered target audio.

The directional data may be used at the sound capture device to estimate the power of the noise data and/or the target audio data.

The system may include a directional sensor configured to sense a direction of the noise source relative to the UAV and a direction of the target audio source relative to the UAV, and the processing unit may be configured to determine a relative direction between the noise source and the target audio source.

The sound capture device may include: a noise capture device configured to capture noise data; and a target audio capture device configured to capture target audio.

The system may also include at least one sensor, and the processing unit may be configured to receive sensor data from the at least one sensor and associate the filtered target audio with the sensor data. The at least one sensor may include a video camera configured to capture video data, and the processing unit may be configured to associate the filtered target audio with the video data. The direction of the target audio capture device may be aligned with the direction of the video camera. The at least one sensor may be a location sensor, and the processing unit may be configured to associate the filtered target audio with the location data.

The sound capture device may comprise a MEMS microphone. The sound capture device may be an array of microphones.

The sound capture device may be attached to the UAV via a gimbal.

According to yet another example embodiment, there is provided a method for noise filtering, comprising: receiving directional data corresponding to the relative directions of at least one noise source and at least one target audio source; steering the sound capture device to capture noise data from the at least one noise source and to capture target audio data from the at least one target audio source; filtering noise data from the target audio data using the directional data; and outputting the filtered target audio.

The step of steering the sound capture device may comprise applying a beamforming configuration to redirect at least one beam to capture at least one noise source and/or at least one target audio source.

The sound capture device may be mounted via a gimbal, and the step of steering the sound capture device may include steering the gimbal to redirect the sound capture device.

The sound capture device may comprise a noise capture device for capturing noise data and a target audio capture device for capturing target audio, and wherein the target audio capture device may be mounted via a gimbal, and the step of steering the sound capture device may comprise steering the gimbal to redirect the target audio capture device towards the target audio source.

According to another example embodiment, there is provided a system for noise filtering for an Unmanned Aerial Vehicle (UAV), comprising: a sound capture device configured to capture noise from at least one noise source and configured to capture target audio from at least one target audio source, wherein the sound capture device is independently steerable relative to the UAV; and a processing unit configured to: receiving directional data corresponding to the relative directions of at least one noise source and at least one target audio source; receiving noise data and target audio data; filtering noise data from the target audio data using the directional data; and outputting the filtered target audio.

The sound capture device may be mounted to the UAV via a gimbal, and the sound capture device may be steered by steering the gimbal.

The sound capture device may be configured to steer independently with respect to the UAV through beamforming.

The sound capture device may include a noise capture device for capturing noise data and a target audio capture device for capturing target audio, and the target audio capture device may be mounted to the UAV via a gimbal and the target audio capture device may be steered by steering the gimbal.

The sound capture device may comprise a MEMS microphone. The sound capture device may be an array of microphones.

According to another example embodiment, an Unmanned Aerial Vehicle (UAV) load is provided, comprising a system as described above.

According to yet another example embodiment, there is provided a method for noise filtering, comprising: determining a beamforming pattern comprising a main beam and a null beam; capturing noise data using the null beam; capturing target audio data using the main beam; filtering noise data from the target audio data; and outputting the filtered target audio.

The null beam can capture all significant noise sources.

The null beam may have a beamwidth of at least 180 °.

The null beam may have a beamwidth of 360-X, where X is the beamwidth of the main beam.

The null beam may have a gain that varies across its beamwidth by less than 20%.

The null beam may have a frequency range defined by low frequencies where the gain varies by less than 20%.

The step of filtering the noise data from the target audio data may use a simplified 2 x 2 gain matrix that characterizes the gain of the main beam in the forward and reverse directions and the gain of the null beam in the forward and reverse directions.

A gimbaled microphone may be configured to use the above method.

The microphone may be a MEMS array.

The microphones may alternatively be an end-ray microphone array.

One or more sensors may detect directional data corresponding to the relative direction of the gimbaled microphone and be mounted to an Unmanned Aerial Vehicle (UAV); and the processing unit may use the directional data to filter noise data from the target audio data.

Alternatively, the noise data may be filtered from the target audio data without the relative direction of the gimbaled microphones and without being mounted to an Unmanned Aerial Vehicle (UAV).

An Unmanned Aerial Vehicle (UAV) may include a gimbaled microphone as above.

It is to be understood that the term "comprising" may, under varying jurisdictions, be attributed with either an exclusive or an inclusive meaning. For the purposes of this specification, and unless otherwise indicated, these terms are intended to have an inclusive meaning-i.e., that they will be taken to mean an inclusion of the listed directly referenced components, and may also include other unspecified components or elements.

Any reference in this specification does not constitute an admission that it is prior art, effectively combined with other references or forms part of the common general knowledge.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and, together with a general description of the invention given above, and the detailed description of the embodiments given below, serve to explain the principles of the invention, in which:

FIG. 1 is a block diagram of a UAV including a system for noise filtering;

FIG. 2 is a block diagram of a system for noise filtering in an example context;

fig. 3a and 3b are directional diagrams (polar patterns) of a sound capture device according to one embodiment;

fig. 4a and 4b are directional diagrams of a sound capture device according to another embodiment;

FIG. 5 is a block diagram of a UAV;

FIG. 6 is a flow diagram of noise filtering according to one embodiment;

FIG. 7 is a flow diagram of noise filtering according to one embodiment;

FIG. 8 is a flow diagram of noise filtering according to one embodiment;

FIG. 9 is a schematic diagram of noise filtering according to one embodiment;

FIG. 10 is a schematic diagram of noise filtering according to one embodiment;

FIG. 11 is a schematic diagram of noise filtering according to one embodiment;

FIG. 12 is a diagram of noise filtering according to one embodiment; and

FIG. 13 is a diagram of noise filtering according to one embodiment.

Detailed Description

The systems, methods, and apparatus described herein provide improved target audio capture using an Unmanned Aerial Vehicle (UAV). For ease of illustration, "target audio" will be used to indicate the sound that is desired to be captured. For example, the target audio may include speech of a person being photographed, ambient sound of a scene being photographed, or sound from an industrial location. "noise" will be used to indicate unwanted sounds and/or background sounds that are not the target audio. Although this would primarily include sound generated by the UAV itself (e.g., sound from a motor and/or propeller assembly of the UAV), it is not limited in this respect. For example, background sounds in the scene being photographed, or ambient sounds (when it is not desired to capture these sounds) may also be included.

Before describing the method for noise filtering, it is helpful to outline the system. Fig. 1 shows a block diagram of a UAV200 that includes a system 100 for noise filtering. System 100 may be located on or around UAV 200. The system 100 may be configured as a load that is removably or permanently mountable to the UAV. The system may also be incorporated partially or completely into the UAV itself.

The system may include a sound capture device that may include a target audio capture device 102 configured to capture target audio and a noise capture device 104 configured to capture noise. Although the target audio capture device 102 and the noise capture device 104 are shown as distinct elements, in some embodiments they may be part of a single element. Embodiments of the target audio capture device 102 and the noise capture device 104 will be described in more detail below.

The system 100 may include a sensor module 106. The sensor module 106 may include various sensors configured to sense various information about the system, including information about the target audio source, the noise source, and/or the UAV itself. For example, the sensor module 106 may include a GPS sensor 112 configured to sense a GPS location of the system 100 (and thus also a GPS location of the UAV200 to which the system 100 is attached). The sensor module may also include a sensor configured to sense the relative orientation of the target audio capture device 102 and/or the noise capture device 104. Other possible sensors are described in more detail below.

The system 100 may include an image capture device 118. The image capture device 118 may be used for system applications requiring image capture (e.g., for photographic purposes). The image capture device 118 may include a video camera suitable for the type of filming desired (e.g., for leisure use, the camera may be a relatively light and inexpensive video camera, while for film filming, the camera may be a relatively heavy, but higher quality, film camera). The image capture device 118 may also include a photo camera for capturing still images, which may be more suitable for certain applications where video is not required.

At least some of the components of the system may be mounted on a gimbal (not shown in fig. 1). In one embodiment, the entire system 100 may be gimbaled to the UAV200 such that the system is steered relative to the UAV 200. In other embodiments, the components of the system may be mounted on a gimbal, attached to the rest of the system, or attached to the UAV itself. For example, the target audio capture device 102, the noise capture device 104, and the image capture device 118 may be mounted together or individually on a gimbal, which in turn is attached to the system 100. In this way, the devices can be steered relative to the rest of the system 100, relative to the UAV200, and/or relative to each other.

The system 100 may include a control module 114. The control module 114 may be connected to an actuator 116. The actuator 116 may be one or more gimbal motors that may be controlled to steer one or more of the gimbals described above. For example, where the entire system is mounted on a gimbal, the actuator may be controlled to steer the gimbal so that the target audio capture device 102 is directed toward the target audio source. In another example, if the target audio capture device 102 is mounted on a first gimbal and the image capture device 118 is mounted on a second gimbal, the actuator may include a first gimbal motor and a second gimbal motor, the first gimbal motor may be controlled to steer the first gimbal such that the target audio capture device 102 is directed toward the target audio source, and the second gimbal motor may be controlled to steer the second gimbal such that the image capture device 118 is directed toward the image that is desired to be captured.

The modules and devices described above may all be connected to the processing unit 108. Processing unit 108 may be configured to receive input from, process information, and generate output that controls the operation of, various modules and devices. For simplicity, the processing unit 108 is shown in fig. 1 as a single module, however it may be divided into multiple modules, some of which may be incorporated into other modules of the system. For example, a sensor module may have the ability to undertake some processing of its own information, in which case the processing unit may be considered to be at least partially incorporated into the sensor module.

The system may also include a communication module 110. The communication module may be configured for bi-directional communication with the remote processing unit 120. Such bidirectional communication may be by way of any suitable wired or wireless communication protocol. Although the remote processing unit 120 of fig. 1 is shown as part of the UAV200, in some implementations, the remote processing unit may be remote from the UAV200 (e.g., a laptop computer located on the ground). In this manner, the processing methods described below may be processed by the processing unit 108, by the remote processing unit 120, or a combination thereof. The communications module may communicate with other communications modules (not shown) incorporated in the UAV. The other communication module may be configured to communicate with a remote device (e.g., a laptop computer used by a user to control operation of the UAV). In this way, the system 100 need not establish a separate communication line with the remote device, but may rely on an existing communication line between the UAV and the remote device.

Various components of the system may be implemented as one or more suitable integrated circuits. For example, the integrated circuit may be an ASIC or FPGA that may be well suited for UAV applications due to relatively small weight, size, and low power consumption.

The system 100 also includes a power supply 122. The power supply supplies power to the various modules and devices of the system 100. The power source may be a replaceable or rechargeable battery. Although the power source 122 is shown as part of the system 100, in other embodiments, the power source may be part of the UAV 200. However, providing the system 100 with a different power source (rather than relying on the existing power source of the UAV) is advantageous because the power source of the UAV may be "noisy," which may affect the quality of the signals generated by the various modules and devices and ultimately the quality of noise filtering.

Fig. 2 is a block diagram illustrating the system 100 and UAV200 described above in an example context. Fig. 2 shows a load 210 of system 100 configured to mount to UAV 200. Fig. 2 shows a user 201 controlling a UAV200 and capturing target audio from a target audio source 203 using the system 100. For example, a user may use UAV200 to photograph a meeting where it is desired to capture what the meeting person is saying. The user 201 uses a remote device that includes a remote processing unit 120, the remote processing unit 120 wirelessly communicating with a communication module to control the system 100. The user 201 may also control the UAV200 (e.g., fly the UAV) using the remote processing unit 120.

The load 210 is mounted to the UAV200 via a gimbal and is able to move independently of the UAV 200. The user may fly UAV200 toward the target audio source. The user controls the system 100 such that the image capture device (not shown) is pointed at the target audio source (thereby filming the scene). The target audio capture device 102, which has been configured to align with the axis of the image capture device, will also capture target audio from the target audio source 203.

Noise is captured by the noise capture device 104 from one or more noise sources 205a and 205b, such as the propellers of the UAV 200. As shown in fig. 2, the noise capturing device 104 may have two components 104a and 104 b. Each component may be located on or around the UAV200 to better capture noise from noise sources (e.g., they may be mounted on either side of the UAV200 to capture noise from the left and right propellers 205a, 205 b), and they may be directed at noise sources. In this embodiment, although the noise capture device 104 may be considered part of the system 100, it is mounted directly to the UAV200 rather than as part of the load 210. In this way, when the user turns the load (e.g., to point the image capture device toward the scene being filmed), the position and orientation of the noise capture device 104 relative to the noise sources 205a and 205b is not affected.

As will be described in greater detail below, the processing unit of the system 100 determines the position and/or orientation of the target audio capture device 102 relative to the noise capture device 104 using the sensor module. The position and/or orientation information is used to adjust parameters of the noise filtering algorithm. The processing unit uses the adjusted noise filtering algorithm to output filtered target audio based on the captured target audio. The system is therefore able to capture fairly clean target audio from the target audio source 203, minimizing interference from noise from the UAV200 itself. The filtered target audio is time synchronized with the captured video. The filtered target audio and video may be webcast to the remote processing unit 120 and can be viewed by the user 201.

Target audio capturing apparatus and noise capturing apparatus

In one embodiment, the target audio capture device 102 and the noise capture device 104 may be sound capture devices. The sound capture device may include any suitable number of microphones. Without limitation, these microphones may be MEMS microphones, condenser microphones (e.g., electret condenser microphones), electret microphones, parabolic microphones, moving coil microphones, ribbon microphones, carbon particle microphones, piezoelectric microphones, fiber optic microphones, laser microphones, and/or liquid microphones. Microphones may be used due to their particular directivity pattern, for example, over-center gun microphones, tri-center microphones, and/or omni-directional microphones. The microphones may be formed as an array. For example an array of two or three cardioid or omni-directional microphones, or an array of MEMS microphones.

The microphone may be selected to take advantage of its particular properties. These properties may include directivity (illustrated by its characteristic directivity pattern), frequency response (which may correspond to the target audio and/or noise), or signal-to-noise ratio.

The use of directional beamforming (MVDR, etc.) may require characterization by impulse response to generate a beam. Beam performance is best when the acoustic or noise source is exactly in the characterized direction, but performance degrades when the physical orientation of the array changes and the source is no longer aligned. In other words, ideally, each UAV would need to be calibrated first with a wide range of relative gimbal orientations. In use, a particular gimbal position will be approximated to the nearest calibration point with varying degrees of efficacy.

In one embodiment, the sound capture device may be an array of MEMS microphones that have been configured to have a directivity pattern as shown in fig. 3 a. The directivity pattern shows that the sound capture device 311 has two sensitivity lobes. As will be described in more detail below, the target audio and noise may then be used for noise filtering to produce filtered target audio.

This sound capture device 311 is advantageous because the directivity pattern is such that if the first sensitivity lobe 313 is directed towards the target audio source 303, the sound capture device will capture the target audio. The shape of the second sensitivity lobe 315 means that the sound capture device will also be able to capture noise from noise sources outside the target audio (e.g., the thrusters 305a and 305b fixed relative to the position of the sound capture device 311). Another advantage is that regardless of the relative position of the first sensitivity lobe 313 with respect to the noise source, noise from the noise source will be captured within the second lobe 315 and have a suitably constant gain. This is illustrated by figure 3 b. The target audio source 303 is moved relative to the sound capturing device (compare fig. 3a) and the sound capturing device 311 needs to be redirected such that the first sensitivity lobe 313 is directed towards the target audio source 303. However, the noise from the noise sources 305a and 305b, although now in a different direction relative to the sound capture device, will still be captured by the second lobe 315. Further benefits of the method for noise filtering will become apparent when it is described in more detail below.

A beam with a wide capture area and approximately equal gain across the frequency may be more suitable for use on a mobile gimbal system mounted on a UAV because it reduces the inaccuracy of the response to changes in the relative position of the noise source.

For the purpose of noise filtering using beamforming to capture sound and noise sources, a wider beam is advantageous because it provides a more complete capture of the noise received by the audio capture device. Smaller variations in the response across the beam acquisition arc also help reduce errors in source separation, which are more severe the further the response is off target.

The array and beamformer may use null beams (null beams) with a wide and constant response that approximates the ideal case described above. This allows it to be used in a gimbal mounted drone system that will move the array relative to the noise source while maintaining the performance required for noise filtering.

This also means that a single implementation with a wider null beam can switch between unmanned aerial frames with different motor positions while still capturing them as a noise source. It may also allow for switching not only between frames of the same model of drone, but also between frames of different models of drone.

In one example, the null beam may be wide enough to capture all noise sources. In this case, noise sources such as motors/rotators can be defined in terms of known relative directions of the audio recording devices (very close), known relative distances, and significant signal power compared to the signal of interest (possibly ranging from-5 dB down to +10dB up relative to the signal). For example, in a quadrotor UAV, the null beam may be at least 180 ° depending on the position of the gimbal. In other examples, the null beam may be 355 °, 350 °, 340 °, 330 °, 320 °, 310 °, 300 °, 290 °, 280 °, 270 °, 260 °, 250 °, 240 °, 230 °, 220 °, 210 °, 200 °, or 190 ° for different main beam widths. Alternatively, a null beam may be defined by the absence of a primary beam. For example, if the user selects a main beamwidth of X, the null beamwidth may be 360-X, or may be 360-X-Y, where Y is a buffer width, which may be fixed or may be user selectable, or may be determined based on an algorithm.

The level of suitably constant gain across the angle of the null beam may vary depending on the application. For example, in commercial audio capture on a typical quadrotor UAV, the gain may vary by less than 20% across the beamwidth. In other examples, the null beam gain may vary by less than 40%, 35%, 30%, or 25% across the beamwidth.

The MEMS array may use an end-ray microphone array as the audio capture device. The end-ray array will have a form factor similar to a standard gun microphone. This shape will be more suitable than any other array structure that is less suitable for the mounting constraints of the drone gimbal.

The endfire array structure also results in itself having the greatest possible directivity at either end of the array, which allows for optimal capture of the signal/noise source of interest, while having high rejection in other directions.

Such an array may provide the additional advantage of a wider and more efficient frequency response. For example, frequency performance may be characterized by a level of gain variation across the beamwidth of the null beam. The suitably constant gain level across the frequency range may vary depending on the application. For example, the lowest available frequency may have a null beam gain that varies by less than 20% across the beamwidth, such as at 150Hz or 1 kHz. In commercial audio capture on a typical quad-rotor UAV, this may result in a usable frequency range of 150Hz to 20kHz for noise filtering purposes.

In another embodiment, beamforming may be used to define sensitivity lobes that capture both the target audio source and the noise source. For example, referring to fig. 4a, a directivity pattern for a sound capture device 411 following appropriate beamforming is shown. The first beam, as shown by the first sensitivity lobe 413, has been configured to capture target audio from the target audio source 403. The second beam, as shown by the second sensitivity lobe 415a, has been configured to capture noise from noise sources (e.g., the fixed position thrusters 405a and 405b relative to the sound capture device 411). By so forming, the sound capture device 411 is able to capture target audio and noise, as described below, which may then be used for noise filtering to produce filtered target audio.

If the target audio source moves relative to the sound capture device, the same beam will not be active. This is illustrated by fig. 4 b. In this figure, the target audio source 403 has moved (compared to fig. 4 a), requiring the sound capture device 411 to turn such that the first sensitivity lobe 413 is directed towards the target audio source 403. However, as it has moved, the noise from noise sources 305a and 305b will no longer be captured by the existing sensitivity lobes (as shown by dashed lines 415a and 415 b), thereby limiting the captured noise. Therefore, to more accurately capture noise, a new beamforming configuration is implemented (as shown at 417a and 417 b). This beam forming may be understood as turning the sound capturing device to a pair-wise beam redirection. In another embodiment, rather than physically redirecting the sound capture device, a new beamforming configuration may be implemented if the target audio source is no longer within the first sensitivity lobe, where the second sensitivity lobe remains unchanged, but the first lobe is steered to capture the target audio source. This approach may be advantageous because the sound capture device need not be physically redirected, but instead relies on steering the beam.

The sound capture device can have a plurality of beamforming configurations. The sound capture device may be configured to achieve an appropriate beamforming configuration depending on the relative position of the target audio source with respect to the noise source. In order to achieve a suitable beamforming configuration, it is necessary to detect the relative direction of the target audio source with respect to the noise source. For UAVs, only the direction of the target audio source relative to the UAV needs to be detected, since the dominant noise source (i.e., the motor and propeller assembly) is fixed relative to the UAV.

In another embodiment, the target audio capture device may be different from the noise capture device. The target audio capture device may be one or more microphones. Without limitation, these microphones may be MEMS microphones, condenser microphones (e.g., electret condenser microphones), electret microphones, parabolic microphones, moving coil microphones, ribbon microphones, carbon particle microphones, piezoelectric microphones, fiber optic microphones, laser microphones, and/or liquid microphones. Microphones may be used due to their particular directivity pattern, for example, over-center gun microphones, tri-center microphones, and/or omni-directional microphones. The microphones may be formed as an array. For example an array of two or three cardioid or omni-directional microphones, or an array of MEMS microphones. The microphone may be selected to take advantage of its specific properties that are appropriate for the target audio capture. These properties may include directivity (illustrated by its characteristic directivity pattern), frequency response (which may correspond to the target audio), or signal-to-noise ratio.

If multiple separate microphones are used, the microphones may be evenly distributed around the UAV, and may be selectively switched on to capture the target audio from a particular direction. In other embodiments, the array of target audio capture devices may be evenly radially spaced from each other on the UAV and selectively activated to capture target audio from different directions.

The noise capturing means may be one or more microphones. Without limitation, these microphones may be MEMS microphones, condenser microphones (e.g., electret condenser microphones), electret microphones, parabolic microphones, moving coil microphones, ribbon microphones, carbon particle microphones, piezoelectric microphones, fiber optic microphones, laser microphones, and/or liquid microphones. Microphones may be used due to their particular directivity pattern, for example, over-center gun microphones, tri-center microphones, and/or omni-directional microphones. The microphones may be formed as an array. For example an array of two or three cardioid or omni-directional microphones, or an array of MEMS microphones. The microphone may be selected to take advantage of its specific properties suitable for noise capture. These properties may include directivity (illustrated by its characteristic directivity pattern), frequency response (which may correspond to the target audio), or signal-to-noise ratio.

The target audio capture device may have the same type of microphone as the noise capture device or may have a different type of microphone.

Position of

The sound capture device may be positioned to be mounted (possibly via a gimbal) to a portion of a load (e.g., 210 of fig. 2) on the underside of the UAV. The sound capture device may be connected to the UAV by a gimbal. In this way, the sound capture device can be steered relative to the UAV.

As shown in fig. 5, the sound capture device 507 may be mounted to the UAV 500 in a space within 10 degrees of the plane of the motor and propeller assembly 509. This is advantageous because noise from the motor and propeller assembly 509 is minimal in this space. The sound capture device may be mounted toward the front or rear of the UAV (rather than the sides) to maintain balance. The sound capture device (or attached gimbal) may be mounted via a connection configured to isolate and/or dampen vibrations generated by the UAV.

Where there are different target audio capture devices and noise capture devices, the devices may be remote from each other in order to minimize the noise picked up by the target audio capture devices. For example, the target audio capture device may be located on a UAV load that is suspended below the UAV and faces the ground at an angle, while the noise capture device may be mounted closer to and directed toward the noise source. In another example, the target audio capture device may be located directly to the side of the UAV within 10 degrees of the plane of the motor and propeller assembly 509 (similar to the arrangement described with respect to fig. 5). The target audio capture device (or attached gimbal) may be mounted via a connection configured to isolate vibrations.

The noise capturing device may be fixed or may be movable relative to the load or the UAV. The noise capturing device is configured to face a source of noise to be filtered from the target audio. Examples of noise include, but are not limited to, noise from UAV motors and/or propeller assemblies or wind noise. The noise capture device may be located near an arm of the UAV or other noise source on the UAV.

Mobility

The sound capture device (or the target audio capture device if different from the noise capture device) is movable relative to the UAV (and/or the noise capture device). For example, they may be mounted via independent steerable gimbals. In the event that the sound capture device is unable to move relative to the UAV, the UAV itself may move to point the sound capture device at the target audio source. In one case, the sound capture device (or the target audio capture device if different from the noise capture device) may be aligned with the image capture device such that the sound capture device is directed toward a target audio source that is also captured by the image capture device. In one embodiment, the sound capture device and the image capture device may be mounted on the same gimbal to ensure that they remain aligned regardless of the direction in which they face.

Sensor module

The sensor module 106 (introduced for the system 100 of fig. 1) load may include a target sensor for sensing data about a target audio source. Examples of target sensors include, but are not limited to, visual sensors (e.g., imaging devices capable of detecting visible, infrared, or ultraviolet light, such as cameras or thermographic cameras), proximity sensors (e.g., ultrasonic sensors, lidar, laser rangefinders, time-of-flight cameras), or other field sensors (e.g., magnetometers, electromagnetic sensors). These vision sensors may be attached to or replace the image capture devices described previously. The sensor data may be communicated to the processing unit 108 or to a remote device.

Target data from the target sensor may be used to control the direction of the target audio capture device to track the target audio source. The processing unit may be configured to automatically track the target audio source (e.g., the target audio source may include a radio transceiver whose location the processing unit is able to detect via an appropriate transceiver included in the sensor module) or the tracking may require some input from the user (e.g., the user may visually track the target audio source via video captured by the image capture device or an appropriate visual sensor and manually redirect the target audio capture device).

Where the target audio capture device is attached via a gimbal, the target audio source may be steered independently with respect to the load and/or the UAV. Where the load is attached via a gimbal, the load itself may be steered (and thus reoriented) toward the target audio capture device. The sensor module may include a gimbal sensor to detect an orientation of the gimbal to allow a relative direction of a target audio source or target audio capture device with respect to the UAV to be determined. For example, if a user manually controls the load to track the target audio source via video feedback, the gimbal sensor may sense the orientation of the gimbal and thus be able to determine the relative direction of the target audio source with respect to the UAV.

Data from the sensors may be associated with target audio source data and may be used to obtain measurements from the target, map an area, or assist in UAV navigation. The sensor data may be streamed to a remote location in real time, transmitted to a remote device, or stored locally.

The sensor module may include other sensors for determining the position, orientation, and/or motion of the UAV, the load, and/or the target audio capture device, and the noise capture device. Examples of sensors include, but are not limited to, a location sensor (e.g., a GPS sensor or a mobile device transmitter capable of position triangulation), an inertial sensor (e.g., an accelerometer, gyroscope, or inertial measurement unit), an altitude sensor, and/or a pressure sensor. For example, the sensor module may include an electronic compass for measuring the orientation (e.g., azimuth and inclination) of the UAV.

Although the sensors are described above as part of a sensor module that is part of the system 100 mounted on a load attached to the UAV. Some sensors may be mounted on or incorporated into the UAV itself. For example, the UAV may also include a GPS sensor and the system 100 may be configured to receive data from an existing GPS system via a communication module.

The "directional data" as determined by the sensor may include the relative angle of the sound capture device with respect to the drone/noise source. This may be derived from telemetry from the rotating gimbal or other sensor of the mounting device, as described below, for selecting appropriate input parameters in the spatial noise filtering system.

Target audio source location

Sensor data (including, for example, gimbal data and GPS data) may be combined to calculate an absolute position of a target audio source. For example, a user may remotely point a target audio capture device at a target audio source. A range finder (e.g., a laser range finder aligned with the direction of the target audio capture device) may calculate the distance between the UAV load and the target audio source. The gimbal sensor may detect the relative direction of the target audio capture device with respect to the UAV and the accelerometer may be used to detect the orientation of the UAV. If the relative direction and distance of the target audio source is known, and the absolute position of the UAV is known (e.g., using GPS), the absolute position of the target audio source can be determined.

Load(s)

As described with respect to fig. 2, the load of the UAV may be located below the UAV. The load may include a system for noise filtering as described herein. However, some aspects of the system may be shared with the UAV (e.g., a power source).

The payload may be removably or permanently attached to the UAV. The load may be attached via a gimbal such that the load can be steered independently of the UAV (i.e., by controlling yaw, roll, and pitch of the gimbal, thereby controlling the load).

Treatment of

The system 100 may partially or fully onboard process the audio and noise data to produce filtered target audio. Alternatively, the system 100 may store audio and noise data for post-processing. System 100 may additionally include a data storage component that stores data collected and/or processed by processing unit 108. In one embodiment, the data storage component may store data when a connection between the system 100 and the remote processing unit 108 is lost for subsequent transfer when the connection is restored. The data storage component may be an SD card.

Characteristics of UAV

The system 100 for noise filtering may be combined with other systems and methods capable of reducing noise generated by the UAV itself. For example, the motor and propeller assembly of the UAV may be shrouded, the UAV may include noise absorbing materials, and/or the UAV may be provided with a noise cancellation launcher.

The UAV may navigate by remote control, pre-programming, or autonomous navigation. The UAV may include one or more propulsion units that allow the UAV to move in up to six degrees of freedom. The UAV may be any suitable size. The UAV may include a central body and one or more arms extending outward from the central body that includes a UAV propulsion unit. The central body may include a housing that includes UAV electronics.

The UAV may itself include one or more sensors. Examples of sensors of UAVs include, but are not limited to, position sensors (e.g., GPS sensors, mobile device transmitters capable of position triangulation), visual sensors (e.g., imaging devices capable of detecting visible, infrared, or ultraviolet light, such as cameras), proximity sensors (e.g., ultrasonic sensors, lidar, time-of-flight cameras), inertial sensors (e.g., accelerometers, gyroscopes, or inertial measurement units), altitude sensors, pressure sensors, audio sensors (e.g., microphones), and/or field sensors (e.g., magnetometers or electromagnetic sensors)

The battery may be coupled to the UAV 200. The battery may be coupled to the UAV to provide power to one or more components of the UAV. The battery may provide power to the one or more propulsion units and any other components of the UAV when coupled to the UAV. In some embodiments, the battery may also provide power to a system for noise filtering that includes the target audio capture device. In other embodiments, the system relies on its own power supply.

Although described with respect to UAVs, embodiments may also be used to improve audio capture of a target audio source in any suitable mobile vehicle, including, but not limited to, UAVs, helicopters, gyroplanes, vertical lift systems, and fixed wing aircraft.

Target audio and noise

The UAV may be configured to fly in any suitable environment (including indoor and outdoor environments) for targeted audio capture. The target audio may be ambient audio, sounds produced by humans, animals, machinery, the environment, or any other audio that is desired to be captured.

The distance between the UAV and the target audio source may vary depending on the application.

The noise captured by the noise capture device may include noise generated by the UAV itself or ambient noise. Examples of noise generated by the UAV include noise generated by a motor and propeller assembly, noise generated by an onboard instrument (such as a gimbal motor or a camera), or sound generated by interaction of the UAV with the airflow. Ambient noise may include general wind noise, noise from nearby air vehicles, or other environmental noise, such as traffic noise.

In some implementations, the target audio source may be closer to the target audio capture device than the noise capture device. Alternatively, the noise source may be closer to the target audio capture device than the target audio source.

Remote processing unit

The remote processing unit 120 may be incorporated into any suitable remote device including, but not limited to, a personal computer, a laptop computer, a smart phone, a tablet computer, or a custom device. The remote device may include a user interface for controlling the UAV and/or the system for noise filtering 100 and a display for displaying data from the UAV and/or the system. The data may include sensor data and/or target audio or noise data.

The system 100 may include a control mechanism for starting and stopping audio capture. This is useful in e.g. live broadcasting. A user with a remote device may communicate with the system 100 to start audio and/or video capture, redirect a target audio capture device or image capture device, and stop audio and/or video capture. The user may selectively capture audio alone or video alone.

In some embodiments, the load may include a speaker that allows for remote communication. The remote device captures an audio message from the user (e.g., an instruction regarding the person receiving the package) and wirelessly transmits the audio message to a communication module of the system, which is then emitted through a speaker. In this way, remote communication is achieved.

Filtration method

Having described the system and apparatus, various methods for noise filtering will now be described. Fig. 6 illustrates a method of generating filtered target audio using the system for noise filtering 100 described above. The steps may be performed by a processing unit or a remote processing unit, or a combination of both.

At step 602, the direction of a target audio source relative to the system is detected. In one embodiment, the target audio source may include a radio transceiver that communicates its location to the system 100, thereby enabling detection of a direction toward the target audio source. In another embodiment, the user may steer the image capture device to a target audio source with a video feed by ensuring that the target audio source is within the field of view of the image capture device. For example, the image capture device may be mounted to the UAV via a gimbal that can be controlled such that the field of view of the image capture device faces the target audio source. In another example, the image capture device may be attached to the UAV, and thus the user may move the UAV (by flying it to a particular location) so that the image capture device faces the target audio source. By determining the relative orientation of the image capture device with respect to the system, the orientation of the target audio source may be detected.

At step 604, the target audio capture device (or sound capture device, in embodiments where the target audio capture device and the noise capture device are disposed in the same device) is pointed at the target audio source. In embodiments where an image capture device has been used to detect the direction of a target audio source, the target audio capture device may be aligned with the image capture device such that it is automatically directed towards the target audio source. In other embodiments, the target audio capture device may be redirected toward the target audio source, for example, by controlling a gimbal to which the target audio capture device is attached.

At step 606, the noise capture device is directed toward the noise source. Where the dominant noise source is noise from the motor or propeller assembly of the UAV, the one or more noise capture devices may have been directed at the noise source.

At step 608, the relative direction between the target audio capture device and the noise capture device is determined. Since the relative direction of the target audio capture device with respect to the system is known (the same as the relative direction of the target audio source detected at step 602) and the direction of the noise capture device is known, the relative direction between the target audio capture device and the noise capture device is determined.

At step 610, target audio from a target audio source is captured using a target audio capture device (or sound capture device, in embodiments where the target audio capture device and the noise capture device are provided in the same device). The noise is captured from the noise source using at least one noise capture device (or sound capture device, in embodiments where the target audio capture device and the noise capture device are provided in the same device).

At step 612, parameters of the noise filtering algorithm are adjusted using the directional data obtained at step 608.

At step 614, filtered target audio is generated using the adjusted noise filtering algorithm.

To continuously capture the target audio, the method may continuously or periodically repeat steps 602-608 in the event the target audio source moves relative to the system (e.g., the target audio source may be mobile, or the UAV may move relative to the target audio source).

Fig. 7 shows another embodiment of a method relying on beamforming.

At step 702, the relative direction of the target audio source with respect to the system is detected in much the same manner as described for step 602.

At step 703, the relative direction of the noise source with respect to the system is detected. In the case where the dominant noise source is noise from the motor or propeller assembly of the UAV, the relative direction will be known.

At step 705, the sound capture device will be implemented with the appropriate beamforming configuration such that the beam is directed towards the target audio source and the noise source.

At step 708, the relative direction between the target audio source and the noise source is determined.

At step 710, target audio from a target audio source is captured using a sound capture device and noise is captured from a noise source using the sound capture device.

At step 712, parameters of the noise filtering algorithm are adjusted using the directional data obtained at step 708.

At step 714, filtered target audio is generated using the adjusted noise filtering algorithm.

To continuously capture the target audio, the method may continuously or periodically repeat steps 702-708 in the event the target audio source moves relative to the system.

Fig. 8 shows another embodiment of a method relying on beamforming.

At step 801, a sound capture device is implemented with a first beamforming configuration.

At step 802, the relative direction of the target audio source with respect to the system is detected in much the same manner as described for step 602.

At step 803, the relative direction of the noise source with respect to the system is detected. In the case where the dominant noise source is noise from the motor or propeller assembly of the UAV, the relative direction will be known.

At step 804, the sound capture device is oriented such that the target audio capture beam is directed toward the target audio source.

At step 805, a sound capture device will be implemented with the appropriate beamforming configuration such that the beam is directed towards the target audio source and the noise source.

At step 808, the relative direction between the target audio source and the noise source is determined.

At step 810, target audio from a target audio source is captured using a sound capture device and noise is captured from a noise source using the sound capture device.

At step 812, parameters of the noise filtering algorithm are adjusted using the directional data obtained at step 808.

At step 814, filtered target audio is generated using the adjusted noise filtering algorithm.

To continuously capture the target audio, the method may continuously or periodically repeat steps 802 through 808 in the event the target audio source moves relative to the system.

Fig. 9 shows a schematic diagram of a method for generating filtered target audio z (t) using a sound capture device according to an embodiment. The sound capture device includes an array of microphones (denoted 1, 2, … M), each in the time domain X₁(t),X₂(t),…X_M(t) capturing sound data. Changing the domain of sound data to frequency domain X using Fourier transform₁(ω),X₂(ω),…X_M(ω)。

Sound data X₁(ω),X₂(ω),…X_M(ω) is passed to the beamformer 0 which uses the directional data (e.g. detected at step 702 or step 802 described above) to apply the appropriate beamforming configuration such that the resulting target audio beam Y₀(ω) is directed to the target audio source.

Sound data X₁(ω),X₂(ω),…X_M(ω) is also passed to the beamformer n, which uses the directional data (e.g. detected at step 703 or step 803 described above) to apply the appropriate beamforming configuration so that the resulting noise beam Y_n(ω) is directed to a noise source.

The target audio beam Y0(ω) and the noise beam Yn (ω) are provided to a square law unit, which calculates the energy amplitude for each frequency bin (frequency bin) for each beam. The resulting data is provided to a PSD estimation unit, which estimates the PSD for each beam. This can be performed using the Welch method. The Welch method relies on directional data. The directivity data may be pre-computed based on impulse response system characterization. In estimating the PSD for each beam, the PSD estimating unit uses the directional data to select appropriate data.

The PSD estimation unit generates weights which are provided to a suitable filter (such as a Wiener filter, as shown in fig. 9) which generates a filter H (ω) that is applied to the target audio beam Y0(ω). The inverse fourier transform is converted to the time domain, producing the filtered target audio z (t).

The voice data X will be captured continuously in the voice capturing device₁(t),X₂(t),…X_M(t), due to the relative direction of the target source with respect to the noise changing (e.g., due to the moving target source), a new beamforming configuration and PSD estimate is applied, thereby improving the filtered target audio z (t).

In embodiments where the sound capture device is physically steered relative to the UAV, there is no need to reconfigure the target audio beam once the beamformer 0 has been applied. If the relative direction of the target audio source changes, the sound capture device will be redirected so that the target audio beam continues to capture the target audio. However, since the relative direction of the noise sources will change, a new noise beam needs to be realized by the beamformer n.

Fig. 10 shows a schematic diagram of a method for producing filtered target audio z (t) using a sound capturing device wherein the sound capturing device has a directivity pattern as described for fig. 3a and 3 b. The beamformer 0 generates a target audio beam (corresponding to a first sensitivity lobe) and the beamformer n generates a noise beam n (corresponding to a second sensitivity lobe). As the relative direction of the target audio source changes (e.g., the target audio source moves), the direction of the sound capture device changes (e.g., it may be mounted via a gimbal that allows it to steer). The beam itself does not need to be changed since the second sensitivity lobe will capture the noise source regardless of direction. Furthermore, since the noise beam has an approximately uniform gain, the energy capture of the noise beam is largely immune to the relative position of the noise source. Thus, the use of the arrays of fig. 3a and 3b may be simpler, as it does not require periodic updates with new direction data.

In this case, "direction data" may mean the spatial relationship of each element in the array forming the sound capture device to the noise source. This can be used to calculate a beamformer for capturing sound and noise sources before using the capturing device.

Many noise filtering processes are based on estimating the noise mixed into the input audio to filter out only that noise. This means that the accuracy of this estimation is critical to the actual performance of these processes.

In one example of a noise filtering system that can be used with an array of gimbal mounted microphones, the system can receive spatially separated target sound sources and noise sources via two separate beamformers that are directed at these sources. Although the beams themselves tailor the spatial filtering, there is still a leakage of non-target sound levels into each beam. The known response of each beam in the known target directions of interest can be used to obtain an estimate of what components of the audio originate from these directions:

PSD_sources(f，t)＝G^-1(f)×PSD_beams(f，t)

wherein^PSDbeamsIs the Power Spectral Density (PSD), G, of the audio captured by each beam directed to the target sound source and noise source of interest^-1Is a flip square matrix containing the gain of each beamformer in each direction of interest, and the PSD_sourcesIs the estimated PSD of each of the different sources.

When the diagonal elements of the gain matrix G are small, the inverse operations performed on the digital system are prone to precision level errors, greatly degrading performance potential. The severity of this error increases with the size of the gain matrix G, which becomes larger as the number of beams used increases. This error can be mitigated by regularization:

G_regularized＝G+R×I

where I is the identity matrix and R is the regularization factor. R is typically small, chosen to be any number that ensures that the diagonal elements are not too small. However, the regularization is such that the matrix no longer accurately represents the gain of the beamformer, and so the performance of the noise filtering is still degraded accordingly.

Typically, in order to meaningfully capture the relevant noise sources in a UAV system with a narrow beam, both "sides" of the motor must have their own dedicated beams in addition to the target source beam. In UAV frames with motors spatially located far away, more beams may be needed and the greater the degree of numerical error introduced.

By utilizing a wide null beam implementation that captures all non-target sources in a beam whose gain is approximately constant, all noise inputs from the capture region can be modeled as a single "noise source" from one direction. In this model, there are only two beams that capture two directions of interest, one being the direction of the target sound source and one being the direction of the "noise source". Thus, the matrix G, which is made up of the directionality of each beamformer in each signal direction of interest, is reduced to a 2 x 2 matrix, minimizing the chance of accuracy errors and also reducing the amount of regularization (if any) that needs to be done. This may improve the accuracy of the resulting noise filtered output.

This reduction in problem space also means a reduction in computational burden, making it easier to implement on UAVs in real time, where power consumption is a significant design consideration.

Although for wider null beams for large angles the response is relatively angle and frequency invariant, there is still a small degree of deviation from the ideal unity gain and the performance of noise filtering can be improved if the true response at a particular angle to the dominant noise source is used. Telemetry of drones and gimbals may be used in some applications to select the appropriate gain to use in the matrix G for the relative angle of the current sound capture system to the noise source.

Fig. 11 shows a schematic diagram of a method for generating filtered target audio z (t) using different noise capturing means (N1, N2) and target audio capturing means (S). In this embodiment, there is no beamforming.

The target audio from the target audio capturing device is passed through a Fourier transform unit to produce target audio data X in the frequency domain_S(ω). The noise signals from the noise capturing devices N1 and N2 are also passed through a fourier transform unit to produce noise audio data X_N1(omega) and X_N2(ω)。

These data are then passed through a square law unit, a PSD estimation unit and a Wiener filter in much the same way as described for fig. 9, resulting in filtered target audio x (t).

Where the noise capture device is fixed (e.g., the noise capture device may be permanently pointed at the motor and propeller assembly), the target audio capture device may be able to steer relative to the UAV. The relative direction of the target audio signal may thus be changed and this information is provided to the PSD estimation unit.

FIG. 12 shows a diagram of noise filtering according to one embodiment. This shows that the adaptive filter is used to filter out noise from the noise source from the target audio to produce a filtered target audio output.

FIG. 13 shows a schematic diagram of noise filtering according to one embodiment. Noise and target audio are captured. The signal is pre-amplified and then digitized. The noise is then filtered from the target audio. The filtered target audio may be stored/transmitted in digital and/or analog format.

While the present invention has been illustrated by the description of embodiments thereof, and while the embodiments have been described in detail, it is not the intention of the applicants to restrict or in any way limit the scope of the appended claims to such detail. Additional advantages and modifications will readily appear to those skilled in the art. The invention in its broader aspects is therefore not limited to the specific details, representative apparatus and method, and illustrative examples shown and described. Accordingly, departures may be made from such details without departing from the spirit or scope of the applicant's general inventive concept.

32页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：信号处理装置与方法以及程序

System and method for audio capture

相关技术

网友询问留言