System and method for capturing noise for pattern recognition processing

文档序号:1026958 发布日期:2020-10-27 浏览:6次 中文

阅读说明:本技术 捕获噪声用于模式识别处理的系统和方法 (System and method for capturing noise for pattern recognition processing ) 是由 罗伯特·措普夫 维克托·西米利斯基 阿舒托什·潘迪 帕特里克·克鲁斯 于 2019-01-25 设计创作,主要内容包括:示例系统和方法通过以第一间隔周期性地捕获音频数据来捕获音频数据的第一多个部分。实施例检测音频数据中的语音开始。响应于检测到语音开始,系统和方法从周期性地捕获音频数据切换到连续地捕获音频数据。实施例组合音频数据的第一多个捕获部分的至少一个捕获部分与连续捕获的音频数据,以提供连续的音频数据。(Example systems and methods capture a first plurality of portions of audio data by capturing audio data periodically at a first interval. Embodiments detect a speech onset in audio data. In response to detecting the onset of speech, the systems and methods switch from periodically capturing audio data to continuously capturing audio data. Embodiments combine at least one captured portion of the first plurality of captured portions of audio data with the continuously captured audio data to provide continuous audio data.)

1. A method, comprising:

capturing a first plurality of portions of audio data by capturing the audio data periodically at a first interval;

detecting a speech onset in the audio data;

in response to detecting the onset of speech, switching from periodically capturing the audio data to continuously capturing the audio data; and

at least one captured portion of the first plurality of portions of audio data is combined with the continuously captured audio data to provide continuous audio data.

2. The method of claim 1, further comprising processing the continuous audio data to identify speech in the continuously captured audio data.

3. The method of claim 1, comprising operating at least one power domain in a first power consumption mode to capture a first plurality of portions of audio data, and operating in a second power consumption mode during the first interval, wherein the first power consumption mode has a greater power consumption rate than the second power consumption mode.

4. The method of claim 3, wherein operating in the second power consumption mode comprises operating a sensor power domain in a monitor mode and operating a buffer power domain in a sleep mode.

5. The method of claim 1, wherein periodically capturing the audio data comprises sampling the audio data at a first sampling rate, and detecting the onset of speech comprises sampling the audio data at a second sampling rate, wherein the first sampling rate is greater than the second sampling rate.

6. The method of claim 1, further comprising setting or dynamically adjusting the first interval based on one or more noise characteristics or power consumption requirements.

7. The method of claim 1, wherein periodically capturing the audio data comprises periodically sampling the audio data and periodically buffering the sampled audio data, and continuously capturing the audio data comprises continuously sampling the audio data and continuously buffering the sampled audio data.

8. The method of claim 1, wherein at least one captured portion of the first plurality of captured portions of audio data is a most recently captured portion of the first plurality of captured portions of audio data.

9. The method of claim 8, wherein the combining comprises overlapping a portion of one end of a most recently captured portion of the audio data with a portion of one end of the successively captured audio data.

10. The method of claim 9, wherein a portion of an end of the last captured portion is less than 20 ms.

11. The method of claim 1, wherein detecting the onset of speech in the audio data comprises detecting the onset of speech without using the captured portions of the audio data.

12. The method of claim 11, wherein detecting the onset of speech in the audio data comprises waking a onset of speech detector in response to the audio data meeting or exceeding an activation threshold of an audio interface and executing an onset of speech detection algorithm to determine that a speech-like signal is present in the audio data.

13. The method of claim 12, further comprising:

capturing a second plurality of portions of the audio data by capturing the audio data periodically at a second interval;

calculating another activation threshold using one or more portions of the second plurality of captured portions; and

providing the other activation threshold to the audio interface.

14. The method of claim 13, comprising operating in a first power consumption mode to capture a second plurality of portions of the audio data and calculate the activation threshold, and operating in a second power consumption mode during a second interval, wherein a rate of power consumption of the first power consumption mode is greater than a rate of power consumption of the second power consumption mode.

15. The method of claim 14, wherein operating in the second power consumption mode comprises operating a sensor power domain in a monitor mode and operating a talk-start detection power domain in a sleep mode.

16. An audio processing device comprising:

an audio interface operable to sample audio data, a speech onset detector, a buffer, a combiner, and an audio interface control, wherein in response to the speech onset detector detecting a speech onset in the audio data, the audio interface control is operable to switch the audio processing device from periodically capturing the audio data at an interval to continuously capturing the audio data, wherein the combiner is operable to provide continuous audio data using at least one of the periodically captured audio data and the continuously captured audio data.

17. The audio processing device of claim 16, further comprising a wake phrase detector operable to process the continuous audio data to identify a wake phrase in the continuously captured audio data.

18. The audio processing device of claim 16, wherein the buffer is in a buffer power domain of the audio processing device, wherein during the interval the buffer power domain is in a sleep mode.

19. The audio processing device of claim 16, wherein the audio interface is configured to provide the audio data to the onset detector in response to the audio meeting or exceeding a threshold activity level, the audio processing device further comprising a threshold calculation module configured to periodically wake up, turn on the audio interface to collect audio data, calculate an updated threshold activity level, provide the updated threshold activity level to the audio interface, and re-enter a sleep mode.

20. An electronic communication device, comprising:

one or more processors, a memory system, a communication interface, and an audio processing device, the audio processing device comprising: an audio interface to process audio data, a speech onset detector to detect a speech onset in the audio data, and an audio interface control to switch a buffer in the memory system from periodically buffering the audio data to continuously buffering the audio data in response to detecting a speech onset, and a wake phrase detector to detect a wake phrase in continuously buffered audio data using a portion of audio data from periodically buffered audio data, wherein the one or more processors cause the communication interface to wirelessly transmit the continuously buffered audio data to a network in response to detection of the wake phrase.

21. The electronic communication device of claim 20, wherein the audio interface control is configured to set or adjust an interval of the periodic buffering.

22. The electronic communication device of claim 20, wherein the audio interface control is configured to cause the audio interface to: providing audio data having a first sampling rate to the onset detector and providing audio data having a second sampling rate to the buffer, wherein the first sampling rate is less than the second sampling rate.

Technical Field

The present subject matter relates to the field of pattern recognition solutions. More specifically, but not by way of limitation, the present subject matter discloses techniques for capturing noise of a pattern recognition process.

Background

Devices with "always on" or "always listening" voice interface capabilities, such as voice-enabled digital assistants, smart speakers, and hands-free interfaces, traditionally require constant power, which either consumes battery power or requires a power outlet. Portions of the speech recognition capable device may remain in a low power consumption mode until a speech-like sound is detected, at which point phrase detection may determine whether a particular word or phrase has been spoken (i.e., a wake phrase). Implementation of wake phrase detection results in increased power consumption because portions of the device remain in a powered-on state (e.g., "always on") for long periods of time.

21页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:生物测定过程

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!