System and method for continuous media segment identification

文档序号:1046968 发布日期:2020-10-09 浏览:9次 中文

阅读说明:本技术 用于连续介质片段识别的系统和方法 (System and method for continuous media segment identification ) 是由 W·里奥·霍尔提 于 2015-11-30 设计创作,主要内容包括:本发明提供了使用所述节目的音频分量来识别未知媒体节目的手段。本发明从由消费者电子设备如智能电视机和电视机顶盒接收的媒体中提取音频信息,然后将所述信息传送到远端服务器装置,该远端服务器装置继而通过对照已知音频片段信息的数据库进行测试来识别所述未知身份的音频信息。该系统实时地识别未知媒体节目,以便可以提供时间敏感的服务,诸如提供背景相关的信息或电视广告替代的交互式电视应用。其他用途包括跟踪许多其他服务中的媒体消费。(The present invention provides means to identify unknown media programs using the audio component of the program. The present invention extracts audio information from media received by consumer electronic devices such as smart televisions and television set-top boxes and then transmits the information to a remote server device which in turn identifies the audio information of unknown identity by testing against a database of known audio clip information. The system identifies unknown media programs in real-time so that time sensitive services can be provided, such as interactive television applications that provide context-dependent information or television advertisement replacements. Other uses include tracking media consumption in many other services.)

1. A computer-implemented method for identifying one or more unknown media segments, the method comprising:

receiving audio cues related to an unknown media segment, wherein the audio cues comprise autocorrelation representations of audio frames identified in the unknown media segment;

identifying a plurality of reference audio cues in a reference audio cue database, wherein the plurality of reference audio cues are determined to match the received audio cue, and wherein a reference audio cue of the plurality of reference audio cues comprises an autocorrelation representation of an audio frame identified in a known media segment;

adding a first token to a first bin related to a first known media segment, wherein the first token is added to the first bin based on a determined match between the received audio cue related to the unknown media segment and a first reference audio cue related to the first known media segment;

adding a second token to a second bin related to a second known media segment, wherein the second token is added to the second bin based on a determined match between the received audio cue related to the unknown media segment and a second reference audio cue related to the second known media segment;

determining that a plurality of tokens in the first bin exceed a value; and

identifying the unknown media segment as matching the first known media segment when it is determined that the plurality of tokens in the first bin exceeds the value.

2. The computer-implemented method of claim 1, wherein the autocorrelation representation of the audio frame comprises one or more coefficients.

3. The computer-implemented method of claim 2, the method further comprising generating the one or more coefficients, wherein generating the one or more coefficients comprises: applying an autocorrelation function to the audio frame.

4. The computer-implemented method of claim 3, the method further comprising applying one or more transform functions to the one or more coefficients to generate one or more transformed coefficients.

5. The computer-implemented method of claim 4, wherein the one or more transform functions comprise at least a linear predictive coding function.

6. The computer-implemented method of claim 4, further comprising applying one or more normalization functions to coefficients of the one or more transforms.

7. The computer-implemented method of claim 6, wherein applying the one or more normalization functions to the one or more transformed coefficients comprises: quantizing the one or more transformed coefficients.

8. The computer-implemented method of claim 4, the method further comprising applying one or more additional transform functions to the one or more transformed coefficients to generate one or more further transformed coefficients.

9. The computer-implemented method of claim 8, wherein the one or more additional transform functions comprise at least one of a Line Spectral Pair (LSP) transform function or an Immittance Spectral Frequency (ISF) transform function.

10. The computer-implemented method of claim 8, the method further comprising applying one or more normalization functions to the one or more further transformed coefficients.

11. The computer-implemented method of claim 10, wherein applying the one or more normalization functions to the one or more further transformed coefficients comprises: quantizing the one or more further transformed coefficients.

12. The computer-implemented method of claim 1, wherein an audio frame comprises a period of time in an audio portion of a media segment, and wherein the audio portion has fixed audio signal characteristics during the period of time.

13. The computer-implemented method of claim 1, wherein the audio frames have a fixed size.

14. The computer-implemented method of claim 1, wherein reference audio cues are determined to match the received audio cues when the autocorrelation representations of audio frames identified in the known media segments are within the range of the autocorrelation representations of audio frames identified in the unknown media segments.

15. The computer-implemented method of claim 1, the method further comprising:

determining content related to the known media segments;

retrieving the content from a database;

sending the retrieved content, wherein the retrieved content is addressed to a media system, and wherein the received audio cue is received from the media system.

16. The computer-implemented method of claim 1, the method further comprising:

removing one or more tokens from the first bin when a period of time has elapsed.

17. A computing device for identifying one or more unknown media segments, the computing device comprising:

a storage device;

one or more processors; and

a non-transitory machine-readable storage medium comprising instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising:

receiving audio cues related to an unknown media segment, wherein the audio cues comprise autocorrelation representations of audio frames identified in the unknown media segment;

identifying a plurality of reference audio cues in a reference audio cue database, wherein the plurality of reference audio cues are determined to match the received audio cue, and wherein a reference audio cue of the plurality of reference audio cues comprises an autocorrelation representation of an audio frame identified in a known media segment;

adding a first token to a first bin related to a first known media segment, wherein the first token is added to the first bin based on a determined match between the received audio cue related to the unknown media segment and a first reference audio cue related to the first known media segment;

adding a second token to a second bin related to a second known media segment, wherein the second token is added to the second bin based on a determined match between the received audio cue related to the unknown media segment and a second reference audio cue related to the second known media segment;

determining that a plurality of tokens in the first bin exceed a value; and

identifying the unknown media segment as matching the first known media segment when it is determined that the plurality of tokens in the first bin exceeds the value.

18. The computing device of claim 17, wherein the autocorrelation representation of the audio frame comprises one or more coefficients, the computing device further comprising instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising generating the one or more coefficients by applying an autocorrelation function to the audio frame.

19. The computing device of claim 18, further comprising instructions that when executed on the one or more processors cause the one or more processors to perform operations comprising applying one or more transform functions to the one or more coefficients to generate one or more transformed coefficients.

20. A computer program product tangibly embodied in a non-transitory machine-readable storage medium of a computing device, the non-transitory machine-readable storage medium comprising instructions configured to cause one or more processors to:

receiving audio cues related to an unknown media segment, wherein the audio cues comprise autocorrelation representations of audio frames identified in the unknown media segment;

identifying a plurality of reference audio cues in a reference audio cue database, wherein the plurality of reference audio cues are determined to match the received audio cue, and wherein a reference audio cue of the plurality of reference audio cues comprises an autocorrelation representation of an audio frame identified in a known media segment;

adding a first token to a first bin related to a first known media segment, wherein the first token is added to the first bin based on a determined match between the received audio cue related to the unknown media segment and a first reference audio cue related to the first known media segment;

adding a second token to a second bin related to a second known media segment, wherein the second token is added to the second bin based on a determined match between the received audio cue related to the unknown media segment and a second reference audio cue related to the second known media segment;

determining that a plurality of tokens in the first bin exceed a value; and

identifying the unknown media segment as matching the first known media segment when it is determined that the plurality of tokens in the first bin exceeds the value.

50页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种高可靠性统型交换机

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类