More matching detections of video based on audio-frequency fingerprint and identifying to media channel disambiguate

文档序号：1755585 发布日期：2019-11-29 浏览：21次中文

阅读说明：本技术 基于音频指纹的视频多匹配检测和对媒体频道识别消歧 (More matching detections of video based on audio-frequency fingerprint and identifying to media channel disambiguate ) 是由徐忠源权宁抚李载炯于 2017-02-28 设计创作，主要内容包括：公开了在媒体内容的视频指纹与与多个不同频道分别对应的多个参考视频指纹匹配的情况下帮助对频道识别进行消歧的方法和系统。对于这种多匹配情况,实体可以基于媒体内容的音频分量消歧,例如通过进一步确定所讨论的媒体内容的音频指纹与该多个频道中的仅一个频道的音频指纹匹配,从而确定这即是由媒体呈现设备正在展示的媒体内容正在其上到达的频道。(It discloses and helps to identify the method and system disambiguated to channel in the case where the multiple reference video fingerprint matchings corresponding with multiple and different channels of the video finger print of media content.For this more match conditions, entity can be disambiguated based on the audio component of media content, such as by further determining that the audio-frequency fingerprint of discussed media content is matched with the audio-frequency fingerprint of the only one channel in multiple channel, so that it is determined that this is the channel just reached on it by the media content that media presentation devices are being shown.)

1. a kind of based on the method for executing movement by disambiguating determining channel based on audio-frequency fingerprint, comprising:

It is determined the digital video fingerprinting of the media content shown by media presentation devices by computing system and is respectively correspond toed Multiple reference video fingerprint matchings of different corresponding channels；

In response at least determining the multiple with reference to view of the digital video fingerprinting corresponding channel different from respectively correspond toing Frequency fingerprint matching is based at least partially on and determines that the digital audio of the media content shown by the media presentation devices refers to The reference audio fingerprint matching cause of line and the single channel for corresponding only to channel corresponding to the multiple reference video fingerprint Computing system executes disambiguation, and the disambiguation, which determines, receives media content on the described single channel of media presentation devices； And

It is executed based on reception media content on the described single channel of the determination media presentation devices by computing system dynamic Make.

2. according to the method described in claim 1, wherein, there is view by the media content that the media presentation devices are being shown Frequency track and audio track, wherein the digital video fingerprinting is the fingerprint of the track of video, and the digital audio refers to Line is the fingerprint of the audio track.

3. according to the method described in claim 1, wherein, by the number for the media content that the media presentation devices are being shown Audio-frequency fingerprint at least indicates the language track of the media content shown by the media presentation devices.

4. according to the method described in claim 1, wherein, the computing system is the entity in addition to media presentation devices, and And wherein, the digital video fingerprinting and digital audio fingerprint are generated by the media presentation devices, and the method is also wrapped It includes:

The digital video fingerprinting and digital audio fingerprint are received from the media presentation devices by the computing system.

5. according to the method described in claim 1, wherein, several ginsengs of the multiple reference video fingerprint in reference data Examine video finger print, and wherein it is determined that the digital video fingerprinting of the media content shown by the media presentation devices with The multiple reference video fingerprint matching includes:

By the institute in the digital video fingerprinting of the media content shown by the media presentation devices and the reference data Several reference video fingerprints are stated to be compared, and

Based on comparative result, determine the digital video fingerprinting of the media content shown by the media presentation devices with it is described Multiple reference video fingerprint matchings.

6. according to the method described in claim 1, the method also includes detecting and mark by the multiple reference video fingerprint More match groups of composition,

Wherein it is determined that digital video fingerprinting and the multiple reference of the media content shown by the media presentation devices Video finger print matches the digital video fingerprinting and label for comprising determining that the media content shown by the media presentation devices More match groups reference video fingerprint matching.

7. according to the method described in claim 6, the method also includes:

Compare reference audio fingerprint corresponding with the reference video fingerprint of more match groups of label；

Based on comparative result, the difference between the reference audio fingerprint compared is detected；And

In response to detecting difference, further mark more match groups with indicate audio-frequency fingerprint analysis can promote to disambiguate,

Wherein, more match groups are marked further to indicate that audio-frequency fingerprint analysis can promote to disappear further in response to determining Discrimination based on the digital audio fingerprint for determining the media content shown by the media presentation devices and corresponds only to single frequency The reference audio fingerprint matching in road executes disambiguation.

8. according to the method described in claim 1, wherein, the computing system is the reality in addition to the media presentation devices Body, the method also includes:

In response at least determining the multiple with reference to view of the digital video fingerprinting corresponding channel different from respectively correspond toing Frequency fingerprint matching requests and receives opened up by the media presentation devices from the computing system to the media presentation devices The digital audio fingerprint of the media content shown, to promote to execute disambiguation based on digital audio fingerprint obtained.

9. according to the method described in claim 1, the method also includes:

In response at least determining that the multiple reference of the digital video fingerprinting corresponding channel different from respectively correspond toing refers to Line matching, generates reference audio fingerprint corresponding with channel corresponding to the multiple reference video fingerprint, to promote to disambiguate.

10. according to the method described in claim 1, wherein, based on determining the described single channel of media presentation devices It is upper to receive media content to execute movement include: to show the content specific to channel of supplement and media presentation devices Media content is shown together.

Background technique

Typical media presentation devices operation with receive the media content for indicating to have video and audio component simulation or Digital media stream, and shown in the user interface for including display screen and audio tweeter and export media content.It is such The example of equipment includes but is not limited to (for example, with individual or integrated video display and audio presentation component) TV Machine, Multimedia System etc..

In many cases, this media presentation devices can be communicated with receiver, such as local set-top box or other classes Like equipment or remote server, which can access many discrete carrier content channels, and can selectively by The media content of given this channel is transmitted to media presentation devices to play out.

For example, television set can communicatedly connect with the CATV set-top-box for having access to one group of cable television channels Connect, and set-top box can be configured to receive the user's input for selecting specific channel and responsively be tuned to selected channel, and general The video and audio component of selected channel are output to television set, and television set can be configured to show those videos and audio point Amount is to show user.As another example, with the multimedia show of independent or integrated display and loudspeaker assembly System can communicatedly connect with the computer, set-top box or other receivers for having access to a large amount of TVs or online stream media channel It connects, and receiver can be configured to receive the user's input for selecting specific channel, responsively to start to receive selected channel, and And provide the video component of channel to display to show user, while to loudspeaker provide the audio component of media content with Show user.

Summary of the invention

When media presentation devices receive and show media content, which channel carrying media presentation devices may not have The instruction of media content.It is selectively tunable to channel and receives media content and provide media content to media presentation devices Receiver or other equipment can have such information, but the media presentation devices of media content are received from other equipment It may be without such information.For example, if television set is coupled with CATV set-top-box, and user selects on the set-top box Specific cable channel is output to the media content of TV since set-top box receives on the channel, then set-top box therefore can be with Instruction with selected channel.But television set itself only may receive and show media content, and may there is no institute's frequency-selecting The instruction in road.

However, for various reasons, determining which channel in various channels is that carrying is being opened up by media presentation devices The channel for the content shown can be useful.In addition, not received from channel selecting device (for example, receiver or remote controler) The report for the channel that the equipment is tuned to and may be not in the case where any participation of channel selecting device, doing so can To be useful.For example, coming for media presentation devices itself and/or with the network server of media presentation devices cooperative work It says, based on the matchmaker that media presentation devices are shown with (for example, shown, currently to show, or in queue to be presented) The assessment held in vivo is to determine that it is useful that channel can be.If it is known that the channel that media content reaches on it, then media are in Then existing equipment, network server and/or another entity can execute one or more operations for the channel, for example, It determines and records the range for being playing the media content of the channel, selectively with replacement (such as replacement advertisement) replacement The predetermined portions of media content, or on media content be superimposed specific to channel content, to be opened up together with media content Show, etc..

A kind of method for determining the channel that media content reaches on it is to keep media presentation devices (or possible attached Equipment) and/or network server generate the digital finger-print of media content shown by media presentation devices, then by the fingerprint and The reference fingerprint data established for the known media content provided on a particular channel are compared.

For example, network server or other such entities can establish or otherwise access reference data, reference Data include that each channel of media content can be provided to media presentation devices (for example, providing in media to media presentation devices Each channel in the subscription plan of the set-top box of appearance) reference video fingerprint, and each reference video fingerprint is mapped to and is mentioned For the channel of media content.When media presentation devices receive and show given media content, media presentation devices and then can be with The video finger print of the media content is generated, and video finger print generated is reported by network communication interface and gives network service Device is for analysis.Then, the video finger print of report can be compared by network server with reference video finger print data, to look for To matched reference video fingerprint, so that it is determined that the channel that media content reaches on it is that reference data refers to the reference video The channel that line is mapped to.When having thereby determined that the channel that media content reaches on it, network server then can should The instruction of channel is transmitted to media presentation devices, and media presentation devices can execute the movement specific to channel.As an alternative, Network server itself or another entity can execute the movement specific to channel based on identified channel.

Unfortunately, the problem of being likely to occur in this process be may be provided on multiple and different channels it is identical Media content, it may be possible to while being also likely to be to be provided in different time, therefore, in the media shown by media presentation devices The video finger print of appearance may be not only associated with the channel that media content reaches on it.For example, sports tournament or political thing Part can broadcast simultaneously on multiple and different channels, or joint TV or broadcast program can be simultaneously or simultaneously not multiple It is broadcasted on different such channels.In these or other scene, if media presentation devices show such media content And it generating and the video finger print of media content is provided, then the video finger print may be mapped to multiple reference video fingerprints, because The identification of this channel will be uncertain.

As the specific example of such case, following scene is considered, two different content suppliers are respectively individual Identical sports tournament is broadcasted on channel, and advertiser has only signed the advertisement in the provider with one of content supplier The middle contract for showing pop-up advertisement.In such a scenario, when media presentation devices are receiving and show these broadcast in one When a, if media presentation devices generate and provide the video finger print of broadcast to network server, network server can be true Determine video finger print all to match with the match of Liang Ge content supplier broadcast, therefore does not know whether media presentation devices should show Pop up advertisement.

It is corresponding with multiple and different channels multiple with reference to view that disclosed herein is the video fingerprint datas in media content The method and system disambiguated to channel identification is helped in the case where frequency fingerprint matching.According to the disclosure, when network server or its When his entity detects this more match conditions, then which executes disambiguation for the audio component based on media content.Specifically Ground, entity can be first by determining the video finger print of discussed media content and the reference video fingerprint of multiple and different channels Matching matches scenes to detect more.For the more matching scenes detected, entity will be based on further determining that discussed media The audio-frequency fingerprint of content matches to execute disambiguation with the audio-frequency fingerprint of the only one in multiple channels, so that it is determined that this is by media The channel that the media content that display device is shown reaches on it, to promote to execute the movement specific to channel.

In practice, the audio for constituting the basis of this disambiguation can be the language track of media content.For example, multiple frequencies Road may have mutually the same track of video, but they may have audio track different from each other, for example, one in English It records or dubs, another is recorded or dubbed with Spanish or other Languages or sound.For example, being provided on different channel In the case that identical broadcast but channel have different language in order to use the user of different language to receive and appreciate, it may go out Existing such case.For more match conditions relevant to the track of video of such channel, the entity for executing the process be can be used Determine which channel is the channel shown by media presentation devices based on the audio-frequency fingerprint for the media content being demonstrated.

Alternatively, or in addition, the audio for forming the basis of the disambiguation can be using other forms, including such as background sound Happy, audio and/or other audio components.

Therefore, it discloses and a kind of is related to executing movement based on the channel of the determination and disambiguation based on audio-frequency fingerprint Method.This method includes computing system, determine the video finger print of the media content shown by media presentation devices with it is respective right It should be in multiple reference video fingerprint matchings of different corresponding channels.In addition, this method includes, in response at least determining that video refers to Multiple reference video fingerprint matchings of the line corresponding channel different from respectively correspond toing, being based at least partially on determination by media is in The audio-frequency fingerprint for the media content that existing equipment is shown and correspond only to the reference audio fingerprint matching of single channel to execute disambiguation, The disambiguation has determined that media presentation devices just receive media content on the single channel.And this method includes based on determination Media presentation devices receive media content just on the single channel to execute movement.

Furthermore there is disclosed a kind of systems comprising network communication interface, processing unit, non-transitory data storage with And it is stored in non-transitory data storage (for example, on it) and can be run by processing unit to execute various operations Program instruction.Operation includes that the matchmaker shown by media presentation devices is received from media presentation devices via network communication interface The video finger print held in vivo.In addition, operation include determine institute received video finger print with corresponding to multiple channels reference video Fingerprint matching.In addition, operation includes receiving from media presentation devices by media presentation devices exhibition via network communication interface The audio-frequency fingerprint of the media content shown.And operate includes determining which of multiple channels using the received audio-frequency fingerprint of institute Carry the media content shown by media presentation devices.Then, operation includes executing movement based on identified channel.

And a kind of non-transitory computer-readable medium is also disclosed, the finger that can be executed by processing unit is stored thereon with It enables, to execute all various operations as described herein.

By reading described in detail below and referring to attached drawing when needed, in terms of these and other, advantage and alternative Case will become obvious for those of ordinary skills.Moreover, it should be understood that in summary of the invention and hereinafter mentioning The description of confession is intended to only illustrate the present invention by way of example, and not limitation.

Detailed description of the invention

Fig. 1 is can be using the simplified block diagram of the example system of disclosed each principle.

Fig. 2 is that media presentation devices communicate example network in order to realize disclosed each principle with network server The simplified block diagram of arrangement.

Fig. 3 is the flow chart for describing the operation that can be executed according to the disclosure.

Fig. 4 is the simplified block diagram of example network server.

Fig. 5 is the simplified block diagram of example media display device.

Specific embodiment

With reference to attached drawing, Fig. 1 is can be using the simplified block diagram of the example system of disclosed each principle.However, should Understand, described herein this is arranged with other and process can use various other forms.For example, element and operation can weigh New sort, distribution, duplication, combination, omission, addition are otherwise modified.In addition, it should be understood that described herein by one The function that a or more entity executes by these entities and/or can represent these entities by hardware, firmware and/or software It realizes, for example, being realized by one or more processing units executed program instructions etc..

As shown in Figure 1, example system includes one or more media content sources 12 (for example, broadcaster, network service Device etc.), one or more media content distributors 14 (for example, multichannel distributor, as cable television provider, satellite mention For quotient, radio broadcasting provider, network polymerization device etc.), one or more media content receivers 16 are (for example, cable network reception Device, satellite receiver, air broadcast receiver, computer or other Streaming Media receivers etc.) and one or more clients End or media presentation devices 18 (for example, television set or other display equipment, loudspeaker or other audio output apparatus etc.).

In practice, for example, media content sources 12 can be National Broadcasting Service, such as ABC, NBC, CBS, FOX, HBO And CNN, media content distributor 14 can be local branch in specific designated market area (DMA) and/or other Local content distributor, receiver 16 and media presentation devices 18 can be located at the user terminal of such as family or commercial undertaking.Pass through This or other arrangements, content source 12 can transmit media content to content distributor 14, to be distributed at user terminal Receiver 16, and media content can be distributed to receiver in discrete channel (for example, specific frequency) by content distributor 16.Then, each receiver can by be tuned to selected channel and to media presentation devices 18 export in selected channel The media content of upper arrival triggers to respond user's input or one or more other.And media presentation devices 18 can connect It receives and shows media content (for example, showing or otherwise show content).

In this arrangement, when media presentation devices receive and show the media content, media presentation devices may not Instruction with the channel that media content reaches on it, i.e. receiver be tuned to channel instruction.But media presentation is set It is standby to may be configured to only receive media content in the form of Media Stream from receiver and show the media content received.So And according to the disclosure, media presentation devices can be communicated with network server, and can work together with network server with The identification for promoting channel, to promote to execute the useful movement specific to channel.

Media presentation devices 18 are configured to receive the channel of media content from receiver 16 and show media content to open up Show to user.In this arrangement, media presentation devices can be the above-mentioned type, such as television set or including integrated or individually Video and audio presentation component are (for example, video display module and relevant software/hardware and audio output module and relevant Software/hardware) other systems.And receiver can be the above-mentioned type, such as CATV set-top-box, computer etc., match It is set to and is selectively tunable to and exports any of each carrier content channel.In practice, media presentation devices can be with There are one or more connections (for example, wired or wireless connection) with receiver, to promote to receive from receiver: the receiver The video and audio component (for example, video and audio track) for the channel being tuned to.Then, the video of media presentation devices and Audio presentation component can be used for showing from the received video of receiver and audio to show user.

Next Fig. 2 shows example network arrangement, wherein net of this media presentation devices 18 via such as internet Network 22 is communicated with network server 20.In practice, media presentation devices 18 can be used as on the local area network (LAN) at user terminal Node, wherein media presentation devices have address network protocol (IP) for distributing on the lan, and LAN has in internet In IP address.In addition, network server 20 can also be accessed by the IP address in internet.Pass through this arrangement, matchmaker Body display device can initiate and participate in via internet to communicate with the IP of network server, be set by media presentation with report The fingerprint of the standby media content shown, to promote channel-identification and associated movement.

As described above, network server 20 or according to the disclosure operate another entity can establish or access reference data 24 obtain media content, and media content carried or plan be carried in the addressable various channels of media presentation devices 18 At least on each.Can store the reference data in relational database or the database of other forms may include for every One or more reference fingerprints of a channel, may be recently (for example, on the sliding window basis for covering nearest a period of time On) by channel carrying media content reference fingerprint stream.Alternatively, or in addition, reference data may include it is available and/or One of plan carrying each media content program (for example, television broadcasting, files in stream media etc.) on a particular channel or more Multiple corresponding reference fingerprints (for example, reference fingerprint stream).It is held thereon in addition, each reference fingerprint can be mapped to by reference data Carry or may carry the channel of associated media content (that is, by content of reference video and audio-frequency fingerprint unique identification).

For example, most preferably, the reference data of each channel may include the video component of channel reference video fingerprint and The reference audio fingerprint of the audio component of channel.For example, for each channel, reference data can store related to data record The each reference video fingerprint (for example, reference video fingerprint stream) and each audio-frequency fingerprint (for example, reference audio fingerprint stream) of connection. And reference data can also include the mapping for storing channel correlation associated with data record.

When reference fingerprint is mapped to channel, reference data can characterize channel by each attribute, with help by Channel is distinguished from each other out.For example, providing the multiple and different frequencies that can be selected by channel number in receiver or equipment as other In the case where road, reference data can characterize channel by its respective channel number.As another example, it is held in each channel In the case where the content (for example, content of one of specific broadcaster) for carrying particular content source, reference data can be each by it From the mark of content source characterize channel.In addition, in more than one content distributor (for example, distributor by all kinds of means) distribution In the case where the content of Rong Yuan, reference data can characterize channel by the mark of its respective content distributor.It is practicing In, reference data can be one or more associated in these or other attribute by each reference fingerprint.

Network server 20 can be by analyzing in each of each channel according to other entities of disclosure operation The media content reached on channel is (for example, at least in the available various channels of receiver for serving media presentation devices The media content that is reached on each channel) establish some or all of the reference data.In order to promote this point, such as scheme Shown, server may include one or more receivers 16 or interconnected, and wherein receiver 16 is configured in various frequencies Receive media content from one or more media content distributors 14 on road, reception mode and receiver be configured as with The mode of reception content is closely similar at the end of family.For example, server may include one or more CATV set-top-boxes, meter Calculation machine or other media sources interconnect therewith, or can be configured to simulate one or more such receivers.So Afterwards, server may be configured to: using any media fingerprints recognition methods that is currently known or developing later, to receive and divide The respective media content reached on each channel is analysed, and is referred to for the reference video that each channel generates the video component of channel The audio component of line and channel audio-frequency fingerprint (for example, every frame or other on the basis of calculate hash, or otherwise know Not, it extracts and indicates the distinctive component characterization of media content in digital form).

In practice, server can be configured on multiple such channels (can be all channels) while receive simultaneously And it concurrently analyzes and generates the corresponding fingerprint of channel or server and can be configured to jump to another frequency from a channel Road may cycle repeatedly through these channels, to analyze and generate the corresponding video and audio-frequency fingerprint of each channel.In addition, service Device can continue to do so in real time, and the corresponding video and sound of at least nearest time window of media content are saved for each channel Frequency fingerprint, it is for reference.And server can record the reference fingerprint of each channel in reference data, reference data with should The characteristic (for example, those discussed above attribute) of channel and instruction receive the timestamp information of the time of relevant media content It is associated.Here, server knows each channel (for example, channel number), just as the generally known receiver of receiver is tuned to Channel it is the same.In addition, the attribute of the accessible specified each such channel of server is (for example, content source mark, content point Originator mark etc.) guide information or other such data, so that server can be respectively each reference fingerprint or channel Record determines and record channel attribute.

Alternatively or additionally, server can receive or may can be used or plan to provide on a particular channel Media content program establishes such video and audioref fingerprint.For example, supplier or the distribution of various media content programs Person can be equally that media content program generates reference using any media fingerprints recognition methods that is currently known or developing later Video and audio-frequency fingerprint, and those reference fingerprints can be provided to server.As an alternative, server can receive media in advance The copy of content program, and oneself can generate such reference fingerprint.In addition, server can connect from performance guide information It receives or determines that media content program is available or plan provides the channel of media content program, and can be plan and provide in media The date and time of appearance.Then, the reference fingerprint of each media content program can be recorded in reference data by server, ginseng Examine that data and media content program are available or the plan carrying channel of media content program is associated, equally with relevant frequency Road attribute, and it is associated to provide the date and time of media content program with plan.

In addition, server usually can be respectively each channel or media content program only establish reference video fingerprint without It is reference audio fingerprint, to promote channel to identify.Then, server can be in response to determining the video existed about those channels More match conditions begin setting up the reference audio fingerprint of each channel in one or more specific channels, to help in turn Solve more match conditions.

This or other such reference data is given, when showing the received media content on unknown channel to server Fingerprint when, any finger print matching method that is currently known or developing later can be used by the reference of fingerprint and storage in server One of fingerprint is matched, so as to conclude: the media content discussed is will to match reference fingerprint mapping from reference data To channel on reach.Therefore, it if server faces the fingerprint of the media content shown by media presentation devices 18, takes Fingerprint can be compared by business device with the reference fingerprint in reference data.And if therefore server finds matched reference Fingerprint, then server can identify that reference data maps the channel for being mapped to matched reference fingerprint, and may conclude that this It is that media presentation devices are just receiving the channel of media content (that is, carrying the media that media presentation devices are being shown on it The channel of content).Then server responsively can execute the movement specific to channel based on the channel identified, or make Other one or more entities execute the movement specific to channel based on the channel identified.

In practice, compared with the reference video fingerprint of known channel, the channel that is being shown by media presentation devices Video finger print can usually be used as the basis for the channel that identification is being shown enough.Therefore, in general practice, media presentation is set It can be configured to generate the video finger print of the channel shown by media presentation devices for 18 or another entity, and video referred to Line is sent to server 20 to be analyzed.

However, as described in the present disclosure, it is likely present the video finger print of the channel shown by media presentation devices and more The case where a channel associated multiple reference video fingerprint matchings.And in this case the sound of the channel shown Frequency fingerprint may be used as the basis disambiguated.Therefore, either in general practice or to this how matched feelings of video of generation When condition responds, media presentation devices or other entities may be configured to generate the frequency shown by media presentation devices The audio-frequency fingerprint in road, and server 20 is sent by the audio-frequency fingerprint to be used to analyze.

For this purpose, may include that video finger print generator 26 and audio-frequency fingerprint are raw Fig. 2 shows media presentation devices 18 It grows up to be a useful person 28, may be provided as such as hardware and/or software (programmed process device) component.Video finger print generator 26 can match It is set to the digital video fingerprinting for generating the media content shown by media presentation devices, and audio-frequency fingerprint generator 28 can To be configured to generate the digital audio fingerprint of the media content shown by media presentation devices.In addition, such fingerprint is raw Growing up to be a useful person can be configured to when media presentation devices receive media content from receiver 16 and/or from media presentation devices are The fingerprint of media content is generated when managing media content for showing.In this way, fingerprint generator, which can receive from receiver, reaches matchmaker The copy of body display device and/or the media content for being processed for showing by media presentation devices is as input, and application is existing The fingerprint of media content is generated in any media fingerprints recognition methods that is known or developing later.

Video finger print generator 26 can be configured to generate video finger print as fingerprint stream on lasting basis, such as right In every frame (for example, each key frame is basic) or other bases.And media presentation devices can be configured to will via network 22 Video finger print is sent to server 20 to be analyzed.As an example, media presentation devices not can be configured to periodically or not When send video finger print to server, which indicates the newest of the media content shown by media presentation devices Frame, series of frames or other segments or part.Particularly, the video finger print that carrying is newly generated can be generated in media presentation devices Together with one or more timestamps and/or other such data and the message of the identifier of media presentation devices, and can To send this message to the IP address of server.And therefore server can receive video finger print to be analyzed.

As an alternative, media presentation devices can be sent to server, and therefore server can receive about by media The various data for the media content that display device is being shown make server similarly, for lasting basis or other bases Itself or another entity can generate the video finger print of the media content shown by media presentation devices.For example, media are in Existing equipment can send the part of the video component shown by media presentation devices, such as individual frame (example to server Such as, snapshot) or video component other segments.And server can refer to using any video that is currently known or developing later Line recognition methods come generate the video finger print of media content for analysis.

By the above process, then any digital video fingerprinting ratio that is currently known or develop later can be used in server Compared with method, by the reference video fingerprint in the video finger print and reference data of the media content shown by media presentation devices It is compared.And as described above, server can determine ginseng if therefore server finds matched reference video fingerprint The channel that matched reference video fingerprint is mapped to by data is examined, and can be concluded that identified channel is to carry by media to be in The channel for the media content that existing equipment is being shown.

In response to having thereby determined that the channel discussed, server may then based on the determination of channel to execute or promote Execute one or more movements specific to channel.Particularly, server itself can be determined based on channel to execute movement, Or server can signal to another entity, may signal to media presentation devices, so that another entity is based on Channel is determining and executes movement.

For example, server can recorde the fact that media presentation devices are showing the content of the specific channel, as frequency Road grading or analysis system are used to measure a part of the range for the specific channel that media presentation devices are being shown.For example, matchmaker The media content that body display device can shown regular (for example, periodically) to server reporting media display device Video finger print, and server can execute all those of as discussed herein processes to determine the channel that is showing.Each taking Business device it is thus determined that showing channel when, server can increase the counting or other statistical numbers of the channel shown According to as the data of range for indicating that the channel is demonstrated.In addition, these are counted or other statistics can be each media presentation (as the viewing analysis specific to equipment) of equipment, instruction media presentation devices show the range of discussed channel.

In another example, server can responsively make media presentation devices show supplemental content, for example, institute as above Pop-up advertisement, commercial advertisement or channel identication for stating etc. possibly serve for the substitution of one or more parts of media content, And it is used as video and/or audio content.For example, server can be generated or select in the case where knowing discussed channel Selecting (for example, from server data storages) and the especially relevant specific supplemental media content of identified channel (and can Can also be based on profile data associated with specific medium display device (for example, specific to equipment viewing analyze) Lai Shengcheng or Selection), and media presentation devices can be sent by supplementing media content, so that media presentation devices combination media presentation is set It is standby just to be shown from the received media content of receiver.Therefore, media presentation devices can receive in supplementing media from server Hold, and it is just being shown together with the received media content of receiver with media presentation devices.

In practical applications, which can be related to server from media presentation devices real-time reception by media presentation devices The video finger print and server of the media content shown determine the received video finger print of institute and server in known channel On simultaneously (or in the predetermined time) received media content reference fingerprint matching.

In some cases, media presentation devices show media content and timestamp and send server for video finger print Time and server receive on known channel media content and added timestamp in other ways for reference video fingerprint There may be the time differences between time.Server can pass through video more received on the sliding window of reference video fingerprint Fingerprint considers the time difference, and vice versa.In addition, server can be in response to the received video finger print of institute and reference video The matching of determination between fingerprint and execution consider the time difference when acting.For example, if the received media of media presentation devices Content is enough earlier than the content timestamp of server (for example, early is more than several seconds time), then server still can identify view Frequency fingerprint matching and it can recorde analysis data.But in response to detecting the time difference, server may be abandoned making media Display device shows associated supplemental content, to help to avoid media presentation devices from the angle of user too late (for example, not It is synchronous) the case where showing supplemental content.On the other hand, if server detects that video finger print matches the sufficiently long period And/or determining that matching content will continue, then server can make media presentation devices show supplemental content, even if in face of in this way Time difference when be also such.

Under any circumstance, it can determine matchmaker by process as these or other, network server or other entities Body display device is receiving the channel of discussed media content.And once entity has determined channel, then entity can be with base It is determined in channel to execute movement.As an alternative, entity can signal to another entity, may return to media presentation devices Signal, so that another entity determines to execute movement based on channel.Other examples are also possible.

From the discussion above, server 20 or other entities of accessible reference data as described above can configure It matches scene at identification video more, may match each other and opened up with media presentation devices by being detected in reference data The matched each reference video fingerprint of the video finger print of the media content shown.

For example, in one embodiment, video finger print that is any currently known or developing can be used in server later Video reference fingerprint pair in matching process comparison reference data carrys out periodic analysis reference data to search for the more Matched Fields of video Scape, to attempt to find the reference video fingerprint to match each other.Finding such, each of at least two reference video fingerprints Then reference video Finger-print labelling method can be more match groups by timing, server.If shown by media presentation devices Any reference video fingerprint matching in the video finger print of media content and more match groups of label, then such label can refer to Show and potential ambiguousness will be present.Reference video Finger-print labelling method can be in various ways more match groups by server.For example, clothes Be engaged in device can with the reference video fingerprint of more match groups in cross reference reference data, with indicate they be more match groups at Member.

Using the embodiment, when server receives the media shown by media presentation devices from media presentation devices When the video finger print and server of content determine the received video finger print of institute and reference video fingerprint matching, server in turn can Easily to determine whether there is more match conditions from reference data.If matched reference video fingerprint is not labeled as more The member of combo, then server can be concluded that in the presence of single match condition (rather than more match conditions), in this case, as above Described, then, server easily can determine channel associated with matching reference fingerprint from reference data, and can break Determining this is the channel for carrying the media content shown by media presentation devices.However, if matched reference video fingerprint It is marked as the member of more match groups, then server can be concluded that there are more match conditions (rather than single match conditions), at this Kind in the case of, server may need to be implemented disambiguation processing with help from it is associated with the reference video fingerprint of more match groups that The discussed channel of identification in a little.

As an alternative, in another embodiment, server can receive video finger print from media presentation devices in server When identify the more match groups of video.For example, when (for example) server receives video finger print from media presentation devices, server The video finger print received can be compared with all reference video fingerprints in reference data.If thus server is examined Measure received video finger print only matched with one in reference video fingerprint, then server can be concluded that in the presence of single matching feelings Then condition (rather than more match conditions), in this case, server easily can determine from reference data and match ginseng The associated channel of fingerprint is examined, and can be concluded that this is the frequency for the media content that carrying is being shown by media presentation devices Road.However, being serviced if server detects the received video finger print of institute and two or more reference video fingerprint matchings Device can be concluded that in this case, server, which may need to be implemented, to disappear there are more match conditions (rather than single match condition) Discrimination processing is to help to identify discussed channel from channel associated with the reference video fingerprint of more match groups.

Note that being similar to procedures discussed above, detection can be executed and be related to two or more reference video fingerprints The process of more match conditions, even if two reference video fingerprint representations are relative to each other in two different channels of having time delay The same media content of upper carrying, that is, the displaying of the media content on one of channel is relative to the phase on another channel With the displaying of media content, there are time migrations.The time migration, and if video finger print can be considered in fingerprint matching process It matches each other, still can find occurrence, such as by referring to a video finger print with another video on sliding window Line is compared.For example, the process that matched reference video fingerprint is found in reference data can be related to search meet it is as follows The reference video fingerprint of condition: matching each other and the corresponding time in mutual threshold time interval shows and/or scheduling.

As described above, being shown when the server or other entities that execute the analysis are detected by media presentation devices When the video finger print of media content multiple reference fingerprints corresponding with multiple channels match, which can apply disambiguation side Method is carried which channel in channel associated with more match groups assisted in and is being shown by media presentation devices The actual channel of media content.

According to the disclosure, disambiguation method can be to further determine that the media content shown by media presentation devices Based on audio-frequency fingerprint is only matched with single channel.

As described above, this disambiguation form can be applied to multiple channels provide the same video with related audio and Video on channel is largely identical but channel has the case where audio track different from each other.In this case, when facing Video is when matching more, then server or other entities can use the sound of the audio component shown by media presentation devices Frequency fingerprint is as the basis disambiguated.Particularly, server can will be from the received audio-frequency fingerprint of media presentation devices and and video The reference audio fingerprint of more associated channels of match group is compared, and may thereby determine that audio-frequency fingerprint only in those channels The reference audio fingerprint matching of one channel, to support such conclusion: a channel is carrying by media presentation devices The channel of the media content shown.

In order to promote this point in practice, as described above, server can be received periodically from media presentation devices by matchmaker The audio-frequency fingerprint (for example, audio fingerprint stream) for the audio component that body display device is being shown.And server can periodically be built Reference data that is vertical or receiving the reference audio fingerprint (for example, reference audio fingerprint stream) including various available channels.Therefore, when When match conditions more in face of video, the audio-frequency fingerprint from media presentation devices easily can be matched composition with by server more The audio-frequency fingerprint of member is compared to disambiguate.

As an alternative, as described above, server periodically can receive video finger print without receiving audio from media presentation devices Fingerprint, in this case, once detecting the more match conditions of video, server can request media presentation devices to provide just In the audio-frequency fingerprint of the media content of displaying, to promote to disambiguate.In addition, server periodically can only generate each available channel Reference video fingerprint, in this case, once detecting the more match conditions of video, server can start generation and video The audio-frequency fingerprint of more associated channels of match group, to disambiguate.

It is furthermore noted that in some cases, the audio track of two different channels may be only partially different from each other.Example Such as, two channels can provide mutually the same movie contents, including identical background music and audio, but channel can have There is language track different from each other (for example, one in English and another is dubbed with Spanish).In order to consider such portion Multi-voice frequency differential rail, server can execute whithin a period of time audio-frequency fingerprint and compare, and can identify and track audio Difference between track, for disambiguating and solving the more match conditions of video.

For example, server can receive audio fingerprint stream for a period of time (for example, about 60-120 from media presentation devices Second), and server can determine a time slice of audio fingerprint stream and multiple reference audio fingerprints of more match groups Match, but can determine that another time slice of audio fingerprint stream is only matched with one in the reference audio fingerprint, thus really Fixed channel associated with matching reference audio fingerprint is discussed channel.The United States Patent (USP) Shen of Serial No. 15/222,405 It please provide and execute further discussing for this disambiguation, and its using based on earlier or later fingerprint time slice The principle of middle discussion also can be applied in this scene.

In addition, the preparatory assessment reference video finger print of server and in reference data the more match groups of marking video implementation In mode, server can compare the reference audio fingerprint of channel associated with more match groups, to determine reference audio fingerprint It is whether different from each other, if so, can further marking video more match groups, to indicate that audio-frequency fingerprint analysis can promote to disappear Discrimination.For example, server, which can recorde Boolean associated with the more match groups of video or audio-frequency fingerprint analysis, can promote to disambiguate Other instruction.After detecting the more match conditions of the video about the video finger print provided by media presentation devices later, Then server can detecte the more match groups of video and so further be marked, also, in response to detecting that video matches more Situation and detect further label, server can responsively start based on audio-frequency fingerprint analysis and seek the side disambiguated Method.

It also can be applied to audio-frequency fingerprint above in conjunction with the various aspects that video finger print analysis discusses to analyze to promote more The disambiguation matched.For example, as media presentation devices can provide media content data (for example, the matchmaker shown to server The individual video frame or other segments held in vivo) so that server itself can generate the video finger print of media content for dividing Analysis, media presentation devices can also provide media content (for example, audio fragment of the media content shown) to server, So that server itself can generate the audio-frequency fingerprint of media content for analysis.As another example, just as server exists It time shift and can carefully avoiding can be considered when comparing video finger print executes certain movements and refer to respond video outmoded enough Line data are the same, and when comparing audio fingerprint, server can carefully avoid executing certain movements it is also contemplated that time shift To respond audio fingerprint data outmoded enough.And as another example, as server can be used it is currently known or Any video fingerprint matching method developed later is the same to compare video finger print, server also can be used it is currently known or with Any audio-frequency fingerprint matching process developed afterwards carrys out comparing audio fingerprint.Other examples are also possible.

In addition, note that although the discussion of this paper is concentrated mainly on based on audio-frequency fingerprint analysis identifies that video matches more Situation and and then disambiguate server 20 on, but it is described it is some or all of operation alternatively by it is one or more its He executes entity instead of server or with server collaboration.

For example, one or more in operation can be executed by media presentation devices itself, or by setting with media presentation The subsystem of standby local communication executes.For example, media presentation devices itself can be provided or accessible institute as above The reference data stated, media presentation devices itself can identify the more match conditions of video with reference to reference data, be referred to based on audio Line analysis executes disambiguation, provides the channel of the media content shown by media presentation devices to identify.In addition, in response to Detect the more match conditions of video, then media presentation devices can be requested to server and be received related to the more match groups of video The reference audio fingerprint or media presentation devices of the channel of connection can be provided separately such reference audio finger print data. Then, media presentation devices itself can by by the audio-frequency fingerprint of such audio-frequency fingerprint and the audio component being just demonstrated into Row relatively executes disambiguation.In addition, then media presentation devices oneself can execute the movement specific to channel, such as show special Due to channel content, record channel present etc., or can make other one or more entities or with it is one or more Other entities execute such movement together.

Next, Fig. 3 is the flow chart for describing the method that can be executed according to the above discussion.One in method shown in Fig. 3 A or more operation can be executed by one or more entities, and including but not limited to network server, media presentation are set It is standby, and/or represent these or other entity or one or more entities with these or other entity cooperation.It is any such Entity may include the computing system, such as programmed process unit etc. for being configured to execute one or more method operations.This Outside, non-transitory data storage (for example, disk storage, flash memories or other computer-readable mediums) can be on it Storage can be executed by processing unit to execute the instruction of various discribed operations.

As shown in figure 3, this method includes that computing system determines the media shown by media presentation devices at frame 30 Multiple reference video fingerprint matchings of the video finger print of the content corresponding channel different from respectively correspond toing.At frame 32, the party Then method includes, in response at least determining multiple reference video fingerprints of the video finger print corresponding channel different from respectively correspond toing Matching is based at least partially on the audio-frequency fingerprint for determining the media content shown by media presentation devices and corresponds only to list The reference audio fingerprint matching of a channel executes disambiguation, and the disambiguation determines that media presentation devices just connect on the single channel Receive media content.At frame 34, this method includes being received in media just on the single channel based on determining media presentation devices Hold to execute movement.

As described above, can have track of video and audio track by the media content that media presentation devices are being shown (for example, at least language track), and video finger print can be the fingerprint of track of video, and audio-frequency fingerprint can be audio track Fingerprint.

In addition, as described above, the case where computing system is server or other entities in addition to media presentation devices Under, the video and audio-frequency fingerprint of the media content shown by media presentation devices can be generated in media presentation devices, and Computing system can receive those fingerprints from media presentation devices.

In addition, multiple references can be selected to refer to from several reference fingerprints in reference data in example embodiment Line determines that the video finger print of the media content shown by media presentation devices and multiple reference regard in this case Frequency fingerprint matching can be related to (i) will be in the video finger print and reference data of the media content that shown by media presentation devices Several reference video fingerprints be compared, (ii) based on comparative result, determines the media that are being shown by media presentation devices The video finger print of first segment of content and multiple reference video fingerprint matching.

Moreover, this method can also include detection and more match groups that label is made of multiple reference fingerprint, this In the case of, video finger print and the multiple reference video fingerprint matchings of the determining media content shown by media presentation devices Movement can be related to: more match groups of the video finger print and label of the determining media content shown by media presentation devices Reference video fingerprint matching.

And this method can also compare audio corresponding with the video finger print of more match groups of label including (i) and refer to Line；(ii) based on comparative result, the difference between the audio-frequency fingerprint compared is detected；(iii) in response to detecting difference, further More match groups are marked to indicate that audio-frequency fingerprint analysis can promote to disambiguate.And it in this case, is set based on determining media presentation The matching of the audio-frequency fingerprint and the reference audio fingerprint for corresponding only to single channel of the standby media content shown disappears to execute The movement of discrimination further can mark more match groups further to determining to indicate that audio-frequency fingerprint analysis can promote to disambiguate and make Response.

Also according to above discussion, this method can include determining that the media content shown by media presentation devices Audio-frequency fingerprint with correspond only to the reference audio fingerprint matching of single channel.For example, this method may include that media are in by (i) The audio-frequency fingerprint for the media content that existing equipment is being shown and ginseng corresponding with the identical channel of multiple reference video fingerprints It examines audio-frequency fingerprint to be compared, (ii) based on comparative result, determines the audio for the media content that media presentation devices are being shown Fingerprint only with the single reference audio fingerprint matching in reference audio fingerprint.

In addition, as described above, computing system can be the entity except media presentation devices.And this method can also wrap It includes: multiple reference video fingerprint matchings in response at least determining the video finger print corresponding channel different from respectively correspond toing, meter The audio-frequency fingerprint of the media content shown by media presentation devices is requested to media presentation devices and received to calculation system, to promote Disambiguation is executed into the audio-frequency fingerprint based on acquisition.

And further, this method may include, in response at least determining that video finger print is different from respectively correspond toing Multiple reference fingerprints of corresponding channel match, and generate the corresponding reference audio of identical with multiple reference video fingerprint channel and refer to Line, to promote to compare.

In addition, as described above, based on determining that media presentation devices receive media content just on the single channel and execute The operation of movement can be related to executing the operation selected from following operation: (i) makes the content supplemented specific to channel and matchmaker The media content that body display device is being shown shows that (ii) records the presentation of the single channel together, is used for channel ratings system System.Also, as described above, this method can be executed at least partly by media presentation devices.

Next, Fig. 4 is the simplified block diagram for the example system that can be operated according to the disclosure.The system can indicate institute as above The network server and/or other one or more entities (may include media presentation devices) stated.As shown in figure 4, example System includes network communication interface 40, processing unit 42, non-transitory data storage 44, any one or can all be collected At together, or as shown, it is communicatively coupled to together by system bus, network or other connection mechanisms 46.

Network communication interface 40 may include one or more physical network connection mechanisms, to promote to beg for above such as It is communicated on the network of the network 22 of opinion, and/or direct for being carried out with other one or more Local or Remote entities Or connected network communication.In this way, network communication interface may include wirelessly or non-wirelessly Ethernet interface or other kinds of network interface, For participating in IP communication and/or other kinds of network communication.

Then, processing unit 42 may include one or more general processors (for example, microprocessor) and/or one Or more application specific processor (for example, specific integrated circuit).And non-transitory data storage 44 may include one or More volatibility and/or non-volatile storage components, such as optics, magnetism or flash memories.

As shown, then data storage 44 stores program instruction 48, program instruction 48 can be executed by processing unit 42 To implement various operations as described herein.For example, program instruction can be performed, with (i) via network communication interface from media Display device receives the video finger print of the media content shown by media presentation devices, and (ii) determines that the received video of institute refers to Line reference video fingerprint matching corresponding with multiple channels, (iii) via network communication interface from media presentation devices receive by The audio-frequency fingerprint for the media content that media presentation devices are being shown, (iv) determine multiple frequencies using the received audio-frequency fingerprint of institute Which channel in road carries the media content shown by media presentation devices, and (v) based on identified channel Execute movement.

According to the discussion of this paper, these operations can take various forms.For example, receiving from media presentation devices by media The movement of the audio-frequency fingerprint for the media content that display device is being shown can determine received video finger print and a multiple frequencies Sometime generation before the corresponding reference video fingerprint matching in road, may receive audio-frequency fingerprint and video finger print simultaneously.It replaces Selection of land, the movement for receiving the audio-frequency fingerprint of the media content shown by media presentation devices from media presentation devices can be rung It should execute in determining the reference video fingerprint matching corresponding with multiple channels of the received video finger print of institute, for example, by matchmaker Body display device sends the request to the audio-frequency fingerprint of the media content shown by media presentation devices and in response to sending The request and receive audio-frequency fingerprint respond video more match.Other examples are also possible.

Finally, Fig. 5 is the simplified block diagram for the example media display device that can be operated according to the disclosure.It is begged for according to above By the media presentation devices can use various forms.For example, it can be television set, computer monitor or for receiving With the other equipment for showing video content and/or it can be loudspeaker, a pair of of earphone or for receiving and showing audio content Other equipment.Many other examples are also possible.

As shown in figure 5, example media display device includes media input interface 50, media presentation interface 52, network communication Interface 54, processing unit 56 and non-transitory data storage 58, any one or all can integrate together, or As shown, being communicatively coupled to together by system bus, network or other connection mechanisms 60.

Media input interface 50 may include physical communication interface, for receiving in the media shown by media presentation devices Hold.In this way, media input interface may include one or more wiredly and/or wirelessly interfaces, for receiver or other Equipment or system, which are established, to be communicated to connect and receives from it the media content in the form of analog or digital.For example, media input interface It may include one or more interfaces for meeting the agreements such as DVI, HDMI, VGA, USB, bluetooth, WIFI.

Then, media presentation interface 52 may include one or more components, to promote to show in the received media of institute Hold.As an example, media presentation interface may include user interface, such as display screen and/or loudspeaker, and for handling One or more drivers or other assemblies of received media content, to promote to show content on a user interface.

Network communication interface 54 may include physical network connection mechanism, to promote in such as network discussed above 22 It is communicated on network, and/or for carrying out direct or connected network communication with other one or more Local or Remote entities.This Sample, network communication interface may include wirelessly or non-wirelessly Ethernet interface or other kinds of network interface, logical for participating in IP Letter and/or other kinds of network communication.

Then, processing unit 56 may include one or more general processors (for example, microprocessor) and/or one Or more application specific processor (for example, specific integrated circuit).And non-transitory data storage 58 may include one or More volatibility and/or non-volatile storage components, such as optics, magnetism or flash memories.In addition, as shown, number Program instruction 62 is then stored according to memory 58, program instruction 62 can be executed as described herein various to execute by processing unit 56 Operation.For example, program instruction can be performed with: based on the analysis to media content received at media input interface 50 And/or the analysis of the media content in the processing of media presentation interface, to generate on the basis of continuing or according to request by matchmaker The video finger print and audio-frequency fingerprint for the media content that body display device is being shown, and on the basis of continuing or according to request Fingerprint generated is provided to promote channel as described herein to identify.

Exemplary embodiment is described above.It will be understood by those skilled in the art, however, that of the invention practical not departing from In the case where scope and spirit, it can change these embodiments and modify.

This application discloses following example A 1-G51

A1, a kind of channel based on the determination and disambiguation based on audio-frequency fingerprint are come the method that executes movement, comprising:

By computing system determine the digital video fingerprinting of the media content shown by media presentation devices with it is respectively right It should be in multiple reference video fingerprint matchings of different corresponding channels；

In response at least determining the multiple ginseng of the digital video fingerprinting corresponding channel different from respectively correspond toing Video finger print matching is examined, the digital sound for determining the media content shown by the media presentation devices is based at least partially on The reference audio fingerprint matching of frequency fingerprint and the single channel for corresponding only to channel corresponding to the multiple reference video fingerprint To execute disambiguation by computing system, the disambiguation is determined and received in media on the described single channel of media presentation devices Hold；And

It is held based on media content is received on the described single channel of the determination media presentation devices by computing system Action is made.

A2, the method according to claim A1, wherein the media content shown by the media presentation devices With track of video and audio track, wherein the digital video fingerprinting is the fingerprint of the track of video, and the number Audio-frequency fingerprint is the fingerprint of the audio track.

A3, the method according to claim A1, wherein the media content shown by the media presentation devices Digital audio fingerprint at least indicate the language track of the media content shown by the media presentation devices.

A4, the method according to claim A1, wherein the computing system is the reality in addition to media presentation devices Body, and wherein, the digital video fingerprinting and digital audio fingerprint are generated by the media presentation devices, the method Further include:

The digital video fingerprinting and digital audio fingerprint are received from the media presentation devices by the computing system.

A5, the method according to claim A1, wherein the multiple reference video fingerprint is in reference data Several reference video fingerprints, and wherein it is determined that the media content shown by the media presentation devices digital video Fingerprint includes: with the multiple reference video fingerprint matching

It will be in the digital video fingerprinting for the media content that shown by the media presentation devices and the reference data Several reference video fingerprints be compared, and

Based on comparative result, determine the digital video fingerprinting of the media content shown by the media presentation devices with The multiple reference video fingerprint matching.

A6, the method according to claim A1, the method also includes detecting and mark by the multiple with reference to view More match groups of frequency fingerprint composition,

Wherein it is determined that the digital video fingerprinting of the media content shown by the media presentation devices with it is the multiple Reference video fingerprint matching comprise determining that the digital video fingerprinting of the media content shown by the media presentation devices with The reference video fingerprint matching of more match groups of label.

A7, the method according to claim A6, the method also includes:

Compare reference audio fingerprint corresponding with the reference video fingerprint of more match groups of label；

Based on comparative result, the difference between the reference audio fingerprint compared is detected；And

In response to detecting difference, more match groups are marked further to indicate that audio-frequency fingerprint analysis can promote to disappear Discrimination,

Wherein, more match groups are marked further to indicate that audio-frequency fingerprint analysis can promote further in response to determining It disambiguates, based on the digital audio fingerprint for determining the media content shown by the media presentation devices and corresponds only to single The reference audio fingerprint matching of channel executes disambiguation.

A8, the method according to claim A1, wherein the computing system is in addition to the media presentation devices Entity, the method also includes:

In response at least determining the multiple ginseng of the digital video fingerprinting corresponding channel different from respectively correspond toing Video finger print matching is examined, requests and is received by the media presentation devices just from the computing system to the media presentation devices In the digital audio fingerprint of the media content of displaying, to promote to execute disambiguation based on digital audio fingerprint obtained.

A9, the method according to claim A1, the method also includes:

In response at least determining the multiple ginseng of the digital video fingerprinting corresponding channel different from respectively correspond toing Fingerprint matching is examined, reference audio fingerprint corresponding with channel corresponding to the multiple reference video fingerprint is generated, to promote to disappear Discrimination.

A10, the method according to claim A1, wherein based on determining the media presentation devices list Media content is received on a channel come execute movement include: the content specific to channel for making supplement and media presentation devices The media content of displaying is shown together.

A11, the method according to claim A1, wherein based on determining the media presentation devices list On a channel receive media content come execute movement include: selectively with replacement advertisement replacement media content predetermined portions.

A12, the method according to claim A1, wherein based on determining the media presentation devices list Media content is received on a channel come to execute movement include: to record the presentation of the single channel for channel ratings system.

A13, the method according to claim A1, the method are at least partly held by the media presentation devices Row.

B14, a kind of non-transitory computer-readable medium, are stored thereon with instruction, and described instruction can be by processing unit It executes to execute operation, the operation includes:

Determine that the digital video fingerprinting of the media content shown by media presentation devices is different from respectively correspond toing Multiple reference video fingerprint matchings of corresponding channel；

In response at least determining the multiple ginseng of the digital video fingerprinting corresponding channel different from respectively correspond toing Video finger print matching is examined, the digital sound for determining the media content shown by the media presentation devices is based at least partially on The reference audio fingerprint matching of frequency fingerprint and the single channel for corresponding only to channel corresponding to the multiple reference video fingerprint Execute disambiguation, the disambiguation determines receives media content on the described single channel of media presentation devices；And

Movement is executed based on media content is received on the described single channel of the determination media presentation devices.

B15, the non-transitory computer-readable medium according to claim B14, wherein set by the media presentation The digital audio fingerprint of the standby media content shown at least indicates in the media shown by the media presentation devices The language track of appearance.

B16, the non-transitory computer-readable medium according to claim B14, wherein based on the determining media On the described single channel of display device receive media content come execute movement include: make supplement the content specific to channel It is shown together with the media content that media presentation devices are being shown.

B17, the non-transitory computer-readable medium according to claim B14, wherein based on the determining media Media content is received on the described single channel of display device come execute movement include: record the presentation of the single channel with For channel ratings system.

C18, a kind of channel based on the determination and disambiguation based on audio-frequency fingerprint are come the system that executes movement, comprising:

Network communication interface；

Processing unit；

Non-transitory data storage；With

Program instruction, described program instruction are stored in the non-transitory data storage, and can be by the processing Unit is executed to execute operation, and the operation includes:

The matchmaker shown by the media presentation devices is received from media presentation devices via the network communication interface The digital video fingerprinting held in vivo,

Determine institute received digital video fingerprinting with corresponding to multiple channels reference video fingerprint matching,

It receives via the network communication interface from the media presentation devices and is being shown by the media presentation devices Media content digital audio fingerprint,

Using received digital audio fingerprint carry come which channel determined in the multiple channel by the matchmaker The media content that body display device is being shown,

Wherein, using received digital audio fingerprint carry come which channel determined in the multiple channel by institute State the media content that media presentation devices are being shown, comprising:

Determine received digital audio fingerprint and only reference audio corresponding with the single channel of the multiple channel Fingerprint matching, and

Movement is executed based on identified channel.

C19, the system according to claim C18, wherein the operation further include:

In response at least determine institute received digital video fingerprinting with corresponding to the multiple channel reference video fingerprint Matching, Xiang Suoshu media presentation devices send the request of the audio-frequency fingerprint of the media content shown to media presentation devices,

Wherein, the number of the media content shown by the media presentation devices is received from the media presentation devices Audio-frequency fingerprint be in response in send it is described request and carry out.

C20, the system according to claim C18, wherein being acted based on identified channel execution includes: to make to mend The content specific to channel filled is shown together with the media content that media presentation devices are being shown.

C21, the system according to claim C18, wherein being acted based on identified channel execution includes: selection Property with replacement advertisement replacement media content predetermined portions.

C22, the system according to claim C17, wherein being acted based on identified channel execution includes: record The presentation of identified channel is to be used for channel ratings system.

D23, a kind of method based on by executing movement based on the determining channel of audio fingerprint data disambiguation, comprising:

The digital video fingerprinting data for indicating the media content shown by media presentation devices are determined by computing system It is matched with the reference video finger print data for corresponding to multiple channels；

In response to the reference video finger print data at least determining the digital video fingerprinting data with corresponding to multiple channels Matching is based at least partially on the digital audio fingerprint for determining and indicating the media content shown by the media presentation devices Data match to be disappeared by computing system execution with the reference audio finger print data for the single channel for corresponding only to the multiple channel Discrimination, the disambiguation, which determines, receives media content on the described single channel of media presentation devices；And

Movement is executed based on media content is received on the described single channel of the determination media presentation devices.

D24, the method according to claim D23, wherein media content has track of video and audio track, Described in digital video fingerprinting data be the finger print data for indicating the track of video, and the digital audio finger print data is Indicate the fingerprint of the audio track.

D25, the method according to claim D23, wherein digital audio finger print data at least indicates media content Language track.

D26, the method according to claim D23, wherein based on determining the media presentation devices list Media content is received on a channel come to execute movement include: the content and matchmaker for making media presentation devices by supplement specific to channel The media content that body display device is being shown is shown together.

D27, the method according to claim D26, wherein the content specific to channel of the supplement includes pop-up At least one of advertisement, commercial advertisement or channel identication.

D28, the method according to claim D26,

Wherein, the content specific to channel of the supplement includes advertisement, and

Wherein, the matchmaker that showing media presentation devices the content specific to channel of supplement with media presentation devices Hold in vivo and shows to include making media presentation as the replacement of a part of media content to show advertisement together.

D29, the method according to claim D23, wherein based on determining the media presentation devices list Media content is received on a channel to execute the advertisement for acting and including: insertion instead of a part of media content.

E30, a kind of media presentation devices, comprising:

Media input interface receives the media content shown by media presentation devices by the media input interface；

Media presentation interface, for showing media content；

Network communication interface；

Processing unit；

Non-transitory data storage；

Program instruction, it includes following for being stored in non-transitory data storage and being executed by processing unit to execute Every operation:

Analysis based on media content generates the digital video fingerprinting data for indicating media content, and exports generation Digital video fingerprinting data, for being transferred to server by network communication interface,

After output digital video finger print data is to be used for transmission server, receiving from server indicates media content Audio fingerprint data request, it is described request in response to determine digital video fingerprinting data with correspond to multiple channels reference The matched server of video fingerprint data,

In response to the request, output indicates the digital audio finger print data of media content, for being connect by network communication Digital audio finger print data is transferred to server by mouth, and

Make media presentation devices by associated with a channel in the multiple channel supplement specific to channel Content shows that one channel is based on determining digital audio finger print data and only indicates the multiple frequency together with media content The reference audio finger print data matching of a channel in road is identified from the multiple channel.

E31, the media presentation devices according to claim E30, wherein media content has track of video and audio Track, wherein the digital video fingerprinting data are the finger print datas for indicating the track of video, and the digital audio refers to Line data are the finger print datas for indicating the audio track.

E32, the media presentation devices according to claim E30, wherein digital audio finger print data at least indicates matchmaker The language track held in vivo.

E33, the media presentation devices according to claim E30, wherein the operation further include:

In response to the request, the digital audio finger print data for indicating media content is generated.

E34, the media presentation devices according to claim E30, wherein the content specific to channel of supplement includes Pop up at least one of advertisement, commercial advertisement or channel identication.

E35, the media presentation devices according to claim E30, wherein the content specific to channel of supplement is to use Replacement in one or more parts of media content.

E36, the media presentation devices according to claim E30,

Wherein, the content specific to channel of supplement is advertisement, and

Wherein, showing media presentation devices together with media content the content of supplement includes making media presentation devices exhibition Show the advertisement of a part instead of media content.

E37, the media presentation devices according to claim E30, wherein media presentation devices include TV.

F38, a kind of method for the content specific to channel supplemented for rendering, this method comprises:

Based on the analysis to the media content shown by media presentation devices, being generated by media presentation devices is indicated by media The digital video fingerprinting data for the media content that display device is shown；

The video fingerprint data generated by media presentation devices output, to pass through network transmission to server；

In output digital video finger print data with for by after network transmission to server, by media presentation devices from Server receives the request for indicating the audio fingerprint data of media content, and the request is in response to determining digital video fingerprinting data With the matched server of reference video finger print data for corresponding to multiple channels；

In response to the request, the digital audio finger print data for indicating media content is exported by media presentation devices, is used for Digital audio finger print data is transferred to server by network；And

By media presentation devices by associated with a channel in the multiple channel supplement specific to channel Content shows together with media content, one channel based on determine digital audio finger print data with correspond only to it is the multiple The reference audio finger print data matching of a channel in channel is identified from the multiple channel.

F39, the method according to claim F38, wherein media content has track of video and audio track, Described in digital video fingerprinting data be the finger print data for indicating the track of video, and the digital audio finger print data is Indicate the finger print data of the audio track.

F40, the method according to claim F38, wherein digital audio finger print data at least indicates media content Language track.

F41, the method according to claim F38, further includes:

In response to the request, the digital audio finger print data for indicating media content is generated.

F42, the method according to claim F38, wherein the content specific to channel of supplement includes that pop-up is wide At least one of announcement, commercial advertisement or channel identication.

F43, the method according to claim F38, wherein the content specific to channel of supplement is in media The replacement of one or more parts held.

F44, the method according to claim F38,

Wherein, the content specific to channel of supplement is advertisement, and

G45, a kind of non-transitory computer-readable medium, are stored thereon with instruction, and described instruction can be by processing unit It executes to execute operation, the operation includes:

Based on the analysis to the media content shown by media presentation devices, generate what expression was shown by media presentation devices The digital video fingerprinting data of media content, and the video fingerprint data generated by media presentation devices output, to pass through net Network is transferred to server；

After output digital video finger print data is to be used for transmission server, when server has determined digital video When finger print data is matched with the reference video finger print data for corresponding to multiple channels, receive from server for expression media content Audio fingerprint data request；

In response to the request, output indicates the digital audio finger print data of media content, will be digital for passing through network Audio fingerprint data is transferred to server；And

Make media presentation devices by associated with a channel in the multiple channel supplement specific to channel Content shows together with media content, one channel based on determine digital audio finger print data with correspond only to it is the multiple The reference audio finger print data matching of a channel in channel is identified from the multiple channel.

G46, the non-transitory computer-readable medium according to claim G45, wherein media content has video Track and audio track, wherein the digital video fingerprinting data are the finger print datas for indicating the track of video, and described Digital audio finger print data is the finger print data for indicating the audio track.

G47, the non-transitory computer-readable medium according to claim G45, wherein digital audio finger print data At least indicate the language track of media content.

G48, the non-transitory computer-readable medium according to claim G45, further includes:

In response to the request, the digital audio finger print data for indicating media content is generated.

G49, the non-transitory computer-readable medium according to claim G45, wherein supplement specific to channel Content include at least one of pop-up advertisement, commercial advertisement or channel identication.

G50, the non-transitory computer-readable medium according to claim G45, wherein supplement specific to channel Content be one or more parts for media content replacement.

G51, the non-transitory computer-readable medium according to claim G45, wherein non-transitory computer can Medium is read to realize in media presentation devices.

27页详细技术资料下载

More matching detections of video based on audio-frequency fingerprint and identifying to media channel disambiguate

相关技术

网友询问留言