System and method for improving device map accuracy using media viewing data

文档序号：1786406 发布日期：2019-12-06 浏览：12次中文

阅读说明：本技术 使用媒体查看数据提高设备映射图准确度的系统和方法 (System and method for improving device map accuracy using media viewing data ) 是由泽埃夫·诺伊迈尔于 2018-04-05 设计创作，主要内容包括：提供了用于确定设备映射系统的准确度分数的方法、设备和计算机程序产品。在一些示例中,准确度分数可以是基于设备映射系统的设备映射和来自自动内容识别部件的观看数据。在这样的示例中,准确度分数可以指示设备映射系统是否将类似的类别分配给具有类似的媒体内容播放器的设备。在这样的示例中,可以确定设备映射是随机的,指示设备映射系统是不准确的。相反,如果确定设备映射具有仅本质上随机的足够低的概率,则可以确定设备映射系统是准确的。(Methods, devices, and computer program products are provided for determining an accuracy score for a device mapping system. In some examples, the accuracy score may be based on a device mapping of a device mapping system and viewing data from an automatic content recognition component. In such an example, the accuracy score may indicate whether the device mapping system assigns similar categories to devices having similar media content players. In such an example, the device mapping may be determined to be random, indicating that the device mapping system is inaccurate. Conversely, if the device mapping is determined to have a sufficiently low probability of being only random in nature, then the device mapping system may be determined to be accurate.)

1. a system, comprising:

One or more processors; and

One or more non-transitory machine-readable storage media embodying instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising:

Obtaining a plurality of categories assigned to a group of media player devices, wherein the plurality of categories are determined using a device mapping system, and wherein a category comprises a classification of the group of media player devices;

Determining a viewing behavior of the group of media player devices, wherein the viewing behavior is determined using automatic content recognition by matching viewing media content viewed by the media player devices to stored media content;

Determining a correlation between the plurality of categories of the group of media player devices and the viewing behavior;

determining an accuracy score of the device mapping system using the determined correlation; and

assigning the accuracy score to the device mapping system, wherein the accuracy score is used to improve the device mapping system.

2. the system of claim 1, wherein the correlation between the plurality of categories of the group of media player devices and the viewing behavior is based on a degree of difference in viewing behavior in the plurality of categories.

3. The system of claim 2, wherein determining an accuracy score of the device mapping system comprises: performing a statistical hypothesis test to determine whether the correlation between the plurality of categories of the media player device group and the viewing behavior is random.

4. the system of claim 3, further comprising instructions that, when executed on the one or more processors, cause the one or more processors to perform operations comprising:

Comparing the results of the statistical hypothesis test to a randomness threshold; and

Determining that the correlation is random when the result is less than the randomness threshold.

5. the system of claim 4, wherein the accuracy score is determined for the device mapping system based on the comparison of the result of the statistical hypothesis test to the randomness threshold.

6. The system of claim 1, wherein the media content is video content, and wherein performing the automatic content recognition comprises:

Receiving a pixel cue point associated with a frame of an unknown video segment, wherein the pixel cue point comprises a set of pixel values corresponding to the frame;

Identifying candidate reference data points in a database of reference data points, wherein the candidate reference data points are similar to the pixel hint point, and wherein the candidate reference data points comprise one or more pixel values corresponding to a candidate frame of a candidate video segment;

Adding a marker to a marker cylinder associated with the candidate reference data point and the candidate video segment;

determining whether the number of markers in the marker cartridge exceeds a value; and

identifying the unknown video segment as matching the candidate video segment when the number of tokens in the token bucket exceeds the value.

7. The system of claim 1, wherein the viewing behavior comprises at least one or more of: an amount of time that the media player device group views one or more of a plurality of channels, revenue associated with a user of the media player device group, an age group of the user of the media player device group, an educational level of the user of the media player device group, or a number of devices in the media player device group.

8. A method, comprising:

determining a viewing behavior of the group of media player devices, wherein the viewing behavior is determined using automatic content recognition by matching viewing media content viewed by the media player devices to stored media content;

determining a correlation between the plurality of categories of the group of media player devices and the viewing behavior;

Determining an accuracy score of the device mapping system using the determined correlation; and

Assigning the accuracy score to the device mapping system, wherein the accuracy score is used to improve the device mapping system.

9. The method of claim 8, wherein the correlation between the plurality of categories of the media player device group and the viewing behavior is based on a degree of difference in viewing behavior in the plurality of categories.

10. The method of claim 9, wherein determining the accuracy score of the device mapping system comprises: performing a statistical hypothesis test to determine whether the correlation between the plurality of categories of the group of media player devices and viewing behavior is random.

11. The method of claim 10, further comprising:

Comparing the results of the statistical hypothesis test to a randomness threshold; and

determining that the correlation is random when the result is less than the randomness threshold.

12. the method of claim 11, wherein the accuracy score is determined for the device mapping system based on the comparison of the results of the statistical hypothesis test to the randomness threshold.

13. the method of claim 8, wherein the media content is video content, and wherein performing the automatic content recognition comprises:

receiving a pixel cue point associated with a frame of an unknown video segment, wherein the pixel cue point comprises a set of pixel values corresponding to the frame;

adding a marker to a marker cylinder associated with the candidate reference data point and the candidate video segment;

Determining whether the number of markers in the marker cartridge exceeds a value; and

Identifying the unknown video segment as matching the candidate video segment when the number of tokens in the token bucket exceeds the value.

14. the method of claim 8, wherein the viewing behavior comprises at least one or more of: an amount of time that the media player device group views one or more of a plurality of channels, revenue associated with a user of the media player device group, an age group of the user of the media player device group, an educational level of the user of the media player device group, or a number of devices in the media player device group.

15. A computer program product tangibly embodied as a non-transitory machine-readable storage medium, comprising instructions that when executed by one or more processors cause the one or more processors to:

determining a correlation between the plurality of categories of the group of media player devices and the viewing behavior;

Determining an accuracy score of the device mapping system using the determined correlation; and

assigning the accuracy score to the device mapping system, wherein the accuracy score is used to improve the device mapping system.

16. The computer program product of claim 15, wherein the correlation between the plurality of categories of the media player device group and the viewing behavior is based on a degree of difference in viewing behavior in the plurality of categories.

17. The computer program product of claim 16, wherein determining the accuracy score of the device mapping system comprises: performing a statistical hypothesis test to determine whether the correlation between the plurality of categories of the group of media player devices and viewing behavior is random.

18. The computer program product of claim 17, further comprising instructions that, when executed by the one or more processors, cause the one or more processors to:

Comparing the results of the statistical hypothesis test to a randomness threshold; and

Determining that the correlation is random when the result is less than the randomness threshold.

19. The computer program product of claim 18, wherein the accuracy score is determined for the device mapping system based on the comparison of the result of the statistical hypothesis test to the randomness threshold.

20. the computer program product of claim 15, wherein the media content is video content, and wherein performing the automatic content recognition comprises:

Receiving a pixel cue point associated with a frame of an unknown video segment, wherein the pixel cue point comprises a set of pixel values corresponding to the frame;

identifying candidate reference data points in a database of reference data points, wherein the candidate reference data points are similar to the pixel hint point, and wherein the candidate reference data points comprise one or more pixel values corresponding to a candidate frame of a candidate video segment;

Adding a marker to a marker cylinder associated with the candidate reference data point and the candidate video segment;

Determining whether the number of markers in the marker cartridge exceeds a value; and

Identifying the unknown video segment as matching the candidate video segment when the number of tokens in the token bucket exceeds the value.

Technical Field

the present disclosure relates generally to improving the accuracy of data derived from analysis of connected devices and their association with particular categories.

background

Users are increasingly accessing media through a range of devices. However, determining which devices are associated with a particular user may be difficult. There are many systems that claim to map devices to specific categories (sometimes referred to as device maps or device drawings). For example, the device mapping system may generate a device map indicating that the first device and the second device belong to a particular category. In some examples, the devices may be mapped to particular users based on the categories assigned to each device. In other examples, devices are assigned to a home-wide device map. However, it is difficult to assess the accuracy of the device mapping. Therefore, there is a need in the art to determine and improve the accuracy of device maps.

Disclosure of Invention

Methods, devices, and computer program products are provided for determining an accuracy score for a device mapping system by processing media (e.g., video and/or audio data) played by one or more devices. In some examples, the accuracy score may be a device map based on a device mapping system. In such an example, the device map may refer to devices that are linked together.

in some examples, the accuracy score may also be based on media content viewing data from an Automatic Content Recognition (ACR) system or other system that can determine media content being viewed by one or more media player devices. In some cases, the media content may include video content (which may include audio content) or audio content. The media content may be processed and analyzed (e.g., using an ACR system) to determine the media content being viewed by one or more media player devices, which may be stored as viewing data. In one illustrative example, when the ACR system is used to determine media content being viewed by a media player device, the media player device can decode video data (and in some cases, audio data) associated with a video program. The media player device may place the decoded content of each frame of video into a video frame buffer in preparation for display or further processing of the pixel information of the video frame. The media player device can process the buffered video data and can generate an unknown data point (which can be referred to as a "cue point") representing an unknown video segment currently being played by the player device. The matching server can receive the unknown cue point and can compare the unknown cue point to stored candidate cue points to determine a match between the candidate video segment and the unknown video segment.

the viewing data may then be processed to determine an accuracy score. In such examples, the media viewing data (sometimes referred to as viewing behavior) may indicate the media content that the media player device is playing. In some examples, the accuracy score may indicate whether the device mapping system assigned similar categories to devices with similar media content playback. In such an example, the device map may be compared to the device-to-class designation based on the randomness assignment to determine the accuracy of the device mapping system. The device mapping system may be determined to be accurate if the device map is determined to have a sufficiently low probability of being merely random in nature.

In some examples, the device mapping system may use the accuracy score to improve its process for generating the device map. For example, the device mapping system may modify one or more operations to attempt to improve the accuracy score.

in some examples, there is provided a system comprising: one or more processors and one or more non-transitory machine-readable storage media embodying instructions that, when executed on the one or more processors, cause the one or more processors to perform operations. The operations include obtaining a plurality of categories assigned to a group of media player devices. Determining the plurality of categories using a device mapping system. The categories include classifications for a group of media player devices. The operations also include determining a viewing behavior of the group of media player devices. Determining the viewing behavior using automatic content recognition by matching viewing media content viewed by the media player device with stored media content. The operations further include determining a correlation between the plurality of categories of the media player device group and the viewing behavior, and determining an accuracy score for the device mapping system using the determined correlation. The operations further comprise assigning the accuracy score to the device mapping system, wherein the accuracy score is used to improve the device mapping system.

In some examples, there is provided a method comprising: a plurality of categories assigned to a group of media player devices is obtained. Determining the plurality of categories using a device mapping system. The categories include a classification of a group of media player devices. The method also includes determining a viewing behavior of the group of media player devices. Determining the viewing behavior using automatic content recognition by matching viewing media content viewed by the media player device with stored media content. The method also includes determining a correlation between the plurality of categories of the media player device group and the viewing behavior, and determining an accuracy score for the device mapping system using the determined correlation. The method also includes assigning the accuracy score to the device mapping system, wherein the accuracy score is used to improve the device mapping system.

In some examples, there is provided a computer program product, tangibly embodied in a non-transitory machine-readable storage medium, comprising instructions that when executed by one or more processors cause the one or more processors to: obtaining a plurality of categories assigned to a group of media player devices, wherein the plurality of categories are determined using a device mapping system, and wherein a category comprises a classification of a group of media player devices; determining a viewing behavior of the group of media player devices, wherein the viewing behavior is determined using automatic content recognition by matching viewing media content viewed by the media player devices to stored media content; determining a correlation between the plurality of categories of the group of media player devices and the viewing behavior; determining an accuracy score of the device mapping system using the determined correlation; and assigning the accuracy score to the device mapping system, wherein the accuracy score is used to improve the device mapping system.

In some aspects, the correlation between the plurality of categories of the group of media player devices and the viewing behavior is based on a degree of difference (variance) of the viewing behavior in the plurality of categories.

in some aspects, determining the accuracy score of the device mapping system includes performing a statistical hypothesis test to determine whether correlations between multiple categories of the media player device group and viewing behavior are random.

In some aspects, the systems, methods, and computer program products include comparing results of statistical hypothesis testing to a randomness threshold; and determining that the correlation is random when the result is less than the randomness threshold.

In some aspects, an accuracy score is determined for the device mapping system based on a comparison of the result of the statistical hypothesis test to the randomness threshold.

In some aspects, the media content is video content, and performing the automatic content recognition comprises: receiving a pixel cue point associated with a frame of an unknown video segment, wherein the pixel cue point comprises a set of pixel values corresponding to the frame; identifying candidate reference data points in a database of reference data points, wherein the candidate reference data points are similar to pixel hint points, and wherein the candidate reference data points comprise one or more pixel values corresponding to candidate frames of a candidate video segment; adding a token (token) to a token bin (bin) associated with the candidate reference data point and the candidate video segment; determining whether the number of marks in the mark cartridge exceeds a value; and identifying the unknown video segment as matching the candidate video segment when the number of tokens in the token bucket exceeds the value.

in some aspects, the viewing behavior comprises at least one or more of: an amount of time that the media player device group views one or more of a plurality of channels, revenue associated with a user of the media player device group, an age group of the user of the media player device group, an educational level of the user of the media player device group, or a number of devices in the media player device group.

This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used alone to determine the scope of the claimed subject matter. The subject matter should be understood by reference to the entire specification of this patent, any or all of the drawings, and appropriate portions of each claim.

The foregoing and other features and embodiments will become more apparent by reference to the following specification, claims and appended drawings.

Drawings

Illustrative embodiments of the invention are described in detail below with reference to the following drawings:

FIG. 1 illustrates an example of a system for updating a device map classification system;

FIG. 2A shows an example of a graph comparing a viewing time of a first source with a channel disparity;

FIG. 2B shows an example of a graph comparing the viewing time of the second source with the channel disparity;

FIG. 2C shows an example of a graph comparing the viewing time of a third source with the channel disparity;

FIG. 3 shows an example of calculating f-ratios for various sources;

FIG. 4 shows an example of a process for assigning accuracy scores to a device map matching process;

FIG. 5 shows an example of a process for evaluating statistical relevance of a plurality of devices to a predicted statistical attribute;

FIG. 6 shows an example of a process for comparing predicted viewing behavior with actual viewing measured by an automatic content recognition component;

FIG. 7 shows an example of a block diagram of a matching system for identifying video content being viewed by a media system;

FIG. 8 shows an example of a process flow for various devices;

fig. 9 shows an example of monthly viewing hours for the revenue code VS equivalent to the first matching rate (equaling);

Fig. 10 shows an example of monthly viewing hours of the revenue code VS equivalent to the second matching rate; and

Fig. 11 shows an example of a media device VS viewing hours per month equivalent to a revenue code, found only in data set 2.

Detailed Description

In the following description, for purposes of explanation, specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It may be evident, however, that the various embodiments may be practiced without these specific details. The drawings and description are not to be taken in a limiting sense.

The following description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing an exemplary embodiment. It being understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the invention as set forth in the appended claims.

In the following description specific details are given to provide a thorough understanding of the embodiments. However, one of ordinary skill in the art will understand that embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown in block diagram form as a means for not obscuring the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.

furthermore, it is noted that the various embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be rearranged. A process terminates when its operations are completed, but there may be additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a procedure corresponds to a function, its termination may correspond to a return of the function to the calling function or the main function.

the terms "machine-readable storage medium" or "computer-readable storage medium" include, but are not limited to portable or non-portable storage devices, optical storage devices, and various other media capable of storing, containing, or carrying instruction(s) and/or data. A machine-readable or computer-readable storage medium may include a non-transitory medium that may store data and that does not include carrier waves and/or transitory electronic signals propagated over a wireless or wired connection. Examples of non-transitory media may include, but are not limited to, magnetic disks or tapes, optical storage media such as Compact Disks (CDs) or Digital Versatile Disks (DVDs), flash memory, or memory devices. A computer program product may include code and/or machine-implementable instructions that may represent any combination of processes, functions, subroutines, programs, routines, subroutines, modules, software packages, classes or instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, parameters, arguments, or memory contents. Information, parameters, quantities, data, or other information may be communicated, forwarded, or transmitted using any suitable means, including memory sharing, message passing, tag passing, network transmission, or other transmission techniques.

furthermore, embodiments may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments (e.g., computer program products) to perform the necessary tasks may be stored in a machine-readable medium. The processor may perform the necessary tasks.

Some of the systems depicted in the figures may be provided in various configurations. In some embodiments, the system may be configured as a distributed system, where one or more components of the system are distributed over one or more networks in a cloud computing system.

methods, devices, and computer program products are provided for determining the accuracy of a device mapping system. In some examples, a device mapping system accuracy score may be determined. In some cases, the accuracy score may be based on a device map of a device mapping system. In this case, the device map may include information linking media player devices (also referred to as "devices" or "player devices" or "media devices") that are categorized or associated together. In some examples, a device (or "player device" or "media player device") may be defined as a network-connected device, such as a smartphone, tablet, smart television, laptop, smart watch, or other wearable device, or any other network-connected device that can receive and display media content (e.g., an internet-connected, broadband network-connected, cellular network-connected, or other network-connected device). In some examples, the device map may be generated based on assigning one or more category segments (or "categories") to each device included in the device map. In such examples, the category segments or categories may include demographic attributes, such as annual family income, age group, education level, number of televisions, and/or various preferences regarding entertainment selection, or any suitable combination thereof. However, it should be appreciated that a category segment or category may be any logical group that may associate multiple devices together.

in some examples, the accuracy score may be further based on viewing data from an Automatic Content Recognition (ACR) component or other system that can determine media content being viewed by one or more media players. In some cases, the media content may include video content (which may include audio content) or audio content. The media content may be processed and analyzed (e.g., using an ACR system) to determine which media content is being viewed by one or more media players, which may be stored as viewing data. The viewing data may then be processed to determine an accuracy score. In such an example, the viewing data (sometimes referred to as viewing behavior) may indicate the media content that the media player device is playing. In some examples, the accuracy score may indicate whether the device mapping system assigned similar categories to devices with similar media content playback. In such an example, the device map may be determined to be random, indicating that the device mapping system is inaccurate. In contrast, if the device map is determined to have a sufficiently low probability of being only random in nature, then the device mapping system may be determined to be accurate.

FIG. 1 shows an example of a system for updating a device map classification system 130 (sometimes referred to as a device mapping system). In some examples, the system may include one or more devices 110, a device map classification system 120, a viewing behavior system 130, or any combination thereof. It should be appreciated that one or more components of the system may be combined into fewer components or divided into more components.

In some examples, data from one or more devices 110 may be processed by one or more components of the system, including the device map classification system 120 and the viewing behavior system 130. The one or more devices 110 may include a laptop (e.g., laptop 112), a tablet (e.g., first tablet 114 or second tablet 115), a telephone (e.g., smartphone 116), a television (e.g., television 118), or any other network-connected device that may receive and display media content (e.g., auditory or visual content). In some examples, one or more devices 110 may be included in one or more networks.

As above, data from one or more devices 110 may be processed by device map classification system 120. Processing may include assigning one or more category segments to each of the one or more devices 110 (e.g., using the category segment generator 122), generating a device map for the one or more devices 110 (e.g., using the device map generator 124), and generating a data report for the device map (e.g., using the data report generator 126). In some examples, each device in the device map may be assigned at least one category segment (also referred to as a category). In such an example, the category segment assigned to the device may indicate that the device is ranked higher than one or more features associated with the category segment and/or a threshold for one or more behaviors. In some examples, the device map may indicate links or associations between multiple devices. Illustrative examples of data from one or more devices 110 may include cookies (cookies) from a browser and IP address.

In some examples, the data report generator 126 may generate a device map and/or a data report for one or more category segments. In such an example, the data report may include information for each of the one or more devices 110 and the corresponding category segment. In one illustrative example, the data report may include information about the device type (such as to distinguish smart tv from mobile tablet) for distinguishing television program (e.g., broadcast television, streaming television, or other television program) playback. For example, it may be useful to determine whether particular media content is being viewed on a television at home or on a handheld device. There are many other uses to obtain information about category segments that are well known to those skilled in the art.

As described above, one or more devices 110 may also be processed by the viewing behavior system 130. In some examples, the viewing behavior system 130 can include an Automatic Content Recognition (ACR) engine 132. The ACR engine 132 can identify media content (e.g., auditory or visual content) that is displayed or played on a device (e.g., a device of the one or more devices 110). In such an example, the ACR engine 132 can also identify channels or other metadata associated with the media content.

while there are many ways in which media content can be identified, one method (described in more detail below with reference to fig. 7) can include receiving pixel hint points associated with frames of an unknown video segment. In some examples, a pixel hint point may include a set of pixel values corresponding to a frame. The method may also include identifying candidate reference data points in a database of reference data points. In some examples, the candidate reference data points may be similar to the pixel hint points. In such an example, the candidate reference data points can include one or more pixel values corresponding to a candidate frame of the candidate video segment. The method can also include adding a marker to a marker cylinder associated with the candidate reference data point and the candidate video segment and determining whether a number of markers in the marker cylinder exceeds a value. The method may further comprise: when the number of markers in the marker cylinder exceeds this value, the unknown video segment is identified as a matching candidate video segment. The unknown video segment can then be identified as a candidate video segment indicating that the media device is playing the candidate video segment.

fig. 7 illustrates an example of a block diagram of a matching system 700 (e.g., the ACR engine 132) for identifying video content being viewed by a media system. In some examples, the unknown content may include one or more unknown data points. In such an example, the matching system 700 can match the unknown data point with the reference data point to identify an unknown video segment associated with the unknown data point. The reference data points may be included in a reference database 716.

the matching system 700 (e.g., ACR engine) can include a player device 702 and a matching server 704. The player device 702 can include a media client 706, an input device 708, an output device 710, and one or more contextual applications 726. Media client 706, which may be a television system, computer system, or other electronic device capable of connecting to the internet, may decode data (e.g., broadcast signals, data packets, or other frame data) associated with video program 728. Media client 706 may place the decoded content of each frame of video into a video frame buffer for display or to prepare pixel information for further processing of the video frame. In some examples, player device 702 may be any electronic decoding system capable of receiving and decoding video signals. Player device 702 may receive video program 728 and store the video information in a video buffer (not shown). Player device 702 may process the video buffer information and generate unknown data points (which may be referred to as "cue points"). The media client 706 may transmit the unknown data point to the matching server 704 for comparison with reference data points in the reference database 716.

Input device 708 may include any suitable device that allows requests or other information to be input to media client 706. For example, input device 708 may include a keyboard, a mouse, a voice recognition input device, a wireless interface for receiving wireless input from a wireless device (e.g., from a remote control, a mobile device, or other suitable wireless device), or any other suitable input device. Output device 710 may include any suitable device that can present or otherwise output information, such as a display, a wireless interface for transmitting wireless output to a wireless device (e.g., a mobile device or other suitable wireless device), a printer, or other suitable output device.

the matching system 700 can begin the process of identifying video segments by first collecting data samples from a known video data source 718. For example, the matching server 104 may collect data to build and maintain a reference database 716 from a variety of video data sources 718. The video data source 718 may include a media provider of television programs, movies, or any other suitable video source. Video data from the video data source 718 may be provided as a radio broadcast, a cable television channel, a streaming source from the internet, and from any other video data source. In some examples, as described below, the matching server 704 may process received video from the video data source 718 to generate and collect reference video data points in the reference database 716. In some examples, the video program from the video data source 718 may be processed by a reference video program ingestion system (not shown) that may generate reference video data points and send them to the reference database 716 for storage. The reference data points can be used as described above to determine information that is subsequently used to analyze unknown data points.

The matching server 704 may store reference video data points for each video program received over a period of time (e.g., days, weeks, months, or any other suitable period of time) in the reference database 716. The matching server 704 may build and continuously or periodically update a reference database 716 of television programming samples (e.g., including reference data points, which may also be referred to as hints or hint values). In some examples, the collected data is a compressed representation of video information sampled from periodic video frames (e.g., every five video frames, every ten video frames, every fifteen video frames, or other suitable number of frames). In some examples, several bytes of data per frame (e.g., 25 bytes, 50 bytes, 75 bytes, 100 bytes, or any other amount of bytes per frame) may be collected for each program source. Any number of sources of programming may be used to obtain video, such as 25 channels, 50 channels, 75 channels, 100 channels, 200 channels, or any other number of sources of programming.

Media client 706 can send communication 722 to matching engine 712 of matching server 704. The communication 722 may include a request to the matching engine 712 to identify unknown content. For example, the unknown content may include one or more unknown data points, and the reference database 716 may include a plurality of reference data points. The matching engine 712 may identify unknown content by matching unknown data points to reference data in the reference database 716. In some examples, the unknown content may include unknown video data presented by the display (for video-based ACRs), search queries (for MapReduce systems, Bigtable systems, or other data storage systems), unknown images of faces (for face recognition), unknown images of patterns (for pattern recognition), or any other unknown data that may be matched against a database of reference data. The reference data points may be derived from data received from the video data source 718. For example, data points may be extracted from information provided by the video data source 718 and may be indexed and stored in the reference database 716.

The matching engine 712 may send a request to the candidate determination engine 714 to determine candidate data points from the reference database 716. The candidate data points may be reference data points that are a certain determined distance from the unknown data points. In some examples, the distance between the reference data point and the unknown data point may be determined by comparing one or more pixels of the reference data point (e.g., a single pixel, a value representing a group of pixels (e.g., an average, mean, median, or other value), or other suitable number of pixels) to one or more pixels of the unknown data point. In some examples, the reference data point may be a particular determined distance from the unknown data point when the pixel at each sample location is within a particular range of pixel values.

In one illustrative example, the pixel values of a pixel may include a red value, a green value, and a blue value (in a red-green-blue (RGB) color space). In such an example, a first pixel (or a value representing a first group of pixels) may be compared to a second pixel (or a value representing a second group of pixels) by comparing the corresponding red, green, and blue values, respectively, and ensuring that the values are within a particular range of values (e.g., within 0-5 values). For example, a first pixel may be matched to a second pixel when (1) the red value of the first pixel is within (+/-) 5 values of the range of 0-255 values for the red value of the second pixel, (2) the green value of the first pixel is within (+/-) 5 values of the range of 0-255 values for the green value of the second pixel, and (3) the blue value of the first pixel is within (+/-) 5 values of the range of 0-255 values for the blue value of the second pixel. In such an example, the candidate data point is a reference data point that approximately matches the unknown data point, resulting in a plurality of candidate data points (associated with different media segments) being identified for the unknown data point. Candidate determination engine 714 may return the candidate data points to matching engine 712.

For a candidate data point, the matching engine 712 can add a label to the label cylinder associated with the candidate data point and assigned to the identified video segment from which the candidate data point originated. Corresponding markers may be added to all marker cartridges corresponding to the identified candidate data points. When matching server 704 receives more unknown data points (corresponding to the unknown content being viewed) from player device 702, a similar candidate data point determination process may be performed and a marker may be added to the marker cartridge corresponding to the identified candidate data point. Where other marker bins correspond to candidate data points that match due to similar data point values (e.g., having similar pixel color values), only one marker bin corresponds to a segment of the unknown video content being viewed, but not to the actual segment being viewed. A marked cartridge for an unknown piece of video content being viewed will have more marks assigned to it than other marked cartridges for the segments not being viewed. For example, as more unknown data points are received, a greater number of reference data points corresponding to the marker cylinder are identified as candidate data points, resulting in more markers being added to the marker cylinder. Once the marked cartridge includes a particular number of markers, matching engine 712 can determine that the video segment associated with the marked cartridge is currently being displayed on player device 702. The video segment may comprise the entire video program or a portion of the video program. For example, a video segment may be a video program, a scene of a video program, one or more frames of a video program, or any other portion of a video program. An example of a system for identifying media content is described in U.S. patent application No.15/240,801, which is incorporated herein by reference in its entirety for all purposes.

Referring back to fig. 1, in some examples, the ACR engine 132 can output an identification of the identified media content (e.g., a video segment being viewed by the media device). In such an example, the ACR engine 132 may send to the statistics correlator 136 an identification of the media content, an identification of the device from which the received media content came, and any other metadata associated with the media content (e.g., the channel on which the media content is playing).

In some examples, the statistical correlator 136 may evaluate the device map using the viewing data output from the ACR engine 132 to determine correlations between categories generated using the device map classification system 120 and viewing behaviors of groups of devices assigned to different categories. In some cases, statistical correlator 136 may determine whether the device mapping system that generated the device map is accurate. Fig. 2A, 2B, and 2C illustrate examples of graphs for different device mapping systems (e.g., source a, source B, and source C).

In one illustrative example, the graph may include an x-axis for average channel viewing time (i.e., total elapsed time) and a y-axis for channel disparity. In such an example, the channel disparity may indicate a disparity in viewing time between different channels. In some examples, each point (e.g., circle) on the graph may represent a category segment or category (e.g., as described above with the category segment generator 124). For example, when a category segment is used for a home device (e.g., multiple devices), the circle at (2,10) may indicate that the home device is active for 2 units of time (e.g., hours, minutes, seconds, etc.) and during which one or more channels are watched 10 units of time (e.g., hours, minutes, seconds, etc.) more than one or more other channels. As another example, when a category segment is used for a device, a circle at (2,10) may indicate that the device is active for 2 units of time (e.g., hours, minutes, seconds, etc.), and during which one or more channels are viewed 10 units of time (e.g., hours, minutes, seconds, etc.) more than one or more other channels.

Although the examples shown in fig. 2A, 2B, and 2C include average channel viewing time and channel disparity, one of ordinary skill will appreciate that any other viewing behavior (e.g., type of viewing, such as Digital Video Recording (DVR) viewing or Video On Demand (VOD) viewing) or time at which viewing occurs) other than channel viewing time may be used by statistical correlator 136.

In some examples, the statistics correlator 136 can perform statistical evaluation of the viewing data (e.g., viewing time of a video segment) from the ACR engine 132 and the device maps from the device map classification system 120. The statistical evaluation may represent the accuracy of the prediction of the device map VS viewing data detected by the ACR engine 132. For example, the statistical evaluation may indicate whether there is a correlation between devices with similar viewing data and the categories assigned to the devices. As another example, the statistical correlator 136 may determine how channel viewing varies between each category segment. It should be appreciated that the statistical evaluation may be performed using any suitable statistical evaluation technique, including, for example, analysis of variance (ANOVA), chi-squared test, f-test, t-test, any combination thereof, and the like. For illustrative purposes, ANOVA will be used herein as an example. However, one of ordinary skill will appreciate that the statistical correlator 136 may use any other suitable statistical evaluation test to determine correlation.

ANOVA can be used to analyze the differences between the mean (or average) of the logical groups. In some examples, an average of the information associated with the ACR engine 132 may be calculated for each category segment received from the device map classification system 120. For example, for each device, a degree of difference in viewing time between different channels may be calculated (e.g., as shown in fig. 2A, 2B, and 2C). For each class segment, the degree of variance may be averaged over each device to calculate an average degree of variance. The average degree of difference may be an average value of the information. In another example, for each household (household), an average of the information associated with the ACR engine 132 may be calculated based on the composite device map for the household.

In some examples, ANOVA may compare two types of degrees of difference: the degree of difference within each category segment and the degree of difference between different category segments. To calculate the degree of dissimilarity, the Sum of Squares (SS) between the different classes of segments (referred to as "between SS") can be calculated: which is the class segment average and is the overall average. In some examples, a degree of difference between different classes of segments (referred to as "degree of difference between") may be calculated: where k is the number of different samples. In some examples, the degree of dissimilarity within each category segment (referred to as "intra-variance") may also be calculated: in one illustrative example, the intra-variance may be obtained using the following equation:

After calculating the degree of difference between and the degree of difference within, the F-ratio may be calculated. The F-ratio may be based on the degree of difference between and the degree of difference within: the F-ratio may indicate an amount of randomness of the data. In some examples, a critical value of the F-ratio may be identified such that when the F value is less than the critical value, the device mapping system fails the verification (i.e., the data is identified as random). Fig. 3 shows an example of calculating the F-ratio for each source described in fig. 2A, 2B, and 2C. It can be seen that the F ratio in fig. 3 indicates that source C and source a (from fig. 2A and 2C) pass verification with F > Fcrit, while source B (from fig. 2B) does not. In some examples, the threshold may be adjusted according to the analyzed device mapping system. In some examples, the lower the F-ratio, the better the implicit quality of the potential match.

In some examples, the statistical correlator 126 may send a message to the category measurement system 120 (e.g., to the category segment generator 122 or the device map generator 124). In such an example, the message may indicate whether the data report is satisfactory based on media content from one or more devices 110. In some examples, the data report may be satisfactory when the category is determined to not appear random based on the statistical evaluation performed by the statistical correlator 136. Using this method to score a device map system, modifications to the device map system may be identified and compared to other accuracy scores to determine the progress of the modifications.

Fig. 4 shows an example of a process 400 for assigning an accuracy score to a device mapping system. In some examples, process 400 may be performed by a computer system.

Process 400 is illustrated as a logical flow diagram, whose operations represent a sequence of operations that can be implemented in hardware, computer instructions, or a combination thereof. In the context of computer instructions, the operations represent computer-executable instructions stored on one or more computer-readable storage media that, when executed by one or more processors, perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or performed in parallel to implement a process.

Additionally, process 400 may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more application programs) that is collectively carried out by hardware, or a combination thereof, on one or more processors. As described above, the code may be stored on a machine-readable storage medium, for example, in the form of a computer program comprising a plurality of instructions executable by one or more processors. The machine-readable storage medium may be non-transitory.

Process 400 may include obtaining a plurality of categories assigned to a group of media player devices (step 410). In some examples, multiple categories may be determined using a device mapping system. In such an example, the categories can include classifications for a group of media player devices (e.g., category segment, device segment, viewing segment, etc.), such as: revenue associated with a user of the media player device group, an age group of the user of the media player device group, an educational level of the user of the media player device group, or a number of devices in the media player device group. In some examples, the media player device may be a network connected device that may receive and display media content. Examples of a media player device may include a smartphone, tablet, smart television, laptop, or any other suitable network connection device.

Process 400 may also include determining the viewing behavior of the group of media player devices (step 420). In some examples, the viewing behavior may include at least one or more of a group of media player devices viewing one or more of a plurality of channels, recorded programs (e.g., from a DVR), live programs, on-demand content, content from the internet (e.g., YouTube or NetFlix), video time amounts for a particular program type (e.g., sports or live tv), or any combination thereof. In some examples, Automatic Content Recognition (ACR) may be used to determine viewing behavior. For example, the ACR may match viewing media content viewed by the media player device with stored media content. In such examples, the media content may be auditory or visual (e.g., audio, video, or still images).

in an example where the media content is video content, performing automatic content recognition can include receiving a pixel cue point associated with a frame of an unknown video segment, wherein the pixel cue point includes a set of pixel values corresponding to the frame; identifying candidate reference data points in the database of reference data points, wherein the candidate reference data points are similar to the pixel hint point, and wherein the candidate reference data points comprise one or more pixel values corresponding to a candidate frame of the candidate video segment; adding a marker to a marker cylinder associated with the candidate reference data point and the candidate video segment; determining whether the number of marks in the mark cartridge exceeds a value; and identifying the unknown video segment as a matching candidate video segment when the number of markers in the marker cylinder exceeds the value.

The process 400 may also include determining correlations between the viewing behavior and the plurality of categories for the group of media player devices (step 430). In some examples, the relevance between the plurality of categories of the media player device group and the viewing behavior may be based on a degree of difference in the viewing behavior in the plurality of categories.

The process 400 may also include determining an accuracy score for the device mapping system using the determined correlation (step 440). In some examples, determining the accuracy score of the device mapping system includes performing a statistical hypothesis test (e.g., such as the F-ratio test described above) to determine whether correlations between multiple categories of the media player device group and viewing behavior are random. In some examples, process 300 may also include comparing the results of the statistical hypothesis test to a randomness threshold (sometimes referred to as a cut-off value), and determining that the correlation is random when the results are less than the randomness threshold. In some examples, an accuracy score of the device mapping system may be determined based on a comparison of a result of the statistical hypothesis test to a randomness threshold.

process 400 may also include assigning (or sending) an accuracy score to the device mapping system (step 450). In some examples, the accuracy score may be used to improve the device mapping system. For example, an optimization algorithm (such as hill climbing) may be used to compare the updated accuracy score to the accuracy score, where the updated accuracy score is determined after updating one or more parameters of the device mapping system.

Fig. 5 shows an example of a process 500 for evaluating statistical relevance of a plurality of devices to a predicted statistical attribute. In some examples, process 500 may be performed by a computer system (such as viewing behavior system 130).

process 500 is illustrated as a logical flow diagram, whose operations represent a sequence of operations that can be implemented in hardware, computer instructions, or a combination thereof. In the context of computer instructions, the operations represent computer-executable instructions stored on one or more computer-readable storage media, which when executed by one or more processors perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement a process.

Additionally, process 500 may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more application programs) that are collectively executed by hardware, or a combination thereof, on one or more processors. As described above, the code may be stored on a machine-readable storage medium (e.g., in the form of a computer program comprising a plurality of instructions executable by one or more processors). The machine-readable storage medium may be non-transitory.

Process 500 may include calculating a value for each of the one or more devices (step 510). In some examples, the value may be a number of hours the device is tuned to each of one or more channels available to the device. In some examples, one or more devices may be indicated by a device mapping system. In such an example, the device mapping system may provide an indication of one or more devices and a particular category (sometimes referred to as a category segment) for each of the one or more devices.

process 500 may also include performing a statistical analysis on the values of each of the one or more devices to identify how channel viewing varies between each segment indicated by the device mapping system (step 520). In some examples, the statistical analysis may be an analysis of variance (ANOVA), chi-square test, f-test, t-test, or the like. If the statistical analysis is ANOVA, process 500 may further include: when there is a low amount of variance between segments, it is determined that the segments are poorly identified by the device mapping system (step 530), when there is a high amount of variance between segments, it is determined that the segments are associated with viewing behavior (step 540), and an F-test (or other suitable statistical analysis test or statistical hypothesis test) is performed to determine that there is a low amount or high amount of variance between segments.

Fig. 6 shows an example of a process 600 for comparing predicted viewing behavior with actual viewing measured by an automatic content recognition component. In some examples, process 600 may be performed by a computer system (such as viewing behavior system 130).

Process 600 is illustrated as a logical flow diagram, whose operations represent a sequence of operations that can be implemented in hardware, computer instructions, or a combination thereof. In the context of computer instructions, the operations represent computer-executable instructions stored on one or more computer-readable storage media, which when executed by one or more processors perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement a process.

Additionally, process 600 may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more application programs) that are collectively executed by hardware, or a combination thereof, on one or more processors. As described above, the code may be stored on a machine-readable storage medium (e.g., in the form of a computer program comprising a plurality of instructions executable by one or more processors). The machine-readable storage medium may be non-transitory.

Process 600 may include obtaining a device map for one or more devices (step 610). In some examples, one or more devices may each be a media device that constitutes a home device. In some examples, the device map may be generated by a third party system. In such an example, the device map may be generated based on raw data (e.g., Internet Protocol (IP) traffic, such as using a local area connection and/or the internet, including time spent on email, Facebook, YouTube, etc.). In other examples, the raw data may be assembled by collecting browser data, such as an infologger and other data mining activities, from one or more devices.

In some examples, the request for the device map may include an indication of one or more devices on which the device map should be based. In other examples, the request may include raw data. One of ordinary skill in the art will recognize that a proprietary process known in the art may be used to generate the device map. The data used to generate the device map may be derived from analyzing an information logging program collected from the device when the user accesses various internet sites. In some examples, the type of internet connected device may be derived from configuration information within the remote querying device.

process 600 may also include associating (or mapping) a device map to media content viewing (step 620). In some examples, associating may include associating (using some form of content identification) an IP address of a device map to an IP address of the device being monitored for media content viewing. In such an example, associating may also include associating the media content view detected from the device with an IP address of the device map. An example of associating a device map with media content viewing would be that the device map predicts that the associated household likes something to cook, as they search for websites featuring food recipes and kitchen tools, which are then mapped to their television viewing of food channels.

Process 600 may also include generating a first database for viewing statistics based on data from one or more reference sources (e.g., one or more devices such as televisions in the home) (step 630). In some examples, the first database may be generated by a third party using a proprietary process of correlating internet activity collected from one or more devices of the device map. In such an example, the proprietary process does not use the actual viewing records. In some examples, a first database may be used to associate viewer interests with media content viewing. For example, a first database may associate media content viewing with product interests (such as a particular brand of automobile).

Process 600 may also include generating a second database for viewing of the video segment using automatic content recognition (step 640). In some examples, automatic content recognition (as described herein) may identify media content being viewed on one or more media devices. The identified media content may be analyzed to determine what one or more media devices are viewing. Based on the content being viewed, a second database may be generated to include information about the viewing behavior of one or more devices.

process 600 also includes performing a statistical evaluation (e.g., statistical correlation as described above) using the first database and the second database (step 650). In some examples, the statistical evaluation may compare the first database and the second database.

the process 600 also includes evaluating the device map based on the statistical evaluation (step 660). For example, if the statistical evaluation indicates that one or more category segments were randomly selected, then a device map may be determined to be inappropriate. However, if the statistical evaluation indicates that one or more category segments are relevant to the viewing behavior, then the device map may be determined to be accurate. In some examples, the process 600 may be looped such that when step 660 ends, the process 600 may repeat steps 630, 640, 650, and 660.

Fig. 8 shows an example of the processing flow of various devices. In some examples, the process flow may include media device 801. Media device 801 may generate cue point data (sometimes referred to as a fingerprint) for a video program currently being displayed on media display 801. Media device 801 may send cue point data to cue point manager 802. The cue point manager 802 may process and/or identify content being displayed on the media display 801 using cue point data and an automatic content identification system (as described herein).

In some examples, the process flow may also include a hint point cache 806. Hint point cache 806 can be a storage device to support ingestion (storage) of hint point data. The process flow may also include a real-time reference database 804. The real-time reference database 804 may be a database of television programming currently available on one or more television channels. Real-time reference database 804 may collect and process one or more television channels for comparison with cueing point data from media device 801 to identify a video segment currently being displayed on media device 801.

In some examples, the process flow may also include searching for the router 803. The search router 803 may accept device map information for one or more devices in the home, such as devices 805A, 805B, 805C, and 805D, for use in associating the device map information with viewing information from the media device 801.

Fig. 9-11 show examples of graphs representing associations of household revenue to monthly television viewing times. Each chart is from a different vendor and is associated with television viewing learned from direct measurements of the ACR system to verify the quality of the data from each vendor.

Fig. 9 shows an example of monthly viewing hours of the revenue code VS equivalent to a first matching rate of 47%. It can be seen that the probability of two factors not being correlated is small at 9 x 10 < lambda > 17. The probability indicates that the matching process is likely to be good.

Fig. 10 shows an example of revenue codes and monthly viewing hours equivalent to a second match of 62%. It can be seen that the probability of two factors being uncorrelated is reduced by three orders of magnitude compared to fig. 9. Thus, the overall second match reduces randomness in the system and is better than the first (lower scores are better).

Fig. 11 shows an example of a media device VS viewing hours per month equivalent to a revenue code, found only in data set 2.

in the foregoing specification, aspects of the present invention have been described with reference to specific embodiments thereof, but those skilled in the art will recognize that the present invention is not limited thereto. Various features and aspects of the above-described invention may be used alone or in combination. Moreover, embodiments may be utilized in any number of environments and applications beyond those described herein without departing from the broader spirit and scope of the specification. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. Additionally, while a system has been described, it should be appreciated that the system may be one or more servers. In addition, the ACR engine 132, the statistics correlator 136, the external data ingester 134, the category segment generator 122, the device mapping generator 124, the data report generator 126, the device mapping system, the viewing behavior system 130, the device mapping classification system 120, etc., may be implemented by one or more servers.

in the foregoing description, for purposes of explanation, methodologies have been described in a particular order. It should be understood that in alternative embodiments, the methods may be performed in an order different than that described. It will also be appreciated that the above-described methods may be performed by hardware components or may be embodied in the sequence of machine-implementable instructions, which may be used to cause a machine, such as a general purpose or special purpose processor or logic circuits programmed with the instructions to perform the methods. These machine-implementable instructions may be stored on one or more machine-readable media, such as a CD-ROM or other type of optical disk, floppy disk, ROM, RAM, EPROM, EEPROM, magnetic or optical card, flash memory, or other type of machine-readable medium suitable for storing electronic instructions. Alternatively, the methods may be performed by a combination of hardware and software.

where a component is described as being configured to perform certain operations, such configuration may be accomplished, for example, by designing electronic circuitry or other hardware to perform the operations, by programming programmable electronic circuitry (e.g., a microprocessor or other suitable electronic device) to perform the operations, or any combination thereof.

Although illustrative embodiments of the present application have been described in detail herein, it is to be understood that the inventive concepts may be otherwise variously embodied and employed, and that the appended claims are intended to be construed to include such variations unless the variations are limited by the prior art.

30页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：使用信道相关性检测Wi-Fi网络中的媒体存取控制(MAC)地址欺骗

System and method for improving device map accuracy using media viewing data

相关技术

网友询问留言