Encoding for prefetching of application streams

文档序号：1144483 发布日期：2020-09-11 浏览：22次中文

阅读说明：本技术 用于应用流的预取的编码 (Encoding for prefetching of application streams ) 是由张倬领于 2016-05-04 设计创作，主要内容包括：用于应用流的预取编码的系统和方法。在一些实施方式中,可以在服务器计算机系统处接收客户端设备具有多个模版帧的表示的指示。由服务器计算机系统提供给客户端设备的多个图像帧可以被处理,其中每个图像帧的处理包括,确定该图像帧是否满足模版帧中的一个的阈值相似性。对于满足模版帧中的一个的阈值相似性的每个图像帧执行的操作可以包括,生成该图像帧和该图像帧满足与其的阈值相似性的模版帧之间的差异的表示、生成用于渲染该图像帧的指令、以及向客户端设备提供该指令。(Systems and methods for prefetching code for an application stream. In some implementations, an indication that a client device has a representation of a plurality of template frames may be received at a server computer system. A plurality of image frames provided by the server computer system to the client device may be processed, wherein the processing of each image frame includes determining whether the image frame satisfies a threshold similarity for one of the template frames. The operations performed for each image frame that satisfies the threshold similarity for one of the template frames may include generating a representation of a difference between the image frame and the template frame with which the image frame satisfies the threshold similarity, generating instructions for rendering the image frame, and providing the instructions to a client device.)

1. A computer-implemented method, comprising:

obtaining a set of image frames associated with a particular application;

for each image frame within the set of image frames, determining a respective likelihood that the image frame will be rendered on the particular application;

selecting a subset of a set of image frames as a plurality of image frames, the selection based on a likelihood that the plurality of image frames will be rendered on the particular application;

for each of the plurality of image frames, determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame;

generating a template frame for each image frame that satisfies a threshold difference from another image frame that is previous to the image frame; and

providing the template frame to a client device prior to the client device running the particular application.

2. The method of claim 1, wherein template frames for the plurality of image frames are generated using a predefined level of quality.

3. The method of claim 1, wherein determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame comprises:

determining a distance score between a signature of the image frame and a signature of another frame preceding the image frame;

determining that a distance score between a signature of the image frame and a signature of another frame previous to the image frame satisfies a distance threshold; and

in response to determining that a distance score between a signature of the image frame and a signature of another frame previous to the image frame satisfies a distance threshold, determining that the image frame satisfies a threshold difference from another frame previous to the image frame.

4. The method of claim 3, wherein determining a distance score between a signature of the image frame and a signature of another frame prior to the image frame comprises:

generating a signature for the image frame based on a luminance histogram of pixels of the image frame.

5. The method of claim 1, wherein:

each of the respective likelihoods determined for the set of image frames includes a score representing a number of times that an image has been previously rendered on the particular application; and

the subset of the set of image frames includes image frames determined to have a score that satisfies a threshold number of times that images have been previously rendered on the particular application.

6. The method of claim 1, further comprising:

determining the size of a template frame;

determining that a size of the template frame exceeds a threshold size, the threshold size being associated with a launch delay of the particular application on the client device; and

in response to determining that the size of the template frame exceeds a threshold size, removing one or more template frames from the template frame prior to the client device running the particular application.

7. The method of claim 1, wherein the threshold difference indicates that an image frame displays a different graphical element than another frame that precedes the image frame.

8. A system, comprising:

one or more computers; and

one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising:

obtaining a set of image frames associated with a particular application;

for each image frame within the set of image frames, determining a respective likelihood that the image frame will be rendered on the particular application;

selecting a subset of a set of image frames as a plurality of image frames, the selection based on a likelihood that the plurality of image frames will be rendered on the particular application;

for each of the plurality of image frames, determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame;

generating a template frame for each image frame that satisfies a threshold difference from another image frame that is previous to the image frame; and

providing the template frame to a client device prior to the client device running the particular application.

9. The system of claim 8, wherein template frames for the plurality of image frames are generated using a predefined level of quality.

10. The system of claim 8, wherein determining whether an image frame satisfies a threshold difference from another image frame that is prior to the image frame comprises:

determining a distance score between a signature of the image frame and a signature of another frame preceding the image frame;

determining that a distance score between a signature of the image frame and a signature of another frame previous to the image frame satisfies a distance threshold; and

11. The system of claim 10, wherein determining a distance score between a signature of the image frame and a signature of another frame prior to the image frame comprises:

generating a signature for the image frame based on a luminance histogram of pixels of the image frame.

12. The system of claim 8, wherein:

each of the respective likelihoods determined for the set of image frames includes a score representing a number of times that an image has been previously rendered on the particular application; and

13. A non-transitory computer-readable storage device encoded with computer program instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising:

obtaining a set of image frames associated with a particular application;

for each image frame within the set of image frames, determining a respective likelihood that the image frame will be rendered on the particular application;

selecting a subset of a set of image frames as a plurality of image frames, the selection based on a likelihood that the plurality of image frames will be rendered on the particular application;

for each of the plurality of image frames, determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame;

generating a template frame for each image frame that satisfies a threshold difference from another image frame that is previous to the image frame; and

providing the template frame to a client device prior to the client device running the particular application.

14. The device of claim 13, wherein template frames for the plurality of image frames are generated using a predefined level of quality.

15. The device of claim 13, wherein determining whether an image frame satisfies a threshold difference from another image frame that is prior to the image frame comprises:

determining a distance score between a signature of the image frame and a signature of another frame preceding the image frame;

determining that a distance score between a signature of the image frame and a signature of another frame previous to the image frame satisfies a distance threshold; and

16. The device of claim 15, wherein determining a distance score between a signature of the image frame and a signature of another frame prior to the image frame comprises:

generating a signature for the image frame based on a luminance histogram of pixels of the image frame.

17. The apparatus of claim 13, wherein:

each of the respective likelihoods determined for the set of image frames includes a score representing a number of times that an image has been previously rendered on the particular application; and

18. The device of claim 13, wherein the operations further comprise:

determining the size of a template frame;

determining that a size of the template frame exceeds a threshold size, the threshold size being associated with a launch delay of the particular application on the client device; and

Technical Field

The present specification relates to application streaming.

Background

In application streaming, a server may run an application and stream (streaming) video rendered for the application to a remote client device used by a user. The user may then interact with the application based on the video streamed to the client device. Video generated by the server may be captured on the server as video frames, encoded as a video bitstream, and sent to the client for decoding and playback.

In order to maintain high interactivity, a reduction in the delay between content generation at the server and content playback at the client is desirable. Delays in application flow may result in an unsatisfactory user experience. For example, the delay may cause the application to appear unresponsive or cause hysteresis. The delay is largely due to three factors: server processing time, client processing time, and network transfer time. Server processing time and client processing time may depend primarily on available computing resources and may not vary much in video frames. However, for a given network bandwidth, network transmission time may increase as the encoded bitstream size of the video frames increases.

Video frames containing sudden content changes, such as window pops in a desktop application, or scene transition animations between levels in a game, can often be difficult to encode due to the amount of new information presented in the content. To encode these frames using conventional video encoding algorithms, the server may limit the encoded bitstream size to meet the delay requirement by reducing the image quality in the encoder setup, or may utilize an increased bitstream size to maintain quality but sacrifice delay.

Disclosure of Invention

In general, one innovative aspect of the subject matter described in this specification can be embodied in methods that can include: an act of receiving, at a server computer system, an indication that a client device has a representation of a plurality of template frames, and processing, by the server computer system, the plurality of image frames for provision to the client device, wherein the processing of each image frame includes determining whether the image frame satisfies a threshold similarity to one of the template frames. For each image frame that satisfies a threshold similarity to one of the template frames, additional actions include: generating a representation of a difference between the image frame and a template frame (the image frame satisfying a threshold similarity thereto); generating instructions for rendering the image frame, the instructions including (i) an identification of a representation of a stencil frame with which the image frame satisfies a threshold similarity, and (ii) a representation of a difference between the image frame and the stencil frame; and provide the instruction to the client device.

Other versions include corresponding systems, apparatus, and computer programs configured to perform the actions of the methods encoded on computer storage devices.

Each of these or other versions may optionally include one or more of the following features. For example, in some implementations, for each image frame that does not meet the threshold similarity to one of the template frames, the actions may include: a representation of an image frame, or a representation of a difference between an image frame and another image frame that was processed immediately prior to the image frame, is provided to a client device.

In some implementations, the threshold similarity includes determining that the image frame is most similar to a particular template frame from the one or more template frames, and/or an immediately preceding image frame of the image frame.

In some implementations, for each image frame, determining whether the image frame satisfies a threshold similarity to one of the template frames comprises the acts of: determining a distance score between the signature of the image frame and the signature of one of the template frames; determining a distance score between the signature of the image frame and the signature of one of the template frames that satisfies a distance threshold; and determining that the image frame satisfies a threshold similarity to one of the template frames in response to determining a distance score between the signature of the image frame and the signature of the template frame that satisfies a distance threshold.

In some implementations, determining the distance score between the signature of the image frame and the signature of one of the template frames may include generating the signature of the image frame based on a luminance histogram of pixels of the image frame.

In some implementations, processing the plurality of image frames to provide to the client device may include the acts of: identifying a particular application for which a plurality of image frames are to be rendered; determining, by the server computer system, a template frame associated with the particular application; and providing, by the server computer system, a request to the client device to provide an indication of whether the client device has a representation of the plurality of template frames associated with the particular application, wherein, in response to providing the request, receiving, at the server computer system, an indication that the client device has a representation of the plurality of template frames.

In some implementations, a representation of the difference between the image frame and the template frame with which the image frame satisfies the threshold similarity may be generated based on encoding the image frame as a non-predictive frame at a predefined quality immediately after encoding the template frame.

In some implementations, the actions can include: obtaining, at a server computer system, a set of a plurality of image frames from which a template frame is generated; processing, by the server computer system, the set of the plurality of image frames from which the template frame is generated, wherein the processing of each image frame of the set of the plurality of image frames from which the template frame is generated includes, for each image frame of the set of the plurality of image frames from which the template frame is generated, determining whether the image frame satisfies a threshold difference from a previous image frame; for each image frame that satisfies a threshold difference from a previous image frame: generating a representation of the image frame as a template frame; for each image frame that satisfies a threshold difference from a previous image frame, providing the generated template frame to the client device; determining whether a total size of the generated template frame satisfies a size threshold; and in response to determining that the total size of the generated template frames satisfies a size threshold, provide a subset of the generated template frames to the client device.

In some implementations, providing the subset of the generated template frames to the client device can include an act of providing the generated template frames corresponding to image frames that are more dissimilar than other generated template frames to prior image frames.

The encoded video frame may be independent of other video frames, e.g., for encoding of non-predictive frames, or dependent on other video frames, e.g., for encoding of predictive frames. For example, in the case where a video frame is very different from a previous frame, the encoded bitstream of the video frame may be independent of any other video frame, and in the case where the video frame is similar to the previous video frame, the encoded bitstream of the video frame may represent the difference between the video frame and the previous video frame. Thus, the size of the encoded bitstream of a video frame may increase as the difference between the video frame and the immediately preceding video frame increases.

The system may reduce the size of the encoded bitstream of video frames by prefetching the encoded bitstream of video frames from the server to the client device. When the server streams video for an application to a client device, the server may determine that a particular video frame is more similar to a video frame having an encoded bitstream that has been pre-fetched on the client device than an immediately preceding encoded video frame. In response to the determination, the server may determine to encode the particular video frame as a predictive frame that depends on a prediction from a video frame corresponding to the prefetched encoded bitstream, provide the encoding to the client device, and instruct the client device to decode the encoding based on the prefetched encoded bitstream instead of the encoded bitstream for the previous video frame. Thus, the system may reduce latency in the application stream by reducing the size of the encoded bitstream of the video frame by using the prefetched encoded bitstream.

In some embodiments, there is provided a computer-implemented method comprising: obtaining a set of image frames associated with a particular application; for each image frame within the set of image frames, determining a respective likelihood that the image frame will be rendered on the particular application; selecting a subset of a set of image frames as a plurality of image frames, the selection based on a likelihood that the plurality of image frames will be rendered on the particular application; for each of the plurality of image frames, determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame; generating a template frame for each image frame that satisfies a threshold difference from another image frame that is previous to the image frame; and providing the template frame to a client device prior to the client device running the particular application.

In some embodiments, there is provided a system comprising: one or more computers; and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising: obtaining a set of image frames associated with a particular application; for each image frame within the set of image frames, determining a respective likelihood that the image frame will be rendered on the particular application; selecting a subset of a set of image frames as a plurality of image frames, the selection based on a likelihood that the plurality of image frames will be rendered on the particular application; for each of the plurality of image frames, determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame; generating a template frame for each image frame that satisfies a threshold difference from another image frame that is previous to the image frame; and providing the template frame to a client device prior to the client device running the particular application.

In some embodiments, there is provided a non-transitory computer-readable storage device encoded with computer program instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising: obtaining a set of image frames associated with a particular application; for each image frame within the set of image frames, determining a respective likelihood that the image frame will be rendered on the particular application; selecting a subset of a set of image frames as a plurality of image frames, the selection based on a likelihood that the plurality of image frames will be rendered on the particular application; for each of the plurality of image frames, determining whether an image frame satisfies a threshold difference from another image frame that is previous to the image frame; generating a template frame for each image frame that satisfies a threshold difference from another image frame that is previous to the image frame; and providing the template frame to a client device prior to the client device running the particular application.

The details of one or more implementations of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other potential features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.

Drawings

1A-1B are block diagrams of example systems for an interactive streaming session.

FIG. 2 is a block diagram of an example system for a prefetch preparation session.

3A-3C are flow diagrams of example processes for encoding representations of template frames using prefetching.

FIG. 4 is a diagram of an exemplary computing device.

Like reference symbols in the various drawings indicate like elements.

Detailed Description

Fig. 1A is a block diagram of an example system 100 for an interactive streaming session. Briefly, and as described further below, the system 100 may include a server 110 and a client device 160. The server 110 may include a template frame 112, an image frame 114, a signature generator 120, a signature comparator 130, a selected template frame 132A, an encoding engine 140, and instructions 150A. Client device 160 may include a video renderer 162 and a stencil representation 164.

Template frames 112 may be frames of video content previously generated for application on server 110 for which corresponding template representations 164 (i.e., encoded bitstreams) were pre-fetched on client device 160. For example, the stencil frames 112 may be specific frames rendered for a game running on the server 110, where the specific frames include substantial differences from their immediately preceding frames. Frames that include substantial differences from their immediately preceding frame may include scene change animations, dialog boxes, scoreboards, or other video frames that are routinely displayed and that differ from the previous frame.

Image frames 114 may be frames of video of an application on server 110 that are rendered for streaming to client device 160. For example, the image frames 114 may be frames of a cloud video game application rendered for execution by a user of the client device 160.

The signature generator 120 may calculate signatures for the template frame 112 and the image frame 114. The signature of a frame may generally describe the content of the frame. For example, the signature of a frame may be based on the luminance values of pixels in the image. The signature may additionally or alternatively be calculated using other image processing parameters. Such as contrast, natural color ratio, pixel color values, and other parameters.

In some implementations, the signature generator 120 computes the signature for each of the template frame 112 and the image frame 114 by initially dividing the frame luminance (a 2D array of 1-byte integers with values from 0-255) of the frame into N smaller 2D blocks. For example, a 960 x 640 pixel frame may be divided into N-100 blocks, where each block includes 96 x 64 pixels. The signature generator 120 may then calculate a histogram of luminance values for each of the N blocks, the histogram representing the number of occurrences of luminance values within each block. The signature generator 120 may then concatenate the N histograms to compute the signature for the frame.

The signature comparator 130 may compare the signature of the image frame 114 with the signature of one or more of the template frames 112 and the signature of the immediately preceding image frame. The immediately preceding image frame may comprise an image frame that is encoded immediately prior to the image frame 114 to be encoded, i.e., the frame is displayed prior to the image frame 114. Using this comparison, the signature comparator 130 may determine whether the image frame 114 satisfies a threshold similarity to one of the template frames. The threshold similarity may be that the image frame is most similar to a particular template frame outside of both the template frame 212 and the immediately preceding frame. In some cases, an image frame 114 may not have an immediately preceding image frame if the image frame 114 is the first frame to be encoded. Thus, in these cases, the threshold similarity may be that the image frame is most similar to the particular template frame, and the similarity is above a predetermined minimum similarity threshold.

The signature comparator 130 may use the distance scores of the two signatures to compare the signatures of the image frame 114 and the previous image frame. For example, the distance score for two signatures may be calculated by summing the differences in the respective luminance values for each of the N blocks of two respective frames and normalizing the summed differences with the sum of the luminance values for the N blocks of the image frame 114. A lower distance score may correspond to a more similar frame. For example, a distance score of "0" may correspond to a very similar frame.

In addition to the immediately preceding frame and the template frame 122, the signature comparator may select the frame with the signature most similar to the signature of the image frame 114 and output the selected template frame 132A. For example, the image frame 114 may be a start screen following a start animation, where the start screen includes text that changes based on the name of the user, the template frame may also correspond to a start screen having a different name of the user, and the immediately preceding frame may be an end for the start animation. In an example, the signature comparator 130 may determine that the signature of the image frame 114 is most similar to the signature of a template frame corresponding to a start screen having a different user's name. Signature comparator 130 may determine the most similar frame based on determining the frame whose distance score corresponds to the greatest similarity (e.g., closest to "0").

The selected template frame 132A may be the template frame of the template frame 112 identified by the signature comparator 130 as being most similar to the image frame 114. For example, the signature comparator 130 may compare the distance scores between the template frame 112 and each of the image frames 114 and identify the template frame having the lowest distance score as the selected template frame 132A.

The encoding engine 140 may be a video encoder on the server 110 that encodes the image frames 114. The encoding engine 140 may be configured to encode image frames using a video codec (e.g., MPEG-4, DivX Pro, Xvid, or x 264). The encoding engine 140 may receive the selected template frame 132A, or an indication of the selected template frame, and may encode the image frame 114 using the selected template frame. For example, the encoding engine 140 may receive an indication that "template frame X" is most similar to the image frame 114 and, in response, encode the image frame 114 based on the difference between the image frame 114 and "template frame X".

The encoding engine 140 may generate the instructions 150A. The instructions 150A may include a representation 152 and an identification 154. The representation 152 may be a representation of the difference between the image frame 114 and the selected template frame 132A. For example, where the image frame 114 is a frame of a start screen having a user's name and the template frame 112 is a frame of a start screen having a different name, the representation 152 of the differences may be an encoded bitstream representing differences in pixels corresponding to the different names. The identification 154 may identify a particular stencil representation of the stencil representations 164 stored on the client device 160 for use in rendering the image frame 114. For example, the identification 154 may be "template representation X" corresponding to a representation of "template frame X".

The encoding engine 140 may generate a representation of the difference between the image frame 114 and the selected template frame 132A by encoding the selected template frame 132A using a video codec, and then encode the video frame 114 using the video codec, based on the frame just encoded, and an indication that the image frame 114 is to be encoded. For example, the encoding engine 140 may first encode a template frame of a start screen having a different name, discard a representation of the template frame, then encode an image frame of the start screen having the user's name with predictive encoding, and use the output as a representation of the difference between the template frame and the image frame.

The selected template frame 132A may be encoded as a non-predictive frame, e.g., independent of any other video frame, and at the same level of quality as the corresponding template representation prefetched by the client device 160. This may ensure that the prediction from the selected template frame 132A generated by the encoding engine 140 is the same frame as the template representation 164 on the client device 160.

Client device 160 may include a video renderer 162 and a stencil representation 164. The template representation 164 may be a pre-fetched set of representations of template frames 112 on the server 110. The template representation 164 may be an encoded bitstream of the template frame 112, where the encoding is done by a video codec, and an indication that the template frame 112 is to be encoded independently of any other frames.

Video renderer 162 may perform video decoding operations on client device 160. For example, the video renderer 162 may perform the decoding operation using the h.264 video standard or some other video codec. The video renderer 162 may render the image frame 114 on the client device 160 based on the identification 154 of the stencil and the representation 152 of the difference between the image frame 114 and the selected stencil frame 132A. For example, video renderer 162 may determine that instruction 150A identifies "stencil representation X" corresponding to a representation of a start screen with a default user name and includes a representation of the difference between the start screen with the default user name and the start screen with the name of the current user.

In response, the video renderer 162 may access the template representation pre-fetched on the client device 160 and modify the template representation based on the difference between the image frame 114 and the selected template frame 132A. For example, the video renderer 162 may access "stencil representation X", decode "stencil representation X" into "stencil frame X", not display "stencil frame X", and then decode a representation of the difference between the start screen with the default user name and the start screen with the name of the current user based on the frame just decoded "stencil frame X".

Thus, instead of providing an encoded representation of image frame 114 encoded based on an immediately preceding image frame in the video or based on an image frame 114 encoded without a previous frame, server 110 may reduce the size of the encoding for image frame 114 by encoding a template frame similar to image frame 114 and then encoding image frame 114 based on the encoding of the template frame.

Fig. 1B represents a streaming session in which the server 110 determines that the image frame 114 does not satisfy the threshold similarity to any of the template frames 112. For example, the signature comparator 130 may determine that the image frame 114 is more similar to a previous image frame than to any template frame 112. The signature comparator 130 may determine that the previous image frame is more similar to the image frame 114 than any of the template frames 112 based on determining that the distance score of the signatures of the image frame 114 and the previous image is lower than the distance score between the signatures of the image frame 114 and the template frames 112. For example, the signature comparator 130 may determine that, for an image frame 114 that shows only a slight change (e.g., only a small number of pixels different) from the previous image frame 132b, the signature of the image frame 114 is most similar to the signature of the previous image frame 132b and exceeds the signature of the template frame 112. In such a case, the encoding engine 140 may encode the image frame 114 by generating a representation of the image frame 114 based on an immediately preceding image frame. In some implementations, when the image frame 114 and the prior image 132b do not satisfy the threshold similarity, e.g., are below 20%, 30%, or some other similar amount, the encoding engine 140 may encode the image frame 114 as a representation of the image frame 114 that is not based on any other frame.

FIG. 2 illustrates an example system 200 for a prefetch preparation session. The pre-fetch preparation session may occur prior to transmission of the application stream to the client using the template frame, and may be the session during which the template frame and template representation for the application stream are generated. Briefly, and as described further below, the system 200 may include a server 210 and a client device 270. The server 210 may include an image frame 212, a signal generator 220, a signature comparator 230, a template generator 240, a template cropper 250, a template frame 252, and a template representation 260. The client device 270 may include a video renderer 272 and a stencil representation 274.

The image frames 212 may be frames of content generated from streaming an application on the server 210 for display on the client device 270. For example, the image frames 212 may include a stream of image frames corresponding to the playing of a game to be streamed, where some of the image frames are similar to previous frames, e.g., the player has moved a cursor in a start menu, and some of the image frames are dissimilar from previous frames, e.g., the player has selected an option in the start menu that has triggered an animation.

Signature generator 220 may calculate a signature for image frame 212. In some embodiments, signature generator 220 may compute the signature according to the computation previously described in FIG. 1A. The signature comparator 230 may compare the signature of each of the image frames 212 to the signature of the immediately preceding image frame of the image frame 212 to determine whether each of the image frames 212 satisfies a threshold difference from the immediately preceding image frame of the image frame 212. In some implementations, the signature comparator 230 uses similar comparison techniques used by the signature comparator 130 to determine whether the image frame 212 satisfies a threshold difference. In one example, for example, where the image frames represent a start menu in which the cursor has moved, the signature comparator 230 may determine that one of the image frames 212 is below a threshold difference of its immediately preceding image frame. In another example, for example, where an image frame represents a transition between displaying a start menu to animation, the signature comparator 230 may determine that one of the image frames 212 satisfies a threshold difference for its immediately preceding image frame.

The signature comparator 230 may select image frames that meet a threshold difference from their previous frames and provide the selected image frames, or an identification of the selected image frames, to the template generator 240. For example, the signature comparator 230 may select image frames in which a full screen menu option is turned on, and image frames in which animation is started after the start menu, and provide those image frames or an identification of those image frames to the template generator 240.

The template generator 240 may generate a template frame 252 based on the selected image frame 232 sent from the signature comparator. For example, the template generator 240 may designate the selected image frame 232 as a template frame and generate a corresponding template representation of the selected image frame 232 by encoding the selected image frame 232 as a frame independent of any other frames.

In generating the template frame 252 and template representation 260, the template generator may generate a frame array, signature array, and representation array that are all initially empty. These arrays may be container objects that store certain values of a single type. For example, the frame number array may store the template frame 252, the signature array may store the calculated signature of the image frame 232 selected as the template frame 252, and the representation array may store the encoded bitstream of the template frame 252. The signature array may be used later during the real-time session so that the signature of the template frame does not need to be calculated again.

The template generator 240 may add the template frame 252 to the frame array and may also add the corresponding signature of the template frame 252 to the signature array. The template generator 240 may then encode the template frame 252 into a template frame representation using a predefined level of quality. Once the template generator 240 has completed encoding all of the template frames 252 into a template frame representation, the template generator 240 may add the template frame representation to the representation array. The template frame 252 may be encoded as a non-predictive frame, e.g., independent of any other video frame and at the same quality and predefined level of quality.

In some implementations, the template generator 240 may initially receive image frames from one or more exercise real-time sessions running on the server 210 to identify image frames that are frequently streamed to the client device. The template generator 240 may compare image frames that are likely to be streamed during the real-time session with selected image frames sent from the signature comparator 230 to select a set of template frames corresponding to the selected image frames that are likely to be streamed during the real-time session. In other implementations, the template generator 240 may prepare a template frame that includes all of the selected image frames sent by the signature comparator 230.

The size of the representation array may reflect a start-up delay in a real-time session of the client device 270 that prefetches the representation array over the network. For example, when the client device 270 initially requests an application that is streamed from the server, the server may provide a representation of the template frame to the client device 270 to prefetch, and then begin streaming the real-time video of the application. In some instances, the size of the representative array may be large enough to imply a severe delay in the pre-fetch phase of the real-time session. For example, prefetching a large representation array may take several seconds and may make the application appear unresponsive. Thus, in some implementations, the stencil generator 240 may compare the size of the generated representation array to a threshold size and, based on a determination that the size of the generated representation array is greater than the threshold size, send the generated representation array to the stencil clipper 250. The threshold size may be, for example, 200KB, 1MB, 5MB, 10MB, or some other size.

In some implementations, the threshold size may depend on the network bandwidth available for transmission to the client device 270. For example, as the network bandwidth available for transmission to the client device 270 increases, the threshold size may increase.

The stencil clipper 250 may reduce the size of the generated representation array by removing the stencil frame representation from the representation array and frame number set. For example, the template cropper 250 may initially prioritize the template frame representations by comparing the distance score between the signature of the selected image frame 252 and the signature of its immediately preceding image frame. In another example, the template cropper 250 may also compare signatures of the template frames 252 and merge template frames 252 that are similar to each other. The stencil cropper 250 may determine that a lower distance score represents an image frame that is less valuable for the prediction because the transition between these image frames may have a smaller performance impact in the real-time session than an image frame with a larger distance score from its previous image frame. The stencil clipper 250 may then first remove the stencil representations in the representation array by the lowest distance score until the representation array is below the threshold size.

The client device 270 may prefetch template representations 260 from the server 210 and store them on the client device 270 as template representations 272. The video renderer 272 of the client device 270 may then perform decoding using the stencil representation 274 (e.g., similar to the video renderer 162).

3A-3C illustrate an example process of an encoded representation of a template frame for streaming pre-fetching. Processes 300A, 300B, and 300C are described below as processes performed by components of systems 100 and 200 described with reference to fig. 1A-1B and 2, respectively. However, processes 300A, 300B, and 300C may be performed by other systems or system configurations.

Fig. 3A is a flow diagram of an example process 300A for streaming an application to a client using template frames. The process 300A may include receiving an indication that a client device has a representation of a template frame (302). For example, the server 110 may receive a set of template representations that the client device 160 has completed prefetching, the set of template representations being provided by the server 110 to the client device 160 in response to a request from the client device 160 to start a streaming application. In another example, the server 110 may receive a transmission from the client device 160 that includes a request to use an application and indicates that there is a set of pre-fetched template representations 164 on the client device 160.

The process 300A may include processing a plurality of image frames for provision to a client device (304). For example, the server 110 may generate a plurality of image frames corresponding to a video game played by the user using the application. In this example, the plurality of image frames may include image frames corresponding to a player that is running around and opening a full screen options menu. The signature generator 120 may then determine a signature of the image frame 114 and obtain a pre-stored signature of the template frame 112.

The process 300A may include determining whether each image frame satisfies a threshold for similarity (306). For example, the signature comparator 130 may determine, for each image frame, whether a distance score between the signature of the image frame and the signature of the particular template frame is below (i) a distance score between the signature of the image frame and the signatures of their immediately preceding image frames, and (ii) a distance score between the signature of the image frame and the signature of any other template frames, and whether a distance score between the signature of the image frame and the signature of the particular template frame is also below a predetermined minimum distance threshold.

The process 300A may include generating a representation of the difference between the image frame and the template frame (308). For example, where a threshold of similarity is met based on a particular template frame, the encoding engine 140 may generate a representation of the difference between the image frame and the particular template frame by first encoding the particular template frame as a non-predictive frame at a predefined level of quality, dropping the encoding of the particular template frame, and then encoding the image frame as an image based on the frame that was just encoded.

The process 300A may include generating instructions for rendering an image frame (310). For example, the encoding engine 140 may generate instructions 150A that include a representation 152 of a difference between a particular image frame and a particular stencil frame to be rendered on the client 160, and an identification 154 of the particular stencil that identifies to the client device 160A particular pre-fetched stencil representation to be used to decode the image frame.

The process 300A may include providing instructions to a client device (312). For example, server 110 may send instructions 150A to client device 160 as a data transmission over a network.

Fig. 3B is a flow diagram of a process 300B for a server to start a streaming application using a template frame. The process 300B may include identifying an application for which a plurality of image frames are to be rendered (314). For example, server 110 may identify that client device 160 has requested to stream a particular application.

The process 300B may include determining that a template frame is associated with the application (316). For example, server 110 may determine that a particular application that client device 160 has requested streaming is associated with a particular set of stencil frames.

The process 300B may include providing a request to a client device to provide an indication that the client device has a representation of a template frame (318). For example, in response to determining that the template frame is associated with a particular application, server 110 may provide a representation of the template frame to client device 160, and may request that client device 160 provide an indication that client device 160 has received the representation of the template frame. In another example, server 110 may determine that client device 160 may have stored representations of template frames, and may request that client device 160 provide an indication of what representations of template frames client device 160 has stored. The server may provide the request over a network.

The process can include receiving an indication that a client device has a representation of a template frame (320). For example, server 110 may receive a data transmission from client device 160 over the network identifying a representation of a template frame that client device 160 has stored on client device 160, or confirm that client device 160 has received a representation of a template frame provided by server 110 to client device 160. In response, server 110 may determine representations of template frames for the application that client device 160 has prefetched, and stream the application using those template frames corresponding to the representations prefetched on client device 160.

Fig. 3C is a flow diagram of a process 300C for rendering an image frame using a prefetched stencil representation by a client device. The process may include receiving a request to provide an indication (322). For example, the client device 160 may receive a request from the server 110 to provide a list of template representations on the client device 160, or confirmation that the client device 160 has completed receiving the set of template representations from the server 110. In some implementations, the request can be specific to a particular application. For example, the request may identify a particular application for which the client device 160 should list available template representations.

The process may include determining that a template frame representation exists (324). For example, client device 160 may determine which template representations are stored on client device 160.

The process may include providing an indication of a template frame representation (326). For example, the client device 160 may send a signal to the server 160 indicating which template representations are on the client device 160, or the client device has completed receiving the set of template representations from the server 160.

The process may include receiving instructions to render an image frame (328). For example, client device 160 may receive an instruction to render an image frame, where the instruction identifies a particular stencil representation and includes a representation of a difference between the stencil frame and the image frame corresponding to the stencil representation.

The process may include obtaining a representation of a difference between the image frame and the template frame from the instructions (330). For example, the client device 160 may extract the representation 152 of the discrepancy from the instructions 150A, which instructions 150A are received from the server 110.

The process may include rendering the image frame (332). For example, the client device 160 may decode a representation of the template frame, discard the results of the decoding, and then use predictive decoding using a representation of the difference between the image frame and the template frame corresponding to the template representation.

FIG. 4 is a block diagram of computing devices 400, 450, either as clients, or as a server or servers, that may be used to implement the systems and methods described in this document. Computing device 400 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Computing device 450 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, and other similar computing devices. Additionally, computing device 400 or 450 can include a Universal Serial Bus (USB) flash drive. The USB flash drive may store an operating system and other applications. The USB flash drive can include input/output components such as a wireless transmitter, or a USB connector that can be plugged into a USB port of another computing device. The components shown here, their connections and relationships, and their functions, are meant to be exemplary only, and are not intended to be limited to the implementations described and/or claimed in this document.

Computing device 400 includes a processor 402, memory 404, a storage device 406, a high-speed interface 408 connecting to memory 404 and high-speed expansion ports 410, and a low-speed interface 412 connecting to low-speed bus 414 and storage device 406. Each of the components 402, 404, 406, 408, 410, and 412 are interconnected using various buses and may be mounted on a common motherboard or in other manners as appropriate. The processor 401 is capable of processing instructions for execution within the computing device 400, including instructions stored in the memory 404 or on the storage device 406 to display Graphical information for a GUI (Graphical User Interface) on an external input/output device, such as display 416 coupled to high speed Interface 408. In other embodiments, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Also, multiple computing devices 400 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, group of blade servers, or multi-processor system).

The memory 404 stores information within the computing device 400. In one implementation, the memory 404 is a volatile memory unit or units. In another implementation, the memory 404 is a non-volatile memory unit or units. The memory 404 may also be another form of computer-readable medium, such as a magnetic or optical disk.

The storage device 406 can provide mass storage for the computing device 400. In one implementation, the storage device 406 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices including devices in a storage area network or other configurations. The computer program product can be tangibly embodied with an information carrier. The computer program product may also contain instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-readable or machine-readable medium, such as the memory 404, the storage device 406, or memory on processor 402.

The high speed controller 408 manages bandwidth-intensive operations for the computing device 400, while the low speed controller 412 manages lower bandwidth density operations. This allocation of functionality is merely exemplary. In one embodiment, the high-speed controller 408 is coupled to memory 404, a display 416 (e.g., through a graphics processor or graphics accelerator), and high-speed expansion ports 410, which may accept various expansion cards (not shown). In this embodiment, low-speed controller 412 is coupled to storage device 406 and low-speed expansion port 414. The low-speed expansion port, which includes various communication ports (e.g., USB, bluetooth, ethernet, wireless ethernet), may be coupled, e.g., through a network adapter, to one or more input/output devices such as a keyboard, a pointing device, a microphone/speaker pair, a scanner, or a network device such as a switch and router. The computing device 400 may be implemented in a number of different forms as shown in the figures. For example, it may be implemented as a standard server 420, or multiple times in a group of such servers. It may also be implemented as part of a rack server system 424. Further, it may be implemented in the form of a personal computer such as a laptop computer 422. Alternatively, components from computing device 400 may be combined with other components in a mobile device (not shown), such as device 450. Each of such devices may contain one or more of computing device 400, 500, and an overall system, which may be made up of multiple computing devices 400, 450 in communication with each other.

The computing device 400 may be implemented in a number of different forms as shown in the figures. For example, it may be implemented as a standard server 420, or multiple times in a group of such servers. It may also be implemented as part of a rack server system 424. Further, it may be implemented with a personal computer such as a laptop computer 422. Alternatively, components from computing device 400 may be combined with other components in a mobile device (not shown), such as device 450. Each of such devices may contain one or more of computing device 400, 500, and an overall system, which may be made up of multiple computing devices 400, 450 in communication with each other.

The computing device 450 includes a processor 452, memory 464, and input/output devices such as a display 454, a communication interface 466, and a transceiver 468, among other components. The device 450 may also possess a storage device, such as a microdrive or other device, to provide additional storage. Each of the components 450, 452, 464, 454, 466, and 468 are interconnected using various buses, and several of the components may be mounted on a common motherboard or in other manners as appropriate.

The processor 452 is capable of executing instructions within the computing device 450, including instructions stored in the memory 464. The processor may be implemented as a chipset of chips that include separate analog and digital processors, and multiple analog and digital processors. Additionally, the processor may be implemented using any of several architectures. For example, processor 410 may be a Complex Instruction Set Computer (CISC) processor, a Reduced Instruction Set Computer (RISC) processor, or a Minimal Instruction Set Computer (MISC) processor. The processor may provide, for example, for coordination of the other components of the device 450, such as control of user interfaces, applications run by device 450, and wireless communication by device 450.

The processor 452 may communicate with a user through a control interface 458 and a display interface 456 coupled to the display 454. The display 454 may be, for example, a Thin-Film-Transistor Liquid crystal display (TFT) display or an Organic Light Emitting Diode (OLED) display, or other suitable display technology. The display interface 456 may comprise appropriate circuitry for driving the display 454 to present graphical and other information to a user. The control interface 458 may receive commands from a user and convert them for submission to the processor 452. Further, external interface 462 may be provided in communication with processor 452 to enable near field communication of device 450 with other devices. External interface 462 may provide, for example, for wired communication in some implementations, or for wireless communication in other implementations, and multiple interfaces may also be used.

Memory 464 stores information within computing device 450. Memory 464 can be implemented as one or more of the following: a computer-readable medium or media, a volatile memory unit or units, or a non-volatile memory unit or units. Expansion Memory 474 may also be provided and connected to device 450 through expansion interface 472, which may include, for example, a Single line Memory Module (SIMM) card interface. Such expansion memory 474 may provide additional storage space for device 450, or may also store applications or other information for device 450. Specifically, expansion memory 474 may include instructions to carry out or supplement the processes described above, and may include secure information also. Thus, for example, expansion memory 474 may be provided as a security module for device 450 and may be programmed with instructions that permit secure use of device 450. In addition, the secure application may be provided via the SIMM card with additional information, such as placing identifying information on the SIMM card in a non-hackable manner.

As discussed below, the memory may include, for example, flash memory and/or NVRAM memory. In one embodiment, a computer program product is tangibly embodied in the form of an information carrier. The computer program product contains instructions which, when executed, perform one or more methods, such as those described above. The information carrier is a computer-readable or machine-readable medium, such as the memory 464, expansion memory 474, or memory on processor 452 that may be received via transceiver 468 or external interface 462, for example.

Device 450 may communicate wirelessly through communication interface 466, which communication interface 466 may include digital signal processing circuitry when necessary. Communication interface 466 may provide for communications under various modes or protocols, such as GSM voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000, or GPRS, among others. Such communication may occur, for example, through radio-frequency transceiver 468. Further, short-range communications may occur, such as using a Bluetooth, WiFi, or other such transceiver (not shown). In addition, GPS (Global Positioning System) receiver module 470 may provide additional navigation-related and location-related wireless data to device 450, which may be used as appropriate by applications running on device 450.

Device 450 may also communicate audibly using audio codec 460, which may receive verbal information from a user and convert it to usable digital information. Audio codec 460 may likewise generate audible sound for a user, such as through a speaker (e.g., in a handset of device 450). Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.), and may also include sound generated by applications operating on device 450.

The computing device 450 may be implemented in a number of different forms as shown in the figures. For example, it may be implemented as a cellular telephone 480. It may also be implemented as part of a smart phone 482, part of a personal digital assistant, or part of other similar mobile device.

Various embodiments of the systems and methods described herein can be implemented in the form of digital electronic circuitry, integrated circuitry, specially designed Application Specific Integrated Circuits (ASICs), computer hardware, firmware, software, and/or combinations of these embodiments. These various implementations can include implementation of one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium," "computing machine-readable medium" refers to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, and Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a Programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) having a display for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices can also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), and the internet.

The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Several embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the scope of the invention. Moreover, the logic flow depicted in the accompanying figures does not require the particular order shown, or sequential order, to achieve desirable results. In addition, other steps may be provided, or steps may be eliminated, from the described flowcharts, and other components may be added to, or removed from, the described systems. Accordingly, other embodiments are within the scope of the following claims.

23页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种自动化多模式图像压缩的控制方法及装置

Encoding for prefetching of application streams

相关技术

网友询问留言