Spatial layer rate allocation

文档序号：864166 发布日期：2021-03-16 浏览：16次中文

阅读说明：本技术 空间层速率分配 (Spatial layer rate allocation ) 是由迈克尔·霍罗威茨拉斯马斯·勃兰特于 2019-06-23 设计创作，主要内容包括：一种方法(400)包括接收对应于缩放视频输入信号(120)的变换系数(226),所述缩放视频输入信号(120)包括多个空间层(L),所述多个空间层(L)包括基本层(L_0)。该方法还包括基于来自缩放视频输入信号的帧样本(S_F)来确定空间速率因子(332)。空间速率因子在从缩放视频输入信号形成的编码比特流(204)的每个空间层处定义用于比特率分配的因子。空间速率因子由基本层的每个变换系数的比特率与每个变换系数的平均比特率(R_L)之间的差来表示。所述方法还包括通过基于空间速率因子和帧样本将比特率分配给每个空间层来减小多个空间层的失真。(A method (400) includes receiving a response toScaling transform coefficients (226) of a video input signal (120), the scaled video input signal (120) comprising a plurality of spatial layers (L) including a base layer (L) 0 ). The method further comprises scaling the video input signal based on frame samples (S) from the video input signal F ) A spatial rate factor is determined (332). The spatial rate factor defines a factor for bit rate allocation at each spatial layer of an encoded bitstream (204) formed from the scaled video input signal. The spatial rate factor is formed by the bit rate of each transform coefficient of the base layer and the average bit rate (R) of each transform coefficient L ) The difference between them. The method also includes reducing distortion of the plurality of spatial layers by allocating a bit rate to each spatial layer based on the spatial rate factor and the frame samples.)

1. A method (400), comprising:

receiving, at data processing hardware (510), transform coefficients (226) corresponding to a scaled video input signal (120), the scaled video input signal (120) comprising a plurality of spatial layers (L) including a base layer (L)₀)；

-based on frame samples (S) from the scaled video input signal (120) by the data processing hardware (510)_F) To determine a spatial rate factor (332), the spatial rate factor (332) defining a factor for bitrate allocation at each spatial layer (L) of an encoded bitstream (204) formed from the scaled video input signal (120), the spatial rate factor (332) being defined by the base layer (L)₀) And each transform coefficient (22) of the plurality of spatial layers (L)6) Average bit rate (R)_L) The difference between them; and

by the data processing hardware (510), by basing the spatial rate factor (332) and the frame samples (S)_F) Allocating a bitrate to said each spatial layer (L) to reduce distortion of said plurality of spatial layers (L) of said encoded bitstream (204).

2. The method (400) of claim 1, further comprising:

receiving, at the data processing hardware (510), second frame samples (S) from the scaled video input signal (120)_F)；

-based on the second frame samples (S) from the scaled video input signal (120) by the data processing hardware (510)_F) Modifying the spatial rate factor (332); and

based on the modified spatial rate factor (332) and the second frame sample (S) by the data processing hardware (510)_F) -assigning the modified bitrate to said each spatial layer (L).

3. The method (400) of claim 1 or 2, further comprising:

receiving, at the data processing hardware (510), second frame samples (S) from the scaled video input signal (120)_F)；

Modifying, by the data processing hardware (510), the spatial rate factor (332) on a frame-by-frame basis based on an exponential moving average corresponding to at least the frame sample (S)_F) And the second frame sample (S)_F) (ii) a And

allocating, by the data processing hardware (510), the modified bitrate to the each spatial layer (L) based on the modified rate factor (332).

4. The method (400) of any of claims 1-3, wherein receiving the scaled video input signal (120) comprises:

receiving a video input signal (120);

-scaling the video input signal (120) into the plurality of spatial layers (L);

-dividing said each spatial layer (L) into sub-blocks;

transforming said each sub-block into said transform coefficients (226); and

scalar quantizing the transform coefficients corresponding to the each sub-block (226).

5. The method (400) of claim 4, wherein said scaling is based on said frame samples (S) from said scaled video input signal (120)_F) Determining the spatial rate factor (332) comprises: a variance estimate (322) for the transform coefficients (226) quantized by each scaler (210) is determined based on an average of all transform blocks of a frame of the video input signal (120).

6. The method (400) of claim 4 or 5, wherein the transform coefficients (226) of each sub-block are equally distributed across all sub-blocks.

7. The method (400) according to any one of claims 1-6, wherein the spatial rate factor (332) includes: is configured to assign the bit rate to a single parameter of each layer (L) of the coded bit stream (204).

8. The method (400) according to any one of claims 1-7, further comprising: determining, by the data processing hardware (510), that the spatial rate factor (332) satisfies a spatial rate factor threshold (334).

9. The method (400) of claim 8, wherein the value corresponding to the spatial rate factor threshold (334) satisfies the spatial rate factor threshold (334) when the value is less than about 1.0 and greater than about 0.5.

10. The method (400) of any of claims 1-9, wherein the spatial rate factor (332) comprises a weighted sum, the weighted sum corresponding to a ratio of variance products, the ratio comprising a numerator based on an estimated variance of the transform coefficients (226) quantized by the scaler (210) from the first spatial layer (L) and a denominator based on an estimated variance of the transform coefficients (226) quantized by the scaler (210) from the second spatial layer (L).

11. A system (100), comprising:

data processing hardware (510); and

memory hardware (520), the memory hardware (520) in communication with the data processing hardware (510), the memory hardware (520) storing instructions that, when executed on the data processing hardware (510), cause the data processing hardware (510) to:

receiving transform coefficients (226) corresponding to a scaled video input signal (120), the scaled video input signal (120) comprising a plurality of spatial layers (L) including a base layer (L)₀)；

Based on frame samples (S) from the scaled video input signal (120)_F) To determine a spatial rate factor (332), said spatial rate factor (332) defining a factor for bitrate allocation at each spatial layer (L) of an encoded bitstream (204), said encoded bitstream (204) being formed from said scaled video input signal (120), said spatial rate factor (332) being formed by said base layer (L)₀) And an average bit rate (R) of said each transform coefficient (226) of said plurality of spatial layers (L)_L) The difference between them; and

by based on the spatial rate factor (332) and the frame samples (S)_F) Allocating a bitrate to said each spatial layer (L) to reduce distortion of said plurality of spatial layers (L) of said encoded bitstream (204).

12. The system (100) of claim 11, wherein the operations further comprise:

receiving from the sink-playing second frame samples (S) of the video input signal (120)_F)；

Based on the second frame samples (S) from the scaled video input signal (120)_F) Modifying the spatial rate factor (332); and

based on the modified spatial rate factor (332) and the second frame samples (S)_F) -assigning the modified bitrate to said each spatial layer (L).

13. The system (100) of claim 11 or 12, wherein the operations further comprise:

receiving the second frame samples (S) from the scaled video input signal (120)_F)；

Modifying the spatial rate factor (332) based on an exponential moving average corresponding to at least the frame samples (S)_F) And the second frame sample (S)_F) (ii) a And

allocating the modified bitrate to said each spatial layer (L) based on the modified spatial rate factor (332).

14. The system (100) according to any one of claims 11-13, wherein receiving the scaled video input signal (120) further comprises:

receiving a video input signal (120);

-scaling the video input signal (120) into the plurality of spatial layers (L);

-dividing said each spatial layer (L) into sub-blocks;

transforming said each sub-block into transform coefficients (226); and

scalar quantizing the transform coefficients corresponding to the each sub-block (226).

15. The system (100) of claim 14, wherein the scaling is based on the frame samples (S) from the scaled video input signal (120)_F) Determining the spatial rate factor (332) comprises: all transformations based on frames of the video input signal (120)The mean of the blocks to determine a variance estimate (322) for the transform coefficients (226) quantized by each scaler (210).

16. The system (100) according to claim 14 or 15, wherein the transform coefficients (226) of each sub-block are equally distributed over all sub-blocks.

17. The system (100) according to any one of claims 11-16, wherein the spatial rate factor (332) includes: is configured to assign the bit rate to a single parameter of each layer (L) of the coded bit stream (204).

18. The system (100) according to any one of claims 11-17, wherein the operations further include: determining that the spatial rate factor (332) satisfies a spatial rate factor threshold (334).

19. The system (100) of claim 18, wherein the value corresponding to the spatial rate factor threshold (334) satisfies the spatial rate factor threshold (334) when the value is less than about 1.0 and greater than about 0.5.

20. The system (100) of any of claims 11-19, wherein the spatial rate factor (332) comprises a weighted sum, the weighted sum corresponding to a ratio of variance products, the ratio comprising a numerator of an estimated variance of the transform coefficients (226) quantized based on the scaler (210) from the first spatial layer (L) and a denominator of an estimated variance of the transform coefficients (226) quantized based on the scaler (210) from the second spatial layer (L).

21. A method (400), comprising:

receiving, at data processing hardware (510), non-quantized transform coefficients (226) corresponding to a scaled video input signal (120), the scaled video input signal (120) comprising a plurality of spatial layers (L);

by said numberProcessing hardware (510) based on frame samples (S) from the scaled video input signal (120)_F) To determine an allocation factor corresponding to a received variance estimate of the non-quantized transform coefficients (226); and

-by the data processing hardware (510) based on the allocation factor and the frame samples (S)_F) A bit rate is allocated to each spatial layer (L).

Technical Field

The present invention relates to spatial layer rate allocation in the context of scalable video coding.

Background

As video becomes more prevalent in a wide range of applications, the video stream may need to be encoded and/or decoded several times depending on the situation of the application. For example, different applications and/or devices may need to comply with bandwidth or resource constraints. To meet these requirements, which require several settings to be combined without being too expensive, efficient codecs have been developed that compress video to several resolutions. With codecs such as scalable VP9 and h.264, a video bitstream can contain multiple spatial layers that allow users to reconstruct the original video at different resolutions (i.e., the resolution of each spatial layer). By having scalable capabilities, video content can be delivered from device to device with limited further processing.

Disclosure of Invention

One aspect of the present invention provides a method for allocating bit rates. The method includes receiving, at data processing hardware, transform coefficients corresponding to a scaled video input signal, the scaled video input signal including a plurality of spatial layers, the plurality of spatial layers including a base layer. The method also includes determining, by the data processing hardware, a spatial rate factor based on the frame samples from the scaled video input signal. The spatial rate factor defines a factor for bit rate allocation at each spatial layer of an encoded bitstream formed from the scaled video input signal. The spatial rate factor is represented by a difference between a bit rate of each transform coefficient of the base layer and an average bit rate of each transform coefficient of the plurality of spatial layers. The method also includes reducing distortion of a plurality of spatial layers of the encoded bitstream by allocating a bitrate to each spatial layer based on the spatial rate factor and the frame samples.

Implementations of the invention may include one or more of the following optional features. In some embodiments, the method further comprises receiving, at the data processing hardware, second frame samples from the scaled video input signal; the data processing hardware modifies the spatial rate factor based on second frame samples from the scaled video input signal; the data processing hardware allocates a modified bit rate for each spatial layer based on the modified spatial rate factor and the second frame sample. In further embodiments, the method further comprises receiving, at the data processing hardware, second frame samples from the scaled video input signal; the data processing hardware modifies the spatial rate factor from frame to frame based on an exponential moving average, the exponential moving average corresponding to at least the frame sample and the second frame sample; the data processing hardware allocates a modified bit rate for each spatial layer based on the modified spatial rate factor.

In some examples, receiving a scaled video input signal includes receiving the video input signal, scaling the video input signal into a plurality of spatial layers, partitioning each spatial layer into sub-blocks, transforming each sub-block into transform coefficients, and scalar quantizing the transform coefficients corresponding to each sub-block. Determining the spatial rate factor based on the frame samples from the scaled video input signal may include determining a variance estimate for each scalar quantized transform coefficient based on an average over all transform blocks of a frame of the video input signal. Here, the transform coefficients of each sub-block may be equally distributed over all sub-blocks.

In some implementations, the method further includes determining, by the data processing hardware, that the spatial rate factor satisfies the spatial rate factor threshold. In these embodiments, the spatial rate factor threshold may be satisfied when the value to which the spatial rate factor threshold corresponds is less than about 1.0 and greater than about 0.5. The spatial rate factor may comprise a single parameter configured to allocate bit rate to each layer of the encoded bit stream. In some examples, the spatial rate factor includes a weighted sum corresponding to a ratio of the variance products, wherein the ratio includes a numerator of an estimated variance based on the scalar quantized transform coefficients from the first spatial layer and a denominator of an estimated variance based on the scalar quantized transform coefficients from the second spatial layer.

Another aspect of the invention provides a system for allocating bit rates. The system includes data processing hardware and memory hardware in communication with the data processing hardware. The memory hardware stores instructions that, when executed by the data processing hardware, cause it to perform operations. The operations include receiving transform coefficients corresponding to a scaled video input signal, the scaled video input signal including a plurality of spatial layers, the plurality of spatial layers including a base layer. The operations also include determining a spatial rate factor based on frame samples from the scaled video input signal. The spatial rate factor defines a factor for bit rate allocation at each spatial layer of an encoded bitstream formed from the scaled video input signal. The spatial rate factor is represented by a difference between a bit rate of each transform coefficient of the base layer and an average bit rate of each transform coefficient of the plurality of spatial layers. The operations also include reducing distortion of a plurality of spatial layers of the encoded bitstream by allocating a bitrate to each spatial layer based on the spatial rate factor and the frame samples.

This aspect may include one or more of the following optional features. In some embodiments, the operations further comprise receiving second frame samples from the scaled video input signal; modifying the spatial rate factor based on a second frame sample from the scaled video input signal; and allocating a modified bit rate to each spatial layer based on the modified spatial rate factor and the second frame sample. In further embodiments, the operations further comprise receiving second frame samples from the scaled video input signal; modifying the spatial rate factor from frame to frame based on an exponential moving average, the exponential moving average corresponding to at least the frame sample and the second frame sample; and allocating a modified bit rate to each spatial layer based on the modified spatial rate factor.

In some implementations, the operations further include determining that the spatial rate factor satisfies a spatial rate factor threshold. In these embodiments, the spatial rate factor threshold may be satisfied when the value to which the spatial rate factor threshold corresponds is less than about 1.0 and greater than about 0.5. The spatial rate factor may comprise a single parameter configured to allocate bit rate to each layer of the encoded bit stream. In some examples, the spatial rate factor includes a weighted sum corresponding to a ratio of the variance products, wherein the ratio includes a numerator based on an estimated variance of the scalar quantized transform coefficients from the first spatial layer and a coefficient based on an estimated variance of the scalar quantized transform coefficients from the second spatial layer.

The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other aspects, features, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

Fig. 1 is a schematic diagram of an exemplary rate allocation system.

Fig. 2 is a schematic diagram of an exemplary encoder within the rate allocation system of fig. 1.

Fig. 3 is a schematic diagram of an exemplary dispenser within the rate dispensing system of fig. 1.

Fig. 4 is a flow diagram of an exemplary method for implementing a rate allocation system.

FIG. 5 is a schematic diagram of an exemplary computing device that may be used to implement the systems and methods described herein.

Like reference symbols in the various drawings indicate like elements.

Detailed Description

Fig. 1 is an example of a rate allocation system 100. The rate distribution system 100 generally includes a video source device 110 that transmits captured video as a video input signal 120 to a remote system 140 via a network 130. At the remote system 140, the encoder 200 and distributor 300 convert the video input signal 120 into an encoded bitstream 204. The coded bitstream 204 comprises more than one spatial layer L_0-iWherein i represents a spatial layer L_0-iThe number of the cells. Each spatial layer L is a scalable version of the encoded bitstream 204. A scalable video bitstream refers to a video bitstream in which part of the bitstream can be removed by generating a sub-stream (e.g., spatial layer L) that forms an effective bitstream for some target decoders. More specifically, the sub-streams represent the source content (e.g., captured video) of the original video input signal 120, the reconstructed quality of which is lower than the quality of the original captured video. For example, the first spatial layer L₁With a 720p High Definition (HD) resolution of 1280 x 720, and a base layer L₀Scaled to a resolution of 640 x 360 as an extended form of video graphics adapter resolution (VGA). In terms of scalability, typically video may be scalable temporally (e.g., by frame rate), spatially (e.g., by spatial resolution), and/or qualitatively (e.g., by fidelity, commonly referred to as signal-to-noise ratio, SNR).

The rate distribution system 100 is an example environment in which a user 10,10a captures video at a video source device 110 and communicates the captured video to other users 10,10 b-c. Here, the encoder 200 and distributor 300 convert the captured video to an encoded bitstream 204 at the assigned bitstream rate before the user 10b,10c receives the captured video via the video receiving device 150,150 b-c. Each video receiving device 150 may be configured to receive and/or process a different video resolution. Here, the spatial layer L having the larger layer number i refers to the layer L having the larger resolution becauseWhere i is 0 in more than one spatial layer L_0-iRefers to the base layer L with the lowest scalable resolution₀. Referring to fig. 1, the coded video bitstream 204 comprises two spatial layers L₀、L₁. Thus, one video receiving device 150 can receive as a lower resolution spatial layer L₀And another video receiving device 150 may receive as the higher resolution spatial layer L₁The video content of (2). For example, FIG. 1 depicts a first video receiving device 150a belonging to user 10b as receiving a lower spatial resolution layer L₀While a user 10c having a second receiving device 150b as a laptop receives the higher resolution spatial layer L₁。

When different video receiving devices 150a-b receive different spatial layers L_0-iThe video quality of each spatial layer L may depend on the bitrate B of the received spatial layer L_RAnd/or the allocation factor A_F. Here, bit rate B_RThe factor A is assigned corresponding to the number of bits per second_FCorresponding to the number of bits per sample (i.e., transform coefficients). In the case of a scalable bit stream (e.g., the encoded bit stream 204), the total bit rate B of the scalable bit stream_RtotAre typically constrained such that each spatial layer L of the scalable bit stream receives similar bit rate constraints. Due to these limitations, one spatial layer L is associated with a bit rate B_RThe quality of the further spatial layer L may be compromised or compromised. More specifically, if the quality on the spatial layer L received by the user 10 via the video receiving device 150 is compromised, the quality may negatively impact the user experience. For example, it is becoming more common today to transmit video content as a form of communication via a real-time communication (RTC) application. A user 10 using an RTC application can typically select an application for communication based on the subjective quality of the application. Thus, as an application user, the user 10 generally desires to obtain a positive communication experience without quality problems that may result from insufficient bit rates allocated to the spatial layer L received by the application user 10. To help ensure a positive user experience, the dispenser 300Is configured to adaptively transmit an allocation factor A_FTo determine a plurality of spatial layers L_0-iOf each spatial layer L_R. By applying at a plurality of spatial layers L_0-iAnalytically distribute the allocation factor A among_FThe allocator 300 seeks to allocate, for a given total bit rate B_RtotIn all spatial layers L_0-iThe highest video quality is achieved.

The video source device 110 may be any computing device or data processing hardware capable of communicating captured video and/or video input signals 120 to the network 130 and/or the remote system 140. In some examples, the video source device 110 includes data processing hardware 112, memory hardware 114, and a video capture device 116. In some implementations, the video capture device 116 is actually an image capture device that can transmit a sequence of captured images as video content. For example, some digital cameras and/or webcams are configured to capture images at a particular frequency to form a perceived video content. In other examples, video source device 110 captures video in a continuous analog format, which can then be converted to a digital format. In some configurations, the video source device 110 includes an encoder to preliminarily encode or compress the captured data (e.g., analog or digital) into a format for further processing by the encoder 200. In other examples, the video source device 110 is configured to access the encoder 200 at the video source device 110. For example, the encoder 200 is a web application hosted on the remote system 140, but accessible by the video source device 110 via a network connection. In other examples, some or all of the encoder 200 and/or distributor 300 are hosted on the video source device 110. For example, the encoder 200 and distributor 300 are hosted on the video source device 110, but the remote system 140 serves as a backend system that will include the spatial layer L according to the decoding capabilities of the video sink device 150 and the connection capabilities of the network 130 between the video sink device 150 and the remote system 140_0-iTo the video receiving device 150. Additionally or alternatively, the video source device 110 is configured such that the user 10a can communicate with another user 10b-c over the network 130 using the video capture device 116.

The video input signal 120 is a video signal corresponding to the captured video content. Here, video source device 110 captures video content. For example, fig. 1 depicts video source device 110 capturing video content via webcam 116. In some examples, video input signal 120 is an analog signal that is processed by encoder 200 into a digital format. In other examples, the video input signal 120 undergoes some degree of encoding or digital formatting prior to the encoder 200, such that the encoder 200 performs a re-quantization process.

Much like the video source device 110, the video sink device 150 may be any computing device or data processing hardware capable of receiving transmitted captured video via the network 130 and/or the remote system 140. In some examples, the video source device 110 and the video sink device 150 are configured with the same functionality such that the video sink device 150 can become the video source device 110 and the video source device 110 can become the video sink device 150. In either case, the video receiving device 150 includes at least data processing hardware 152 and memory hardware 154. In addition, the video receiving device 150 includes a display 156, the display 156 configured to display the received video content (e.g., the at least one layer L of the encoded bitstream 204). As shown in FIG. 1, users 10B,10c are at bit rate B_RThe encoded bitstream 204 is received as a spatial layer L and the encoded bitstream 204 is decoded and displayed as video on the display 156. In some examples, the video receiving device 150 contains a decoder or is configured to access a decoder (e.g., via the network 130) to allow the video receiving device 150 to display the content of the encoded bitstream 204.

In some examples, the encoder 200 and/or distributor 300 is an application hosted by a remote system 140 (e.g., a distributed system of a cloud environment) that is accessed via the video source device 110 and/or the video sink device 150. In some implementations, the encoder 200 and/or distributor 300 is an application downloaded to the memory hardware 114, 154 of the video source device 110 and/or video sink device 150. Regardless of the access point of the encoder 200 and/or the distributor 300, the encoder 200 and/or the distributor 300 may be configured to communicate with the remote system 140Which is believed to access a resource 142 (e.g., data processing hardware 144, memory hardware 146, or software resource 148). Access to the resources 142 of the remote system 140 may allow the encoder 200 and/or distributor 300 to encode the video input signal 120 into an encoded bitstream 204 and/or to encode a bit rate B_RMore than one spatial layer L assigned to the coded bitstream 204_0-iEach spatial layer L. Optionally, the real-time communication (RTC) application comprises the encoder 200 and/or dispatcher 300 as a built-in function as a software resource 148 of the remote system 140 for communication between users 10,10 a-c.

Referring in more detail to FIG. 1, three users 10,10a-c communicate via an RTC application hosted by the remote system 140 (e.g., a WebRTC video application hosted by the cloud). In this example, the first user 10a is in a group video chat with the second user 10b and the third user 10 c. When the video capture device 116 captures video of the first user 10a speaking, the video captured via the video input signal 120 is processed by the encoder 200 and distributor 300 and transmitted via the network 130. Here, the encoder 200 and allocator 300 operate in conjunction with the RTC application to generate a virtual machine having more than one spatial layer L₀、L₁Wherein each spatial layer L has an allocated bitrate B_R0、B_R1The allocated bit rate is determined by an allocation factor A based on the video input signal 120_F0、A_F1And (4) determining. Due to the different capabilities of each video receiving device 150a, 150b, each user 10b,10c receiving the video chat of the first user 10a receives a different scaled version of the original video corresponding to the video input signal 120. For example, the second user 10b receives the base spatial layer L₀And the third user 10c receives the first spatial layer L₁. Each user 10b,10c continues to display video content received in communication with the RTC application on the display 156a, 156 b. Although an RTC communication application is shown, the encoder 200 and/or allocator 300 may be used to refer to having more than one spatial layer L_0-iIn other applications of the encoded bitstream 204.

Fig. 2 is an example of an encoder 200. The encoder 200 is configured to convert a video input signal 120 as an input 202 into an encoded bitstream as an output 204. Although shown separately, the encoder 200 and distributor 300 may be integrated into a single device (e.g., as shown in phantom in fig. 1), or appear separately on multiple devices (e.g., video input device 110, video receiving device 150, or remote system 140). Encoder 200 generally includes a scaler 210, a transformer 220, a quantizer 230, and an entropy coder 240. Although not shown, the encoder 200 may include additional components for generating the encoded bitstream 204, such as prediction components (e.g., motion estimation and intra prediction) and/or loop filters. The prediction component may generate a residual to be passed to transformer 220 for transformation, where the residual is based on the difference of the original input frame minus the frame prediction (e.g., motion compensation or intra prediction).

The scaler 210 is configured to scale the video input signal 120 into a plurality of spatial layers L_0-i. In some embodiments, the scaler 210 scales the video input signal 120 by determining the portion of the video input signal 120 that can be removed to reduce spatial resolution. The scaler 210 forms multiple versions of the video input signal 120 by removing one or more portions to form multiple spatial layers (e.g., sub-streams). The scaler 210 may repeat this process until the scaler 210 forms the base spatial layer L₀. In some examples, scaler 210 scales video input signal 120 to form a set number of spatial layers L_0-i. In other examples, the scaler 210 is configured to scale the video input signal 120 until the scaler 210 determines that no decoder exists to decode the sub-streams. When the scaler 210 determines that there is no decoder to decode the sub-stream corresponding to the scaled version of the video input signal 120, the scaler 210 recognizes a previous version (e.g., the spatial layer L) as the base spatial layer L₀. Some examples of the scaler 210 include a codec corresponding to a Scalable Video Coding (SVC) extension, such as an extension of the h.264 video compression standard or an extension of the VP9 coding format.

The transformer 220 is configured to receive each spatial layer L corresponding to the video input signal 120 from the scaler 210. For each spatial layer L, the transformer 220 divides each spatial layer L into sub-blocks in operation 222. With each sub-block, transformer 220 transforms each sub-block at operation 224 to generate transform coefficients 226 (e.g., via a Discrete Cosine Transform (DCT)). By generating transform coefficients 226, transformer 220 may correlate redundant video data with non-redundant video data to help encoder 200 remove redundant video data. In some embodiments, the transform coefficients also allow the allocator 300 to easily determine a plurality of coefficients for each transform block having a non-zero variance in the spatial layer L.

The quantizer 230 is configured to perform a quantization or re-quantization process 232 (i.e., scalar quantization). The quantization process typically converts input parameters (e.g., from a continuous analog data set) into a smaller output value data set. Although the quantization process may convert an analog signal to a digital signal, here, the quantization process 232 (also sometimes referred to as a re-quantization process) typically further processes the digital signal. Either process may be used interchangeably depending on the form of the video input signal 120. By employing a quantization or re-quantization process, data may be compressed, but at the expense of some aspect of data loss, since a smaller data set is a reduction of a larger or contiguous data set. Here, the quantization process 232 converts the digital signal. In some examples, quantizer 230 facilitates forming encoded bitstream 204 by scalar quantizing transform coefficients 226 from each sub-block of transformer 220 into quantization indices 234. Here, scalar quantized transform coefficients 226 may allow lossy encoding to scale each transform coefficient 226 in order to compare redundant video data (e.g., data that may be removed during encoding) with valuable video data (e.g., data that should not be removed).

The entropy encoder 240 is configured to convert the quantization index 234 (i.e., the quantized transform coefficient) and the side information into bits. By this transformation, the entropy encoder 240 forms the encoded bitstream 204. In some implementations, the entropy encoder 240 in conjunction with the quantizer 230 enable the encoder 200 to form the encoded bitstream 204, where each layer L is a layer L_0-iHaving a dispensing factor A based on the determination by the dispenser 300_F0-iBit rate B of_R0-i。

Fig. 3 is an example of a dispenser 300. The dispenser 300 is configured to receiveReceive more than one spatial layer L_0-iAssociated with the non-quantized transform coefficients 226 and for each received spatial layer L_0-iDetermining an allocation factor A_F. In some embodiments, the allocator 300 determines each allocation factor a based on a high rate approximation based on squared error for scalar quantization_F. The high rate approximation of the squared error allows the system to determine the optimal (in the case of high rate approximation) bit rate to allocate to the N scalar quantizers. Typically, the optimal bit rate allocated for the N scalar quantizers is determined by rate-distortion optimized quantization. Rate distortion optimization attempts to suffer from bit rate constraints (e.g., total bit rate B) by minimization_Rtot) The amount of distortion (i.e., loss of video quality) to improve video quality during video compression. Here, the allocator 300 applies the principle of determining an optimal bitrate for N scalar quantizers to determine an optimal allocation factor to allocate the bitrate to more than one spatial layer L of the encoded bitstream 204_0-iEach of which.

In general, a scalar quantized squared error high-rate approximation can be represented by the following equation:

wherein the content of the first and second substances,depending on the source distribution of the input signal (e.g., transform coefficients) to the ith quantizer,is the variance of the signal, and r_iIs the bit rate for the ith quantizer in units of bits per input symbol. The following is an expression of the optimal rate allocation for two scalar quantizers derived using the square error high rate approximation.

Average distortion of double quantizer problem D, D₂Is equal toSimilarly, the average rate R of the two quantizer problem₂Is equal toHere, d_iIs due to the distortion of the square error caused by the ith quantizer, and r_iIs the bit rate allocated to the i-th quantizer in units of bits per sample. Although, the parameter d_iIs the rate r_iSuch that the image d_i(r_i) Such an equation is appropriate, but for convenience, d will be referred to_iSimply instead of being denoted by d_i. Will d₀And d₁Substituting D into a high-rate approximation₂To obtain:

using equation (2), 2R can be used₂-r₀Substitute for r₁Obtaining:

by further finding D₂About r₀Equation (3) yields the following expression:

the above expression (equation (4)) is set to zero and r is solved₀An expression for the optimal rate r of the zero quantizer is obtained, expressed as follows:

since the expression for high rate distortion is convex, the minimum found by setting the derivative to zero is global. Similarly, the optimal rate r of the first quantizer may be expressed as follows:

to find the best quantizer distortionAndsubstituting equations (5) and (6) as the optimal rate into the high-rate expression of the scalar quantizer distortion as follows:

the simplified form of equation (7) yields the following equation:

for all i (8)

The same double quantizer analysis can be extended to three quantizers by combining the zero quantizer and the first quantizer into a single quantization system (i.e., a nested system), where the combined quantizer has been solved according to equations (1) - (8). Using a method similar to the two-quantizer rate allocation, a three-quantizer system is derived as follows.

The average single quantizer distortion due to a dual quantizer system is represented asBy substituting d_avgExpression to average triple quantizer distortionThe following equation is generated:

similarly, the average rate of a three quantizer system is expressed as follows:

wherein

Using the best distortion results from the double quantizer analysis as shown in equation (8), the resulting triple quantizer distortion can be represented by the following equation:

therefore, when equation (11) is simplified and substitutedBy equation (11), equation (11) is converted to the following expression:

using equation (12), we can relate r₂Is set to zero and solves for r₂The following equation is obtained:

for three quantizers, equation (13) can be more fully expressed as follows:

based on the first and second quantizers, an expression of the optimal rate allocation r for the N quantizers can be derived. The expression for the optimal rate for the ith quantizer is as follows:

by substituting the expression for the optimal rate into the high-rate expression for distortion and performing a simplification similar to the dual quantizer expression, the resulting expression for the optimal distortion is shown below for the N quantizers.

For all i (16)

Based on expressions derived from equations (1) - (16), the allocator 300 may apply these expressions to the optimal distortion to determine the distortion for the plurality of spatial layers L_0-iOf each layer L_F(i.e., to contribute to the optimal bit rate B_R). Similar to the derived N-quantizer expression, the multi-spatial layer bit-rate can be derived from expressions associated with two-layer and three-layer rate allocation systems. In some examples, it is assumed that although spatial layer L is present_0-iUsually with different spatial dimensions, but a spatial layer L_0-iOriginating from the same video source (e.g., video source device 110). In some embodiments, the first spatial layer L is encoded₀And a second spatial layer L₁Are assumed to be identical in structure, even though the values of the scalar quantizers may be different. Furthermore, for each spatial layer L, the plurality of samples S is typically equal to the plurality of transform coefficients 226 (i.e., also equal to the plurality of quantizers).

Average distortion D of two spatial layers in case of two spatial layer rate allocation system₂Can be expressed as an average distortion d₀And d₁And corresponds to the first and second spatial layers L₀、L₁(i.e., spatial layers 0 and 1) are as follows:

wherein s is_iIs equal to the ith spatial layer L_iAnd S ═ S₀+s₁. Similarly, the average bit rate of the two spatial layers can be expressed as follows:

wherein r is₀And r₁The average bit rates of the first and second spatial layers L0, L1, respectively. By substituting the expression for the optimum distortion of the N quantizer, i.e., equation (16), into D above₂Equation (17), D₂Can be expressed as follows:

whereinIs the ith spatial layer L_iThe variance of the input signal of the jth scalar quantizer. Solving for r in equation (18)₁And substituting the result into equation (19) yields:

furthermore, by mixing D₂About r₀Is set to zero and r is solved for₀，r₀Can be represented by the following equation:

equation (21) is simplified for ease of representation,will be P_iSubstitution of expression (c)And rearranging the resulting terms to form the following expression similar to the N-quantizer distribution expression in equation (21):

alternatively, it is possible to useEquation (22) is expressed to realize the following equation:

based on equations (17) - (23), the optimal two spatial layer distortion can be expressed as follows:

a similar approach can be derived for application to three spatial layers L_0-2The optimal allocation factor of. Very similar to the two spatial layers L₀、L₁，s_iIs equal to the ith spatial layer L₁The number of samples in (1) is such that S is S₀+s₁+s₂. Three spatial layers L_0-2Average rate and distortion of R₃And D₃And can be represented as spatial layers 0,1, and 2 (e.g., three spatial layers L)_0-2) Average rate and distortion r of₀、r₁And r₂And d₀、d₁And d₂The weighted sum of (c) is as follows:

and

when similar techniques are usedWhen applied to three quantizers from a dual quantizer result, R₃It can be expressed as an average two-layer rate R using the following equation₂The combination of (A) and (B):

wherein

Similarly, for three quantizers, the distortion can be expressed as follows:

wherein

Using two layers of optimum distortionEquation (24) and optimum N-quantizer distortionEquation (8) of (1), D can be solved in equation (29)₃To obtain the following expression:

whereinR can be solved in equation (27)₂To obtain the following expression:

in addition, the main points of the invention areBy substituting equation (32) into D₃Equation (31) to combine equations (31) and (32) to form the following equation:

can be obtained by taking D₃About r₂And set the result to zero to form r₂Is described in (1). This expression can be represented by the following equation:

when the terms are rearranged, equation (34) may look similar to the N-quantizer distribution expression as follows:

applying this equation (36) to the first layer L₀And a second layer L₁The partition factor for each layer may be expressed as follows:

and

two spatial layers L_0-1And three spatial layers L_0-2The two derivations of (a) illustrate that the extension to multiple spatial layers is to optimize rate allocation at the allocator 300 (e.g., for determining the bit rate B allocated to each spatial layer L_RDistribution factor A of_F) The mode (2). Here, the above result is extended to L spatial layers L_0-LA generic expression is obtained as shown in the following equation:

wherein R is_LIs the average rate corresponding to the L spatial layers L_0-iThe number of bits per sample; total number of samples S over L spatial layers, wheres_iIs the number of samples in the ith spatial layer;wherein h is_j,iA source distribution dependent on a signal quantized in the ith spatial layer by the jth quantizer; and isCorresponding to the variance of the jth transform coefficient in the ith spatial layer.

In some embodiments, equation (39) has a different form due to various assumptions. Two different forms of equation (39) are shown below.

E.g. h_j,iIs dependent on the spatial layer L at the i-th spatial layer_iOf the video input signal 120 quantized by the jth quantizer. In examples with similar source distributions, h_j,iThe value of (c) does not change between different quantizers and therefore cancels out due to the ratio of the product terms in equation (39). In other words, h_j,0＝h_j,1＝h_j,2H. Thus, when such cancellation occurs, the term This effectively eliminates the consideration of this parameter, since P_iAlways appear in a ratiometric fashion, where h is used in the numerator to eliminate similar terms in the denominator. In practice, h_j,0May be different from h_j,1And h_j,2Because of the basic spatial layer L₀Only temporal prediction is used, while other spatial layers may use both temporal and spatial prediction. In some configurations, such differences do not significantly affect the dispense factor A determined by the dispenser 300_F。

In other implementations, the encoder 200 introduces a transform block that produces the transform coefficients 226. When this occurs, the combination of transform coefficients 226 may change, introducing a variable s_i'. Variable s_i' corresponds to the spatial layer L at the i-th spatial layer_iThe average number of transform coefficients 226 for each transform block having a non-zero variance, as shown in equation (39 a). With this variable s_i' different, s in equation (39b)_iCorresponding to the ith spatial layer L_iS. Further, in equation (39a), the termWherein sigma² _k,iIs the ith spatial layer L_iThe variance of the kth coefficient in the transform block in (1). Practically speaking, equation (39a) represents the i-th spatial layer L_iThe optimal bit rate allocation, an expression that is a weighted sum of the ratios of the variance products (e.g.,)。

referring to fig. 3, in some embodiments, the allocator 300 includes a sampler 310, an estimator 320, and a rate determiner 330. The sampler 310 receives a signal having a plurality of spatial layers L_0-iAs input 302 to the distributor 300. For example, fig. 2 shows transform coefficients 226 generated by transformer 220, which transform coefficients 226 are communicated to distributor 300 by dashed lines. Using the received non-quantized transform coefficients 226, the sampler 310 identifies frames of the video input signal 120 as samples S_F. Based onThe samples S identified by the sampler 310_FThe allocator 300 determines an allocation factor a for each spatial layer L_F. In some embodiments, the allocator 300 is configured to dynamically determine the allocation factor a for each spatial layer L_F. In these embodiments, sampler 310 may be configured to iteratively identify frame samples S_FSuch that the dispenser 300 may dispense the factor a_FAdapted to each sample S identified by the sampler 310_FA collection of (a). For example, the distributor 310 is based on the first samples S of the frames of the video input signal 120_F1To determine the allocation factor a to each spatial layer L_F. And then (e.g., if necessary) based on the second samples S of the frame of the video input signal 120 identified by the sampler 310_F2Continuing to adjust or modify the allocation factor A applied to each spatial layer L_F(e.g., from the first sample S as shown in FIG. 3_F1First allocation factor A of_F1Becomes a second sample S_F2Second distribution factor A of_F2). The process may continue iteratively until the time duration that the distributor 300 receives the video input signal 120. In these examples, the dispenser 300 modifies the dispensing factor A_FDistribution factor A_FBased on the first sample S_F1And a second sample S_F2To change the spatial rate factor 332 (e.g., from the first spatial rate factor 332)₁Change to second spatial rate factor 332₂). Additionally or alternatively, the allocator 300 may use an exponential moving average to modify the allocation factor a from frame to frame_F. The exponential moving average is typically a weighted moving average with an allocation factor a from a previous frame_FIs determined for the current frame_FThe weighting is performed. In other words, here, the allocation factor a is assigned_FEach modification of (2) is to have a current and a previous allocation factor a_FIs calculated as the weighted average of (a).

The estimator 320 is configured to determine a variance estimate 322 for each transform coefficient from the encoder 200. In some configurations, estimator 320 assumes that the transform coefficients 226 in each block from transformer 220 are similarly distributed. Based on the assumption, the user can look through the eyesSample frame S of the frequency input signal 120_FAll transform blocks in (a) are averaged to estimate the variance of the transform coefficients 226. For example, the following expression is for the ith spatial layer L₁The k-th transform coefficient 226 in (a) is modeled as a random variable E_k,i。

Wherein epsilon_b,k,i,tIs the ith spatial layer L in the t frame_iThe kth transform coefficient 226, B in the B-th transform block in (B)_iRepresents the ith spatial layer L_iNumber of blocks in, S_FRepresenting the number of sample frames used to estimate the variance. In some examples of the method of the present invention,value of (i) th spatial layer L_iThe estimate of the variance of the kth transform coefficient 226 in (a) is independent of the transform block when it is assumed that all transform blocks have the same statistics. However, in practice, the statistics of the transform block may vary across the entire frame. This is particularly true of video conferencing content, where blocks at the edges of a frame may have less activity than blocks in the center. Therefore, if these different statistics negatively impact the accuracy of the rate assignment result, estimating the variance based on the block located in the center of the frame may mitigate the negative impact. In some configurations, the sub-blocks used to estimate transform coefficient variances represent a subset of all sub-blocks in a video image (e.g., a sub-block located at a centermost portion of a video image or a sub-block located at a position where a video image has changed relative to a previous image).

The rate determiner 330 is configured to determine the frame samples S of the video input signal 120 based on the samples S identified by the sampler 310_FTo determine the spatial rate factor 332. In some examples, the spatial rate factor 332 defines a spatial layer L for each spatial layer in the encoded bitstream 204_0-iTo determine the bit rate B_RThe factor of (2). Spatial rate factor 332 is assigned to spatial layer L_i-1And allocated to the spatial layer L₁Between bit rates ofThe ratio of (a) to (b). In the space layer L₀And L₁The spatial rate factor is equal to 0.5, and is allocated to the spatial layer L₁In the two-space example where the bit rate of (2) is equal to 500kbps, the spatial layer L is allocated₀Is equal to 250kbps (i.e., 0.5 times the 500 kbps). In these embodiments, the value of spatial rate factor 332 is set to base layer L₀Distribution factor A of_FAnd the average rate R_LThe difference between (e.g. expression r of equation (39))₀-R_L). Here, a factor A is assigned_FCorresponding to the base layer L₀Number of bits per transform coefficient (also referred to as r ×)₀) And average rate R_LCorresponding to more than one spatial layer L_0-iThe number of bits per transform coefficient. In some configurations, experimental results for two spatial layers indicate that spatial rate factor 332 corresponds to an expressionAs a single parameter, the spatial rate factor 332 may allow the allocator 300 to easily adjust or modify the bit rate B for each layer L of the encoded bitstream 204_R。

Although explained with respect to two spatial layers, the allocator 300 may factor the spatial rate 332 and/or the allocation factor a_FApplied to any number of spatial layers L_0-i. For example, the allocator 300 determines the allocation factor a with respect to each group of two spatial layers_FAnd/or spatial rate factor 332. To use three layers L_0-2To illustrate, the allocator 300 first determines the base layer L₀And a first layer L₁Distribution factor A of_FThen, the first layer L is determined₁And a partition factor A of the second layer L2_F. Each allocation factor A_FCan be used to determine the spatial rate factor 332, the base layer L₀And a first layer L₁A spatial rate factor 332, and a first layer L₁And a second layer L₂332, or a second spatial rate factor 332. With spatial rate factors 332 for each group of two spatial layers, allocator 300 may pair spatial rate factors 332 and/or allocation factor a_FPerforming an average (e.g. weighted average, arithmetic average)Uniform, geometrically uniform, etc.) for any number of spatial layers L_0-iAn average spatial rate factor and/or an average allocation factor is generated.

In some examples, spatial rate factor 332 must satisfy spatial rate factor threshold 334 (e.g., within a range of values) in order for allocator 300 to help determine bitrate B based on spatial rate factor 332_R. In some embodiments, a value satisfies the spatial rate factor threshold 334 when the value is within an interval of less than about 1.0 and greater than about 0.5. In other embodiments, the spatial rate factor threshold 334 corresponds to a narrower interval of values (e.g., 0.55-0.95, 0.65-0.85, 0.51-0.99, 0.65-1.0, 0.75-1.0, etc.) or a wider interval of values (e.g., 0.40-1.20, 0.35-0.95, 0.49-1.05, 0.42-1.17, 0.75-1.38, etc.). In some configurations, when spatial rate factor 332 is outside the interval of values of spatial rate factor threshold 334, allocator 300 adjusts spatial rate factor 332 to satisfy spatial rate factor threshold 334. For example, when the spatial rate factor threshold 334 takes an interval of 0.45-0.95, the spatial rate factors 332 outside this interval are adjusted to the nearest maximum of the interval (e.g., 0.3 of the spatial rate factor 332 is adjusted to 0.45 of the spatial rate factor 332, while 1.82 of the spatial rate factor 332 is adjusted to 0.95 of the spatial rate factor 332).

Based on the determined spatial rate factor 332, the allocator 300 is configured to determine the spatial rate by reducing the one or more spatial layers L_0-iIs subject to a total bit rate B to optimize video quality_RtotOf (3) is performed. To reduce distortion, the distributor 300 bases on the frame samples S_FThe calculated spatial rate factor 332 affects (e.g., helps the encoder 200 determine) the bit rate B of each spatial layer L_R. For example, when the encoded bitstream 204 comprises two spatial layers L0, L1, the allocator 300 determines the allocation factor a_FDistribution factor A_FAnd used to determine spatial rate factor 332 to generate a corresponding equationFirst bit rate B_R1And corresponds to the equationSecond bit rate B_R0In which B is_RtotCorresponding to the total bit rate available for encoding the total bitstream (i.e. all spatial layers L0, L1).

Fig. 4 is an example of a method 400 for implementing the rate allocation system 100. In operation 402, the method 400 receives transform coefficients 226 (e.g., non-quantized transform coefficients) corresponding to the video input signal 120 at data processing hardware 510. The video input signal 120 comprises a plurality of spatial layers L_0-iWherein a plurality of spatial layers L_0-iComprising a base layer L₀. In operation 404, the method 400 is performed by the data processing hardware 510 based on frame samples S from the video input signal 120_FTo determine the spatial rate factor 332. The spatial rate factor 332 defines a factor for rate allocation on each spatial layer L of the coded bitstream 204 and is defined by the base layer L₀With respect to the bit rate of each transform coefficient and the plurality of spatial layers L_0-iOf each transform coefficient of (1)_LThe difference between them. In operation 406, the method 400 is performed by the data processing hardware 510 based on the spatial rate factor 332 and the frame samples S_FBit rate B_RMultiple spatial layers L assigned to each spatial layer L to reduce the encoded bitstream 204_0-iD of (a).

Fig. 5 is a schematic diagram of an example computing device 500 that may be used to implement the systems and methods described in this document (e.g., encoder 200 and/or allocator 300). Computing device 500 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit embodiments of the inventions described and/or claimed in this document.

Computing device 500 includes data processing hardware 510, memory hardware 520, storage 530, a high-speed interface/controller 540 connected to memory 520 and high-speed expansion ports 550, and a low-speed interface/controller 560 connected to low-speed bus 570 and storage 530. Each of the components 510, 520, 530, 540, 550, and 560, are interconnected using various busses, and may be mounted on a common motherboard or in other manners. Processor 510 may process instructions for execution within computing device 500, including instructions stored in memory 520 or on storage device 530, to display graphical information for a Graphical User Interface (GUI) on an external input/output device, such as display 580 coupled to high speed interface 540. In other embodiments, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Moreover, multiple computing devices 500 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).

The memory 520 stores information within the computing device 500 non-temporarily. The memory 520 may be a computer-readable medium, a volatile memory unit or a nonvolatile memory unit. Non-transitory memory 520 may be a physical device for temporarily or permanently storing programs (e.g., sequences of instructions) or data (e.g., program state information) for use by computing device 500. Examples of non-volatile memory include, but are not limited to, flash memory and Read Only Memory (ROM)/Programmable Read Only Memory (PROM)/Erasable Programmable Read Only Memory (EPROM)/Electrically Erasable Programmable Read Only Memory (EEPROM) (e.g., typically used for firmware, such as boot programs). Examples of volatile memory include, but are not limited to, Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Phase Change Memory (PCM), and magnetic disks or tape.

The storage device 530 is capable of providing mass storage for the computing device 500. In some implementations, the storage device 530 is a computer-readable medium. In various different implementations, the storage device 530 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state storage device, or an array of devices, including devices in a storage area network or other configurations. In further embodiments, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods such as those described above. The information carrier is a computer-or machine-readable medium, such as the memory 520, the storage device 530, or memory on processor 510.

The high-speed controller 540 manages bandwidth-intensive operations for the computing device 500, while the low-speed controller 560 manages lower bandwidth-intensive operations. This allocation of duties is exemplary only. In some embodiments, high-speed controller 540 is coupled to memory 520, display 580 (e.g., through a graphics processor or accelerator), and high-speed expansion ports 550, which high-speed expansion ports 550 may accept various expansion cards (not shown). In some embodiments, low-speed controller 560 is coupled to storage device 530 and low-speed expansion port 590. The low-speed expansion port 590 can include various communication ports (e.g., USB, bluetooth, ethernet, wireless ethernet), and the low-speed expansion port 590 can be coupled (e.g., through a network adapter) to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a network device, such as a switch or router.

As shown, computing device 500 may be implemented in a number of different forms. For example, it may be implemented as a standard server 500a or multiple times in a group of such servers 500a, as a laptop computer 500b, or as part of a rack server system 500 c.

Various implementations of the systems and techniques described here can be realized in digital electronic and/or optical circuits, integrated circuits, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive and transmit data and instructions to a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, non-transitory computer-readable medium, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and in particular by, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from and/or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, such as internal hard disks or removable disks; magneto-optical disks; CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

To provide for interaction with a user, one or more aspects of the invention can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube), LCD (liquid crystal display) display, or touch screen for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other types of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback, such as visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input. Further, the computer may interact with the user by sending and receiving documents to and from the device used by the user; for example, by sending a web page to a web browser on the user's client device in response to a request received from the web browser.

Various embodiments have been described herein. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other implementations are within the scope of the following claims.

24页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：编码装置、编码方法、编码程序、解码装置、解码方法和解码程序

Spatial layer rate allocation

相关技术

网友询问留言