Motion vector coding using residual block energy distribution
阅读说明:本技术 使用残差块能量分布的运动矢量代码化 (Motion vector coding using residual block energy distribution ) 是由 达克·何 于 2018-09-16 设计创作,主要内容包括:使用运动矢量来对当前块进行帧间预测。编码可以包括:使用所述运动矢量来对所述当前块进行帧间预测以确定残差块;以及使用所述残差块来确定用于所述当前块的变换系数的变换块。编码还可以包括:使用所述变换系数的非零系数的位置来确定所述变换块的与所述残差块中的能量分布相关的类别,所述类别是多种类别中的一种,并且所述多种类别中的每种类别通过不同的能量分布来定义;使用所述类别来确定用于对所述运动矢量进行代码化的上下文,所述上下文包括用于对所述运动矢量进行熵代码化的概率分布;以及使用所述上下文来对所述运动矢量进行编码。所述变换块的大小也可以用于确定所述类别。(Encoding may also include determining a class of the transform block related to an energy distribution in the residual block using positions of non-zero coefficients of the transform coefficients, the class being of a plurality of classes and each class of the plurality of classes being defined by a different energy distribution, determining a context for encoding the motion vector using the class, the context including a probability distribution for entropy encoding the motion vector, and encoding the motion vector using the context.)
1, A method for encoding, comprising:
determining a motion vector for inter-predicting a current block;
inter-predicting the current block using the motion vector to determine a residual block;
determining a transform block of transform coefficients for the current block using the residual block;
determining a class of the transform block related to an energy distribution in the residual block using positions of non-zero coefficients of the transform coefficients, the class being of a plurality of classes, and each class of the plurality of classes being defined by a different energy distribution;
determining a context for coding the motion vector using the class, the context comprising a probability distribution for entropy coding the motion vector; and
encoding the motion vector using the context.
2. The method of claim 1, wherein the classification is determined further by based on a size of the transform block.
3. The method of claim 1 or 2, wherein the category is selected from the group consisting of DC ONLY, AC LOW, AC HIGH X, AC HIGH Y, and AC HIGH, and wherein:
the DC _ ONLY class transform block is a transform block where ONLY non-zero coefficients are located at position (0,0),
the AC LOW class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold and no non-zero coefficients in rows greater than a vertical threshold,
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
the AC _ HIGH _ Y class transform block is a transform block having no non-zero coefficients in a column number greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold, and
the AC _ HIGH class transform block is a transform block having a non-zero coefficient at a position (x, y) where x is greater than a horizontal threshold and y is greater than a vertical threshold.
4. The method of any of claims 1-3, wherein:
the motion vector includes a horizontal offset and a vertical offset, and
determining the class of the transform block comprises:
determining an th class of the transform block, the th class being used to determine a horizontal context for coding the horizontal offset, and
determining a second category of the transform block, the second category being used to determine a vertical context for coding the vertical offset.
5. The method of claim 4, wherein:
the th class is selected from the group consisting of DC _ ONLY, AC _ LOW _ X, and AC _ HIGH _ X,
the second category is selected from the group consisting of DC _ ONLY, AC _ LOW _ Y, and AC _ HIGH _ Y,
the DC _ ONLY class transform block is a transform block where ONLY non-zero coefficients are located at position (0,0),
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
an AC HIGH Y class transform block is a transform block that has no non-zero coefficients in column numbers greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold,
the AC _ LOW _ X class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold, and
an AC LOW Y transform block is a transform block in which there are no non-zero coefficients in rows greater than a vertical threshold.
6. The method of claim 4, wherein:
the th class is selected from the group consisting of AC _ LOW _ X and AC _ HIGH _ X, and
the second category is selected from the group consisting of AC LOW Y and AC HIGH Y,
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
an AC HIGH Y class transform block is a transform block that has no non-zero coefficients in column numbers greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold,
the AC _ LOW _ X class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold, and
an AC LOW Y transform block is a transform block in which there are no non-zero coefficients in rows greater than a vertical threshold.
7, an apparatus for encoding, comprising:
a memory;
a processor, the memory comprising instructions executable by the processor for:
determining a motion vector for inter-predicting a current block;
inter-predicting the current block using the motion vector to determine a residual block;
determining a transform block for the current block using the residual block;
determining a class of the transform block related to an energy distribution in the residual block using a position of a non-zero coefficient of the transform block and a size of the transform block, the class being of a plurality of classes, and each class of the plurality of classes being defined by a different energy distribution;
determining a context for coding the motion vector using the class; and
encoding the motion vector using the context in a compressed bitstream.
8. The apparatus of claim 7, wherein the instructions to determine the class of the transform block for the current block comprise instructions to:
determining a class for an th transform block of the current block;
determining a second class for a second transform block of the current block;
selecting the class when the th transform block has a higher number of non-zero coefficients than the second transform block, and
selecting the second class when the second transform block has a higher number of non-zero coefficients than the th transform block.
9. The apparatus of claim 7 or 8, wherein the category is selected from the group consisting of DC ONLY, AC LOW, AC HIGH X, AC HIGH Y, and AC HIGH, wherein:
the DC _ ONLY class transform block is a transform block where ONLY non-zero coefficients are located at position (0,0),
the AC LOW class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold and no non-zero coefficients in rows greater than a vertical threshold,
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
the AC _ HIGH _ Y class transform block is a transform block having no non-zero coefficients in a column number greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold, and
the AC _ HIGH class transform block is a transform block having a non-zero coefficient at a position (x, y) where x is greater than a horizontal threshold and y is greater than a vertical threshold.
10. The apparatus of any of claims 7-9, wherein:
the motion vector includes a horizontal offset and a vertical offset, and
the instructions for determining the class of the transform block include instructions for:
determining an th class of the transform block, the th class being used to determine a horizontal context for coding the horizontal offset, and
determining a second category of the transform block, the second category being used to determine a vertical context for coding the vertical offset.
11. The apparatus of claim 10, wherein:
the th class is selected from the group consisting of DC _ ONLY, AC _ LOW _ X, and AC _ HIGH _ X,
the second category is selected from the group consisting of DC _ ONLY, AC _ LOW _ Y, and AC _ HIGH _ Y,
the DC _ ONLY class transform block is a transform block where ONLY non-zero coefficients are located at position (0,0),
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
an AC HIGH Y class transform block is a transform block that has no non-zero coefficients in column numbers greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold,
the AC _ LOW _ X class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold, and
an AC LOW Y transform block is a transform block in which there are no non-zero coefficients in rows greater than a vertical threshold.
12. The apparatus of claim 10, wherein:
the th class is selected from the group consisting of AC _ LOW _ X and AC _ HIGH _ X, and
the second category is selected from the group consisting of AC LOW Y and AC HIGH Y,
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
an AC HIGH Y class transform block is a transform block that has no non-zero coefficients in column numbers greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold,
the AC _ LOW _ X class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold, and
an AC LOW Y transform block is a transform block in which there are no non-zero coefficients in rows greater than a vertical threshold.
An apparatus for decoding of the type 13, , comprising:
a memory;
a processor, the memory comprising instructions executable by the processor for:
decoding a transform block for a current block from an encoded bitstream, the transform block corresponding to a residual block for the current block;
determining a class of the transform block related to an energy distribution in the residual block using positions of non-zero coefficients of the transform block, the class being of a plurality of classes, and each class of the plurality of classes being defined by a different energy distribution;
determining a context for decoding a motion vector for encoding the current block using the class; and
decoding the motion vector from the encoded bitstream using the context.
14. The apparatus of claim 13, wherein the instructions further include instructions for:
inter-predicting the current block using the motion vector to generate a prediction block;
determining the residual block using the transform block; and
reconstructing the current block by combining the residual block with the prediction block.
15. The apparatus of claim 13 or 14, wherein determining the class of the transform block comprises:
determining a class for an th transform block of the current block;
determining a second class for a second transform block of the current block; and
selecting of the class and the second class as classes for the transform block based on which of the transform block and the second transform block have a higher number of non-zero coefficients.
16. The apparatus of any of claims 13-15, wherein the classification is determined further based on a size of the transform block.
17. The apparatus of any of claims 13-16, wherein the category is selected from the group consisting of DC _ ONLY, AC _ LOW, AC _ HIGH _ X, AC _ HIGH _ Y, and AC _ HIGH, and wherein:
the DC _ ONLY class transform block is a transform block where ONLY non-zero coefficients are located at position (0,0),
the AC LOW class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold and no non-zero coefficients in rows greater than a vertical threshold,
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
the AC _ HIGH _ Y class transform block is a transform block having no non-zero coefficients in a column number greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold, and
the AC _ HIGH class transform block is a transform block having a non-zero coefficient at a position (x, y) where x is greater than a horizontal threshold and y is greater than a vertical threshold.
18. The apparatus of claim 13 or 14, wherein:
the motion vector includes a horizontal offset and a vertical offset, and
the instructions for determining the class of the transform block include instructions for:
determining an th class of the transform block, the th class being used to determine a horizontal context for coding the horizontal offset, and
determining a second category of the transform block, the second category being used to determine a vertical context for coding the vertical offset.
19. The apparatus of claim 18, wherein:
the th class is selected from the group consisting of DC _ ONLY, AC _ LOW _ X, and AC _ HIGH _ X,
the second category is selected from the group consisting of DC _ ONLY, AC _ LOW _ Y, and AC _ HIGH _ Y,
the DC _ ONLY class transform block is a transform block where ONLY non-zero coefficients are located at position (0,0),
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
an AC HIGH Y class transform block is a transform block that has no non-zero coefficients in column numbers greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold,
the AC _ LOW _ X class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold, and
an AC LOW Y transform block is a transform block in which there are no non-zero coefficients in rows greater than a vertical threshold.
20. The apparatus of claim 18, wherein:
the th category is selected from the group consisting of AC LOW X and AC HIGH X,
the second category is selected from the group consisting of AC LOW Y and AC HIGH Y,
an AC HIGH X class transform block is a transform block that has no non-zero coefficients in the row number above the vertical threshold and at least non-zero coefficients in at least columns above the horizontal threshold,
an AC HIGH Y class transform block is a transform block that has no non-zero coefficients in column numbers greater than a horizontal threshold and at least non-zero coefficients in at least rows greater than a vertical threshold,
the AC _ LOW _ X class transform block is a transform block in which there are no non-zero coefficients in columns greater than a horizontal threshold, and
an AC LOW Y transform block is a transform block in which there are no non-zero coefficients in rows greater than a vertical threshold.
21, a method for decoding, comprising:
decoding a transform block for a current block from an encoded bitstream, the transform block corresponding to a residual block for the current block;
determining a class of the transform block related to an energy distribution in the residual block using positions of non-zero coefficients of the transform block, the class being of a plurality of classes, and each class of the plurality of classes being defined by a different energy distribution;
determining a context for decoding a motion vector for encoding the current block using the class;
decoding the motion vector from the encoded bitstream using the context; and
reconstructing the current block using the motion vector.
Background
Digital video streams may represent video using series of frames or still images digital video may be used for a variety of applications including, for example, video conferencing, high definition video entertainment, video or sharing of user generated video.
techniques for compression use a reference frame and a motion vector to generate a prediction block corresponding to a current block to be encoded.A difference between the prediction block and the current block may be encoded instead of encoding the value of the current block itself to reduce the amount of data encoded.
Disclosure of Invention
The present disclosure relates generally to encoding and decoding video data, and more particularly to improved coding of motion vectors.
A method for encoding described herein includes determining a motion vector for inter-predicting a current block, inter-predicting the current block using the motion vector to determine a residual block, determining a transform block of transform coefficients for the current block using the residual block, determining a class of the transform block related to an energy distribution in the residual block, determining a context for coding the motion vector using the class, and encoding the motion vector using the context.
apparatus for encoding includes a non-transitory storage medium or memory and a processor, according to aspects described herein, the memory includes instructions executable by the processor to determine a motion vector for inter-predicting a current block, inter-predicting the current block using the motion vector to determine a residual block, determine a transform block for the current block using the residual block, determine a class of the transform block related to an energy distribution in the residual block using a position of a non-zero coefficient of the transform block and a size of the transform block, the class being of a plurality of classes and each class of the plurality of classes being defined by a different energy distribution, determine a context for encoding the motion vector using the class, and encode the motion vector using the context in a compressed bitstream.
The memory includes instructions executable by the processor to decode a transform block for a current block from an encoded bitstream, the transform block corresponding to a residual block for the current block, determine a class of the transform block related to an energy distribution in the residual block using positions of non-zero coefficients of the transform block, the class being of a plurality of classes, and each class of the plurality of classes being defined by a different energy distribution, determine a context for decoding a motion vector for encoding the current block using the class, and decode the motion vector from the encoded bitstream using the context.
The method for decoding according to aspects taught herein includes decoding a transform block for a current block from an encoded bitstream, the transform block corresponding to a residual block for the current block, determining a class of the transform block related to an energy distribution in the residual block using positions of non-zero coefficients of the transform block, the class being of a plurality of classes, and each class of the plurality of classes being defined by a different energy distribution, determining a context for decoding a motion vector for encoding the current block using the class, decoding the motion vector from the encoded bitstream using the context, and reconstructing the current block using the motion vector.
These and other aspects of the disclosure are disclosed in the following detailed description of the embodiments, the appended claims and the accompanying drawings.
Drawings
The description herein makes reference to the accompanying drawings described below wherein like reference numerals refer to like parts throughout the several views unless otherwise specified.
Fig. 1 is a schematic diagram of a video encoding and decoding system.
Fig. 2 is a block diagram of an example of a computing device that may implement a transmitting station or a receiving station.
Fig. 3 is a schematic diagram of an example of a video stream to be encoded and subsequently decoded.
Fig. 4 is a block diagram of an encoder according to an embodiment of the present disclosure.
Fig. 5 is a block diagram of a decoder according to an embodiment of the present disclosure.
Fig. 6 is a schematic diagram of motion vectors representing full and sub-pixel motion according to an embodiment of the present disclosure.
Fig. 7 is a schematic diagram of a sub-pixel prediction block according to an embodiment of the present disclosure.
FIG. 8 is a schematic diagram of full and sub-pixel locations according to an embodiment of the present disclosure.
FIG. 9 is a flowchart of a process for encoding a current block of a video frame using inter prediction according to an embodiment of the present disclosure.
FIG. 10 is a flowchart of a process for decoding a current block of a video frame using inter prediction according to an embodiment of the present disclosure.
Detailed Description
The video stream may be encoded into a bitstream that involves compression (i.e., a compressed bitstream). The compressed bitstream may then be transmitted to a decoder, which may decode or decompress the compressed bitstream to prepare it for viewing or processing at steps.
As described further below at step , coding the residual block may include generating or more transform blocks for the residual block.
Each motion vector used to generate a prediction block in an inter-prediction process refers to a frame (i.e., a reference frame) other than the current frame, the reference frame may be located before or after the current frame in a sequence (i.e., display order) of a video stream and may be a frame reconstructed before being used as a reference frame, the forward reference frame is a frame used for forward prediction with respect to the sequence, whereas the backward reference frame is a frame used for backward prediction with respect to the sequence or more forward and/or backward reference frames may be used to encode or decode a block.
The motion vector for the current block in motion compensated prediction may be encoded into and decoded from an encoded bitstream. The motion vector for the current block (i.e., the block being encoded) is described with respect to a co-located block in the reference frame. The motion vectors describe an offset (i.e., displacement) in the horizontal direction (i.e., mv _ x) and a displacement in the vertical direction (i.e., mv _ y) from a co-located block in the reference frame. Thus, a motion vector may be characterized as a 3-tuple (f, mv _ x, mv _ y), where f indicates (e.g., is an index of) a reference frame, mv _ x is an offset in the horizontal direction, and mv _ y is an offset in the vertical direction. Thus, at least the offsets mv _ x and mv _ y are written (i.e., encoded) into and read (i.e., decoded) from the encoded bitstream.
For example, the reference motion vector may be of neighboring blocks' motion vectors.
In cases, the prediction block that produces the best residual may not correspond to a pixel in the reference frame.A best motion vector may point to a location between pixels of a block in the reference frame.A motion compensated prediction may be useful at the sub-pixel level.A motion compensated prediction may involve the use of sub-pixel interpolation filters that generate filtered sub-pixel values at defined locations between whole pixels (also called integer pixels) along a row, column, or both.
The same interpolation filters are used to generate sub-pixel prediction blocks for all blocks of the frame.
Coding a motion vector as used herein refers to coding of a motion vector and differential coding of a motion vector in any cases, coding a motion vector includes coding a horizontal offset of a motion vector (i.e., mv _ x) and coding a vertical offset of a motion vector (i.e., mv _ y).
Coding the motion vector may include entropy coding a horizontal offset and a vertical offset of the motion vector. Thus, a context is determined for the motion vector, and a probability model corresponding to the context is used to code the motion vector.
Entropy coding is a technique for "lossless" coding that relies on a probability model that models the distribution of values that occur in an encoded video bitstream. Entropy coding can reduce the number of bits required to represent video data to near a theoretical minimum by using a probabilistic model based on measured or estimated distributions of values. In practice, the actual reduction in the number of bits required to represent the video data may be a function of the accuracy of the probability model, the number of bits used to perform the coding, and the computational accuracy of the fixed point arithmetic used to perform the coding.
The purpose of context modeling is to obtain probability distributions for subsequent entropy coding engines (such as arithmetic coding, Huffman coding, and other variable-length to variable-length coding engines). to achieve good compression performance, a large number of contexts may be required.
The probability distribution may be learned by the decoder and/or included in the header of the frame to be decoded.
Learning may mean that an entropy coding engine of the decoder may adapt a probability distribution (i.e., a probability model) of a context model based on decoded frames and/or decoded blocks. For example, a decoder may have an initial probability distribution available, which the decoder (e.g., an entropy coding engine of the decoder) may continuously update as the decoder decodes additional frames. The updating of the probability model may ensure that the initial probability distribution is updated to reflect the actual distribution in the decoded frame.
In a coding system that includes 3000 contexts and uses 8 bits to encode a probability distribution (coded as an integer value between 1 and 255), for example, 24,000 bits are added to the coded bitstream.
The efficiency of entropy coding may be directly related to the probability model. The model as used herein may be lossless (entropy) coding or may be a parameter in lossless (entropy) coding. The model may be any parameter or method that affects the probability estimation for entropy coding.
For example, when reconstructed pixel values (described further in step below) are not readily available, they are not used as context information for selecting a probability model for coding the motion vector.
Therefore, it is generally desirable to accommodate decoding (and thus encoding) motion vectors on readily available inter prediction modes and motion vectors, e.g., of previously decoded blocks. The previously decoded block may be a block in the spatial neighborhood and/or temporal neighborhood of the current block being encoded. For example, where a block of a current frame is coded in raster scan order, the spatial neighborhood may include the top (i.e., above) and left neighboring blocks of the current block in the current frame. For example, the temporal neighborhood may include co-located blocks in previously coded frames (i.e., reference frames).
For example, available inter prediction modes may represent that the motion vector of a block is 0, which may be referred to as ZEROMV mode, and inter prediction modes may represent that the motion vector of a block is a reference motion vector, which may be referred to as REMMV mode.
The readily available information for coding the motion vector may also include whether the current block is a luminance block or a chrominance block, and the block size.
As mentioned above, filters with different frequency responses may be used to generate motion vectors at sub-pixel locations. Thus, and due to the use of these filters, reference blocks at different sub-pixel locations may have different characteristics in the transform domain. For example, a reference block at a sub-pixel location generated by a low pass filter is likely to have lower energy in the high frequency band than a reference block at a full pixel location. Since the residual block is the difference between the source block and the reference block, the energy distribution in the residual block is thus related to the energy distribution of the reference block.
The efficiency of entropy coding may be directly related to the probability model, which in turn is selected based on the context model. According to information theory, entropy h (X) may be a measure of the number of bits required to code variable X; and the conditional entropy H (X | Y) may be a measure of the number of bits required to code the variable X if the quantity Y is known. H (X) and H (X | Y) are related by the well-known property H (X | Y) ≦ H (X). That is, the conditional entropy H (X | Y) may never exceed H (X). If X represents a motion vector and Y represents energy distribution information contained in a transform block (e.g., or equivalently, a residual), then coding that can improve the motion vector (i.e., X) by using the energy distribution information (i.e., Y) is followed. For example, energy distribution information contained in the transform block may be used as additional context information for coding the motion vector.
Embodiments according to the present disclosure use energy distribution information available in a transform block to code a motion vector associated with a residual block. Compression performance may be improved using energy distribution information available in the transform block. Details of improved coding of motion vectors are initially described with reference to systems in which the teachings herein may be implemented.
Fig. 1 is a schematic diagram of a video encoding and
In examples, the receiving
Other implementations of the video encoding and
For example, receiving
Fig. 2 is a block diagram of an example of a
Although the disclosed embodiments may be practiced with processors (e.g., CPU 202) as shown, advantages in speed and efficiency may be realized with more than processors.
In embodiments the
Although FIG. 2 depicts the
FIG. 3 is a schematic diagram of an example of a
Whether or not the
Fig. 4 is a block diagram of an
The
When the
Next, still referring to FIG. 4, the prediction block may be subtracted from the current block at the intra/
The reconstruction path in fig. 4 (shown by dashed lines) may be used to ensure that
For example, a non-transform based encoder may quantize the residual signal directly without the
Fig. 5 is a block diagram of a
Similar to the reconstruction path of the
When the
Other filtering may be applied to the reconstructed block. In this example,
FIG. 6 is a schematic diagram of motion vectors representing full and sub-pixel motion in accordance with an embodiment of the present disclosure, in FIG. 6,
The
Fig. 7 is a schematic diagram of a sub-pixel prediction block according to an embodiment of the present disclosure. Fig. 7 includes a
In cases, generating the prediction block may require only interpolation operations along of the x-axis and y-axis, the th interpolation operation to generate the intermediate pixels is followed by a second interpolation operation for generating pixels of the prediction block from the intermediate pixels, the th and second interpolation operations may be along the horizontal direction (i.e., along the x-axis) and vertical direction (i.e., along the y-axis), respectively.
In examples, interpolation filters such as Finite Impulse Response (FIR) filters are used to perform the interpolation process, the interpolation filters may include 6-tap filters, 8-tap filters, or other size filters.
FIG. 8 is a schematic diagram of full and sub-pixel locations according to an embodiment of the present disclosure. In the example of fig. 8, a 6-tap filter is used. This means that the values of the sub-pixels or
, in some embodiments, a set of interpolation filters may be designed for 1/16 pixel accuracy and include at least two of a bilinear filter, an 8-tap filter (EIGHTTAP), a SHARP 8-tap filter (EIGHTTAP _ SHARP), or a SMOOTH 8-tap filter (EIGHTTAP _ SMOOTH).
FIG. 9 is a flow diagram of a method or
At
The
The substitute reference frame may not be displayed.
For example, available space may store the second to LAST FRAME (i.e., the th FRAME before the LAST FRAME) and/or the third to LAST FRAME (i.e., the FRAME two FRAMEs before the LAST FRAME) as additional forward prediction reference FRAMEs (e.g., in addition to the LAST and GOLDEN reference FRAMEs). in examples, the backward FRAME may be stored as additional backward prediction reference FRAMEs (e.g., in addition to the ALTREF _ FRAME reference FRAME). LAST, GOLDEN, ALTREF _ FRAME, etc. may be referred to herein as reference FRAME identifiers.
In a motion search, the portion of the reference frame may be translated to the series of locations to form individual prediction blocks, which may be subtracted from the current block to form individual residuals.
As indicated above, the prediction block that produces the best residual may not correspond to pixels (i.e., integer pixels) in the reference frame.A. that is, the best motion vector may point to locations between pixels of a block located in the reference frame.
At
At
Transform blocks may be organized into two-dimensional blocks. Let (x, y) denote the transform block position in the transform domain, and let c (x, y) denote the transform coefficient at position (x, y). Note that in the case of a decoder (such as described with respect to fig. 10), c (x, y) represents the decoded transform coefficient at position (x, y). By way of example, the transform coefficient at position (0,0) may be referred to as a DC coefficient; the coefficients at any other location may be referred to as AC coefficients. Different transformation modes may result in different arrangements. In any case, transform coefficients having positive or negative values (i.e., non-zero values) are referred to as non-zero transform coefficients, or more simply non-zero coefficients.
The distribution of energy in the residual block may be represented by the distribution of non-zero coefficients within a transform block (e.g., a quantized transform block). Thus, determining the class of the transform block related to the energy distribution in the residual block may comprise determining the class of the transform block using the position of the non-zero coefficient within the transform block. Any number of categories may be available. For example, each category may represent a different energy distribution. The class may be represented by the number of non-zero coefficients located in different partitions of the transform block. The transform block may be partitioned in any manner to determine the class. The transform block may be partitioned differently to determine the class depending on a transform type used to generate the transform block.
Determining the class of a transform block may include determining that the block is in a second class in which a majority of the non-zero coefficients of the block are located in an upper left quadrant of the block and a majority of the remaining non-zero coefficients of the block are located in an upper right quadrant of the block.
In an example, a category is selected from a set comprising categories DC _ ONLY, AC _ LOW, AC _ HIGH _ X, AC _ HIGH _ Y, and AC _ HIGH.A selection from a set is intended to select of the possible values DC _ ONLY, AC _ LOW, AC _ HIGH _ X, AC _ HIGH _ Y, and AC _ HIGH.
The class DC _ ONLY may indicate that the transform block does not have non-zero AC coefficients, that is, the ONLY non-zero coefficients of the transform are DC coefficients (i.e., the coefficients at position (0, 0)).
Thus, in an AC _ LOW transform block, c (x, Y) ═ 0 if x > T _ x or Y > T _ Y, then the AC _ LOW class can be roughly interpreted as having non-zero coefficients only in the upper left portion of the transform block, the thresholds T _ x and T _ Y are described further below.
The category AC _ HIGH _ X may indicate that the residual block has no non-zero coefficients in the row number greater than the threshold (i.e., T _ y) and at least non-zero coefficients in at least columns greater than the threshold (i.e., T _ X).
The class AC _ HIGH _ Y may indicate that the residual block has no non-zero coefficients in the column number greater than the threshold (i.e., T _ x) and at least non-zero coefficients in at least columns of the threshold (i.e., T _ Y.) that is, if x > T _ x, then c (x, Y) is 0, and at some Y (x, Y) where Y > T _ Y, c (x, Y) | is 0.
The AC _ HIGH category may indicate that the transform block has a non-zero coefficient at some (x, y) where x > T _ x and y > T _ y. The AC HIGH class may be roughly interpreted as the presence of a non-zero coefficient in the lower right portion of the transform block.
In an example, the horizontal threshold (i.e., T _ x) and/or the vertical threshold (i.e., T _ y) may be based on the size of the transform block.
The horizontal threshold T _ x and the vertical threshold T _ y may be based on filters selected for sub-pixel interpolation because the frequency response of the selected filter has an effect on the distribution of non-zero coefficients, the horizontal threshold T _ x and the vertical threshold T _ y. may be set based on the frequency response-that is, the thresholds may be designed (e.g., set, selected, etc.) to correspond to the statistics of the selected filter used to generate the sub-pixel reference block.
In an example, where the prediction block is based on full pixel positions, as described above, the horizontal and vertical thresholds may be based on a linear relationship with the width and height of the transform block, respectively; whereas in the case of sub-pixel interpolation, the threshold may be selected based on the selected interpolation filter (i.e., based on the response characteristics of the selected filter).
At
The correlation between the motion vector and the frequency domain characteristics may be used to encode the motion vector.
As described above, one or more of inter prediction modes of neighboring blocks, motion vectors of motion blocks, current block type (e.g., chroma or luminance), and/or of current block sizes may be used in conjunction with a class as context information for determining context.
The context index may be used to retrieve a context model (e.g., a probability distribution) from a list of available context models, combinations of values of context information may be mapped to index values, or more combinations of these values may be mapped to the same index value, that is, or more context information combinations may be mapped to the same context model.
At
Thus, determining the context at
In an embodiment, the horizontal category may be selected from the set comprising categories DC _ ONLY, AC _ LOW _ X, and AC _ HIGH _ X; and the vertical category may be selected from the set comprising categories DC _ ONLY, AC _ LOW _ Y and AC _ HIGH _ Y.
In another embodiment, a horizontal category may be selected from the set including categories AC _ LOW _ X and AC _ HIGH _ X, and a vertical category may be selected from the set including categories AC _ LOW _ Y and AC _ HIGH _ Y.
As described above, the class DC _ ONLY may indicate that the transform block does not have a non-zero AC coefficient, that is, the ONLY non-zero coefficient of the transform is the DC coefficient (i.e., the coefficient at position (0, 0)).
The class AC LOW X may indicate that the transform block does not have non-zero coefficients at high frequency locations. In the AC _ LOW _ X transform block, there are no non-zero coefficients in columns that are greater than the horizontal threshold (i.e., T _ X). Therefore, in the AC _ LOW _ X transform block, if X > T _ X, c (X, y) becomes 0. The AC LOW X class may be roughly interpreted as the absence of non-zero coefficients in the right part of the transform block.
The category AC _ HIGH _ X may indicate that the transform block has at least non-zero coefficients in columns with large column numbers in an AC _ HIGH _ X transform block, there are at least non-zero coefficients in columns greater than a horizontal threshold (i.e., T _ X).
The class AC _ LOW _ Y may indicate that the transform block does not have non-zero coefficients at high frequency locations. In the AC LOW Y transform block, there are no non-zero coefficients in the rows that are greater than the vertical threshold (i.e., T Y). Therefore, in the AC _ LOW _ Y transform block, if Y > T _ Y, c (x, Y) becomes 0. The AC LOW Y class may be roughly interpreted as the absence of non-zero coefficients in the lower part of the transform block.
The category AC _ HIGH _ Y may indicate that the transform block has at least non-zero coefficients in rows with a large row number in an AC _ HIGH _ Y transform block, there are at least non-zero coefficients in rows greater than the vertical threshold (i.e., T _ Y.) thus, in an AC _ HIGH _ Y transform block, at some (x, Y) where Y > T _ Y, c (x, Y) | 0.
The horizontal threshold (i.e., T _ x) and the vertical threshold (i.e., T _ y) may be as described above.
For example, a transform operation such as that performed by
In the case where a plurality of transform blocks correspond to the current block, determining a transform block for the current block at
In an embodiment, any of the transform blocks may be selected, in another embodiment, the transform block with the highest AC component is selected.
In yet another embodiments, respective categories (such as described with respect to operation 906) are determined for each of the transform blocks, note that the categories (as described above) may form a progression (progression) with respect to the position of non-zero coefficients in the transform block-for example, in a set of { DC _ ONLY, AC _ LOW _ X, AC _ HIGH _ X }, an AC _ LOW _ X transform block includes more positions than a DC _ ONLY transform block where non-zero coefficients may occur, and an AC _ HIGH _ X includes more positions than an AC _ LOW _ X block where non-zero coefficients may occur.
The respective categories may be determined for each of the transform blocks and the category to be used in motion vector coding may be selected by using a voting process.an example voting process is as follows, each transform block casts votes on its category and selects the category that receives the most votes.in another example, the votes cast by each transform block are weighted by a factor proportional to the transform block size and the category that receives the most weighted vote is selected.
As described above, the transform coefficients (i.e., information in the transform block) are used to encode the motion vector. Thus, the
Although there may be a correlation (e.g., statistical correlation) between the residual block (and equivalently, the transform block) and the motion vector as described above, the correlation may not be exact (i.e., non-deterministic).
The encoding of the class may depend on the transform coefficients. For example, the number of non-zero coefficients in a transform block may be used to determine a context for coding a class. For example, the transform type may be used to determine a context for coding the category. Adjusting the encoding of the class on the transform coefficients may limit the cost (i.e., in bits) associated with the encoding of the class.
The transform type may be a transform type used by the
FIG. 10 is a flow diagram of a method or
At
For example, binary token trees may be used to code the transform coefficients.
At
In an embodiment, and as similarly described with respect to process 900, determining a class may mean decoding the class from a compressed bitstream. The transform coefficients decoded at
As described with respect to fig. 9, the categories may be selected from a set including categories DC _ ONLY, AC _ LOW, AC _ HIGH _ X, AC _ HIGH _ Y, and AC _ HIGH.
At
At
Thus, determining the context at
In an embodiment, the horizontal category is selected from the set comprising categories DC _ ONLY, AC _ LOW _ X and AC _ HIGH _ X, and the vertical category is selected from the set comprising categories DC _ ONLY, AC _ LOW _ Y and AC _ HIGH _ Y.
In an embodiment, the horizontal category is selected from the set comprising categories AC _ LOW _ X and AC _ HIGH _ X, and the vertical category is selected from the set comprising categories AC _ LOW _ Y and AC _ HIGH _ Y.
As described above, in some cases, the current block may be consumed by multiple transform blocks, thus, in implementations, determining the class of the transform block includes determining th class for the th transform block of the current block, determining a second class for the second transform block of the current block, and selecting of the th class and the second class corresponding to the higher class relative to the number of levels described above.
For simplicity of explanation, the
However, it should be understood that when those terms are used in the claims, encoding and decoding can mean compression, decompression, transformation, or any other processing or change to data.
The use of the word "example" is not intended to mean "or" rather than "or" exclusively "or" as used in this application, that is, unless otherwise specified or clear from the context, "X includes A or B" is intended to mean any in a naturally inclusive arrangement, that is, if X includes A, X includes B, or X includes both A and B, then "X includes A or B" is satisfied under any of the above examples.
Embodiments of sending
Additionally, in aspects, for example, transmitting
Transmitting
The computer-usable or computer-readable medium may be any apparatus that can, for example, tangibly embody, store, communicate, or transport the program for use by or in connection with any processor .
On the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest so as to encompass all such modifications and equivalent structures as is permitted under the law.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:视频译码中的运动信息传播