Inter prediction in geometric partitioning with adaptive region number

文档序号：246883 发布日期：2021-11-12 浏览：7次中文

阅读说明：本技术 在以自适应区域数量进行的几何分区中的帧间预测 (Inter prediction in geometric partitioning with adaptive region number ) 是由 B·富尔赫特 H·卡瓦 V·阿季奇于 2020-01-28 设计创作，主要内容包括：一种解码器,所述解码器包括电路,所述电路配置成接收比特流；经由几何分区模式将当前块分区成第一区域、第二区域和第三区域；确定与所述第一区域、所述第二区域或所述第三区域相关联的运动矢量,所述确定包括构建候选项列表；和利用所确定的运动矢量将所述当前块解码。还描述了相关设备、系统、技术和物件。(A decoder comprising circuitry configured to receive a bitstream; partitioning a current block into a first region, a second region, and a third region via a geometric partition mode; determining a motion vector associated with the first region, the second region, or the third region, the determining comprising constructing a list of candidates; and decoding the current block using the determined motion vector. Related apparatus, systems, techniques, and articles are also described.)

1. A decoder, the decoder comprising circuitry configured to:

receiving a bit stream;

partitioning a current block into a first region, a second region, and a third region via a geometric partition mode;

determining motion vectors associated with regions of the first region, the second region, and the third region, wherein the determining further comprises constructing a candidate list; and

decoding the current block using the determined motion vector.

2. The decoder of claim 1, wherein constructing the candidate list comprises evaluating a lower left candidate, a left candidate, an upper side candidate, and an upper right candidate.

3. The decoder of claim 2, wherein:

the determined motion vector is for the first region;

the geometric partitioning pattern includes a line segment between a first luminance location and a second luminance location;

the lower left candidate is positioned at a third luma location that is immediately to the left of and immediately below the second luma location;

the left candidate is positioned at a fourth luminance position immediately adjacent to the left of the second luminance position;

the top-left candidate is positioned at a fifth luminance position immediately above and to the left of the top-left most luminance position of the first region;

the upper candidate is positioned at a sixth luminance position, which is immediately above the first luminance position; and

the upper right candidate is positioned at a seventh luminance position immediately above and to the right of the first luminance position.

4. The decoder of claim 2, wherein:

the determined motion vector is for the second region;

the geometric partitioning pattern includes a line segment between a first luminance location and a second luminance location;

the lower left candidate is positioned at a third luminance position immediately to the left and below of a leftmost luminance position of the third region;

the left candidate is located at a fourth luminance position immediately to the left of the leftmost lower luminance position of the third region;

the top-left candidate is positioned at a fifth luminance position, which is immediately above the first position;

the upper candidate is positioned at a sixth luminance position immediately above the upper-rightmost luminance position of the second region; and

the upper right candidate is positioned at a seventh luminance position immediately above and to the right of the upper rightmost luminance position of the second region.

5. The decoder of claim 2, wherein:

the determined motion vector is for the third region;

the geometric partitioning pattern includes a line segment between a first luminance location and a second luminance location;

the lower left candidate is positioned at a third luminance position immediately to the left and below of a leftmost luminance position of the third region;

the left candidate is located at a fourth luminance position immediately to the left of the leftmost lower luminance position of the third region;

the top-left candidate is located at a fifth luminance position, the fifth luminance position being co-located with the first region;

the upper candidate is positioned at a sixth luminance position, which is immediately to the left of the second position; and

the top-right candidate is located at a seventh luminance position, the seventh luminance position being co-located with the second region.

6. The decoder of claim 2 wherein the determined motion vector is for the second region, the decoder further configured to mark a candidate as unavailable responsive to determining that the candidate is co-located with the third region.

7. The decoder of claim 2, wherein:

the determined motion vector is for the second region; and

the decoder is further configured to automatically mark the top-left candidate as unavailable in response to determining that the geometric partitioning mode is enabled.

8. The decoder of claim 2, wherein:

the determined motion vector is for the third region; and

the decoder is further configured to automatically mark the top-right candidate as unavailable in response to determining that the geometric partitioning mode is enabled.

9. The decoder of claim 2, wherein:

the determined motion vector is for the third region; and

the decoder is further configured to automatically mark the top-left candidate as unavailable in response to determining that the geometric partitioning mode is enabled.

10. The decoder of claim 1, further configured to determine that merge mode is enabled for the first region.

11. The decoder of claim 1, further configured to determine that advanced motion vector prediction mode is enabled for the first region.

12. The decoder of claim 1, further configured to reconstruct pixel data of the current block.

13. The decoder of claim 12, wherein the first region and the second region are non-rectangular.

14. The decoder of claim 1, wherein the geometric partition mode flag is in the bitstream.

15. The decoder of claim 1, wherein partitioning the current block into the first region, the second region, and the third region via the geometric partition mode comprises partitioning the current block with a line segment characterized by a first luma location and a second luma location.

16. The decoder of claim 1, further configured to:

determining whether the geometric partition mode is enabled;

determining a first line segment of the current block; and

determining a second line segment of the current block;

wherein:

the decoding of the current block comprises reconstructing pixel data using the first line segment and the second line segment; and

the first line segment and the second line segment partition the current block into the first region, the second region, and the third region.

17. The decoder of claim 1, wherein the geometric partition mode is available for block sizes greater than or equal to 64 x 64 or 128 x 128 luma samples.

18. The decoder of claim 1, further comprising:

an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;

an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing an inverse discrete cosine transform;

a deblocking filter;

a frame buffer; and

an intra prediction processor.

19. The decoder of claim 1, wherein the bitstream includes a parameter indicating whether the geometric partition mode is enabled for the current block.

20. The decoder of claim 1, wherein the current block forms part of a quadtree plus binary decision tree.

21. The decoder of claim 20, wherein the current block is a non-leaf node of the quadtree plus binary decision tree.

22. The decoder of claim 1, wherein the current block is a coding tree unit or a coding unit.

23. The decoder of claim 1, wherein the first region is a coding unit or a prediction unit.

24. A method, comprising:

receiving, by a decoder, a bitstream;

partitioning, by the decoder, a current block into a first region, a second region, and a third region through a geometric partition mode;

determining, by the decoder, motion vectors associated with regions of the first region, the second region, and the third region, the determining comprising constructing a list of candidates; and

decoding, by the decoder, the current block using the determined motion vector.

25. The method of claim 24, wherein constructing the candidate list comprises evaluating a lower left candidate, a left candidate, an upper right candidate, and an upper right candidate.

26. The method of claim 25, wherein:

the determined motion vector is for the first region;

the geometric partitioning pattern includes a line segment between a first luminance location and a second luminance location;

the lower left candidate is positioned at a third luminance position, the third luminance position being immediately to the left and below the second luminance position;

the left candidate is positioned at a fourth luminance position, the fourth luminance position being immediately to the left of the second luminance position;

the top-left candidate is positioned at a fifth luminance position immediately above and to the left of the top-left most luminance position of the first region;

the upper candidate is positioned at a sixth luminance position, which is immediately above the first luminance position; and

the upper right candidate is positioned at a seventh luminance position immediately above and to the right of the first luminance position.

27. The method of claim 25, wherein:

the determined motion vector is for the second region;

the geometric partitioning pattern includes a line segment between a first luminance location and a second luminance location;

the lower left candidate is positioned at a third luminance position immediately to the left and below of a leftmost luminance position of the third region;

the left candidate is located at a fourth luminance position immediately to the left of the leftmost lower luminance position of the third region;

the top-left candidate is positioned at a fifth luminance position, which is immediately above the first position;

the upper candidate is positioned at a sixth luminance position immediately above the upper-rightmost luminance position of the second region; and

the upper right candidate is positioned at a seventh luminance position immediately above and to the right of the upper rightmost luminance position of the second region.

28. The method of claim 25, wherein:

the determined motion vector is for the third region;

the geometric partitioning pattern includes a line segment between a first luminance location and a second luminance location;

the lower left candidate is positioned at a third luminance position immediately to the left and below of a leftmost luminance position of the third region;

the left candidate is located at a fourth luminance position immediately to the left of the leftmost lower luminance position of the third region;

the top-left candidate is located at a fifth luminance position, the fifth luminance position being co-located with the first region;

the upper candidate is positioned at a sixth luminance position, which is immediately to the left of the second position; and

the top-right candidate is located at a seventh luminance position, the seventh luminance position being co-located with the second region.

29. The method of claim 25 wherein a determined motion vector is used for the second region, and further comprising marking the candidate as unavailable in response to determining that the candidate is co-located with the third region.

30. The method of claim 25, wherein the determined motion vector is for the second region, and further comprising automatically marking the top-left candidate as unavailable in response to determining that the geometric partitioning mode is enabled.

31. The method of claim 25, wherein the determined motion vector is for the third region, and further comprising automatically marking the top-right candidate as unavailable in response to determining that the geometric partitioning mode is enabled.

32. The method of claim 25, wherein the determined motion vector is for the third region, and further comprising automatically marking the top-left candidate as unavailable in response to determining that the geometric partitioning mode is enabled.

33. The method of claim 24, further comprising determining that merge mode is enabled for the first region.

34. The method of claim 24, further comprising determining that an advanced motion vector prediction mode is enabled for the first region.

35. The method of claim 24, further comprising reconstructing pixel data of the current block.

36. The method of claim 24, wherein each of the first region and the second region is non-rectangular.

37. The method of claim 24, wherein the geometric partition mode flag is in the bitstream.

38. The method of claim 24, wherein partitioning the current block into the first region, the second region, and the third region via the geometric partition mode comprises partitioning the current block with a line segment characterized by a first luma location and a second luma location.

39. The method of claim 24, further comprising:

determining whether the geometric partition mode is enabled;

determining a first line segment of the current block; and

determining a second line segment of the current block;

wherein:

the decoding of the current block comprises reconstructing pixel data using the first line segment and the second line segment;

the first line segment and the second line segment partition the current block into the first region, the second region, and the third region.

40. The method of claim 24, wherein the geometric partitioning mode is available for block sizes greater than or equal to 64 x 64 or 128 x 128 luma samples.

41. The method of claim 24, wherein the decoder further comprises:

an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;

an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing an inverse discrete cosine transform;

a deblocking filter;

a frame buffer; and

an intra prediction processor.

42. The method of claim 24, wherein the bitstream includes a parameter indicating whether the geometric partition mode is enabled for the current block.

43. The method of claim 24, wherein the current block forms part of a quadtree plus binary decision tree.

44. The method of claim 43, wherein the current block is a non-leaf node of the quadtree plus binary decision tree.

45. The method of claim 24, wherein the current block is a coding tree unit or a coding unit.

46. The method of claim 24, wherein the first region is a coding unit or a prediction unit.

Technical Field

The present invention relates generally to the field of video compression. In particular, the present invention relates to inter prediction in geometric partitioning with adaptive region number.

Background

A video codec may include electronic circuitry or software that compresses or decompresses digital video. It may convert uncompressed video into a compressed format and vice versa. In the context of video compression, the device that compresses video (and/or performs some of its functions) may be generally referred to as an encoder, and the device that decompresses video (and/or performs some of its functions) may be referred to as a decoder.

The format of the compressed data may conform to standard video compression specifications. Compression can be lossy because the compressed video lacks some of the information present in the original video. Such consequences may include that the decompressed video may have a lower quality compared to the original uncompressed video, as there is not sufficient information to accurately reconstruct the original video.

There may be complex relationships between video quality, the amount of data used to represent the video (e.g., as determined by the bit rate), the complexity of the encoding and decoding algorithms, susceptibility to data loss and errors, ease of editing, random access, end-to-end delay (e.g., delay time), and so forth.

Disclosure of Invention

In one aspect, a decoder includes a circuit configured to receive a bitstream; partitioning a current block into a first region, a second region, and a third region via a geometric partition mode; determining a motion vector associated with a region of the first region, the second region and the third region, wherein the determining further comprises constructing a candidate list; and decoding the current block using the determined motion vector.

In another aspect, a method includes receiving, by a decoder, a bitstream. The method includes partitioning, by a decoder, a current block into a first region, a second region, and a third region via a geometric partitioning mode. The method comprises determining, by the decoder, motion vectors associated with regions of the first region, the second region and the third region, the determining comprising constructing a list of candidates. The method includes decoding, by a decoder, a current block using the determined motion vector.

The details of one or more variations of the subject matter described herein are set forth in the accompanying drawings and the description below. Other features and advantages of the subject matter described herein will be apparent from the description and drawings, and from the claims.

Drawings

For the purpose of illustrating the invention, the drawings show aspects of one or more embodiments of the invention. It should be understood, however, that the present invention is not limited to the precise arrangements and instrumentalities shown in the attached drawings, wherein:

fig. 1 is a diagram illustrating an example of a geometrically partitioned residual block (e.g., a current block) in which three regions exist;

FIG. 2 is a diagram illustrating example positions of possible spatial motion vector candidates relative to a first region (region S0) of an example current block partitioned according to geometric partitions;

FIG. 3 shows FIG. 2 with annotations showing the luminance positions including the upper left most luminance position of the first region S0;

FIG. 4 is a diagram illustrating exemplary locations of possible motion vector candidates relative to a second region S1 of an exemplary current block partitioned according to a geometric partition;

fig. 5 shows fig. 4 with annotations, showing luminance positions including the leftmost lower luminance position of the third region S2 and the rightmost upper luminance position of the second region S1;

FIG. 6 is a diagram illustrating example positions of possible spatial motion vector candidates relative to a third region (region S2) of an example current block partitioned according to geometric partitions;

fig. 7 shows fig. 6 with annotations, showing luminance positions including the leftmost lower luminance position of the third region S2 and the rightmost upper luminance position of the second region S1; and

FIG. 8 is a system block diagram illustrating an example video encoder capable of encoding video using inter prediction with geometric partitioning by adaptive region number;

FIG. 9 is a process flow diagram illustrating an example process for encoding video with geometric partitioning and inter-prediction with adaptive region numbers in accordance with aspects of the present subject matter that may reduce encoding complexity while increasing compression efficiency;

FIG. 10 is a system block diagram illustrating an example decoder capable of decoding a bitstream using inter prediction and geometric partitioning with adaptive region numbers that may improve the complexity and processing performance of video encoding and decoding;

FIG. 11 is a process flow diagram illustrating an example process for decoding a bitstream using inter prediction and geometric partitioning with adaptive region numbers that may improve the complexity and processing performance of video encoding and decoding;

fig. 12 is a block diagram of a computing system that may be used to implement any one or more of the methods disclosed herein and any one or more portions thereof.

The figures are not necessarily to scale and may be shown by broken lines, schematic representations and fragmentary views. In certain instances, details that are not necessary for an understanding of the embodiments or that render other details difficult to perceive may have been omitted. Like reference symbols in the various drawings indicate like elements.

Detailed Description

Some embodiments of the present subject matter include performing inter prediction with regions that have been partitioned in an adaptive number of regions geometric partitioning mode, where a rectangular block may be partitioned into three or more non-rectangular regions. Performing inter prediction with non-rectangular blocks, which have been partitioned with an adaptive region number of geometric partitions, may allow the partitions to follow object boundaries more closely, resulting in lower motion compensated prediction errors, less residual, and thus improved compression efficiency. During inter prediction, motion compensation may be performed using motion vectors predicted for blocks (e.g., coding units, prediction units, etc.) determined according to the geometric partition mode. The motion vector may be predicted using Advanced Motion Vector Prediction (AMVP) and/or via merge mode, where the motion vector is selected from a list of motion vector candidates without encoding a motion vector difference value.

The current subject matter may be adapted to larger blocks, such as blocks having a size of 128 x 128 or 64 x 64, for example. In some implementations, geometric partitioning can involve partitioning a current block into an adaptive number of regions, such as three or more regions for a given current block; and motion information may be determined for each region.

Motion compensation may include a manner of predicting video frames or portions thereof that takes into account previous and/or future frames of motion of the camera and/or objects in the video. Motion compensation may be used for encoding and decoding of video data for video compression, such as for encoding and decoding utilizing the Moving Picture Experts Group (MPEG) -2 (also known as Advanced Video Coding (AVC)) standard. Motion compensation may describe an image with respect to the transformation of a reference image to a current image. The reference image may be previous or future when compared to the current image. Compression efficiency may improve when pictures may be accurately synthesized from previously transmitted and/or stored pictures.

Block partitioning may refer to a method of video coding to find regions of similar motion. Some form of block partitioning is found in video codec standards, including MPEG-2, h.264 (also referred to as AVC or MPEG-4 part 10), and h.265 (also referred to as High Efficiency Video Coding (HEVC)). In an example block partitioning approach, non-overlapping blocks of a video frame may be partitioned into rectangular sub-blocks to find block partitions containing pixels with similar motion. This approach works well when all pixels of a block partition have similar motion. The motion of pixels in a block may be determined relative to a previously encoded frame.

Motion vector prediction may be efficiently implemented in geometric partitions with adaptive number of regions. In more detail, geometric partitioning with adaptive number of regions may include techniques for video encoding and decoding, where a rectangular block is further divided into two or more regions, which may be non-rectangular. For example, fig. 1 is a diagram illustrating an example of a residual block (e.g., a current block) 100 having a geometric partition, in which there are three sections S0, S1, and S2. The current block 100 may have a width of M pixels and a height of N pixels, denoted as M × N pixels, such as 64 × 64 or 128 × 128. The current block may be geometrically partitioned according to two line segments (P1P2 and P3P4), which may divide the current block into three regions S0, S1, and S2. When the pixels in the region S0 have similar motion, the motion vector may describe the motion of all the pixels in the region. As described more fully below, the corresponding motion vector may be determined according to either the AMVP mode or the merge mode. The motion vector may be used to compress the region S0. Similarly, when the pixels in region S1 have similar motion, the associated motion vector may describe the motion of the pixels in region S1. Similarly, when the pixels in region S2 have similar motion, the associated motion vector may describe the motion of the pixels in region S2. By encoding the positions P1, P2, P3, P4 (or representations of these positions, such as with polar coordinates, indices of predetermined templates, or other representations of partitions), such geometric partitions may be signaled to a receiver (e.g., a decoder) in a video bitstream.

With continued reference to FIG. 1, when encoding video data using geometric partitioning at the pixel level, a line segment P1P2 (or more specifically, points P1 and P2) may be determined. To determine the line segment P1P2 (or more specifically, points P1 and P2) that best divides the block when using pixel-level geometric partitioning, the possible combinations of points P1 and P2 depend as the block width M and height N. For a block of size M N, there are (M-l) X (N-l) X3 possible partitions. Thus, identifying the correct partition can be a computationally expensive task to evaluate motion estimates for all possible partitions; this may increase the amount of time and/or processing power required to encode video compared to encoding with rectangular partitions (e.g., without geometric partitioning at the pixel level). The pixels that make up the best or correct partition may be determined from the metrics and may vary from implementation to implementation.

In some embodiments and still referring to fig. 1, partitioning occurs iteratively, i.e., a first partition forming two regions may be determined (e.g., determining line P1P2 and the associated region), and one of these regions may then be further partitioned. For example, the partitioning described with reference to FIG. 1 may be performed to partition a block into two regions. One of these regions may be further partitioned (e.g., to form new region S1 and region S2). The process may continue to perform block horizontal partitioning until a stop criterion is reached.

With continued reference to fig. 1, inter prediction may be performed using the geometrically partitioned regions. The motion vectors used for motion compensation may be derived using AMVP or merge mode. In AMVP, motion vector prediction is performed by issuing an index into a motion vector candidate list, and a motion vector difference value (e.g., a residual) is encoded in a bitstream and included therein. In merge mode, the motion vector is selected from a motion vector candidate list without encoding the motion vector difference, allowing the current block to adopt the motion information of another previously decoded block. In both AMVP and merge modes, the candidate list may be constructed by both the encoder and decoder, and the indices in the candidate list are signaled in the bitstream.

Fig. 2 is a diagram illustrating a non-limiting example location of a potential spatial motion vector candidate relative to the location of the first region (region S0) in the case of partitioning an example current block 200 according to geometric partitioning. The potential spatial motion vector candidates may be considered during AMVP mode or merge mode for building a motion vector candidate list. The current block 200 may be partitioned into three regions S0, S1, and S2 by straight lines between points P0, P1 and P2, P3, respectively. Each of region S0, region S1, and region S2 may be uni-directionally or bi-directionally predicted. The spatial candidates (e.g., as shown in fig. 2) of the first region (region S0) may include a lower left candidate a0, a left candidate a1, an upper left candidate B2, an upper side candidate B1, and an upper right candidate B0.

Still referring to fig. 2, as shown and in some embodiments, each location (a0, a1, B2, B1, and B0) may represent a block at the respective location. For example, the upper left candidate B2 may be a block residing at a position immediately to the left of and immediately above the region S0; for example, if the top left corner luminance position of S0 is (0,0), then top left candidate B2 may reside at position (-1, -1). The lower left candidate a0 may be positioned immediately to the left and below P1; for example, if the luminance position of P1 is (P1x, P1y), the lower left candidate a0 may reside at position (Plx-1, Ply + 1). Left candidate a1 may be positioned immediately to the left of P1; for example, left candidate a1 may reside at position (Plx-1, Ply). The upper candidate B1 may be positioned immediately above P0; for example, if the luminance position of P0 is (P0x, P0y), the upper candidate B1 may be located at (P0x, P0 y-1). The upper right candidate B0 may be positioned immediately above and to the right of P0; for example, the top right candidate B0 may reside at position (P0x +1, P0 y-1). Other locations are possible as will be apparent to those skilled in the art upon review of the entire disclosure. Fig. 3 shows fig. 2 with an illustration to show the luminance positions including the leftmost upper luminance position of the first region S0.

In some embodiments and still referring to fig. 3, when building the candidate list for region S0, some potential candidates may be automatically marked as unavailable and may be removed from the candidate list; as such partitioning may be performed to partition areas (or objects) within a frame with different motion information, in the presence of geometric partitioning. Thus, it may be inferred that the blocks associated with these candidates likewise represent another object with a different motion, and thus these candidates may be automatically marked as unavailable (e.g., without further consideration, removed from the candidate list, etc.). In the example shown above with reference to fig. 2, with respect to region S0, the lower left candidate a0 may be automatically marked as unavailable because it is possible that region S0 does not share motion information with the block located at the lower left candidate a 0. Similarly, with respect to region S0, the top right candidate B0 may be automatically marked as unavailable because it is possible that region S0 does not share motion information with the block located at top right candidate B0. In some embodiments, whether it is possible for left lower candidate a0 and/or right upper candidate B0 to share motion information may be determined by estimating a line segment P0P1 (or point P0, P1), for example by determining a line segment P0P1, extending the line segment into a left lower candidate a0 block and/or a right upper candidate B0 block, and determining whether left lower candidate a0 and/or right upper candidate B0 reside on the same side of the extended line segment as first region S0.

Fig. 4 is a diagram illustrating a non-limiting example of the location of a potential spatial motion vector candidate relative to a second region (region S1) in the case of partitioning an exemplary current block 400 according to geometric partitioning. The potential spatial motion vector candidates may be considered during AMVP mode or merge mode for building a motion vector candidate list. The current block 400 may have been partitioned into three regions S0, S1, and S2 by straight lines between points P0, P1 and P2, P3, respectively. Each of region S0, region S1, and region S2 may be uni-directionally or bi-directionally predicted. Non-limiting examples of spatial candidates of the second region (region S1) are shown in fig. 4 and include a left lower candidate a0, a left side candidate a1, a left upper candidate B2, an upper side candidate B1 and a right upper candidate B0.

As shown and still referring to fig. 4, each location (a0, a1, B2, B1, and B0) may represent a block at the respective location. For example, the upper left candidate B2 may be a block that resides at a luminance position immediately to the left of and immediately above the upper left-most position of the region S1; for example, if the top left corner luminance position of S1 is an adjacent P0(P0x +1, P0y) with luminance position coordinates, then top left candidate B2 may reside at position (P0x, P0 y-1). The lower left candidate a0 is located immediately below the leftmost position of the third region (region S2); for example, if the leftmost lower position of the third region (region S2) is located at (0, N-l), the lower left candidate a0 may reside at position (0, N). The left candidate a1 may be located immediately to the left of the leftmost lower position of the third region (region S2); for example, the left candidate A1 may reside at position (0, N-l). The upper candidate B1 may be located immediately above the upper-right-most position of region S1; for example, if the upper-right most position of the region S1 is located at (M-1, 0), then B1 may reside at position (M-1, -1). The upper right candidate B0 may be located immediately above and to the right of the upper rightmost position of region S1; for example, the upper right candidate B0 may reside at location (M, -1). Fig. 5 shows fig. 4 with annotations, thereby showing luminance positions including the leftmost lower luminance position of the third region S2 and the rightmost upper luminance position of the second region S1.

In some embodiments and still referring to fig. 5, when building the candidate list for region S1, some potential candidates may be automatically marked as unavailable and may be removed from the candidate list; since in case of geometrical partitioning such partitioning might be performed to partition areas (or objects) within the frame with different motion information. Thus, it may be inferred that the blocks associated with these candidates may represent another object with a different motion, and thus these candidates may be automatically marked as unavailable (e.g., without further consideration, removed from the candidate list, etc.). In one non-limiting example shown above with reference to fig. 4, with respect to region S1, the top left candidate B2 may be automatically marked as unavailable because it is possible that region S1 does not share motion information with the block located at the top left candidate B2. Similarly, in some embodiments, the left candidate a1 may be automatically marked as unavailable for region S1 because it is possible that region S1 does not share motion information with the block located at the left candidate a1 (which may be the third region S2). Similarly, in some embodiments, the lower left candidate a0 may be automatically marked as unavailable for the region S1 because it is possible that the region S1 does not share motion information with the block located at the lower left candidate a0 (which may be below the third region S2).

Fig. 6 is a diagram illustrating an example position of a potential spatial motion vector candidate with respect to a third region (region S2) in case of a partition pair of an example current block 600 according to geometric partitions. The potential spatial motion vector candidates may be considered during AMVP mode or merge mode for building a motion vector candidate list. The current block 600 may be partitioned into three regions S0, S1, and S2 by straight lines between points P0, P1 and P2, P3, respectively. Each of region S0, region S1, and region S2 may be uni-directionally or bi-directionally predicted. Non-limiting examples of spatial candidates of the third region (region S2) are shown in fig. 6 and may include a left lower candidate a0, a left side candidate a1, a left upper candidate B2, an upper side candidate B1 and a right upper candidate B0.

As shown and still referring to fig. 6, each location (a0, a1, B2, B1, and B0) may represent a block at the respective location. For example, the top-left candidate may be a block that resides at a luminance position that is above and to the left of the region S2; for example, the upper left candidate B2 may be the first region S0. If S0 is located at (0,0), the upper left candidate B2 may be located at (0, 0). The lower left candidate a0 may be located immediately to the left of and immediately below the leftmost position of region S2; for example, if the leftmost lower position of region S2 is located at (0, N-l), then lower left candidate a0 may reside at (-1, N). The left candidate a1 may be located immediately to the left of the leftmost lower position of region S2; for example, left candidate A1 may reside at (-1, N-l). The upper candidate B1 may be located above and to the left of region S2, which may be the neighboring point P1; for example, if P1 is located at (Plx, Ply), then upper candidate B1 may be located at (Plx-1, Ply). The upper right candidate B0 may be a block that resides at a luminance position that is above and to the right of the region S2; for example, the upper right candidate B0 may be the second region S1. For example, the top-right candidate may be located at the top-right most position of S1, which may reside at (M-l, 0). Fig. 7 shows fig. 6 with annotations, thereby showing luminance positions including the leftmost lower luminance position of the third region S2 and the rightmost upper luminance position of the second region S1.

In some embodiments and still referring to fig. 7, when building the candidate list for region S2, some potential candidates may be automatically marked as unavailable and may be removed from the candidate list; since in case of geometrical partitioning such partitioning might be performed to partition areas (or objects) within the frame with different motion information. Thus, it may be inferred that the blocks associated with these candidates may represent another object with a different motion, and thus these candidates may be automatically marked as unavailable (e.g., without further consideration, removed from the candidate list, etc.). In one non-limiting example provided above in fig. 6, for region S2, the top left candidate B2 may be automatically marked as unavailable because it is possible that region S2 does not share motion information with the block located at top left candidate B2 (e.g., S0). Similarly, in some embodiments, the top-right candidate a0 may be automatically marked as unavailable for region S2 because it is possible that region S2 does not share motion information with blocks located in the left candidate B0 (which may be the second region S1). Similarly, in some embodiments, the upper candidate B1 may be automatically marked as unavailable for the region S2 because it is possible that the region S2 does not share motion information with the blocks located at the upper candidate B1 (which is to the left of the first region S0).

Fig. 8 is a system block diagram illustrating an example video encoder 800, the example video encoder 800 being capable of encoding video with geometric partitioning by adaptive number of regions using inter prediction. The example video encoder 800 receives an input video 805, and the input video 805 may be initially segmented or partitioned according to a processing scheme, such as a tree-structured macro block partitioning scheme (e.g., quad-tree plus binary tree). One example of a tree structured macro-block partitioning scheme may include partitioning an image frame into large block elements called Coding Tree Units (CTUs). In some implementations, each CTU may also be partitioned into multiple sub-blocks, referred to as Coding Units (CUs), one or more times. The end result of such partitioning may include a group of sub-blocks, which may be referred to as a Prediction Unit (PU). Transform Trees (TUs) may also be utilized. Such partitioning schemes may include geometric partitioning with an adaptive number of regions according to some aspects of the current subject matter.

Still referring to fig. 8, the example video encoder 800 may include an intra prediction processor 815, a motion estimation/compensation processor 820 (also referred to as an inter prediction processor), a transform/quantization processor 825, an inverse quantization/inverse transform processor 830, an in-loop filter 835, a decoded image buffer 840, and an entropy encoding processor 845; the motion estimation/compensation processor 820 can support geometric partitioning by the number of adaptation regions including the AMVP mode and the merge mode. In some embodiments, including utilizing AMVP mode and merge mode, motion estimation/compensation processor 820 may execute to adapt the geometric partitioning of the number of regions. Bitstream parameters representing the geometric partition mode, AMVP mode, and merge mode may be input to the entropy encoding processor 845 for inclusion in the output bitstream 850.

In operation and still referring to fig. 8, for each block of a frame of input video 805, a decision may be made whether to process the block via intra image prediction or with motion estimation/compensation. The blocks may be provided to an intra prediction processor 810 or a motion estimation/compensation processor 820. If the block is to be processed via intra prediction, intra prediction processor 810 may perform processing to output a predictor. If the block is to be processed via motion estimation/compensation, motion estimation/compensation processor 820 may perform processing to output a predictor, including geometric partitioning and the use of AMVP mode and merge mode.

With continued reference to fig. 8, a residual may be formed by subtracting the predictor from the input video. The residual may be received by a transform/quantization processor 825, and the transform/quantization processor 825 may perform a transform process (e.g., a Discrete Cosine Transform (DCT)) to produce coefficients, which may be quantized. The quantized coefficients and any associated signaling information may be provided to an entropy encoding processor 845 for entropy encoding and included in the output bitstream 850. The entropy encoding processor 845 may support encoding of signaling information related to the geometric partitioning mode, the AMVP mode, and the merge mode. Further, the quantized coefficients may be provided to an inverse quantization/inverse transform processor 830, which may combine with the predictor and be processed by an in-loop filter 835, the output of which 835 may be stored in a decoded picture buffer 840 for use by a motion estimation/compensation processor 820, which motion estimation/compensation processor 820 may be capable of supporting a geometric partition mode, an AMVP mode, and a merge mode.

Fig. 9 is a process flow diagram illustrating an example process 300 for encoding video via geometric partitioning and inter-prediction in accordance with aspects of the present subject matter, the example process 300 may reduce encoding complexity while increasing compression efficiency. At step 910, the video frame may undergo initial block segmentation, for example, utilizing a tree-structured macro-block partitioning scheme, which may include partitioning the image frame into a plurality of CTUs and a plurality of CUs. At step 920, a block may be selected for geometric partitioning by the number of adaptive regions. Selection may include determining, according to a metric rule, that the block is to be processed according to a geometric partitioning mode.

At step 930 and still referring to FIG. 9, a geometric partition having three or more regions may be determined. At least two line segments may be determined that will separate the pixels included within the block into three or more regions (e.g., region 0, region 1, and region 2) according to their inter-frame motion such that the pixels (e.g., luma samples) within each of the respective regions have similar motion, which may be different from the pixel motion within the other regions (e.g., region 1).

In step 940 and with continued reference to fig. 9, the motion information for each region may be determined and processed using AMVP mode or merge mode. When processing a region using AMVP mode, a candidate list may be built by considering both spatial and temporal candidates (which may include spatial candidates described above, and may include marking some candidates as unavailable). The motion vector may be selected from a list of motion vector candidates as a motion vector prediction and a motion vector difference (e.g., a residual) may be calculated. An index of the candidate list may be determined. In merge mode, a candidate list may be built by considering both spatial and temporal candidates (which may include spatial candidates as described above, and may include marking some candidates as unavailable). The motion vector may be selected from a list of motion vector candidates for the motion information of the region taking another block. An index of the candidate list may be determined.

In step 950 and still referring to fig. 9, the determined geometric partition and motion information may be signaled in a bitstream. Signaling the geometric partitions in the bitstream may include, for example, the locations of P0, P1, P2, P3; an index of one or more predetermined templates; and so on. When processing a region using AMVP, signaling of motion information may include including a motion vector difference value (e.g., a residual) and an index of a motion vector candidate list in a bitstream. When processing the area with merge mode, the signaling of motion information may include including an index to the motion vector candidate list in the bitstream.

Fig. 10 is a system block diagram illustrating an example decoder 1000, the example decoder 1000 being capable of decoding a bitstream 1070 using inter-prediction and geometric partitioning with adaptive region numbers, which may improve the complexity and processing performance of video encoding and decoding. The decoder 1000 may include an entropy decoder processor 1010, an inverse quantization and inverse transform processor 1020, a deblocking filter 1030, a frame buffer 1040, a motion compensation processor 1050, and an intra prediction processor 1060. In some implementations, the bitstream 1070 may include parameters that represent a geometric partitioning mode, an AMVP mode, and a merge mode. With geometric partitioning as described herein, the motion compensation processor 1050 can reconstruct pixel information.

In operation and still referring to fig. 10, the bitstream 1070 may be received by the decoder 1000 and input to the entropy decoder processor 1010, which entropy decoder processor 1010 may decode the bitstream into quantized coefficients. The quantized coefficients may be provided to an inverse quantization and inverse transform processor 1020, and the inverse quantization and inverse transform processor 1020 may perform inverse quantization and inverse transform to form a residual signal. The residual signal may be added to the output of the motion compensation processor 1050 or the intra prediction processor 1060 according to the processing mode. The outputs of the motion compensation processor 1050 and the intra prediction processor 1060 may include block prediction values based on previously decoded blocks. The sum of the prediction and residual may be processed by the deblocking filter 1030 and stored in the frame buffer 1040. For a given block (e.g., CU or PU), when the bitstream 1070 signals (partition mode is geometric partition), the motion compensation processor 1050 may construct a prediction value based on the geometric partitioning approach described herein.

Fig. 11 is a process flow diagram illustrating an example process 1100, the example process 1100 decoding a bitstream using inter prediction in geometric partitioning with adaptive region number, which may improve complexity and processing performance of video encoding and decoding. At step 1110, a bitstream is received, which may include a current block (e.g., CTU, CU, PU). The receiving operation may include extracting and/or parsing signaling information related to the current block and the bitstream. The decoder may extract or determine one or more parameters characterizing the geometric partition. These parameters may include, for example, the start and end indices of the line segment (e.g., P0, P1, P2, P3). The extracting or determining operation may include identifying and obtaining parameters of the bitstream (e.g., parsing the bitstream).

At step 1120 and still referring to FIG. 11, according to the geometric partition mode, a first region, a second region, and a third region of the current block may be determined. The determining operation may include determining whether geometric partition mode is enabled for the current block (e.g., yes). If geometric partition mode is not enabled (e.g., NO), the decoder may process the block using an alternative partition mode. If the geometric partition mode is enabled (e.g., YES), three or more regions may be determined and/or processed.

At step 1130 and still referring to FIG. 11, motion vectors associated with the first region, the second region, and the region of the second region may be determined. Determining the motion vector may include determining whether to determine motion information of the region using the AMVP mode or the merge mode. When processing a region using AMVP mode, a candidate list may be built by considering both spatial and temporal candidates (which may include spatial candidates described above, and may include marking some candidates as unavailable). The motion vector may be selected from a list of motion vector candidates as the motion vector prediction; and a motion vector difference (e.g., residual) may be calculated. In merge mode, this determination may include building a candidate list of spatial candidates and temporal candidates for each region. Constructing the candidate list may include automatically marking the candidate as unavailable and removing the unavailable candidate from the candidate list. The index of the constructed candidate list may be parsed from the bitstream and used to select the final candidate from the candidate list. The motion information of the current region may be determined to be the same as the motion information of the final candidate (e.g., the motion vector of the region may be employed from the final candidate).

Still referring to FIG. 11, in step 1140, the current block may be decoded using the determined motion vector.

Although some variations have been described in detail above, other modifications or additions are possible. For example, geometric partitioning may signal in the bitstream based on rate-distortion decisions in the encoder. The encoding may predetermine a combination of partitions (e.g., templates), temporal and spatial prediction of partitions, and/or additional offsets based on rules. Each geometrically partitioned area may utilize motion compensated prediction or intra prediction. The boundaries of the prediction region may be smoothed before adding the residual.

In some embodiments, a quadtree plus binary decision tree (QTBT) may be implemented. In QTBT, at the coding tree unit level, the partition parameters of the QTBT can be dynamically derived to adapt to local characteristics without communicating any overhead. Subsequently, at the coding unit level, the joint classifier decision tree structure may eliminate unnecessary iterations and control the risk of mispredictions. In some embodiments, geometric partitioning by the number of adaptive regions may be available as an additional partitioning option available at each leaf node of the QTBT.

In some embodiments, a decoder may include a partition processor that generates a geometric partition for a current block and provides all partition related information for dependent processes. The partition processor may directly affect motion compensation because it may be performed segment by segment in case of geometric partitioning of blocks. In addition, the partition processor may provide the shape information to the intra-prediction processor and the transform coding processor.

In some embodiments, the additional syntax elements may signal different hierarchical levels of the bitstream. To enable geometric partitioning with adaptive region number for the entire sequence, an enable flag may be encoded in a Sequence Parameter Set (SPS). In addition, the CTU flag may be coded at the Coding Tree Unit (CTU) level to indicate whether any Coding Unit (CU) utilizes geometric partitioning with adaptive region numbers. The CU flag may be encoded to indicate whether the current coding unit utilizes geometric partitioning with adaptive region number. It is possible to encode parameters that specify line segments on a block. For each region, a flag may be decoded, which may specify whether the current region is inter or intra predicted.

In some embodiments, a minimum zone size may be specified.

The subject matter described herein provides a number of technical advantages. For example, some embodiments of the present subject matter may provide for partitioning of blocks that reduces complexity while increasing compression efficiency. In some embodiments, blocking artifacts (artifacts) at object boundaries may be reduced.

It is worthy to note that any one or more of the aspects and embodiments described herein may be conveniently implemented using: digital electronic circuitry, integrated circuitry, specially designed Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), computer hardware, firmware, software, and/or combinations thereof, as implemented and/or embodied in one or more machines (e.g., one or more computing devices acting as a user computing device for electronic files, one or more server devices such as file servers, etc.) programmed according to the teachings of the present specification, as will be apparent to those skilled in the computer art. These various aspects or features may include implementation in one or more computer programs and/or software that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be coupled for dedicated and general purpose to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device. Suitable software coding can be readily prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art. The aspects and embodiments discussed above that employ software and/or software modules may also include appropriate hardware for facilitating the implementation of the machine-executable instructions of the software and/or software modules.

Such software may be a computer program product employing a machine-readable storage medium. A machine-readable storage medium may be any medium that is capable of storing and/or encoding a sequence of instructions for execution by a machine (e.g., a computing device) and that cause the machine to perform any one of the methods and/or embodiments described herein. Examples of a machine-readable storage medium include, but are not limited to, magnetic disks, optical disks (e.g., CD-R, DVD-R, etc.), magneto-optical disks, read-only memory "ROM" devices, random-access memory "RAM" devices, magnetic cards, optical cards, solid-state memory devices, EPROM, EEPROM, Programmable Logic Devices (PLD), and/or any combination thereof. As used herein, a machine-readable medium is intended to include a single medium, as well as a collection of physically separate media, such as, for example, a compact disc or a collection of one or more hard disk drives in combination with a computer memory. As used herein, a machine-readable storage medium does not include a transitory form of signal transmission.

Such software may also include information (e.g., data) carried as a data signal on a data carrier, such as a carrier wave. For example, machine-executable information may be included as a data-bearing signal embodied in a data carrier, where the signal encodes a sequence of instructions or portions thereof for execution by a machine (e.g., a computing device), and any associated information (e.g., data structures and data) that causes the machine to perform any one of the methods and/or embodiments described herein.

Examples of a computing device include, but are not limited to, an electronic book reading device, a computer workstation, a terminal computer, a server computer, a handheld device (e.g., tablet computer, smartphone, etc.), a network appliance, a network router, network switch, network bridge, any machine capable of executing a sequence of instructions that specify actions to be taken by that machine, and any combination thereof. In one example, the computing device may include and/or may be included in a kiosk.

Fig. 12 shows a schematic representation of one embodiment of a computing device in the exemplary form of a computer system 1200 within which a set of instructions for causing a control system to perform any one or more of the aspects and/or methods of the present disclosure may be executed. It is also contemplated that multiple computing devices may be used to implement a particular set of configuration instructions for causing one or more of these devices to perform any one or more of the aspects and/or methods of the present disclosure. Computer system 1200 includes a processor 1204 and a memory 1208, which communicate with each other and other components via a bus 1212. Utilizing any of a variety of bus architectures, the bus 1212 can include any of several types of bus structures, including but not limited to a memory bus, a memory controller, a peripheral bus, a local bus, and any combination thereof.

Memory 1208 may include various components (e.g., machine-readable media) including, but not limited to, random access memory components, read-only components, and any combination thereof. In one example, a basic input/output system 1216(BIOS) (including the basic paths that help to transfer information between elements within computer system 1200, such as during start-up) may be stored in memory 1208. The memory 1208 may also include instructions (e.g., software) (e.g., stored on one or more machine-readable media) 1220, the instructions 1220 implementing any one or more of the aspects and/or methods of the present disclosure. In another example, memory 1208 may also include any number of program modules including, but not limited to, an operating system, one or more application programs, other program modules, program data, and any combination thereof.

The computer system 1200 may also include a storage device 1224. Examples of storage devices (e.g., storage device 1224) include, but are not limited to, hard disk drives, magnetic disk drives, optical disk drives in combination with optical media, solid state memory devices, and any combination thereof. A storage device 1224 may be connected to the bus 1212 by an appropriate interface (not shown). Example interfaces include, but are not limited to, SCSI, Advanced Technology Attachment (ATA), Serial ATA, Universal Serial Bus (USB), IEEE 1394 (firewire), and any combination thereof. In one example, storage 1224 (or one or more components thereof) may be removably connected with computer system 1200 (e.g., via an external port connector (not shown)). In particular, storage devices 1224 and the associated machine-readable media 1228 may provide non-volatile and/or volatile storage of machine-readable instructions, data structures, program modules and/or other data for computer system 1200. In one example, the software 1220 may reside, completely or partially, within the machine-readable medium 1228. In another example, the software 1220 may reside, completely or partially, within the processor 1204.

The computer system 1200 may also include an input device 1232. In one example, a user of computer system 1200 can enter commands and/or other information into computer system 1200 via input device 1232. Examples of input device 1232 include, but are not limited to, an alphanumeric input device (e.g., a keyboard), a pointing device, a joystick, a game pad, an audio input device (e.g., a microphone, a voice response system, etc.), a cursor control device (e.g., a mouse), a touch pad, an optical scanner, a video capture device (e.g., a still camera, a video camera), a touch screen, and any combination thereof. An input device 1232 is connectable to the bus 1212 via any of a variety of interfaces (not shown), including, but not limited to, a serial interface, a parallel interface, a game port, a USB interface, a firewire interface, a direct interface to the bus 1212, and any combination thereof. The input device 1232 may include a touch screen interface, which may be part of or separate from the display 1236, discussed further below. The input device 1232 may be used as a user selection device for selecting one or more graphical representations of a graphical interface as described above.

A user may also enter commands and/or other information into the computer system 1200 via the storage devices 1224 (e.g., a removable disk drive, a flash memory drive, etc.) and/or the network interface device 1240. A network interface device, such as network interface device 1240, may be used to connect computer system 1200 to one or more of a variety of networks, such as network 1244, and one or more remote devices 1248 connected thereto. Examples of network interface devices include, but are not limited to, a network interface card (e.g., a mobile network interface card, a LAN card), a modem, and any combination thereof. Examples of networks include, but are not limited to, a wide area network (e.g., the internet, an enterprise network), a local area network (e.g., a network associated with an office, a building, a campus, or other small geographic space), a telephone network, a data network associated with a telephone/voice provider (e.g., a mobile communications provider data and/or voice network), a direct connection between two computing devices, and any combination thereof. Networks, such as network 1244, may employ wired and/or wireless communication modes. In general, any network topology may be used. Information (e.g., data, software 1220, etc.) may be communicated to and/or from the computer system 1200 via the network interface device 1240.

Computer system 1200 may also include a video display adapter 1252 for communicating the displayable pictures to a display device, such as display device 1236. Examples of display devices include, but are not limited to, Liquid Crystal Displays (LCDs), Cathode Ray Tubes (CRTs), plasma displays, Light Emitting Diode (LED) displays, and any combination thereof. The display adapter 1252 and the display device 1236 may be used in conjunction with the processor 1204 to provide a graphical representation of some aspects of the present disclosure. In addition to a display device, computer system 1200 may also include one or more other peripheral output devices, including but not limited to audio speakers, printers, and any combination thereof. Such peripheral output devices are connected to bus 1212 through a peripheral interface 1256. Examples of peripheral interfaces include, but are not limited to, a serial port, a USB connection, a firewire connection, a parallel connection, and any combination thereof.

The foregoing has described in detail illustrative embodiments of the invention. Various modifications and additions can be made without departing from the spirit and scope of the invention. The features of each of the various embodiments described above can be combined with the features of the other described embodiments as appropriate to provide a variety of combinations of the features of the associated new embodiments. Furthermore, while the foregoing describes a number of separate embodiments, what has been described herein illustrates merely the application of the principles of the invention. Additionally, although particular methods herein may be shown and/or described as being performed in a particular order, that order is highly variable within the skill of the art to implement embodiments as disclosed herein. Accordingly, this description is intended by way of example only and is not intended to otherwise limit the scope of the present invention.

In the above description and claims, phrases such as "at least one of a. The term "and/or" may also be present in a list of two or more elements or features. Such phrases are intended to mean either any of the listed elements or features individually or any of the recited elements or features in combination with any of the other recited elements or features unless implicitly or explicitly contradicted by context used. For example, the phrases "at least one of a and B", "one or more of a and B", and "a and/or B" are all intended to mean "a single, B single, or both a and B". Similar explanations are also intended for lists comprising three or more items. For example, the phrases "at least one of A, B and C", "one or more of A, B and C", and "A, B and/or C" are all intended to mean "a alone, B alone, both C, A and B alone, both a and C, both B and C alone, or both a and B and C together". Furthermore, the term "based on" as used above and in the claims is intended to mean "based, at least in part, on" such that unrecited features and elements are also permissible.

The subject matter described herein may be implemented in systems, devices, methods, and/or articles, depending on the desired configuration. The embodiments set forth in the foregoing description do not represent all embodiments consistent with the subject matter described herein. Rather, they are merely a few examples consistent with aspects related to the described subject matter. Although some variations have been described in detail above, other modifications or additions are possible. In particular, other features and/or variations may be provided in addition to those set forth herein. For example, the embodiments described above may be directed to various combinations and subcombinations of the disclosed features and/or combinations and subcombinations of some other features disclosed above. Moreover, the logic flows illustrated in the figures and/or described herein do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Other embodiments may fall within the scope of the following claims.

29页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：指数分区的帧间预测

Inter prediction in geometric partitioning with adaptive region number

相关技术

网友询问留言