Image decoding method and apparatus based on affine motion prediction in image coding system

文档序号:1327992 发布日期:2020-07-14 浏览:17次 中文

阅读说明:本技术 基于图像编码系统中的仿射运动预测的图像解码方法和设备 (Image decoding method and apparatus based on affine motion prediction in image coding system ) 是由 李在镐 于 2019-07-09 设计创作,主要内容包括:根据本公开,一种由解码设备执行的视频解码方法包括以下步骤:从比特流获取关于当前块的运动预测信息;生成包括所述当前块的仿射运动向量预测子MVP候选的仿射MVP候选列表;基于所述仿射MVP候选列表中所包括的所述仿射MVP候选当中的一个仿射MVP候选来推导所述当前块的CP的CPMVP;基于所述运动预测信息来推导所述当前块的CP的CPMVD;基于所述CPMVP和所述CPMVD来推导所述当前块的CP的CPMV;以及基于所述CPMV来推导所述当前块的预测样本。(According to the present disclosure, a video decoding method performed by a decoding apparatus includes the steps of: acquiring motion prediction information on a current block from a bitstream; generating an affine motion vector predictor, MVP, candidate list comprising MVP candidates for the current block; deriving a CPMVP of a CP of the current block based on one affine MVP candidate among the affine MVP candidates included in the affine MVP candidate list; deriving a CPMVD of a CP of the current block based on the motion prediction information; deriving a CPMV of a CP of the current block based on the CPMVP and the CPMVD; and deriving prediction samples for the current block based on the CPMV.)

1. A method of video decoding performed by a decoding device, the method comprising the steps of:

obtaining motion prediction information of a current block from a bitstream;

constructing an affine Motion Vector Predictor (MVP) candidate list comprising MVP candidates for the current block;

deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on one of the affine MVP candidates included in the affine MVP candidate list;

deriving a control point motion vector difference CPMVD of the CP for the current block based on the motion prediction information;

deriving a Control Point Motion Vector (CPMV) of the CP for the current block based on the CPMVP and the CPMVD;

deriving prediction samples for the current block based on the CPMV; and

generating a reconstructed picture of the current block based on the derived prediction samples,

wherein the affine MVP candidates comprise a first affine MVP candidate and a second affine MVP candidate,

wherein the first affine MVP candidate is derived based on a first block in a left block group comprising a lower left corner neighboring block and a left neighboring block,

wherein the first block is encoded with an affine motion model and a reference picture of the first block is the same as a reference picture of the current block,

wherein the second affine MVP candidate is derived based on a second block in an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block, and

wherein the second block is encoded with the affine motion model and a reference picture of the second block is the same as the reference picture of the current block.

2. The method of claim 1, wherein the first block is a first block that has been verified to satisfy a condition when neighboring blocks within the left block group are examined according to a particular order.

3. The method of claim 1, wherein the second block is a first block that has been verified to satisfy a condition when an adjacent block within the upper block group is examined according to a particular order.

4. The method of claim 3, wherein the particular order is an order starting from the top neighboring block up to the top-right neighboring block and the top-left neighboring block.

5. The method of claim 1, wherein the motion prediction information comprises an affine MVP candidate index for the current block, and

wherein the CPMVP of the CP for the current block is derived based on an affine MVP candidate indicated by the affine MVP candidate index.

6. The method of claim 1, wherein the step of constructing a list of affine motion vector predictor, MVP, candidates comprises the steps of:

dividing motion vectors of neighboring blocks of the current block into a first group, a second group, and a third group; and

deriving a CPMVP candidate for the CP0 of the current block from the first group, deriving a CPMVP candidate for the CP1 of the current block from the second group, and deriving a CPMVP candidate for the CP2 of the current block from the third group, and deriving a constructed affine MVP candidate including the CPMVP candidate of the CP.

7. The method of claim 6, wherein the neighbor blocks include neighbor block A, neighbor block B, neighbor block C, neighbor block D, neighbor block E, neighbor block F, and neighbor block G, and

wherein, in a case where the size of the current block is W × H and x and y components of a top-left sample position of the current block are 0, the neighboring block A is a block including samples of coordinates (-1, -1), the neighboring block B is a block including samples of coordinates (0, -1), the neighboring block C is a block including samples of coordinates (-1, 0), the neighboring block D is a block including samples of coordinates (W-1, -1), the neighboring block E is a block including samples of coordinates (W, -1), the neighboring block F is a block including samples of coordinates (-1, H-1), and the neighboring block G is a block including samples of coordinates (-1, H).

8. The method of claim 7, wherein the first group includes motion vectors of the neighboring block A, motion vectors of the neighboring block B, and motion vectors of the neighboring block C,

wherein the second group includes motion vectors of the neighboring block D and motion vectors of the neighboring block E, and

wherein the third group includes the motion vector of the neighboring block F and the motion vector of the neighboring block G.

9. The method of claim 8, wherein the CPMVP of the CP0 is a motion vector having a reference picture that is first verified to be the same as a reference picture of the current block when motion vectors within the first group are checked according to a specific order,

wherein the specific order is an order from the neighboring block A up to the neighboring block B and the neighboring block C.

10. The method of claim 8, wherein the CPMVP of the CP1 is a motion vector having a reference picture that is first verified to be the same as a reference picture of the current block when motion vectors within the second group are checked according to a specific order,

wherein the specific order is an order starting from the neighboring block D until the neighboring block E.

11. The method of claim 8, wherein the CPMVP of the CP2 is a motion vector having a reference picture that is first verified to be the same as a reference picture of the current block when motion vectors within the third group are checked according to a specific order, wherein the specific order is an order starting from the neighboring block F until the neighboring block G.

12. A method of video encoding performed by an encoding device, the method comprising the steps of:

constructing an affine Motion Vector Predictor (MVP) candidate list comprising MVP candidates of the current block;

deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on one of the affine MVP candidates included in the affine MVP candidate list;

deriving a Control Point Motion Vector (CPMV) of the CP for the current block;

deriving a Control Point Motion Vector Difference (CPMVD) for the CP of the current block based on the CPMVP and the CPMV; and

encoding motion prediction information including information on the CPMVD,

wherein the affine MVP candidates comprise a first affine MVP candidate and a second affine MVP candidate,

wherein the first affine MVP candidate is derived based on a first block in a left block group comprising a lower left corner neighboring block and a left neighboring block,

wherein the first block is encoded with an affine motion model and a reference picture of the first block is the same as a reference picture of the current block,

wherein the second affine MVP candidate is derived based on a second block in an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block, and

wherein the second block is encoded with the affine motion model and a reference picture of the second block is the same as the reference picture of the current block.

13. The method of claim 12, wherein the first block is a first block that has been verified to satisfy a condition when neighboring blocks within the left block group are examined according to a particular order.

14. The method of claim 12, wherein the second block is a first block that has been verified to satisfy a condition when an adjacent block within the upper block group is examined according to a particular order.

15. The method of claim 14, wherein the particular order is an order starting from the top neighboring block up to the top-right neighboring block and the top-left neighboring block.

Technical Field

The present disclosure relates to an image encoding technique, and most particularly, to an image decoding method and apparatus based on affine motion prediction in an image encoding system.

Background

In various fields, demand for high-resolution, high-quality images such as HD (high definition) images and UHD (ultra high definition) images is increasing. Since the image data has high resolution and high quality, the amount of information or bits to be transmitted increases relative to conventional image data. Therefore, when image data is transmitted using a medium such as a conventional wired/wireless broadband line or stored using an existing storage medium, transmission costs and storage costs thereof increase.

Accordingly, there is a need for efficient image compression techniques for efficiently transmitting, storing, and reproducing information for high-resolution, high-quality images.

Disclosure of Invention

Technical purpose

It is a technical object of the present disclosure to provide a method and apparatus capable of enhancing image coding efficiency.

Another technical object of the present disclosure is to provide an image decoding method and apparatus that derive an affine MV candidate list of a current block based on neighboring blocks to which affine prediction is applied, and then perform prediction on the current block based on the derived affine MVP candidate list.

Another technical object of the present disclosure is to provide an image decoding method and apparatus that derive an affine MVP candidate list of a current block by deriving a first affine MVP candidate list based on a left block group and deriving a second affine MVP candidate list based on an upper block group, and then perform prediction on the current block based on the derived affine MVP candidate list.

Technical scheme

According to an embodiment of the present disclosure, there is provided herein a video decoding method performed by a decoding apparatus. The method comprises the following steps: obtaining motion prediction information of a current block from a bitstream, constructing an affine Motion Vector Predictor (MVP) candidate list including MVP candidates of the current block; deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on one of the affine MVP candidates included in the affine MVP candidate list, deriving a Control Point Motion Vector Difference (CPMVD) for the CP of the current block based on the motion prediction information, deriving a Control Point Motion Vector (CPMV) for the CP of the current block based on the CPMVP and the CPMVD, deriving a prediction sample for the current block based on the CPMV; and generating a reconstructed picture of the current block based on the derived prediction samples, wherein the affine MVP candidate includes a first affine MVP candidate and a second affine MVP candidate, wherein the first affine MVP candidate is derived based on a first block in a left block group including a lower left neighboring block and a left neighboring block, wherein the first block is encoded with an affine motion model, and a reference picture of the first block is the same as a reference picture of the current block, wherein the second affine MVP candidate is derived based on a second block within an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block, and wherein the second block is encoded with the affine motion model, and a reference picture of the second block is the same as a reference picture of the current block.

According to another embodiment of the present disclosure, there is provided a decoding apparatus performing a video decoding method. The decoding apparatus includes: an entropy encoder obtaining motion prediction information of the current block from a bitstream; a predictor that constructs an affine Motion Vector Predictor (MVP) candidate list including affine MVP candidates of the current block, derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block based on one of the affine MVP candidates included in the affine MVP candidate list, derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the motion prediction information, derives a Control Point Motion Vector (CPMV) of the CP of the current block based on the CPMVP and the CPMVD, and derives a prediction sample of the current block based on the CPMV; and an adder that generates a reconstructed picture of the current block based on the derived prediction samples, wherein the affine MVP candidates comprise a first affine MVP candidate and a second affine MVP candidate, wherein the first affine MVP candidate is derived based on a first block in a left block group comprising a lower left corner neighboring block and a left neighboring block, wherein the first block is encoded with an affine motion model and a reference picture of the first block is the same as a reference picture of the current block, wherein the second affine MVP candidate is derived based on a second block within an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block, and wherein the second block is encoded with the affine motion model and a reference picture of the second block is the same as a reference picture of the current block.

According to another embodiment of the present disclosure, there is provided herein a video encoding method performed by an encoding apparatus. The method comprises the following steps: constructing an affine Motion Vector Predictor (MVP) candidate list including MVP candidates for the current block; deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on one of the affine MVP candidates included in the affine MVP candidate list; deriving a Control Point Motion Vector (CPMV) of the CP of the current block; deriving a Control Point Motion Vector Difference (CPMVD) of a CP of the current block based on the CPMVP and the CPMV; and encoding motion prediction information including information on the CPMVD, wherein the affine MVP candidates include a first affine MVP candidate and a second affine MVP candidate, wherein the first affine MVP candidate is derived based on a first block in a left block group including a lower left neighboring block and a left neighboring block, wherein the first block is encoded with an affine motion model, and a reference picture of the first block is the same as a reference picture of the current block, wherein the second affine MVP candidate is derived based on a second block within an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block, and wherein the second block is encoded with the affine motion model, and a reference picture of the second block is the same as a reference picture of the current block.

According to another embodiment of the present disclosure, there is provided an encoding apparatus performing a video encoding method. The encoding device includes: a predictor that constructs an affine Motion Vector Predictor (MVP) candidate list including MVP candidates for a current block, derives a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on one of the affine MVP candidates included in the affine MVP candidate list, and derives a Control Point Motion Vector (CPMV) for the CP of the current block; a subtractor that derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the CPMVP and the CPMV; and an entropy encoder encoding motion prediction information including information on the CPMVD, wherein the affine MVP candidates include a first affine MVP candidate and a second affine MVP candidate, wherein the first affine MVP candidate is derived based on a first block in a left block group including a lower left neighboring block and a left neighboring block, wherein the first block is encoded with an affine motion model, and a reference picture of the first block is the same as a reference picture of the current block, wherein the second affine MVP candidate is derived based on a second block within an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block, and wherein the second block is encoded with the affine motion model, and a reference picture of the second block is the same as a reference picture of the current block.

Effects of the disclosure

According to the present disclosure, overall image/video compression efficiency can be enhanced.

According to the present disclosure, the efficiency of affine motion prediction based image encoding can be improved.

According to the present disclosure, when deriving an affine MVP candidate list, neighboring blocks are divided into a left block group and an upper block group, and the affine MVP candidate list may be constructed by deriving MVP candidates from each block group. Therefore, the complexity of the process of constructing the affine MVP candidate list can be reduced, and the encoding efficiency can be enhanced.

Drawings

Fig. 1 is a schematic diagram illustrating a configuration of a video encoding apparatus to which the present disclosure is applied.

Fig. 2 is a schematic diagram illustrating a configuration of a video decoding apparatus to which the present disclosure is applied.

Fig. 3 illustrates a motion expressed by an affine motion model.

Fig. 4 illustrates an affine motion model using motion vectors of 3 control points.

Fig. 5 illustrates an affine motion model using motion vectors of 2 control points.

Fig. 6 illustrates a method of deriving a motion vector on a sub-block basis based on an affine motion model.

Fig. 7 is a flowchart illustrating an affine motion prediction method according to an embodiment of the present disclosure.

Fig. 8 is a diagram for describing a method of deriving a motion vector predictor at a control point according to an embodiment of the present disclosure.

Fig. 9 is a diagram for describing a method of deriving a motion vector predictor at a control point according to an embodiment of the present disclosure.

Fig. 10 illustrates an example of performing affine prediction in the case where the neighboring block a is selected as an affine merge candidate.

FIG. 11 illustrates exemplary neighboring blocks for deriving inherited affine candidates.

Fig. 12 illustrates exemplary spatial candidates of the constructed affine candidates.

Fig. 13 illustrates an example of constructing an affine MVP list.

Fig. 14 illustrates an example of constructing an affine MVP list.

Fig. 15 illustrates an example of constructing an affine MVP list.

Fig. 16 illustrates a general view of an image encoding method performed by an encoding apparatus according to the present disclosure.

Fig. 17 illustrates a general view of an encoding apparatus performing an image encoding method according to the present disclosure.

Fig. 18 illustrates a general view of an image decoding method performed by a decoding apparatus according to the present disclosure.

Fig. 19 illustrates a general view of a decoding apparatus performing an image decoding method according to the present disclosure.

Fig. 20 illustrates a content streaming system structure to which the present disclosure is applied.

Detailed Description

The present disclosure may be modified in various forms and specific embodiments thereof will be described and illustrated in the accompanying drawings. However, these embodiments are not intended to limit the present disclosure. The terminology used in the following description is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. A singular expression includes a plural expression as long as it is clearly understood differently. Terms such as "including" and "having" are intended to indicate the presence of features, numbers, steps, operations, elements, components, or combinations thereof used in the following description, and therefore it should be understood that one or more different features, numbers, steps, operations, elements, components, or combinations thereof may not be excluded from possible presence or addition.

Furthermore, the elements in the figures described in this disclosure are drawn separately for the purpose of convenience to illustrate different specific functions, which does not mean that these elements are implemented by separate hardware or separate software. For example, two or more of these elements may be combined to form a single element, or one element may be divided into a plurality of elements. Embodiments in which elements are combined and/or divided are within the present disclosure without departing from the concepts thereof.

Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In addition, like reference numerals are used to designate like elements throughout the drawings, and the same description of the like elements will be omitted.

Furthermore, the present disclosure relates to video/image coding. For example, the methods/embodiments disclosed in the present disclosure may be applied to methods disclosed in a multifunctional video coding (VVC) standard or a next generation video/image coding standard.

In the present disclosure, in general, a picture means a unit representing an image in a specific time slot, and a slice is a unit constituting a part of the picture in encoding. A picture may include multiple slices, and in some cases, the picture and slices may be used in a mixed manner.

A pixel or pixel (pel) may mean the smallest unit that constitutes a picture (or image). In addition, the term "sample" may be used corresponding to a pixel. The samples may generally represent pixels or values of pixels, may represent only pixels/pixel values of a luminance component or may represent only pixels/pixel values of a chrominance component.

In some cases, the cells may be used in a mixed manner with blocks or regions.

Fig. 1 is a diagram briefly illustrating a video encoding apparatus to which the present disclosure is applied.

Referring to fig. 1, the video encoding apparatus 100 may include a picture divider 105, a predictor 110, a residual processor 120, an entropy encoder 130, an adder 140, a filter 150, and a memory 160. The residual processor 120 may include a subtractor 121, a transformer 122, a quantizer 123, a re-arranger 124, an inverse quantizer 125, and an inverse transformer 126.

The picture divider 105 may divide an input picture into at least one processing unit.

The encoding process according to the present disclosure may be performed based on a final encoding unit that is not divided any more.

In another example, a processing unit may include a Coding Unit (CU), a Prediction Unit (PU), or a Transform Unit (TU). The coding unit may be partitioned from a maximum coding unit (L CU) to a deeper coding unit according to a quadtree structure.

The predictor 110 may perform prediction on a processing target block (hereinafter, a current block), and may generate a prediction block including prediction samples for the current block. The unit of prediction performed in the predictor 110 may be an encoding block, or may be a transform block, or may be a prediction block.

The predictor 110 may determine whether intra prediction or inter prediction is applied to the current block. For example, the predictor 110 may determine whether to apply intra prediction or inter prediction in units of CUs.

In the case of intra prediction, the predictor 110 may derive prediction samples of the current block based on reference samples other than the current block in a picture (hereinafter, current picture) to which the current block belongs. In this case, the predictor 110 may derive the prediction sample based on an average or interpolation of neighboring reference samples of the current block (case (i)), or may derive the prediction sample based on a reference sample in which the prediction sample exists in a specific (prediction) direction among the neighboring reference samples of the current block (case (ii)). Case (i) may be referred to as a non-directional mode or a non-angular mode, and case (ii) may be referred to as a directional mode or an angular mode. In intra prediction, the prediction modes may include 33 directional modes and at least two non-directional modes, as an example. The non-directional mode may include a DC mode and a planar mode. The predictor 110 may determine a prediction mode to be applied to the current block by using prediction modes applied to neighboring blocks.

In the case of inter prediction, the predictor 110 may derive prediction samples for the current block based on samples specified by a motion vector on a reference picture. The predictor 110 may derive prediction samples of the current block by applying any one of a skip mode, a merge mode, and a Motion Vector Prediction (MVP) mode. In case of the skip mode and the merge mode, the predictor 110 may use motion information of neighboring blocks as motion information of the current block. In case of the skip mode, unlike the merge mode, a difference (residual) between the prediction sample and the original sample is not transmitted. In case of the MVP mode, motion vectors of neighboring blocks are used as a motion vector predictor, and thus, used as a motion vector predictor of the current block to derive a motion vector of the current block.

In the case of inter prediction, the neighboring blocks may include spatially neighboring blocks existing in the current picture and temporally neighboring blocks existing in the reference picture. A reference picture including temporally adjacent blocks may also be referred to as a collocated picture (colPic). The motion information may include a motion vector and a reference picture index. Information such as prediction mode information and motion information may be (entropy) encoded and then output as a bitstream.

When motion information of temporally adjacent blocks is used in the skip mode and the merge mode, the highest picture in the reference picture list may be used as a reference picture. Reference pictures included in a reference picture list may be aligned based on Picture Order Count (POC) differences between the current picture and the corresponding reference picture. POC corresponds to display order and can be distinguished from coding order.

The subtractor 121 generates a residual sample, which is the difference between the original sample and the predicted sample. If skip mode is applied, residual samples may not be generated as described above.

For example, if intra prediction is applied to a prediction block or a coding block that overlaps with the transform block, the residual samples may be transformed using a Discrete Sine Transform (DST) transform kernel, the transform block being a 4 × 4 residual array, and otherwise transformed using a Discrete Cosine Transform (DCT) transform kernel.

The quantizer 123 may quantize the transform coefficients to generate quantized transform coefficients.

The rearranger 124 rearranges the quantized transform coefficients. The rearranger 124 may rearrange the quantized transform coefficients in the form of blocks into one-dimensional vectors by a coefficient scanning method. Although reorderer 124 is described as a separate component, reorderer 124 may be part of quantizer 123.

The entropy encoder 130 may encode information (e.g., values of syntax elements, etc.) necessary for video reconstruction in addition to the quantized transform coefficients together or separately, the entropy encoder 130 may transmit or store the entropy-encoded information in units of NA L (network abstraction layer) in the form of a bitstream.

The inverse quantizer 125 inverse-quantizes the values (transform coefficients) quantized by the quantizer 123, and the inverse transformer 126 inverse-transforms the values inverse-quantized by the inverse quantizer 125 to generate residual samples.

The adder 140 adds the residual samples to the prediction samples to reconstruct the picture. The residual samples may be added to the prediction samples in units of blocks to generate a reconstructed block. Although the adder 140 is described as a separate component, the adder 140 may be part of the predictor 110. Further, the adder 140 may be referred to as a reconstructor or a reconstruction block generator.

The filter 150 may apply an adaptive loop filter (A L F) to the reconstructed picture, A L F may be applied to the reconstructed picture to which the deblocking filtering and/or the sample adaptive offset has been applied.

The memory 160 may store information required for reconstructing a picture (decoded picture) or encoding/decoding. Here, the reconstructed picture may be a reconstructed picture filtered by the filter 150. The stored reconstructed pictures may be used as reference pictures for (inter) prediction of other pictures. For example, the memory 160 may store (reference) pictures used for inter prediction. Here, a picture for inter prediction may be specified according to a reference picture set or a reference picture list.

Fig. 2 is a schematic diagram illustrating a configuration of a video decoding apparatus to which the present disclosure is applied.

Referring to fig. 2, the video decoding apparatus 200 includes an image decoder 210, a residual processor 220, a predictor 230, an adder 240, a filter 250, and a memory 260. Here, the residual processor 220 may include a reorderer 221, an inverse quantizer 222, and an inverse transformer 223.

When a bitstream including video information is input, the video decoding apparatus 200 may reconstruct video corresponding to a process of processing the video information in the video encoding apparatus.

For example, the video decoding apparatus 200 may perform video decoding using a processor applied in the video encoding apparatus. Thus, a processing unit block of video decoding may be, for example, a coding unit, or may also be, for example, a coding unit, a prediction unit, or a transform unit. The coding units may be partitioned from the largest coding unit according to a quadtree structure and/or a binary tree structure.

In some cases, a prediction unit and a transform unit may also be used, and in this case, the prediction block is a block derived or divided from a coding unit and may be a unit of sample prediction. Here, the prediction unit may be divided into subblocks. The transform unit may be divided from the coding unit according to a quadtree structure, and may be a unit that derives a transform coefficient or a unit that derives a residual signal from the transform coefficient.

For example, the entropy decoder 210 may decode information in the bitstream based on an encoding method such as exponential golomb encoding, CAV L C, CABAC, or the like, and may output values of syntax elements required for video reconstruction and quantized values of transform coefficients for a residual.

More specifically, the CABAC entropy decoding method may receive a bin corresponding to each syntax element in a bitstream, determine a context model using decoding target syntax element information and decoding information adjacent to and decoding a target block or information of a symbol/bin decoded in a previous step, predict a bin generation probability according to the determined context model, and perform arithmetic decoding on the bin to generate a symbol corresponding to each syntax element value. Here, the CABAC entropy decoding method may update the context model using information of the symbol/bin decoded for the context model of the next symbol/bin after determining the context model.

Information for prediction among information decoded in the entropy decoder 210 may be provided to the predictor 250, and residual values, i.e., quantized transform coefficients on which the entropy decoder 210 has performed entropy decoding, may be input to the re-arranger 221.

The reorderer 221 may rearrange the quantized transform coefficients into a two-dimensional block form. The reorderer 221 may perform reordering corresponding to coefficient scanning performed by the encoding apparatus. Although the reorderer 221 is described as a separate component, the reorderer 221 may be part of the inverse quantizer 222.

The inverse quantizer 222 may inverse-quantize the quantized transform coefficients based on the (inverse) quantization parameter to output the transform coefficients. In this case, information for deriving the quantization parameter may be signaled from the encoding device.

Inverse transformer 223 may inverse transform the transform coefficients to derive residual samples.

The predictor 230 may perform prediction on the current block and may generate a prediction block including prediction samples of the current block. The unit of prediction performed in the predictor 230 may be an encoding block or may be a transform block or may be a prediction block.

The predictor 230 may determine whether to apply intra prediction or inter prediction based on information used for prediction. In this case, the unit for determining which of intra prediction and inter prediction is to be used may be different from the unit for generating the prediction samples. In addition, the unit for generating the prediction sample may be different between the inter prediction and the intra prediction. For example, which of inter prediction and intra prediction is to be applied may be determined in units of CUs. In addition, for example, in inter prediction, prediction samples may be generated by determining a prediction mode in units of PUs, and in intra prediction, prediction samples may be generated in units of TUs by determining a prediction mode in units of PUs.

In the case of intra prediction, the predictor 230 may derive prediction samples for the current block based on neighboring reference samples in the current picture. The predictor 230 may derive prediction samples of the current block by applying a directional mode or a non-directional mode based on neighboring reference samples of the current block. In this case, the prediction mode to be applied to the current block may be determined by using the intra prediction modes of the neighboring blocks.

In the case of inter prediction, the predictor 230 may derive prediction samples of the current block based on samples specified in a reference picture according to a motion vector. The predictor 230 may derive prediction samples of the current block using one of a skip mode, a merge mode, and an MVP mode. Here, motion information required for inter prediction of the current block provided by the video encoding apparatus, for example, a motion vector and information for reference picture index, may be acquired or derived based on the information for prediction.

In the skip mode and the merge mode, motion information of neighboring blocks may be taken as motion information of the current block. Here, the neighboring blocks may include spatially neighboring blocks and temporally neighboring blocks.

The predictor 230 may construct a merge candidate list using motion information of available neighboring blocks and use information indicated by a merge index on the merge candidate list as a motion vector of the current block. The merge index may be signaled by the encoding device. The motion information may include a motion vector and a reference picture. When motion information of temporally adjacent blocks is used in the skip mode and the merge mode, the highest picture in the reference picture list may be used as a reference picture.

In case of the skip mode, unlike the merge mode, a difference (residual) between the prediction sample and the original sample is not transmitted.

In case of the MVP mode, a motion vector of a current block may be derived using motion vectors of neighboring blocks as a motion vector predictor. Here, the neighboring blocks may include spatially neighboring blocks and temporally neighboring blocks.

When the merge mode is applied, for example, the merge candidate list may be generated using motion vectors of reconstructed spatially neighboring blocks and/or motion vectors corresponding to Col blocks, which are temporally neighboring blocks. In the merge mode, a motion vector of a candidate block selected from the merge candidate list is used as a motion vector of the current block. The above-mentioned information for prediction may include a merge index indicating a candidate block having a best motion vector selected from candidate blocks included in the merge candidate list. Here, the predictor 230 may derive a motion vector of the current block using the merge index.

When an MVP (motion vector prediction) mode is applied as another example, a motion vector predictor candidate list may be generated using motion vectors of reconstructed spatially neighboring blocks and/or motion vectors corresponding to Col blocks, which are temporally neighboring blocks. That is, motion vectors of reconstructed spatial neighboring blocks and/or motion vectors corresponding to Col blocks, which are temporal neighboring blocks, may be used as motion vector candidates. The above-mentioned information for prediction may include a prediction motion vector index indicating a best motion vector selected from motion vector candidates included in the list. Here, the predictor 230 may select a predicted motion vector of the current block from among motion vector candidates included in the motion vector candidate list using the motion vector index. A predictor of the encoding apparatus may obtain a Motion Vector Difference (MVD) between a motion vector of the current block and the motion vector predictor, encode the MVD, and output the encoded MVD in the form of a bitstream. That is, the MVD may be obtained by subtracting the motion vector predictor from the motion vector of the current block. Here, the predictor 230 may acquire a motion vector included in the information for prediction and derive a motion vector of the current block by adding a motion vector difference to the motion vector predictor. In addition, the predictor may obtain or derive a reference picture index indicating a reference picture from the above-mentioned information for prediction.

The adder 240 may add the residual samples to the prediction samples to reconstruct the current block or the current picture. The adder 240 may reconstruct the current picture by adding the residual samples to the prediction samples in units of blocks. When the skip mode is applied, the residual is not sent, so the prediction sample can become a reconstructed sample. Although the adder 240 is described as a separate component, the adder 240 may be part of the predictor 230. Further, the adder 240 may be referred to as a reconstructor or a reconstruction block generator.

The filter 250 may apply deblocking filtering, sample adaptive offset, and/or a L F to the reconstructed picture where, after deblocking filtering, sample adaptive offset may be applied in sample units a L F may be applied after deblocking filtering and/or applying sample adaptive offset.

The memory 260 may store reconstructed pictures (decoded pictures) or information required for decoding. Here, the reconstructed picture may be a reconstructed picture filtered by the filter 250. For example, the memory 260 may store pictures for inter prediction. Here, a picture for inter prediction may be specified according to a reference picture set or a reference picture list. The reconstructed picture may be used as a reference picture for other pictures. The memory 260 may output the reconstructed pictures in output order.

In addition, in the case of inter prediction, an inter prediction method considering image distortion has been proposed. In particular, an affine motion model has been proposed to efficiently derive motion vectors of sub-blocks or sample points of a current block, and to improve accuracy of inter prediction despite distortion of image rotation, enlargement, or reduction. That is, an affine motion model has been proposed that derives motion vectors for sub-blocks or sample points of the current block. Prediction using an affine motion model may be referred to as affine inter prediction or affine motion prediction.

For example, affine inter-frame prediction using an affine motion model can efficiently express four motions, i.e., four deformations, as described below.

Fig. 3 illustrates a motion expressed by an affine motion model. Referring to fig. 3, the motion that can be represented by the affine motion model may include a translational motion, a zooming motion, a rotational motion, and a shearing motion. That is, a zoom motion in which (a part of) an image is scaled according to the passage of time, a rotation motion in which (a part of) an image is rotated according to the passage of time, a shear motion in which (a part of) an image is parallelogram-deformed according to the passage of time, and a translational motion in which (a part of) an image is moved on a plane according to the passage of time can be effectively represented as illustrated in fig. 3.

The encoding apparatus/decoding apparatus may predict a distorted shape of an image based on a motion vector at a Control Point (CP) of a current block through affine inter prediction, and may improve compression performance of the image by increasing prediction accuracy. In addition, since the motion vector of at least one control point of the current block can be derived using the motion vectors of neighboring blocks of the current block, the data amount burden of additional information can be reduced and the inter prediction efficiency can be greatly improved.

As an example of affine inter prediction, motion information at three control points (i.e., three reference points) may be required.

Fig. 4 illustrates an affine motion model using motion vectors of 3 control points.

When the upper left sample position in the current block (400) is (0,0), sample positions (0,0), (w,0), (0, h) may be defined as control points, as shown in fig. 4. Hereinafter, the control point of the sample position (0,0) may be denoted as CP0, the control point of the sample position (w,0) may be denoted as CP1, and the control point of the sample position (0, h) may be denoted as CP 2.

The above-described control points and the motion vectors of the corresponding control points may be used to derive the formula for the affine motion model. The formula for an affine motion model can be expressed as follows:

[ formula 1]

Here, w denotes the width of the current block (400); h represents the height of the current block (400); v. of0xAnd v0yX-and y-components representing motion vectors of CP0, respectively; v. of1xAnd v1yX-and y-components representing motion vectors of CP1, respectively; and v is2xAnd v2yRepresenting the x-component and y-component, respectively, of the motion vector of CP 2. In addition, x represents the x component of the position of the target sample in the current block (400), and y represents the y component of the position of the target sample in the current block (400)Amount, vxRepresents the x-component of the motion vector of the target sample in the current block (400), and vyRepresents the y-component of the motion vector of the target sample in the current block (400).

Since the motion vector of CP0, the motion vector of CP1, and the motion vector of CP2 are known, a motion vector based on the sample position in the current block can be derived based on equation 1. That is, according to the affine motion model, the motion vector v0 (v) at the control point can be scaled based on the distance ratio between the three control points and the coordinates (x, y) of the target sample0x,v0y)、v1(v1x,v1y) And v2 (v)2x,v2y) To derive a motion vector for the target sample based on the position of the target sample. That is, according to the affine motion model, a motion vector of each sample in the current block may be derived based on a motion vector of a control point. Furthermore, the set of motion vectors for samples in the current block that has been derived from an affine motion model may be referred to as an affine Motion Vector Field (MVF).

Further, six parameters in equation 1 may be represented by a, b, c, d, e, and f as shown in the following equation, and the equation of the affine motion model represented by these six parameters may be as follows:

[ formula 2]

Here, w denotes the width of the current block (400); h denotes the height of the current block (400), v0xAnd v0yRepresenting the x-and y-components, v, respectively, of the motion vector of CP01xAnd v1yRespectively representing the x-component and y-component of the motion vector of CP1, and v2xAnd v2yRepresenting the x-component and y-component, respectively, of the motion vector of CP 2. In additionIn addition, x represents the x component of the position of the target sample in the current block (400), y represents the y component of the position of the target sample in the current block (400), vxRepresents the x-component of the motion vector of the target sample in the current block (400), and vyRepresents the y-component of the motion vector of the target sample in the current block (400).

The affine motion model or affine inter-frame prediction using these six parameters may be referred to as a 6-parameter affine motion model or AF 6.

In addition, as an example of affine inter prediction, motion information at two control points (i.e., two reference points) may be required.

Fig. 5 illustrates an affine motion model using motion vectors of two control points. Three motions, including translational, zooming, and rotational motions, can be represented using affine motion models of the two control points. The affine motion model representing these three motions may be referred to as a similarity affine motion model or a simplified affine motion model.

When the upper left sample position in the current block (500) is (0,0), sample positions (0,0) and (w,0) may be defined as control points, as shown in fig. 5. Hereinafter, the control point of the sample position (0,0) may be denoted as CP0, and the control point of the sample position (w,0) may be denoted as CP 1.

The above-described control points and the motion vectors of the corresponding control points may be used to derive the formula for the affine motion model. The formula for an affine motion model can be expressed as follows:

[ formula 3]

Here, w denotes the width of the current block (500), v0xAnd v0yRespectively representing the x-component and y-component of the motion vector of CP0, and v1xAnd v1yRepresenting the x-component and y-component of the motion vector of CP 1. In addition, x represents an x component of the position of the target sample in the current block (500), y represents a y component of the position of the target sample in the current block (500), vxRepresenting a target sample in a current block (500)X component of motion vector, and vyRepresents the y-component of the motion vector of the target sample in the current block (500).

Further, four parameters in equation 3 may be represented by a, b, c, and d in the following equation, and the equation of the affine motion model represented by these four parameters may be as follows.

[ formula 4]

Here, w denotes the width of the current block (500), v0xAnd v0yRespectively representing the x-component and y-component of the motion vector of CP0, and v1xAnd v1yRepresenting the x-component and y-component, respectively, of the motion vector of CP 1. In addition, x represents an x component of the position of the target sample in the current block (500), y represents a y component of the position of the target sample in the current block (500), vxRepresents the x-component of the motion vector of the target sample in the current block (500), and vyRepresents the y-component of the motion vector of the target sample in the current block (500). An affine motion model using two control points may be represented by four parameters a, b, c, and d as shown in equation 4, and thus, the affine motion model or affine inter-frame prediction using these four parameters may be referred to as a 4-parameter affine motion model or AF 4. That is, according to the affine motion model, a motion vector of each sample in the current block may be derived based on a motion vector of a control point. Furthermore, the set of motion vectors for samples in the current block that has been derived from an affine motion model may be referred to as an affine Motion Vector Field (MVF).

Further, as described above, the motion vector of a sample unit can be derived by an affine motion model, and therefore, the accuracy of inter prediction can be greatly improved. However, in this case, the complexity of the motion compensation process may increase greatly.

Thus, a restriction may be made such that a motion vector of a sub-block unit in the current block is derived instead of a motion vector of a derived sample unit.

For example, when a subblock is set to a size of n × n (n is a positive integer, e.g., n is 4), a motion vector may be derived in units of n × n blocks in the current block based on the affine motion model, and various methods for deriving a motion vector representing each subblock may be applied.

For example, referring to fig. 6, a motion vector of each sub-block may be derived using the center or lower-right sample position of each sub-block as a representative coordinate. Here, the center lower-right position may indicate a sample position at a lower-right side among four samples at the center of the sub-block. For example, when n is an odd number, one sample may be at the center of the sub-block, and in this case, the motion vector of the sub-block may be derived using the center sample position. However, when n is an even number, four samples may be set to be adjacent in the center of the sub-block, and in this case, a motion vector may be derived using the lower-right sample position. For example, referring to fig. 6, representative coordinates of each sub-block may be derived as (2, 2), (6, 2), (10, 2),., (14, 14), and the encoding/decoding apparatus may derive a motion vector of each sub-block by substituting each of the representative coordinates of the sub-block into the above equation 1 or equation 3. The motion vector of a sub-block in a current block derived by an affine motion model may be referred to as an affine MVF.

Also, as an example, the size of the sub-block in the current block may be derived based on the following equation.

[ formula 5]

Here, M denotes the width of the subblock, and N denotes the height of the subblock. In addition, v0xAnd v0yX-components of CPMV0 respectively representing current blocksAnd the y component, v1xAnd v1yRespectively representing the x-and y-components of CPMV1 for the current block, w representing the width of the current block, h representing the height of the current block, and MvPre representing the motion vector fractional precision. For example, the motion vector fractional precision may be set to 1/16.

Further, INTER-frame prediction using the above-described affine motion model (i.e., affine motion prediction) may have an affine MERGE mode (AF _ MERGE) and an affine INTER-frame mode (AF _ INTER). Here, the affine inter mode may be referred to as an affine MVP mode AF _ MVP.

The affine merge mode is similar to the existing merge mode in that the MVDs for the motion vectors of the control points are not sent. That is, similar to the existing skip/merge mode, the affine merge mode may refer to an encoding/decoding method that performs prediction by deriving CPMV for each of two or three control points from neighboring blocks of a current block.

For example, when the AF _ MRG mode is applied to the current block, MVs of CP0 and CP1 (i.e., CPMV0 and CPMV1) may be derived from neighboring blocks to which the affine mode is applied among neighboring blocks of the current block. That is, the CPMV0 and the CPMV1 of the neighboring blocks to which the affine mode is applied may be derived as merging candidates, and the merging candidates may be derived as the CPMV0 and the CPMV1 of the current block.

The affine inter mode may represent performing predicted inter prediction based on affine Motion Vector Predictor (MVP) by deriving a Motion Vector Predictor (MVP) of a motion vector of a control point, deriving a motion vector of a control point based on a Motion Vector Difference (MVD) and the MVP, and deriving an affine MVF of a current block based on the motion vector of the control point. Here, the motion vector of the control point may be expressed as a Control Point Motion Vector (CPMV), the MVP of the control point may be expressed as a Control Point Motion Vector Predictor (CPMVP), and the MVD of the control point may be expressed as a Control Point Motion Vector Difference (CPMVD). Specifically, for example, the encoding apparatus may derive a Control Point Motion Vector Predictor (CPMVP) and a Control Point Motion Vector (CPMV) for each of the CP0 and the CP1 (or the CP0, the CP1, and the CP2), and transmit or store information on the CPMVD and/or the CPMVP, which are difference values between the CPMVP and the CPMV.

Here, when the affine inter mode is applied to the current block, the encoding apparatus/decoding apparatus may construct an affine MVP candidate list based on neighboring blocks of the current block, and the affine MVP candidate may be referred to as a CPMVP pair candidate and the affine MVP candidate list may be referred to as a CPMVP candidate list.

In addition, each of the affine MVP candidates may refer to a combination of CPMVPs of CP0 and CP1 in a 4-parameter affine motion model, and may refer to a combination of CPMVPs of CP0, CP1, and CP2 in a 6-parameter affine motion model.

Fig. 7 illustrates a flow chart of an affine motion prediction method according to an embodiment of the present disclosure.

Referring to fig. 7, the affine motion prediction method may be represented as follows. When the affine motion prediction method starts, first, a CPMV pair may be obtained (S700). Here, when a 4-parameter affine model is used, the CPMV pair may include CPMV0 and CPMV 1.

Thereafter, affine motion compensation may be performed based on the CPMV pair (S710), and affine motion prediction may be terminated.

Additionally, there may be two affine prediction modes for determining CPMV0 and CPMV 1. Here, the two affine prediction modes may include an affine inter mode and an affine merge mode. In affine inter mode, CPMV0 and CPMV1 may be explicitly determined by signaling two Motion Vector Difference (MVD) information of CPMV0 and CPMV 1. Furthermore, in the affine merging mode, the CPMV pair can be derived without MVD information signaling.

In other words, in the affine merge mode, the CPMV of the current block may be derived using the CPMVs of the neighboring blocks encoded in the affine mode, and in the case of determining a motion vector in units of sub-blocks, the affine merge mode may be referred to as a sub-block merge mode.

In the affine merging mode, the encoding apparatus may signal, to the decoding apparatus, indexes of neighboring blocks encoded in the affine mode for deriving the CPMV of the current block, and may also signal a difference value between the CPMV of the neighboring blocks and the CPMV of the current block. Here, in the affine merge mode, an affine merge candidate list may be constructed based on neighboring blocks, and indexes of the neighboring blocks may represent the neighboring blocks to be referenced in order to derive the CPMV of the current block on the affine merge candidate list. The affine merge candidate list may be referred to as a subblock merge candidate list.

The affine inter mode may be referred to as an affine MVP mode. In the affine MVP mode, the CPMV of the current block may be derived based on a Control Point Motion Vector Predictor (CPMVP) and a Control Point Motion Vector Difference (CPMVD). In other words, the encoding device may determine the CPMVP of the CPMV of the current block, derive the CPMVD, which is a difference between the CPMV and the CPMVP of the current block, and signal information on the CPMVP and information on the CPMVD to the decoding device. Here, the affine MVP mode may construct an affine MVP candidate list based on neighboring blocks, and the information on the CPMVP may represent the neighboring blocks to be referenced in order to derive the CPMVP of the CPMV of the current block on the affine MVP candidate list. The affine MVP candidate list may be referred to as a control point motion vector predictor candidate list.

For example, in the case of an affine inter mode to which a 6-parameter affine motion model is applied, the current block may be encoded as follows.

Fig. 8 is a diagram for describing a method for deriving a motion vector predictor at a control point according to an embodiment of the present disclosure.

Referring to FIG. 8, a motion vector of the CP0 of the current block may be represented as v0The motion vector of CP1 may be denoted as v1The motion vector of the control point for the lower left sample position may be denoted v2And the motion vector of CP2 can be represented as v3. More specifically, v0CPMVP, v, which may represent CP01CPMVP of CP1 can be represented, and v2May represent the CPMVP of CP 2.

The affine MVP candidate may be a combination of the CPMVP candidate of CP0, the CPMVP candidate of CP1, and the CPMVP candidate of CP 2.

For example, affine MVP candidates may be derived as described below.

More specifically, a maximum of 12 CPMVP candidate combinations may be determined according to the formula shown below.

[ formula 6]

{(v0,v1,v2)|v0={vA,vB,vc},v1={vD,vE},v2={vF,vG}}

Here, vACan represent the motion vector, v, of the neighboring block aBCan represent the motion vector, v, of the neighboring block BCCan represent the motion vector, v, of the neighboring block CDCan represent the motion vector, v, of the neighboring block DECan represent the motion vector, v, of the neighboring block EFCan represent the motion vector of the neighboring block F, and vGA motion vector of the neighboring block G may be represented.

In addition, the neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper side of the upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of the upper left sample position of the current block. In addition, the neighboring block D may represent a neighboring block at an upper side of a top-right sample position of the current block, and the neighboring block E may represent a neighboring block at an upper-right side of the top-right sample position of the current block. Also, the neighboring block F may represent a neighboring block at the left side of the lower-left sample position of the current block, and the neighboring block G may represent a neighboring block at the lower-left side of the lower-left sample position of the current block.

More specifically, referring to equation 6 above, the CPMVP candidate of CP0 may include a motion vector v of a neighboring block aAMotion vector v of neighboring block BBAnd/or motion vectors v of neighboring blocks CCThe CPMVP candidates of CP1 may include a motion vector v of a neighboring block DDAnd/or motion vectors v of neighboring blocks EEAnd the CPMVP candidate of CP2 may include the motion vector v of the neighboring block FFAnd/or motion vectors v of neighboring blocks GG

In other words, the CPMVP v of CP0 may be derived based on the motion vector of at least one of neighboring blocks A, B and C of the top-left sample position0. Herein, the neighboring block a may represent an upper left sample position at an upper left sample position of the current blockThe side block, the neighboring block B may represent a block at an upper side of the upper-left sample position of the current block, and the neighboring block C may represent a block at a left side of the upper-left sample position of the current block.

A combination of up to 12 CPMVP candidates including the CPMVP candidate of CP0, the CPMVP candidate of CP1, and the CPMVP candidate of CP2 may be derived based on motion vectors of neighboring blocks.

Thereafter, the combination of derived CPMVP candidates may be aligned by the order of candidates having lower DV values. Thus, the first 2 CPMVP candidate combinations may be derived as affine MVP candidates.

The DV of the CPMVP candidate combination can be derived using the following equation.

[ formula 7]

DV=|(v1x-v0x)*h-(v2y-v0y)*w|+|(v1y-v0y)*h+(v2x-v0x)*w|

Thereafter, the encoding device may determine the CPMV of each of the affine MVP candidates. Then, by comparing Rate Distortion (RD) costs of the CPMVs, an affine MVP candidate having a lower RD cost may be selected as the best affine MVP candidate for the current block. The encoding device may encode and signal an index indicating the best candidate and the CPMVD.

In addition, for example, in case of applying the affine merging mode, the current block may be encoded as follows.

Fig. 9 is a diagram for describing a method of deriving a motion vector predictor at a control point according to an embodiment of the present disclosure.

The affine merge candidate list of the current block may be constructed based on neighboring blocks of the current block shown in fig. 9. The neighboring blocks may include a neighboring block a, a neighboring block B, a neighboring block C, a neighboring block D, and a neighboring block E. Also, the neighboring block a may represent a left neighboring block of the current block, the neighboring block B may represent an upper neighboring block of the current block, the neighboring block C may represent an upper right neighboring block of the current block, the neighboring block D may represent a lower left neighboring block of the current block, and the neighboring block E may represent an upper left neighboring block of the current block.

For example, in the case where the size of the current block is WxH, and in the case where the x-component of the upper-left sample position of the current block is 0 and the y-component is 0, the left neighboring block may be a block including samples of coordinates (-1, H-1), the upper neighboring block may be a block including samples of coordinates (W-1, -1), the upper-right neighboring block may be a block including samples of coordinates (W, -1), the lower-left neighboring block may be a block including samples of coordinates (-1, H), and the upper-left neighboring block may be a block including samples of coordinates (-1, -1).

More specifically, for example, the encoding apparatus may scan the neighboring blocks a, B, C, D, and E of the current block in a certain scan order, and the neighboring blocks that will be first encoded as an affine prediction mode may be determined as candidate blocks of an affine merge mode, i.e., affine merge candidates, in the scan order. Herein, for example, the specific scan order may be alphabetical. More specifically, the specific scan order may be a neighboring block a, a neighboring block B, a neighboring block C, a neighboring block D, and a neighboring block E.

Thereafter, the encoding apparatus may determine an affine motion model of the current block by using the determined candidate block, determine a CPMV of the current block based on the affine motion model, and determine an affine MVF of the current block based on the CPMV.

For example, in the case where the neighboring block a is determined as a candidate block of the current block, the neighboring block a may be encoded as described below.

Fig. 10 illustrates an example of performing affine prediction in the case where the neighboring block a is selected as an affine merge candidate.

Referring to fig. 10, an encoding apparatus may determine a neighboring block a of a current block as a candidate block, and the encoding apparatus may determine CPMV, v of the neighboring block based on the neighboring block2And v3To derive an affine motion model of the current block. Thereafter, the encoding apparatus may determine the CPMV, v of the current block based on the affine motion model0And v1. The coding device may be based on the CPMV, v of the current block0And v1An affine MVF is determined, and an encoding process may be performed on the current block based on the affine MVF.

Furthermore, with regard to affine inter prediction, inherited affine candidates for constructing an affine MVP candidate list are being considered.

Herein, the inherited affine candidates may be as follows.

For example, in case the neighboring blocks of the current block are affine blocks and in case the reference picture of the current block is the same as the reference picture of the neighboring blocks, the affine MVP pair of the current block may be determined from the affine motion models of the neighboring blocks. Herein, an affine block may represent a block to which affine inter prediction is applied. Inherited affine candidates can represent CPMVPs (e.g., affine MVP pairs) derived based on affine motion models of neighboring blocks.

More specifically, for example, inherited affine candidates can be derived as described below.

FIG. 11 illustrates exemplary neighboring blocks for deriving inherited affine candidates.

Referring to FIG. 11, the neighboring blocks of the current block may include a left neighboring block A0 of the current block, a lower left neighboring block A1 of the current block, an upper neighboring block B0 of the current block, an upper right neighboring block B1 of the current block, and an upper left neighboring block B2 of the current block.

For example, in the case where the size of the current block is WxH, and in the case where the x-component of the upper-left sample position of the current block is 0 and the y-component is 0, the left neighboring block may be a block including samples of coordinates (-1, H-1), the upper neighboring block may be a block including samples of coordinates (W-1, -1), the upper-right neighboring block may be a block including samples of coordinates (W, -1), the lower-left neighboring block may be a block including samples of coordinates (-1, H), and the upper-left neighboring block may be a block including samples of coordinates (-1, -1).

The encoding apparatus/decoding apparatus may sequentially check the neighboring blocks a0, a1, B0, B1, and B2. Also, in the case of encoding neighboring blocks by using affine motion models and in the case where the reference picture of the current block is the same as the reference picture of the neighboring blocks, 2 CPMVs or 3 CPMVs of the current block may be derived based on the affine motion models of the neighboring blocks. The CPMV may be derived as an affine MVP candidate for the current block. The affine MVP candidate may represent an inherited affine candidate.

For example, at most two inherited affine candidates may be derived based on neighboring blocks.

For example, the encoding device/decoding device may derive the first affine MVP candidate based on the first block within the neighboring block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same. More specifically, when neighboring blocks are checked in a particular order, the first block may be the first block verified as satisfying the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

Thereafter, the encoding device/decoding device may derive a second affine MVP candidate based on a second block within the neighboring block. Herein, the second block may be encoded by using an affine motion model, and the reference picture of the second block and the reference picture of the current block may be the same. More specifically, when the neighboring blocks are checked in a specific order, the second block may be the second block verified to satisfy the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

For example, at most one inherited affine candidate may be derived based on neighboring blocks.

For example, the encoding device/decoding device may derive the first affine MVP candidate based on the first block within the neighboring block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same. More specifically, when neighboring blocks are checked in a particular order, the first block may be the first block verified as satisfying the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

As shown below in the following table, the source code of MVP candidate derivation for the current block may be derived.

[ Table 1]

Referring to table 1, the encoding apparatus/decoding apparatus may determine whether to encode the left neighboring block by using an affine motion model, and may also determine whether the reference picture of the current block and the reference picture of the left neighboring block are the same. In the case where the above condition is satisfied, the CPMV derived based on the affine motion model of the left neighboring block may be derived as a CPMVP candidate of the current block.

Thereafter, the encoding device/decoding device may determine whether the number of derived CPMVP candidates is less than 2. In case that the number of derived CPMVP candidates is not less than 2, the CPMVP candidate derivation process may end.

In addition, in the case where the number of derived CPMVP candidates is less than 2, it may be determined whether an upper neighboring block is encoded by using an affine motion model and whether a reference picture of the current block is the same as a reference picture of the upper neighboring block. Also, in case that the above condition is satisfied, the CPMV derived based on the affine motion model of the upper neighboring block may be derived as a CPMVP candidate of the current block.

Thereafter, the encoding device/decoding device may determine whether the number of derived CPMVP candidates is less than 2. In case that the number of derived CPMVP candidates is not less than 2, the CPMVP candidate derivation process may end.

In addition, in the case where the number of derived CPMVP candidates is less than 2, it may be determined whether an upper-right neighboring block is encoded by using an affine motion model and whether a reference picture of the current block is the same as a reference picture of the upper-right neighboring block. Also, in case that the above condition is satisfied, the CPMV derived based on the affine motion model of the upper-right neighboring block may be derived as a CPMVP candidate of the current block.

Thereafter, the encoding device/decoding device may determine whether the number of derived CPMVP candidates is less than 2. In case that the number of derived CPMVP candidates is not less than 2, the CPMVP candidate derivation process may end.

In addition, in the case where the number of derived CPMVP candidates is less than 2, it may be determined whether a lower left neighboring block is encoded by using an affine motion model and whether a reference picture of the current block is the same as a reference picture of the lower left neighboring block. Also, in the case where the above condition is satisfied, the CPMV derived based on the affine motion model of the lower left neighboring block may be derived as a CPMVP candidate of the current block.

Thereafter, the encoding device/decoding device may determine whether the number of derived CPMVP candidates is less than 2. In case that the number of derived CPMVP candidates is not less than 2, the CPMVP candidate derivation process may end.

In addition, in the case where the number of derived CPMVP candidates is less than 2, it may be determined whether an upper-left neighboring block is encoded by using an affine motion model and whether a reference picture of the current block is the same as a reference picture of the upper-left neighboring block. Also, in case that the above condition is satisfied, the CPMV derived based on the affine motion model of the upper left neighboring block may be derived as a CPMVP candidate of the current block.

In addition, inherited affine candidates can be derived as described below, as well.

For example, the encoding apparatus/decoding apparatus may sequentially check neighboring blocks in a specific order. Also, in the case of encoding a neighboring block by using an affine motion model and in the case where the reference picture of the current block is the same as the reference picture of the neighboring block, an affine candidate based on inheritance in which the neighboring block does not apply scaling can be derived. Also, in the case of encoding a neighboring block by using an affine motion model and in the case where the reference picture of the current block is not identical to the reference picture of the neighboring block, an affine candidate of inheritance that applies scaling based on the neighboring block can be derived.

More specifically, in the case where the neighboring blocks are encoded by using the affine motion model and in the case where the reference picture of the current block is the same as the reference picture of the neighboring blocks, as described in the previous embodiment above, the encoding apparatus/decoding apparatus may derive the affine MVP candidate of the current block based on the affine motion model of the neighboring blocks.

In addition, in the case where a neighboring block is encoded by using an affine motion model and in the case where a reference picture of the current block is not identical to that of the neighboring block, the encoding apparatus/decoding apparatus may derive a motion vector of a CP of the current block based on the affine motion model of the neighboring block, may scale the motion vector by using a scaling factor, and may derive the scaled motion vector as an affine MVP candidate. Herein, the scaling factor may be a distance ratio of the first temporal distance to the second temporal distance. More specifically, the scaling factor may be a value obtained by dividing the first time distance by the second time distance. The first temporal distance may be a difference value between a Picture Order Count (POC) of a current picture including the current block and a POC of a reference picture of the current block. And, the second temporal distance may be a difference value between the POC of the current picture and the POC of the reference picture of the neighboring block.

As shown below in the following table, the above-described source code of MVP candidate derivation for the current block may be derived.

[ Table 2]

The performance and encoding/decoding time of the above embodiments are very similar. And, therefore, after performing the complexity analysis, an implementation with a lower level of complexity may be selected.

As shown below in the table below, a complexity analysis of these embodiments can be derived.

[ Table 3]

Referring to table 3, unlike the second embodiment in which both scaled affine MVP candidates and non-scaled affine MVP candidates are considered, the first embodiment in which only non-scaled affine MVP candidates are considered is excellent in having a smaller number of comparisons, shifts, and additions.

Furthermore, the present disclosure proposes a method of similar performance and encoding/decoding time as the above-described embodiments, but with significantly lower complexity.

For example, the left predictor and the upper predictor are divided, and the left predictor may be derived based on an affine encoding block having the same reference picture as the current block, which is first derived by scanning a left adjacent block and a lower-left adjacent block in a specific order, and the upper predictor may be derived based on an affine encoding block having the same reference picture as the current block, which is first derived by scanning an upper-right adjacent block, an upper adjacent block, and an upper-left adjacent block in a specific order.

More specifically, the encoding apparatus/decoding apparatus may derive a first affine MVP candidate of the current block from a left block group including a left adjacent block and a lower-left adjacent block, and the encoding apparatus/decoding apparatus may derive a second affine MVP candidate of the current block from an upper block group including an upper-right adjacent block, an upper adjacent block, and an upper-left adjacent block.

Herein, the first affine MVP candidate may be derived based on a first block within the left block group, the first block may be encoded by using an affine motion model, and a reference picture of the first block may be the same as a reference picture of the current block. More specifically, the first block may be the first block that has been verified to satisfy the condition when the adjacent blocks within the left block group are checked in a particular order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

In addition, a second affine MVP candidate may be derived based on a second block within the upper block group, the second block may be encoded by using an affine motion model, and the reference picture of the second block and the reference picture of the current block may be the same. More specifically, the second block may be the first block that has been verified to satisfy the condition when the adjacent blocks within the upper block group are examined in a particular order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

In addition, for example, the order in which the left neighboring block and the lower-left neighboring block are scanned may be the order of the left neighboring block and the lower-left neighboring block. Also, the order of scanning the upper right neighboring block, the upper neighboring block, and the upper left neighboring block may be the order of the upper right neighboring block, the upper neighboring block, and the upper left neighboring block. In addition, another example of the order of scanning the above-described adjacent blocks may be used in addition to the above-described example.

As shown below in the following table, the source code of the above affine MVP candidate derivation for the current block can be derived.

[ Table 4]

In addition, as shown below in the following table, a complexity analysis of the affine MVP candidate derivation method proposed in the present disclosure may be derived.

[ Table 5]

Referring to table 5, the proposed affine MVP candidate derivation method is performed by considering a minimum number of positions, candidates, comparisons, shifts and additions compared to the existing (or old) implementation and only non-scaled affine MVP candidates. Thus, the complexity of the proposed method can be minimized compared to the above mentioned embodiments. Therefore, since the affine MVP candidate derivation method proposed in the present disclosure is similar to the existing (or old) embodiment in terms of performance and encoding/decoding time while having the lowest complexity, it can be determined that the proposed method is more excellent than the existing embodiment.

Further, for example, in the case where the available number of inherited affine candidates is less than 2, the constructed affine candidates may be considered. The constructed affine candidates can be derived as follows.

Fig. 12 illustrates exemplary spatial candidates of the constructed affine candidates.

As shown in fig. 12, motion vectors of neighboring blocks of the current block may be divided into 3 groups. Referring to fig. 12, the neighbor blocks may include a neighbor block a, a neighbor block B, a neighbor block C, a neighbor block D, a neighbor block E, a neighbor block F, and a neighbor block G.

The neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper side of the upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of the upper left sample position of the current block. In addition, the neighboring block D may represent a neighboring block at an upper side of a top-right sample position of the current block, and the neighboring block E may represent a neighboring block at an upper-right side of the top-right sample position of the current block. In addition, the neighboring block F may represent a neighboring block at the left side of the lower-left sample position of the current block, and the neighboring block G may represent a neighboring block at the lower-left side of the lower-left sample position of the current block.

For example, the 3 groups may include S0、S1And S2And S can be derived as shown below in the table below0、S1And S2

[ Table 6]

Herein, mvAThe motion vector, mv, of the neighboring block a can be representedBThe motion vector, mv, of the neighboring block B can be representedCThe motion vector, mv, of the neighboring block C can be representedDThe motion vector, mv, of the neighboring block D can be representedEThe motion vector, mv, of the neighboring block E can be representedFThe motion vectors of the neighboring blocks F can be represented, and mvGA motion vector of the neighboring block G may be represented. S0May represent a first group, S1May represent a second group, and S2A third group may be represented.

The encoding/decoding device can be selected from S0Derivation of mv0From S1Derivation of mv1From S2Derivation of mv2And the encoding apparatus/decoding apparatus may derive mv0、mv1And mv2The affine MVP candidate of (1). The affine MVP candidate may represent the constructed affine candidate. In addition, mv0May be CPMVP candidate, mv, for CP01Can be CPMVP candidates for CP1, and mv2May be a CPMVP candidate for CP 2.

Herein, mv0May be the same as the reference picture of the current block. More specifically, when S is checked according to a specific order0Inner motion vector, mv0May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block.

In addition, mv1May be the same as the reference picture of the current block. More specifically, when S is checked according to a specific order1Inner motion vector, mv1May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block.

In addition, mv2May be the same as the reference picture of the current block. More specifically, when S is checked according to a specific order2Inner motion vector, mv2May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block.

In addition, in only deriving mv0And mv1In the case of (2), mv can be derived by using the following equation2

[ formula 8]

Herein, mv2 xRepresenting mv2The x component of (a); mv2 yRepresenting mv2The y component of (a); mv0 xRepresenting mv0The x component of (a); mv0 yRepresenting mv0The y component of (a); mv1 xRepresenting mv1The x component of (a); mv1 yRepresenting mv1The y component of (a). In addition, w denotes a width of the current block, and h denotes a height of the current block.

In addition, in only deriving mv0And mv2In the case of (2), mv can be derived by using the following equation1

[ formula 9]

Herein, mv1 xRepresenting mv1The x component of (a); mv1 yRepresenting mv1The y component of (a); mv0 xRepresenting mv0The x component of (a); mv0 yRepresenting mv0The y component of (a); mv2 xRepresenting mv2The x component of (a); mv2 yRepresenting mv2The y component of (a). In addition, w denotes a width of the current block, and h denotes a height of the current block.

In addition, in case the number of available inherited affine candidates and/or available constructed affine candidates is less than 2, AMVP processing of the existing HEVC standard may be applied to the affine MVP list construction. More specifically, in the case where the number of available inherited affine candidates and/or available constructed affine candidates is less than 2, the process of configuring the MVP candidates of the existing HEVC standard may be performed.

Further, a flowchart of an embodiment for constructing the above affine MVP list is as follows.

Fig. 13 illustrates an example of constructing an affine MVP list.

Referring to fig. 13, the encoding apparatus/decoding apparatus may add the inherited candidates to the affine MVP list of the current block (S1300). The inherited candidates may represent the inherited affine candidates described above.

More specifically, the encoding device/decoding device may derive at most 2 inherited affine candidates from neighboring blocks of the current block (S1305). Herein, the neighboring blocks may include a left neighboring block A0 of the current block, a lower left neighboring block A1 of the current block, an upper neighboring block B0 of the current block, an upper right neighboring block B1 of the current block, and an upper left neighboring block B2 of the current block.

For example, the encoding device/decoding device may derive the first affine MVP candidate based on the first block within the neighboring block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same. More specifically, when neighboring blocks are checked in a particular order, the first block may be the first block verified as satisfying the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

Thereafter, the encoding device/decoding device may derive a second affine MVP candidate based on a second block within the neighboring block. Herein, the second block may be encoded by using an affine motion model, and the reference picture of the second block and the reference picture of the current block may be the same. More specifically, when the neighboring blocks are checked in a specific order, the second block may be the second block verified to satisfy the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

Further, the specific order may be left adjacent tile a0 → lower left adjacent tile a1 → upper adjacent tile B0 → upper right adjacent tile B1 → upper left adjacent tile B2. In addition, the check may be performed in an order other than the above-described order, and the check may not be limited to the above-described example.

The encoding apparatus/decoding apparatus may add the constructed candidate to the affine MVP list of the current block (S1310). The constructed candidates may represent the constructed affine candidates described above. When the number of available inherited candidates is less than 2, the encoding apparatus/decoding apparatus may add the constructed candidate to the affine MVP list of the current block.

The encoding/decoding apparatus may derive mv from the first group0Deriving mv from the second group1And deriving mv from the third group2And the encoding apparatus/decoding apparatus can derive to include mv0、mv1And mv2The constructed affine candidate of (1). mv0May be CPMVP candidate, mv, for CP01Can be CPMVP candidates for CP1, and mv2May be a CPMVP candidate for CP 2.

Herein, mv0May be the same as the reference picture of the current block. More specifically, when motion vectors within the first group are examined according to a particular order, mv0May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In addition, mv1May be the same as the reference picture of the current block. More specifically, when the motion vectors within the second group are checked according to a particular order, mv1May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In addition, mv2May be the same as the reference picture of the current block. More specifically, when the motion vectors within the third group are checked according to a particular order, mv2May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block.

In addition, the first group may include motion vectors of the neighboring block a, motion vectors of the neighboring block B, and motion vectors of the neighboring block C. The second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E. The third group may include motion vectors of the neighboring block F and motion vectors of the neighboring block G. The neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper side of the upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of the upper left sample position of the current block. In addition, the neighboring block D may represent a neighboring block at an upper side of a top-right sample position of the current block, and the neighboring block E may represent a neighboring block at an upper-right side of the top-right sample position of the current block. In addition, the neighboring block F may represent a neighboring block at the left side of the lower-left sample position of the current block, and the neighboring block G may represent a neighboring block at the lower-left side of the lower-left sample position of the current block.

The encoding device/decoding device may add the HEVC AMVP candidate to the affine MVP list of the current block (S1320). When the number of inherited candidates and/or constructed candidates available is less than 2, the encoding/decoding device may add HEVC AMVP candidates to the affine MVP list of the current block. More specifically, when the number of available inherited candidates and/or constructed candidates is less than 2, the encoding device/decoding device may perform a process of configuring MVP candidates of the existing HEVC standard.

Fig. 14 illustrates an example of constructing an affine MVP list.

Referring to fig. 14, the encoding apparatus/decoding apparatus may add the inherited candidates to the affine MVP list of the current block (S1400). The inherited candidates may represent the inherited affine candidates described above.

More specifically, the encoding apparatus/decoding apparatus may derive a first affine MVP candidate of the current block from a left block group including a left adjacent block and a lower-left adjacent block (S1405), and the encoding apparatus/decoding apparatus may derive a second affine MVP candidate of the current block from an upper block group including a upper-right adjacent block, an upper adjacent block, and an upper-left adjacent block (S1410).

Herein, the first affine MVP candidate may be derived based on a first block within the left block group, the first block may be encoded by using an affine motion model, and a reference picture of the first block may be the same as a reference picture of the current block. More specifically, when neighboring blocks are checked in a particular order, the first block may be the first block verified as satisfying the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

In addition, a second affine MVP candidate may be derived based on a second block within the upper block group, the second block may be encoded by using an affine motion model, and the reference picture of the second block and the reference picture of the current block may be the same. More specifically, when the neighboring blocks are checked in a specific order, the second block may be the second block verified to satisfy the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

Further, the specific order in which the left block group is checked may be the order of the left neighboring block and the lower left neighboring block. Alternatively, the specific order in which the left block groups are examined may be the order of the lower left corner neighboring block and the left neighboring block. In addition, the specific order of checking the upper block group may be an order of an upper right neighboring block, an upper neighboring block, and an upper left neighboring block. Alternatively, the specific order of checking the upper block group may be the order of the upper neighboring block, the upper right neighboring block, and the upper left neighboring block. In addition, the check may be performed in an order other than the above-described order, and the check may not be limited to the above-described example.

The encoding apparatus/decoding apparatus may add the constructed candidate to the affine MVP list of the current block (S1420). The constructed candidates may represent the constructed affine candidates described above. When the number of available inherited candidates is less than 2, the encoding apparatus/decoding apparatus may add the constructed candidate to the affine MVP list of the current block.

The encoding/decoding apparatus may derive mv from the first group0Deriving mv from the second group1And deriving mv from the third group2And the encoding apparatus/decoding apparatus can derive to include mv0、mv1And mv2The constructed affine candidate of (1). mv0May be CPMVP candidate, mv, for CP01Can be CPMVP candidates for CP1, and mv2May be a CPMVP candidate for CP 2.

Herein, mv0May be the same as the reference picture of the current block. More specifically, when motion vectors within the first group are examined according to a particular order, mv0May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In addition, mv1May be the same as the reference picture of the current block. More specifically, when the motion vectors within the second group are checked according to a particular order, mv1May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In addition, mv2May be the same as the reference picture of the current block. More specifically, when based onIn checking the motion vectors in the third group in fixed order, mv2May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block.

In addition, the first group may include motion vectors of the neighboring block a, motion vectors of the neighboring block B, and motion vectors of the neighboring block C. The second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E. The third group may include motion vectors of the neighboring block F and motion vectors of the neighboring block G. The neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper side of the upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of the upper left sample position of the current block. In addition, the neighboring block D may represent a neighboring block at an upper side of a top-right sample position of the current block, and the neighboring block E may represent a neighboring block at an upper-right side of the top-right sample position of the current block. In addition, the neighboring block F may represent a neighboring block at the left side of the lower-left sample position of the current block, and the neighboring block G may represent a neighboring block at the lower-left side of the lower-left sample position of the current block.

The encoding/decoding apparatus may add the HEVC AMVP candidate to the affine MVP list of the current block (S1430). When the number of inherited candidates and/or constructed candidates available is less than 2, the encoding/decoding device may add HEVC AMVP candidates to the affine MVP list of the current block. More specifically, in the case where the number of available inherited candidates and/or constructed candidates is less than 2, the encoding device/decoding device may perform a process of configuring MVP candidates of the existing HEVC standard.

Fig. 15 illustrates an example of constructing an affine MVP list.

Referring to fig. 15, the encoding apparatus/decoding apparatus may add the inherited candidates to the affine MVP list of the current block (S1500). The inherited candidates may represent the inherited affine candidates described above.

More specifically, the encoding apparatus/decoding apparatus may derive 1 inherited affine candidates at most from neighboring blocks of the current block (S1505). Herein, the neighboring blocks may include a left neighboring block A0 of the current block, a lower left neighboring block A1 of the current block, an upper neighboring block B0 of the current block, an upper right neighboring block B1 of the current block, and an upper left neighboring block B2 of the current block.

For example, the encoding device/decoding device may derive the first affine MVP candidate based on the first block within the neighboring block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same. More specifically, when neighboring blocks are checked in a particular order, the first block may be the first block verified as satisfying the condition. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block.

Further, the specific order may be left adjacent tile a0 → lower left adjacent tile a1 → upper adjacent tile B0 → upper right adjacent tile B1 → upper left adjacent tile B2. In addition, the check may be performed in an order other than the above-described order, and the check may not be limited to the above-described example.

The encoding apparatus/decoding apparatus may add the constructed candidate to the affine MVP list of the current block (S1510). The constructed candidates may represent the constructed affine candidates described above. In the case where the number of available inherited candidates is less than 2, the encoding apparatus/decoding apparatus may add the constructed candidate to the affine MVP list of the current block.

The encoding/decoding apparatus may derive mv from the first group0Deriving mv from the second group1And deriving mv from the third group2And the encoding apparatus/decoding apparatus can derive to include mv0、mv1And mv2The constructed affine candidate of (1). mv0May be CPMVP candidate, mv, for CP01Can be CPMVP candidates for CP1, and mv2May be a CPMVP candidate for CP 2.

Herein, mv0May be the same as the reference picture of the current block. More specifically, when motion vectors within the first group are examined according to a particular order, mv0May be the first motion vector that has been verified to satisfy the condition. The conditions canSo that the reference picture of the motion vector is the same as the reference picture of the current block. In addition, mv1May be the same as the reference picture of the current block. More specifically, when the motion vectors within the second group are checked according to a particular order, mv1May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In addition, mv2May be the same as the reference picture of the current block. More specifically, when the motion vectors within the third group are checked according to a particular order, mv2May be the first motion vector that has been verified to satisfy the condition. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block.

In addition, the first group may include motion vectors of the neighboring block a, motion vectors of the neighboring block B, and motion vectors of the neighboring block C. The second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E. The third group may include motion vectors of the neighboring block F and motion vectors of the neighboring block G. The neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper side of the upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of the upper left sample position of the current block. In addition, the neighboring block D may represent a neighboring block at an upper side of a top-right sample position of the current block, and the neighboring block E may represent a neighboring block at an upper-right side of the top-right sample position of the current block. In addition, the neighboring block F may represent a neighboring block at the left side of the lower-left sample position of the current block, and the neighboring block G may represent a neighboring block at the lower-left side of the lower-left sample position of the current block.

The encoding device/decoding device may add the HEVC AMVP candidate to the affine MVP list of the current block (S1520). In the case where the number of inherited candidates and/or constructed candidates available is less than 2, the encoding/decoding device may add the HEVC AMVP candidate to the affine MVP list of the current block. More specifically, in the case where the number of available inherited candidates and/or constructed candidates is less than 2, the encoding device/decoding device may perform a process of configuring MVP candidates of the existing HEVC standard.

The embodiments disclosed in the above flowcharts differ in the process of deriving inherited affine candidates. Therefore, comparison between the above embodiments can be made by performing complexity analysis on the process of deriving inherited affine candidates.

A complexity analysis of the above described embodiment can be derived as shown below in the table below.

[ Table 7]

In addition, as shown below in the following table, the coding performance of the above embodiment can be derived.

[ Table 8]

Referring to table 7, it can be verified that the embodiment of fig. 14 and the embodiment of fig. 15 are less complex than the embodiment of fig. 13. In addition, referring to table 8, it can be verified that the encoding performance of the embodiment of fig. 13, the embodiment of fig. 14, and the embodiment of fig. 15 are almost the same. Therefore, the embodiment of fig. 14 and the embodiment of fig. 15, which have lower complexity and the same encoding performance, can be considered more for the encoding/decoding process.

Fig. 16 illustrates a full view of an image encoding method performed by an encoding apparatus according to the present disclosure. The method set forth in fig. 16 may be performed by the encoding device (or apparatus) disclosed in fig. 1. More specifically, for example, steps S1600 to S1630 of fig. 16 may be performed by a predictor of an encoding apparatus, and step S1640 may be performed by an entropy encoder of the encoding apparatus. In addition, although not shown in the drawings, the process of deriving the prediction samples of the current block based on the CPMV may be performed by a predictor of the encoding apparatus. The process of deriving residual samples of the current block based on the predicted samples and the original samples of the current block may be performed by a subtractor of the encoding apparatus. The process of generating information related to the residue of the current block based on the residue samples may be performed by a transformer of an encoding apparatus. Also, the process of encoding the information related to the residual may be performed by an entropy encoder of the encoding apparatus.

The encoding apparatus constructs an affine Motion Vector Predictor (MVP) candidate list including MVP candidates for the current block (S1600).

For example, the affine MVP candidates may include a first affine MVP candidate and a second affine MVP candidate.

The first affine MVP candidate may be derived based on a first block within a group of left blocks including a lower left corner neighboring block and a left neighboring block of the current block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same.

More specifically, the first block may be the first block that has been verified to satisfy the condition when the adjacent blocks within the left block group are checked according to the specific order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block. For example, the encoding device may check whether blocks within the left block group satisfy the condition according to a certain order, then the encoding device may derive a first block that satisfies the condition, and may derive a first affine MVP candidate based on this first block.

More specifically, for example, the encoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of the first block, and may derive a first affine MVP candidate including the motion vector as the CPMVP candidate. The affine motion model can be derived as shown in equation 1 or equation 3 presented above.

Further, the particular order in which the blocks within the left block group are examined may be an order starting from the left adjacent block until the lower left adjacent block. Alternatively, the specific order in which the groups of left blocks are examined may be an order starting from the lower left adjacent block up to the left adjacent block.

The second affine MVP candidate may be derived based on a second block within an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block of the current block. Herein, the second block may be encoded by using an affine motion model, and the reference picture of the second block and the reference picture of the current block may be the same.

More specifically, the second block may be a second block that has been verified to satisfy the condition when the adjacent blocks within the left block group are checked according to the specific order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block. For example, the encoding device may check whether blocks within the upper block group satisfy the condition according to a certain order, then the encoding device may derive a first second block that satisfies the condition, and may derive a second affine MVP candidate based on the second block.

More specifically, for example, the encoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of the second block, and may derive a second affine MVP candidate including the motion vector as the CPMVP candidate. The affine motion model can be derived as shown in equation 1 or equation 3 presented above.

Further, the specific order of checking the blocks within the upper block group may be an order starting from the upper neighboring block up to the upper right neighboring block and the upper left neighboring block.

In addition, as another example, the affine MVP candidate may include a first affine MVP candidate derived as described below.

The first affine MVP candidate may be derived based on a first block within a neighboring block of the current block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same. In addition, the neighboring blocks may include a left neighboring block, a lower left neighboring block, an upper right neighboring block, an upper neighboring block, and an upper left neighboring block of the current block.

The first block may be the first block that has been verified to satisfy the condition when the neighboring blocks are checked according to the specific order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block. For example, the encoding device may check whether neighboring blocks satisfy the condition according to a certain order, then the encoding device may derive a first block that satisfies the condition, and may derive a first affine MVP candidate based on the first block.

More specifically, for example, the encoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of the first block, and may derive a first affine MVP candidate including the motion vector as the CPMVP candidate. The affine motion model can be derived as shown in equation 1 or equation 3 presented above.

Further, the specific order of checking the neighboring blocks may be an order starting from the left neighboring block up to the lower left neighboring block, the upper right neighboring block, and the upper left neighboring block.

Herein, for example, in the case where the size of the current block is WxH, and in the case where the x-component of the upper-left sample position of the current block is 0 and the y-component is 0, the left neighboring block may be a block including samples of coordinates (-1, H-1), the upper neighboring block may be a block including samples of coordinates (W-1, -1), the upper-right neighboring block may be a block including samples of coordinates (W, -1), the lower-left neighboring block may be a block including samples of coordinates (-1, H), and the upper-left neighboring block may be a block including samples of coordinates (-1, -1).

Further, in the case where the first affine MVP candidate and/or the second affine MVP candidate are not derived, that is, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the affine MVP candidates may include the constructed affine MVP candidates.

In other words, for example, in the case where the first affine MVP candidate and/or the second affine MVP candidate are not derived, that is, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the encoding apparatus may derive the constructed affine MVP candidates based on the neighboring blocks.

More specifically, the encoding apparatus may divide the motion vectors of the neighboring blocks into a first group, a second group, and a third group. The first group may include motion vectors of neighboring block a, motion vectors of neighboring block B, and motion vectors of neighboring block C. The second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E. And, the third group may include motion vectors of the neighboring block F and motion vectors of the neighboring block G.

In addition, the encoding apparatus may derive a CPMVP candidate of the CP0 of the current block from the first group, derive a CPMVP candidate of the CP1 of the current block from the second group, and derive a CPMVP candidate of the CP2 of the current block from the third group. Also, the encoding apparatus may derive an affine MVP candidate including the construction of the CPMVP candidate of the CP.

The CPMVP candidate of CP0 may be a first motion vector that has been verified to satisfy a condition when motion vectors within the first group are checked according to a specific order. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In other words, the CPMVP candidate of the CP0 may be a motion vector having a reference picture that is first verified to be the same as the reference picture of the current block when motion vectors within the first group are checked according to a specific order. For example, the encoding apparatus may check the motion vectors within the first group according to a specific order to verify whether the motion vectors satisfy the condition, and the first motion vector verified to satisfy the condition may be derived as a CPMVP candidate of the CP 0. Herein, for example, the specific order may be an order starting from the neighboring block a up to the neighboring block B and the neighboring block C.

The CPMVP candidate of CP1 may be the first motion vector that has been verified to satisfy the condition when the motion vectors within the second group are examined according to a particular order. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In other words, the CPMVP candidate of the CP1 may be a motion vector having a reference picture of which the first is verified to be the same as the reference picture of the current block when motion vectors within the second group are checked according to a specific order. For example, the encoding apparatus may check the motion vectors within the second group according to a specific order to verify whether the motion vectors satisfy the condition, and the first motion vector verified to satisfy the condition may be derived as a CPMVP candidate of the CP 1. Herein, for example, the specific order may be an order starting from the neighboring block D until the neighboring block E.

The CPMVP candidate of CP2 may be the first motion vector that has been verified to satisfy the condition when the motion vectors within the third group are checked according to a specific order. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In other words, the CPMVP candidate of the CP2 may be a motion vector having a reference picture that is first verified to be the same as a reference picture of the current block when motion vectors within the third group are checked according to a specific order. For example, the encoding apparatus may check the motion vectors within the third group according to a specific order to verify whether the motion vectors satisfy the condition, and the first motion vector verified to satisfy the condition may be derived as a CPMVP candidate of the CP 2. Herein, for example, the specific order may be an order starting from the neighboring block F until the neighboring block G.

Herein, the neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper left side of an upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of an upper left sample position of the current block, the neighboring block D may represent a neighboring block at an upper right side of an upper right sample position of the current block, the neighboring block E may represent a neighboring block at an upper right side of an upper right sample position of the current block, the neighboring block F may represent a neighboring block at a left side of a lower left sample position of the current block, and the neighboring block G may represent a neighboring block at a lower left side of a lower left sample position of the current block.

Herein, for example, in case that the size of the current block is W × H, and in case that the x-component of the top-left sample position of the current block is 0 and the y-component is 0, the neighboring block a may be a block including samples of coordinates (-1, -1), the neighboring block B may be a block including samples of coordinates (0, -1), the neighboring block C may be a block including samples of coordinates (-1, 0), the neighboring block D may be a block including samples of coordinates (W-1, -1), the neighboring block E may be a block including samples of coordinates (W, -1), the neighboring block F may be a block including samples of coordinates (-1, H-1), and the neighboring block G may be a block including samples of coordinates (-1, H).

Also, in case of deriving the CPMVP candidate of the CP0 of the current block from the first group, in case of deriving the CPMVP candidate of the CP1 of the current block from the second group without deriving the CPMVP candidate of the CP2 of the current block from the third group, the CPMVP candidate of the CP2 may be derived by using equation 8 described above based on the CPMVP candidate of the CP0 and the CPMVP candidate of the CP 1.

In addition, in the case of deriving the CPMVP candidate of the CP0 of the current block from the first group, in the case of deriving the CPMVP candidate of the CP2 of the current block from the third group without deriving the CPMVP candidate of the CP1 of the current block from the second group, the CPMVP candidate of the CP1 may be derived by using equation 9 described above based on the CPMVP candidate of the CP0 and the CPMVP candidate of the CP 2.

Furthermore, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the affine MVP candidates may include MVP candidates of the existing HEVC standard.

In other words, for example, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the encoding device may derive MVP candidates of the existing HEVC standard.

The encoding apparatus derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block based on one affine MVP candidate among affine MVP candidates included in the affine MVP candidate list (S1610). The encoding apparatus may derive the CPMV of the CP of the current block having the best RD cost, and may select an affine MVP candidate closest to the CPMV among the affine MVP candidates as the affine MVP candidate of the current block. Based on the selected affine MVP candidate selected from the affine MVP candidate list, the encoding device may derive a Control Point Motion Vector Predictor (CPMVP) for the Control Point (CP) of the current block.

The encoding apparatus may encode an affine MVP candidate index indicating an affine MVP candidate selected from among affine MVP candidates. The affine MVP candidate index may indicate one affine Motion Vector Predictor (MVP) candidate selected from affine MVP candidates included in an MVP candidate list of the current block.

The encoding apparatus derives the CPMV of the CP of the current block (S1620). The encoding device may derive the CPMV of each of the CPs of the current block.

The encoding apparatus derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the CPMVP and the CPMV (S1630). The encoding apparatus may derive the CPMVP of the CP of the current block based on the CPMVP and the CPMV of each of the CPs.

The encoding apparatus encodes motion prediction information including information on the CPMVD (S1640). The encoding apparatus may output motion prediction information including information on the CPMVD in a bitstream format. In other words, the encoding apparatus may output image (or video) information including motion prediction information in a bitstream format. The encoding apparatus may encode information regarding the CPMVD of each of the CPs, and the motion prediction information may include information regarding the CPMVD.

In addition, the motion prediction information may include an affine MVP candidate index. The affine MVP candidate index may indicate a selected affine Motion Vector Predictor (MVP) candidate selected from among affine MVP candidates included in an MVP candidate list of the current block.

Also, for example, the encoding apparatus may derive the prediction samples of the current block based on the CPMV, and may derive the residual samples of the current block based on the prediction samples and the original samples of the current block. Then, the encoding apparatus may generate information on a residual of the current block, which may be generated based on the residual samples, and may encode the information on the residual. The image information may include information about the residual.

Further, the bitstream may be transmitted to the decoding apparatus through a network or a (digital) storage medium. Herein, the network may include a broadcasting network and/or a communication network, etc., and the digital storage medium may include various storage media such as USB, SD, CD, DVD, blu-ray, HDD, SSD, etc.

Fig. 17 illustrates a general view of an encoding apparatus performing an image encoding method according to the present disclosure. The method disclosed in fig. 16 may be performed by the encoding device disclosed in fig. 17. More specifically, for example, the predictor of the encoding apparatus of fig. 17 may perform steps S1600 to S1630 of fig. 16, and the entropy encoder of the encoding apparatus of fig. 17 may perform step S1640 of fig. 16. In addition, although not shown in the drawings, the process of deriving the predicted samples of the current block based on the CPMV may be performed by a predictor of the encoding apparatus of fig. 17. The process of deriving residual samples of the current block based on the predicted samples and the original samples of the current block may be performed by a subtractor of the encoding apparatus of fig. 17. The process of generating information related to the residue of the current block based on the residue samples may be performed by a transformer of the encoding apparatus of fig. 17. Also, the process of encoding the information related to the residual may be performed by an entropy encoder of the encoding apparatus of fig. 17.

Fig. 18 illustrates a general view of an image decoding method performed by a decoding apparatus according to the present disclosure. The method set forth in fig. 18 may be performed by the decoding apparatus (or device) disclosed in fig. 2. More specifically, for example, steps S1810 to S1850 of fig. 18 may be performed by a predictor of a decoding apparatus, and step S1860 may be performed by an adder of an encoding apparatus. In addition, although not shown in the drawings, a process of obtaining information regarding a residual of the current block through a bitstream may be performed by an entropy decoder of the decoding apparatus, and a process of deriving residual samples of the current block based on the residual information may be performed by an inverse transformer of the decoding apparatus.

The decoding apparatus may obtain motion prediction information from the bitstream (S1800). The decoding apparatus may obtain image (or video) information including motion prediction information from a bitstream.

In addition, for example, the motion prediction information may include information on a Control Point Motion Vector Difference (CPMVD) of a Control Point (CP) of the current block. In other words, the motion prediction information may include information on the CPMVD of each of the CPs of the current block.

In addition, for example, the motion prediction information may include an affine MVP candidate index of the current block. The affine MVP candidate index may indicate one of affine Motion Vector Predictor (MVP) candidates included in an MVP candidate list of the current block.

The decoding apparatus constructs an affine Motion Vector Predictor (MVP) candidate list including MVP candidates for the current block (S1810).

For example, the affine MVP candidates may include a first affine MVP candidate and a second affine MVP candidate.

The first affine MVP candidate may be derived based on a first block within a group of left blocks including a lower left corner neighboring block and a left neighboring block of the current block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same.

More specifically, the first block may be the first block that has been verified to satisfy the condition when the adjacent blocks within the left block group are checked according to the specific order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block. For example, the decoding device may check whether the blocks within the left block group satisfy the condition according to a certain order, then the decoding device may derive a first block that satisfies the condition, and may derive a first affine MVP candidate based on this first block.

More specifically, for example, the decoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of the first block, and may derive a first affine MVP candidate including the motion vector as the CPMVP candidate. The affine motion model can be derived as shown in equation 1 or equation 3 presented above.

Further, the particular order in which the blocks within the left block group are examined may be an order starting from the left adjacent block until the lower left adjacent block. Alternatively, the specific order in which the groups of left blocks are examined may be an order starting from the lower left adjacent block up to the left adjacent block.

The second affine MVP candidate may be derived based on a second block within an upper block group including an upper right neighboring block, an upper neighboring block, and an upper left neighboring block of the current block. Herein, the second block may be encoded by using an affine motion model, and the reference picture of the second block and the reference picture of the current block may be the same.

More specifically, the second block may be the first block that has been verified to satisfy the condition when the adjacent blocks within the left block group are examined according to the specific order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block. For example, the decoding apparatus may check whether the blocks within the upper block group satisfy the condition according to a certain order, and then the decoding apparatus may derive a first second block satisfying the condition, and may derive a second affine MVP candidate based on the second block.

More specifically, for example, the decoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of the second block, and may derive a second affine MVP candidate including the motion vector as the CPMVP candidate. The affine motion model can be derived as shown in equation 1 or equation 3 presented above.

Further, the specific order of checking the blocks within the upper block group may be an order starting from the upper neighboring block up to the upper right neighboring block and the upper left neighboring block.

In addition, as another example, the affine MVP candidate may include a first affine MVP candidate derived as described below.

The first affine MVP candidate may be derived based on a first block within a neighboring block of the current block. Herein, the first block may be encoded by using an affine motion model, and the reference picture of the first block and the reference picture of the current block may be the same. In addition, the neighboring blocks may include a left neighboring block, a lower left neighboring block, an upper right neighboring block, an upper neighboring block, and an upper left neighboring block of the current block.

The first block may be the first block that has been verified to satisfy the condition when the neighboring blocks are checked according to the specific order. The condition may be that encoding is performed using an affine motion model and that a reference picture of the block is the same as a reference picture of the current block. For example, the decoding device may check whether neighboring blocks satisfy the condition according to a certain order, then the decoding device may derive a first block that satisfies the condition, and may derive a first affine MVP candidate based on the first block.

More specifically, for example, the decoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of the first block, and may derive a first affine MVP candidate including the motion vector as the CPMVP candidate. The affine motion model can be derived as shown in equation 1 or equation 3 presented above.

Further, the specific order of checking the neighboring blocks may be an order starting from the left neighboring block up to the lower left neighboring block, the upper right neighboring block, and the upper left neighboring block.

Herein, for example, in the case where the size of the current block is W × H, and in the case where the x-component of the upper-left sample position of the current block is 0 and the y-component is 0, the left neighboring block may be a block including samples of coordinates (-1, H-1), the upper neighboring block may be a block including samples of coordinates (W-1, -1), the upper-right neighboring block may be a block including samples of coordinates (W, -1), the lower-left neighboring block may be a block including samples of coordinates (-1, H), and the upper-left neighboring block may be a block including samples of coordinates (-1, -1).

Further, in the case where the first affine MVP candidate and/or the second affine MVP candidate are not derived, that is, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the affine MVP candidates may include the constructed affine MVP candidates.

In other words, for example, in the case where the first affine MVP candidate and/or the second affine MVP candidate are not derived, that is, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the decoding apparatus may derive the constructed affine MVP candidates based on the neighboring blocks.

More specifically, the decoding apparatus may divide the motion vectors of the neighboring blocks into a first group, a second group, and a third group. The first group may include motion vectors of neighboring block a, motion vectors of neighboring block B, and motion vectors of neighboring block C. The second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E. And, the third group may include motion vectors of the neighboring block F and motion vectors of the neighboring block G.

In addition, the decoding apparatus may derive a CPMVP candidate of the CP0 of the current block from the first group, derive a CPMVP candidate of the CP1 of the current block from the second group, and derive a CPMVP candidate of the CP2 of the current block from the third group. Also, the decoding apparatus may derive a constructed affine MVP candidate including the CPMVP candidate of the CP.

The CPMVP candidate of CP0 may be a first motion vector that has been verified to satisfy a condition when motion vectors within the first group are checked according to a specific order. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In other words, the CPMVP candidate of the CP0 may be a motion vector having a reference picture that is first verified to be the same as the reference picture of the current block when motion vectors within the first group are checked according to a specific order. For example, the decoding apparatus may check the motion vectors within the first group according to a specific order to verify whether the motion vectors satisfy the condition, and the first motion vector verified to satisfy the condition may be derived as a CPMVP candidate of the CP 0. Herein, for example, the specific order may be an order starting from the neighboring block a up to the neighboring block B and the neighboring block C.

The CPMVP candidate of CP1 may be the first motion vector that has been verified to satisfy the condition when the motion vectors within the second group are examined according to a particular order. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In other words, the CPMVP candidate of the CP1 may be a motion vector having a reference picture of which the first is verified to be the same as the reference picture of the current block when motion vectors within the second group are checked according to a specific order. For example, the decoding apparatus may check the motion vectors within the second group according to a specific order to verify whether the motion vectors satisfy the condition, and the first motion vector verified to satisfy the condition may be derived as a CPMVP candidate of the CP 1. Herein, for example, the specific order may be an order starting from the neighboring block D until the neighboring block E.

The CPMVP candidate of CP2 may be the first motion vector that has been verified to satisfy the condition when the motion vectors within the third group are checked according to a specific order. The condition may be that the reference picture of the motion vector is the same as the reference picture of the current block. In other words, the CPMVP candidate of the CP2 may be a motion vector having a reference picture that is first verified to be the same as the reference picture of the current block when motion vectors within the third group are checked according to a specific order. For example, the decoding apparatus may check the motion vectors within the third group according to a specific order to verify whether the motion vectors satisfy the condition, and the first motion vector verified to satisfy the condition may be derived as a CPMVP candidate of the CP 2. Herein, for example, the specific order may be an order starting from the neighboring block F until the neighboring block G.

Herein, the neighboring block a may represent a neighboring block at an upper left side of an upper left sample position of the current block, the neighboring block B may represent a neighboring block at an upper left side of an upper left sample position of the current block, and the neighboring block C may represent a neighboring block at a left side of an upper left sample position of the current block, the neighboring block D may represent a neighboring block at an upper right side of an upper right sample position of the current block, the neighboring block E may represent a neighboring block at an upper right side of an upper right sample position of the current block, the neighboring block F may represent a neighboring block at a left side of a lower left sample position of the current block, and the neighboring block G may represent a neighboring block at a lower left side of a lower left sample position of the current block.

Herein, for example, in case that the size of the current block is W × H, and in case that the x-component of the top-left sample position of the current block is 0 and the y-component is 0, the neighboring block a may be a block including samples of coordinates (-1, -1), the neighboring block B may be a block including samples of coordinates (0, -1), the neighboring block C may be a block including samples of coordinates (-1, 0), the neighboring block D may be a block including samples of coordinates (W-1, -1), the neighboring block E may be a block including samples of coordinates (W, -1), the neighboring block F may be a block including samples of coordinates (-1, H-1), and the neighboring block G may be a block including samples of coordinates (-1, H).

Also, in case of deriving the CPMVP candidate of the CP0 of the current block from the first group, in case of deriving the CPMVP candidate of the CP1 of the current block from the second group without deriving the CPMVP candidate of the CP2 of the current block from the third group, the CPMVP candidate of the CP2 may be derived by using equation 8 described above based on the CPMVP candidate of the CP0 and the CPMVP candidate of the CP 1.

In addition, in the case of deriving the CPMVP candidate of the CP0 of the current block from the first group, in the case of deriving the CPMVP candidate of the CP2 of the current block from the third group without deriving the CPMVP candidate of the CP1 of the current block from the second group, the CPMVP candidate of the CP1 may be derived by using equation 9 described above based on the CPMVP candidate of the CP0 and the CPMVP candidate of the CP 2.

Furthermore, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the affine MVP candidates may include MVP candidates of the existing HEVC standard.

In other words, for example, in the case where less than 2 affine MVP candidates are derived by performing the above-described processing, the decoding device may derive MVP candidates of the existing HEVC standard.

The decoding apparatus derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block based on one affine MVP candidate among affine MVP candidates included in the affine MVP candidate list (S1820).

The decoding device may select a specific affine MVP candidate from among the affine MVP candidates included in the affine MVP candidate list, and then, the decoding device may derive the selected affine MVP candidate as the CPMVP of the CP of the current block. For example, the decoding apparatus may obtain, from the bitstream, an affine MVP candidate index of the current block, and then, the decoding apparatus may derive, as the CPMVP of the CP of the current block, an affine MVP candidate indicated by the affine MVP candidate index among affine MVP candidates included in the affine MVP candidate list.

The decoding apparatus derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the motion prediction information (S1830). The motion prediction information may include information on the CPMVD of each of the CPs, and the decoding device may derive the CPMVD of each of the CPs of the current block based on the information on the CPMVD of each of the CPs.

The decoding apparatus derives a Control Point Motion Vector (CPMV) of the CP of the current block based on the CPMVP and the CPMVD (S1840). The decoding apparatus may derive the CPMV of each CP based on the CPMVD and the CPMVP of each of the CPs. For example, the decoding apparatus may derive the CPMV of the CP by adding the CPMVD and the CPMVP of each CP.

The decoding apparatus derives a prediction sample of the current block based on the CPMV (S1850). The decoding apparatus may derive a sub-block unit or a sample unit of the motion vector of the current block based on the CPMV. That is, the decoding device may derive a motion vector for each sub-block or each sample of the current block. The motion vector of the sub-block unit or the sample unit may be derived based on equation 1 or equation 3 described above. A motion vector may be indicated as a Motion Vector Field (MVF) or as an array of motion vectors.

The decoding apparatus may derive prediction samples of the current block based on motion vectors of sub-block units or sample units. The decoding apparatus may derive a reference region within a reference picture based on a motion vector of a sub-block unit or a sample unit, and may generate prediction samples of the current block based on reconstructed samples within the reference region.

The decoding apparatus generates a reconstructed picture of the current block based on the derived prediction samples (S1860). The decoding device may generate a reconstructed picture of the current block based on the derived prediction samples. The decoding apparatus may directly use the prediction samples as the reconstructed samples in the prediction mode, or the decoding apparatus may generate the reconstructed samples by adding the residual samples to the prediction samples. In case there are residual samples for the current block, the decoding apparatus may obtain information on a residual of the current block from the bitstream. The information about the residual may include transform coefficients related to residual samples. The decoding apparatus may derive residual samples (or a residual sample array) of the current block based on the residual information. The decoding device may generate reconstructed samples based on the prediction samples and the residual samples, and the decoding device may derive a reconstructed block or a reconstructed picture based on the reconstructed samples. Thereafter, as described above, in order to enhance subjective/objective image quality as needed, the decoding apparatus may apply in-loop filtering processing such as deblocking filtering and/or SAO processes to the reconstructed picture.

Fig. 19 illustrates a general view of a decoding apparatus performing an image decoding method according to the present disclosure. The method disclosed in fig. 18 may be performed by the decoding device disclosed in fig. 19. More specifically, for example, the entropy decoder of the decoding apparatus of fig. 19 may perform step S1800 of fig. 18, the predictor of the decoding apparatus of fig. 19 may perform steps S1810 to S1850 of fig. 18, and the adder of the decoding apparatus of fig. 19 may perform step S1860. In addition, although not shown in the drawings, a process of obtaining information regarding a residual of the current block through a bitstream may be performed by an entropy decoder of the decoding apparatus of fig. 19, and a process of deriving residual samples of the current block based on the residual information may be performed by an inverse transformer of the decoding apparatus of fig. 19.

According to the present disclosure described above, the efficiency of affine motion prediction based image encoding can be improved.

According to the present disclosure described above, when deriving an affine MVP candidate list, neighboring blocks are divided into a left block group and an upper block group, and the affine MVP candidate list may be constructed by deriving MVP candidates from each block group. Accordingly, the complexity of the process of constructing the affine MVP candidate list can be reduced, and the encoding efficiency can be enhanced.

In the above embodiments, although the method has been described based on a flowchart using a series of steps or blocks, the present disclosure is not limited to the sequence of these steps, and some of these steps may be performed in a different sequence from the remaining steps or may be performed simultaneously with the remaining steps. Further, those skilled in the art will appreciate that the steps illustrated with the flowcharts are not exclusive, and that other steps may be included or one or more steps in the flowcharts may be deleted without affecting the scope of the present disclosure.

The embodiments described herein may be implemented and executed on a processor, microprocessor, controller, or chip. For example, the functional elements shown in each figure may be implemented and executed on a computer, processor, microprocessor, controller, or chip. In this case, information for implementation (e.g., information about instructions) or algorithms may be stored in the digital storage medium.

In addition, the decoding apparatus and the encoding apparatus to which the present disclosure is applied may be applied to multimedia communication devices such as multimedia broadcast transmission and reception devices, mobile communication terminals, home theater video devices, digital cinema video devices, surveillance cameras, video chat devices, (3D) video devices, video phone video devices, medical video devices, and the like, which may be included in, for example, a storage medium, a camcorder, a video on demand (VoD) service providing device, an over-the-top video (OTT video) device, an internet streaming service providing device, a 3D video device, a video call device, a vehicle terminal (e.g., a vehicle terminal, an airplane terminal, a ship terminal, and the like), and which may be used to process a video signal or a data signal. For example, an over-the-top video (OTT video) device may include a game console, a blu-ray player, an internet access TV, a home theater system, a smart phone, a tablet PC, a Digital Video Recorder (DVR).

In addition, the processing method to which the present disclosure is applied may be produced in the form of a program executed by a computer, and may be stored in a computer-readable recording medium. The multimedia data having the data structure according to the present disclosure may also be stored in a computer-readable recording medium. The computer-readable recording medium includes all kinds of storage devices and distributed storage devices in which computer-readable data is stored. The computer-readable recording medium may be, for example, a blu-ray disc (BD), a Universal Serial Bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD-ROM, an electronic tape, a floppy disk, and an optical data storage device. In addition, the computer-readable recording medium includes media implemented in the form of carrier waves (e.g., transmission through the internet). In addition, the bitstream generated by the encoding method may be stored in a computer-readable recording medium or may be transmitted through a wired or wireless communication network.

In addition, the embodiments of the present disclosure may be implemented as a computer program product by program codes, and the program codes may be executed in a computer according to the embodiments of the present disclosure. The program code may be stored on a carrier readable by a computer.

Fig. 20 illustrates a content streaming system structure to which the present disclosure is applied.

A content streaming system to which the present disclosure is applied may include an encoding server, a streaming server, a network server, a media storage, a user device, and a multimedia input device.

The encoding server compresses contents input from a multimedia input device such as a smart phone, a camera, a camcorder, etc. into digital data to generate a bitstream, and transmits the bitstream to the streaming server. As another example, when a multimedia input device such as a smart phone, a camera, a camcorder, etc. directly generates a bitstream, an encoding server may be omitted.

The bitstream may be generated by applying the encoding method or the bitstream generation method of the present disclosure, and the streaming server may temporarily store the bitstream in a process of transmitting or receiving the bitstream.

The streaming server transmits multimedia data to the user device through the web server based on the user request, and the web server serves as an intermediary informing the user of what service is provided. When a user requests a desired service from the web server, the web server transmits it to the streaming server, and the streaming server transmits multimedia data to the user. Here, the content streaming system may include a separate control server, and in this case, the control server controls commands/responses among devices in the content streaming system.

The streaming server may receive content from the media store and/or the encoding server. For example, the content may be received in real time as it is received from the encoding server. In this case, in order to provide a smooth streaming service, the streaming server may store the bit stream for a predetermined time.

Examples of user devices include mobile phones, smart phones, laptop computers, digital broadcast terminals, Personal Digital Assistants (PDAs), Portable Multimedia Players (PMPs), navigation devices, touch screen PCs, tablet PCs, ultrabooks, wearable devices (e.g., smart watches, smart glasses, head mounted displays), digital TVs, desktop computers, digital signage, and the like. Each server in the content streaming system may operate as a distributed server, and in this case, data received from each server may be processed in a distributed manner.

56页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于沉浸式视频编解码的信令语法的方法以及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类