Affine motion prediction based image decoding method and apparatus using affine MVP candidate list in image coding system

文档序号：1358580 发布日期：2020-07-24 浏览：17次中文

阅读说明：本技术 在图像编码系统中使用仿射mvp候选列表的基于仿射运动预测的图像解码方法及设备 (Affine motion prediction based image decoding method and apparatus using affine MVP candidate list in image coding system ) 是由李在镐于 2019-09-10 设计创作，主要内容包括：根据本文档的由解码设备执行图像解码的方法包括以下步骤：从比特流获得关于当前块的运动预测信息；生成当前块的仿射MVP候选列表；以仿射MVP候选列表为基础推导当前块的CP的CPMVP；以运动预测信息为基础推导当前块的CP的CPMVD；以CPMVP和CPMVD为基础推导当前块的CP的CPMV；以及以CPMV为基础推导当前块的预测样本。(The method of performing image decoding by a decoding apparatus according to the present document includes the steps of: obtaining motion prediction information on a current block from a bitstream; generating an affine MVP candidate list of the current block; deriving the CPMVP of the CP of the current block on the basis of the affine MVP candidate list; deriving a CPMVD of the CP of the current block based on the motion prediction information; deriving the CPMV of the CP of the current block on the basis of the CPMVP and the CPMVD; and deriving prediction samples for the current block based on the CPMV.)

1. A video decoding method performed by a decoding apparatus, the method comprising the steps of:

obtaining motion prediction information of a current block from a bitstream;

constructing an affine Motion Vector Predictor (MVP) candidate list of the current block;

deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on the affine MVP candidate list;

deriving a control point motion vector difference CPMVD of the CP of the current block based on the motion prediction information;

deriving a Control Point Motion Vector (CPMV) of the CP of the current block based on the CPMVP and the CPMVD;

deriving prediction samples for the current block based on the CPMV; and

generating a reconstructed picture of the current block based on the derived prediction samples,

wherein the step of constructing the affine MVP candidate list comprises the steps of:

checking whether inherited affine MVP candidates are available, wherein the inherited affine MVP candidates are derived when available;

checking whether a constructed affine MVP candidate is available, wherein the constructed affine MVP candidate is derived when the constructed affine MVP candidate is available and includes a candidate motion vector of the CP0 of the current block, a candidate motion vector of the CP1 of the current block, and a candidate motion vector of the CP2 of the current block;

deriving a first affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of the CP0 is available, wherein the first affine MVP candidate is an affine MVP candidate including a motion vector of the CP0 as a candidate motion vector of a CP;

deriving a second affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of the CP1 is available, wherein the second affine MVP candidate is an affine MVP candidate including a motion vector of the CP1 as a candidate motion vector of a CP;

deriving a third affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of the CP2 is available, wherein the third affine MVP candidate is an affine MVP candidate including a motion vector of the CP2 as a candidate motion vector of a CP;

deriving a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of a CP when the number of derived affine MVP candidates is less than 2; and

when the number of derived affine MVP candidates is less than 2, a fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP is derived.

2. The method of claim 1, wherein the CP0 represents an upper left position of the current block, the CP1 represents an upper right position of the current block, and the CP2 represents a lower left position of the current block; and is

The constructed affine MVP candidate is available when the candidate motion vector is available.

3. The method of claim 2, wherein the candidate motion vector of the CP0 is available when a reference picture of a first block in the first group is the same as a reference picture of the current block,

when the reference picture of the second block in the second group is the same as the reference picture of the current block, the candidate motion vector of the CP1 is available,

when the reference picture of the third block in the third group is the same as the reference picture of the current block, the candidate motion vector of the CP2 is available; and is

When the candidate motion vector of the CP0 is available, the candidate motion vector of the CP1 is available, and the candidate motion vector of the CP2 is available, the affine MVP candidate list includes the constructed affine MVP candidate.

4. The method of claim 3, wherein the first group comprises neighboring block A, neighboring block B, and neighboring block C; the second group comprises neighboring block D and neighboring block E; and the third group comprises neighboring block F and neighboring block G; and is

When the size of the current block is W × H, and the x-component and the y-component of the upper-left sample position of the current block are 0, the neighboring block a is a block including samples at coordinates (-1, -1), the neighboring block B is a block including samples at coordinates (0, -1), the neighboring block C is a block including samples at coordinates (-1, 0), the neighboring block D is a block including samples at coordinates (W-1, -1), the neighboring block E is a block including samples at coordinates (W, -1), the neighboring block F is a block including samples at coordinates (-1, H-1), and the neighboring block G is a block including samples at coordinates (-1, H).

5. The method of claim 4, wherein the first block is a block that has first been determined to be a reference picture that is the same as a reference picture of the current block while checking neighboring blocks in the first group in a first particular order,

the second block is a block that has first been confirmed to be a reference picture identical to a reference picture of the current block while checking neighboring blocks in the second group in a second particular order, and

the third block is a block that has first been determined to be the same reference picture as the reference picture of the current block while checking neighboring blocks in the third group in a third particular order.

6. The method of claim 5, wherein the first particular order is an order from the neighboring block A to the neighboring block B, and then to the neighboring block C,

the second specific order is an order from the neighboring block D to the neighboring block E, and

the third particular order is an order from the neighboring block F to the neighboring block G.

7. The method of claim 1, wherein availability of neighboring blocks of the current block is checked in a particular order, and the inherited affine MVP candidate is derived based on the checked available neighboring blocks.

8. The method of claim 7, wherein the available neighboring blocks are neighboring blocks that are encoded according to an affine motion model and have the same reference picture as a reference picture of the current block.

9. The method of claim 8, wherein the neighboring blocks comprise a left neighboring block and an upper neighboring block of the current block.

10. The method of claim 8, wherein the neighboring block comprises a left neighboring block to the current block, and

when an upper neighboring block of the current block is included in a current Coding Tree Unit (CTU) including the current block, the neighboring block includes the upper neighboring block of the current block.

11. The method of claim 9, wherein, when the neighboring blocks include the left neighboring block and the upper neighboring block, the particular order is an order from the left neighboring block to the upper neighboring block.

12. The method of claim 11, wherein, when the size of the current block is W × H and x and y components of an upper-left sample position of the current block are 0, the left neighboring block is a block including a sample at coordinates (-1, H-1) and the upper neighboring block is a block including a sample at coordinates (W-1, -1).

13. The method of claim 1, wherein no pruning checks between the inherited affine MVP candidate and the constructed affine MVP candidate are performed.

14. The method of claim 1, wherein, when an upper neighboring block of the current block is included in a current Coding Tree Unit (CTU) including the current block, the upper neighboring block is used to derive the inherited affine MVP candidate, and

when the upper neighboring block of the current block is not included in the current CTU, the upper neighboring block is not used to derive the inherited affine MVP candidate.

15. A video encoding method performed by an encoding apparatus, the video encoding method comprising the steps of:

constructing an affine Motion Vector Predictor (MVP) candidate list of the current block;

deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on the affine MVP candidate list;

deriving a CPMV of the CP for the current block;

deriving a control point motion vector difference CPMVD of the CP of the current block based on the CPMVP and the CPMV; and

encoding motion prediction information including information on the CPMVD,

wherein the step of constructing the affine MVP candidate list comprises the steps of:

checking whether an inherited affine MVP candidate for the current block is available, wherein the inherited affine MVP candidate is derived when available;

checking whether a constructed affine MVP candidate of the current block is available, wherein the constructed affine MVP candidate is derived when the constructed affine MVP candidate is available and includes a candidate motion vector of CP0 of the current block, a candidate motion vector of CP1 of the current block, and a candidate motion vector of CP2 of the current block;

when the number of derived affine MVP candidates is less than 2, a fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP is derived.

Technical Field

The present disclosure relates to video encoding technology, and more particularly, to a method and apparatus for video decoding based on affine motion prediction in a video encoding system.

Background

In various fields, demand for high-resolution, high-quality images such as HD (high definition) images and UHD (ultra high definition) images is increasing. Since the image data has high resolution and high quality, the amount of information or bits to be transmitted increases relative to conventional image data. Therefore, when image data is transmitted using a medium such as a conventional wired/wireless broadband line or stored using an existing storage medium, transmission costs and storage costs thereof increase.

Accordingly, there is a need for efficient image compression techniques for efficiently transmitting, storing, and reproducing information for high-resolution, high-quality images.

Disclosure of Invention

Technical problem

It is a technical object of the present disclosure to provide a method and apparatus for improving video coding efficiency.

Another technical object of the present disclosure is to provide a video decoding method and apparatus that construct an affine MVP candidate list of a current block by deriving a constructed affine MVP candidate based on neighboring blocks only when all candidate motion vectors of a CP are available and perform prediction of the current block based on the constructed affine MVP candidate list.

It is still another technical object of the present disclosure to provide a video decoding method and apparatus for deriving an affine MVP candidate by using a candidate motion vector derived from a process for deriving a constructed affine MVP candidate as an increased affine MVP candidate when the number of available inherited affine MVP candidates and the number of constructed affine MVP candidates (i.e., the number of candidates of an MVP candidate list) is less than a maximum number; and performing prediction of the current block based on the constructed affine MVP candidate list.

Technical scheme

According to an embodiment of the present disclosure, there is provided a video decoding method performed by a decoding apparatus. The method comprises the following steps: obtaining motion prediction information of a current block from a bitstream; constructing an affine Motion Vector Predictor (MVP) candidate list of the current block; deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on the affine MVP candidate list; deriving a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the motion prediction information; deriving a Control Point Motion Vector (CPMV) of the CP of the current block based on the CPMVP and the CPMVD; deriving prediction samples for the current block based on the CPMV; and generating a reconstructed picture of the current block based on the derived prediction samples, wherein the step of constructing the affine MVP candidate list comprises: checking whether an inherited affine MVP candidate of the current block is available, wherein the inherited affine MVP candidate is derived when the inherited affine MVP candidate is available; checking whether a constructed affine MVP candidate of the current block is available, wherein the constructed affine MVP candidate is derived when the constructed affine MVP candidate is available, and the constructed affine MVP candidate includes a candidate motion vector of CP0 of the current block, a candidate motion vector of CP1 of the current block, and a candidate motion vector of CP2 of the current block; deriving a first affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP0 is available, wherein the first affine MVP candidate is an affine MVP candidate including a motion vector of CP0 as a candidate motion vector of CP; deriving a second affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP1 is available, wherein the second affine MVP candidate is an affine MVP candidate including a motion vector of CP1 as a candidate motion vector of CP; deriving a third affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP2 is available, wherein the third affine MVP candidate is an affine MVP candidate including a motion vector of CP2 as a candidate motion vector of CP; deriving a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of the CP when the number of derived affine MVP candidates is less than 2; and deriving a fifth affine MVP candidate when the number of derived affine MVP candidates is less than 2, the fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP.

According to another embodiment of the present disclosure, there is provided a decoding apparatus that performs video encoding. The decoding apparatus includes: an entropy decoder that obtains motion prediction information of a current block; a predictor that constructs an affine Motion Vector Predictor (MVP) candidate list of the current block, derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block based on the affine MVP candidate list, and derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on motion prediction information; deriving a Control Point Motion Vector (CPMV) of the CP of the current block based on the CPMVD; deriving prediction samples for the current block based on the CPMV; and an adder that generates a reconstructed picture of the current block based on the derived prediction samples, wherein the affine MVP candidate list is constructed based on: checking whether an inherited affine MVP candidate for the current block is available, wherein the inherited affine MVP candidate is derived when the inherited affine MVP candidate is available; checking whether a constructed affine MVP candidate of the current block is available, wherein the constructed affine MVP candidate is derived when the constructed affine MVP candidate is available, and the constructed affine MVP candidate includes a candidate motion vector of CP0 of the current block, a candidate motion vector of CP1 of the current block, and a candidate motion vector of CP2 of the current block; deriving a first affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP0 is available, wherein the first affine MVP candidate is an affine MVP candidate including a motion vector of CP0 as a candidate motion vector of CP; deriving a second affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP1 is available, wherein the second affine MVP candidate is an affine MVP candidate including a motion vector of CP1 as a candidate motion vector of CP; deriving a third affine MVP candidate when the number of derived affine MVP candidates is less than 2 and the motion vector of CP2 is available, wherein the third affine MVP candidate is an affine MVP candidate including the motion vector of CP2 as a candidate motion vector of CP; deriving a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of the CP when the number of derived affine MVP candidates is less than 2; and deriving a fifth affine MVP candidate when the number of derived affine MVP candidates is less than 2, the fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP.

According to still another embodiment of the present disclosure, there is provided a video encoding method performed by an encoding apparatus. The method comprises the following steps: constructing an affine Motion Vector Predictor (MVP) candidate list of the current block; deriving a Control Point Motion Vector Predictor (CPMVP) for a Control Point (CP) of the current block based on the affine MVP candidate list; deriving a CPMV of the CP of the current block; deriving a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the CPMVP and the CPMV; and encoding motion prediction information including information on the CPMVD, wherein the constructing of the affine MVP candidate list includes: checking whether an inherited affine MVP candidate for the current block is available, wherein the inherited affine MVP candidate is derived when the inherited affine MVP candidate is available; checking whether a constructed affine MVP candidate of the current block is available, wherein the constructed affine MVP candidate is derived when the constructed affine MVP candidate is available, and the constructed affine MVP candidate includes a candidate motion vector of CP0 of the current block, a candidate motion vector of CP1 of the current block, and a candidate motion vector of CP2 of the current block; deriving a first affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP0 is available, wherein the first affine MVP candidate is an affine MVP candidate including a motion vector of CP0 as a candidate motion vector of CP; deriving a second affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP1 is available, wherein the second affine MVP candidate is an affine MVP candidate including a motion vector of CP1 as a candidate motion vector of CP; deriving a third affine MVP candidate when the number of derived affine MVP candidates is less than 2 and the motion vector of CP2 is available, wherein the third affine MVP candidate is an affine MVP candidate including the motion vector of CP2 as a candidate motion vector of CP; deriving a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of the CP when the number of derived affine MVP candidates is less than 2; and deriving a fifth affine MVP candidate when the number of derived affine MVP candidates is less than 2, the fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP.

According to still another embodiment of the present disclosure, there is provided a video encoding apparatus. The encoding device includes: a predictor that constructs an affine Motion Vector Predictor (MVP) candidate list of the current block, derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block, and derives a CPMV of the CP of the current block; a subtractor which derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the CPMVP and the CPMV; and an entropy encoder which encodes motion prediction information including information on the CPMVD, wherein the affine MVP candidate list is constructed based on: checking whether an inherited affine MVP candidate for the current block is available, wherein the inherited affine MVP candidate is derived when the inherited affine MVP candidate is available; checking whether a constructed affine MVP candidate of the current block is available, wherein the constructed affine MVP candidate is derived when the constructed affine MVP candidate is available, and the constructed affine MVP candidate includes a candidate motion vector of CP0 of the current block, a candidate motion vector of CP1 of the current block, and a candidate motion vector of CP2 of the current block; deriving a first affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP0 is available, wherein the first affine MVP candidate is an affine MVP candidate including a motion vector of CP0 as a candidate motion vector of CP; deriving a second affine MVP candidate when the number of derived affine MVP candidates is less than 2 and a motion vector of CP1 is available, wherein the second affine MVP candidate is an affine MVP candidate including a motion vector of CP1 as a candidate motion vector of CP; deriving a third affine MVP candidate when the number of derived affine MVP candidates is less than 2 and the motion vector of CP2 is available, wherein the third affine MVP candidate is an affine MVP candidate including the motion vector of CP2 as a candidate motion vector of CP; deriving a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of the CP when the number of derived affine MVP candidates is less than 2; and deriving a fifth affine MVP candidate when the number of derived affine MVP candidates is less than 2, the fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP.

Advantageous effects

According to the present disclosure, overall image/video compression efficiency can be improved.

According to the present disclosure, the efficiency of affine motion prediction based video encoding can be improved.

According to the present disclosure, in deriving an affine MVP candidate list, a constructed affine MVP candidate may be added only when all candidate motion vectors of CPs of the constructed affine MVP candidate are available, whereby the complexity of deriving the constructed affine MVP candidate and constructing the affine MVP candidate list may be reduced, and encoding efficiency may be improved.

According to the present disclosure, in deriving an affine MVP candidate list, additional affine MVP candidates may be derived based on candidate motion vectors of CPs derived from a process for deriving a constructed affine MVP candidate, whereby the complexity of constructing the affine MVP candidate list may be reduced and encoding efficiency may be improved.

According to the present disclosure, in deriving an inherited affine MVP candidate, only when an upper neighboring block is included in a current CTU, the inherited affine MVP candidate can be derived by using the upper neighboring block, whereby the storage amount of a line buffer for affine prediction can be reduced and hardware cost can be minimized.

Drawings

Fig. 1 illustrates an example of a video/image encoding system to which the present disclosure may be applied.

Fig. 2 is a schematic diagram illustrating a configuration of a video/image encoding device to which an embodiment of this document can be applied.

Fig. 3 is a schematic diagram illustrating a configuration of a video/image decoding apparatus to which an embodiment of this document can be applied.

Fig. 4 illustrates a motion represented by an affine motion model.

Fig. 5 illustrates an affine motion model using motion vectors of 3 control points.

Fig. 6 illustrates an affine motion model using motion vectors of 2 control points.

Fig. 7 illustrates a method of deriving a motion vector on a sub-block basis based on an affine motion model.

Fig. 8 is a flowchart illustrating an affine motion prediction method according to one embodiment of the present disclosure.

Fig. 9 illustrates a method for deriving a motion vector predictor at a control point according to one embodiment of the present disclosure.

Fig. 10 illustrates a method for deriving a motion vector predictor at a control point according to one embodiment of the present disclosure.

Fig. 11 illustrates one example of affine prediction performed when the adjacent block a is selected as an affine merging candidate.

Fig. 12 illustrates neighboring blocks used to derive inherited affine candidates.

Fig. 13 illustrates spatial candidates for the constructed affine candidates.

Fig. 14 illustrates an example of constructing an affine MVP list.

Fig. 15 illustrates an example of deriving configuration candidates.

Fig. 16 illustrates an example of deriving configuration candidates.

Fig. 17 illustrates positions of neighboring blocks scanned for deriving inherited affine candidates.

Fig. 18 illustrates an example of deriving construction candidates when a four-parameter affine motion model is applied to a current block.

Fig. 19 illustrates an example of deriving a construction candidate when a six-parameter affine motion model is applied to the current block.

Fig. 20a and 20b illustrate embodiments for deriving inherited affine candidates.

Fig. 21 illustrates a video encoding method performed by an encoding apparatus according to the present disclosure.

Fig. 22 illustrates an encoding apparatus performing a video encoding method according to the present disclosure.

Fig. 23 illustrates a video decoding method performed by a decoding apparatus according to the present disclosure.

Fig. 24 illustrates a decoding apparatus performing a video decoding method according to the present disclosure.

Fig. 25 illustrates a content flow system structure to which an embodiment of the present disclosure is applied.

Detailed Description

The present disclosure may be modified in various forms and specific embodiments thereof will be described and illustrated in the accompanying drawings. However, the embodiments are not intended to limit the present disclosure. The terminology used in the following description is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. Expressions in the singular include expressions in the plural as long as they are clearly understood in a different manner. Terms such as "including" and "having" are intended to indicate the presence of the features, numbers, steps, operations, elements, components or combinations thereof used in the following description, and therefore it should be understood that the possibility of one or more different features, numbers, steps, operations, elements, components or combinations thereof being present or added is not excluded.

On the other hand, elements in the drawings described in the present disclosure are separately drawn for convenience of explaining different specific functions, and it is not meant that the elements are embodied by separate hardware or separate software. For example, two or more of the elements may be combined to form a single element, or one element may be divided into a plurality of elements. Embodiments in which elements are combined and/or divided are within the present disclosure without departing from the concepts of the present disclosure.

Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In addition, like reference numerals are used to designate like elements throughout the drawings, and the same description of the like elements will be omitted.

Fig. 1 illustrates an example of a video/image encoding system to which the present disclosure may be applied.

Referring to fig. 1, a video/image encoding system may include a first device (source device) and a second device (sink device). The source device may transmit the encoded video/image information or data to the sink device in the form of a file or stream via a digital storage medium or a network.

The source device may include a video source, an encoding apparatus, and a transmitter. The receiving apparatus may include a receiver, a decoding device, and a renderer. The encoding device may be referred to as a video/image encoding device, and the decoding device may be referred to as a video/image decoding device. The transmitter may be comprised in an encoding device. The receiver may be comprised in a decoding device. The renderer may include a display, and the display may be configured as a separate device or an external component.

The video source may acquire the video/image by capturing, synthesizing, or generating the video/image. The video source may include a video/image capture device and/or a video/image generation device. The video/image capture device may include, for example, one or more cameras, video/image archives including previously captured video/images, and the like. The video/image generation means may comprise, for example, a computer, a tablet computer and a smartphone, and may generate the video/image (electronically). For example, the virtual video/image may be generated by a computer or the like. In this case, the video/image capturing process may be replaced by a process of generating the relevant data.

The encoding apparatus may encode the input video/image. An encoding apparatus may perform a series of processes such as prediction, transformation, and quantization to achieve compression and encoding efficiency. The encoded data (encoded video/image information) may be output in the form of a bitstream.

The transmitter may transmit the encoded image/image information or data, which is output in the form of a bitstream, to the receiver of the receiving apparatus in the form of a file or a stream through a digital storage medium or a network. The digital storage medium may include various storage media such as USB, SD, CD, DVD, blu-ray, HDD, SSD, and the like. The transmitter may include elements for generating a media file through a predetermined file format, and may include elements for transmitting through a broadcast/communication network. The receiver may receive/extract a bitstream and transmit the received bitstream to the decoding apparatus.

The decoding apparatus may decode the video/image by performing a series of processes such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding apparatus.

The renderer may render the decoded video/image. The rendered video/image may be displayed by a display.

This document relates to video/image coding. For example, the methods/embodiments disclosed in this document may be applied to methods disclosed in the multifunctional Video coding (VVC), EVC (basic Video coding) standard, AOMedia Video 1(AV1) standard, 2 nd generation audio Video coding standard (AVs2), or next generation Video/image coding standard (e.g., h.267, h.268, or the like).

This document presents various embodiments of video/image coding, and unless otherwise mentioned, embodiments may be performed in combination with each other.

A slice may include one or more slice groups.A slice may include one or more slices/slices.A slice may include a rectangular region of CTU rows within the slice.A slice may include one or more tile groups.A slice may include one or more slices.A slice may also be referred to as a tile.A tile scan is a particular ordered sequence of CTUs in which the picture with the CTU ordered sequence in the tile raster scan of the tile is divided by a rectangular slice arranged consecutively in the raster scan of the tile, and a slice in which the slice in the slice is a raster scan of the tile is a particular ordered sequence of CTU rows within the slice, and a slice header may be a particular slice height, or a particular slice height, which is a slice height, and is a particular slice height, which is a slice included in the particular slice header of the slice/slice.

A pixel or a pixel may represent the smallest unit that constitutes a picture (or image). In addition, "sample" may be used as a term corresponding to a pixel. The samples may generally represent pixels or pixel values and may represent only pixels/pixel values of a luminance component or only pixels/pixel values of a chrominance component.

A unit may include one luma block and two chroma (e.g., cb, cr) blocks.

In this document, the terms "/" and "," should be interpreted as indicating "and/or". For example, the expression "a/B" may mean "a and/or B". Further, "A, B" may represent "a and/or B". Further, "a/B/C" may mean "A, B and/or at least one of C". In addition, "a/B/C" may mean "A, B and/or at least one of C".

Furthermore, in this document, the term "or" should be interpreted as indicating "and/or". For example, the expression "a or B" may include 1) only a, 2) only B, and/or 3) both a and B. In other words, the term "or" herein should be interpreted as indicating "additionally or alternatively. "

Fig. 2 is a schematic diagram illustrating a configuration of a video/image encoding device to which an embodiment of this document can be applied. Hereinafter, the video encoding apparatus may include an image encoding apparatus.

Referring to fig. 2, the encoding apparatus 200 includes an image divider 210, a predictor 220, a residue processor 230 and an entropy encoder 240, an adder 250, a filter 260, and a memory 270. The predictor 220 may include an inter predictor 221 and an intra predictor 222. The residual processor 230 may include a transformer 232, a quantizer 233, an inverse quantizer 234, and an inverse transformer 235. The residue processor 230 may further include a subtractor 231. The adder 250 may be referred to as a reconstructor or reconstruction block generator. According to an embodiment, the image partitioner 210, the predictor 220, the residue processor 230, the entropy coder 240, the adder 250, and the filter 260 may be comprised of at least one hardware component (e.g., an encoder chipset or processor). In addition, the memory 270 may include a Decoded Picture Buffer (DPB) or may be formed of a digital storage medium. The hardware components may also include memory 270 as an internal/external component.

The image splitter 210 may split an input image (or picture or frame) input to the encoding device 200 into one or more processors, for example, the processors may be referred to as Coding Units (CUs). in this case, a coding unit may be recursively split from a Coding Tree Unit (CTU) or a maximum coding unit (L CU) according to a quadtree binary tree treelet (QTBTTT) structure.

In general, a M × N block may represent a sample or set of transform coefficients comprised of M columns and N rows.

In the encoding apparatus 200, a prediction signal (prediction block, prediction sample array) output from the inter predictor 221 or the intra predictor 222 is subtracted from an input image signal (original block, original sample array) to generate a residual signal (residual block, residual sample array) and the generated residual signal is transmitted to the transformer 232. In this case, as shown in the figure, a unit for subtracting a prediction signal (prediction block, prediction sample array) from an input image signal (original block, original sample array) in the encoder 200 may be referred to as a subtractor 231. The predictor may perform prediction on a block to be processed (hereinafter, referred to as a current block) and generate a prediction block including prediction samples of the current block. The predictor may determine whether to apply intra prediction or inter prediction based on the current block or CU. As described later in the description of each prediction mode, the predictor may generate various information related to prediction, such as prediction mode information, and transmit the generated information to the entropy encoder 240. Information on the prediction may be encoded in the entropy encoder 240 and output in the form of a bitstream.

The intra predictor 222 may predict the current block by referring to samples in the current picture. Depending on the prediction mode, the referenced samples may be located near the current block or may be located far away from the current block. In intra prediction, the prediction modes may include a plurality of non-directional modes and a plurality of directional modes. The non-directional mode may include, for example, a DC mode and a planar mode. Depending on the degree of detail of the prediction direction, the directional modes may include, for example, 33 directional prediction modes or 65 directional prediction modes. However, this is merely an example, and more or fewer directional prediction modes may be used depending on the setting. The intra predictor 222 may determine a prediction mode applied to the current block by using prediction modes applied to neighboring blocks.

The inter predictor 221 may derive a prediction block of the current block based on a reference block (reference sample array) specified by a motion vector on a reference picture, here, in order to reduce an amount of motion information transmitted in an inter prediction mode, motion information may be predicted in units of blocks, sub-blocks, or samples based on a correlation of motion information between neighboring blocks and the current block, the motion information may include a motion vector and a reference picture index, the motion information may further include inter prediction direction (L0 prediction, L1 prediction, Bi prediction, etc.) information.

The predictor 220 may generate a prediction signal based on various prediction methods described below. For example, the predictor may not only apply intra prediction or inter prediction to predict one block, but also apply both intra prediction and inter prediction at the same time. This may be referred to as inter-frame intra-combined prediction (CIIP). In addition, the predictor may predict the block based on an Intra Block Copy (IBC) prediction mode or a palette mode. The IBC prediction mode or palette mode may be used for content image/video coding, e.g., Screen Content Coding (SCC), of games and the like. IBC basically performs prediction in the current picture, but may be performed similarly to inter prediction because the reference block is derived in the current picture. That is, IBC may use at least one of the inter prediction techniques described herein. The palette mode may be considered as an example of intra coding or intra prediction. When the palette mode is applied, the sample values within the picture may be signaled based on information about the palette table and palette indices.

The transform 232 may generate transform coefficients by applying a transform technique to the residual signal, for example, the transform technique may include at least one of a Discrete Cosine Transform (DCT), a Discrete Sine Transform (DST), a karhunen-lo eve transform (K L T), a graph-based transform (GBT), or a conditional non-linear transform (CNT).

The quantizer 233 may quantize the transform coefficients and transmit them to the entropy encoder 240, and the entropy encoder 240 may encode quantized signals (information on the quantized transform coefficients) and output a bitstream the information on the quantized transform coefficients may be referred to as residual information the quantizer 233 may rearrange the block type quantized transform coefficients into a one-dimensional vector form based on a coefficient scan order and generate information on the quantized transform coefficients based on the quantized transform coefficients in the one-dimensional vector form the information on the transform coefficients may be generated the entropy encoder 240 may perform various encoding methods such as, for example, exponential Golomb (Golomb), context adaptive variable length coding (CAV L C), Context Adaptive Binary Arithmetic Coding (CABAC), etc. the entropy encoder 240 may encode information required for video/image reconstruction other than the quantized transform coefficients (e.g., values of syntax elements, etc.) together or separately, the encoded information may be transmitted or stored in units of NA L (network abstraction layer) in the form of a bitstream (e.g., encoded video/image information may be stored in a network abstraction layer) or may be stored in an entropy encoder 240, a network coding information storage medium such as an entropy encoder 240, a video coding parameter set, a video coding information may be stored in a network, a network coding information storage medium may include, a video coding parameter set, a general video coding information storage medium such as an external video encoding parameter set (PPS, a video encoding device, a video network, a video encoding device, a video encoding.

The quantized transform coefficients output from the quantizer 233 may be used to generate a prediction signal. For example, a residual signal (residual block or residual sample) may be reconstructed by applying inverse quantization and inverse transform to the quantized transform coefficients using inverse quantizer 234 and inverse transformer 235. The adder 250 adds the reconstructed residual signal to the prediction signal output from the inter predictor 221 or the intra predictor 222 to generate a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array). If the block to be processed has no residual (such as the case where the skip mode is applied), the prediction block may be used as a reconstructed block. The adder 250 may be referred to as a reconstructor or reconstruction block generator. The generated reconstructed signal may be used for intra prediction of a next block to be processed in a current picture, and may be used for inter prediction of a next picture through filtering as described below.

Further, during picture encoding and/or reconstruction, a luma map with chroma scaling (L MCS) may be applied.

Filter 260 may improve subjective/objective image quality by applying filtering to the reconstructed signal. For example, the filter 260 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture and store the modified reconstructed picture in the memory 270 (specifically, the DPB of the memory 270). The various filtering methods may include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, and so on. The filter 260 may generate various information related to filtering and transmit the generated information to the entropy encoder 240, as described later in the description of various filtering methods. The information related to the filtering may be encoded by the entropy encoder 240 and output in the form of a bitstream.

The modified reconstructed picture transmitted to the memory 270 may be used as a reference picture in the inter predictor 221. When inter prediction is applied by the encoding apparatus, prediction mismatch between the encoding apparatus 200 and the decoding apparatus can be avoided and encoding efficiency can be improved.

The DPB of the memory 270 may store the modified reconstructed picture used as a reference picture in the inter predictor 221. The memory 270 may store motion information of a block from which motion information in a current picture is derived (or encoded) and/or motion information of a reconstructed block in a picture. The stored motion information may be transmitted to the inter predictor 221 and used as motion information of a spatial neighboring block or motion information of a temporal neighboring block. The memory 270 may store reconstructed samples of reconstructed blocks in the current picture and may transmit the reconstructed samples to the intra predictor 222.

Fig. 3 is a schematic diagram illustrating a configuration of a video/image decoding apparatus to which an embodiment of this document can be applied.

Referring to fig. 3, the decoding apparatus 300 may include an entropy decoder 310, a residual processor 320, a predictor 330, an adder 340, a filter 350, and a memory 360. The predictor 330 may include an inter predictor 332 and an intra predictor 331. The residual processor 320 may include an inverse quantizer 321 and an inverse transformer 322. According to an embodiment, the entropy decoder 310, the residual processor 320, the predictor 330, the adder 340, and the filter 350 may be formed of hardware components (e.g., a decoder chipset or processor). In addition, the memory 360 may include a Decoded Picture Buffer (DPB), or may be composed of a digital storage medium. The hardware components may also include memory 360 as internal/external components.

When a bitstream including video/image information is input, the decoding apparatus 300 may reconstruct an image corresponding to a process of processing the video/image information in the encoding apparatus of fig. 2. For example, the decoding apparatus 300 may derive a unit/block based on block division related information obtained from a bitstream. The decoding apparatus 300 may perform decoding using a processor applied in the encoding apparatus. Thus, the processor of decoding may be, for example, a coding unit, and the coding unit may be partitioned from the coding tree unit or the maximum coding unit according to a quadtree structure, a binary tree structure, and/or a ternary tree structure. One or more transform units may be derived from the coding unit. The reconstructed image signal decoded and output by the decoding apparatus 300 may be reproduced by a reproducing apparatus.

The decoding device 300 may receive a signal output from the encoding device of fig. 2 in the form of a bitstream and may decode the received signal by the entropy decoder 310. for example, the entropy decoder 310 may parse the bitstream to derive information (e.g., video/image information) required for image reconstruction (or picture reconstruction). the video/image information may also include information regarding various parameter sets such as an Adaptive Parameter Set (APS), a Picture Parameter Set (PPS), a Sequence Parameter Set (SPS), or a Video Parameter Set (VPS). additionally, the video/image information may also include general constraint information.the decoding device may also decode a picture based on information regarding parameter sets and/or general constraint information.a signaled/received information and/or syntax elements described later herein may be decoded by a decoding process and obtained from the bitstream.e.g., the entropy decoder 310 may decode information in the bitstream based on coding methods such as golomb coding, CAV L C, or CABAC, and/or CABAC, and may further generate a residual coding information regarding a residual coding parameter, and/or may be output to the decoder 320 via an intra predictor or a residual predictor decoder 350, and may be provided to the decoder 320 via an intra predictor decoder 320, and/or may be configured to a residual predictor decoder, and may be decoded via an intra predictor decoder according to a residual predictor module for a residual predictor (or a residual predictor) corresponding to a residual predictor or an intra predictor module for which may be determined by an intra predictor or an intra predictor module for which may be used by a decoder 310.

The inverse quantizer 321 may inverse-quantize the quantized transform coefficient and output the transform coefficient. The inverse quantizer 321 may rearrange the quantized transform coefficients in the form of a two-dimensional block. In this case, the rearrangement may be performed based on the coefficient scan order performed in the encoding apparatus. The inverse quantizer 321 may perform inverse quantization on the quantized transform coefficient by using a quantization parameter (e.g., quantization step information) and obtain a transform coefficient.

The inverse transformer 322 inverse-transforms the transform coefficients to obtain a residual signal (residual block, residual sample array).

The predictor may perform prediction on the current block and generate a prediction block including prediction samples of the current block. The predictor may determine whether to apply intra prediction or inter prediction to the current block based on information regarding prediction output from the entropy decoder 310, and may determine a specific intra/inter prediction mode.

The predictor 320 may generate a prediction signal based on various prediction methods described below. For example, the predictor may not only apply intra prediction or inter prediction to predict one block, but also apply both intra prediction and inter prediction. This may be referred to as Combined Inter and Intra Prediction (CIIP). In addition, the predictor may predict the block based on an Intra Block Copy (IBC) prediction mode or a palette mode. The IBC prediction mode or palette mode may be used for content image/video coding, e.g., Screen Content Coding (SCC), of games and the like. IBC basically performs prediction in a current picture, but may be performed similarly to inter prediction because a reference block is derived in the current picture. That is, IBC may use at least one of the inter prediction techniques described in this document. The palette mode may be considered as an example of intra coding or intra prediction. When the palette mode is applied, the sample values within the picture may be signaled based on information about the palette table and palette indices.

The intra predictor 331 may predict the current block by referring to samples in the current picture. Depending on the prediction mode, the referenced samples may be located near the current block or may be located far away from the current block. In intra prediction, the prediction modes may include a plurality of non-directional modes and a plurality of directional modes. The intra predictor 331 may determine a prediction mode applied to the current block by using a prediction mode applied to the neighboring block.

The inter predictor 332 may derive a prediction block of the current block based on a reference block (reference sample array) specified by a motion vector on a reference picture, in this case, in order to reduce the amount of motion information transmitted in an inter prediction mode, motion information may be predicted in units of blocks, sub-blocks, or samples based on the correlation of motion information between neighboring blocks and the current block.

The adder 340 may generate a reconstructed signal (reconstructed picture, reconstructed block, reconstructed sample array) by adding the obtained residual signal to a prediction signal (prediction block, predicted sample array) output from a predictor (including the inter predictor 332 and/or the intra predictor 331). If the block to be processed has no residual (e.g., when skip mode is applied), the predicted block may be used as a reconstructed block.

The adder 340 may be referred to as a reconstructor or a reconstruction block generator. The generated reconstructed signal may be used for intra prediction of a next block to be processed in a current picture, may be output through filtering as described below, or may be used for inter prediction of a next picture.

Further, a luma map with chroma scaling (L MCS) may be applied in the picture decoding process.

Filter 350 may improve subjective/objective image quality by applying filtering to the reconstructed signal. For example, the filter 350 may generate a modified reconstructed picture by applying various filtering methods to the reconstructed picture and store the modified reconstructed picture in the memory 360 (specifically, the DPB of the memory 360). The various filtering methods may include, for example, deblocking filtering, sample adaptive offset, adaptive loop filter, bilateral filter, and so on.

The (modified) reconstructed picture stored in the DPB of the memory 360 may be used as a reference picture in the inter predictor 332. The memory 360 may store motion information of a block from which motion information in a current picture is derived (or decoded) and/or motion information of a reconstructed block in a picture. The stored motion information may be sent to the inter predictor 260 to be utilized as motion information of a spatial neighboring block or motion information of a temporal neighboring block. The memory 360 may store reconstructed samples of a reconstructed block in a current picture and may transmit the reconstructed samples to the intra predictor 331.

In the present disclosure, the embodiments described in the filter 260, the inter predictor 221, and the intra predictor 222 of the encoding apparatus 200 may be the same as or applied to correspond to the filter 350, the inter predictor 332, and the intra predictor 331 of the decoding apparatus 300, respectively. The same applies to the inter predictor 332 and the intra predictor 331.

Further, with respect to inter prediction, an inter prediction method considering image distortion has been proposed. More specifically, an affine motion model has been proposed, which effectively derives a motion vector of a sub-block or a sample point of a current block and improves accuracy of inter prediction regardless of deformation due to image rotation, enlargement, or reduction. In other words, an affine motion model is proposed which derives the motion vectors of the sub-blocks or sample points of the current block. Prediction using an affine motion model may be referred to as affine inter prediction or affine motion prediction.

For example, affine inter prediction using an affine motion model can effectively represent four motions, i.e., four deformations, as described below.

Fig. 4 illustrates a motion represented by an affine motion model. Referring to fig. 4, the motion that can be represented by the affine motion model may include a translational motion, a zooming motion, a rotational motion, and a shearing motion. That is, a zoom motion in which (a part of) the image shown in fig. 4 is scaled according to the passage of time, a rotation motion in which (a part of) the image is rotated according to the passage of time, and a shear motion in which (a part of) the image is deformed parallelogram-according to the passage of time, and a translation motion in which (a part of) the image is planarly moved according to the passage of time of the image can be effectively expressed as shown in fig. 3.

The encoding apparatus/decoding apparatus can predict a distorted shape of an image based on a motion vector at a Control Point (CP) of a current block through affine inter prediction, and can improve compression performance of the image by improving prediction accuracy. In addition, since a motion vector of at least one control point of the current block can be derived using motion vectors of neighboring blocks of the current block, a burden on the amount of data of additional information can be reduced, and inter prediction efficiency can be significantly improved.

As an example of affine inter prediction, motion information at three control points (i.e., three reference points) may be required.

Fig. 5 illustrates an affine motion model using motion vectors of three control points.

When the upper left sample position in the current block 500 is (0, 0), sample positions (0, 0), (w, 0), and (0, h) may be defined as control points, as shown in fig. 5. Hereinafter, the control point of the sample position (0, 0) may be denoted as CP0, the control point of the sample position (w, 0) may be denoted as CP1, and the control point of the sample position (0, h) may be denoted as CP 2.

The above-described control points and the motion vectors of the respective control points may be used to derive the formula for the affine motion model. The formula for an affine motion model can be expressed as follows.

[ formula 1]

Here, w denotes a width of the current block 500, h denotes a height of the current block 500, v denotes a height of the current block 500_0xAnd v_0yX-and y-components, v, respectively, representing the motion vector of CP0_1xAnd v_1yRespectively representing the x-component and y-component of the motion vector of CP1, and v_2xAnd v_2yRepresenting the x-component and y-component, respectively, of the motion vector of CP 2. In addition, x denotes an x component of the position of the target sample in the current block 500, y denotes a y component of the position of the target sample in the current block 500, v_xRepresents the x component of the motion vector of the target sample in the current block 500, and v_yRepresents the y-component of the motion vector of the target sample in the current block 500.

Since the motion vector of CP0, the motion vector of CP1, and the motion vector of CP2 are known, a motion vector based on the sample position in the current block can be derived based on equation 1. That is, according to the affine motion model, the motion vector v0 (v) at the control point_0x，v_0y)、v1(v_1x，v_1y) And v2 (v)_2x，v_2y) Scaling may be performed based on a ratio of coordinates (x, y) of the target sample to distances between the three control points to derive a motion vector of the target sample according to a position of the target sample. That is, according to the affine motion model, a motion vector of each sample in the current block may be derived based on a motion vector of a control point. Furthermore, the set of motion vectors of samples in the current block derived from the affine motion model may be referred to as an affine Motion Vector Field (MVF).

Further, the six parameters of equation 1 may be represented by a, b, c, d, e, and f as shown in the following equation, and the equation of the affine motion model represented by the six parameters may be as follows.

[ formula 2]

Here, w denotes a width of the current block 500, h denotes a height of the current block 500, v denotes a height of the current block 500_0xAnd v_0yX-component, y-component, v representing the motion vector of CP0_1xAnd v_1yRespectively representing the x-component and y-component of the motion vector of CP1, and v_2xAnd v_2yRepresenting the x-component and y-component, respectively, of the motion vector of CP 2. In addition, x denotes an x component of the position of the target sample in the current block 500, y denotes a y component of the position of the target sample in the current block 500, v_xRepresents the x component of the motion vector of the target sample in the current block 500, and v_yRepresents the y-component of the motion vector of the target sample in the current block 500.

An affine motion model or affine inter-frame prediction using six parameters may be referred to as a 6-parameter affine motion model or AF 6.

Further, as an example of affine inter prediction, motion information at two control points (i.e., two reference points) may be required.

Fig. 6 illustrates an affine motion model using motion vectors of two control points. An affine motion model using two control points can represent three motions including a translational motion, a zooming motion, and a rotational motion. Affine motion models representing three motions may be referred to as similar affine motion models or as simplified affine motion models.

When the upper left sample position in the current block 600 is (0, 0), sample positions (0, 0) and (w, 0) may be defined as control points, as shown in fig. 6. Hereinafter, the control point of the sample position (0, 0) may be denoted as CP0, and the control point of the sample position (w, 0) may be denoted as CP 1.

[ formula 3]

Further, the four parameters of equation 3 may be represented by a, b, c, and d in the following equation, and the equation of the affine motion model represented by the four parameters may be as follows.

[ formula 4]

Here, w denotes a width of the current block 600, v_0xAnd v_0yRespectively representing the x-component and y-component of the motion vector of CP0, and v_1xAnd v_1yRepresenting the x-component and y-component, respectively, of the motion vector of CP 1. In addition, x denotes an x component of the position of the target sample in the current block 600, y denotes a y component of the position of the target sample in the current block 600, v_xRepresents the x component of the motion vector of the target sample in the current block 600, and v_yRepresents the y-component of the motion vector of the target sample in the current block 600. The affine motion model using two control points may be represented by four parameters a, b, c, and d as shown in equation 4, and thus, the affine motion model or affine inter-frame prediction using four parameters may be referred to as a 4-parameter affine motion model or AF 4. That is, according to the affine motion model, a motion vector of each sample in the current block may be derived based on a motion vector of a control point. This is achieved byIn addition, the set of motion vectors of samples in the current block derived from the affine motion model may be referred to as an affine Motion Vector Field (MVF).

Further, as described above, the motion vector of a sample unit can be derived by an affine motion model, and thus the accuracy of inter prediction can be significantly improved. However, in this case, the complexity in the motion compensation process may be greatly increased.

Therefore, a restriction may be made to derive a motion vector of a sub-block unit of the current block, not a motion vector of a sample unit.

For example, when a subblock is set to a size of n × n (n is a positive integer, e.g., n is 4), a motion vector may be derived in units of n × n subblocks in the current block based on the affine motion model, and various methods for deriving a motion vector representing each subblock may be applied.

For example, referring to fig. 7, a motion vector of each sub-block may be derived using the center or lower-right sample position of each sub-block as a representative coordinate. Here, the center lower-right position may indicate a sample position located on the lower-right side among the four samples located at the center of the sub-block. For example, when n is an odd number, one sample may be located at the center of the sub-block, and in this case, the center sample position may be used to derive the motion vector of the sub-block. However, when n is an even number, the four samples may be positioned adjacent at the center of the sub-block, and in this case, the lower right sample position may be used to derive the motion vector. For example, referring to fig. 7, the representative coordinates of each sub-block may be derived as (2, 2), (6, 2), (10, 2), … …, (14, 14), and the encoding/decoding apparatus may derive the motion vector of each sub-block by substituting each representative coordinate of the sub-block into equation 1 or equation 3 above. The motion vector of the sub-block in the current block derived by the affine motion model may be referred to as affine MVF.

Also, as an example, the size of the sub-block in the current block may be derived based on the following equation.

[ formula 5]

Here, M denotes the width of the subblock, and N denotes the height of the subblock. In addition, v_0xAnd v_0yX-and y-components, v, of CPMV0 representing a current block_1xAnd v_1yThe x-and y-components of CPMV1 representing the current block, w represents the width of the current block, h represents the height of the current block, and MvPre represents the motion vector fractional precision. For example, the motion vector score accuracy may be set to 1/16.

Further, in INTER-frame prediction using the above-described affine motion model (i.e., affine motion prediction), there may be an affine MERGE mode AF _ MERGE and an affine INTER-frame mode AF _ INTER. Here, the affine inter mode may be referred to as an affine MVP mode AF _ MVP.

The affine merge mode is similar to the existing merge mode in that the MVDs of the motion vectors of the control points are not transmitted. That is, the affine merge mode may refer to an encoding/decoding method that performs prediction by deriving CPMV for each of two or three control points from neighboring blocks of a current block, similar to the existing skip/merge mode.

For example, when the AF _ MRG mode is applied to the current block, motion vectors of the CP0 and CP1 (i.e., CPMV0 and CPMV1) may be derived from neighboring blocks, to which the affine mode has been applied, among neighboring blocks of the current block. In other words, the CPMV0 and the CPMV1 of the neighboring blocks to which the affine mode has been applied may be derived as merging candidates, or may be derived as the CPMV0 and the CPMV1 of the current block based on the merging candidates. An affine motion model may be derived based on CPMV0 and CPMV1 of neighboring blocks represented by merging candidates, and based on the affine motion model, CPMV0 and CPMV1 of the current block may be derived.

The affine inter mode may represent inter prediction that derives a Motion Vector Predictor (MVP) of a motion vector of a control point, derives a motion vector of a control point based on a received Motion Vector Difference (MVD) and MVP, and derives an affine MVF of a current block based on the motion vector of the control point; and performs prediction according to affine MVF. Here, the motion vector of the control point may be referred to as a Control Point Motion Vector (CPMV), the MVP of the control point may be referred to as a Control Point Motion Vector Predictor (CPMVP), and the MVD of the control point may be referred to as a Control Point Motion Vector Difference (CPMVD). More specifically, for example, the encoding apparatus may derive Control Point Motion Vector Predictors (CPMVPs) and Control Point Motion Vectors (CPMVs) of the CPs 0 and 1 (or the CPs 0, the CP1, and the CP2), respectively, and transmit or store information on the CPMVPs and/or CPMVDs representing differences between the CPMVPs and the CPMVs.

Here, if the affine inter mode is applied to the current block, the encoding/decoding apparatus may construct an affine MVP candidate list based on neighboring blocks of the current block, wherein the affine MVP candidate may be referred to as a CPMVP pair candidate and the affine MVP candidate list may be referred to as a CPMVP candidate list.

In addition, each affine MVP candidate may represent a combination of CPMVPs of CP0 and CP1 in a four-parameter affine motion model, and a combination of CPMVPs of CP0, CP1, and CP2 in a six-parameter affine motion model.

Fig. 8 is a flowchart illustrating an affine motion prediction method according to one embodiment of the present disclosure.

Referring to fig. 8, the affine motion prediction method may be mainly described as follows. Once the affine motion prediction method starts, a CPMV pair is first obtained at S800. Here, if a four-parameter affine model is used, the CPMV pair may include CPMV0 and CPMV 1.

Thereafter, at S810, affine motion compensation may be performed based on the CPMV pair, after which affine motion prediction may be terminated.

In addition, two affine prediction modes can be defined to determine CPMV0 and CPMV 1. Here, the two affine prediction modes may include an affine inter mode and an affine merge mode. The affine inter mode may signal information about a Motion Vector Difference (MVD) between two motion vectors of CPMV0 and CPMV1 to clearly determine CPMV0 and CPMV 1. On the other hand, the affine merge mode can derive the CPMV pair without signaling MVD information.

In other words, the affine merge mode may derive the CPMV of the current block by using the CPMVs of the neighboring blocks encoded in the affine mode, and if the motion vector is determined by the sub-block unit, the affine merge mode may be referred to as a sub-block merge mode.

In the affine merging mode, the encoding apparatus may signal to the decoding apparatus an index of the neighboring block encoded in the affine mode for deriving the CPMV of the current block, and may also signal a difference value between the CPMV of the neighboring block and the CPMV of the current block. Here, the affine merging mode may construct an affine merging candidate list based on neighboring blocks, wherein an index of the neighboring block may indicate a neighboring block within the affine merging candidate list to be utilized to derive the CPMV of the current block. The affine merge candidate list may also be referred to as a subblock merge candidate list.

The affine inter mode may also be referred to as affine MVP mode. In the affine MVP mode, the CPMV of the current block may be derived based on a Control Point Motion Vector Predictor (CPMVP) and a Control Point Motion Vector Difference (CPMVD). In other words, the encoding apparatus may determine the CPMVP of the CPMV of the current block, derive the CPMVD, which is a difference value between the CPMV and the CPMVP of the current block, and signal information on the CPMVP and information on the CPMVD to the decoding apparatus. Here, the affine MVP mode may construct an affine MVP candidate list based on the neighboring blocks, and the information on the CPMVP may indicate the neighboring blocks to be utilized by the CPMVP, which derives the CPMV of the current block from the affine MVP candidate list. The affine MVP candidate list may also be referred to as a control point motion vector predictor candidate list.

For example, when an affine inter mode of a six-parameter affine motion model is applied, the current block may be encoded as described below.

Fig. 9 illustrates a method for deriving a motion vector predictor at a control point according to one embodiment of the present disclosure.

Referring to FIG. 9, a motion vector of the CP0 of the current block may be represented by v₀To refer, the motion vector of CP1 may be represented by v₁Indicating control points at the lower left sample positionThe motion vector may be represented by v₂And the motion vector of CP2 may be represented by v₃And (4) indicating. In other words, v₀CPMVP, v, which may represent CP0₁CPMVP of CP1 can be represented, and v₂May represent the CPMVP of CP 2.

The affine MVP candidate may be a combination of the CPMVP candidate of CP0, the CPMVP candidate of CP1, and the CPMVP candidate of CP 2.

For example, affine MVP candidates can be derived as follows.

More specifically, a maximum of 12 CPMVP candidate combinations may be determined, as shown in the following equation.

[ formula 6]

{(v₀，v₁，v₂)|v₀＝{v_A，v_B，v_c}，v1＝{v_D，v_E}，v₂＝{v_F，v_G}}

Here, v_ARepresenting the motion vector of the adjacent block A, v_BRepresenting motion vectors, v, of neighboring blocks B_cRepresenting the motion vector of the adjacent block C, v_DRepresenting the motion vector of the neighbouring block D, v_ERepresenting the motion vector of the adjacent block E, v_FRepresents the motion vector of the neighboring block F, and v_GRepresenting the motion vector of the neighboring block G.

Additionally, the neighboring block a may represent a neighboring block located above and to the left of an upper-left sample position of the current block, the neighboring block B may represent a neighboring block located above an upper-left sample position of the current block, and the neighboring block C may represent a neighboring block located to the left of an upper-left sample position of the current block. Additionally, the neighboring block D may represent a neighboring block located above the upper-right sample position of the current block, and the neighboring block E may represent a neighboring block located above and to the right of the upper-right sample position of the current block. Also, the neighboring block F may represent a neighboring block located to the left of the lower-left sample position of the current block, and the neighboring block G may represent a neighboring block located to the lower-left of the lower-left sample position of the current block.

In other words, referring to equation 6 above, the CPMVP candidate of CP0 may include the motion vector v of the neighboring block a_AMotion vector of neighboring block BQuantity v_BAnd/or motion vectors v of neighboring blocks C_C(ii) a The CPMVP candidates of CP1 may include the motion vector v of the neighboring block D_DAnd/or the motion vector v of the neighboring block E_E(ii) a The CPMVP candidates of CP2 may include the motion vector v of the neighboring block F_FAnd/or the motion vector v of the neighboring block G_G。

In other words, the CPMVPv of the CP0 may be derived based on at least one motion vector of the neighboring blocks A, B and C relative to the top-left sample position₀. Here, the neighboring block a may represent a block located above and to the left of an upper-left sample position of the current block, the neighboring block B may represent a block located above an upper-left sample position of the current block, and the neighboring block C may represent a block located to the left of an upper-left sample position of the current block.

Based on the motion vectors of the neighboring blocks, a maximum of 12 CPMVP candidate combinations including the CPMVP candidate of CP0, the CPMVP candidate of CP1, and the CPMVP candidate of CP2 may be derived.

Thereafter, the derived CPMVP candidate combinations are arranged in ascending order of DV, and the first two CPMVP candidate combinations may be derived as affine MVP candidates.

The DV of the CPMVP candidate combination is derived by the following formula.

[ formula 7]

DV＝|(v_1x-v_0x)*h-(v2_y-v0_y)*w|+|(v1_y-v0_y)*h+(v2_x-v0_x)*w|

Thereafter, the encoding apparatus may determine CPMVs of the respective affine MVP candidates, compare Rate Distortion (RD) costs between the CPMVs, and select an affine MVP candidate having a minimum RD cost as the best affine MVP candidate for the current block. The encoding device may encode and signal the index indicating the best candidate and the CPMVD.

In addition, for example, if the affine merge mode is applied, the current block may be encoded as follows.

Fig. 10 illustrates a method for deriving a motion vector predictor at a control point according to one embodiment of the present disclosure.

Based on the neighboring blocks of the current block shown in fig. 10, an affine merge candidate list of the current block may be constructed. The neighboring blocks may include neighboring block a, neighboring block B, neighboring block C, neighboring block D, and neighboring block E. The neighboring block a may represent a left neighboring block of the current block, the neighboring block B may represent an upper neighboring block of the current block, the neighboring block C may represent an upper-right neighboring block of the current block, the neighboring block D may represent a lower-left neighboring block of the current block, and the neighboring block E may represent an upper-left neighboring block of the current block.

For example, when the size of the current block is W × H, the x component of the upper-left sample position of the current block is 0, and the y component thereof is 0, the left neighboring block may be a block including a sample at coordinates (-1, H-1), the upper neighboring block may be a block including a sample at coordinates (W-1, -1), the upper-right neighboring block may be a block including a sample at coordinates (W, -1), the lower-left neighboring block may be a block including a sample at coordinates (-1, H), and the upper-left neighboring block may be a block including a sample at coordinates (-1, -1).

More specifically, for example, the encoding apparatus may scan a neighboring block a, a neighboring block B, a neighboring block C, a neighboring block D, and a neighboring block E of the current block in a certain scan order; the neighboring block that is first encoded in the affine prediction mode according to the scan order is determined as a candidate block for the affine merge mode, i.e., an affine merge candidate. In other words, the specific scanning order may be performed in the order of the neighboring block a, the neighboring block B, the neighboring block C, the neighboring block D, and the neighboring block E.

Thereafter, the encoding apparatus may determine an affine motion model of the current block by using the determined CPMV of the candidate block, determine the CPMV of the current block based on the affine motion model, and determine an affine MVF of the current block based on the CPMV.

As one example, if the neighboring block a is determined as a candidate block for the current block, encoding may be performed as described below.

Fig. 11 illustrates one example of affine prediction performed when the adjacent block a is selected as an affine merging candidate.

Referring to fig. 11, the encoding apparatus may determine a neighboring block a of the current block as a candidate block and based on CPMV (v) of the neighboring block₂And v₃) To derive an affine motion model of the current block. Thereafter, the encoding apparatus may determine CPMV (v) of the current block based on the affine motion model₀And v₁). The encoding device may be based on the CPMV (v) of the current block₀And v₁) Affine MVF is determined, and processing for encoding the current block is performed based on the affine MVF.

Further, in connection with affine inter prediction, as a means of constructing an affine MVP candidate list, inherited affine candidates and constructed affine candidates are being considered.

Here, the inherited affine candidates can be described as follows.

For example, if a neighboring block of the current block is an affine block and a reference picture of the current block is the same as a reference picture of the neighboring block, an affine MVP pair of the current block may be determined from affine motion models of the neighboring blocks. Here, the affine block may represent a block to which affine inter prediction has been applied. The inherited affine candidates may represent CPMVPs (e.g., affine MVP pairs) derived based on the neighboring blocks' affine motion models.

More specifically, as one example, inherited affine candidates can be derived as described below.

Fig. 12 illustrates neighboring blocks used to derive inherited affine candidates.

Referring to FIG. 12, the neighboring blocks of the current block may include a left neighboring block A0 of the current block, a lower left neighboring block A1 of the current block, an upper neighboring block B0 of the current block, an upper right neighboring block B1 of the current block, and an upper left neighboring block B2 of the current block.

The encoding/decoding apparatus may sequentially check the neighboring blocks a0, a1, B0, B1, and B2, and if the neighboring blocks are encoded according to an affine motion model and a reference picture of the current block is the same as that of the neighboring blocks, may derive two CPMVs or three CPMVs of the current block based on the affine motion models of the neighboring blocks. The CPMV may be derived as an affine MVP candidate for the current block. The affine MVP candidate may represent an inherited affine candidate.

As one example, up to two inherited affine candidates may be derived based on neighboring blocks.

For example, the encoding/decoding apparatus may derive a first affine MVP candidate for the current block based on a first block among neighboring blocks. Here, the first block may be encoded according to an affine motion model, and the reference picture of the first block may be the same as the reference picture of the current block. In other words, the first block may be a block that is first confirmed to satisfy the condition from checking neighboring blocks according to a certain order. The condition may be encoded according to an affine motion model, and the reference picture of the block may be the same as the reference picture of the current block.

Thereafter, the encoding/decoding apparatus may derive a second affine MVP candidate based on a second block among the neighboring blocks. Here, the second block may be encoded according to an affine motion model, and the reference picture of the second block may be the same as the reference picture of the current block. In other words, the second block may be a block that is confirmed to satisfy the condition from checking neighboring blocks according to a specific order. The condition may be encoded according to an affine motion model, and the reference picture of the block may be the same as the reference picture of the current block.

Further, for example, when the number of available inherited affine candidates is less than 2 (i.e., when the number of derived inherited affine candidates is less than 2), the constructed affine candidates may be considered. The constructed affine candidates can be derived as follows.

Fig. 13 illustrates spatial candidates of the constructed affine candidates.

As shown in fig. 13, the motion vectors of the neighboring blocks of the current block may be divided into three groups. Referring to fig. 13, the neighboring blocks may include a neighboring block a, a neighboring block B, a neighboring block C, a neighboring block D, a neighboring block E, a neighboring block F, and a neighboring block G.

The neighboring block a may represent a neighboring block located at the upper left of the upper left sample position of the current block; the neighboring block B may represent a neighboring block located above the upper left sample position of the current block; and the neighboring block C may represent a neighboring block located to the left of the upper-left sample position of the current block. In addition, the neighboring block D may represent a neighboring block located above the upper-right sample position of the current block, and the neighboring block E may represent a neighboring block located above and to the right of the upper-right sample position of the current block. In addition, the neighboring block F may represent a neighboring block located at the left side of the lower-left sample position of the current block; and the neighboring block G may represent a neighboring block located at the lower left of the lower left sample position of the current block.

For example, the three groups may include S₀、S₁And S₂Wherein S can be derived as shown in the following table₀、S₁And S₂。

[ Table 1]

Here, mv_ARepresenting the motion vector, mv, of the adjacent block A_BRepresenting the motion vector, mv, of the adjacent block B_CRepresenting the motion vector, mv, of the adjacent block C_DRepresenting the motion vector, mv, of the adjacent block D_ERepresenting the motion vector, mv, of the adjacent block E_FRepresents the motion vector of the adjacent block F, and mv_GRepresenting the motion vector of the neighboring block G. S₀May indicate a first group, S₁May indicate a second group, and S₂A third group may be indicated.

The encoding/decoding device can be selected from S₀Derivation of mv₀From S₁Derivation of mv₁From S₂Derivation of mv₂And is derived to include mv₀、mv₁And mv₂The affine MVP candidate of (1). The affine MVP candidate may indicate a constructed affine candidate. In addition, mv₀May be CPMVP candidate, mv, for CP0₁Can be CPMVP candidates for CP1, and mv₂May be CP2The CPMVP candidate of (1).

Here, mv₀May be the same as the reference picture of the current block. In other words, mv₀May be a slave check S₀The motion vector in the motion vector is first confirmed as a motion vector satisfying the condition. The condition may be such that the reference picture of the motion vector is the same as the reference picture of the current block. The specific order may be such that S is checked in the order of the neighboring block a, the neighboring block B and the neighboring block C₀The motion vector of the inner. Further, the checking order may be performed differently from the above, and may not be limited to the above example.

Furthermore, mv₁May be the same as the reference picture of the current block. In other words, mv₁May be a slave check S₁The motion vector in the motion vector is first confirmed as a motion vector satisfying the condition. The condition may be such that the reference picture of the motion vector is the same as the reference picture of the current block. The specific order may be such that S is checked in the order of the neighboring block D and the neighboring block E₁The motion vector of the inner. In addition, the checking order may be performed differently from the above, and may not be limited to the above example.

Furthermore, mv₂May be the same as the reference picture of the current block. In other words, mv₂May be a slave check S₂The motion vector in the motion vector is first confirmed as a motion vector satisfying the condition. The condition may be such that the reference picture of the motion vector is the same as the reference picture of the current block. The specific order may be such that S is checked in the order of the neighboring block F and the neighboring block G₂The motion vector of the inner. In addition, the checking order may be performed differently from the above, and may not be limited to the above example.

Furthermore, only when mv₀And mv₁When available, i.e. only when mv is derived₀And mv₁Then mv can be derived by₂。

[ formula 8]

In this case, the amount of the solvent to be used,representing mv₂The x-component of (a) is,representing mv₂The y-component of (a) is,representing mv₀The x-component of (a) is,representing mv₀The y-component of (a) is,representing mv₁X component of (a) andrepresenting mv₁The y component of (a). In addition, w denotes a width of the current block, and h denotes a height of the current block.

Furthermore, when deriving only mv₀And mv₂Then mv can be derived by₁。

[ formula 9]

In this case, the amount of the solvent to be used,representing mv₁The x-component of (a) is,representing mv₁The y-component of (a) is,representing mv₀The x-component of (a) is,representing mv₀The y-component of (a) is,representing mv₂X component of (a) andrepresenting mv₂The y component of (a). In addition, w denotes a width of the current block, and h denotes a height of the current block.

Furthermore, when the number of inherited affine candidates available and/or the number of constructed affine candidates is less than 2, AMVP processing of the existing HEVC standard may be applied to construct an affine MVP list. In other words, when the number of inherited affine candidates available and/or the number of constructed affine candidates is less than 2, the process for constructing the MVP candidate specified in the existing HEVC standard may be performed.

Further, a flow diagram of an embodiment for constructing an affine MVP list may be described as follows.

Fig. 14 illustrates an example of constructing an affine MVP list.

Referring to fig. 14, the encoding/decoding apparatus may add the inherited candidates to the affine MVP list of the current block S1400. The inherited candidates may represent the inherited affine candidates described above.

More specifically, at S1405, the encoding/decoding apparatus may derive at most two inherited affine candidates from neighboring blocks of the current block. Here, the neighboring blocks may include a left neighboring block a0 of the current block, a lower left neighboring block a1 of the current block, an upper neighboring block B0 of the current block, an upper right neighboring block B1 of the current block, and an upper left neighboring block B2 of the current block.

For example, the encoding/decoding apparatus may derive a first affine MVP candidate for the current block based on a first block among neighboring blocks. Here, the first block may be encoded according to an affine motion model, and the reference picture of the first block may be the same as the reference picture of the current block. In other words, the first block may be a block that is first confirmed to satisfy the condition from checking neighboring blocks according to a specific order. The condition may be encoded according to an affine motion model, and the reference picture of the block may be the same as the reference picture of the current block.

Thereafter, the encoding/decoding apparatus may derive a second affine MVP candidate based on a second block among the neighboring blocks. Here, the second block may be encoded according to an affine motion model, and the reference picture of the second block may be the same as the reference picture of the current block. In other words, the second block may be a block that is second confirmed to satisfy the condition from checking neighboring blocks according to a specific order. The condition may be encoded according to an affine motion model, and the reference picture of the block may be the same as the reference picture of the current block.

Further, the particular order may be such that neighboring blocks are checked in the order of the left neighboring block a0, the lower left neighboring block a1, the upper neighboring block B0, the upper right neighboring block B1, and the upper left neighboring block B2. In addition, the checking order may be performed differently from the above, and may not be limited to the above example.

At S1410, the encoding/decoding apparatus may add the constructed candidate to the affine MVP list of the current block. The constructed candidates may represent affine candidates constructed above. The constructed candidates may also be referred to as constructed affine MVP candidates. If the number of available inheritance candidates is less than 2, the encoding/decoding apparatus can add the constructed candidates to the affine MVP list of the current block. For example, the encoding/decoding device may derive a constructed affine candidate.

Furthermore, the method for deriving the constructed affine candidate may be different depending on whether the affine motion model applied to the current block is a six-parameter affine motion model or a four-parameter affine motion model. A detailed description of how to derive the candidates for the configuration will be provided later.

At S1420, the encoding/decoding device may add the HEVC AMVP candidate to the affine MVP list for the current block. If the number of available inheritance candidates and/or the number of construction candidates is less than 2, the encoding/decoding device may add the HEVC AMVP candidate to the affine MVP list of the current block. In other words, when the number of available inheritance candidates and/or the number of construction candidates is less than 2, the encoding/decoding apparatus may perform a process for constructing MVP candidates specified in the existing HEVC standard.

Furthermore, the method for deriving candidates for construction may be performed as follows.

For example, if the affine motion model applied to the current block is a six-parameter affine motion model, the constructed candidates may be derived as shown in the embodiment of fig. 15.

Fig. 15 illustrates an example of deriving configuration candidates.

Referring to FIG. 15, the encoding/decoding apparatus may check mv for a current block S1500₀、mv₁And mv₂. In other words, the encoding/decoding apparatus can determine mv₀、mv₁And mv₂Is available among neighboring blocks of the current block. Here, mv₀CPMVP candidate, mv, which may represent CP0 for the current block₁CPMVP candidates of CP1 that may represent the current block, and mv₂May represent the CPMVP candidate of the CP2 of the current block. In addition, mv₀、mv₁And mv₂Candidate motion vectors for respective CPs may be represented.

For example, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the first group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₀. In other words, mv₀May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the first group according to a certain order. If the motion vectors of the neighboring blocks within the first group do not satisfy a certain condition, then mv is available₀May not be present. Here, for example, the specific order may be performed in the order of the neighboring block a, the neighboring block B, and the neighboring block C in the first group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, for example, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the second group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may determine a motion vector of a neighboring block that is first confirmed to satisfy a condition during the checking processThe quantity is deduced as mv₁. In other words, mv₁May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the second group according to a certain order. If the motion vectors of the neighboring blocks within the second group do not satisfy a certain condition, then mv is available₁May not be present. Here, for example, the specific order may be performed from the neighboring block D to the neighboring block E in the second group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, for example, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the third group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₂. In other words, mv₂May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the third group according to a certain order. If the motion vectors of the neighboring blocks within the third group do not satisfy a certain condition, available mv₂May not be present. Here, for example, the specific order may be performed from the neighboring block F to the neighboring block G within the third group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, the first group may include a motion vector of the neighboring block a, a motion vector of the neighboring block B, and a motion vector of the neighboring block C; the second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E; and the third group may include the motion vector of the neighboring block F and the motion vector of the neighboring block G. The neighboring block a may represent a neighboring block located at the upper left of the upper left sample position of the current block; the neighboring block B may represent a neighboring block located above the upper left sample position of the current block; the neighboring block C may represent a neighboring block located at the left side of the upper-left sample position of the current block; the neighboring block D may represent a neighboring block located above the upper-right sample position of the current block; the neighboring block E may represent a neighboring block located at the upper right of the upper right sample position of the current block; the neighboring block F may represent a neighboring block located at the left of the lower left sample position of the current block; and the neighboring block G may represent a neighboring block located at the lower left of the lower left sample position of the current block.

When only mv₀And mv₁Available for the current block, i.e. when only mv for the current block is derived₀And mv₁When, the encoding/decoding apparatus may derive mv for the current block based on equation 8 above at S1510₂. The encoding/decoding apparatus can derive mv by subtracting the derived mv₀And mv₁Inserted in equation 8 above to derive mv₂。

When only mv₀And mv₂Available for the current block, i.e. when only mv for the current block is derived₀And mv₂In S1520, the encoding/decoding apparatus may derive mv for the current block based on equation 9 above₁. The encoding/decoding apparatus can derive mv by subtracting the derived mv₀And mv₂Inserted in equation 9 above to derive mv₁。

At S1530, the encoding/decoding apparatus may provide the derived mv₀、mv₁And mv₂As a candidate for construction of the current block. When mv₀、mv₁And mv₂When available, i.e. when deriving mv based on neighboring blocks of the current block₀、mv₁And mv₂The encoding/decoding apparatus may provide the derived mv₀、mv₁And mv₂As a candidate for construction of the current block.

In addition, when only mv₀And mv₁Available for the current block, i.e. when only mv for the current block is derived₀And mv₁The encoding/decoding apparatus may provide the derived mv₀、mv₁And mv derived based on equation 8 above₂As a candidate for construction of the current block.

In addition, when only mv₀And mv₂Available for the current block, i.e. when only mv for the current block is derived₀And mv₂The encoding/decoding apparatus may provide the derived mv₀、mv₂And mv derived based on the above equation 9₁As a candidate for construction of the current block.

In addition, for example, if the affine motion model applied to the current block is a four-parameter affine motion model, a constructed candidate may be derived as shown in the embodiment of fig. 15.

Fig. 16 illustrates an example of candidates of the derivation configuration.

Referring to fig. 16, the encoding/decoding apparatus may check mv at S1600₀、mv₁And mv₂. In other words, the encoding/decoding apparatus can determine mv₀、mv₁And mv₂Is available among neighboring blocks of the current block. Here, mv₀Can represent the CP of the current block₀Of CPMVP candidate, mv₁Can represent the CP of the current block₁And mv is a candidate for CPMVP, and mv₂Can represent the CP of the current block₂The CPMVP candidate of (1).

For example, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the first group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₀. In other words, mv₀May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the first group according to a certain order. If the motion vectors of the neighboring blocks within the first group do not satisfy a certain condition, then mv is available₀May not be present. Here, for example, the specific order may be performed in the order of the neighboring block a, the neighboring block B, and the neighboring block C within the first group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, for example, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the second group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₁. In other words, mv₁May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the second group according to a certain order. If the motion vectors of the neighboring blocks within the second group do not satisfy a certain condition, then mv is available₁May not be present. In thatThis may be done, for example, from neighboring block D to neighboring block E within the second set. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

When only mv₀And mv₁When available for the current block, or when mv₀、mv₁And mv₂When available for the current blockI.e. when only mv has been derived for the current block₀And mv₁When, or when mv is derived for the current block₀、mv₁And mv₂In S1610, the encoding/decoding apparatus may provide the derived mv₀And mv₁As a candidate for construction of the current block.

Furthermore, when only mv₀And mv₂Available for the current block, i.e. when only mv has been derived for the current block₀And mv₂In S1620, the encoding/decoding apparatus may derive mv for the current block based on equation 9 above₁. The encoding/decoding apparatus can derive mv by subtracting the derived mv₀And mv₂Inserted in equation 9 above to derive mv₁。

Thereafter, the encoding/decoding apparatus may provide the derived mv at S1610₀And mv₁As a candidate for construction of the current block.

Furthermore, another embodiment for deriving inherited affine candidates according to the present disclosure will be presented. The proposed embodiments may reduce computational complexity when deriving inherited affine candidates, thereby improving coding performance.

Fig. 17 illustrates positions of neighboring blocks scanned for deriving inherited affine candidates.

The encoding/decoding apparatus may derive at most two inherited affine candidates from neighboring blocks of the current block. Fig. 17 illustrates neighboring blocks of inherited affine candidates. For example, the neighboring blocks may include a neighboring block a and a neighboring block B shown in fig. 17. The neighboring block a may represent a left neighboring block a0, and the neighboring block B may represent an upper neighboring block B0.

For example, the encoding/decoding apparatus may check the availability of the neighboring blocks in a certain order, and may derive an inherited affine candidate of the current block based on the neighboring blocks that are first confirmed to be available. In other words, the encoding/decoding apparatus may check the neighboring blocks in a certain order to see whether the neighboring blocks satisfy a certain condition, and derive an inherited affine candidate of the current block based on the neighboring blocks that are first confirmed to be available. In addition, the encoding/decoding apparatus may derive an inherited affine candidate of the current block based on a second neighboring block confirmed to satisfy a certain condition. In other words, the encoding/decoding apparatus may derive the inherited affine candidate of the current block based on the second neighboring block confirmed to satisfy the specific condition. Here, the usability may mean that a block is encoded based on an affine motion model, and a reference picture of the block is the same as a reference picture of the current block. In other words, the specific condition may indicate that the block is encoded based on an affine motion model, and that the reference picture of the block is the same as the reference picture of the current block. In addition, the specific order may be performed from the neighboring block a to the neighboring block B, for example. Further, the pruning check process may not be performed between the two inherited affine candidates (i.e., derived inherited affine candidates). The pruning checking process may represent a process of checking whether candidates are identical to each other and removing a candidate derived thereafter if the candidates are found to be identical.

The above embodiment proposes a method for checking only two neighboring blocks (i.e., the neighboring block a and the neighboring block B) and deriving an inherited affine candidate, instead of checking all existing neighboring blocks (i.e., the neighboring block a, the neighboring block B, the neighboring block C, the neighboring block D, and the neighboring block E) and deriving an inherited affine candidate. Here, the neighboring block C may represent an upper-right neighboring block B1, the neighboring block D may represent a lower-left neighboring block a1, and the neighboring block E represents an upper-left neighboring block B2.

When affine prediction is applied to each neighboring block to analyze spatial correlation between the neighboring block and the current block, a probability of applying affine prediction to the current block may be utilized. When affine prediction is applied to each neighboring block, the probability of applying affine prediction to the current block can be derived as shown in the following table.

[ Table 2]

Reference block	A	B	C	D	E
						Probability of	65％	41％	5％	3％	1％

Referring to table 2 above, it can be found that the spatial correlation with the current block is high for the neighboring block a and the neighboring block B among the neighboring blocks. Therefore, by using only the neighboring block a and the neighboring block B exhibiting high spatial correlation to derive the inherited affine candidate, the processing time can be reduced, and high decoding performance can be achieved.

Further, a pruning check process may be performed to prevent the same candidate from being present in the candidate list. Since the pruning check process can remove redundancy, a beneficial effect can be obtained in terms of coding efficiency, but at the same time, the computational complexity can be increased due to the pruning check process. In particular, since the pruning check process for affine candidates must be performed in consideration of the affine type (e.g., whether the affine motion model is a four-parameter affine motion model or a six-parameter affine motion model), the reference picture (or reference picture index), and the MVs of CP0, CP1, and CP2, the computational complexity is very high. Therefore, the present embodiment proposes a method of not performing the pruning check process between the inherited affine candidate (e.g., inherited _ a) derived based on the neighboring block a and the inherited affine candidate (e.g., inherited _ B) derived based on the neighboring block B. In the case of the adjacent blocks a and B, they are far from each other, and thus show low spatial correlation. Therefore, the probability that associated _ A and associated _ B are the same is low. Therefore, it may be desirable not to perform the pruning check process between the inherited affine candidates.

In addition, for the above-described reason, a method for performing the trim check processing as little as possible may be proposed. For example, the encoding/decoding apparatus may perform the pruning check process in such a manner that only the MVs of the CP0 in the inherited affine candidates are compared with each other.

In addition, the present disclosure proposes a method for deriving a configuration candidate different from that obtained by the above embodiment. Compared to the above embodiments for deriving construction candidates, the proposed embodiments may improve coding performance by reducing complexity. The proposed embodiments can be described as follows. In addition, when the number of available inherited affine candidates is less than 2 (i.e., when the number of derived inherited affine candidates is less than 2), the constructed affine candidates can be considered.

For example, the encoding/decoding apparatus may check mv for a current block₀、mv₁And mv₂. In other words, the encoding/decoding apparatus can determine mv₀、mv₁And mv₂Is available among neighboring blocks of the current block. Here, mv₀CPMVP candidate, mv, which may represent CP0 for the current block₁CPMVP candidates of CP1 that may represent the current block, and mv₂May represent the CPMVP candidate of the CP2 of the current block.

Specifically, the neighboring blocks of the current block may be divided into three groups, and the neighboring blocks may include a neighboring block a, a neighboring block B, a neighboring block C, a neighboring block D, a neighboring block E, a neighboring block F, and a neighboring block G. The first group may include motion vectors of the neighboring block a, motion vectors of the neighboring block B, and motion vectors of the neighboring block C; the second group may include motion vectors of the neighboring block D and motion vectors of the neighboring block E; and the third group may include the motion vector of the neighboring block F and the motion vector of the neighboring block G. The neighboring block a may represent a neighboring block located at the upper left of the upper left sample position of the current block; the neighboring block B may represent a neighboring block located above the upper left sample position of the current block; the neighboring block C may represent a neighboring block located at the left side of the upper-left sample position of the current block; the neighboring block D may represent a neighboring block located above the upper-right sample position of the current block; the neighboring block E may represent a neighboring block located at the upper right of the upper right sample position of the current block; the neighboring block F may represent a neighboring block located at the left of the lower left sample position of the current block; and the neighboring block G may represent a neighboring block located at the lower left of the lower left sample position of the current block.

The encoding/decoding apparatus may determine mv within the first group₀Determining mv within the second group₁And determining mv within the third group₂Availability of (c).

More specifically, for example, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the first group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₀. In other words, mv₀May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the first group according to a certain order. If the motion vectors of the neighboring blocks within the first group do not satisfy a certain condition, then mv is available₀May not be present. Here, for example, the specific order may be performed in the order of the neighboring block a, the neighboring block B, and the neighboring block C within the first group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the second group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₁. In other words, mv₁May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the second group according to a certain order. If the motion vectors of the neighboring blocks within the second group do not satisfy a certain condition, then mv is available₁May not be present. Here, for example, the specific order may be performed from the adjacent block D to the adjacent block E within the second groupAnd (4) sequencing. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the third group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₂. In other words, mv₂May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the third group according to a certain order. If the motion vectors of the neighboring blocks within the third group do not satisfy a certain condition, available mv₂May not be present. Here, for example, the specific order may be performed from the neighboring block F to the neighboring block G within the third group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

Then, if the affine motion model applied to the current block is a 4-parameter affine motion model, and mv of the current block₀And mv₁If available, the encoding/decoding apparatus may provide the derived mv₀And mv₁As a candidate for construction of the current block. Furthermore, if mv of the current block₀And/or mv₁Not available, i.e. if mv is not derived from neighboring blocks of the current block₀And mv₁May not add the constructed candidate to the affine MVP list of the current block.

Further, if the affine motion model applied to the current block is a 6-parameter affine motion model, and mv of the current block₀、mv₁And mv₂If available, the encoding/decoding apparatus may provide the derived mv₀、mv₁And mv₂As a construction candidate of the current block. Furthermore, if mv of the current block₀、mv₁And/or mv₂Not available, i.e. if mv is not derived from neighboring blocks of the current block₀、mv₁And mv₂May not add the constructed candidate to the affine MVP list of the current block.

The proposed embodiment describes a method of considering the motion vector of the CP used to generate the affine motion model of the current block as a candidate for construction only when all motion vectors are available. Here, the availability may mean that the reference picture of the neighboring block is the same as the reference picture of the current block. In other words, the constructed candidates can be derived only when there is a motion vector satisfying the condition among motion vectors of neighboring blocks of respective CPs of the current block. Therefore, if the affine motion model applied to the current block is a 4-parameter affine motion model, only the motion vectors of the CP0 and CP1 (i.e., mv) of the current block₀And mv₁) When available, the candidates for construction may be considered. In addition, if the affine motion model applied to the current block is a 6-parameter affine motion model, only motion vectors of the CP0, CP1, and CP2 (i.e., mv) of the current block are considered₀、mv₁And mv₂) When available, the candidates for construction may be considered. Therefore, according to the proposed embodiment, an additional construction for deriving a motion vector of a CP based on equation 8 or equation 9 may not be required. By the proposed embodiments, the computational complexity of the candidates for deriving the constructs may be reduced. In addition, since constructed candidates are determined only when CPMVP candidates having the same reference picture are available, overall coding performance can be improved.

Further, no pruning check process may be performed between the derived inherited affine candidate and the constructed affine candidate. The pruning checking process may represent a process of checking whether candidates are identical to each other and removing a candidate derived thereafter if the candidates are found to be identical.

The above embodiment may be as shown in fig. 18 and 19.

Fig. 18 illustrates an example of deriving a constructed candidate when a four-parameter affine motion model is applied to the current block.

Referring to FIG. 18, the encoding/decoding apparatus may determine mv of the current block S1800₀And mv₁Whether it is available. In other words, the encoding/decoding apparatus may determine whether there is available mv in a neighboring block of the current block₀And mv₁. Here, mv₀May be a CPMVP candidate for CP0 of the current block, and mv₁May be a CPMVP candidate for CP 1.

The encoding/decoding apparatus can determine mv₀Whether it is available in the first group and mv₁Is available in the second group.

The encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the first group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₀. In other words, mv₀May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the first group according to a certain order. If the motion vectors of the neighboring blocks within the first group do not satisfy a certain condition, then mv is available₀May not be present. Here, for example, the specific order may be performed in the order of the neighboring block a, the neighboring block B, and the neighboring block C within the first group. In addition, for example, the specific condition may be such that the motion vectors of the adjacent blocks areThe reference picture is the same as the reference picture of the current block.

In addition, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the second group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₁. In other words, mv₁May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the second group according to a certain order. If the motion vectors of the neighboring blocks within the second group do not satisfy a certain condition, then mv is available₁May not be present. Here, for example, the specific order may be performed from the neighboring block D to the neighboring block E within the second group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

If mv of the current block₀And mv₁Available, i.e. if mv for the current block is derived₀And mv₁Then, at S1810, the encoding/decoding apparatus may provide the derived mv₀And mv₁As a candidate for construction of the current block. Furthermore, if mv of the current block₀And mv₁Not available, i.e. if mv is not derived from neighboring blocks of the current block₀And mv₁May not add the constructed candidate to the affine MVP list of the current block.

Fig. 19 illustrates an example of deriving a constructed candidate when a six-parameter affine motion model is applied to the current block.

Referring to FIG. 19, the encoding/decoding apparatus may determine mv among neighboring blocks of the current block in S1900₀、mv₁And mv₂Whether it is available. In other words, the encoding/decoding apparatus may determine whether there is available mv in a neighboring block of the current block₀、mv₁And mv₂. Here, mv₀CPMVP candidate, mv, which may represent CP0 for the current block₁CPMVP candidates of CP1 that may represent the current block, and mv₂May represent the CPMVP candidate of the CP2 of the current block.

The encoding/decoding apparatus can determine mv₀Whether available in the first group, mv₁Whether it is available in the second group and mv₂Is available in the third group.

The encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the first group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₀. In other words, mv₀May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the first group according to a certain order. If the motion vectors of the neighboring blocks within the first group do not satisfy a certain condition, then mv is available₀May not be present. Therein, theFor example, the specific order may be performed in the order of the neighboring block a, the neighboring block B, and the neighboring block C within the first group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

In addition, the encoding/decoding apparatus may check whether the motion vectors of the neighboring blocks within the second group satisfy a certain condition according to a certain order. The encoding/decoding apparatus may derive a motion vector of a neighboring block, which is first confirmed to satisfy a condition during the checking process, as mv₁. In other words, mv₁May be a motion vector that is first confirmed to satisfy a certain condition from checking motion vectors within the second group according to a certain order. If the motion vectors of the neighboring blocks within the second group do not satisfy a certain condition, then mv is available₁May not be present. Here, for example, the specific order may be performed from the neighboring block D to the neighboring block E within the second group. In addition, for example, the specific condition may be such that the reference picture of the motion vector of the neighboring block is the same as the reference picture of the current block.

If mv of the current block₀、mv₁And mv₂Available, i.e. if mv for the current block is derived₀、mv₁And mv₂Then, at S1910, the encoding/decoding apparatus may provide the derived mv₀、mv₁And mv₂As whenCandidates for construction of the previous block. Furthermore, if mv of the current block₀、mv₁And/or mv₂Not available, i.e. if mv is not derived from neighboring blocks of the current block₀、mv₁And mv₂May not add the constructed candidate to the affine MVP list of the current block.

Further, no pruning check process may be performed between the derived inherited affine candidate and the constructed affine candidate.

Furthermore, when the number of derived affine candidates is less than 2 (i.e., when the number of inherited affine candidates and/or the number of constructed affine candidates is less than 2), HEVC AMVP candidates may be added to the affine MVP list of the current block.

For example, HEVC AMVP candidates may be derived in the following order.

More specifically, when the number of derived affine candidates is less than 2, and the CPMV0 of the constructed affine candidate is available, the CPMV0 may be used as an affine MVP candidate. In other words, when the number of derived affine candidates is less than 2 and the CPMV0 of the constructed affine candidate is available (i.e., when the number of derived affine candidates is less than 2 and the CPMV0 of the constructed affine candidate is derived), the CPMV0 of the constructed affine candidate may be derived as the first affine MVP candidate including CPMV0, CPMV1, and CPMV 2.

In addition, next, when the number of derived affine candidates is less than 2, and CPMV1 of the constructed affine candidate is available, CPMV1 may be used as an affine MVP candidate. In other words, when the number of derived affine candidates is less than 2 and the CPMV1 of the constructed affine candidate is available (i.e., when the number of derived affine candidates is less than 2 and the CPMV1 of the constructed affine candidate is derived), the CPMV1 of the constructed affine candidate may be derived as the second affine MVP candidate including the CPMV0, the CPMV1, and the CPMV 2.

In addition, next, when the number of derived affine candidates is less than 2, and CPMV2 of the constructed affine candidate is available, CPMV2 may be used as an affine MVP candidate. In other words, when the number of derived affine candidates is less than 2 and the CPMV2 of the constructed affine candidate is available (i.e., when the number of derived affine candidates is less than 2 and the CPMV2 of the constructed affine candidate is derived), the CPMV2 of the constructed affine candidate may be derived as the third affine MVP candidate including the CPMV0, the CPMV1, and the CPMV 2.

In addition, next, when the number of derived affine candidates is less than 2, the HEVC Temporal Motion Vector Predictor (TMVP) may be used as an affine MVP candidate. Hevc mvp may be derived based on motion information of temporally neighboring blocks of the current block. In other words, when the number of derived affine candidates is less than 2, the motion vector of the temporal neighboring block of the current block may be derived as a third affine MVP candidate including CPMV0, CPMV1, and CPMV 2. The temporal neighboring block may indicate a collocated block within a collocated picture corresponding to the current block.

In addition, next, when the number of derived affine candidates is less than 2, a zero Motion Vector (MV) may be used as an affine MVP candidate. In other words, when the number of derived affine candidates is less than 2, the zero motion vector may be derived as the third affine MVP candidate including CPMV0, CPMV1, and CPMV 2. A zero motion vector may represent a motion vector whose elements are all zero.

The processing step using the CPMV of the constructed affine candidate reuses the MVs already considered to generate the constructed affine candidate, thereby reducing the processing complexity compared to the existing methods for deriving HEVC AMVP candidates.

Furthermore, the present disclosure proposes another embodiment for deriving inherited affine candidates.

To derive the inherited affine candidates, affine prediction information of neighboring blocks is required, and more specifically, the following affine prediction information is required:

1) an affine flag (affine _ flag) indicating whether affine prediction based coding has been applied to neighboring blocks, and

2) motion information of neighboring blocks.

In addition, if a six-parameter affine motion model is applied to the neighboring block, the motion information of the neighboring block may include L0 motion information and L1 motion information for a CP0, and L00 motion information and L11 motion information for a CP 1. in addition, if a six-parameter affine motion model is applied to the neighboring block, the motion information of the neighboring block may include L20 motion information and L31 motion information for a CP0, and L40 motion information and L51 motion information for a CP 2. here, L60 motion information may represent motion information for list 0 (L70), and L1 motion information may represent motion information for list 1 (L1). L0 motion information may include L0 reference picture index and L0 motion vector, and L1 motion information may include L1 reference picture index and L1 motion vector.

In this regard, the present disclosure proposes an embodiment for deriving inherited affine candidates, which minimizes hardware costs by not storing affine prediction related information in a line buffer or by reducing affine prediction related information in a line buffer.

In the present embodiment, additional information on affine prediction may not be stored in the line buffer, and when it is necessary to generate inherited affine candidates using information within the line buffer, generation of the inherited affine candidates may be restricted.

Fig. 20a and 20b illustrate embodiments for deriving inherited affine candidates.

Referring to fig. 20a, when a neighboring block B of the current block (i.e., a neighboring block above the current block) does not belong to the same CTU as the current block, the neighboring block B may not be used to generate an inherited affine candidate. Furthermore, although the neighboring block a also does not belong to the same CTU as the current block, information on the neighboring block a is not stored in the line buffer, and thus the neighboring block a can be used to generate an inherited affine candidate. Thus, according to the present embodiment, a neighboring block above a current block can be used to derive inherited affine candidates only if the neighboring block and the current block belong to the same CTU. In addition, when a neighboring block above the current block does not belong to the same CTU as the current block, the upper neighboring block may not be used to derive the inherited affine candidate.

Referring to fig. 20B, a neighboring block B of the current block (i.e., a neighboring block above the current block) may belong to the same CTU as the current block. In this case, the encoding/decoding apparatus can generate inherited affine candidates by referring to the adjacent block B.

Fig. 21 illustrates a video encoding method performed by an encoding apparatus according to the present disclosure. The method disclosed in fig. 21 may be performed by the encoding device disclosed in fig. 2. More specifically, for example, the steps S2100 to S2120 may be performed by a predictor of the encoding apparatus, the step S2130 may be performed by a subtractor of the encoding apparatus, and the step S2140 may be performed by an entropy encoder of the encoding apparatus. In addition, although not shown in the drawings, the process for deriving the predicted samples of the current block based on the CPMV may be performed by a predictor of the encoding apparatus, the process for deriving the residual samples of the current block based on the original samples and the predicted samples of the current block may be performed by a subtractor of the encoding apparatus, the process for generating information about the residual of the current block based on the residual samples may be performed by a transformer of the encoding apparatus, and the process for encoding the information about the residual may be performed by an encoder of the encoding apparatus.

In S2100, the encoding apparatus constructs an affine Motion Vector Predictor (MVP) candidate list of the current block. The encoding apparatus may construct an affine MVP candidate list including affine MVP candidates for the current block. The maximum number of affine MVP candidates in the affine MVP candidate list may be 2.

In addition, as one example, the affine MVP candidate list may include inherited affine MVP candidates. The encoding apparatus may check whether an inherited affine MVP candidate of the current block is available, and may derive the inherited affine MVP candidate if the inherited affine MVP candidate is available. For example, inherited affine MVP candidates may be derived based on neighboring blocks of the current block, and the maximum number of inherited affine MVP candidates may be 2. The availability of neighboring blocks may be checked in a particular order and inherited affine MVP candidates may be derived based on the checked available neighboring blocks. In other words, the availability of neighboring blocks may be checked in a certain order, a first inherited affine MVP candidate may be derived based on the first checked available neighboring blocks, and a second inherited affine MVP candidate may be derived based on the second checked available neighboring blocks. Availability may mean that a neighboring block is encoded based on an affine motion model and a reference picture of the neighboring block is the same as a reference picture of the current block. In other words, the available neighboring blocks may refer to neighboring blocks that are encoded according to an affine motion model (i.e., neighboring blocks to which affine prediction is applied) and whose reference pictures are the same as those of the current block. More specifically, the encoding apparatus may derive a motion vector of a CP of the current block based on an affine motion model of the available neighboring blocks checked first, and derive a first inherited affine MVP candidate including the motion vector as a CPMVP candidate. In addition, the encoding apparatus may derive a motion vector of the CP of the current block based on the affine motion model of the second checked available neighboring block, and derive a second inherited affine MVP candidate including the motion vector as a CPMVP candidate. The affine motion model can be derived as in equation 1 or 3 above.

In addition, in other words, the neighboring blocks may be checked in a certain order to see whether the neighboring blocks satisfy a certain condition, and the inherited affine MVP candidate may be derived based on the neighboring blocks satisfying the checked certain condition. In other words, neighboring blocks may be checked in a particular order to see whether the neighboring blocks satisfy a particular condition, a first inherited affine MVP candidate may be derived based on first checking out the neighboring blocks that satisfy the particular condition, and a second inherited affine MVP candidate may be derived based on second checking out the neighboring blocks that satisfy the particular condition. More specifically, the encoding apparatus may derive a motion vector of a CP of the current block based on an affine motion model in which neighboring blocks satisfying a certain condition are first checked, and derive a first inherited affine MVP candidate including the motion vector as a CPMVP candidate. In addition, the encoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of a neighboring block whose second check has satisfied a certain condition, and derive a second inherited affine MVP candidate including the motion vector as a CPMVP candidate. The affine motion model can be derived as in equation 1 or 3 above. Furthermore, the specific condition may indicate that the neighboring block is encoded according to an affine motion model, and a reference picture of the neighboring block is the same as a reference picture of the current block. In other words, the neighboring blocks satisfying a certain condition may be encoded according to an affine motion model (i.e., affine prediction is applied to the neighboring blocks), and the reference picture is the same as that of the current block.

Here, for example, the neighboring blocks may include a left neighboring block, an upper-right neighboring block, a lower-left neighboring block, and an upper-left neighboring block of the current block. In this case, the specific order may be an order from the left neighboring block to the lower left neighboring block to the upper right neighboring block to the upper left neighboring block.

Or, for example, the neighboring blocks may include only a left neighboring block and an upper neighboring block. In this case, the specific order may be an order from the left neighboring block to the upper neighboring block.

Further, when the size of the current block is W × H, the x-component of the upper-left sample position of the current block is 0, and the y-component thereof is 0, the lower-left neighboring block may be a block including samples at coordinates (-1, H), the left neighboring block may be a block including samples at coordinates (-1, H-1), the upper-right neighboring block may be a block including samples at coordinates (W, -1), the upper neighboring block may be a block including samples at coordinates (W-1, -1), the upper-left neighboring block may be a block including samples at coordinates (-1, -1).

In addition, as an example, if the constructed affine MVP candidate is available, the affine MVP candidate list may include the constructed affine MVP candidate. The encoding apparatus may check whether the constructed affine MVP candidate of the current block is available, and may derive the constructed affine MVP candidate if the constructed affine MVP candidate is available. In addition, for example, after deriving the inherited affine MVP candidate, a constructed affine MVP candidate may be derived. If the number of derived affine MVP candidates (i.e., the number of inherited affine MVPs) is less than 2 and the constructed affine MVP candidates are available, the affine MVP candidate list may include the constructed affine MVP candidates. Here, the constructed affine MVP candidate may include a candidate motion vector of the CP. When all candidate motion vectors are available, the constructed affine MVP candidate may be available.

For example, if a four-parameter affine motion model is applied to the current block, the CP of the current block may include CP0 and CP 1. If a candidate motion vector of CP0 is available and a candidate motion vector of CP1 is available, the constructed affine MVP candidate may be available and the affine MVP candidate list may include the constructed affine MVP candidates. Here, the CP0 may represent an upper left position of the current block, and the CP1 may represent an upper right position of the current block.

The constructed affine MVP candidates may include candidate motion vectors of CP0 and candidate motion vectors of CP 1. The candidate motion vector of CP0 may be a motion vector of the first block, and the candidate motion vector of CP1 may be a motion vector of the second block.

In addition, the first block may be a block whose reference picture has been first determined to be the same as the reference picture of the current block while checking neighboring blocks in the first group in the first specific order. In other words, the candidate motion vector of the CP1 may be a motion vector of a block whose reference picture is the same as that of the current block, which is first confirmed by checking neighboring blocks within the first group according to the first order. Availability may indicate that there are neighboring blocks and that the neighboring blocks are encoded by inter prediction. Here, if the reference picture of the first block within the first group is the same as the reference picture of the current block, the candidate motion vector of the CP0 may be available. In addition, for example, the first group may include a neighboring block a, a neighboring block B, and a neighboring block C, and the first specific order may be an order from the neighboring block a to the neighboring block B, and then to the neighboring block C.

In addition, the second block may be a block whose reference picture has been first confirmed to be the same as the reference picture of the current block while checking neighboring blocks in the second group in the second specific order. Here, if the reference picture of the second block within the second group is the same as the reference picture of the current block, the candidate motion vector of the CP1 may be available. In addition, for example, the second group may include a neighboring block D and a neighboring block E, and the second specific order may be an order from the neighboring block D to the neighboring block E.

Also, when the size of the current block is W × H, the x component of the upper-left sample position of the current block is 0, and the y component thereof is 0, the neighboring block A may be a block including a sample at coordinates (-1, -1), the neighboring block B may be a block including a sample at coordinates (0, -1), the neighboring block C may be a block including a sample at coordinates (-1, 0), the neighboring block D may be a block including a sample at coordinates (W-1, -1), and the neighboring block E may be a block including a sample at coordinates (W, -1).

Further, if at least one of the candidate motion vector of CP0 and the candidate motion vector of CP1 is unavailable, the constructed affine MVP candidate may not be available.

Or, for example, if a six-parameter affine motion model is applied to the current block, the CP of the current block may include CP0, CP1, and CP 2. If a candidate motion vector of CP0 is available, a candidate motion vector of CP1 is available, and a candidate motion vector of CP2 is available, the constructed affine MVP candidate may be available and the affine MVP candidate list may include the constructed affine MVP candidate. Here, the CP0 may represent an upper left position of the current block, the CP1 may represent an upper right position of the current block, and the CP2 may represent a lower left position of the current block.

The constructed affine MVP candidates may include a candidate motion vector of CP0, a candidate motion vector of CP1, and a candidate motion vector of CP 2. The candidate motion vector of CP0 may be a motion vector of the first block, the candidate motion vector of CP1 may be a motion vector of the second block, and the candidate motion vector of CP2 may be a motion vector of the third block.

In addition, the third block may be a block whose reference picture has been first determined to be the same as the reference picture of the current block while checking neighboring blocks in the third group in a third specific order. Here, if the reference picture of the third block within the third group is the same as the reference picture of the current block, a candidate motion vector of the CP2 may be available. In addition, for example, the third group may include a neighboring block F and a neighboring block G, and the third specific order may be an order from the neighboring block F to the neighboring block G.

Further, when the size of the current block is W × H, the x component of the upper-left sample position of the current block is 0, and the y component thereof is 0, the neighboring block a may be a block including samples at coordinates (-1, -1), the neighboring block B may be a block including samples at coordinates (0, -1), the neighboring block C may be a block including samples at coordinates (-1, 0), the neighboring block D may be a block including samples at coordinates (W-1, -1), the neighboring block E may be a block including samples at coordinates (W, -1), the neighboring block F may be a block including samples at coordinates (-1, H-1), and the neighboring block G may be a block including samples at coordinates (-1, H), in other words, the neighboring block a may be an upper-left neighboring block of the current block, B may be an upper-left neighboring block of the current block, C may be an upper-left neighboring block at the position of the current block, D may be an upper-left neighboring block F, and the neighboring block E may be an upper-right neighboring block of the neighboring block F, and the neighboring block may be an upper-right neighboring block of the neighboring block, C may be an upper-left neighboring block of the current block, and the neighboring block F, the neighboring block may be the neighboring block at.

Further, if at least one of the candidate motion vector of CP0, the candidate motion vector of CP1, and the candidate motion vector of CP2 is unavailable, the constructed affine MVP candidate may be unavailable.

Thereafter, an affine MVP candidate list may be derived based on the steps described below.

For example, when the number of derived affine MVP candidates is less than 2 and a motion vector of CP0 is available, the encoding apparatus may derive the first affine MVP candidate. Here, the first affine MVP candidate may be an affine MVP candidate including a motion vector of CP0 as a candidate motion vector of the CP.

In addition, for example, when the number of derived affine MVP candidates is less than 2 and a motion vector of CP1 is available, the encoding apparatus may derive a second affine MVP candidate. Here, the second affine MVP candidate may be an affine MVP candidate including a motion vector of CP1 as a candidate motion vector of the CP.

In addition, for example, when the number of derived affine MVP candidates is less than 2 and a motion vector of CP2 is available, the encoding apparatus may derive a third affine MVP candidate. Here, the third affine MVP candidate may be an affine MVP candidate including a motion vector of CP2 as a candidate motion vector of the CP.

In addition, for example, when the number of derived affine MVP candidates is less than 2, the encoding apparatus may derive a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of the CP. The temporal neighboring block may refer to the same collocated block within the same collocated picture corresponding to the current block. The temporal MVP may be derived based on motion vectors of temporally neighboring blocks.

In addition, for example, when the number of derived affine MVP candidates is less than 2, the encoding apparatus may derive a fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP. A zero motion vector may represent a motion vector whose elements are all zero.

At S2110, the encoding apparatus derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block based on the affine MVP candidate list. The encoding apparatus may derive the CPMV of the CP of the current block exhibiting the best RD cost, and may select an affine MVP candidate most similar to the CPMV among the affine MVP candidates as the affine MVP candidate of the current block. The encoding apparatus may derive the CPMVP of the CP of the current block based on an affine MVP candidate selected among affine MVP candidates. More specifically, if the affine MVP candidate includes a candidate motion vector of CP0 and a candidate motion vector of CP1, a candidate motion vector of CP0 of the affine MVP candidate may be derived as the CPMVP of CP0, and a candidate motion vector of CP1 of the affine MVP candidate may be derived as the CPMVP of CP 1. In addition, if the affine MVP candidate includes a candidate motion vector of CP0, a candidate motion vector of CP1, and a candidate motion vector of CP2, a candidate motion vector of CP0 of the affine MVP candidate may be derived as CPMVP of CP0, a candidate motion vector of CP1 of the affine MVP candidate may be derived as CPMVP of CP1, and a candidate motion vector of CP2 of the affine MVP candidate may be derived as CPMVP of CP 2. In addition, if the affine MVP candidate includes a candidate motion vector of CP0 and a candidate motion vector of CP2, a candidate motion vector of CP0 of the affine MVP candidate may be derived as the CPMVP of CP0, and a candidate motion vector of CP2 of the affine MVP candidate may be derived as the CPMVP of CP 2.

The encoding apparatus may encode an affine MVP candidate index indicating a selected affine MVP candidate among affine MVP candidates. The affine MVP candidate index may indicate one affine Motion Vector Predictor (MVP) candidate among affine MVP candidates included in an MVP candidate list of the current block.

At S2120, the encoding apparatus derives the CPMV of the CP of the current block. The encoding apparatus may derive the CPMV of each CP of the current block.

In S2130, the encoding apparatus derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the CPMVP and the CPMV. The encoding apparatus may derive the CPMVD of the CP of the current block based on the CPMVP and the CPMV of the respective CPs.

At S2140, the encoding apparatus encodes motion prediction information including information on the CPMVD. The encoding apparatus may output motion prediction information including information on the CPMVD in the form of a bitstream. In other words, the encoding apparatus may output image information including motion prediction information in the form of a bitstream. The encoding apparatus may encode information on the CPMVDs of the respective CPs, wherein the motion prediction information may include information on the CPMVDs.

In addition, the motion prediction may include an affine MVP candidate index. The affine MVP candidate index may indicate a selected affine Motion Vector Predictor (MVP) candidate among affine MVP candidates included in an MVP candidate list of the current block.

Also, as one example, the encoding apparatus may derive a predicted sample of the current block based on the CPMV, derive a residual sample of the current block based on the original sample and the predicted sample of the current block, generate information regarding a residual of the current block based on the residual sample, and encode the information regarding the residual. The image information may include information about the residual.

Further, the bitstream may be transmitted to the decoding apparatus through a network or a (digital) storage medium. Here, the network may include a broadcasting network and/or a communication network, and the digital storage medium may include various types of storage media including USB, SD, CD, DVD, blu-ray, HDD, and SSD.

Fig. 22 illustrates an encoding apparatus performing a video encoding method according to the present disclosure. The method disclosed in fig. 21 may be performed by the encoding device disclosed in fig. 22. More specifically, for example, the predictor of the encoding apparatus may perform the steps of S2100 to S2130 of fig. 21, and the entropy encoder of the encoding apparatus of fig. 22 may perform the step of S2140 of fig. 21. In addition, although not shown in the drawings, the process for deriving the prediction sample of the current block based on the CPMV may be performed by a predictor of the encoding apparatus of fig. 22, the process for deriving the residual sample of the current block based on the original sample and the prediction sample of the current block may be performed by a subtractor of the encoding apparatus of fig. 22, the process for generating information about the residual of the current block based on the residual sample may be performed by a transformer of the encoding apparatus, and the process for encoding the information about the residual may be performed by an entropy encoder of the encoding apparatus of fig. 22.

Fig. 23 illustrates a video decoding method performed by a decoding apparatus according to the present disclosure. The method disclosed in fig. 23 may be performed by the decoding device disclosed in fig. 3. More specifically, for example, the S2300 step of fig. 23 may be performed by an entropy decoder of the decoding apparatus, the S2310 to S2350 steps may be performed by a predictor of the decoding apparatus, and the S2360 step may be performed by an adder of the decoding apparatus. In addition, although not shown in the drawings, a process for obtaining information regarding a residual of the current block through a bitstream may be performed by an entropy decoder of the decoding apparatus, and a process for deriving residual samples of the current block based on the residual information may be performed by an inverse transformer of the decoding apparatus.

At S2300, the decoding apparatus acquires motion prediction information of the current block from the bitstream. The decoding apparatus may obtain image information including motion prediction information from a bitstream.

In addition, for example, the motion prediction information may include information on a Control Point Motion Vector Difference (CPMVD) of a Control Point (CP) of the current block. In other words, the motion prediction information may include information on the CPMVDs of the respective CPs of the current block.

In addition, for example, the motion prediction information may include an affine Motion Vector Predictor (MVP) candidate index of the current block. The affine MVP candidate index may indicate one of affine MVP candidates included in the affine MVP candidate list of the current block.

At S2310, the decoding apparatus constructs an affine MVP candidate list of the current block. The decoding device may construct an affine MVP candidate list including affine MVP candidates for the current block. The maximum number of affine MVP candidates in the affine MVP candidate list may be 2.

Further, as one example, the affine MVP candidate list may include inherited affine MVP candidates. The decoding device may check whether the inherited affine MVP candidate of the current block is available, and may derive the inherited affine MVP candidate if the inherited affine MVP candidate is available. For example, inherited affine MVP candidates may be derived based on neighboring blocks of the current block, and the maximum number of inherited affine MVP candidates may be 2. The availability of neighboring blocks may be checked in a particular order and inherited affine MVP candidates may be derived based on the checked available neighboring blocks. In other words, the availability of neighboring blocks may be checked in a certain order, a first inherited affine MVP candidate may be derived based on the first checked available neighboring blocks, and a second inherited affine MVP candidate may be derived based on the second checked available neighboring blocks. Availability may mean that a neighboring block is encoded based on an affine motion model and a reference picture of the neighboring block is the same as a reference picture of the current block. In other words, the available neighboring blocks may refer to neighboring blocks that are encoded according to an affine motion model (i.e., neighboring blocks to which affine prediction is applied) and whose reference pictures are the same as those of the current block. More specifically, the decoding apparatus may derive a motion vector of a CP of the current block based on an affine motion model of the available neighboring blocks checked first, and derive a first inherited affine MVP candidate including the motion vector as a CPMVP candidate. In addition, the decoding apparatus may derive a motion vector of the CP of the current block based on the affine motion model of the second checked available neighboring block, and derive a second inherited affine MVP candidate including the motion vector as a CPMVP candidate. The affine motion model can be derived as in equation 1 or 3 above.

In addition, in other words, the neighboring blocks may be checked in a certain order to see whether the neighboring blocks satisfy a certain condition, and the inherited affine MVP candidate may be derived based on the neighboring blocks satisfying the checked certain condition. In other words, neighboring blocks may be checked in a particular order to see whether the neighboring blocks satisfy a particular condition, a first inherited affine MVP candidate may be derived based on first checking out the neighboring blocks that satisfy the particular condition, and a second inherited affine MVP candidate may be derived based on second checking out the neighboring blocks that satisfy the particular condition. More specifically, the decoding apparatus may derive a motion vector of a CP of the current block based on an affine motion model in which neighboring blocks satisfying a certain condition are first checked, and derive a first inherited affine MVP candidate including the motion vector as a CPMVP candidate. In addition, the decoding apparatus may derive a motion vector of the CP of the current block based on an affine motion model of a neighboring block whose second check has satisfied a certain condition, and derive a second inherited affine MVP candidate including the motion vector as a CPMVP candidate. The affine motion model can be derived as in equation 1 or 3 above. Furthermore, the specific condition may indicate that the neighboring block is encoded according to an affine motion model, and a reference picture of the neighboring block is the same as a reference picture of the current block. In other words, the neighboring blocks satisfying a certain condition may be encoded according to an affine motion model (i.e., affine prediction is applied to the neighboring blocks) and the reference picture is the same as the reference picture of the current block.

Or, for example, the neighboring block may include a left neighboring block, and the neighboring block may further include an upper neighboring block if the upper neighboring block belongs to a current CTU including the current block. In this case, the specific order may be an order from the left neighboring block to the upper neighboring block. In addition, if the upper neighboring block does not belong to the current CTU, the neighboring block may not include the upper neighboring block. In this case, only the left neighboring block may be checked. In other words, if an upper neighboring block of the current block belongs to a current Coding Tree Unit (CTU) including the current block, the upper neighboring block may be used to derive an inherited affine MVP candidate, and if the upper neighboring block of the current block does not belong to the current CTU, the upper neighboring block may not be used to derive the inherited affine MVP candidate.

In addition, as an example, if the constructed affine MVP candidate is available, the affine MVP candidate list may include the constructed affine MVP candidate. The decoding apparatus may check whether the constructed affine MVP candidate of the current block is available, and may derive the constructed affine MVP candidate if the constructed affine MVP candidate is available. In addition, for example, after deriving the inherited affine MVP candidate, a constructed affine MVP candidate may be derived. If the number of derived affine MVP candidates (i.e., the number of inherited affine MVPs) is less than 2 and the constructed affine MVP candidates are available, the affine MVP candidate list may include the constructed affine MVP candidates. Here, the constructed affine MVP candidate may include a candidate motion vector of the CP. When all candidate motion vectors are available, the constructed affine MVP candidate may be available.

Further, if at least one of the candidate motion vector of CP0 and the candidate motion vector of CP1 is unavailable, the constructed affine MVP candidate may not be available.

Further, no pruning check process may be performed between the inherited affine MVP candidate and the constructed affine MVP candidate. The pruning checking process may represent a process of checking whether the constructed affine MVP candidate is identical to the inherited affine MVP candidate, and if they are found to be identical, not deriving the constructed affine MVP candidate.

Thereafter, an affine MVP candidate list may be derived based on the steps described below.

For example, when the number of derived affine MVP candidates is less than 2 and a motion vector of CP0 is available, the decoding apparatus may derive the first affine MVP candidate. Here, the first affine MVP candidate may be an affine MVP candidate including a motion vector of CP0 as a candidate motion vector of the CP.

In addition, for example, when the number of derived affine MVP candidates is less than 2 and a motion vector of CP1 is available, the decoding apparatus may derive a second affine MVP candidate. Here, the second affine MVP candidate may be an affine MVP candidate including a motion vector of CP1 as a candidate motion vector of the CP.

In addition, for example, when the number of derived affine MVP candidates is less than 2 and a motion vector of CP2 is available, the decoding apparatus may derive a third affine MVP candidate. Here, the third affine MVP candidate may be an affine MVP candidate including a motion vector of CP2 as a candidate motion vector of the CP.

In addition, for example, when the number of derived affine MVP candidates is less than 2, the decoding device may derive a fourth affine MVP candidate including a temporal MVP derived based on a temporal neighboring block of the current block as a candidate motion vector of the CP. The temporal neighboring block may refer to the same collocated block within the same collocated picture corresponding to the current block. The temporal MVP may be derived based on motion vectors of temporally neighboring blocks.

In addition, for example, when the number of derived affine MVP candidates is less than 2, the decoding apparatus may derive a fifth affine MVP candidate including a zero motion vector as a candidate motion vector of the CP. A zero motion vector may represent a motion vector whose elements are all zero.

At S2320, the decoding apparatus derives a Control Point Motion Vector Predictor (CPMVP) of a Control Point (CP) of the current block based on the affine MVP candidate list.

The decoding device may select a specific affine MVP candidate among affine MVP candidates included in the affine MVP candidate list, and derive the selected affine MVP candidate as the CPMVP of the CP of the current block. For example, the decoding apparatus may obtain an affine MVP candidate index of the current block from the bitstream, and derive, among affine MVP candidates included in the affine MVP candidate list, an affine MVP candidate indicated by the affine MVP candidate index as a CPMVP of the CP of the current block. More specifically, if the affine MVP candidate includes a candidate motion vector of CP0 and a candidate motion vector of CP1, a candidate motion vector of CP0 of the affine MVP candidate may be derived as the CPMVP of CP0, and a candidate motion vector of CP1 of the affine MVP candidate may be derived as the CPMVP of CP 1. In addition, if the affine MVP candidate includes a candidate motion vector of CP0, a candidate motion vector of CP1, and a candidate motion vector of CP2, a candidate motion vector of CP0 of the affine MVP candidate may be derived as CPMVP of CP0, a candidate motion vector of CP1 of the affine MVP candidate may be derived as CPMVP of CP1, and a candidate motion vector of CP2 of the affine MVP candidate may be derived as CPMVP of CP 2. In addition, if the affine MVP candidate includes a candidate motion vector of CP0 and a candidate motion vector of CP2, a candidate motion vector of CP0 of the affine MVP candidate may be derived as the CPMVP of CP0, and a candidate motion vector of CP2 of the affine MVP candidate may be derived as the CPMVP of CP 2.

At S2330, the decoding apparatus derives a Control Point Motion Vector Difference (CPMVD) of the CP of the current block based on the motion prediction information. The motion prediction information may include information on a CPMVD of each CP, and the decoding apparatus may derive the CPMVD of each CP of the current block based on the information on the CPMVD of each CP.

At S2340, the decoding apparatus derives a Control Point Motion Vector (CPMV) of the CP of the current block based on the CPMVP and the CPMVD. The decoding apparatus may derive the CPMV of each CP based on the CPMVP and the CPMVD of each CP. For example, the decoding apparatus may derive the CPMV of each CP by adding the CPMVP and the CPMVD of the CP.

At S2350, the decoding apparatus derives prediction samples for the current block based on the CPMV. The decoding device may derive a motion vector of the current block in units of sub-blocks or samples based on the CPMV. In other words, the decoding apparatus may derive a motion vector for each sub-block or each sample of the current block based on the CPMV. The motion vector in units of sub-blocks or samples may be derived as in equation 1 or equation 3 above. The motion vectors may be referred to as affine Motion Vector Fields (MVFs) or motion vector arrays.

The decoding apparatus may derive prediction samples of the current block based on the motion vector in units of subblocks or samples. The decoding apparatus may derive a reference region within a reference picture based on a motion vector in units of subblocks or samples, and generate prediction samples of the current block based on reconstructed samples within the reference region.

At S2360, the decoding apparatus generates a reconstructed picture of the current block based on the derived prediction samples. The decoding device may generate a reconstructed picture of the current block based on the derived prediction samples. Depending on the prediction mode, the decoding apparatus may directly use the prediction samples as reconstructed samples, or may generate reconstructed samples by adding residual samples to the prediction samples. In the presence of residual samples of the current block, the decoding apparatus may obtain information regarding a residual of the current block from the bitstream. The information about the residual may include transform coefficients of residual samples. The decoding apparatus may derive residual samples (or a residual sample array) of the current block based on the residual information. The decoding device may generate reconstructed samples based on the prediction samples and the residual samples, and derive a reconstructed block or a reconstructed picture based on the reconstructed samples. Thereafter, as described above, the decoding apparatus may apply loop filtering processing such as deblocking filtering and/or SAO processing to the reconstructed picture as necessary to improve subjective/objective image quality.

Fig. 24 illustrates a decoding apparatus performing a video decoding method according to the present disclosure. The method disclosed in fig. 23 may be performed by the decoding apparatus disclosed in fig. 24. More specifically, for example, the entropy decoder of the decoding apparatus of fig. 24 may perform the S2300 step of fig. 23, the predictor of the decoding apparatus of fig. 24 may perform the S2310 to S2350 steps, and the adder of the decoding apparatus of fig. 24 may perform the S2360 step of fig. 23. In addition, although not shown in the drawings, a process for obtaining image information including information regarding a residual of the current block through a bitstream may be performed by an entropy decoder of the decoding apparatus of fig. 24, and a process for deriving residual samples of the current block based on the residual information may be performed by an inverse transformer of the decoding apparatus of fig. 24.

According to the present disclosure, the efficiency of affine motion prediction based video encoding can be improved.

In addition, according to the present disclosure, in deriving the affine MVP candidate list, the constructed affine MVP candidate may be added only when the candidate motion vectors of the CP of the constructed affine MVP candidate are all available, whereby the complexity of the process for deriving the constructed affine MVP candidate and the process for constructing the affine MVP candidate list may be reduced, and the encoding efficiency may be improved.

In addition, according to the present disclosure, in deriving the affine MVP candidate list, additional affine MVP candidates may be derived based on a candidate motion vector of a CP derived through a process for deriving a constructed affine MVP candidate, whereby the complexity of the process for constructing the affine MVP candidate list may be reduced and encoding efficiency may be improved.

In addition, according to the present disclosure, in deriving an inherited affine MVP candidate, only when an upper neighboring block is included in a current CTU, the inherited affine MVP candidate can be derived by using the upper neighboring block, whereby the storage amount of a line buffer for affine prediction can be reduced, and hardware cost can be minimized.

In the above embodiments, although the method is described based on the flowchart using a series of steps or blocks, the present disclosure does not limit the specific order of the steps, and some steps may be performed in a different order from the rest of the steps or simultaneously with the rest of the steps. Further, those skilled in the art will understand that the steps shown in the flowcharts are not exclusive and may further include other steps or may delete one or more steps in the flowcharts without affecting the technical scope of the present disclosure.

Embodiments according to the present disclosure may be implemented and performed on a processor, microprocessor, controller, or chip. For example, the functional elements shown in each figure may be implemented and executed on a computer, processor, microprocessor, controller, or chip. In this case, information for implementation (e.g., information about instructions) or algorithms may be stored in the digital storage medium.

In addition, the decoding apparatus and the encoding apparatus to which the embodiments of the present disclosure are applied may include a multimedia broadcast transmitting and receiving apparatus, a mobile communication terminal, a home theater video device, a digital cinema video device, a surveillance camera, a video communication device, a real-time communication device for video communication, a mobile streaming device, a storage medium, a camcorder, a video on demand (VoD) service providing device, an over-the-top (OTT) video device, an internet streaming service providing device, a 3D video device, a video telephony device, a vehicle terminal (e.g., a vehicle terminal, an airplane terminal, and a ship terminal), and a medical video device; and may be used to process video signals or data signals. For example, OTT video devices may include game consoles, blu-ray players, internet-connected televisions, home theater systems, smart phones, tablets, Digital Video Recorders (DVRs), and the like.

In addition, the processing method to which the embodiment of the present disclosure is applied may be generated in the form of a program executed by a computer and may be stored in a computer-readable recording medium. The multimedia data having the data structure according to the present disclosure may also be stored in a computer-readable recording medium. The computer-readable recording medium includes all types of storage devices and distributed storage devices in which computer-readable data is stored. The computer-readable recording medium may include, for example, a blu-ray disc (BD), a Universal Serial Bus (USB), a ROM, a PROM, an EPROM, an EEPROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device. In addition, the computer-readable recording medium includes media implemented in the form of carrier waves (e.g., transmission via the internet). In addition, the bitstream generated according to the encoding method may be stored in a computer-readable recording medium or transmitted through a wired/wireless communication network.

In addition, the embodiments of the present disclosure may be implemented as a computer program product in the form of program codes, and the program codes may be executed by a computer according to the embodiments of the present disclosure. The program code may be stored on a computer readable carrier.

Fig. 25 illustrates a content flow system structure to which an embodiment of the present disclosure is applied.

A content streaming system to which embodiments of the present disclosure are applied may mainly include an encoding server, a streaming server, a web server, a media storage, a user device, and a multimedia input device.

The encoding server compresses content input from a multimedia input device such as a smart phone, a camera, or a camcorder into digital data to generate a bitstream and transmits the bitstream to the streaming server. As another example, if a multimedia input device such as a smartphone, camera, or camcorder directly generates a bitstream, the encoding server may be omitted.

The bitstream may be generated by applying the encoding method or the method for generating the bitstream of the embodiments of the present disclosure, and the streaming server may temporarily store the bitstream while transmitting or receiving the bitstream.

The streaming server transmits multimedia data to the user device through the web server based on the user request, and the web server plays a role of informing the user which services are available. If the user requests a desired service from the web server, the web server transmits the request to the streaming server, and then the streaming server transmits multimedia data to the user. In this case, the content streaming system may include a separate control server. In this case, the control server serves to control commands/responses between devices within the content streaming system.

The streaming server may receive content from the media store and/or the encoding server. For example, when receiving content from an encoding server, the content may be received in real time. In this case, in order to provide a smooth streaming service, the streaming server may store the bit stream for a predetermined period of time.

Examples of user devices may include mobile phones, smart phones, laptop computers, digital broadcast terminals, Personal Digital Assistants (PDAs), Portable Multimedia Players (PMPs), navigation terminals, touch screen PCs, tablet PCs, ultrabooks, wearable devices (e.g., smart watches, smart glasses, and Head Mounted Displays (HMDs)), digital TVs, desktop computers, and digital signage. Each individual server within the content streaming system may operate as a distributed server, in which case the data received from each server may be processed in a distributed manner.

63页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：现场线性节目播放中的媒体节目的受众定义

Affine motion prediction based image decoding method and apparatus using affine MVP candidate list in image coding system

相关技术

网友询问留言