Image/video encoding method, apparatus, system, and computer-readable storage medium

文档序号：196258 发布日期：2021-11-02 浏览：32次中文

阅读说明：本技术 图像/视频编码方法、装置、系统及计算机可读存储介质 (Image/video encoding method, apparatus, system, and computer-readable storage medium ) 是由江东林聚财殷俊于 2021-06-24 设计创作，主要内容包括：本申请公开了一种图像/视频编码方法、装置、系统及计算机可读存储介质,该图像/视频编码方法包括：获取当前编码帧中的待编码块,以及获取待编码块的相邻已编码像素；基于待编码块和/或相邻已编码像素设置搜索模板；构建待编码块的运动信息候选列表,其中,运动信息候选列表中包括若干运动信息；利用待编码块在参考帧中的同位块或者运动信息候选列表的运动信息确定搜索起点,在预设搜索范围内搜索匹配搜索模板的最佳搜索块；按照最佳搜索块获取最佳匹配块；利用最佳匹配块对待编码块进行编码,得到待编码块的码流。本申请通过在参考帧内利用搜索模板的方式搜索待编码块的匹配块,能够进一步提升帧间编码的压缩率。(The application discloses an image/video coding method, a device, a system and a computer readable storage medium, wherein the image/video coding method comprises the following steps: acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded; setting a search template based on the block to be coded and/or the adjacent coded pixels; constructing a motion information candidate list of a block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information; determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list, and searching for an optimal search block matched with the search template in a preset search range; obtaining an optimal matching block according to the optimal searching block; and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded. According to the method and the device, the matching block of the block to be coded is searched in the reference frame by using the search template, so that the compression rate of inter-frame coding can be further improved.)

1. An image/video encoding method, characterized in that the image/video encoding method comprises:

acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded;

setting a search template based on the block to be coded and/or the adjacent coded pixels;

constructing a motion information candidate list of the block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information;

determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list, and searching for an optimal search block matched with the search template in a preset search range;

obtaining an optimal matching block according to the optimal searching block;

and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

2. Image/video coding method according to claim 1,

the setting of the search template based on the block to be encoded and/or the adjacent encoded pixels comprises:

taking the adjacent coded pixels corresponding to the blocks to be coded as the search templates;

wherein the neighboring encoded pixels comprise upper neighboring encoded pixels and/or left neighboring encoded pixels.

3. Image/video coding method according to claim 1,

the setting of the search template based on the block to be encoded and/or the adjacent encoded pixels comprises:

taking adjacent coded pixels corresponding to the blocks to be coded and the blocks to be coded as the search templates;

wherein the neighboring encoded pixels comprise upper neighboring encoded pixels and/or left neighboring encoded pixels.

4. Image/video coding method according to claim 1,

the determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list includes:

selecting a corresponding motion information from the motion information candidate list, and acquiring a corresponding block corresponding to the corresponding motion information;

and taking the corresponding block as the search starting point.

5. Image/video coding method according to claim 4,

selecting a corresponding motion information from the motion information candidate list, and acquiring a corresponding block corresponding to the corresponding motion information, including:

selecting at least two candidate motion information from the motion information candidate list;

calculating an average value of the at least two candidate motion information as the corresponding motion information;

and acquiring a corresponding block corresponding to the corresponding motion information.

6. Image/video coding method according to claim 5,

the calculating an average value of the at least two candidate motion information as the corresponding motion information includes:

calculating an average value of the at least two candidate motion information;

and taking the average value and a reference frame corresponding to one candidate motion information of the at least two candidate motion information as the corresponding motion information.

7. Image/video coding method according to claim 1,

the determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list includes:

acquiring the optimal motion information in the motion information candidate list by cost comparison by utilizing a preset inter-frame prediction mode;

acquiring a corresponding block corresponding to the optimal motion information;

and taking the corresponding block as the search starting point.

8. Image/video coding method according to claim 1,

the searching for the best search block matching the search template within a preset search range includes:

searching for an optimal search block matching the search template within a preset search range using a preset matching criterion, wherein the preset matching criterion includes at least one of an absolute difference sum minimum criterion, a mean square error minimum criterion, an absolute error mean minimum criterion, and a threshold difference count minimum criterion.

9. Image/video coding method according to claim 1,

the image/video coding method further comprises:

setting a first syntax element, wherein the first syntax element identifies the image/video coding method as claimed in claim 1;

setting a second syntax element, wherein the second syntax element comprises at least one of: the method comprises a search template selecting mode, a search starting point determining mode, a search template index and a search matching mode.

10. The encoding device is characterized by comprising a current acquisition module, a search template module, a candidate list module, a search matching module and an encoding module;

the current obtaining module is used for obtaining a block to be coded in a current coding frame and obtaining adjacent coded pixels of the block to be coded;

the searching template module is used for setting a searching template based on the to-be-coded block and/or the adjacent coded pixels;

the candidate list module is configured to construct a motion information candidate list of the block to be coded, where the motion information candidate list includes a plurality of pieces of motion information;

the search matching module is used for determining a search starting point by utilizing the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list, and searching the optimal search block matched with the search template in a preset search range;

the search matching module is further used for obtaining an optimal matching block according to the optimal search block;

and the coding module is used for coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

11. A codec system comprising a processor, a memory coupled to the processor, wherein,

the memory stores program instructions;

the processor is to execute the memory-stored program instructions to implement:

acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded;

setting a search template based on the block to be coded and/or the adjacent coded pixels;

constructing a motion information candidate list of the block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information;

obtaining an optimal matching block according to the optimal searching block;

and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

12. A computer-readable storage medium, wherein the storage medium stores program instructions that, when executed, implement:

acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded;

setting a search template based on the block to be coded and/or the adjacent coded pixels;

constructing a motion information candidate list of the block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information;

obtaining an optimal matching block according to the optimal searching block;

and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

Technical Field

The present application relates to the field of video coding technologies, and in particular, to an image/video coding method, apparatus, system, and computer-readable storage medium.

Background

The video can be compressed by a video coding technology so as to reduce the data volume of the video, reduce the network bandwidth in the video transmission process and reduce the storage space. Generally, the video coding standard includes intra-frame prediction, inter-frame prediction, transformation, quantization, loop filtering, entropy coding, and other processes to achieve data compression.

The current inter-frame prediction technology does not fully utilize the spatial correlation of adjacent blocks in the inter-frame prediction process, and does not fully consider the content correlation of video images, thereby affecting the compression rate of inter-frame coding.

Disclosure of Invention

The application provides an image/video coding method, device, system and computer readable storage medium.

In order to solve the above technical problem, a first technical solution provided by the present application is: there is provided an image/video encoding method including: acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded;

setting a search template based on the block to be coded and/or the adjacent coded pixels;

constructing a motion information candidate list of the block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information;

obtaining an optimal matching block according to the optimal searching block;

and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

In order to solve the above technical problem, a second technical solution provided by the present application is: providing an encoding device, wherein the encoding device comprises a current acquisition module, a search template module, a candidate list module, a search matching module and an encoding module;

the current obtaining module is used for obtaining a block to be coded in a current coding frame and obtaining adjacent coded pixels of the block to be coded;

the searching template module is used for setting a searching template based on the to-be-coded block and/or the adjacent coded pixels;

the search matching module is further used for obtaining an optimal matching block according to the optimal search block;

and the coding module is used for coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

In order to solve the above technical problem, a third technical solution provided by the present application is: providing a coding and decoding system, wherein the coding and decoding system comprises a processor and a memory connected with the processor, wherein the memory stores program instructions; the processor is to execute the memory-stored program instructions to implement: acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded;

setting a search template based on the block to be coded and/or the adjacent coded pixels;

constructing a motion information candidate list of the block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information;

obtaining an optimal matching block according to the optimal searching block;

and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

In order to solve the above technical problem, a fourth technical solution provided by the present application is: there is provided a computer readable storage medium storing program instructions that when executed implement: acquiring a block to be coded in a current coding frame and acquiring adjacent coded pixels of the block to be coded;

setting a search template based on the block to be coded and/or the adjacent coded pixels;

constructing a motion information candidate list of the block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information;

obtaining an optimal matching block according to the optimal searching block;

and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

The image/video coding method provided by the application comprises the steps of obtaining a block to be coded in a current coding frame and obtaining adjacent coded pixels of the block to be coded; setting a search template based on the block to be coded and/or the adjacent coded pixels; constructing a motion information candidate list of a block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information; determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list, and searching for an optimal search block matched with the search template in a preset search range; obtaining an optimal matching block according to the optimal searching block; and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded. According to the method and the device, the matching block of the block to be coded is searched in the reference frame by using the search template, so that the compression rate of inter-frame coding can be further improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart illustrating an embodiment of an image/video encoding method provided in the present application;

FIG. 2 is a schematic diagram illustrating an embodiment of a search template provided herein;

FIG. 3 is a schematic structural diagram of an embodiment of a search strategy provided herein;

FIG. 4 is a schematic structural diagram of another embodiment of a search strategy provided herein;

FIG. 5 is a schematic structural diagram of an embodiment of an encoding apparatus provided in the present application;

FIG. 6 is a schematic block diagram of an embodiment of a codec system of the present application;

fig. 7 is a schematic structural diagram of a computer-readable storage medium of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

With the requirement of higher and higher definition of video, the data volume of video images becomes larger, and the main function of video coding is to compress video pixel data (RGB, YUV, etc.) into video code streams, thereby reducing the data volume of video, and achieving the purposes of reducing network bandwidth and storage space in the transmission process.

The video codec standards are h.264/AVC, h.265/HEVC, h.266/VVC, VP8, VP9, AV1, AVs, etc., and the main purpose of the video codec is to compress the acquired video signals into data in different standard formats for transmission or storage. In order to apply the video coding technique to an actual scene, the rate control technique plays a crucial role in a video encoder, because it can adjust the target rate output by the encoder under the condition of limited communication bandwidth or storage space, thereby avoiding the condition that the coded video frame is too large or too small.

The present application mainly relates to an encoding technique of inter prediction, and the following explains the basic principle of inter prediction: generally, the luminance and chrominance signal values of temporally adjacent image pixels are relatively close and have strong correlation. The inter-frame prediction searches for a matching block closest to the current block in the reference frame by using methods such as motion search, and records motion information such as a motion vector (mv) size, a prediction direction and a reference frame index between the current block and the matching block. And encoding the motion information and transmitting the encoded motion information to a decoding end. At the decoding end, the decoder can find the matching block of the current block as long as the motion information of the current block is analyzed through the corresponding syntax element, and the pixel value of the matching block is copied or calculated to the current block, namely the interframe prediction value of the current block, so that the data volume is saved, and the aim of video compression is fulfilled.

Currently, inter prediction modes are mainly classified into an AMVP mode and a Merge mode, where an AMVP mode prediction process includes three parts, namely motion information candidate list construction, motion search and motion compensation, and a Merge mode prediction process includes two parts, namely motion information candidate list construction and motion compensation. The following are described separately.

(1) AMVP mode prediction

1) Constructing a motion information candidate list: firstly, a motion information candidate list with the length of 2 is constructed by utilizing spatial domain, time domain, history and zero motion vector information in sequence.

2) And (3) motion search: and searching in the reference frame according to a certain search rule by taking the motion information in the motion information candidate list as a starting point to find the optimal motion information, wherein the search rule is TZ search and the like.

3) And motion compensation: and acquiring the optimal predicted value of the current block by utilizing a pixel interpolation mode according to the optimal motion information.

(2) Merge mode prediction

1) Constructing a motion information candidate list: firstly, a motion information candidate list with the length of 6 is constructed by utilizing spatial domain, time domain, history and zero motion vector information in sequence.

2) And motion compensation: traversing the motion information in the candidate list, acquiring the predicted value of the current block by using a pixel interpolation mode, and then comparing the predicted rate distortion cost of each motion information, wherein the block with the minimum rate distortion cost is the optimal predicted value of the current block.

(3) Optimal prediction mode selection

After the current block is predicted by all modes, a prediction block is obtained in each mode, the mode with the minimum cost is found as the optimal prediction mode through the calculation of rate distortion cost Rdcost, and the prediction block corresponding to the optimal prediction mode is the optimal prediction block. The mathematical relationship for the Rdcost calculation is as follows:

Rdcost＝D+λ*R

where D, R represents the distortion and the number of bits when different prediction modes are used, and λ is the lagrangian factor.

The present application will be described in detail with reference to the accompanying drawings and examples.

Compared with the prior art, the encoding method can further improve the compression ratio of interframe encoding and reduce the transmission bandwidth of video data and the storage resource of the video data. The whole coding method consists of the setting of a search template, the construction of a motion information candidate list, the process of template matching and the expression of syntax elements.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an embodiment of an image/video encoding method provided in the present application.

The image/video coding method of the embodiment of the disclosure includes the following specific steps:

step S11: the method comprises the steps of obtaining a block to be coded in a current coding frame and obtaining adjacent coded pixels of the block to be coded.

The adjacent coded pixels of the embodiment of the disclosure are upper adjacent coded pixels and/or left adjacent coded pixels of the block to be coded.

Step S12: a search template is set based on the block to be encoded and/or the adjacent encoded pixels.

In the block matching mode, the encoding apparatus sets a search template according to a block to be encoded and adjacent encoded pixels of the block to be encoded, specifically referring to fig. 2, where fig. 2 is a schematic structural diagram of an embodiment of the search template provided in the present application. The setting mode of the search template in the embodiment of the present disclosure can be divided into the following three cases:

(1) the search template is the block to be encoded, i.e. corresponds to template 1 in fig. 2.

(2) The search template is an area formed by the block to be coded and the adjacent coded pixels corresponding to the block to be coded, and the search template can be rectangular or non-rectangular. If the adjacent encoded pixels include only left adjacent encoded pixels, then it corresponds to template 2 in fig. 2; if the neighboring encoded pixels only include the upper neighboring encoded pixels, then it corresponds to template 3 in fig. 2; if the neighboring encoded pixels include a left neighboring encoded pixel and an upper neighboring encoded pixel, then the template 4 in fig. 2 is corresponded, and the number of rows of the upper neighboring encoded pixel is M, and the number of columns of the left neighboring encoded pixel is N, where M ═ 1, and N ═ 1.

(3) The search template is the adjacent encoded pixels corresponding to the block to be encoded. If the neighboring encoded pixels include only left neighboring encoded pixels, then this corresponds to template 5 in FIG. 2; if the neighboring encoded pixels include only the upper neighboring encoded pixels, then this corresponds to template 6 in FIG. 2; if the neighboring encoded pixels include a left neighboring encoded pixel and an upper neighboring encoded pixel, then the template 7 in fig. 2 is corresponded, and the number of rows of the upper neighboring encoded pixel is M, and the number of columns of the left neighboring encoded pixel is N, where M ═ 1, and N ═ 1.

Step S13: constructing a motion information candidate list of a block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information

The method for constructing the motion information candidate list according to the embodiment of the present disclosure is consistent with the prior art, and therefore the block matching mode may also be divided into an AMVP block matching prediction mode and a Merge block matching prediction mode according to the AMVP mode and the Merge mode. The length of the motion information candidate list in the AMVP block matching prediction mode is also 2, and the length of the motion information in the Merge block matching prediction mode is also 6. Please refer to the above description for the motion information candidate list construction method in the AMVP mode and the Merge mode, which is not described herein again.

Step S14: and determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list, and searching for the optimal search block matched with the search template in a preset search range.

The template matching process is a process of searching and matching in a reference frame according to the search template and the motion information in the motion information candidate list, finding the best matching template through a cost comparison rule, and further determining the best predicted value of a block to be coded.

The embodiment of the present disclosure provides the following three search strategies, and it should be noted that the following three search strategies are all applicable to the above seven search templates:

(1) and finding the co-located block at the same position of the block to be coded in the reference frame of the current coding frame, and performing template matching in an integer pixel or sub-pixel mode in a search range according to a set search range [ -K, K ] by taking the co-located block as a search starting point.

For example, in the block matching Merge mode, the block to be encoded performs an integer pixel search using the co-located block in the reference frame as a starting point, where K is set to 64, i.e., the search range is [ -64, 64], and the search template is set to template 7 in fig. 2. As shown in fig. 3, the search template searches within the broken line range of the reference frame, and the search range cannot exceed the image boundary. Template search is carried out in a search range according to a mode that the distance is 1 pixel, SAD (Sum of Absolute Difference) is selected by calculating a matching block strategy, a search block with the minimum SAD in a search area is an optimal search block of the template, a matching block corresponding to the optimal search block is an optimal matching block of a block to be coded, and the mathematical relation of the SAD is as follows:

where k denotes the number of search blocks, s and c denote the search block and the search template block, respectively, and x and y denote the abscissa and ordinate of the corresponding position within the search block and the search template block.

(2) Selecting part or all of the motion information in the motion information candidate list, directly using the motion information or a corresponding block found in a reference frame by using the motion information derived from the motion information as a search starting point, and performing template matching in a mode of integer pixel or sub-pixel in a search range according to a set search range [ -K, K ].

Wherein, a piece of motion information is selected from the motion information candidate list, and the case of directly using the corresponding block found in the reference frame by the motion information as the search starting point is as follows:

for example, in the block matching AMVP mode, an AMVP candidate list is first constructed, where there are two motion information candidates in the list, a corresponding block found by the first motion information candidate in the reference frame is selected as a search starting point, and a motion vector in the motion information is half-pixel accurate. Where K is set to 64, i.e., the search range is [ -64, 64], the search template is set to template 4 in fig. 2. As shown in fig. 4, the search template searches within the broken line range of the reference frame, and the search range cannot exceed the image boundary. By performing template search in a search range at a pitch of 2 pixels, and if the search position is not at the integer pixel position, obtaining the template by adopting an adjacent integer pixel interpolation mode, the strategy of calculating the matching block is the same as the example.

Wherein, at least two pieces of motion information are selected from the motion information candidate list, and the case of directly using the corresponding block found in the reference frame by the at least two pieces of motion information as the search starting point is as follows:

for example, in the block matching Merge mode, a Merge candidate list is first constructed, where there are six motion information candidates in the list, and the corresponding blocks found in the reference frame by the first two pieces of motion information in the candidate list are selected as search starting points, and the motion vector in the first motion information is 1/4-pixel accuracy, and the second motion information is integer-pixel accuracy. Where K is set to 64, i.e., the search range is [ -64, 64], the search templates are set to template 5, template 6, and template 7 in fig. 2. Similar to the searching manner of fig. 4, the searching template searches within the broken line range of the reference frame, and the searching range cannot exceed the image boundary.

Firstly, using first motion information as a search starting point, traversing 3 templates in a search range according to a mode of 4 pixels, searching, if the search position is not at the integer pixel position, obtaining the search template in an adjacent integer pixel interpolation mode, calculating a matching block strategy which is the same as the example, and selecting the first motion information as the best matching block of the blocks to be coded under the search starting point.

Then, with the second motion information as a search starting point, 3 templates are traversed for searching in a search range in a mode of 1 pixel pitch, a matching block calculation strategy is the same as the above example, and the best matching block of the current block with the second motion information as the search starting point is selected.

And comparing the rate distortion costs of the two best matching blocks, wherein the lowest cost is the best matching block of the blocks to be coded in the Merge mode in block matching.

Wherein, the case that all motion information is selected from the motion information candidate list, and the corresponding block found in the reference frame by using the motion information derived from all the selected motion information is used as the search starting point is as follows:

for example, in the block matching AMVP mode, an AMVP candidate list is first constructed, where there are two motion information candidates in the list, an average value of the two information candidates and a reference frame corresponding to the first motion information are selected as a corresponding block found in the reference frame by the motion information, which is used as a search starting point, and a motion vector in the motion information is half-pixel accurate. Where K is set to 64, i.e., the search range is [ -64, 64], the search template is set to template 4 in fig. 2. As shown in fig. 4, the search template searches within the broken line range of the reference frame, and the search range cannot exceed the image boundary. By searching the template in the searching range in a mode of 2 pixels at intervals, if the searching position is not at the integer pixel position, the searching template is obtained by adopting an adjacent integer pixel interpolation mode, and the strategy of calculating the matching block is the same as the example.

(3) After the best motion information is selected in the prediction inter-frame prediction mode through cost comparison, the best motion information is refined in a template matching mode, namely, a corresponding block found in a reference frame by the best motion information is used as a search starting point, and template matching is carried out in a pixel integrating or pixel dividing mode in a search range according to a set search range [ -K, K ].

For example, in the normal Merge mode, after the optimal motion information is acquired, the corresponding block found in the reference frame by the optimal motion information is used as the search starting point, and the motion vector in the motion information is integer-pixel-precision. Where K is set to 64, i.e., the search range is [ -64, 64], the search template is set to template 7 in fig. 2. As shown in fig. 4, the search template searches within the broken line range of the reference frame, and the search range cannot exceed the image boundary. The computational matching block strategy is the same as the example above by performing a template search at a pitch of 1 pixel within the search range.

It should be noted that after the search strategy is determined, each search block needs to be compared with the corresponding search template, the search block closest to the search template is used as the optimal search block, and the matching block at the position corresponding to the block to be coded under the search block is used as the optimal matching block of the block to be coded. It is therefore necessary to determine the criteria for comparing matches. The method and the device for searching the block in the current block can adopt a minimum Sum of Absolute Differences (SAD) criterion, wherein the criterion is that the sum of absolute differences of pixel values of all corresponding positions in each searching block and a searching template is calculated and compared, the searching block with the minimum difference sum is selected as a searching block to serve as an optimal searching block, a block at a position corresponding to the block to be coded under the searching block serves as an optimal matching block of the current block, meanwhile, the criterion does not involve multiplication and division calculation, and the method and the device are suitable for realizing hardware.

In addition, the comparison matching criterion may be a Mean square Error minimum criterion (MSE), a Mean Absolute Difference minimum criterion (MAD), a threshold Difference count minimum criterion (NTD), or the like.

Step S15: and acquiring the best matching block according to the best searching block.

Step S16: and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded.

In the block matching mode, some syntax element identifiers need to be added according to the actual design scheme, i.e., the combination mode of the search template and the search strategy, so that a decoder can decode the coded code stream based on the syntax elements conveniently. The syntax elements provided by the embodiments of the present application include, but are not limited to, the following categories:

(1) a syntax element is added that identifies the block matching pattern.

(2) If multiple search templates are selected, syntax elements expressing the index of the search templates need to be added.

(3) If a plurality of search starting points are selected, a syntax element expressing the search starting points needs to be added.

(4) The expression of motion information is based on the selected search template and decoder capabilities, in two cases:

in the first case: for the templates 1 to 4 in fig. 2, since the search template includes the block to be encoded and the block to be encoded is an un-encoded block, if the best prediction mode is the block matching mode, the code stream needs to encode the motion vector or the motion vector residual from the block to be encoded to the best matching block. If the coded motion vector residual is motion vector residual, the motion vector corresponding to the search starting point can be used as a prediction motion vector of the block to be coded, and the decoding end obtains the prediction block of the block to be coded by directly or indirectly obtaining the motion vector and the residual in a decoding code stream.

In the second case: as for the templates 5 to 7 in fig. 2, since the reference pixels are all reconstructed pixels, if the optimal prediction mode is the block matching mode, under the condition that the complexity of the decoding end is not increased basically and the network bandwidth is sufficient, the motion vector or the motion vector residual from the block to be coded to the optimal matching block needs to be coded in the code stream. A decoding end acquires a prediction block of a block to be coded by decoding a motion vector and a residual error in a code stream; if the decoding capability allows, the encoding end does not need to encode the motion vector or the motion vector residual, the decoding end carries out the same template searching and matching strategy as the encoding end, and acquires the prediction block of the block to be encoded by combining the residual. Therefore, in this case, the present disclosure may add a syntax element in any one of the Picture Parameter Set (PPS), the Sequence Parameter Set (SPS), and the Slice header information (Slice header), where the syntax element controls whether the motion vector or the motion vector residual needs to be transmitted, and which template search and matching strategy is adopted by the codec when the motion vector or the motion vector residual is not transmitted.

In the embodiment of the present application, in the image/video encoding method provided in the above embodiment, a block to be encoded in a current encoding frame is obtained, and adjacent encoded pixels of the block to be encoded are obtained; setting a search template based on the block to be coded and/or the adjacent coded pixels; constructing a motion information candidate list of a block to be coded, wherein the motion information candidate list comprises a plurality of pieces of motion information; determining a search starting point by using the co-located block of the block to be coded in the reference frame or the motion information of the motion information candidate list, and searching for an optimal search block matched with the search template in a preset search range; obtaining an optimal matching block according to the optimal searching block; and coding the block to be coded by using the optimal matching block to obtain a code stream of the block to be coded. According to the method, the similarity between the pixel block in the reference frame and the search template block is utilized, the best matching block of the current block is found in the reference frame by adopting a search matching method, the defects of the existing inter-frame prediction technology are supplemented to a certain extent, the time redundancy is favorably further removed, and the compression rate of inter-frame coding is improved.

The above embodiments are only one of the common cases of the present application and do not limit the technical scope of the present application, so that any minor modifications, equivalent changes or modifications made to the above contents according to the essence of the present application still fall within the technical scope of the present application.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of an encoding apparatus provided in the present application. The encoding apparatus 50 includes a current obtaining module 51, a search template module 52, a candidate list module 53, a search matching module 54, and an encoding module 55.

The current obtaining module 51 is configured to obtain a block to be coded in a current coding frame, and obtain adjacent coded pixels of the block to be coded.

The search template module 52 is configured to set a search template based on the to-be-coded block and/or the adjacent coded pixels.

The candidate list module 53 is configured to construct a motion information candidate list of the block to be encoded, where the motion information candidate list includes a plurality of pieces of motion information.

The search matching module 54 is configured to determine a search starting point by using the co-located block of the to-be-coded block in the reference frame or the motion information of the motion information candidate list, and search for an optimal search block matching the search template within a preset search range; and is further configured to obtain a best matching block according to the best search block.

The encoding module 55 is configured to encode the block to be encoded by using the optimal matching block, so as to obtain a code stream of the block to be encoded.

Please refer to fig. 6, which is a schematic structural diagram of an embodiment of the encoding and decoding system of the present application. The codec system comprises a memory 62 and a processor 61 connected to each other.

The memory 62 is used to store program instructions implementing the image/video encoding method of any of the above.

The processor 61 is operative to execute program instructions stored in the memory 62.

The processor 61 may also be referred to as a CPU (Central Processing Unit). The processor 61 may be an integrated circuit chip having signal processing capabilities. The processor 61 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 62 may be a memory bank, a TF card, etc., and may store all information in the string matching prediction apparatus, including the input raw data, the computer program, the intermediate operation results, and the final operation results. It stores and retrieves information based on the location specified by the controller. With the memory, the string matching prediction device has a memory function, and normal operation can be guaranteed. The memory of the string matching prediction device can be classified into a main memory (internal memory) and an auxiliary memory (external memory) according to the use, and also into an external memory and an internal memory. The external memory is usually a magnetic medium, an optical disk, or the like, and can store information for a long period of time. The memory refers to a storage component on the main board, which is used for storing data and programs currently being executed, but is only used for temporarily storing the programs and the data, and the data is lost when the power is turned off or the power is cut off.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a system server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method of the embodiments of the present application.

Please refer to fig. 7, which is a schematic structural diagram of a computer-readable storage medium according to the present application. The storage medium of the present application stores a program file 71 capable of implementing all the above-mentioned image/video coding methods, wherein the program file 71 may be stored in the storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods according to the embodiments of the present application. The aforementioned storage device includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.

The above embodiments are merely examples and are not intended to limit the scope of the present disclosure, and all modifications, equivalents, and flow charts using the contents of the specification and drawings of the present disclosure or those directly or indirectly applied to other related technical fields are intended to be included in the scope of the present disclosure.

17页详细技术资料下载

Image/video encoding method, apparatus, system, and computer-readable storage medium

相关技术

网友询问留言