Intra-frame coding speed optimization method, device and medium based on historical information

文档序号:1925601 发布日期:2021-12-03 浏览:21次 中文

阅读说明:本技术 基于历史信息的帧内编码速度优化方法、装置及介质 (Intra-frame coding speed optimization method, device and medium based on historical information ) 是由 梁凡 贾一凡 于 2021-08-05 设计创作,主要内容包括:本发明公开了基于历史信息的帧内编码速度优化方法、装置及介质,方法包括:获取编码单元;当所述编码单元为第一轮编码单元时,对所述第一轮编码单元的相关指标进行计算,并根据所述相关指标的计算结果对所述第一轮编码单元进行编码;将所述相关指标的计算结果与动态阈值进行比较,确定编码单元的划分类型;其中,所述动态阈值根据编码单元的历史划分信息进动态调整;当所述编码单元为后续轮编码单元时,判断所述后续轮编码单元是否提前终止划分;若是,则返回获取编码单元的步骤;反之,则对所述后续轮编码单元进行编码;完成对所有编码单元的编码操作。本发明的兼容性好且速度快,可广泛应用于视频编码技术领域。(The invention discloses a method, a device and a medium for optimizing intra-frame coding speed based on historical information, wherein the method comprises the following steps: acquiring a coding unit; when the coding unit is a first round coding unit, calculating a related index of the first round coding unit, and coding the first round coding unit according to the calculation result of the related index; comparing the calculation result of the correlation index with a dynamic threshold value to determine the division type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units; when the coding unit is a subsequent wheel coding unit, judging whether the subsequent wheel coding unit terminates division in advance; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit; and completing the coding operation on all the coding units. The invention has good compatibility and high speed, and can be widely applied to the technical field of video coding.)

1. An intra-frame coding speed optimization method based on historical information is characterized by comprising the following steps:

acquiring a coding unit;

when the coding unit is a first round coding unit, calculating a related index of the first round coding unit, and coding the first round coding unit according to the calculation result of the related index;

comparing the calculation result of the correlation index with a dynamic threshold value to determine the division type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units;

when the coding unit is a subsequent wheel coding unit, judging whether the subsequent wheel coding unit terminates division in advance; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit;

and completing the coding operation on all the coding units.

2. The method of claim 1, wherein the method further comprises:

and judging each coding unit in the coding tree unit, and judging whether the coding unit is a first round coding unit or a subsequent round coding unit.

3. The method according to claim 1, wherein the correlation indicators comprise texture information, horizontal gradient values and vertical gradient values, and the calculating the correlation indicator of the first round of coding units comprises:

calculating horizontal gradient values of the first round of coding units;

calculating a vertical gradient value of the first round of encoding units;

calculating texture information of the first round of coding units according to the horizontal gradient value and the vertical gradient value;

wherein, the calculation formula of the horizontal gradient value Gx is as follows:

the calculation formula of the vertical gradient value Gy is as follows;

the texture information T (i, j) is calculated by the following formula:

T(i,j)=|Gx(i,j)|+|Gy(i,j)|

where P represents a pixel matrix of 3 × 3 size centered on the pixel value of the (i, j) position; (i, j) represents the position of the jth row and ith column in the image.

4. The method of claim 3, wherein the method further comprises:

in the step of calculating the correlation index of the first round of coding units, an average texture value of a plurality of sub-units in the first round of coding units is calculated.

5. The method as claimed in claim 1, wherein the comparing the calculation result of the correlation index with a dynamic threshold to determine the partition type of the coding unit comprises:

for the first round of coding units, when the rate distortion cost is smaller than the rate distortion cost corresponding to the type to be divided, the dynamic threshold value is adjusted;

the adjustment formula of the dynamic threshold is as follows:

where Thr represents the adjusted threshold; thr _ old represents the threshold before adjustment; t represents the average texture value of the first round of coding units.

6. The method of claim 1, wherein the determining whether the subsequent round of coding units terminates partitioning early comprises:

in the sub-strategy aiming at the homogeneity, for the type to be divided, if the average texture value of the coding unit is smaller than the corresponding dynamic threshold value, the type to be divided is skipped;

in the sub-strategy aiming at the directivity, judging whether a first skipping condition is met, if so, skipping the type to be divided;

in the sub-strategy for the texture difference between the sub-parts, whether a second skipping condition is met is judged, and if yes, the current partition type is terminated in advance.

7. The method of claim 6, wherein the intra coding speed is optimized based on the history information,

the expression of the first skip condition is:

the expression of the second skip condition is:

Diff_ratio<Thr

wherein the content of the first and second substances,

represents the average horizontal gradient value of the current coding unit;represents the average vertical gradient value of the current coding unit; thr represents the threshold of the decision; BT-V represents a binary tree vertical partition mode; TT-V represents a ternary tree vertical partition mode; BT-H represents a binary tree horizontal division mode; TT-H stands for the horizontal division mode of the ternary tree(ii) a Diff _ ratio represents the sub-block disparity; ratio _1 represents the sub-block disparity 1; ratio _2 represents the sub-block disparity 2;represents the average texture value of the first sub-block;represents the average texture value of the second sub-block;represents the average texture value of the third sub-block.

8. An intra-coding speed optimization apparatus based on history information, comprising:

a first module for obtaining an encoding unit;

a second module, configured to, when the coding unit is a first-round coding unit, calculate a correlation index of the first-round coding unit, and code the first-round coding unit according to a calculation result of the correlation index;

a third module, configured to compare the calculation result of the correlation index with a dynamic threshold, and determine a partition type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units;

a fourth module, configured to determine whether the subsequent round of coding unit terminates partitioning in advance when the coding unit is the subsequent round of coding unit; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit;

and the fifth module is used for finishing the coding operation of all the coding units.

9. An electronic device comprising a processor and a memory;

the memory is used for storing programs;

the processor executing the program realizes the method according to any one of claims 1-7.

10. A computer-readable storage medium, characterized in that the storage medium stores a program, which is executed by a processor to implement the method according to any one of claims 1-7.

Technical Field

The invention relates to the technical field of video coding, in particular to a method, a device and a medium for optimizing intra-frame coding speed based on historical information.

Background

VVC (Versatile Video coding) can improve coding efficiency while maintaining subjective and objective visual quality40% and above. The improvement of coding efficiency benefits from a number of newly adopted coding techniques and tools, such as QTMT partition scheme, multi-line reference prediction (MRL), Matrix Intra Prediction (MIP), multiple transform kernel selection (MTS), low frequency non-separable transform (LFNST), Intra sub-block partitioning (Intra)ISP) and the like. These newly adopted coding tools, while effective in improving compression efficiency, also introduce coding complexity significantly. Too high coding complexity can affect the real-time performance of coding and improve the implementation difficulty of engineering landing.

Experts have called for effective control of coding complexity for dramatically increasing coding times. According to the report, compared with HEVC, configuration in full frameRandom Access configuration (Random Access) and low latency configuration(s) (( P/B), the encoding time of VVC is increased by 25, 7 and 6 times respectively, and the encoding efficiency is correspondingly improved by about 25%, 36% and 32%. Obviously, the complexity of intra-coding increases far beyond that of inter-coding, and it is currently the most crucial and tricky way to control the complexity of intra-coding.

In the VVC fast algorithm of the conventional method, although the encoding time can be reduced by 20% -50%, the encoding loss is close to or even exceeds 1%. Considering that the overall gain of VVC intra coding compared to HEVC is only 25%, too high coding losses (e.g. greater than 1%) are unacceptable. In other words, these existing algorithms still do not achieve a satisfactory compromise and balance in coding efficiency and coding time.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, and a medium for optimizing intra-frame coding speed based on historical information, which are fast and have good compatibility.

One aspect of the present invention provides a method for optimizing intra-frame coding speed based on historical information, including:

acquiring a coding unit;

when the coding unit is a first round coding unit, calculating a related index of the first round coding unit, and coding the first round coding unit according to the calculation result of the related index;

comparing the calculation result of the correlation index with a dynamic threshold value to determine the division type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units;

when the coding unit is a subsequent wheel coding unit, judging whether the subsequent wheel coding unit terminates division in advance; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit;

and completing the coding operation on all the coding units.

Optionally, the method further comprises:

and judging each coding unit in the coding tree unit, and judging whether the coding unit is a first round coding unit or a subsequent round coding unit.

Optionally, the correlation indicator includes texture information, a horizontal gradient value and a vertical gradient value, and the calculating the correlation indicator of the first round of encoding units includes:

calculating horizontal gradient values of the first round of coding units;

calculating a vertical gradient value of the first round of encoding units;

calculating texture information of the first round of coding units according to the horizontal gradient value and the vertical gradient value;

wherein, the calculation formula of the horizontal gradient value Gx is as follows:

the calculation formula of the vertical gradient value Gy is as follows;

the texture information T (i, j) is calculated by the following formula:

T(i,j)=|Gx(i,j)|+|Gy(i,j)|

where P represents a pixel matrix of 3 × 3 size centered on the pixel value of the (i, j) position; (i, j) represents the position of the jth row and ith column in the image.

Optionally, the method further comprises:

in the step of calculating the correlation index of the first round of coding units, an average texture value of a plurality of sub-units in the first round of coding units is calculated.

Optionally, the comparing the calculation result of the correlation index with a dynamic threshold to determine the partition type of the coding unit includes:

for the first round of coding units, when the rate distortion cost is smaller than the rate distortion cost corresponding to the type to be divided, the dynamic threshold value is adjusted;

the adjustment formula of the dynamic threshold is as follows:

where Thr represents the adjusted threshold; thr _ old represents the threshold before adjustment; t represents the average texture value of the first round of coding units.

Optionally, the determining whether the subsequent round of coding units terminates partitioning in advance includes:

in the sub-strategy aiming at the homogeneity, for the type to be divided, if the average texture value of the coding unit is smaller than the corresponding dynamic threshold value, the type to be divided is skipped;

in the sub-strategy aiming at the directivity, judging whether a first skipping condition is met, if so, skipping the type to be divided;

in the sub-strategy for the texture difference between the sub-parts, whether a second skipping condition is met is judged, and if yes, the current partition type is terminated in advance.

Optionally, the expression of the first skip condition is:

the expression of the second skip condition is:

Diff_ratio<Thr

wherein the content of the first and second substances,

represents the average horizontal gradient value of the current coding unit;represents the average vertical gradient value of the current coding unit; thr represents the threshold of the decision; BT-V represents a binary tree vertical partition mode; TT-V represents a ternary tree vertical partition mode; BT-H represents a binary tree horizontal division mode; TT-H stands for the horizontal division mode of the ternary tree; diff _ ratio represents the sub-block disparity; ratio _1 represents the sub-block disparity 1; ratio _2 represents the sub-block disparity 2;represents the average texture value of the first sub-block;represents the average texture value of the second sub-block;represents the average texture value of the third sub-block.

Another aspect of the embodiments of the present invention further provides an apparatus for optimizing intra-frame coding speed based on historical information, including:

a first module for obtaining an encoding unit;

a second module, configured to, when the coding unit is a first-round coding unit, calculate a correlation index of the first-round coding unit, and code the first-round coding unit according to a calculation result of the correlation index;

a third module, configured to compare the calculation result of the correlation index with a dynamic threshold, and determine a partition type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units;

a fourth module, configured to determine whether the subsequent round of coding unit terminates partitioning in advance when the coding unit is the subsequent round of coding unit; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit;

and the fifth module is used for finishing the coding operation of all the coding units.

In another aspect, an embodiment of the present invention further provides an electronic device, including a processor and a memory;

the memory is used for storing programs;

the processor executes the program to implement the method as described above.

In another aspect, the present invention provides a computer-readable storage medium, which stores a program, where the program is executed by a processor to implement the method described above.

The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and the computer instructions executed by the processor cause the computer device to perform the foregoing method.

The embodiment of the invention firstly obtains a coding unit; when the coding unit is a first round coding unit, calculating a related index of the first round coding unit, and coding the first round coding unit according to the calculation result of the related index; comparing the calculation result of the correlation index with a dynamic threshold value to determine the division type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units; when the coding unit is a subsequent wheel coding unit, judging whether the subsequent wheel coding unit terminates division in advance; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit; and completing the coding operation on all the coding units. The invention has good compatibility and high speed.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is an exemplary diagram of different partition combinations of the present invention resulting in the same CU structure;

FIG. 2 is a flowchart illustrating the overall steps of an embodiment of the present invention;

FIG. 3 is a schematic diagram of TT-V division in a CU according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The embodiment of the invention provides an intra-frame coding speed optimization method based on historical information, which comprises the following steps:

acquiring a coding unit;

when the coding unit is a first round coding unit, calculating a related index of the first round coding unit, and coding the first round coding unit according to the calculation result of the related index;

comparing the calculation result of the correlation index with a dynamic threshold value to determine the division type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units;

when the coding unit is a subsequent wheel coding unit, judging whether the subsequent wheel coding unit terminates division in advance; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit;

and completing the coding operation on all the coding units.

Optionally, the method further comprises:

and judging each coding unit in the coding tree unit, and judging whether the coding unit is a first round coding unit or a subsequent round coding unit.

Optionally, the correlation indicator includes texture information, a horizontal gradient value and a vertical gradient value, and the calculating the correlation indicator of the first round of encoding units includes:

calculating horizontal gradient values of the first round of coding units;

calculating a vertical gradient value of the first round of encoding units;

calculating texture information of the first round of coding units according to the horizontal gradient value and the vertical gradient value;

wherein, the calculation formula of the horizontal gradient value Gx is as follows:

the calculation formula of the vertical gradient value Gy is as follows;

the texture information T (i, j) is calculated by the following formula:

T(i,j)=|Gx(i,j)|+|Gy(i,j)|

where P represents a pixel matrix of 3 × 3 size centered on the pixel value of the (i, j) position; (i, j) represents the position of the jth row and ith column in the image.

Optionally, the method further comprises:

in the step of calculating the correlation index of the first round of coding units, an average texture value of a plurality of sub-units in the first round of coding units is calculated.

Optionally, the comparing the calculation result of the correlation index with a dynamic threshold to determine the partition type of the coding unit includes:

for the first round of coding units, when the rate distortion cost is smaller than the rate distortion cost corresponding to the type to be divided, the dynamic threshold value is adjusted;

the adjustment formula of the dynamic threshold is as follows:

where Thr represents the adjusted threshold; thr _ old represents the threshold before adjustment; t represents the average texture value of the first round of coding units.

Optionally, the determining whether the subsequent round of coding units terminates partitioning in advance includes:

in the sub-strategy aiming at the homogeneity, for the type to be divided, if the average texture value of the coding unit is smaller than the corresponding dynamic threshold value, the type to be divided is skipped;

in the sub-strategy aiming at the directivity, judging whether a first skipping condition is met, if so, skipping the type to be divided;

in the sub-strategy for the texture difference between the sub-parts, whether a second skipping condition is met is judged, and if yes, the current partition type is terminated in advance.

Optionally, the expression of the first skip condition is:

the expression of the second skip condition is:

Diff_ratio<Thr

wherein the content of the first and second substances,

represents the average horizontal gradient value of the current coding unit;represents the average vertical gradient value of the current coding unit; thr represents the threshold of the decision; BT-V represents a binary tree vertical partition mode; TT-V represents a ternary tree vertical partition mode; BT-H represents a binary tree horizontal division mode; TT-H stands for the horizontal division mode of the ternary tree; diff _ ratio represents the sub-block disparity; ratio _1 represents the sub-block disparity 1; ratio _2 represents the sub-block disparity 2;represents the average texture value of the first sub-block;represents the average texture value of the second sub-block;represents the average texture value of the third sub-block.

The embodiment of the invention also provides an intra-frame coding speed optimization device based on historical information, which comprises the following steps:

a first module for obtaining an encoding unit;

a second module, configured to, when the coding unit is a first-round coding unit, calculate a correlation index of the first-round coding unit, and code the first-round coding unit according to a calculation result of the correlation index;

a third module, configured to compare the calculation result of the correlation index with a dynamic threshold, and determine a partition type of the coding unit; wherein the dynamic threshold is dynamically adjusted according to historical partition information of the coding units;

a fourth module, configured to determine whether the subsequent round of coding unit terminates partitioning in advance when the coding unit is the subsequent round of coding unit; if yes, returning to the step of acquiring the coding unit; otherwise, encoding the subsequent wheel encoding unit;

and the fifth module is used for finishing the coding operation of all the coding units.

The embodiment of the invention also provides the electronic equipment, which comprises a processor and a memory;

the memory is used for storing programs;

the processor executes the program to implement the method as described above.

An embodiment of the present invention further provides a computer-readable storage medium, where the storage medium stores a program, and the program is executed by a processor to implement the method described above.

The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and the computer instructions executed by the processor cause the computer device to perform the foregoing method.

The following describes in detail the specific implementation principles of the present invention:

it should be noted that under the QTMT partition scheme, different partition combinations may result in the same CU structure (with the same location and size). In other words, a CU of the same position and size may be encoded multiple times. Taking fig. 1 as an example, after a parent CU (with size w × h) is divided into two binary trees (e.g. 101 and 102 in fig. 1) in two consecutive times and different directions, two child CUs with size w/2 × h/2 can be generated. The two sub-CUs and the partial sub-CUs resulting from the quadtree division have the same position and size and are repetitively coded. For convenience of explanation, the present invention defines a CU that an encoder first encounters and encodes as a first round CU (1st round CU); the CU encountered again by the encoder, i.e. the CU that has been generated and encoded by some previous partition combination, is defined as the following round CU (following rounds CU).

This feature, which originates from the encoder and the QTMT partition scheme itself, motivates the core idea of the present invention that the coding result (including the optimal partition type and prediction mode) of 1st round CU can be used to guide the coding of following rounds CU. Specifically, the redundant partition types are skipped by using the relationship between 1st round CU (first coding) and following rounds CU (encountered again).

First, from the theoretical point of view of video coding, the "history and repetition" feature of the QTMT partition scheme can be utilized to accelerate QTMT partition. Specifically, for a CU, the location and size of its reference region is unchanged when it is first encoded and encountered again for encoding. Although the value of the reference pixel may vary (the structure of the surrounding CU may be different, which may result in a slightly different reconstructed value of the reference pixel), the difference is not large, and the reconstructed value still approaches the same original pixel value. Therefore, theoretically, the 1st round CU and the following round CU have similar reference pixels, and the subsequent optimal partition types and prediction modes thereof are also similar.

Secondly, from the perspective of statistics and optimization space, the encoding speed can be effectively improved by using the 'history and repetition' characteristic of the QTMT partition scheme under the condition of controlling small encoding loss. On the one hand, according to experimental statistics, 1stThe ratio of round CU and following rounds CU having the same optimal partition type exceeds 80%, consistent with the above theoretical analysis. This also means that coding losses can be effectively controlled if this feature can be exploited. On the other hand, according to experimental statistics, about 30% -40% of CUs are encoded multiple timesThis means that the space for speed optimization using this feature is large.

As described above, according to the experimental statistics, the ratio of 1st round CU and following rounds CU having the same optimal partition type exceeds 80%. Therefore, the encoding result of 1st round CU can be used to guide and skip the recursive traversal partition of following rounds CU. In order to effectively control the coding loss, it is considered that the accuracy of 80% is insufficient, and the identification and elimination of the redundant partition type needs to ensure the accuracy of at least 95%.

After a large number of experimental analyses, we counted several pruning strategies with more than 95% accuracy and summarized in table 1. For example, when the optimal partition type of a 1st round CU is vertical treble partition (TTV) and the optimal prediction mode is centered around the vertical direction (IPM:41-59), its following rounds CU basically does not select horizontal binary tree partition (BT-H) -the accuracy of skipping BT-H is as high as 97.59%. At this time, BT-H can be identified as a redundant partition type, which is skipped to effectively reduce the encoding time. Due to the high accuracy, the present invention applies all the pruning strategies in table 1 to the redundant partitioning for identifying and rejecting following rounds CUs.

TABLE 1

In addition, according to the characteristic of history and repetition, the algorithm also provides a hierarchical adaptive threshold QTMT pruning algorithm. Unlike the single threshold in other documents, the algorithm introduces a threshold matrix, which is multidimensional and takes into account factors such as the size of CU, partition type, Quantization Parameter (QP), and the like. In other words, the CU size, the partition type and the QP are different, and the threshold value is different, so that the characteristics of the QTMT partition of the encoder can be better fitted, and the encoding loss can be effectively controlled.

Framework of adaptive threshold QTMT pruning algorithm as shown in fig. 2, the thresholds of some pruning strategies (described in detail below) are adaptively adjusted in the coding of the 1st round CU. Then, pruning is carried out on the QTMT partition of the following rounds CU by utilizing the adjusted threshold value, so as to save the coding time. This hierarchical adaptive threshold adjustment has two advantages, the first is that it can replace the original time-consuming threshold Offline Training (Offline Training); a second advantage is that the coding loss can be better controlled. This adaptive adjustment, which results from the encoder itself, is more accurate than the manual setting or external adjustment in other documents.

The adjustment of the threshold matrix needs to be based on the result of correlation calculation in the coding of the 1st round CU, and the contents of the correlation calculation, the adaptive threshold adjustment, and the early termination division are respectively described below according to the flow.

1. Correlation calculation

When each CU in the CTU is coded, if it is determined that the CU is a 1st round CU (first coding), some indexes are calculated, where T (i, j), Gx (i, j), and Gy (i, j) respectively represent texture, horizontal gradient, and vertical gradient values at (i, j), and the calculation method is as follows:

T(i,j)=|Gx(i,j)|+|Gy(i,j)| (2)

wherein w and h represent the width and height of the CU, respectively; t represents the average texture value of the CU and can reflect the homogeneity, flatness and uniformity of the CU; gx and Gy represent the average horizontal gradient value and the average vertical gradient value of the CU, respectively, and the difference between the two may reflect the texture directionality of the CU.

In addition, the average texture value of a plurality of sub-parts in the CU is calculated, so that the texture difference degree between the sub-parts can be conveniently judged in the subsequent flow. FIG. 3 is an example of calculating the average texture value for three vertical sub-sections within a CU. In fig. 3, each small dot represents a texture value T (i, j) of a pixel at the corresponding position. The average texture values T _ p1, T _ p2, and T _ p3 of each of the three vertical sub-sections are the average of the texture values T (i, j) of the corresponding section.

In addition to the three vertical subdivisions shown in fig. 3 (corresponding to the TT-V division), the average texture values for other directions and numbers of subdivisions are calculated in a similar manner, as long as the segmentation and calculation is done according to the corresponding division tree and division direction.

2. Adaptive adjustment of threshold

It is possible to determine whether to skip a particular partition type directly by comparing the average texture value T with the threshold value Thr. The strategy is seemingly simple, but the key point is how to select a proper threshold Thr, which is too large, so that the misjudgment rate of the redundancy partition type is increased, and the coding loss is increased; too small a threshold value may not effectively reduce the encoding time.

Furthermore, the threshold Thr should be different for different sizes of CUs, partition types, and QPs. The algorithm solves the difficulty by using the characteristics of history and repetition, and ensures that the threshold Thr (defaults to zero) has a process of increasing from zero and the fluctuation approaches to a reasonable range. Specifically, for a 1st round CU, if equation 4 is satisfied, the threshold Thr is adjusted.

Cost_partitioning type>Cost_non-split (4)

Wherein, the Cost _ partitioning _ type represents a rate-distortion Cost corresponding to a certain partition type; cost _ non _ split represents the rate-distortion Cost when not divided.

According to the formula 4, when the rate-distortion cost of the non-partition is less than a certain partition type (such as QT, BT-H, BT-V, TT-H or TT-V), the threshold value is adjusted. This is because, in this case, the encoder prefers not to divide, rather than do this split _ type division. Therefore, at this time, the original threshold value should be adjusted according to the average texture value T of the CU to gradually approach the reasonable range. After the formula 4 is satisfied, the threshold Thr is adjusted, and the adjustment method is shown in the formula 5.

Wherein Thr and Thr _ old represent the thresholds after and before adjustment, respectively; t denotes the average texture value of the CU.

As can be seen from equation 5, the threshold Thr (defaults to zero) gradually increases as the encoding process proceeds. After reaching a certain size, Thr can be adjusted in a fluctuation mode within a reasonable range due to the existence of min () and max () functions so as to adapt to the characteristic of QTMT division. It is worth noting that the coefficient (7/8) of the min () function in equation 5 is larger than the coefficient of the max () function (1/8), so that the coding loss can be effectively controlled — in the increasing stage of Thr from zero, the small coefficient (1/8) of the max () function can reduce the increasing speed of Thr, and avoid the jumping increase of Thr caused by partial extreme value; in the fluctuation adjustment stage of Thr, the large coefficient (7/8) of the min () function can ensure that the function quickly falls back to a smaller level when Thr is larger, and the misjudgment probability and the coding loss are reduced in time.

3. Early termination partitioning

For following rounds CU, QTMT pruning was performed using the threshold matrix adjusted by 1st round CU. Specifically, first, according to the size of the CU, the partition type, the QP, and the kind of the sub-policy, a corresponding threshold Thr is obtained. Then, corresponding pruning operation is carried out. The three pruning sub-strategies are described below, and the conditions and methods for identifying and skipping redundant partition types are briefly introduced below.

In the sub-strategy for homogeneity, for a certain partition type, if the average texture value T of a CU is smaller than the corresponding threshold Thr, that is, equation 6 is satisfied, it indicates that the region is relatively flat, and the corresponding partition type is skipped.

In the sub-strategy for directionality, if formula 7 is satisfied, it is indicated that a distinct texture in the opposite direction of the partition tree appears in the region, and the corresponding partition type is skipped.

In the sub-strategy for texture differences between sub-parts, the corresponding partition types continue to be identified and skipped with Diff _ ratio. As shown in equation 9, when Diff _ ratio (calculation equation is equation 8) is smaller than threshold Thr, the corresponding partition type is terminated in advance. At this time, a significant texture difference occurs in the opposite direction of the partition tree, so the encoder tends to skip this type of partition.

Diff_ratio<Thr (9)

In summary, according to the characteristics of history and repetition, three sub-strategies are respectively designed for homogeneity, directionality and texture difference among the sub-parts. The related threshold of the sub-strategy can be adaptively adjusted in the coding of the 1st round CU, and then the sub-strategy is used in QTMT pruning of the following rounds CU to finish early termination division.

In order to verify the effect of the algorithm, the history-based QTMT pruning algorithm is implemented, compiled and optimized and implanted into the VTM 10.0. When testing the algorithm effect, we tested 100 frames for All sequences under All-Intra configuration according to the requirements of the general test conditions. The results of the experiment are shown in table 2.

The algorithm effect is measured by BD-Rate and BD-PSNR, Time Saving (TS), and TS/BD-Rate ratio. Here, the TS/BD-Rate ratio is used herein to measure the trade-off between the encoding speed and the loss-the larger the ratio, the better the trade-off between the speed and the loss is. In addition, the BD-Rate and BD-PSNR are used to measure the coding loss, with a positive BD-Rate or a negative BD-PSNR indicating a certain loss in coding efficiency. TS is used for measuring the effect of speed optimization, the larger TS indicates the larger speed is increased, and the calculation formula is shown as formula 10. Wherein T _ o and T _ p respectively represent coding time before and after velocity optimization

As can be seen from Table 2, the history-based QTMT pruning algorithm can save about 20% of the encoding time under the condition of only 0.18% of encoding loss (BD-Rate rise). In addition, the BD-PSNR drops by less than 0.02dB for all sequences, indicating that the video quality is not substantially reduced at all. The extremely low BD-Rate also shows that the misjudgment Rate of the history-based QTMT pruning algorithm on the redundant division mode is low, and the design of the layered threshold adjustment has certain scientificity.

TABLE 2

In summary, the present invention utilizes the "history and repetition" feature for the first time to accelerate the QTMT split of VVC. The algorithm provides a novel perspective for future algorithm design-besides video content features, the encoder and the characteristics of the QTMT partition scheme can also be applied to speed optimization of VVC.

In addition, the adaptive threshold adjusting method introduced in the algorithm can also be popularized and applied to other QTMT pruning algorithms to replace the original artificial threshold to effectively control the coding loss.

It is worth mentioning that the algorithm has good expansibility and compatibility, which are not conflicted with other QTMT pruning algorithms, but can be fused with each other to perform speed optimization to a greater extent. Specifically, other QTMT pruning algorithms can be applied to the coding of the 1st round CU, and the history-based QTMT pruning algorithm can be executed on the following rounds CU to achieve better speed optimization.

In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

19页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:环路滤波实现方法、装置及计算机存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类