Encoding method, encoding device, storage medium, and electronic apparatus

文档序号：1849957 发布日期：2021-11-16 浏览：32次中文

阅读说明：本技术 编码方法、编码装置、存储介质及电子设备 (Encoding method, encoding device, storage medium, and electronic apparatus ) 是由何鸣阮良陈功许迅代苑莹程玲韩庆瑞陈丽于 2021-08-10 设计创作，主要内容包括：本公开实施方式涉及编码方法、编码装置、存储介质与电子设备,涉及图像处理技术领域。所述编码方法包括：根据当前帧的类型,更新当前编码周期内目标帧的数目；当前帧的类型包括目标帧和非目标帧；响应于当前编码周期内目标帧的数目大于第一预设帧数且当前编码周期内已编码帧的数目大于第二预设帧数,根据第一码率对当前帧进行编码处理,并结束当前编码周期；响应于当前编码周期内目标帧的数目大于第一预设帧数且当前编码周期内已编码帧的数目不大于第二预设帧数,根据第二码率对当前帧进行编码处理；第一码率高于根据当前网络性能所确定的参考码率,第二码率低于上述参考码率。本公开能够在不影响视频体积的同时,提升编码后的视频质量。(The disclosed embodiments relate to an encoding method, an encoding device, a storage medium, and an electronic apparatus, and relate to the field of image processing technologies. The encoding method comprises the following steps: updating the number of target frames in the current coding period according to the type of the current frame; the type of the current frame comprises a target frame and a non-target frame; in response to the fact that the number of target frames in the current coding period is larger than a first preset frame number and the number of coded frames in the current coding period is larger than a second preset frame number, coding the current frame according to a first code rate, and ending the current coding period; in response to the fact that the number of target frames in the current coding period is larger than a first preset frame number and the number of coded frames in the current coding period is not larger than a second preset frame number, coding the current frame according to a second code rate; the first code rate is higher than a reference code rate determined according to current network performance, and the second code rate is lower than the reference code rate. The video quality after coding can be improved without influencing the video volume.)

1. A method of encoding, comprising:

updating the number of target frames in the current coding period according to the type of the current frame; the types of the current frame comprise a target frame and a non-target frame;

in response to the fact that the number of target frames in the current coding period is larger than a first preset frame number and the number of coded frames in the current coding period is larger than a second preset frame number, coding the current frame according to a first code rate, and finishing the current coding period;

in response to that the number of target frames in the current coding period is greater than the first preset frame number and the number of coded frames in the current coding period is not greater than the second preset frame number, coding the current frame according to a second code rate;

wherein the first code rate is higher than a reference code rate determined according to the current network performance, and the second code rate is lower than the reference code rate determined according to the current network performance.

2. The method of claim 1, further comprising:

in response to that the number of target frames in the current coding period is not greater than the first preset frame number and the number of coded frames in the current coding period is greater than a third preset frame number, coding the current frame according to the first code rate, and ending the current coding period;

and the third preset frame number is greater than the second preset frame number.

3. The method of claim 2, further comprising:

and in response to the fact that the number of target frames in the current coding period is not greater than the first preset frame number and the number of coded frames in the current coding period is not greater than the third preset frame number, coding the current frame according to the second code rate.

4. A method according to any one of claims 1 to 3, wherein the second code rate is determined by:

and reducing the reference code rate according to a preset reduction proportion to obtain the second code rate.

5. The method of claim 4, wherein the first code rate is determined by:

in the current coding period, acquiring the first N coded frames of the current frame;

obtaining a code rate difference value between the actual code rate of each coded frame and the reference code rate;

determining the first code rate according to the N code rate difference values and the reference code rate;

and N is a positive integer which is greater than the second preset frame number and less than the third preset frame number.

6. The method of claim 5, wherein the determining the first code rate according to the N code rate differences and the reference code rate comprises:

obtaining a first accumulated value of the N code rate difference values;

obtaining a product of the first accumulated value and the reference code rate;

and acquiring a second accumulated value of the product and the reference code rate, and determining the second accumulated value as the first code rate.

7. The method according to any of claims 1 to 3, wherein the type of the current frame is determined by:

carrying out significance detection on the current frame to obtain the area of a significance region;

in response to the area of the salient region being greater than a preset area threshold, determining the current frame as the target frame;

in response to the area of the salient region not being greater than the preset area threshold, determining that the current frame is the non-target frame.

8. An encoding apparatus, comprising:

the frame type detection module is used for updating the number of the target frames in the current coding period according to the type of the current frame; the types of the current frame comprise a target frame and a non-target frame;

a first encoding module, configured to, in response to that the number of target frames in the current encoding period is greater than a first preset frame number and the number of encoded frames in the current encoding period is greater than a second preset frame number, perform encoding processing on the current frame according to a first code rate, and end the current encoding period;

a second encoding module, configured to perform encoding processing on the current frame according to a second code rate in response to that the number of target frames in the current encoding period is greater than the first preset number of frames and the number of encoded frames in the current encoding period is not greater than the second preset number of frames;

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 7.

10. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of any of claims 1-7 via execution of the executable instructions.

Technical Field

Embodiments of the present disclosure relate to the field of image processing technologies, and in particular, to an encoding method, an encoding apparatus, a computer-readable storage medium, and an electronic device.

Background

This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims and the description herein is not admitted to be prior art by inclusion in this section.

The video code rate is the number of data bits transmitted in unit time during data transmission, and the popular understanding is the sampling rate, the higher the sampling rate in unit time is, the higher the precision is, and the closer the processed file is to the original file. Currently, video is generally encoded by using a Constant Bit Rate (CBR) scheme, and when CBR encoding is used, the Bit Rate is kept substantially Constant.

Disclosure of Invention

However, when the video picture is relatively complex, the current encoding method results in relatively large loss of picture quality after encoding, blurred picture and poor video quality.

Therefore, there is a need for an encoding method that can improve the video quality and improve the quality of the encoded video.

In this context, embodiments of the present disclosure are intended to provide an encoding method, an encoding apparatus, a computer-readable storage medium, and an electronic device.

According to a first aspect of the disclosed embodiments, there is provided an encoding method, including: updating the number of target frames in the current coding period according to the type of the current frame; the types of the current frame comprise a target frame and a non-target frame; in response to the fact that the number of target frames in the current coding period is larger than a first preset frame number and the number of coded frames in the current coding period is larger than a second preset frame number, coding the current frame according to a first code rate, and finishing the current coding period; in response to that the number of target frames in the current coding period is greater than the first preset frame number and the number of coded frames in the current coding period is not greater than the second preset frame number, coding the current frame according to a second code rate; wherein the first code rate is higher than a reference code rate determined according to the current network performance, and the second code rate is lower than the reference code rate determined according to the current network performance.

In an optional embodiment, the method further comprises: in response to that the number of target frames in the current coding period is not greater than the first preset frame number and the number of coded frames in the current coding period is greater than a third preset frame number, coding the current frame according to the first code rate, and ending the current coding period; and the third preset frame number is greater than the second preset frame number.

In an optional embodiment, the method further comprises: and in response to the fact that the number of target frames in the current coding period is not greater than the first preset frame number and the number of coded frames in the current coding period is not greater than the third preset frame number, coding the current frame according to the second code rate.

In an alternative embodiment, the second code rate is determined by: and reducing the reference code rate according to a preset reduction proportion to obtain the second code rate.

In an optional embodiment, the first code rate is determined by: in the current coding period, acquiring the first N coded frames of the current frame; obtaining a code rate difference value between the actual code rate of each coded frame and the reference code rate; determining the first code rate according to the N code rate difference values and the reference code rate; and N is a positive integer which is greater than the second preset frame number and less than the third preset frame number.

In an optional implementation manner, the determining the first code rate according to the N code rate difference values and the reference code rate includes: obtaining a first accumulated value of the N code rate difference values; obtaining a product of the first accumulated value and the reference code rate; and acquiring a second accumulated value of the product and the reference code rate, and determining the second accumulated value as the first code rate.

In an alternative embodiment, the type of the current frame is determined by: carrying out significance detection on the current frame to obtain the area of a significance region; in response to the area of the salient region being greater than a preset area threshold, determining the current frame as the target frame; in response to the area of the salient region not being greater than the preset area threshold, determining that the current frame is the non-target frame.

In an alternative embodiment, the type of the current frame is determined by: performing motion detection on the current frame to acquire whether the current frame contains a moving target; in response to the motion target being included in the current frame, determining the current frame to be the target frame; in response to the moving object not being included in the current frame, determining the current frame to be the non-target frame.

In an alternative embodiment, the type of the current frame is determined by: performing background detection on the current frame to acquire a background area and a foreground area in the current frame; detecting whether relative motion exists between the foreground area and the background area; determining the current frame as the target frame in response to the foreground region and the background region having relative motion; determining the current frame as the non-target frame in response to an absence of relative motion of the foreground region and the background region.

According to a second aspect of the embodiments of the present disclosure, there is provided an encoding apparatus including: the frame type detection module is used for updating the number of the target frames in the current coding period according to the type of the current frame; the types of the current frame comprise a target frame and a non-target frame; a first encoding module, configured to, in response to that the number of target frames in the current encoding period is greater than a first preset frame number and the number of encoded frames in the current encoding period is greater than a second preset frame number, perform encoding processing on the current frame according to a first code rate, and end the current encoding period; a second encoding module, configured to perform encoding processing on the current frame according to a second code rate in response to that the number of target frames in the current encoding period is greater than the first preset number of frames and the number of encoded frames in the current encoding period is not greater than the second preset number of frames; wherein the first code rate is higher than a reference code rate determined according to the current network performance, and the second code rate is lower than the reference code rate determined according to the current network performance.

In an alternative embodiment, the first encoding module is configured to: in response to that the number of target frames in the current coding period is not greater than the first preset frame number and the number of coded frames in the current coding period is greater than a third preset frame number, coding the current frame according to the first code rate, and ending the current coding period; and the third preset frame number is greater than the second preset frame number.

In an alternative embodiment, the second encoding module is configured to: and in response to the fact that the number of target frames in the current coding period is not greater than the first preset frame number and the number of coded frames in the current coding period is not greater than the third preset frame number, coding the current frame according to the second code rate.

In an alternative embodiment, the second encoding module is configured to: and reducing the reference code rate according to a preset reduction proportion to obtain the second code rate.

In an alternative embodiment, the first encoding module is configured to: in the current coding period, acquiring the first N coded frames of the current frame; obtaining a code rate difference value between the actual code rate of each coded frame and the reference code rate; determining the first code rate according to the N code rate difference values and the reference code rate; and N is a positive integer which is greater than the second preset frame number and less than the third preset frame number.

In an alternative embodiment, the first encoding module is configured to: obtaining a first accumulated value of the N code rate difference values; obtaining a product of the first accumulated value and the reference code rate; and acquiring a second accumulated value of the product and the reference code rate, and determining the second accumulated value as the first code rate.

In an optional embodiment, the frame type detection module is configured to: carrying out significance detection on the current frame to obtain the area of a significance region; in response to the area of the salient region being greater than a preset area threshold, determining the current frame as the target frame; in response to the area of the salient region not being greater than the preset area threshold, determining that the current frame is the non-target frame.

In an optional embodiment, the frame type detection module is configured to: performing motion detection on the current frame to acquire whether the current frame contains a moving target; in response to the motion target being included in the current frame, determining the current frame to be the target frame; in response to the moving object not being included in the current frame, determining the current frame to be the non-target frame.

In an optional embodiment, the frame type detection module is configured to: performing background detection on the current frame to acquire a background area and a foreground area in the current frame; detecting whether relative motion exists between the foreground area and the background area; determining the current frame as the target frame in response to the foreground region and the background region having relative motion; determining the current frame as the non-target frame in response to an absence of relative motion of the foreground region and the background region.

According to a third aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the encoding method of the first aspect described above.

According to a fourth aspect of the disclosed embodiments, there is provided an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the encoding method of the first aspect described above via execution of the executable instructions.

According to the encoding method, the encoding device, the computer-readable storage medium and the electronic device of the embodiments of the disclosure, the complexity of the video frame in each encoding period is determined according to the number of the target frames in each encoding period, and on the premise that the video frame is sufficiently complex (the number of the target frames is greater than the first preset frame number), the encoded frame is encoded at the first code rate with a higher numerical value only if the number of the encoded frame is greater than the second preset frame number, and the encoded frame is encoded at the second code rate with a lower numerical value if the number of the encoded frame is not greater than the second preset frame number. Therefore, on one hand, the method and the device can solve the technical problems that when a video picture is complex, the loss of the coded picture is large and the video quality is poor due to the fact that the video picture is coded according to the constant code rate in CBR coding, and the quality of the coded video is improved. On the other hand, the video coding method and the video coding device can solve the problems that video pictures are blocked and network delay is large due to the fact that coding is carried out at a high code Rate once the video pictures are detected to be complex in ABR (Average Bit Rate) coding, and the quality of the coded video is improved on the premise that the video volume is not influenced.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

FIG. 1 shows a schematic diagram of CBR coding;

FIG. 2 shows a flow diagram of an encoding method according to an embodiment of the present disclosure;

FIG. 3 illustrates a flow chart for determining a type of a current frame according to an embodiment of the present disclosure;

FIG. 4 illustrates another flow chart for determining the type of a current frame according to an embodiment of the present disclosure;

FIG. 5 illustrates a flow chart for determining a type of a current frame according to an embodiment of the present disclosure;

FIG. 6 illustrates a sub-flow diagram of an encoding method according to an embodiment of the present disclosure;

FIG. 7 shows a flow diagram for determining a first code rate according to an embodiment of the disclosure;

FIG. 8 illustrates an overall flow diagram of an encoding method according to an embodiment of the present disclosure;

FIG. 9 shows an overall flow diagram of another encoding method according to an embodiment of the present disclosure;

FIG. 10 shows a schematic diagram of an encoding method according to an embodiment of the present disclosure;

FIG. 11 shows a schematic diagram of an encoding apparatus according to an embodiment of the present disclosure;

fig. 12 shows a block diagram of an electronic device according to an embodiment of the disclosure.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to an embodiment of the present disclosure, an encoding method, an encoding apparatus, a computer-readable storage medium, and an electronic device are provided.

In this document, any number of elements in the drawings is by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.

The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.

Summary of The Invention

The inventor finds that the current encoding method only carries out encoding according to the set constant code rate, so that when a video picture is relatively complex, the loss of the image quality after encoding is relatively large, and the video quality is poor.

In view of the above, the basic idea of the present disclosure is: the complexity of a video picture in each coding period is determined according to the number of target frames in each coding period, and on the premise that the video picture is sufficiently complex (the number of the target frames is greater than a first preset frame number), the coded frames are coded at a first code rate with a higher numerical value only if the number of the coded frames is greater than a second preset frame number, and the coded frames are coded at a second code rate with a lower numerical value if the number of the coded frames is not greater than the second preset frame number. Therefore, on one hand, the method and the device can solve the technical problems that when a video picture is complex, the loss of the coded picture is large and the video quality is poor due to the fact that the video picture is coded according to the constant code rate in CBR coding, and the quality of the coded video is improved. On the other hand, the method and the device can solve the problems of video picture blockage and large network delay caused by coding at a high code rate once the video picture is detected to be complicated in ABR coding, and improve the quality of the coded video on the premise of not influencing the video volume.

Having described the general principles of the present disclosure, various non-limiting embodiments of the present disclosure are described in detail below.

Application scene overview

It should be noted that the following application scenarios are merely illustrated to facilitate understanding of the spirit and principles of the present disclosure, and embodiments of the present disclosure are not limited in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.

The embodiment of the disclosure supports detecting the type of each frame of video image in an input video stream in a video live broadcast and video communication scene, when a current frame is detected as a target frame, updating the number of the target frames detected in a current coding period, and further, in response to that the number of the target frames in the current coding period is greater than a first preset frame number and the number of the coded frames in the current coding period is greater than a second preset frame number, coding the current frame according to a first code rate (the first code rate is higher than a reference code rate determined according to current network performance), and ending the current coding period to enter a next coding period. And in response to the number of the target frames in the current coding period being greater than the first preset number of frames and the number of the coded frames in the current coding period being not greater than the second preset number of frames, coding the current frame according to a second code rate (the second code rate being lower than a reference code rate determined according to the current network performance).

Exemplary method

In the related art, some embodiments employ the following two encoding methods:

coding CBR. Under the video live broadcast and video communication scenes, the video code rate is often required to be adjusted according to the network condition, and a code rate control mode of CBR is introduced to ensure the stable transmission of the video. The CBR is a code rate-first coding mode, according to a set target code rate, a CBR algorithm can adjust QP (quant parameter) of a coded video in real time, quantization is an important step in video coding, the higher the quantization value QP is, the higher the quantization granularity is, the higher the compression ratio is, the smaller the code rate is, the lower the video quality is, the larger the mosaic is, the picture is not fine and smooth, and the picture is fuzzy. When the set target code rate is lower than the current coding code rate, the QP of the frame to be coded is increased to achieve the purpose of reducing the coding code rate, and when the set target code rate is higher than the current coding code rate, the QP of the frame to be coded is decreased to achieve the purpose of increasing the coding code rate. Fig. 1 shows a schematic diagram of a CBR encoding method, and as can be seen from fig. 1, a code rate in CBR encoding is substantially constant.

And ② ABR coding. In a video-on-demand scene or a video live broadcast scene, an ABR coding mode is adopted, the code rate in a period of time can be kept constant according to the set code rate in the ABR code rate control mode, different code rates can be distributed according to the complexity of the video, more code rates can be distributed for more complex frames, the coding quality of the complex frames is improved, and less code rates can be distributed for simpler frames, so that the overall quality of the video can be controlled.

However, the above method has the following drawbacks:

in CBR coding, since the video coding rate is adjusted only according to the set target rate, when the video picture is relatively complex, the loss of the picture quality after coding is relatively large, and the subjective effect of the video picture is relatively poor. On the other hand, since the CBR maintains a constant code rate for a period of time, the coding loss and the coding error in a certain frame are transmitted along with the reference relationship, which causes a poor video effect for a period of time, and shows a blurred picture and an obvious blocking effect.

In the ABR encoding, the complexity of the video frames within a period of time is synthesized, and different code rates are allocated to the video frames with different complexities. For a more complex video picture within a period of time, the video code rate is higher than the set target code rate within a short time, the code rate is in direct proportion to the video volume, and the video frame rate in network transmission is reduced due to the excessively high code rate, which is represented by the fact that the video picture is blocked and the network delay is large. In a video communication scenario with high real-time requirements, excessive network delay can significantly affect user experience. Meanwhile, compared with the CBR coding method, the ABR coding has higher complexity and higher coding pressure at the user side, which may cause an increase in the cost of a Central Processing Unit (CPU) and an increase in power consumption, so the ABR coding is generally applied to server coding and is not widely applied to the user side.

Therefore, there is a strong need for an encoding method that can improve the display quality of video without affecting the video volume.

An exemplary embodiment of the present disclosure first provides an encoding method. Fig. 2 shows an exemplary flow of the encoding method, which may include the following steps S210 to S230:

in step S210, the number of target frames in the current coding period is updated according to the type of the current frame.

In this embodiment, the current frame may be detected to determine the type of the current frame (the type of the current frame includes a target frame and a non-target frame), and when the current frame is detected as the target frame, the number of the target frames in the current coding period is increased by one, and when the current frame is detected as the non-target frame, the number of the target frames in the current coding period is kept unchanged.

For example, taking the example that the number of target frames already detected in the current coding period is 3, when the current frame is detected as the target frame, the number of target frames in the current coding period may be updated to 4, and when the current frame is detected as the non-target frame, the number of target frames in the current coding period may be maintained to 3.

The input video may include a plurality of image frames, each of which represents a still image, and the image being detected or processed at the current time is the current frame.

The target frame may be an image frame satisfying at least one of the following conditions: the area of the salient region in the current frame is larger than a preset area threshold, the current frame comprises a moving target, and the foreground region and the background region of the current frame have relative motion. Accordingly, when the current frame does not satisfy the condition of the target frame, the current frame may be determined to be a non-target frame.

In step S220, in response to that the number of target frames in the current coding period is greater than a first preset frame number and the number of coded frames in the current coding period is greater than a second preset frame number, the current frame is coded according to the first code rate, and the current coding period is ended.

In this embodiment, when it is detected that the number of target frames in the current coding period is greater than the first preset number of frames and the number of coded frames in the current coding period is greater than the second preset number of frames, the current frame may be coded based on an h.264 or h.265 coder according to the first code rate. Therefore, on the premise that the video picture is complex, only when the number of the coded frames is greater than the second preset frame number, the current frame is coded according to the first code rate, the technical problems of overlarge volume of a video file and unsmooth playing caused by continuous high-code-rate coding are avoided, video unsmooth caused by time delay is reduced, and watching experience of a user is optimized.

After the current frame has been encoded according to the first code rate, a jump may be made to the next encoding cycle (i.e., zero the number of target frames and zero the number of encoded frames). It can be seen that the current coding period in the present disclosure is a period with a non-fixed number of frames, each coding period starts with a video frame coded according to the second code rate, and ends with a video frame coded according to the first code rate, that is, each time the video frame is coded at the first code rate, the current coding period is ended, and the next coding period is entered.

The first preset frame number is a number of preset target frames, and for example, the first preset frame number may be a number greater than the second preset frame number and smaller than the third preset frame number, for example, when the second preset frame number is 2 frames and the third preset frame number is 6 frames, the first preset frame number may be set to 5 frames, and a specific number may be set according to an actual situation, which is not particularly limited by the present disclosure. The complexity of the video picture in the current coding period can be detected by setting a judgment condition of a first preset frame number, namely when the number of target frames in the current coding period is greater than the first preset frame number, the video picture in the current coding period is determined to be sufficiently complex.

The second preset frame number is a preset threshold of the number of coded frames in each coding period, for example: the 2 frames can be set according to actual conditions, and the disclosure does not make any special limitation on the frames. By setting the judgment condition of the second preset frame number, two frames of images coded at the first code rate can be separated, the problems of unsmooth video picture and large network delay caused by continuous high-code-rate coding are avoided, and the transmission rate of the video is ensured.

The first code rate is higher than a reference code rate determined according to current network performance. The network performance may be a network bandwidth (which refers to an amount of data that can be transmitted in a unit time), a network delay (which refers to a time taken for a packet or packet to be transmitted from one end of the network to the other end), a bandwidth-delay product (which refers to a product of a capability of a data link and a round-trip communication delay), and the like, and may be set according to an actual situation, which is not particularly limited in this disclosure.

In step S230, in response to that the number of target frames in the current coding period is greater than a first preset number of frames and the number of coded frames in the current coding period is not greater than a second preset number of frames, the current frame is coded according to a second code rate.

Wherein the second code rate is lower than a reference code rate determined according to current network performance. Therefore, the method and the device can solve the problems of video picture blockage and large network delay caused by coding at a high code rate in ABR coding once a video picture is detected to be complex, and can also reduce the coding complexity and the coding pressure of a user side.

According to the method, the complexity of a video picture in each coding period is determined according to the number of target frames in each coding period, under the premise that the video picture is sufficiently complex (the number of the target frames is greater than a first preset frame number), the coded frames are encoded at a first code rate with a higher numerical value only if the number of the coded frames is greater than a second preset frame number, and the coded frames are encoded at a second code rate with a lower numerical value if the number of the coded frames is not greater than the second preset frame number. Therefore, on one hand, the technical problems that when a video picture is complex, the loss of the coded picture is large and the video quality is poor due to the fact that the video picture is coded according to the constant code rate in CBR coding can be solved, and the quality of the coded video is improved. On the other hand, the problems of video picture blockage and large network delay caused by coding at a high code rate in ABR coding once the video picture is detected to be complicated are solved, and the quality of the coded video is improved on the premise of not influencing the video volume.

In the above step S210, in an alternative embodiment, the type of the current frame may be determined by a saliency detection algorithm, and specifically, referring to fig. 3, fig. 3 shows a flowchart for determining the type of the current frame according to an embodiment of the present disclosure, which includes steps S301 to S303:

in step S301, saliency detection is performed on the current frame to obtain the area of a saliency region.

In this embodiment, saliency detection may be performed on the current frame to obtain the area of a saliency region in the current frame. Wherein, the salient region is the region where the most noticeable target object is located in the current frame.

In step S302, in response to the area of the saliency region being greater than the preset area threshold, the current frame is determined to be the target frame.

In this embodiment, if the area of the significant region is greater than a preset area threshold (a preset area value, which may be changed according to an actual situation, and is not particularly limited by this disclosure), the current frame may be determined as the target frame.

In step S303, in response to the area of the saliency region not being greater than the preset area threshold, the current frame is determined to be a non-target frame.

In this embodiment, if the area of the saliency region is not greater than (i.e., less than or equal to) the preset area threshold, the current frame may be determined as the non-target frame.

In the above step S210, in another alternative embodiment, the type of the current frame may be determined by a motion detection algorithm, and specifically, referring to fig. 4, fig. 4 shows another flowchart for determining the type of the current frame according to the embodiment of the present disclosure, which includes steps S401 to S403:

in step S401, motion detection is performed on the current frame to obtain whether the current frame includes a moving object.

In this embodiment, a moving object detection algorithm may be used to perform motion detection on each frame of the input video to determine whether each frame of the image contains a moving object. A moving object is a dynamic object that has a change in the position where it is located in some frame of the input video. The motion detection algorithm may be an inter-frame difference method (i.e., selecting a picture of a certain frame and subtracting the picture to be identified), a background difference method (modeling a background and comparing the background image with the picture to be identified), an optical flow method (calculating motion information of an object between adjacent frames according to a correspondence between a previous frame and a current frame by using a change of pixels in an image sequence in a time domain and a correlation between adjacent frames), a feature matching method, a KNN (K-Nearest Neighbor classification algorithm, abbreviated as KNN), and variations (three-frame difference and five-frame difference) of these methods, which may be set by itself according to actual conditions, and the present disclosure does not specially limit the methods.

In step S402, in response to the moving object being included in the current frame, the current frame is determined to be a target frame.

In this step, when it is detected that the current frame includes the moving target, it may be determined that the image complexity of the current frame is high, that is, the current frame is the target frame.

In step S403, in response to no moving object being included in the current frame, the current frame is determined to be a non-target frame.

In this step, when it is detected that the current frame does not include the moving target, it may be determined that the image complexity of the current frame is low, that is, the current frame is the non-target frame.

In the above step S210, in yet another alternative embodiment, the type of the current frame may be determined by a background detection algorithm, and specifically, referring to fig. 5, fig. 5 shows a flowchart of determining the type of the current frame according to another embodiment of the present disclosure, which includes steps S501 to S504:

in step S501, background detection is performed on the current frame to obtain a background region and a foreground region in the current frame.

In this embodiment, background detection may be performed on the current frame to obtain a background region and a foreground region in the current frame. Wherein, the foreground region is the region where the subject object (e.g. person) in the picture is located. Background regions, i.e. regions other than foreground regions, for example: external environment, etc., which play a role in setting off the foreground, coordinating the hue, etc., and are generally located behind the foreground.

In step S502, it is detected whether there is relative motion between the foreground region and the background region.

In this embodiment, the relative positions of the foreground region and the background region in the current frame may be detected, and then the relative positions of the foreground region and the background region in the previous frame or the subsequent frame of the current frame may be detected, and the relative positions may be compared to determine whether there is relative motion between the foreground region and the background region in the current frame.

In step S503, in response to the relative motion between the foreground region and the background region, the current frame is determined as the target frame.

In step S504, in response to the foreground region and the background region not having relative motion, the current frame is determined to be a non-target frame.

In the above step S220, if the number of target frames in the current encoding period is not greater than the first preset number of frames, refer to fig. 6, where fig. 6 shows a sub-flowchart of an encoding method according to an embodiment of the present disclosure, including steps S601-S602:

in step S601, in response to that the number of target frames in the current coding period is not greater than a first preset frame number and the number of coded frames in the current coding period is greater than a third preset frame number, the current frame is coded according to the first code rate, and the current coding period is ended.

In this embodiment, when it is detected that the number of target frames in the current coding period is not greater than (less than or equal to) the first preset number of frames and the number of coded frames in the current coding period is greater than the third preset number of frames, the current frame may be coded according to the first code rate, so that the coding process may be forced according to a higher code rate under the condition of a lower complexity of the video picture, so as to avoid the technical problems of blurred video picture and lower quality caused by continuous low-code-rate coding, and improve the video quality.

The third preset frame number is a preset number threshold of coded frames, and the third preset frame number is greater than a second preset frame number. By setting the judgment condition of the third preset frame number, the interval of two frames of images coded at the first code rate can be limited within a controllable range, the technical problems of high video image loss and poor image quality caused by continuous low-code-rate coding are avoided, and the video quality is improved.

In step S602, in response to that the number of target frames in the current coding period is not greater than the first preset number of frames and the number of coded frames in the current coding period is not greater than the third preset number of frames, the current frame is coded according to the second code rate.

In this embodiment, when it is detected that the number of target frames in the current coding period is not greater than (less than or equal to) the first preset number of frames and the number of coded frames in the current coding period is not greater than (less than or equal to) the third preset number of frames, the current frame may be coded according to the second code rate, so that the coding may be performed at the second code rate with a smaller value under the condition of a lower complexity of the video picture, thereby ensuring that the code rate in each coding period is relatively balanced and not affecting the overall volume of the video.

The second code rate may be determined by: for example, when the reference code rate is 1000 and the preset reduction ratio (a number greater than 0 and less than 1) is 0.8, the second code rate may be determined as: 1000 x 0.8 ═ 800. After the second code rate is determined, considering that the second code rate is lower than the reference code rate, the QP and the λ (tangential slope of a rate distortion optimization curve, in a rate distortion optimization algorithm, a rate distortion curve is obtained by using lagrangian optimization, and a slope of a straight line tangent to the rate distortion curve at each code rate point is a λ value to measure the code rate and the loss during encoding according to the λ value) of the current frame may be adjusted, so that the encoding code rate of the current frame reaches the second code rate.

Specifically, different QPs correspond to different quantization step sizes in coding and λ values in rate distortion optimization, and the larger the QP value is, the larger the quantization step size is, the larger λ is, the smaller the coded code rate is, that is, QP has an inverse relationship with the code rate, and λ has an inverse relationship with the code rate. Therefore, the QP and λ can be properly increased to reduce the coding rate of the current frame, so that the coding rate of the current frame reaches the second code rate.

The first code rate may be determined by the following manner, referring to fig. 7, fig. 7 shows a flowchart for determining the first code rate according to an embodiment of the present disclosure, including steps S701 to S703:

in step S701, in the current coding cycle, the first N coded frames of the current frame are acquired.

In this embodiment, if N is a positive integer greater than the second predetermined frame number and less than the third predetermined frame number, it is exemplarily described that N takes 3 as an example in the current encoding period. As can be known from the above explanation of step S220, in the present disclosure, each time the first code rate is used for encoding, the current encoding period is ended, and the next encoding period is entered, so that the first N encoded frames of the current frame are all image frames encoded at the second code rate, that is, the actual code rates of the first N encoded frames of the current frame are all the second code rate.

In step S702, a code rate difference between the actual code rate and the reference code rate of each encoded frame is obtained.

In this embodiment, referring to the above-mentioned related explanation of the steps, it can be known that the actual code rate of each encoded frame is 800, and the reference code rate is 1000, so that the code rate difference between the actual code rate of each encoded frame and the reference code rate is 1000-800-200.

In step S703, a first code rate is determined according to the N code rate differences and the reference code rate.

In this embodiment, a first accumulated value of the N bitrate difference values may be obtained, where 200+200+200 is 600, a product of the first accumulated value and the reference bitrate is obtained, where 1000 × 600 is 600000, a second accumulated value of the product and the reference bitrate is obtained, and 600000+1000 is 610000, and the second accumulated value is determined to be the first bitrate.

After the first code rate is determined, since the first code rate is higher than the reference code rate, and as can be seen from the related explanation in the above steps, QP is in inverse proportion to the code rate, and λ is in inverse proportion to the code rate, QP and λ can be appropriately reduced so that the coding rate of the current frame reaches the first code rate.

Therefore, the code rate saved by the image frame coded according to the second code rate in one coding period can be used for the image frame coded according to the first code rate, so that the overall code rate of the video is ensured to be balanced, the coding performance is improved, the technical problems of overhigh overall code rate and overlarge video volume of the video are avoided, the transmission bandwidth required by the video is saved, and the quality of the coded video is improved on the premise of not influencing the video volume.

Fig. 8 shows an overall flowchart of an encoding method according to an embodiment of the present disclosure, including steps S801-S813:

in step S801, start;

in step S802, initialization (set the number of target frames detected in the current coding period to zero, set the number of coded frames in the current coding period to zero, set the saved code rate of the coded frames in the current coding period to zero, set the second code rate as the current reference code rate multiplied by the preset reduction ratio);

in step S803, it is detected whether the current frame is a target frame (when the area of the saliency region in the current frame is greater than a preset area threshold, and/or the current frame includes a moving target, and/or when there is relative motion between the foreground region and the background region of the current frame, the current frame is determined to be the target frame);

if yes, go to step S804 to add one to the number of target frames;

if not, directly entering step S805, and detecting whether the number of the target frames in the current coding period is greater than a first preset frame number (by setting a judgment condition of the first preset frame number, the complexity of the video picture in the current coding period can be detected);

if the number of the coded frames is greater than the first preset number of frames, the method proceeds to step S806, and determines whether the number of the coded frames is greater than a second preset number of frames (by setting a determination condition of the second preset number of frames, two frames of images coded at the first code rate can be separated);

if the number of frames is greater than the second preset number of frames, the method proceeds to step S807, and encodes the current frame according to the first code rate (higher than the reference code rate determined according to the current network performance);

in step S808, it is determined whether the current frame group is completely traversed;

if the traversal is completed, the step S809 is entered, and the process is ended; otherwise, the process proceeds to step S802.

After the step S805, if the target frame number is not greater than the first preset frame number, step S810 is performed to determine whether the number of the encoded frames is greater than a third preset frame number (by setting a determination condition of the third preset frame number, the interval between two frames of images encoded at the first code rate can be limited within a controllable range); if the number of frames is larger than the third preset number of frames, the step S807 is executed; if the number of frames is not greater than the third preset number of frames, entering step S811, and encoding according to a second code rate (lower than the reference code rate determined according to the current network performance);

in step S812, it is determined whether the current frame group is completely traversed; if not, the process proceeds to step S813, where the number of non-target frames is increased by one;

if yes, the process proceeds to step S809 and ends.

Fig. 9 shows an overall flowchart of another encoding method according to an embodiment of the present disclosure, with reference to fig. 9:

after the frame type detection is carried out on the input video stream, the number of target frames in the current coding period can be updated;

and in response to the fact that the number of the target frames in the current coding period is larger than a first preset frame number and the number of the coded frames in the current coding period is not larger than a second preset frame number, coding the current frame according to a second code rate, and outputting a video code stream.

Fig. 10 shows a schematic diagram of an encoding method according to an embodiment of the present disclosure, referring to fig. 10, a quality frame in the diagram is an image frame encoded according to a first code rate, and a non-quality frame is an image frame encoded according to a second code rate, and compared with the CBR encoding method with a constant code rate shown in fig. 1, in the present disclosure, balanced allocation of code rates can be achieved, and the problems of blur of an encoded video and continuation of a video error are solved, so that the quality of the encoded video is improved.

Exemplary devices

Having introduced the encoding method of the exemplary embodiment of the present disclosure, the encoding apparatus of the exemplary embodiment of the present disclosure is explained next with reference to fig. 11.

Fig. 11 shows a schematic diagram of an encoding apparatus 1100 according to an embodiment of the present disclosure, the encoding apparatus 1100 including:

a frame type detection module 1110, configured to update the number of target frames in the current coding period according to the type of the current frame; the types of the current frame comprise a target frame and a non-target frame;

a first encoding module 1120, configured to, in response to that the number of target frames in the current encoding period is greater than a first preset number of frames and the number of encoded frames in the current encoding period is greater than a second preset number of frames, perform encoding processing on the current frame according to a first code rate, and end the current encoding period;

a second encoding module 1130, configured to perform encoding processing on the current frame according to a second code rate in response to that the number of target frames in the current encoding period is greater than the first preset number of frames and the number of encoded frames in the current encoding period is not greater than the second preset number of frames; wherein the first code rate is higher than a reference code rate determined according to the current network performance, and the second code rate is lower than the reference code rate determined according to the current network performance.

In an alternative embodiment, the first encoding module 1120 is configured to:

and the third preset frame number is greater than the second preset frame number.

In an alternative embodiment, the second encoding module 1130 is configured to: