Encoding and decoding methods and devices, electronic equipment and storage medium

文档序号：366011 发布日期：2021-12-07 浏览：9次中文

阅读说明：本技术 一种编码、解码方法、装置、电子设备及存储介质 (Encoding and decoding methods and devices, electronic equipment and storage medium ) 是由要瑞宵张樱凡于 2021-09-24 设计创作，主要内容包括：本发明公开了一种编码、解码方法、装置、电子设备及存储介质,所述方法包括：在分辨率切换时,选取缓冲区DPB中的可靠帧图像为第一参考帧图像；其中,所述可靠帧图像为可成功解码的图像；根据待编码帧图像的分辨率对所述第一参考帧图像进行缩放处理,得到第二参考帧图像；以所述第二参考帧图像为参考,根据编码参数对所述待编码帧图像进行编码处理。本发明实施例中通过帧间预测对待编码帧图像进行编码处理,提高了编码效率,避免了码率尖峰的问题,提高了视频播放的流畅性。并且,以可靠帧图像为参考,因此能够保证对端对编码处理后的图像成功解码。(The invention discloses a coding and decoding method, a device, an electronic device and a storage medium, wherein the method comprises the following steps: selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image; zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image; and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference. According to the embodiment of the invention, the frame image to be coded is coded by inter-frame prediction, so that the coding efficiency is improved, the problem of peak code rate is avoided, and the fluency of video playing is improved. And the reliable frame image is taken as a reference, so that the opposite end can be ensured to successfully decode the image after the encoding processing.)

1. A method of encoding, the method comprising:

selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image;

and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

2. The method of claim 1, wherein the selecting the reliable frame picture in the buffer DPB as the first reference frame picture comprises:

and selecting the latest reliable frame picture in the buffer DPB as a first reference frame picture, wherein the latest reliable frame picture is the picture which is nearest to the frame picture to be coded and can be successfully decoded.

3. The method according to claim 1, wherein before encoding the frame picture to be encoded according to the encoding parameters with the second reference frame picture as a reference, the method further comprises:

and respectively counting the brightness information of the second reference frame image and the frame image to be coded, determining the similarity of the second reference frame image and the frame image to be coded according to the counting result, and if the similarity is greater than a preset similarity threshold, coding the frame image to be coded according to coding parameters by taking the second reference frame image as reference.

4. The method of claim 3, wherein if the similarity is not greater than a predetermined similarity threshold, the method further comprises:

and coding the frame image to be coded according to the key frame.

5. The method of claim 1, wherein the selecting the reliable frame picture in the buffer DPB as the first reference frame picture comprises:

judging whether a reliable frame image exists in a buffer DPB at the current coding time, if so, selecting the reliable frame image in the buffer DPB at the current coding time as a first reference frame image, if not, reserving the reliable frame image in the buffer DPB at the previous coding time in the buffer DPB at the current coding time, and selecting the reliable frame image in the buffer DPB at the current coding time as the first reference frame image.

6. The method according to claim 1, wherein before encoding the frame picture to be encoded according to the encoding parameters with the second reference frame picture as a reference, the method further comprises:

and judging whether a generalized B frame coding mode is started or not according to the coding processing capacity, and if not, coding the frame image to be coded according to the coding parameters by taking the second reference frame image as a reference.

7. The method of claim 6, wherein if the generalized B frame coding mode is determined to be turned on, the method further comprises:

and coding the frame image to be coded by taking the second reference frame image and the adjacent frame image of the frame image to be coded as references.

8. A decoding method based on the coding method of any one of claims 1 to 7, characterized in that the method comprises:

when the image after the coding processing is completely received, obtaining coding parameters;

selecting a reliable frame image in a buffer DPB as a first reference frame image; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the image subjected to encoding processing to obtain a second reference frame image;

and decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

9. An encoding apparatus, characterized in that the apparatus comprises:

the first determining unit is used for selecting a reliable frame image in the buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image;

the first zooming processing unit is used for zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image;

and the coding unit is used for coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

10. An apparatus for decoding, the apparatus comprising:

the acquisition module is used for acquiring coding parameters when the coded image is completely received;

the second determining unit is used for selecting the reliable frame image in the buffer DPB as a first reference frame image; wherein the reliable frame image is a successfully decodable image;

the second zooming processing unit is used for zooming the first reference frame image according to the resolution of the image subjected to the encoding processing to obtain a second reference frame image;

and the decoding unit is used for decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the steps of the encoding method of any one of claims 1 to 7 or the decoding method of claim 8 when executing a program stored in a memory.

12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the encoding method steps of any one of claims 1 to 7 or carries out the decoding method steps of claim 8.

Technical Field

The present invention relates to the field of encoding and decoding technologies, and in particular, to an encoding method, a decoding method, an encoding device, an decoding device, an electronic device, and a storage medium.

Background

The current frame reference underlying scheme under various video coding standards is an IDR frame followed by a series of Inter/Intra frames, typically denoted as P or B frames, and an Intra frame denoted as I frame. The Intra frame is an Intra-frame prediction frame, and the Inter frame is an Inter-frame prediction frame. The conventional IDR frame is a special Intra frame with coding parameters, and decoding is independent of other frames, so that a decoder can guarantee decoding and playing when encountering a complete IDR frame.

In general, the compression efficiency of the Intra frame is lower than that of the Inter frame, so a frame-level bitrate peak is formed at the Intra frame under the condition of ensuring stable video quality, as shown in fig. 1, and the height of a column in fig. 1 reflects the size of the frame. The Intra frame frequency in the reference architecture frequently appears, and cannot be thick in video storage or non-real-time video application, but problems can occur in real-time communication RTC application (especially when the network condition is poor), namely the greater probability of losing the Intra frame is increased, the transmission delay is increased, the fluency of video playing is reduced, and the user experience is finally influenced. For this reason, in RTC applications, normal Intra frames (I-frames) can be made to not appear in the codestream by changing the configuration (e.g., not enabling scene switching).

However, when the resolution of video coding changes, a traditional IDR frame (i.e., a special Intra frame) is inevitably inserted, which causes a peak code rate and reduces the fluency of video playing.

Disclosure of Invention

The embodiment of the invention provides an encoding method, a decoding method, an encoding device, an electronic device and a storage medium, which are used for solving the problem that in the prior art, when the resolution changes, the fluency of video playing is reduced.

The embodiment of the invention provides an encoding method, which comprises the following steps:

selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image;

and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

In another aspect, an embodiment of the present invention provides a decoding method, where the method includes:

when the image after the coding processing is completely received, obtaining coding parameters;

selecting a reliable frame image in a buffer DPB as a first reference frame image; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the image subjected to encoding processing to obtain a second reference frame image;

and decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

In another aspect, an embodiment of the present invention provides an encoding apparatus, where the apparatus includes:

the first zooming processing unit is used for zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image;

and the coding unit is used for coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

In another aspect, an embodiment of the present invention provides a decoding apparatus, where the apparatus includes:

the acquisition module is used for acquiring coding parameters when the coded image is completely received;

the second determining unit is used for selecting the reliable frame image in the buffer DPB as a first reference frame image; wherein the reliable frame image is a successfully decodable image;

and the decoding unit is used for decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

In another aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete mutual communication through the communication bus;

a memory for storing a computer program;

a processor for implementing the steps of the encoding method or the decoding method of any one of the above when executing the program stored in the memory.

In yet another aspect, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the encoding method steps or the decoding method steps of any one of the above.

The embodiment of the invention provides an encoding method, an encoding device, electronic equipment and a storage medium, wherein the method comprises the following steps: selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image; zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image; and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

In the embodiment of the invention, when the resolution ratio is switched, the reliable frame image in the buffer DPB is selected as the first reference frame image, and the first reference frame image is subjected to scaling processing according to the resolution ratio of the frame image to be coded to obtain the second reference frame image. The second reference frame image has the same resolution as the frame image to be encoded, so that the frame image to be encoded can be encoded with the second reference frame image as a reference. Compared with the scheme of obtaining the IDR frame through intra-frame prediction coding during resolution switching in the related art, the method and the device have the advantages that the coding efficiency is improved, the problem of code rate peak is avoided, and the fluency of video playing is improved by coding the frame image to be coded through inter-frame prediction in the embodiment of the invention. And the reliable frame image is taken as a reference, so that the opposite end can be ensured to successfully decode the image after the encoding processing.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of the spike effect of the code rate of an I frame in the background art;

fig. 2 is a schematic diagram of an encoding process provided in embodiment 1 of the present invention;

FIG. 3 is a block diagram provided in example 1 of the present invention;

FIG. 4 is a schematic diagram of a DPB provided in example 1 of the present invention;

fig. 5 is a schematic diagram of a dependency relationship between a conventional IDR frame and a New-IDR frame provided in embodiment 1 of the present invention;

fig. 6 is a schematic diagram illustrating a variation of reliable frames in a conventional DPB provided in embodiment 3 of the present invention;

fig. 7 is a schematic diagram illustrating a variation of reliable frames in a DPB provided in embodiment 3 of the present invention;

fig. 8 is a schematic diagram of a scheme in which reference frame numbers in an RTC scene are 1 and 2 according to embodiment 4 of the present invention;

fig. 9 is a schematic diagram of a complexity control process of whether to turn on a generalized B frame according to embodiment 4 of the present invention;

FIG. 10 is a schematic diagram of a decoding process provided in embodiment 5 of the present invention;

fig. 11 is a schematic structural diagram of an encoding apparatus according to embodiment 6 of the present invention;

fig. 12 is a schematic structural diagram of a decoding apparatus according to embodiment 7 of the present invention;

fig. 13 is a schematic structural diagram of an electronic device according to embodiment 8 of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the attached drawings, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1:

fig. 2 is a schematic diagram of an encoding process provided in an embodiment of the present invention, which includes the following steps:

s101: selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame picture is a successfully decodable picture.

S102: and carrying out scaling processing on the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image.

S103: and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

The encoding method provided by the embodiment of the invention is applied to electronic equipment, and the electronic equipment can be a PC (personal computer), a tablet computer, a smart phone and the like. The terms involved in the embodiments of the present invention are explained as follows: RTC: Real-Time Communication, i.e., Real-Time Communication. IDR frame: instant Decoder Refresh frame, i.e. Decoder Refresh frame immediately. DPB: decoded Picture Buffer, i.e., Buffer. GPB: generalized P and B pictures, i.e., Generalized B frames.

Fig. 3 is a frame diagram provided in an embodiment of the present invention, where as shown in fig. 3, a terminal a and a terminal B perform a video call, the positions of the terminal a and the terminal B are equal, each terminal includes a coding module, a decoding module, and a feedback information module, as shown in fig. 3, the coding module completes coding of an acquired video, and a coded code stream is transmitted to an opposite terminal through a network. And the opposite end decoding module decodes and displays the received video data when the received video data meets the decoding condition. And the feedback information module reconstructs the decoding portrait of the opposite end at the local end according to the information fed back from the opposite end for the reference of the coding module of the local end.

There is a DPB in the encoding module to place reconstructed frames of some encoded frames for reference by subsequent frames, as shown in fig. 4, where n0 is the most recently encoded and reconstructed frame placed in the DPB and closest to the current frame to be encoded; in actual use, the size of the DPB varies depending on the RTC scheme. In general, reliable frames and unreliable frames exist in the DPB, wherein reliable frames refer to frames that can be successfully decoded at the decoding end through feedback information, otherwise, unreliable frames, as shown in fig. 4, reconstructed frames n4 and n2 are reliable frames, while reconstructed frames n2 are the latest reliable frames, and other reconstructed frames in the DPB are unreliable frames. The specific reconstructed frames are reliable frames, latest reliable frames and unreliable frames, and the information is stored in the feedback information module and updated in the feedback information module.

In the resolution switching, the conventional IDR frame is an Intra frame with coding parameters, which can be decoded independently without depending on the previous frame, as shown in fig. 5 (a). Different from the conventional IDR frame, the New-IDR frame provided by the embodiment of the present invention is a frame that can refer to a previous frame, and in order to ensure that the New-IDR frame can be decoded when the opposite end is successfully received, the frame that the New-IDR refers to is a reliable frame, as shown in fig. 5 (b), from the perspective of compression efficiency, the New-IDR frame only refers to the reliable frame recorded in the feedback information module, and compared with a scheme in which the IDR frame is obtained by intra-frame predictive coding when resolution is switched in the related art, the New-IDR frame performs coding processing on a frame image to be coded by inter-frame prediction in the embodiment of the present invention, so that coding efficiency is improved, the problem of a peak code rate is avoided, and fluency of video playing is improved.

In order to encode with reference to the reliable frame recorded in the feedback information module during resolution switching, the reliable frame image in the buffer DPB is first selected as a first reference frame image, and then the first reference frame image is scaled according to the resolution of the frame image to be encoded, so as to obtain a second reference frame image. The resolution of the frame image to be encoded is consistent with the resolution of the second reference frame image.

It should be noted that, if there are multiple reliable frame pictures in the buffer DPB, any reliable frame picture may be selected as the first reference frame picture, and in order to make the encoding more accurate, the selecting the reliable frame picture in the buffer DPB as the first reference frame picture preferably includes:

Because the time interval between the latest reliable frame image and the frame image to be coded is small and the similarity is high, the latest reliable frame image in the buffer DPB is selected as the first reference frame image, so that the subsequent frame image to be coded is coded more accurately.

The method comprises the steps of generating coding parameters, such as a Sequence Parameter Set (SPS), an image parameter Set (Picture parameter Set, PPS and the like), wherein the modes of generating the SPS and the PPS in the step are completely consistent with the mode of generating the coding parameters in the coding of the traditional IDR frame, and are not repeated in the embodiment of the invention.

Example 2:

the scheme for obtaining the New-IDR frame image by coding the frame image to be coded by referring to the second reference frame image is called a New-IDR scheme. When the New-IDR scheme is adopted, the compression efficiency of Inter prediction is generally higher than that of Intra prediction of a traditional IDR frame, which is equivalent to adding an Inter prediction candidate on the basis of a traditional IDR coding scheme. In the RTC application, the call scene motion strength is generally lower than that of natural video, which also provides a space for the New-IDR scheme to fully take advantage of the advantages. However, if the content difference between the latest reliable frame and the current frame to be coded is large, even if a New-IDR scheme is started, the Intra prediction is probably still mostly selected when a certain block in the decision frame selects Intra or Inter prediction, so that the advantages of the New-IDR cannot be brought into play, and the calculation amount is increased inversely. The enabling condition for New-IDR is that no scene change occurs between the latest reliable frame and the frame to be encoded. The algorithm and threshold of the scene switching are different according to the calculation power of the terminal, and whether the scene is switched or not is determined by the following method in the embodiment of the invention.

Before the encoding processing is performed on the frame image to be encoded according to the encoding parameter by using the second reference frame image as a reference, the method further includes:

If the similarity is not greater than a preset similarity threshold, the method further comprises:

and coding the frame image to be coded according to the key frame.

In the embodiment of the invention, the luminance information of the second reference frame image and the frame image to be coded are respectively counted to obtain two luminance histograms, for example, the luminance values are divided into 16 groups, and the number of pixel points corresponding to each group of luminance values is counted. Two luminance histograms are obtained, each luminance histogram pairOne vector, for example, the vector corresponding to the second reference frame image is D _ ref ═ { r0, r1, r2, …, r15}, and the vector corresponding to the frame image to be encoded is D _ cur ═ { c0, c1, c2, …, c15 }. Then, the similarity between the second reference frame image and the frame image to be encoded is determined according to the statistical result, for example, S ═ D (D) is calculated_ref·D_cur)/(|D_ref|*|D_cur|). Where the dot symbols represent the vector inner product, the asterisks represent the ordinary multiplication, | | | represents the modulus of the calculated vector, and S represents the similarity. And if the similarity is greater than a preset similarity threshold, determining that the scene is not switched. And at the moment, taking a second reference frame image as a reference, and carrying out coding processing on the frame image to be coded according to the coding parameters. And if the similarity is not greater than a preset similarity threshold, determining that the scene is switched. At this time, the frame image to be coded is coded according to the key frame. Wherein, the preset similarity threshold may be 0.8, 0.85, etc.

In the embodiment of the invention, the second reference frame image is taken as a reference, before the frame image to be coded is coded according to the coding parameters, whether the current scene is switched or not is judged firstly, if the scene is not switched, a New-IDR scheme is adopted for coding, and if the scene is switched, the frame image to be coded is coded according to the key frame. The accuracy of coding is guaranteed, and the coding efficiency is improved under the feasible condition.

Example 3:

the DPB size at the encoding end is fixed to N, i.e., N reconstructed frames (position index 0, 1, 2, … …, N-1) can be placed. In general, among N frames, there are reliable frames and unreliable frames; but when the network is particularly bad, there may be no reliable frames updated, so the frames in the DPB may be squeezed out as the encoding proceeds, eventually resulting in a reliable frame in the DPB as well, as illustrated in fig. 6. In the schematic diagram, each rectangle represents a frame, the frames in the DPB frame are the encoded reconstructed frames, the black rectangles represent reliable frames, and the frame pointed by the arrow is the current frame to be encoded. The top-down rows show the change in the DPB as the encoding progresses, and it can be seen that there are no reliable frames in the DPB at the last row. At this time, the encoder must insert an IDR frame, so that the video frame can be decoded successfully only when received by the opposite end, and the video can not be blocked. In order to avoid the situation of forced insertion of an IDR frame when the network is poor, in the embodiment of the present invention, the selecting the reliable frame picture in the buffer DPB as the first reference frame picture includes:

In the embodiment of the present invention, the size of the DPB is still N, except that at least one reliable frame is reserved in the DPB. When the last reliable frame is pushed to the N-1 position, it remains at the N-1 position index of the DPB unless an updated reliable frame appears. As illustrated in fig. 7, where the latest reliable frame is retained at position N-1 in rows 6 and 7, row 8 shows that the outdated reliable frame retained at position N-1 is released when a new reliable frame is updated. This always ensures that reliable frames can be referenced when encoding the current frame, avoiding having to insert an IDR frame when no reliable frame is available.

The DPB in the peer decoder remains intact and does not need to be adjusted. In the opposite-end decoder, the frames (which are not completely received or reference frames do not exist) which do not meet the decoding condition do not perform decoding, and no decoded frame enters the occupied space of the DPB; once a decodable frame is decoded and put into the DPB, information can be fed back in time to enable the encoding end to update a reliable frame in time, so that video blocking is avoided.

Example 4:

there are reliable frames and unreliable frames in the DPB, and when encoding a current frame, it can refer to the latest reliable frame, as shown in fig. 8 (a), and at this time, the current frame must be decoded if it is successfully received by the opposite end, and such a frame is called a reliable reference frame; the current frame may also refer to the immediately adjacent unreliable frame of the simultaneous domain hierarchy, as shown in fig. 8 (b), where the current frame cannot necessarily be decoded if successfully received by the peer, since its reference frame may not be successfully decoded, such frame is called unreliable reference frame. In the present invention, for the unreliable reference frame, in addition to only one frame, the latest reliable frame may also be additionally referred to in the encoding stage, as shown in fig. 8 (c). It can be seen that the reference frame number in (a) and (b) is 1, and the reference frame number in (c) is 2, and it should be noted that when the current frame is coded as a P frame, even if the reference frame number is 2, the reference frame of one block can only be selected from two frames. The reference frame number of the unreliable reference frame is increased from 1 to 2, the compression performance is improved under the condition of not reducing the decoding success rate, but the motion estimation link of the coding introduces more calculation amount.

Using generalized B-frame coding for the current frame (i.e., referring to two-frame coding as shown in fig. 8 (c)), the reference frame lists 1 and List0 are set to be identical, and then the block in the current frame is coded by making a decision to perform unidirectional and bidirectional prediction across the frames in the reference frame lists, so that the amount of computation introduced by motion estimation for the generalized B-frame is further increased relative to the P-frame. In the RTC application, if the terminal computation power is insufficient, the maximum reference frame number is increased, or the generalized B frame is further enabled, which has a risk that the coding cannot reach real time, so that it is necessary to adaptively decide whether the maximum reference frame number and the generalized B frame are on according to the terminal computation power.

Before the encoding processing is performed on the frame image to be encoded according to the encoding parameter by using the second reference frame image as a reference, the method further includes:

If the generalized B frame coding mode is judged to be started, the method further comprises the following steps:

and coding the frame image to be coded by taking the second reference frame image and the adjacent frame image of the frame image to be coded as references.

In the embodiment of the invention, whether the generalized B frame coding mode is started or not is judged according to the coding processing capacity. And when the coding processing capacity is better, starting the generalized B frame, and coding the frame image to be coded by taking a second reference frame image and the frame image adjacent to the frame image to be coded as references. And when the coding processing capacity is poor, closing the generalized B frame. The frame image to be coded is coded with the single frame as a reference. And at the moment, taking a second reference frame image as a reference, and coding the frame image to be coded according to the coding parameters. Or in an actual scene, the image of the frame to be coded can be coded according to the coding parameters by taking the adjacent frame as a reference according to the needs.

It should be noted that the scheme of determining whether to start the generalized B frame coding scheme is applicable to all scenes according to the coding processing capability. And for a resolution switching scene, if the generalized B frame coding mode is judged to be started, coding the frame image to be coded by taking a second reference frame image and the frame image adjacent to the frame image to be coded as references, wherein the second reference frame image is an image subjected to scaling processing. For a scene with a resolution which is not switched, if the generalized B frame coding mode is judged to be started, the second reference frame image and the frame image adjacent to the frame image to be coded are used as references to be coded, and the frame image to be coded is coded, because the scaling processing is not needed, the second reference frame image is the image which is not scaled, or the scaling ratio of the scaling processing is 1:1, and the second reference frame image is obtained.

In this embodiment of the present invention, the determining whether to start the generalized B frame coding mode according to the coding processing capability includes:

initializing the generalized B frame coding mode and closing, wherein the reference frame number is 1;

if the average encoding time consumption of the latest first number of frame images is less than a preset first time threshold, adjusting the reference frame number to be 2;

if the current reference frame number is 2, the average encoding time consumption of the latest second number of frame images is less than a preset second time threshold, and a generalized B frame encoding mode is started;

if the current reference frame number is 2 and the generalized B frame coding mode is opened, and the average coding time consumption of the latest third number of frame images is not less than a preset third time threshold, closing the generalized B frame coding mode;

if the current reference frame number is 2 and the generalized B frame coding mode is closed, the average coding time consumption of the latest fourth number of frame images is not less than a preset fourth time threshold, and the reference frame number is adjusted to 1.

Specifically, fig. 9 is a schematic diagram of a complexity control process of whether to turn on a generalized B frame according to an embodiment of the present invention, as shown in fig. 9, when a video call starts, an initialization generalized B frame is turned off (GPB is equal to 0), a maximum reference frame number is 1(ref _ num is equal to 1), and it is assumed that an expected encoding frame rate is F. Several scenarios in the flow chart are described below when encoding a frame at the current resolution. Note that the black dots after "no" in fig. 9 mean that the current state is kept unchanged.

(1) If the current maximum number of reference frames is 1 and the average encoding time of the latest K1 frames is T1 ms, then the maximum number of reference frames is increased to 2 (actually only valid for unreliable reference frames) when T1< M1 (1000/F).

(2) If the current maximum reference frame number is already 2 and the average encoding time of the latest K1 frames is T3 ms, then the generalized B frame is turned on when T3< M2 (1000/F).

(3) If the maximum number of reference frames is 2 and the generalized B frame is on, and the average encoding time of the latest K2 frame is T2 ms, the generalized B frame characteristic is turned off when T2> -Q (1000/F).

(4) If the maximum number of reference frames is 2 and the generalized B frame is off, and the average encoding time of the latest K2 frames is T2 ms, then the maximum number of reference frames is restored to 1 when T2> -Q (1000/F).

A group of reference values for K1, K2, M1, M2 and Q is, for example, K1-200, K2-5, M1-0.5, M2-0.3 and Q-0.8. It should be noted that when the reference frame number is switched from 2 to 1, the reference frame number 2 is not enabled subsequently, and when the GPB is switched from enabled to closed, the GPB is not enabled subsequently.

Example 5:

an embodiment of the present invention provides a decoding method based on the encoding method of the above embodiment, as shown in fig. 10, the process includes the following steps:

s201: and when the image after the coding processing is completely received, obtaining the coding parameters.

S202: selecting a reliable frame image in a buffer DPB as a first reference frame image; wherein the reliable frame picture is a successfully decodable picture.

S203: and carrying out scaling processing on the first reference frame image according to the resolution of the image subjected to coding processing to obtain a second reference frame image.

S204: and decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

In the embodiment of the invention, the decoding of the New-IDR frame is started when the New-IDR frame is completely received, and the coding parameters (SPS, PPS and the like) are decoded. The referenced frames in the DPB reconstructed frame are scaled. This step is only required if the referenced frame (first reference frame picture) is different in resolution from the frame currently to be decoded. The scaling is to scale the resolution of the referenced frame to be consistent with the resolution of the frame to be decoded, and it should be noted that the scaling algorithms of the encoding end and the decoding end need to be consistent. And decoding the New-IDR frame in the way of the normal Inter frame. After successful decoding, the frames preceding the New-IDR frame in the DPB are cleared.

According to the coding and decoding scheme provided by the embodiment of the invention, the number of traditional IDR frames is reduced in an RTC scene by reducing IDR frames when resolution is switched and reducing IDR frames when reliable frames do not exist in DPB, so that the frequency of occurrence of code rate spikes at a frame level is reduced, frame loss and time delay are further reduced, and the subjective experience of video is improved. A feasible scheme for increasing the reference frame number in an RTC scene is provided, and generalized B frames are further introduced to improve the compression performance. In consideration of the limit of terminal computing power, a complexity control scheme is provided, so that the video compression rate can be improved by a terminal with sufficient computing power, and the user experience is further improved.

Example 6:

fig. 11 is a schematic structural diagram of an encoding apparatus according to an embodiment of the present invention, where the apparatus includes:

a first determining unit 111, configured to select a reliable frame picture in the buffer DPB as a first reference frame picture when the resolution is switched; wherein the reliable frame image is a successfully decodable image;

a first scaling unit 112, configured to scale the first reference frame image according to the resolution of the frame image to be encoded, so as to obtain a second reference frame image;

and an encoding unit 113, configured to perform encoding processing on the frame image to be encoded according to an encoding parameter by using the second reference frame image as a reference.

The first determining unit 111 is specifically configured to select a latest reliable frame picture in the buffer DPB as a first reference frame picture, where the latest reliable frame picture is a picture that is closest to the frame picture to be encoded and can be successfully decoded.

The device further comprises:

a third determining unit 114, configured to count luminance information of the second reference frame image and the frame image to be encoded respectively, determine a similarity between the second reference frame image and the frame image to be encoded according to a statistical result, and trigger the encoding unit if the similarity is greater than a preset similarity threshold.

The encoding unit 113 is further configured to perform encoding processing on the frame image to be encoded according to a key frame if the similarity is not greater than a preset similarity threshold.

The first determining unit 111 is specifically configured to determine whether a reliable frame image exists in the buffer DPB at the current encoding time, select, if yes, the reliable frame image in the buffer DPB at the current encoding time as a first reference frame image, and if not, keep the reliable frame image in the buffer DPB at the previous encoding time in the buffer DPB at the current encoding time and select the reliable frame image in the buffer DPB at the current encoding time as the first reference frame image.

The device further comprises:

and the judging unit 115 is configured to judge whether to start a generalized B frame coding mode according to coding processing capability, and if not, trigger the coding unit, where the coding unit performs coding processing on the frame image to be coded according to coding parameters with the second reference frame image as a reference.

The encoding unit 113 is further configured to, if the generalized B frame encoding mode is determined to be started, perform encoding processing on the frame image to be encoded with reference to the second reference frame image and the frame image immediately adjacent to the frame image to be encoded.

The judgment unit 115 is specifically configured to initialize the generalized B frame coding mode to be closed, where the reference frame number is 1; if the average encoding time consumption of the latest first number of frame images is less than a preset first time threshold, adjusting the reference frame number to be 2; if the current reference frame number is 2, the average encoding time consumption of the latest second number of frame images is less than a preset second time threshold, and a generalized B frame encoding mode is started; if the current reference frame number is 2 and the generalized B frame coding mode is opened, and the average coding time consumption of the latest third number of frame images is not less than a preset third time threshold, closing the generalized B frame coding mode; if the current reference frame number is 2 and the generalized B frame coding mode is closed, the average coding time consumption of the latest fourth number of frame images is not less than a preset fourth time threshold, and the reference frame number is adjusted to 1.

Example 7:

fig. 12 is a schematic structural diagram of a decoding apparatus according to an embodiment of the present invention, where the apparatus includes:

an obtaining module 121, configured to obtain a coding parameter when the image after the coding processing is completely received;

a second determining unit 122, configured to select a reliable frame picture in the buffer DPB as a first reference frame picture; wherein the reliable frame image is a successfully decodable image;

a second scaling unit 123, configured to scale the first reference frame image according to the resolution of the image after the encoding process, so as to obtain a second reference frame image;

a decoding unit 124, configured to perform decoding processing on the image after the encoding processing according to the encoding parameter by using the second reference frame image as a reference.

Example 8:

on the basis of the foregoing embodiments, an embodiment of the present invention further provides an electronic device, as shown in fig. 13, including: the system comprises a processor 301, a communication interface 302, a memory 303 and a communication bus 304, wherein the processor 301, the communication interface 302 and the memory 303 complete mutual communication through the communication bus 304;

the memory 303 has stored therein a computer program performing the encoding steps, which program, when executed by the processor 301, causes the processor 301 to perform the steps of:

selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image;

and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

Alternatively, the memory 303 has stored therein a computer program for performing the decoding step, which when executed by the processor 301, causes the processor 301 to perform the steps of:

when the image after the coding processing is completely received, obtaining coding parameters;

selecting a reliable frame image in a buffer DPB as a first reference frame image; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the image subjected to encoding processing to obtain a second reference frame image;

and decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

The electronic device provided by the embodiment of the invention can be used for executing the encoding method or the decoding method provided by any embodiment, and has corresponding functions and beneficial effects.

Example 9:

on the basis of the foregoing embodiments, an embodiment of the present invention further provides a computer storage readable storage medium, in which a computer program executable by an electronic device is stored, and when the program is run on the electronic device, the electronic device is caused to execute the following steps:

selecting a reliable frame image in a buffer DPB as a first reference frame image when the resolution ratio is switched; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the frame image to be coded to obtain a second reference frame image;

and coding the frame image to be coded according to coding parameters by taking the second reference frame image as a reference.

Or performing:

when the image after the coding processing is completely received, obtaining coding parameters;

selecting a reliable frame image in a buffer DPB as a first reference frame image; wherein the reliable frame image is a successfully decodable image;

zooming the first reference frame image according to the resolution of the image subjected to encoding processing to obtain a second reference frame image;

and decoding the image after the coding processing according to the coding parameters by taking the second reference frame image as a reference.

The computer storage readable storage medium provided by the embodiments of the present invention stores therein a computer program executable by an electronic device, and when the program runs on the electronic device, the computer storage readable storage medium can be used to execute the encoding method or the decoding method provided by any of the above embodiments, and has corresponding functions and advantages.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

21页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：图像预测方法、编码器、解码器以及存储介质

Encoding and decoding methods and devices, electronic equipment and storage medium

相关技术

网友询问留言