Video processing method and device and computer readable storage medium

文档序号：1675889 发布日期：2019-12-31 浏览：20次中文

阅读说明：本技术 一种视频处理方法、装置及计算机可读存储介质 (Video processing method and device and computer readable storage medium ) 是由王凯于 2019-10-21 设计创作，主要内容包括：本发明实施例公开了一种视频处理方法、装置及计算机可读存储介质。该方法包括：获取第一视频图像,第一视频图像包括一个关键帧和多个预测帧,关键帧位于任意两个相邻的预测帧之间；确定关键帧的位置,并对第一视频图像进行编码。本发明实施例能够在不提高码率和编码复杂度的情况下,提高视频质量,从而提升用户的视觉体验。(The embodiment of the invention discloses a video processing method, a video processing device and a computer readable storage medium. The method comprises the following steps: acquiring a first video image, wherein the first video image comprises a key frame and a plurality of prediction frames, and the key frame is positioned between any two adjacent prediction frames; the location of the key frame is determined and the first video image is encoded. The embodiment of the invention can improve the video quality under the condition of not improving the code rate and the coding complexity, thereby improving the visual experience of a user.)

1. A video processing method, comprising:

acquiring a first video image, wherein the first video image comprises a key frame and a plurality of prediction frames, and the key frame is positioned between any two adjacent prediction frames;

and determining the position of the key frame and coding the first video image.

2. The method of claim 1, wherein the key frame is located between a first predicted frame and a second predicted frame, wherein the first predicted frame and the second predicted frame are located at an intermediate position of the plurality of predicted frames.

3. The method according to claim 1 or 2, wherein said encoding the first video image comprises:

encoding the key frame;

sequentially and backwards referencing the prediction frames positioned in front of the key frame, and coding the prediction frames positioned in front of the key frame;

and for the predicted frames positioned behind the key frames, sequentially referring forwards, and encoding the predicted frames positioned behind the key frames.

4. The method of claim 1, further comprising:

preloading a second video image before playing a first video image, wherein the second video image is a part of the first video image and comprises the key frame;

decoding the second video image to obtain the key frame;

setting the key frame as a video cover of the first video image.

5. A video processing apparatus, comprising: the device comprises a video acquisition module, a position determination module and a video coding module;

the video acquisition module is used for acquiring a first video image, wherein the first video image comprises a key frame and a plurality of prediction frames, and the key frame is positioned between any two adjacent prediction frames;

the position determining module is used for determining the position of the key frame;

the video coding module is used for coding the first video image.

6. The apparatus of claim 5, wherein the key frame is located between a first predicted frame and a second predicted frame, wherein the first predicted frame and the second predicted frame are located at an intermediate position of the plurality of predicted frames.

7. The apparatus of claim 5 or 6,

the video encoding module is specifically configured to encode the key frame; sequentially and backwards referencing the prediction frames positioned in front of the key frame, and coding the prediction frames positioned in front of the key frame; and for the predicted frames positioned behind the key frames, sequentially referring forwards, and encoding the predicted frames positioned behind the key frames.

8. The apparatus of claim 5, further comprising: the system comprises a video preloading module, a video decoding module and a cover setting module;

the video preloading module is used for preloading a second video image before playing a first video image, wherein the second video image is a part of the first video image and comprises the key frame;

the video decoding module is used for decoding the second video image to obtain the key frame;

the cover setting module is used for setting the key frame as a video cover of the first video image.

9. A video processing apparatus, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the video processing method of any of claims 1-4.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the video processing method according to any one of claims 1 to 4.

Technical Field

The present invention relates to the field of video technologies, and in particular, to a video processing method and apparatus, and a computer-readable storage medium.

Background

With the continuous development of the internet and terminal devices, various video services enrich the life, work and entertainment of users. Especially, short videos are gradually favored by various large platforms and users due to the characteristics of being suitable for watching in a mobile state and a short leisure state, high-frequency pushing, strong participation and the like.

Due to short time and single scene, a short video often includes only one key frame and a plurality of predicted frames. In the existing video coding method, the first frame is a key frame, a plurality of prediction frames follow the first frame, and the prediction frames are coded by referring to the key frame.

In addition, in the existing short video scheme, a still picture is often generated for the short video as a cover, the cover image is a frame image in the short video, and the cover image is often encoded into a separate jpeg file. The player of the terminal device will first download this jpeg file and then download the video file. This has the disadvantage of increasing the throughput.

Disclosure of Invention

In order to solve the foregoing technical problems, embodiments of the present invention are expected to provide a video processing method, a video processing apparatus, and a computer-readable storage medium, which can improve video quality without increasing a bit rate and coding complexity, so as to improve visual experience of a user.

The technical scheme of the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a video processing method, including:

acquiring a first video image, wherein the first video image comprises a key frame and a plurality of prediction frames, and the key frame is positioned between any two adjacent prediction frames;

the location of the key frame is determined and the first video image is encoded.

Optionally, the key frame is located between the first predicted frame and the second predicted frame, wherein the first predicted frame and the second predicted frame are located in the middle of the plurality of predicted frames.

Optionally, the encoding the first video image specifically includes:

encoding the key frame;

sequentially and backwards referencing the prediction frames positioned in front of the key frames, and coding the prediction frames positioned in front of the key frames;

and for the predicted frames positioned behind the key frames, sequentially referencing the predicted frames forward, and encoding the predicted frames positioned behind the key frames.

Optionally, the method further includes:

preloading a second video image before playing the first video image, wherein the second video image is a part of the first video image and comprises a key frame;

decoding the second video image to obtain a key frame;

the key frame is set as a video cover of the first video image.

In a second aspect, an embodiment of the present invention provides a video processing apparatus, including: the device comprises a video acquisition module, a position determination module and a video coding module;

a position determination module for determining the position of the key frame;

and the video coding module is used for coding the first video image.

Optionally, the video encoding module is specifically configured to encode the key frame; sequentially and backwards referencing the prediction frames positioned in front of the key frames, and coding the prediction frames positioned in front of the key frames; and for the predicted frames positioned behind the key frames, sequentially referencing the predicted frames forward, and encoding the predicted frames positioned behind the key frames.

Optionally, the method further includes: the system comprises a video preloading module, a video decoding module and a cover setting module;

the video preloading module is used for preloading a second video image before playing the first video image, wherein the second video image is a part of the first video image and comprises a key frame;

the video decoding module is used for decoding the second video image to obtain a key frame;

and the cover setting module is used for setting the key frame as a video cover of the first video image.

In a third aspect, an embodiment of the present invention provides a video processing apparatus, including:

one or more processors;

a memory for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the video processing method according to any one of the first aspect of the embodiments of the present invention.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the video processing method according to any one of the first aspect of the embodiments of the present invention.

The embodiment of the invention provides a video processing method, a video processing device and a computer readable storage medium. The first video image is encoded by obtaining a first video image comprising a key frame and a plurality of predicted frames, and determining the location of the key frame. Compared with the prior art, the key frame is positioned between any two adjacent prediction frames, so that the distance between the key frame and the prediction frames is shortened, and the video quality can be improved without improving the code rate and the coding complexity when the prediction frames refer to the key frame for coding, thereby improving the visual experience of a user.

Drawings

FIG. 1 is a schematic diagram of a conventional video encoding structure;

fig. 2 is a schematic flowchart of a video processing method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a video encoding structure according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another video encoding structure according to an embodiment of the present invention;

fig. 5 is a flowchart illustrating another video processing method according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of another video processing apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of another video processing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

It should be noted that the terms "system" and "network" are often used interchangeably herein in the present invention. Reference to "and/or" in embodiments of the invention is intended to include any and all combinations of one or more of the associated listed items. The terms "first", "second", and the like in the description and claims of the present invention and in the drawings are used for distinguishing between different objects and not for limiting a particular order.

It should be noted that the following embodiments of the present invention may be implemented individually, or may be implemented in combination with each other, and the embodiments of the present invention are not limited in this respect.

With the continuous development of the internet and terminal equipment, the appearance of short videos greatly enriches the life, work and entertainment of users. Short videos generally have the following characteristics compared to normal videos:

1. the short video has short duration (generally only about 10 seconds) and a single scene. Fig. 1 is a schematic diagram of a conventional video coding structure. As shown in fig. 1, taking frame rate 30 as an example, a 10-second short video includes 300 frames, one scene. The 300 frames include only 1 key frame, i.e., an Intra frame (hereinafter referred to as an I frame) in a video coding format, and 299 prediction frames (e.g., P frames or B frames, which are P0, P1, …, and P298 in fig. 1).

In conventional video coding, a key frame is located in a first frame, followed by a number of predicted frames. Taking a P frame as an example, since a P frame located in front of a key frame cannot refer to the key frame, and the key frame only has reference to the P frame behind the key frame, 299P frames after the I frame in fig. 1 are all sequentially referred to forward (i.e., P0 frame refers to the I frame, P1 frame refers to P0 frame, P2 frame refers to P1 frame, etc.). Moreover, coding is lossy, and errors accumulate slowly, so that the image quality of a predicted frame after a key frame decreases slowly, and the image quality is relatively poor as the distance from the key frame is longer.

2. In the process of loading the short video, the terminal device often needs to load a video cover besides downloading the short video, and the video cover often is a separate picture file (such as a picture in a jpg format). This picture content is typically from the short video, but is often not the first frame of the short video (because the first frame of the short video may be completely black, or completely black with subtitles, or not representative, and not appealing to the viewer). The content of the video cover page is from the short video, but needs to be coded into a picture format, and the bandwidth and the storage resource are wasted.

In order to solve the above problem, embodiments of the present invention provide a video processing method, an apparatus, and a computer-readable storage medium, which can improve video quality without increasing a bit rate and coding complexity, so as to improve visual experience of a user.

Hereinafter, a video processing method, apparatus, and technical effects thereof will be described in detail.

Fig. 2 is a schematic flow chart of a video processing method according to an embodiment of the present invention, where the method disclosed in the embodiment of the present invention is applicable to a server and/or a terminal device (which may be implemented by an application installed on the terminal device), and for example, the terminal device in the embodiment of the present invention may be a smart phone, and may also be any terminal device having a video playing function, such as a notebook computer or a tablet computer, or a terminal device capable of controlling another video playing device to play a video. As shown in fig. 2, the method may include the steps of:

s101, a first video image is obtained, wherein the first video image comprises a key frame and a plurality of prediction frames, and the key frame is positioned between any two adjacent prediction frames.

Fig. 3 is a schematic view of a video encoding structure according to an embodiment of the present invention. As shown in fig. 3, the first video image has a length of 10 seconds and a frame rate of 30, and the first video image includes 300 frames and one scene. Where the key frame (I-frame) is located between two adjacent predicted frames (P8 frame and P9 frame).

S102, determining the position of the key frame, and encoding the first video image.

After determining the location of the key frame (I-frame), the method of encoding the first video image may comprise the following three steps:

step 1: the key frame is encoded.

Step 2: for the predicted frames located in front of the key frame, the backward reference is performed in turn, and the predicted frames located in front of the key frame are encoded.

And step 3: and for the predicted frames positioned behind the key frames, sequentially referencing the predicted frames forward, and encoding the predicted frames positioned behind the key frames.

Referring to fig. 3, after a key frame (I frame) is set between two adjacent predicted frames (P8 frame and P9 frame), the predicted frames located in front of the key frame are sequentially referred to backward, and the predicted frames located in front of the key frame are encoded; and for the predicted frames positioned behind the key frames, sequentially referencing the predicted frames forward, and encoding the predicted frames positioned behind the key frames.

Thus, the distance between the first predicted frame (P0 frame) and the key frame (I frame) is a distance of 9 predicted frames, and the distance between the last predicted frame (P298 frame) and the key frame (I frame) is a distance of 290 predicted frames. Compared with the existing video coding method, the method provided by the invention can change the error accumulation from the interframe distances 1, 2, … and 299 to 1, 2, …, 9 and 1, 2, … and 290, and the original interframe distances 291, 292, … and 299 are not used, so that the video quality can be improved under the condition of not improving the code rate and the coding complexity, and the visual experience of a user is improved.

Further, to further improve video quality, a key frame is located between the first predicted frame and the second predicted frame, wherein the first predicted frame and the second predicted frame are located at the middle position of the plurality of predicted frames.

Fig. 4 is a schematic view of another video encoding structure according to an embodiment of the present invention. As shown in fig. 4, the first video image has a length of 10 seconds and a frame rate of 30, and the first video image includes 300 frames and one scene. Where the key frame (I-frame) is located between the first predicted frame and the second predicted frame (P148 frame and P149 frame). Thus, the distance between the first predicted frame (P1 frame) and the key frame (I frame) is a distance of 149 predicted frames, and the distance between the last predicted frame (P298 frame) and the key frame (I frame) is a distance of 150 predicted frames. Compared with the existing video coding method, the method provided by the invention can change the error accumulation from the inter-frame distances of 1, 2, … and 299 to 1, 2, …, 149 and 1, 2, … and 150, and the original inter-frame distances of 151, 152, … and 299 are not used, so that the error from the first predicted frame to the key frame of the video is approximately equal to the error from the last predicted frame to the key frame, and the video quality is further improved.

On the basis of the foregoing embodiment of the present invention, fig. 5 is a schematic flowchart of another video processing method according to an embodiment of the present invention, and as shown in fig. 5, in addition to steps S101 to S102 in the foregoing embodiment, the method further includes:

s103, pre-loading a second video image before playing the first video image, wherein the second video image is a part of the first video image and comprises a key frame.

And S104, decoding the second video image to obtain a key frame.

Specifically, when the first video image is stored at the server side, the video cover in the picture format does not need to be stored, but the second video image is preloaded and decoded before the first video image is played, so that the key frame is obtained. The second video image is a portion of the first video image and the second video image includes a key frame.

And S105, setting the key frame as a video cover of the first video image.

Thus, a separate picture format file is no longer required as a video cover, thereby saving bandwidth and storage resources.

The embodiment of the invention provides a video processing method, which comprises the following steps: acquiring a first video image, wherein the first video image comprises a key frame and a plurality of prediction frames, and the key frame is positioned between any two adjacent prediction frames; the location of the key frame is determined and the first video image is encoded. The first video image is encoded by obtaining a first video image comprising a key frame and a plurality of predicted frames, and determining the location of the key frame. Compared with the prior art, the key frame is positioned between any two adjacent prediction frames, so that the distance between the key frame and the prediction frames is shortened, and the video quality can be improved without improving the code rate and the coding complexity when the prediction frames refer to the key frame for coding, thereby improving the visual experience of a user.

Fig. 6 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present invention, specifically, the video processing apparatus may be configured in a server/terminal device, and includes: a video acquisition module 10, a position determination module 11 and a video encoding module 12.

A video obtaining module 10, configured to obtain a first video image, where the first video image includes a key frame and multiple prediction frames, and the key frame is located between any two adjacent prediction frames;

a position determining module 11, configured to determine a position of the key frame;

and a video encoding module 12, configured to encode the first video image.

Optionally, the video encoding module 12 is specifically configured to encode the key frame; sequentially and backwards referencing the prediction frames positioned in front of the key frames, and coding the prediction frames positioned in front of the key frames; and for the predicted frames positioned behind the key frames, sequentially referencing the predicted frames forward, and encoding the predicted frames positioned behind the key frames.

Optionally, with reference to fig. 6, fig. 7 is a schematic structural diagram of another video processing apparatus according to an embodiment of the present invention, further including: a video preloading module 13, a video decoding module 14 and a cover setting module 15.

The video preloading module 13 is configured to preload a second video image before playing the first video image, where the second video image is a part of the first video image and includes a key frame;

the video decoding module 14 is configured to decode the second video image to obtain a key frame;

a cover setting module 15, configured to set the key frame as a video cover of the first video image.

The video processing apparatus provided in the embodiment of the present invention can execute the steps in the video processing method provided in the embodiment of the method of the present invention, and has the corresponding functional modules and beneficial effects of the execution method.

An embodiment of the present invention further provides a video processing apparatus, including: one or more processors; a memory for storing one or more programs; when executed by one or more processors, cause the one or more processors to implement any of the video processing methods described above.

Fig. 8 is a schematic structural diagram of another video processing apparatus according to an embodiment of the present invention, as shown in fig. 8, the video processing apparatus includes a processor 20, a memory 21, an input device 22, and an output device 23; the number of the processors 20 in the video processing apparatus may be one or more, and one processor 20 is taken as an example in fig. 8; the processor 20, the memory 21, the input device 22 and the output device 23 in the video processing device may be connected by a bus or other means, and fig. 8 illustrates an example of connection by a bus. A bus represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures.

The memory 21 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the video processing method in the embodiment of the present invention. The processor 20 executes various functional applications and data processing of the video processing apparatus by executing software programs, instructions, and modules stored in the memory 21, that is, implements the video processing method described above.

The memory 21 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the video processing apparatus, and the like. Further, the memory 21 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 21 may further include memory located remotely from processor 20, which may be connected to a video processing device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 22 is operable to receive input numeric or character information and to generate key signal inputs relating to user settings and function controls of the video processing device. The output device 23 may include a display device such as a display screen.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a video processing method according to embodiments of the present invention, and the method may specifically but not limited to what is disclosed in the foregoing method embodiments.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, Ruby, Go, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

14页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：丢帧处理方法及装置

Video processing method and device and computer readable storage medium

相关技术

网友询问留言