Video processing method, device, equipment and storage medium

文档序号：1188291 发布日期：2020-09-22 浏览：22次中文

阅读说明：本技术 视频处理方法、装置、设备及存储介质 (Video processing method, device, equipment and storage medium ) 是由黄晓政郑云飞闻兴于 2020-06-24 设计创作，主要内容包括：本公开关于一种视频处理方法、装置、设备及存储介质,属于多媒体技术领域。本公开的一些实施例通过对原始视频进行前处理和编码,根据编码后的视频得到视频的压缩失真参数,依据压缩失真参数对前处理参数进行调整,由于在调整前处理参数时考虑了视频编码后的压缩失真,使得调整后的前处理参数的准确性得到显著的提升,因此设备通过应用调整后的前处理参数进行前处理,能够减少视频在编码后产生的压缩失真,从而提升了视频的质量,优化了设备进行视频前处理的性能。(The disclosure relates to a video processing method, a video processing device, video processing equipment and a storage medium, and belongs to the technical field of multimedia. Some embodiments of the present disclosure perform preprocessing and encoding on an original video, obtain a compression distortion parameter of the video according to the encoded video, and adjust the preprocessing parameter according to the compression distortion parameter, because the compression distortion after video encoding is considered when adjusting the preprocessing parameter, so that the accuracy of the adjusted preprocessing parameter is significantly improved, and therefore, the device performs preprocessing by applying the adjusted preprocessing parameter, and can reduce the compression distortion generated after encoding of the video, thereby improving the quality of the video, and optimizing the performance of the device in performing video preprocessing.)

1. A method of video processing, the method comprising:

preprocessing the first original video according to the first preprocessing parameter to obtain a preprocessed video;

encoding the pre-processed video to obtain an encoded video;

obtaining a compression distortion parameter according to the coded video, wherein the compression distortion parameter is used for representing the degree of degradation of the video quality of the coded video and the first original video;

adjusting the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter;

and preprocessing a second original video according to the second preprocessing parameter.

2. The video processing method of claim 1, wherein the first pre-processing parameter comprises a first filter coefficient, and wherein pre-processing the first original video according to the first pre-processing parameter comprises:

preprocessing the first original video through a filter with the first filter coefficient.

3. The video processing method of claim 1, wherein the first pre-processing parameter comprises a first network parameter, and wherein pre-processing the first original video according to the first pre-processing parameter comprises:

preprocessing the first original video through a neural network with the first network parameters.

4. The video processing method of claim 1, wherein said encoding the pre-processed video comprises:

and inputting the preprocessed video into a video encoder, and encoding the preprocessed video through the video encoder.

5. The video processing method of claim 1, wherein said encoding the pre-processed video comprises:

and inputting the preprocessed video into a neural network, and encoding the preprocessed video through the neural network, wherein the neural network is used for outputting a video corresponding to a preset code rate or a preset quantization parameter.

6. The video processing method of claim 1, wherein the second pre-processing parameter comprises a second filter coefficient, and wherein pre-processing the second original video according to the second pre-processing parameter comprises:

and preprocessing the second original video through a filter with the second filter coefficient.

7. The video processing method of claim 1, wherein the second pre-processing parameter comprises a second network parameter, and wherein pre-processing the second original video according to the second pre-processing parameter comprises:

preprocessing the second original video through a neural network with the second network parameters.

8. A video processing apparatus, comprising:

the preprocessing unit is configured to execute preprocessing on the first original video according to the first preprocessing parameter to obtain a preprocessed video;

the encoding unit is configured to encode the preprocessed video to obtain an encoded video;

an obtaining unit configured to obtain a compression distortion parameter representing a degree of degradation in video quality of the encoded video compared with the first original video, from the encoded video;

an adjusting unit configured to perform adjustment on the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter;

the preprocessing unit is further configured to perform preprocessing on a second original video according to the second preprocessing parameter.

9. An electronic device, comprising:

one or more processors;

one or more memories for storing the one or more processor-executable instructions;

wherein the one or more processors are configured to execute the instructions to implement the video processing method of any of claims 1 to 7.

10. A storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the video processing method of any of claims 1 to 7.

Technical Field

The present disclosure relates to the field of multimedia technologies, and in particular, to a video processing method, apparatus, device, and storage medium.

Background

Video pre-processing refers to operations performed on video at the pixel level prior to video encoding. By preprocessing the video and then decoding the video, the compression distortion of the video after encoding can be reduced. In the related art, the filter coefficient of the filter is manually preset, and after the original video is obtained, the filter performs low-pass filtering operation on the original video according to the manually set filter coefficient, so that video preprocessing is realized. After the filter obtains the filtered video, the filtered video is input to a video encoder. And the video encoder encodes the filtered video to obtain the encoded video. When the method is adopted, the compression distortion of the coded video is large.

Disclosure of Invention

The present disclosure provides a video processing method, apparatus, device and storage medium, so as to at least solve the problem in the related art that compression distortion of video after encoding is large. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a video processing method, including;

preprocessing the first original video according to the first preprocessing parameter to obtain a preprocessed video;

encoding the pre-processed video to obtain an encoded video;

adjusting the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter;

and preprocessing a second original video according to the second preprocessing parameter.

Optionally, the first preprocessing parameter includes a first filter coefficient, and the preprocessing the first original video according to the first preprocessing parameter includes:

preprocessing the first original video through a filter with the first filter coefficient.

Optionally, the first preprocessing parameter includes a first network parameter, and the preprocessing the first original video according to the first preprocessing parameter includes:

preprocessing the first original video through a neural network with the first network parameters.

Optionally, the encoding the preprocessed video includes:

and inputting the preprocessed video into a video encoder, and encoding the preprocessed video through the video encoder.

Optionally, the encoding the preprocessed video includes:

Optionally, the adjusting the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter includes:

and determining a first preprocessing parameter which enables the value of the compression distortion parameter to be a minimum value as the second preprocessing parameter, wherein the minimum value is the minimum value in a plurality of compression distortion parameters obtained after the preprocessing parameters are adjusted for a plurality of times.

Optionally, the second preprocessing parameter includes a second filter coefficient, and the preprocessing the second original video according to the second preprocessing parameter includes:

and preprocessing the second original video through a filter with the second filter coefficient.

Optionally, the second preprocessing parameter includes a second network parameter, and the preprocessing the second original video according to the second preprocessing parameter includes:

preprocessing the second original video through a neural network with the second network parameters.

According to a second aspect of the embodiments of the present disclosure, there is provided a video processing apparatus including:

the preprocessing unit is configured to execute preprocessing on the first original video according to the first preprocessing parameter to obtain a preprocessed video;

the encoding unit is configured to encode the preprocessed video to obtain an encoded video;

an adjusting unit configured to perform adjustment on the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter;

the preprocessing unit is further configured to perform preprocessing on a second original video according to the second preprocessing parameter.

Optionally, the first preprocessing parameter includes a first filter coefficient, and the preprocessing unit is configured to perform preprocessing on the first original video through a filter having the first filter coefficient.

Optionally, the first preprocessing parameter includes a first network parameter, and the preprocessing unit is configured to perform preprocessing on the first original video through a neural network having the first network parameter.

Optionally, the encoding unit is configured to perform inputting the pre-processed video into a video encoder, and encoding the pre-processed video by the video encoder.

Optionally, the encoding unit is configured to perform inputting the preprocessed video into a neural network, and encode the preprocessed video through the neural network, where the neural network is configured to output a video corresponding to a preset code rate or a preset quantization parameter.

Optionally, the adjusting unit is configured to perform a first preprocessing parameter that makes a value of the compression distortion parameter a minimum value, which is the smallest among a plurality of compression distortion parameters obtained after the preprocessing parameters are adjusted a plurality of times, and determine the minimum value as the second preprocessing parameter.

Optionally, the second preprocessing parameter includes a second filter coefficient, and the preprocessing unit is configured to perform preprocessing on the second original video through a filter having the second filter coefficient.

Optionally, the second preprocessing parameter includes a second network parameter, and the preprocessing unit is configured to perform preprocessing on the second original video through a neural network having the second network parameter.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

one or more processors;

one or more memories for storing the processor-executable instructions;

wherein the one or more processors are configured to execute the instructions to implement the video processing method described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having instructions that, when executed by a processor of an electronic device, enable the electronic device to perform the above-described video processing method.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising one or more instructions that, when executed by a processor of an electronic device, enable the electronic device to perform the above-described video processing method.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the method comprises the steps of preprocessing and encoding an original video, obtaining a compression distortion parameter of the video according to the encoded video, adjusting the preprocessing parameter according to the compression distortion parameter, and considering the compression distortion after the video is encoded when the preprocessing parameter is adjusted, so that the accuracy of the adjusted preprocessing parameter is remarkably improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a block diagram illustrating an environment for implementing a video processing method in accordance with an exemplary embodiment;

FIG. 2 is a block diagram illustrating a video processing system 200 according to an example embodiment;

FIG. 3 is a flow diagram illustrating a video processing method according to an exemplary embodiment;

FIG. 4 is a flow diagram illustrating a video processing method in accordance with an exemplary embodiment;

FIG. 5 is a block diagram illustrating a video processing device according to an example embodiment;

FIG. 6 is a block diagram illustrating a terminal in accordance with an exemplary embodiment;

FIG. 7 is a block diagram illustrating a server in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Some concepts related to terms related to embodiments of the present application are described below.

(1) Convolutional neural network

Convolutional Neural Networks are a class of feed-forward Neural Networks (fed-forward Neural Networks) that contain convolutional calculations and have deep structures, and are one of the representative algorithms of deep learning (deep learning).

(2) Video encoder

A video encoder refers to a program or device capable of compressing or decompressing video.

(3) Video pre-processing

Video pre-processing refers to operating on the video at the pixel level before video encoding so that the video conforms to the expected effect. And on the premise of the set code rate, the video encoder performs lossy compression on the video digital signal. The lower the bitrate, the more noticeable the distortion of the video compression. On the other hand, the same bitrate is different for different videos. The more complex the texture and motion in the content of the video, the more severe the compression distortion. Therefore, by performing operations such as filtering on the video at the pixel level through video preprocessing, the compression distortion of the video after encoding can be reduced by reducing the texture information of the video content.

(4) Distortion of compression

Compression distortion is the apparent distortion of multimedia files (including images, audio and video) that results after compression using destructive material.

(5) Code rate

The code rate is a binary data amount per unit time after an analog signal is converted into a digital signal. The higher the code rate, the more data is transmitted per second, and the clearer the image quality is.

(6) Video quality

Video quality is a method of quantifying the degree to which the picture quality changes (typically degrades) as a piece of video passes through a video transmission/processing system. Since a video processing system may cause some video signal distortion, video quality evaluation is very important for the selection of a video transmission/processing system.

Some terminology concepts are introduced above, and the following exemplifies the case of the video pre-processing technique in a specific application.

In some embodiments, at a set bitrate (generally referred to as a lower bitrate), a downscaling operation is performed on the video before encoding, so as to reduce texture information of the video, so that compression distortion of the video after encoding is reduced. However, when this method is adopted, since the resolution of the video is reduced, the texture information of the video is significantly reduced, and the video definition is greatly reduced.

In other embodiments, at the video pixel level, the video is subjected to image low-pass filtering operations (such as gaussian blurring and bilateral filtering) to reduce high-frequency information of the video, so that compression distortion of the video after encoding is reduced. However, in this way, the characteristics of the video encoder compression distortion itself are not considered, since the same operation is performed on all the texture information of the video.

In some embodiments provided by the present application, a video pre-processing method is provided, in which a module for simulating video coding distortion is added, and a filter coefficient of video pre-processing is guided by compression distortion after video coding, so that the video pre-processing is optimized with a goal of minimizing the compression distortion after coding.

In the following, a hardware environment of some embodiments of the present disclosure is illustrated.

Fig. 1 is a block diagram illustrating an environment for implementing a video processing method according to an example embodiment. The implementation environment includes: a terminal 101 and a video processing platform 110.

The terminal 101 is connected to the video processing platform 110 through a wireless network or a wired network. The terminal 101 may be at least one of a smart phone, a game console, a desktop computer, a tablet computer, an e-book reader, an MP3(Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3) player, or an MP4(Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4) player, and a laptop computer. The terminal 101 is installed and operated with an application program supporting video processing. The application may be a live application, a multimedia application, a short video application, and the like. Illustratively, the terminal 101 is a terminal used by a user, and a user account is registered in an application running in the terminal 101.

The terminal 101 is connected to the video processing platform 110 through a wireless network or a wired network.

The video processing platform 110 includes at least one of a server, a plurality of servers, a cloud computing platform, and a virtualization center. The video processing platform 110 is used to provide background services for applications that support video processing functionality. Alternatively, the video processing platform 110 and the terminal 101 may work in cooperation in processing video. For example, the video processing platform 110 undertakes primary work and the terminal 101 undertakes secondary work; or, the video processing platform 110 undertakes the secondary work, and the terminal 101 undertakes the primary work; alternatively, the video processing platform 110 or the terminal 101, respectively, may undertake the generation separately. As an example, the video processing platform 110 may perform the following embodiments, train to obtain a convolutional neural network, transmit the convolutional neural network to the terminal 101, and the terminal may receive the convolutional neural network from the video processing platform 110, and perform video pre-processing and video encoding by using the convolutional neural network by performing the following method embodiments.

Optionally, the video processing platform 110 comprises: an access server, a video processing server 1101 and a database 1102. The access server is used to provide access services for the terminal 101. The video processing server 1101 is used to provide background services related to video processing, such as training a convolutional neural network, collecting sample video, and the like. The video processing server 1101 may be one or more. When the video processing servers 1101 are multiple, at least two video processing servers 1101 are present for providing different services, and/or at least two video processing servers 1101 are present for providing the same service, for example, providing the same service in a load balancing manner, which is not limited by the embodiment of the present disclosure. A video processing model may be provided in the video processing server 1101. The database 1102 may be used to store sample video, convolutional neural networks, raw images, or other data related to the method embodiments described below, etc., and the database 1102 may provide the stored data to the terminal 101 and the video processing server 1101 as needed.

The terminal 101 may be generally referred to as one of a plurality of terminals, and the embodiment is only illustrated by the terminal 101.

Those skilled in the art will appreciate that the number of terminals 101 may be greater or fewer. For example, the number of the terminal 101 may be only one, or the number of the terminal 101 may be tens or hundreds, or more, in which case the video processing system further includes other terminals. The number of terminals and the type of the device are not limited in the embodiments of the present disclosure.

The hardware environment of the embodiment of the present application is illustrated above, and the logical functional architecture of the embodiment of the present application is illustrated below.

Referring to fig. 2, an embodiment of the present application provides a video processing system 200. The video processing system 200 includes an original video 210, a video pre-processing module 220, a video encoding module 230, an encoded video 240, and an encoding distortion module 250.

In the process of processing the video by the video processing system 200, the original video 210 is input to the video pre-processing module 220; the video pre-processing module 220 pre-processes the original video 210 to obtain a pre-processed video, and the pre-processed video obtained by the video pre-processing module 220 is input to the video encoding module 230. The video encoding module 230 preprocesses the preprocessed video to obtain an encoded video 240. The encoded video 240 obtained by the video encoding module 230 and the original video 210 are input to the encoding distortion module 250. The coding distortion module 250 obtains coding distortion parameters according to the video quality difference between the coded video 240 and the original video 210. The compression distortion parameters obtained by the coding distortion module 250 are fed back to the video pre-processing module 220. The video preprocessing module 220 adjusts its own preprocessing parameters according to the compression distortion parameters, so that the preprocessing parameters of the video preprocessing module 220 are optimized.

The implementation manner of the video preprocessing module 220 includes various cases. Optionally, the video pre-processing module 220 is a filter. For example, the video pre-processing module 220 is a multi-tap image filter. Optionally, the video pre-processing module 220 is a neural network, for example, the video pre-processing module 220 is a convolutional neural network.

Implementations of the video encoding module 230 include a variety of scenarios, among others. Alternatively, the video encoding module 230 is a real video encoder, i.e., the video encoding module 230 is a physical device. Optionally, the video coding module 230 is a simulator of video coding distortion trained using convolutional neural networks to simulate encoder distortion at a specific code rate or quantization level setting.

Fig. 3 is a flow chart illustrating a video processing method, as shown in fig. 3, for use in an electronic device, according to an exemplary embodiment, including the following steps.

In step S310, the electronic device preprocesses the first original video according to the first preprocessing parameter to obtain a preprocessed video.

In step S320, the electronic device encodes the preprocessed video to obtain an encoded video.

In step S330, the electronic device obtains a compression distortion parameter according to the encoded video, where the compression distortion parameter is used to indicate a degree of degradation of the video quality of the encoded video compared with the first original video.

In step S340, the electronic device adjusts the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter.

In step S350, the electronic device preprocesses the second original video according to the second preprocessing parameter.

Optionally, the first preprocessing parameter includes a first filter coefficient, and preprocessing the first original video according to the first preprocessing parameter includes:

the first original video is preprocessed by a filter having a first filter coefficient.

Optionally, the first preprocessing parameter includes a first network parameter, and preprocessing the first original video according to the first preprocessing parameter includes:

the first raw video is preprocessed through a neural network with first network parameters.

Optionally, encoding the preprocessed video includes:

and inputting the preprocessed video into a video encoder, and encoding the preprocessed video through the video encoder.

Optionally, encoding the preprocessed video includes:

and inputting the preprocessed video into a neural network, and encoding the preprocessed video through the neural network, wherein the neural network is used for outputting the video corresponding to the preset code rate or the preset quantization parameter.

Optionally, adjusting the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter, including:

and determining the first preprocessing parameter which enables the value of the compression distortion parameter to be the minimum value as a second preprocessing parameter.

Optionally, the second preprocessing parameter includes a second filter coefficient, and preprocessing the second original video according to the second preprocessing parameter includes:

and preprocessing the second original video through a filter with a second filter coefficient.

Optionally, the second preprocessing parameter includes a second network parameter, and preprocessing the second original video according to the second preprocessing parameter includes:

and preprocessing the second original video through a neural network with second network parameters.

Fig. 4 is a flow chart illustrating a video processing method, as shown in fig. 4, for use in an electronic device, according to an exemplary embodiment, including the following steps.

In step S400, the electronic device acquires a plurality of original videos.

Raw video refers to video that has not been pre-processed and has not been encoded. How to obtain the original video includes a variety of ways. In some embodiments, the electronic device accesses a database and reads the original video previously stored by the database. In other embodiments, the electronic device performs video recording via a camera to obtain an original video. In other embodiments, the electronic device is a background server of a video application, and a client of the video application records and sends an original video to the electronic device.

In step S410, the electronic device preprocesses the first original video according to the first preprocessing parameter to obtain a first preprocessed video.

The first original video is any one of the original videos.

The preprocessing parameter is a parameter to be used when preprocessing is performed. Optionally, the preprocessing step is implemented by a filter, and the preprocessing parameter is a filter coefficient of the filter. For example, the preprocessing parameter is a filter coefficient of a multi-tap image filter. Optionally, the preprocessing step is implemented by a neural network, and the preprocessing parameters are network parameters of the neural network. For example, the pre-processing parameter is a weight matrix in the neural network.

The preprocessed video refers to a video obtained by preprocessing an original video. The preprocessed video is preprocessed and not encoded. The first preprocessed video refers to a video obtained by preprocessing a first original video.

In this embodiment, the electronic device optimizes the preprocessing parameters by adjusting the preprocessing parameters. In order to describe the preprocessing parameters before and after the optimization in a differentiated manner, the preprocessing parameters before the optimization are referred to as first preprocessing parameters, and the preprocessing parameters after the optimization are referred to as second preprocessing parameters.

Optionally, the first preprocessing parameter is obtained by sample video training. Optionally, in a case where the preprocessing is implemented by a filter, the first preprocessing parameter includes a first filter coefficient, and the manner of obtaining the first filter coefficient includes, for example: determining the structure of the filter; configuring a filter coefficient of the filter having the structure as an initial value; filtering the sample video through a filter; and adjusting the initial value according to the filtered video to obtain a first filter coefficient. Optionally, in a case that the preprocessing is implemented by a neural network, the first preprocessing parameter includes a first network parameter, and a manner of obtaining the first network parameter includes, for example: determining the structure of a neural network; configuring network parameters of a neural network of the structure to initial values; and filtering the sample video through a neural network, and adjusting an initial value according to the filtered video to obtain a first network parameter.

Specifically, how to perform the pretreatment includes various implementation manners, which are exemplified by a first pretreatment manner and a second pretreatment manner.

In the first preprocessing mode, the electronic device preprocesses the first original video through a filter with a first filter coefficient. For example, the first original video is filtered by a filter. For example, the first original video is low-pass filtered by a low-pass filter. The low-pass filtering includes, but is not limited to, a gaussian blur mode or a bilateral filtering mode. By performing the low-pass filtering operation, high frequency information of the video can be reduced, so that compression distortion of the video after encoding is reduced.

And in the second preprocessing mode, the electronic equipment preprocesses the first original video through the neural network with the first network parameters. The first raw video is convolved, for example, by a convolutional neural network.

In step S420, the electronic device encodes the first preprocessed video to obtain a first encoded video.

The encoded video is a video obtained by encoding a preprocessed video. The encoded video is pre-processed and encoded. The first encoded video is a video obtained by encoding the first preprocessed video.

Specifically, how to encode the video includes various implementation manners, which are illustrated by a first encoding manner and a second encoding manner.

And in the first coding mode, the electronic equipment inputs the video subjected to the first preprocessing into a video coder, and the video coder codes the video subjected to the first preprocessing.

Optionally, the video encoder is a hardware device, the electronic device is connected to the video encoder through a wired or wireless network, and the electronic device sends an encoding request to the video encoder, where the encoding request carries the video after the first preprocessing. The video encoder receives an encoding request and encodes the first preprocessed video in response to the encoding request.

And in the second coding mode, the electronic equipment inputs the video subjected to the first preprocessing into the neural network and codes the video subjected to the first preprocessing through the neural network.

Optionally, the neural network is a convolutional neural network. Optionally, the neural network is pre-stored on the electronic device, and the electronic device invokes the neural network to perform encoding through the neural network. The neural network is used for outputting a video corresponding to a preset code rate or a preset quantization parameter. Neural networks are used to model coding distortion. The neural network can learn the mapping relation between the compression distortion parameters and the preprocessing parameters in the training process. Optionally, the neural network used in encoding is different from the neural network used in preprocessing. For example, a first original video is preprocessed by a first neural network, and the first preprocessed video is encoded by a second neural network, wherein the second neural network and the first neural network have different network structures.

Where the quantization parameter is the QP of the encoder. The preset quantization parameter is, for example, a QP preset by a developer. The quantization parameter reflects the spatial detail compression. For example, the quantization parameter is small, indicating that most details are preserved; the quantization parameter is increased, some details are lost, the code rate is reduced, but the image distortion is enhanced and the quality is reduced.

In step S430, the electronic device obtains a compression distortion parameter according to the first encoded video.

The compression distortion parameter is used to indicate a degree of degradation in video quality of the first encoded video compared to the first original video. For example, the larger the compression distortion parameter, the greater the degree of distortion that the first encoded video is distorted compared to the first original video.

In one possible implementation, the electronic device determines a quality parameter of a first original video according to the first original video; the electronic equipment determines the quality parameter of the first coded video according to the first coded video; and the electronic equipment acquires the compression distortion parameters according to the quality parameters of the first original video and the quality parameters of the first coded video. For example, the electronic device determines an amount of difference between a quality parameter of the first original video and a quality parameter of the first encoded video as the compression distortion parameter.

In step S440, the electronic device adjusts the first preprocessing parameter according to the compression distortion parameter to obtain a second preprocessing parameter.

The value of the second preprocessing parameter is different from the value of the first preprocessing parameter. The method for adjusting the pretreatment parameters comprises the following steps: increasing the value of the first pretreatment parameter to make the value of the second pretreatment parameter larger than the value of the first pretreatment parameter; or reducing the value of the first pretreatment parameter to make the value of the second pretreatment parameter smaller than the value of the first pretreatment parameter.

Optionally, the electronic device adjusts the first preprocessing parameter one or more times with a goal of minimizing the compression distortion parameter. The electronic device determines a first preprocessing parameter that minimizes the value of the compression distortion parameter as a second preprocessing parameter. Or, the electronic device determines the first preprocessing parameter, which makes the value of the compression distortion parameter smaller than a preset threshold, as the second preprocessing parameter.

In the case where the preprocessing is implemented by a filter, step S440 is, for example, a process of adjusting a filter coefficient of the filter. For example, the first filter coefficient is adjusted according to the compression distortion parameter to obtain a second filter coefficient.

In the case where the preprocessing is implemented by a neural network, step S440 is, for example, a process of adjusting network parameters of the neural network. For example, the first network parameter is adjusted according to the compression distortion parameter to obtain the second network parameter.

Optionally, the process of adjusting the pre-processing parameters comprises a plurality of iterations. Each iteration comprises one preprocessing, one coding and one adjustment preprocessing parameter. The stop condition of the iteration is that the value of the compression distortion parameter is the minimum value. For example, after the first preprocessing parameter is adjusted once, the preprocessing and encoding operations are performed again on the first original video based on the adjusted first preprocessing parameter, the compression distortion parameter is acquired again based on the newly acquired encoded video, and whether the degree of compression distortion is increased or decreased is determined based on the newly acquired compression distortion parameter. If the degree of compression distortion increases, the iteration is stopped. If the degree of compression distortion is reduced, the first preprocessing parameter continues to be adjusted.

The minimum value is the minimum value among a plurality of compression distortion parameters obtained after the preprocessing parameters are adjusted a plurality of times. Specifically, after the preprocessing parameter is adjusted once, one compression distortion parameter can be obtained again according to the adjusted preprocessing parameter, and after the preprocessing parameter is adjusted for multiple times, multiple compression distortion parameters can be obtained, and the multiple compression distortion parameters are compared to determine the minimum value of the compression distortion parameter. For example, when the preprocessing parameter is adjusted 2 times, first, a compression distortion parameter 1 is obtained from the preprocessing parameter 1; then, adjusting the pretreatment parameter 1 to obtain a pretreatment parameter 2, and obtaining a compression distortion parameter 2 according to the pretreatment parameter 2; then, adjusting the pre-processing parameter 2 to obtain a pre-processing parameter 3, and obtaining a compression distortion parameter 3 according to the pre-processing parameter 3; the minimum value of the compression distortion parameter is, for example, the minimum value of the compression distortion parameter 1, the compression distortion parameter 2, and the compression distortion parameter 3.

In step S450, the electronic device preprocesses the second original video according to the second preprocessing parameter to obtain a second preprocessed video.

The second original video is any one of the original videos. The relationship between the second original video and the first original video includes a variety of cases, which are exemplified by way one and way two below.

Mode one, the second original video and the first original video are different videos. That is, the compression distortion parameters derived from one original video are applied to the processing of another original video. For example, a plurality of videos to be coded are obtained, the plurality of videos form an original video set, a part of original videos are selected from the original video set to serve as first original videos, and compression distortion parameters are obtained by executing the method according to the original videos; in addition, the other part of original video is used as a second original video, and the second original video is preprocessed according to the compression distortion parameters and then is coded, so that the quality of the other part of original video is improved.

And in the second mode, the second original video and the first original video are the same video. That is, the compression distortion parameters derived from an original video are applied to the processing of the original video. For example, a plurality of videos to be encoded are acquired, the plurality of videos are combined into an original video set, for each original video in the original video set, a compression distortion parameter is obtained by executing the method according to the original video, then the original video is preprocessed again according to the compression distortion parameter, and then encoding is performed again, so that the quality of the original video is improved.

Alternatively, the manner of the preprocessing in step S450 is the same as that in step S410 described above. For example, both step S450 and step S410 use filters, and as another example, both step S450 and step S410 use neural networks. Specifically, how to perform the pretreatment includes various implementation manners, which are exemplified by a first pretreatment manner and a second pretreatment manner.

And in the first preprocessing mode, the electronic equipment preprocesses the second original video through a filter with a second filter coefficient. For example, the second original video is filtered by a filter. The second original video is low-pass filtered, for example by a low-pass filter.

And in the second preprocessing mode, the electronic equipment preprocesses the second original video through the neural network with the second network parameters. The second raw video is convolved, for example, by a convolutional neural network.

In step S460, the electronic device encodes the second preprocessed video to obtain a second encoded video.

The second encoded video is a video obtained by encoding the second preprocessed video.

Alternatively, the encoding in step S460 is the same as the encoding in step S420. For example, step S460 and step S420 both use a video encoder, and as another example, step S460 and step S420 both use a neural network. The specific implementation manner of how to perform the encoding includes various implementation manners, which are exemplified by the first encoding manner and the second encoding manner.

And in the first coding mode, the electronic equipment inputs the video subjected to the second preprocessing into a video coder, and codes the video subjected to the second preprocessing through the video coder.

And in the second coding mode, the electronic equipment inputs the video subjected to the second preprocessing into the neural network and codes the video subjected to the second preprocessing through the neural network.

The use of the second encoded video includes a variety of application scenarios. For example, the second encoded video is for playing in a client of a video application. For example, the electronic device in this embodiment is a background server of a video application, the background server codes the second coded video to obtain a second coded video, and then sends the second coded video to a terminal of a client running the video application, and the terminal receives the second coded video and plays the second coded video in the client. As another example, the second encoded video is used for expansion of a video database. For example, after obtaining the second encoded video, the electronic device stores the second encoded video in the video database. As another example, the second encoded video is used to continue optimizing the pre-processing parameters. For example, a compression distortion parameter is obtained according to a video quality difference between the second encoded video and the second original video, and the second preprocessing parameter is adjusted according to the compression distortion parameter.

According to the method provided by the embodiment, the original video is preprocessed and encoded, the compression distortion parameter of the video is obtained according to the encoded video, and the preprocessing parameter is adjusted according to the compression distortion parameter.

Fig. 5 is a block diagram illustrating a video processing apparatus according to an example embodiment. Referring to fig. 5, the apparatus includes a preprocessing unit 510, an encoding unit 520, an obtaining unit 530, and an adjusting unit 540.

A preprocessing unit 510 configured to perform preprocessing on the first original video according to the first preprocessing parameter to obtain a preprocessed video;

an encoding unit 520 configured to perform encoding on the preprocessed video to obtain an encoded video;

an obtaining unit 530 configured to obtain a compression distortion parameter from the encoded video, the compression distortion parameter being used to indicate a degree of degradation of video quality of the encoded video compared with the first original video;

an adjusting unit 540 configured to perform an adjustment on the first preprocessing parameter according to the compression distortion parameter, so as to obtain a second preprocessing parameter;

the preprocessing unit 510 is further configured to perform preprocessing on the second original video according to the second preprocessing parameter.

Optionally, the first pre-processing parameter includes a first filter coefficient, and the pre-processing unit 510 is configured to perform pre-processing on the first original video through a filter having the first filter coefficient.

Optionally, the first preprocessing parameter includes a first network parameter, and the preprocessing unit 510 is configured to perform preprocessing on the first original video through a neural network having the first network parameter.

Optionally, the encoding unit 520 is configured to perform inputting the preprocessed video into a video encoder, and encoding the preprocessed video by the video encoder.

Optionally, the encoding unit 520 is configured to perform inputting the preprocessed video into a neural network, and encoding the preprocessed video through the neural network, where the neural network is configured to output a video corresponding to a preset code rate or a preset quantization parameter.

Optionally, the adjusting unit 540 is configured to perform a first preprocessing parameter that would make the value of the compression distortion parameter the minimum value, and determine the first preprocessing parameter as the second preprocessing parameter.

Optionally, the second pre-processing parameter includes a second filter coefficient, and the pre-processing unit 510 is configured to perform pre-processing on the second original video through a filter having the second filter coefficient.

Optionally, the second pre-processing parameters include second network parameters, and the pre-processing unit 510 is configured to perform pre-processing on the second original video through a neural network with the second network parameters.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

The electronic device in the above method embodiment may be implemented as a terminal or a server, for example, fig. 6 shows a block diagram of a terminal 600 provided in an exemplary embodiment of the present application. The terminal 600 may be: a smart phone, a tablet computer, an MP3(Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3) player, an MP4(Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4) player, a notebook computer or a desktop computer. The terminal 600 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.

In general, the terminal 600 includes: one or more processors 601 and one or more memories 602.

The processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, processor 601 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the video processing method provided by the method embodiments herein.

In some embodiments, the terminal 600 may further optionally include: a peripheral interface 603 and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 603 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 604, a display 605, a camera assembly 606, an audio circuit 607, a positioning component 608, and a power supply 609.

The peripheral interface 603 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 601 and the memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 601, the memory 602, and the peripheral interface 603 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 604 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 604 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 604 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 605 is a touch display screen, the display screen 605 also has the ability to capture touch signals on or over the surface of the display screen 605. The touch signal may be input to the processor 601 as a control signal for processing. At this point, the display 605 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 605 may be one, providing the front panel of the terminal 600; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in other embodiments, the display 605 may be a flexible display disposed on a curved surface or a folded surface of the terminal 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display 605 may be made of LCD (liquid crystal Display), OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 606 is used to capture images or video. Optionally, camera assembly 606 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuitry 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing or inputting the electric signals to the radio frequency circuit 604 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 600. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 607 may also include a headphone jack.

The positioning component 608 is used to locate the current geographic location of the terminal 600 to implement navigation or LBS (location based Service). The positioning component 608 can be a positioning component based on the GPS (global positioning System) in the united states, the beidou System in china, or the galileo System in russia.

Power supply 609 is used to provide power to the various components in terminal 600. The power supply 609 may be ac, dc, disposable or rechargeable. When the power supply 609 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 600 also includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.

The acceleration sensor 611 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 600. For example, the acceleration sensor 611 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 601 may control the display screen 605 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 611. The acceleration sensor 611 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 612 may detect a body direction and a rotation angle of the terminal 600, and the gyro sensor 612 and the acceleration sensor 611 may cooperate to acquire a 3D motion of the user on the terminal 600. The processor 601 may implement the following functions according to the data collected by the gyro sensor 612: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 613 may be disposed on the side bezel of terminal 600 and/or underneath display screen 605. When the pressure sensor 613 is disposed on the side frame of the terminal 600, a user's holding signal of the terminal 600 can be detected, and the processor 601 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 605. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 614 is used for collecting a fingerprint of a user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 614 may be disposed on the front, back, or side of the terminal 600. When a physical button or vendor Logo is provided on the terminal 600, the fingerprint sensor 614 may be integrated with the physical button or vendor Logo.

The optical sensor 615 is used to collect the ambient light intensity. In one embodiment, processor 601 may control the display brightness of display screen 605 based on the ambient light intensity collected by optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the display screen 605 is increased; when the ambient light intensity is low, the display brightness of the display screen 605 is adjusted down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.

A proximity sensor 616, also known as a distance sensor, is typically disposed on the front panel of the terminal 600. The proximity sensor 616 is used to collect the distance between the user and the front surface of the terminal 600. In one embodiment, when proximity sensor 616 detects that the distance between the user and the front face of terminal 600 gradually decreases, processor 601 controls display 605 to switch from the bright screen state to the dark screen state; when the proximity sensor 616 detects that the distance between the user and the front face of the terminal 600 is gradually increased, the processor 601 controls the display 605 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 6 is not intended to be limiting of terminal 600 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

The electronic device in the foregoing method embodiments may be implemented as a server, for example, fig. 7 is a schematic structural diagram of a server provided in the present disclosure, where the server 700 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 701 and one or more memories 702, where at least one instruction is stored in the memory 702, and the at least one instruction is loaded and executed by the processor 701 to implement the video processing method provided in the foregoing method embodiments. Of course, the server may also have a wired or wireless network interface, an input/output interface, and other components to facilitate input and output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, there is also provided a storage medium comprising instructions, such as a memory comprising instructions, executable by a processor of an electronic device to perform the video processing method described above. Alternatively, the storage medium may be a non-transitory computer readable storage medium, for example, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

The user information to which the present disclosure relates may be information authorized by the user or sufficiently authorized by each party.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

22页详细技术资料下载

Video processing method, device, equipment and storage medium

相关技术

网友询问留言