Video processing method and device, electronic equipment and storage medium

文档序号：1908492 发布日期：2021-11-30 浏览：25次中文

阅读说明：本技术 一种视频处理方法、装置、电子设备及存储介质 (Video processing method and device, electronic equipment and storage medium ) 是由李钊于 2021-07-28 设计创作，主要内容包括：本公开关于一种视频处理方法、装置、电子设备及存储介质,其中,包括：获取目标音乐和视频片段信息,视频片段信息包括视频片段集合中的原始视频片段的片段时长,根据目标音乐的节拍信息和原始视频片段的片段时长确定原始视频片段的裁剪信息,在视频处理界面上显示目标视频,目标视频是基于裁剪信息对原始视频片段进行裁剪和拼接后得到的,目标视频存在对应于节拍信息中的节拍点的至少一个视频拼接点,目标音乐的节拍信息为目标音乐的原始节拍信息。本申请通过目标音乐的节拍信息和原始视频片段的片段时长确定原始视频片段的裁剪信息,适用于每一个音乐,如此,可以自适应原视频内容时长等,降低人工成本,提高制造目标视频的灵活性。(The present disclosure relates to a video processing method, an apparatus, an electronic device, and a storage medium, wherein the method includes: the method comprises the steps of obtaining target music and video clip information, wherein the video clip information comprises clip duration of original video clips in a video clip set, determining cutting information of the original video clips according to beat information of the target music and the clip duration of the original video clips, displaying a target video on a video processing interface, wherein the target video is obtained by cutting and splicing the original video clips based on the cutting information, the target video has at least one video splicing point corresponding to a beat point in the beat information, and the beat information of the target music is the original beat information of the target music. According to the method and the device, the cutting information of the original video clip is determined through the beat information of the target music and the clip duration of the original video clip, and the method and the device are suitable for each piece of music, so that the duration of the original video content can be self-adapted, the labor cost is reduced, and the flexibility of manufacturing the target video is improved.)

1. A video processing method, comprising:

acquiring target music and video clip information; the video clip information comprises clip duration of original video clips in the video clip set;

determining cutting information of the original video clip according to the beat information of the target music and the clip duration of the original video clip;

displaying the target video on the video processing interface; the target video is obtained by cutting and splicing the original video clips based on the cutting information; the target video has at least one video splicing point corresponding to a beat point in the beat information, wherein the beat information of the target music is original beat information of the target music.

2. The video processing method of claim 1, wherein the method further comprises:

performing beat analysis on the target music to obtain beat information of the target music;

or;

and acquiring the beat information of the target music from the beat information storage area according to the identification information of the target music.

3. The video processing method of claim 2, wherein the method further comprises:

sending a music acquisition request to a server, wherein the music acquisition request comprises identification information of the target music;

receiving beat information of the target music sent by the server; the beat information of the target music is obtained by performing beat analysis processing on the target music by the server, and the beat information of the target music is stored in the beat information storage area of the server.

4. The video processing method according to claim 1, wherein said determining the cropping information of the original video segment according to the tempo information of the target music and the segment duration of the original video segment comprises:

for each original video clip, determining a clipping range of the original video clip based on the clip duration of the original video clip;

acquiring all video frame positions in the cutting range of the original video clip;

determining the cutting information of the original video clip according to the positions of all the video frames and the beat points;

and cutting the original video clip according to the cutting information.

5. The video processing method according to claim 4, wherein said determining clipping information of the original video segment according to the all video frame positions and the beat points, and clipping the original video segment according to the clipping information comprises:

if the positions of a plurality of video frames corresponding to the beat points exist in the cutting range, performing quality analysis on the video frames in the cutting range to determine a target video frame;

determining cropping information of the original video clip based on the position of the target video frame;

and cutting the original video clip according to the cutting information.

6. The video processing method of claim 4, wherein the method further comprises:

the clipping range of the original video clip is determined based on a key clip of the original video clip, wherein the key clip is determined based on the wonderness value of a video frame;

and/or;

and the cutting range of the original video clip is determined based on cutting indication information, and the cutting indication information is generated according to the acquired user setting information.

7. A video processing apparatus, comprising:

an acquisition module configured to perform acquisition of target music and video clip information; the video clip information comprises clip duration of original video clips in the video clip set;

the cutting information determining module is configured to determine the cutting information of the original video segment according to the beat information of the target music and the segment duration of the original video segment;

a display module configured to perform displaying a target video on a video processing interface; the target video is obtained by cutting and splicing the original video clips based on the cutting information; the target video has at least one video splicing point corresponding to a beat point in the beat information, wherein the beat information of the target music is original beat information of the target music.

8. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video processing method of any of claims 1 to 6.

9. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the video processing method of any of claims 1 to 6.

10. A computer program product, characterized in that the computer program product comprises a computer program, which is stored in a readable storage medium, from which at least one processor of a computer device reads and executes the computer program, causing the device to perform the video processing method according to any one of claims 1 to 6.

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a video processing method and apparatus, an electronic device, and a storage medium.

Background

The development of network technology makes video applications very popular in people's daily life. The video interaction software provides diversified operation experience for users, and the users can shoot videos with different styles at any time and any place, add various special effects and set different types of background music.

At present, when a user shoots a video by using software, beat information needs to be judged manually, segment duration, special effect occurrence time point and the like are designed, and finally a music click effect is realized in a video template mode. However, the above display can only support part of music, and cannot adapt to the duration of the original video content, etc., and the labor cost is high and the flexibility is lacking.

Disclosure of Invention

The disclosure provides a video processing method, a video processing device, an electronic device and a storage medium. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a video processing method, including:

acquiring target music and video clip information; the video clip information comprises the clip duration of the original video clip in the video clip set;

determining cutting information of the original video clip according to the beat information of the target music and the clip duration of the original video clip;

displaying the target video on the video processing interface; the target video is obtained by cutting and splicing the original video segments based on the cutting information; the target video has at least one video splicing point corresponding to the beat point in the beat information;

wherein, the beat information of the target music is the original beat information of the target music.

In some possible embodiments, the method further comprises:

performing beat analysis on the target music to obtain beat information of the target music;

or;

and acquiring the beat information of the target music from the beat information storage area according to the identification information of the target music.

In some possible embodiments, the method further comprises:

sending a music acquisition request to a server, wherein the music acquisition request comprises identification information of target music;

receiving beat information of target music sent by a server; the beat information of the target music is obtained by performing beat analysis processing on the target music by the server, and the beat information of the target music is stored in a beat information storage area of the server.

In some possible embodiments, determining the cropping information of the original video segment according to the beat information of the target music and the segment duration of the original video segment includes:

for each original video clip, determining the clipping range of the original video clip based on the clip duration of the original video clip;

acquiring all video frame positions in the cutting range of the original video clip;

determining cutting information of the original video clip according to all video frame positions and beat points;

and cutting the original video clip according to the cutting information.

In some possible embodiments, determining cropping information of the original video segment according to all video frame positions and the beat points, and cropping the original video segment according to the cropping information includes:

determining clipping information of an original video clip based on the position of the target video frame;

and cutting the original video clip according to the cutting information.

In some possible embodiments, the method further comprises:

the clipping range of the original video clip is determined based on a key clip of the original video clip, and the key clip is determined based on the wonderful degree value of the video frame;

and/or;

the clipping range of the original video clip is determined based on clipping instruction information generated from the acquired user setting information.

In some possible embodiments, the method further comprises:

determining a first target video clip from the original video clip; the segment duration of the first target video segment is less than or equal to the tailorable duration threshold.

In some possible embodiments, the method further comprises:

determining a threshold value of the cuttable time length according to the video segment time length in the target historical video meeting the requirement, and/or; and determining a cuttable time length threshold according to the segment time length of the original video segment.

In some possible embodiments, determining the croppeable duration threshold based on the video segment duration in the target historical video that meets the requirement includes:

acquiring a historical video set;

determining a target historical video meeting the requirement from the historical video set according to the video attribute information, wherein the video attribute information comprises at least one of forwarding number, watching number, praise number, reward number, powder expansion number and comment number;

and analyzing the video segment time length in the target historical video to obtain a cuttable time length threshold value.

In some possible embodiments, if there is a first target video segment in the original video segment, determining the cropping information of the original video segment according to the beat information of the target music and the segment duration of the original video segment includes:

determining a second target video segment in the original video segments based on the tailorable time length threshold;

and determining the cutting information of the second target video segment according to the beat information of the target music, the segment duration of the first target video segment and the segment duration of the second target video segment.

In some possible embodiments, determining the cropping information of the second target video segment according to the tempo information of the target music, the segment duration of the first target video segment, and the segment duration of the second target video segment includes:

splicing the first target video clip and the second target video clip based on the splicing number of the first target video clip and the splicing number of the second target video clip to obtain a video to be processed; the duration of the video to be processed is the sum of the segment duration of the first target video segment and the segment duration of the second target video segment;

and sequentially determining the cutting information of the second target video clip based on the beat points in the beat information and the preset processing direction.

In some possible embodiments, the preset processing direction includes a forward direction of the stitching direction, and sequentially determining the cropping information of the second target video segment based on the beat points in the beat information and the preset processing direction includes:

determining a first second target video clip in the video to be processed according to the forward direction of the splicing direction;

if the ending point of the first second target video segment does not have the corresponding beat point, determining a first beat point of the first second target video segment according to the reverse direction of the splicing direction; the first beat point is located between two consecutive video frames;

and determining the cutting information of the first and second target video clips based on the time points corresponding to the first beat points and the end points.

In some possible embodiments, the cropping information includes a cropping duration, and determining the cropping information of the first second target video segment based on the time point corresponding to the first beat point and the time point corresponding to the end point includes:

determining the cutting duration of the first and second target video clips to be cut according to the difference value between the time point corresponding to the first beat point and the time point corresponding to the ending point;

cutting the first and second target video clips according to a preset cutting mode and a cutting duration;

the preset cutting mode comprises a mode of cutting from the head of the segment, a mode of cutting from the tail of the segment, a mode of cutting from the middle of the segment and a mode of cutting according to the content quality of the segment.

In some possible embodiments, if the preset clipping manner is a clipping manner according to the content quality of the segment, clipping the first and second target video segments according to the preset clipping manner and the clipping duration includes:

dividing the first second target video clip into a plurality of sub-clips; the total added time length of the plurality of sub-segments is equal to the segment time length of the first second target video segment;

evaluating the content quality of each sub-segment in the plurality of sub-segments according to the quality evaluation parameters to obtain the content quality value of each sub-segment;

cutting out sub-segments which do not meet the quality requirement from the plurality of sub-segments according to the content quality value and the cutting time length of each sub-segment;

the quality assessment parameters include color saturation, sharpness, richness of content, and brightness.

dividing the first second target video clip into a plurality of sub-clips; the total duration of the plurality of sub-segments is greater than the segment duration of the first second target video segment; there are repeated segments for each sub-segment and adjacent sub-segments;

evaluating the content quality of each sub-segment in the plurality of sub-segments according to the quality evaluation parameters to obtain the content quality value of each sub-segment;

cutting out sub-segments which do not meet the quality requirement from the plurality of sub-segments according to the content quality value and the cutting time length of each sub-segment;

the quality assessment parameters include color saturation, sharpness, richness, brightness, and/or content coherence.

In some possible embodiments, displaying the target video on the video processing interface includes:

performing first splicing adjustment on a video to be processed based on the first cut second target video clip;

determining a second target video clip based on the video to be processed after the first splicing adjustment, and cutting the second target video clip based on the cutting mode of the first second target video clip;

performing second splicing adjustment on the video to be processed according to the second cut second target video clip until the cutting processing of the last second target video clip is completed;

and displaying a target video on the video processing interface, wherein the target video is obtained by cutting and splicing the video to be processed for multiple times.

In some possible embodiments, the beat information further includes a beat speed, and the method further includes:

determining transition effect information corresponding to the beat speed;

and adding transition effect information at the video splicing point corresponding to the beat point.

In some possible embodiments, the method further comprises:

determining a rephotograph degree value of a beat point corresponding to the beat point;

determining key beat points with the rephotograph degree value meeting a preset degree value;

and adding the rephoto effect information at the video segment splicing position corresponding to the key beat point.

In some of the possible embodiments of the present invention,

the duration of the target video is the same as that of the target music;

or;

the time length of the target video is the same as that of the clipped target music.

According to a second aspect of the embodiments of the present disclosure, there is provided a video processing apparatus including:

an acquisition module configured to perform acquisition of target music and video clip information; the video clip information comprises the clip duration of the original video clip in the video clip set;

the cutting information determining module is configured to determine the cutting information of the original video clip according to the beat information of the target music and the clip duration of the original video clip;

a display module configured to perform displaying a target video on a video processing interface; the target video is obtained by cutting and splicing the original video segments based on the cutting information; the target video has at least one video splicing point corresponding to the beat point in the beat information; wherein, the beat information of the target music is the original beat information of the target music.

In some possible embodiments, the apparatus further comprises a beat information acquisition module configured to perform:

performing beat analysis on the target music to obtain beat information of the target music;

or;

and acquiring the beat information of the target music from the beat information storage area according to the identification information of the target music.

In some possible embodiments, the beat information obtaining module is configured to perform:

sending a music acquisition request to a server, wherein the music acquisition request comprises identification information of target music;

In some possible embodiments, the cropping information determination module is configured to perform:

for each original video clip, determining the clipping range of the original video clip based on the clip duration of the original video clip;

acquiring all video frame positions in the cutting range of the original video clip;

determining cutting information of the original video clip according to all video frame positions and beat points;

and cutting the original video clip according to the cutting information.

In some possible embodiments, the cropping information determination module is configured to perform:

determining clipping information of an original video clip based on the position of the target video frame;

and cutting the original video clip according to the cutting information.

In some possible embodiments, the apparatus further comprises:

the clipping range of the original video clip is determined based on a key clip of the original video clip, and the key clip is determined based on the wonderful degree value of the video frame;

and/or;

the clipping range of the original video clip is determined based on clipping instruction information generated from the acquired user setting information.

In some possible embodiments, the apparatus further comprises a target video segment determination module configured to perform:

determining a first target video clip from the original video clip; the segment duration of the first target video segment is less than or equal to the tailorable duration threshold.

In some possible embodiments, the apparatus further comprises a tailorable duration threshold determination module configured to perform:

In some possible embodiments, the tailorable duration threshold determination module is configured to perform:

acquiring a historical video set;

and analyzing the video segment time length in the target historical video to obtain a cuttable time length threshold value.

In some possible embodiments, if there is a first target video segment in the original video segment, the cropping information determining module is configured to perform:

determining a second target video segment in the original video segments based on the tailorable time length threshold;

In some possible embodiments, the cropping information determination module is configured to perform:

and sequentially determining the cutting information of the second target video clip based on the beat points in the beat information and the preset processing direction.

In some possible embodiments, the preset processing direction includes a forward direction of the stitching direction, and the cropping information determining module is configured to perform:

determining a first second target video clip in the video to be processed according to the forward direction of the splicing direction;

and determining the cutting information of the first and second target video clips based on the time points corresponding to the first beat points and the end points.

In some possible embodiments, the cropping information includes a cropping duration, and the cropping information determination module is configured to perform:

cutting the first and second target video clips according to a preset cutting mode and a cutting duration;

In some possible embodiments, if the preset clipping manner is a manner of clipping according to content quality of the segment, the clipping information determining module is configured to perform:

evaluating the content quality of each sub-segment in the plurality of sub-segments according to the quality evaluation parameters to obtain the content quality value of each sub-segment;

cutting out sub-segments which do not meet the quality requirement from the plurality of sub-segments according to the content quality value and the cutting time length of each sub-segment;

the quality assessment parameters include color saturation, sharpness, richness of content, and brightness.

In some possible embodiments, if the preset clipping manner is a manner of clipping according to content quality of the segment, the clipping information determining module is configured to perform:

evaluating the content quality of each sub-segment in the plurality of sub-segments according to the quality evaluation parameters to obtain the content quality value of each sub-segment;

cutting out sub-segments which do not meet the quality requirement from the plurality of sub-segments according to the content quality value and the cutting time length of each sub-segment;

the quality assessment parameters include color saturation, sharpness, richness, brightness, and/or content coherence.

In some possible embodiments, the display module is configured to perform:

performing first splicing adjustment on a video to be processed based on the first cut second target video clip;

performing second splicing adjustment on the video to be processed according to the second cut second target video clip until the cutting processing of the last second target video clip is completed;

and displaying a target video on the video processing interface, wherein the target video is obtained by cutting and splicing the video to be processed for multiple times.

In some possible embodiments, the beat information further includes a beat speed, and the apparatus further includes a transition effect information adding module configured to perform:

determining transition effect information corresponding to the beat speed;

and adding transition effect information at the video splicing point corresponding to the beat point.

In some possible embodiments, the apparatus further comprises a middle rank effect information adding module configured to perform:

determining a rephotograph degree value of a beat point corresponding to the beat point;

determining key beat points with the rephotograph degree value meeting a preset degree value;

and adding the rephoto effect information at the video segment splicing position corresponding to the key beat point.

In some of the possible embodiments of the present invention,

the duration of the target video is the same as that of the target music;

or;

the time length of the target video is the same as that of the clipped target music.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method of any one of the first aspect as described above.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a storage medium, wherein instructions that, when executed by a processor of an electronic device, enable the electronic device to perform any one of the methods of the first aspect of the embodiments of the present disclosure.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program, the computer program being stored in a readable storage medium, from which at least one processor of a computer device reads and executes the computer program, causing the computer to perform any one of the methods of the first aspect of the embodiments of the present disclosure.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the method comprises the steps of obtaining target music and video clip information, wherein the video clip information comprises clip duration of original video clips in a video clip set, determining cutting information of the original video clips according to beat information of the target music and the clip duration of the original video clips, displaying a target video on a video processing interface, wherein the target video is obtained by cutting and splicing the original video clips based on the cutting information, at least one video splicing point corresponding to a beat point in the beat information exists in the target video, and the beat information of the target music is the original beat information of the target music. According to the method and the device, the cutting information of the original video clip is determined through the beat information of the target music and the clip duration of the original video clip, and the method and the device are suitable for each piece of music, so that the duration of the original video content can be self-adapted, the labor cost is reduced, and the flexibility of manufacturing the target video is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a schematic diagram illustrating an application environment in accordance with an illustrative embodiment;

FIG. 2 is a flow diagram illustrating a video processing method according to an exemplary embodiment;

fig. 3 is a flow diagram illustrating a method of determining beat information according to an example embodiment;

FIG. 4 is a flow diagram illustrating a determination of cropping information in accordance with an exemplary embodiment;

FIG. 5 is a schematic diagram illustrating a spliced video to be processed in accordance with an exemplary embodiment;

FIG. 6 is a flow diagram illustrating a determination of cropping information for a second target video segment in accordance with an exemplary embodiment;

FIG. 7 is a schematic diagram illustrating a cropped second target video segment in accordance with an exemplary embodiment;

FIG. 8 is a block diagram illustrating a video processing device according to an example embodiment;

FIG. 9 is a block diagram illustrating an electronic device for video processing in accordance with an exemplary embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Referring to fig. 1, fig. 1 is a schematic diagram of an application environment according to an exemplary embodiment, as shown in fig. 1, including a server 01 and a terminal device 02. Optionally, the server 01 and the terminal device 02 may be connected through a wireless link or a wired link, and the disclosure is not limited herein.

In an alternative embodiment, the server 01 may provide different music to the terminal device 02 for the user to select the target music using the terminal device. Specifically, the server 01 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like. Optionally, the operating system running on the server 01 may include, but is not limited to, IOS, Linux, Windows, Unix, Android system, and the like.

In an optional embodiment, the terminal device 02 may obtain target music and video clip information, where the video clip information includes a clip duration of an original video clip in a video clip set, and determine clipping information of the original video clip according to beat information of the target music and the clip duration of the original video clip, so that a target video may be displayed on the video processing interface, where the target video is obtained by clipping and splicing the original video clip based on the clipping information, the target video has at least one video splicing point corresponding to a beat point in the beat information, and the video splicing point is a splicing point between target videos in the target video, where the beat information of the target music is the original beat information of the target music. Terminal device 02 may include, but is not limited to, smart phones, desktop computers, tablet computers, laptop computers, smart speakers, digital assistants, Augmented Reality (AR)/Virtual Reality (VR) devices, smart wearable devices, and like types of electronic devices. Optionally, the operating system running on the electronic device may include, but is not limited to, an android system, an IOS system, linux, windows, and the like.

In addition, it should be noted that fig. 1 shows only one application environment of the video processing method provided by the present disclosure, and in practical applications, other application environments may also be included.

Fig. 2 is a flowchart illustrating a video processing method according to an exemplary embodiment, where the video processing method is applied to an electronic device such as a server, a terminal or other nodes, as shown in fig. 2, and includes the following steps:

in step S201, target music and video clip information is acquired; the video clip information includes clip durations of original video clips in the set of video clips.

In an optional embodiment, when the terminal device opens an application based on an application start instruction of a user, and switches to the video processing interface, optional music may be displayed on the display interface. Wherein, each piece of music in the selectable pieces of music may be referred to by the name of the piece of music, or each piece of music may be referred to by the name of the piece of music and the cover of the piece of music.

In an alternative embodiment, the terminal device may acquire the target music in response to the music selection instruction. Specifically, when the terminal device detects that a touch is made on a corresponding interface area of the target music, the target music can be acquired. Or when the audio input module of the terminal equipment receives the identification of the target music, the target music is acquired. Wherein the identification may be a musical name of the target music. Alternatively, in the case where the target music has been selected previously, the target music is stored in the local storage of the terminal apparatus, and therefore, the terminal apparatus may extract the target music from the local storage. When the target music is not selected by the user, the terminal device may establish a link with the server and download the target music from the server while storing the target music to a local storage.

Alternatively, the target music may be a complete song, or may be a plurality of repeated combinations of a certain segment of a song.

In this embodiment, the video clip information includes a clip duration of each original video clip in the video clip set. Specifically, the video clip set may include a plurality of original video clips uploaded by the user, where the clip duration of each original video clip may be the same or different.

Optionally, the original video clip may be obtained by direct shooting by a user, or obtained by performing stitching processing on different pictures.

In step S203, clip information of the original video clip is determined according to the tempo information of the target music and the clip duration of the original video clip.

In the embodiment of the present application, before step S203, the tempo information of the target music may be determined. There are many ways to determine the tempo information of the target music, and two alternative embodiments are described below.

In an optional implementation manner, after the terminal device obtains the target music, beat analysis may be performed on the target music to obtain beat information of the target music. Specifically, the music beat analysis model can be directly called to perform real-time analysis on the target music, so as to obtain the beat information of the target music.

In another alternative embodiment, the music target may be directly processed by an algorithm to obtain the beat information of the target music. Fig. 3 is a flowchart illustrating a method for determining beat information according to an exemplary embodiment, as shown in fig. 3, including the steps of:

in step S301, sampling quantization processing is performed on the target music, and first data is obtained.

In an alternative embodiment, the terminal device may sample the target music, for example, 1024 sampling points are obtained after each sampling, and the 1024 sampling points may be understood as data points contained in one window. Subsequently, the terminal device may perform quantization processing on the 1024 sampling points.

The sampling and quantization processing may include obtaining a new input stream, performing waveform decoding processing on the new input stream, performing floating-point sampling on the processed input stream to obtain 1024 sampling points, and performing quantization processing. Thus, the processed first data can be obtained.

In step S302, difference processing is performed on the first data to obtain first difference data.

Specifically, the terminal device may subtract the previous window data from the current window data to obtain the difference data. The difference formula may be as the following formula (1).

In step S303, time-frequency conversion processing is performed on the first difference data to obtain second data.

For facilitating subsequent processing, the terminal device may perform time-frequency conversion processing on the difference data to obtain second data. The time-frequency conversion processing is mainly realized based on Fourier change.

In step S304, the second data is subjected to difference processing to obtain second difference data.

The audio rhythm can be basically seen through the data after the Fourier change and the differential processing.

In step S305, the second difference data is quantized to obtain data within a window of a preset number.

In an alternative embodiment, the further quantization described above may use a moving average method. A typical audio sampling rate is 44100 or 48000, which will be described as 44100.

In light of the above, the present application sets the window size to 1024, and therefore, 1 second includes 43 whole windows, and one window represents 1000/(44100/1024) ═ 23.22 milliseconds. When the average needs to be calculated in the interval of 0.5 seconds, about 22 windows are needed. Optionally, the average value of the first 10 windows and the last 10 windows is calculated, so that the result of 10 windows can be obtained.

In step S306, beat information of the target music is determined according to the data within the preset number of windows.

In an alternative embodiment, the beat information of the target music determined for the data in the preset number of windows may be determined by peak detection.

Thus, the tempo information of the target music can be determined through the above-described steps S301 to S306.

In the embodiment of the application, the terminal device can also obtain the beat information of the target music from the beat information storage area according to the identification information of the target music.

In an alternative embodiment, the terminal device may transmit a music acquisition request to the server, the music acquisition request including identification information of the target music. Correspondingly, after receiving the music acquisition request of the terminal device, the server can extract the identification information of the target music from the music acquisition request, acquire the tempo information of the target music from the tempo information storage area according to the identification information, and send the tempo information of the target music to the terminal device. At this time, the terminal device may receive beat information of the target music transmitted by the server. Optionally, the beat information of the target music is obtained by performing beat analysis processing on the target music by the server.

Alternatively, the beat information of all music in the storage area may be determined according to the two embodiments, and then, the beat information of each music in all music is marked with the identification information of the music.

In this way, the embodiment of calculating the tempo information of the target music in real time can save a large amount of storage space compared to the embodiment of extracting the tempo information from the storage space. Compared with the implementation mode of calculating the beat information of the target music in real time, the implementation mode of extracting the beat information from the storage space can reduce the processing time of the whole scheme and accelerate the processing process.

In some possible embodiments, for each original video segment, the terminal device may determine a cropping range for the original video segment based on a segment duration of the original video segment. After the cutting range of each original video clip is determined, all video frame positions in the cutting range of the original video clip can be obtained, the cutting information of the original video clip is determined according to all the video frame positions and the beat points, and the original video clip is cut according to the cutting information.

For example, assuming that the duration of an original video segment is 10 seconds, the clipping range of the original video segment can be determined to be between 5 seconds and 8 seconds, and then the terminal device can acquire all video frame positions of the original video segment within 5 seconds to 8 seconds, and if it is finally determined that 2 seconds of content needs to be clipped from the original video segment, the clipped end point of the original video segment can be enabled to be stuck. The terminal equipment can determine the beat point from the 5 th to the 8 th seconds, determine the content of 2 seconds from the 5 th to the 8 th seconds as the cutting information according to the video frame position and the beat point, and cut the original video clip according to the cutting information.

In some optional embodiments, in which cropping information of the original video segment is determined according to all video frame positions and the beat points, if there are multiple video frame positions corresponding to the beat points within the cropping range, for example, between the 5 th second and the 8 th second (there are corresponding beat points in the 5 th second, the 5.5 th second, the 6 th second, the 6.5 th second, the 7 th second, the 7.5 th second and the 8 th second). The terminal equipment can perform quality analysis on the video frames in the clipping range to determine the target video frame.

Optionally, the terminal device may perform quality analysis on each video frame between the 5 th second and the 8 th second to determine a target video frame, where the target video frame may be an optimal video frame, and the target video frame may also be a video frame with the quality ranked first. In this manner, the terminal device may determine cropping information for the original video segment based on the location of the target video frame. To further illustrate according to the above example, for example, if the terminal device determines that the 7 th second, the 7.5 th second and the 8 th second are the optimal video frames, the determined cropping information may be the 5 th second to the 7 th second. The terminal device may then crop the original video segment according to the cropping information. Or the terminal equipment can also cut from the next frame of the optimal video frame, and determine the cutting information with the length of 2 seconds. Or the terminal equipment can cut the last frame of the optimal video frame forward to determine the cutting information with the length of 2 seconds. Alternatively, the 2-second cutting duration may be continuous or may be small segments cut one by one for a total of 2 seconds.

In some possible embodiments, the cropping range of the original video segment is determined based on a key segment of the original video segment, and the key segment is determined based on a highlight value of the video frame. For example, the key segment in the clipping range is the segment with lower highlight value.

In other possible embodiments, the cropping range of the original video segment is determined based on cropping indication information generated from the acquired user setting information.

Therefore, in the process of cutting the video, the video frame with higher quality can be reserved.

In the embodiment of the application, the segment duration of the original video segment acquired by the terminal device may be short, and if the original video segment with the short segment duration is to be cut, the content of the original video segment may not be well expressed.

The above-mentioned cuttable time threshold refers to a critical value (for example, 2 seconds), and if the segment time of a certain original video segment is less than or equal to the critical value, the original video segment may not be cut. This is because an original video segment that is smaller than the critical value, if clipped, would result in the content of the original video segment not being well represented. Therefore, the original video segment with the segment duration less than or equal to the tailorable duration threshold can be directly determined as the first target video segment and directly used for being spliced into the target video.

In an alternative embodiment, the above-mentioned tailorable duration threshold (e.g., 2 seconds) may be set based on empirical values.

In another alternative embodiment, the terminal device may determine the tailorable duration threshold according to the duration of the video segment in the target historical video that meets the requirements. Specifically, the terminal device may obtain a historical video set, and determine a target historical video meeting the requirement from the historical video set according to the video attribute information. The video attribute information comprises at least one of forwarding number, watching number, praise number, reward number, powder expanding number and comment number, and the video segment duration in the target historical video is analyzed to obtain the threshold value of the tailorable duration.

In a specific embodiment, after the terminal device acquires the historical video set, the forwarding number, the watching number, the praise number, the reward number, the powder expansion number and the comment number of each historical video in the historical video set can be determined. The terminal device may determine, as the target history video, the history video whose forwarding number satisfies the first number, whose watching number satisfies the second number, whose praise number satisfies the third number, whose reward number satisfies the fourth number, whose expanding number satisfies the fifth number, and whose comment number satisfies the sixth number. Wherein the first number, the second number, the third number, the fourth number, the fifth number and the sixth number may be preset. And analyzing the determined target historical videos to obtain the number of the video segments contained in each historical video and the segment duration of each video segment, and determining a cuttable duration threshold according to the segment duration of each video segment. Optionally, the history video is a history video that is finally uploaded to the server by an author of the history video through the terminal device, and each video clip of the history video may carry duration information of the video clip.

According to the mode, popular target historical videos meeting requirements can be determined under survey based on big data, a scientific tailorable time threshold value is determined based on the target historical videos, and scientific and effective data support is provided for the tailorable time threshold value.

In another alternative embodiment, the terminal device may determine the cuttable time threshold according to the segment time of each original video segment. For example, assume that there are 3 original video segments, the segment duration of original video segment 1 is 3 seconds, the segment duration of original video segment 2 is 10 seconds, and the segment duration of original video segment 3 is 20 seconds. Since the time length difference between the original video segment 1 and the original video segments 2 and 3 is large, the segment time length of the original video segment 1 can be directly set as the cuttable time length threshold. While the above example is merely one alternative embodiment for determining the croppeable duration threshold based on the segment duration of the original video segment, other embodiments may be included in the present application.

In this way, a tailorable time length threshold can be determined according to the actual segment time length of each original video segment, and the tailorable time length threshold is closer to the original video segment to be processed currently.

In another alternative embodiment, the terminal device may determine the tailorable duration threshold according to the video segment duration in the target historical video and the segment duration of each original video segment that meet the requirements. Specifically, a first cuttable duration threshold value may be determined according to the duration of a video segment in a target historical video that meets requirements, a second cuttable duration threshold value may be determined according to the segment duration of each original video segment, and then a final cuttable duration threshold value may be determined according to the first cuttable duration threshold value and the second cuttable duration threshold value. Specifically, the final cuttable duration threshold may be determined according to an average value of the first cuttable duration threshold and the second cuttable duration threshold, and the final cuttable duration threshold may be determined according to a sum of a product of the first cuttable duration threshold and the first coefficient and a product of the second cuttable duration threshold and the second coefficient.

In this embodiment, optionally, if the first target video segment does not exist in the original video segments, the clipping information of each original video segment may be directly determined according to the beat information of the target music and the segment duration of the original video segment. Optionally, if the first target video segment exists in the original video segment, determining, according to the beat information of the target music and the segment duration of the original video segment, the clipping information of the original video segment except the first target video segment. Optionally, the terminal device may also determine the clipping information of each original video segment directly according to the beat information of the target music and the segment duration of the original video segment, without considering the clipping duration threshold.

The following describes how cropping information for an original video segment is determined based on an alternative embodiment. FIG. 4 is a flowchart illustrating a method of determining cropping information, according to an exemplary embodiment, as shown in FIG. 4, including the steps of:

in step S401, if there is a first target video segment in the original video segment, a second target video segment in the original video segment is determined based on the cuttable duration threshold.

If the first target video clip exists in the original video clip, the original video clip other than the first target video clip may be determined as the second target video clip.

In step S403, cropping information of the second target video segment is determined according to the tempo information of the target music, the segment duration of the first target video segment, and the segment duration of the second target video segment.

In an alternative embodiment, the beat information may include a beat duration, which refers to the time taken by each beat in the target music. Each music has its own tempo, and most of music is usually constant in tempo, and therefore, most of music exists for one beat duration. In music, time is divided into equal basic units, each unit being called a "beat" or a beat. The duration of the beat is represented by the duration of the notes, and the duration of one beat can be a quarter note (i.e., a quarter note is taken as one beat), a half note (i.e., a half note is taken as one beat) or an eighth note (i.e., an eighth note is taken as one beat). The duration of a beat is a relative notion of time, such as when the specified tempo of the music is 60 beats per minute, the time taken per beat is one second and half a beat is one second; when the prescribed speed is 120 beats per minute, the time per beat is half a second, half a beat is one quarter of a second, and so on. After the basic duration of the beat is determined, the notes of the various durations are associated with the beat. Of course, there are also variations of the music, and therefore, there are a plurality of beat time periods of such music.

The following description will be given taking the beat information of the target music as a single beat time length as an example, and assuming that the cuttable time length threshold is 2 seconds, the single beat time length is 3 seconds, and one first target video segment is 2 seconds and one second target video segment is 11 seconds in the original video segments. According to the above, the terminal device may not clip the first target video segment.

Alternatively, fig. 5 is a schematic diagram of a spliced video to be processed according to an exemplary embodiment, as shown in fig. 5, the beat information includes a single beat duration (3 seconds), and the beat information includes beat points of the target music, in other words, one beat point may be marked at every 3 seconds on the target music. The 2-second first target video segment is the video segment with the splicing number of 1, and the 11-second target video segment is the video segment with the splicing number of 2, so that the first target video segment and the second target video segment can be spliced based on the splicing number of 1 of the first target video segment and the splicing number of 2 of the second target video segment, and a video to be processed is obtained. The duration of the video to be processed is the sum of the segment duration of the first target video segment and the segment duration of the second target video segment. In this way, a total of 13 seconds of the to-be-processed video of the first target video segment before and the second target video segment after the subsequent splicing can be obtained. Subsequently, clipping information of the second target video segment may be sequentially determined based on the beat points and the preset processing direction. Since only one second target video segment exists in the displayed example, if a plurality of second target video segments exist, the cropping information of the plurality of second target video segments can be determined at one time. Therefore, the splicing points among different video clips can be just clamped on the beat points of the target music as much as possible, and the effect of clipping the music card points is achieved.

The following description will be made by taking a preset processing direction as a forward direction of the splicing direction as an example, wherein the forward direction by the direction is a front-to-back direction. Fig. 6 is a flowchart illustrating a method for determining cropping information for a second target video segment, as shown in fig. 6, comprising the steps of:

in step S601, a first second target video segment in the video to be processed is determined according to the forward direction of the splicing direction.

Optionally, based on the to-be-processed video shown in fig. 5, the forward direction of the stitching direction is from front to back, that is, the direction from the first target video segment to the second target video segment. Since only one second target video segment exists in the example, the second target video segment is directly determined as the first second target video segment.

In step S603, if there is no corresponding beat point at the end point of the first second target video segment, determining a first beat point of the first second target video segment according to a reverse direction of the splicing direction; the first beat point is located between two consecutive video frames.

As shown in fig. 5, the ending point of the first and second target video segments is 13 seconds, and the target music does not have a corresponding beat point in 13 seconds, a first beat point can be determined according to a direction opposite to the splicing direction, that is, a first beat point is determined according to a direction from the second target video segment to the first target video segment, the first beat point is a fourth beat point from left to right in fig. 5, the time of the video to be processed corresponding to the fourth beat point is 12 seconds, and for the second target video segment, the time of the second target video segment corresponding to the fourth beat point is 10 seconds, that is, the click point is exactly on the 10 th second target video segment.

In an alternative embodiment, as shown in fig. 5, the second target video segment is composed of a plurality of video frames, for example, a plurality of video frames may be included in the 10 th to 11 th seconds, and the number of video frames included in fig. 5 is only an example.

In order to ensure the integrity of the cropped video frames or the integrity of the remaining video frames, the first beat point may be located between two consecutive video frames.

In step S605, clipping information of the first and second target video segments is determined based on the time point corresponding to the first beat point and the time point corresponding to the end point.

Optionally, the terminal device may determine, according to 13 seconds of the end point of the second target video segment and 12 seconds of the time corresponding to the fourth beat point, that the clipping information of the second target video segment located at the first clipping position is a sub-segment from which 1 second needs to be clipped from the second target video segment.

Therefore, the splicing points among different video clips can be just clamped on the beat points of the target music as much as possible, and the effect of clipping the music card points is achieved.

The above examples only illustrate the content that the preset processing direction is the forward direction of the splicing direction, and the preset processing direction may also be the reverse direction of the splicing direction.

In the embodiment of the application, the cutting information includes cutting duration, and the terminal device can determine the cutting duration to be cut of the first and second target video segments according to a difference value between a time point corresponding to the first beat point and a time point corresponding to the end point. That is, the cropping information of the first second target video segment is the sub-segment that needs to be cropped 1 second away from the second target video segment. Based on this, the terminal device may clip the second target video segment located at the first clipping position according to the preset clipping manner and the clipping duration, so as to obtain the clipped 10-second target video segment. Optionally, the preset cropping mode is a mode of cropping from the head of the segment (for example, cropping the segment of the first second of the second target video segment located at the first cropping position), a mode of cropping from the tail of the segment (for example, cropping the segment of the last second of the second target video segment located at the first cropping position), a mode of cropping from the middle of the segment (for example, cropping the segment of the middle any second of the second target video segment located at the first cropping position), or a mode of cropping according to the content quality of the segment. Therefore, the embodiment of the application can provide more cutting choices for users, and the diversity of video processing is realized.

There are various ways for clipping the second target video segment located at the first clipping position in a manner of clipping according to the content quality of the segment, and two embodiments are described below.

In a first alternative embodiment, the terminal device may divide the second target video segment located at the first clipping position into a plurality of sub-segments, and the total duration of the plurality of sub-segments is equal to the duration of the second target video segment located at the first clipping position. For example, the 11 second target video segment may be divided into 11 sub-segments, each of which is 1 second. Then, the content quality of each sub-segment in the plurality of sub-segments can be evaluated according to the quality evaluation parameter, so as to obtain the content quality value of each sub-segment, and sub-segments which do not meet the quality requirement are cut out from the plurality of sub-segments according to the content quality value and the cutting time length of each sub-segment. Assuming the 8 th sub-segment has the lowest content quality value, the 8 th sub-segment may be cropped from the second target video segment. In this way, a schematic diagram of the cropped second target video segment shown in fig. 7 can be obtained.

The quality evaluation parameters may include color saturation, sharpness, richness, and brightness, among others.

In a second alternative embodiment, the terminal device may divide the second target video segment located at the first cropping position into a plurality of sub-segments, the total duration of the plurality of sub-segments is greater than the duration of the second target video segment located at the first cropping position, and there is a repeated segment for each sub-segment and adjacent sub-segments. For example, the 11 second target video segment may be divided into 5 sub-segments, each of which is 3 seconds, and the 10 sub-segments include 0-3 second sub-segment 1, 2-5 second sub-segment 2, 4-7 second sub-segment 3, 6-9 second sub-segment 4, and 8-11 second sub-segment 5. Subsequently, the content quality of each of the plurality of sub-segments may be evaluated according to the quality evaluation parameter, resulting in a content quality value for each sub-segment. Sub-segments that do not meet the quality requirement (say, sub-segments 4 of 6 th to 9 th seconds) are cut out from the plurality of sub-segments according to the content quality value and the cutting time length of each sub-segment. The remaining sub-segments can then be spliced and integrated to yield the 7 th-8 th segment, which is the 1 st second segment.

The quality evaluation parameters comprise color saturation, definition, content richness, brightness and content coherence. Compared with the first alternative embodiment, the two adjacent sub-segments may have repeated portions, and the continuity of the second target video segment after being cut can be ensured while considering color saturation, definition, content richness and brightness. Each sub-segment in the above two embodiments may also be a frame of video, or a plurality of frames of video, and the video frame in each sub-segment is a complete video frame.

In step S205, displaying the target video on the video processing interface; the target video is obtained by cutting and splicing the original video segments based on the cutting information; the target video has at least one video splicing point corresponding to the beat point in the beat information; wherein, the beat information of the target music is the original beat information of the target music.

Continuing with explanation based on an example, in step S205, it may be shown that the terminal device may perform a first stitching adjustment on the video to be processed based on the clipped first second target video segment. As shown in fig. 7, after the second target video segment is cut, a 10-second target video segment obtained by splicing the 7 th sub-segment and the 9 th sub-segment is obtained.

If a second target video segment exists, determining a second target video segment based on the video to be processed after the first splicing adjustment, performing the cutting processing on the second target video segment by referring to the cutting mode of the first second target video segment, and performing second splicing adjustment on the video to be processed according to the second cut target video segment until the cutting processing of the last second target video segment is completed. Therefore, the target video subjected to cutting and splicing can be displayed on the video processing interface, and the target is obtained after the target video is subjected to cutting and splicing for multiple cycles of the video to be processed.

Assume that there are 3 second target video segments (a first second target video segment, a second target video segment, and a third second target video segment) in the video to be processed. The terminal device may cut the first and second target video segments according to the cutting manner described above, perform the first splicing adjustment on the to-be-processed video of the cut first and second target video segments, and determine the second target video segment based on the to-be-processed video after the first splicing adjustment. And then, the terminal equipment cuts a second target video segment according to the cutting mode of the first second target video segment, performs second splicing adjustment on the video to be processed of the second cut second target video segment, and determines a third second target video segment based on the video to be processed after the second splicing adjustment. And then, the terminal device can cut the third second target video segment according to the cutting mode of the first second target video segment, and perform third splicing adjustment on the video to be processed of the cut third second target video segment, so that the obtained video to be processed can be the final target video.

Therefore, the video to be processed can be cut orderly and smoothly, so that the splicing points among different video clips are just clamped on the beat points of the target music, and the effect of cutting the music click points is realized.

In the embodiment of the application, an import duration threshold of an import terminal device may be preset, when the segment duration of each original video segment in the video segment set is greater than or equal to the import duration threshold, each original video segment is determined to be a third target video segment, and when the beat duration in the beat information is a single duration, the clipping information of each third target video segment is determined according to the beat duration and the segment duration of each original video segment, so that the segment duration of each clipped third target video segment obtained based on the clipping information is an integral multiple of the beat duration. And the import duration threshold is greater than or equal to the beat duration.

In an optional embodiment, the beat information further includes a beat speed, and the terminal device may determine transition effect information corresponding to the beat speed, and add the transition effect information at a video segment splicing position corresponding to the beat point. Specifically, the terminal device may be matched with a transition suitable for music wind according to the speed of the tempo, for example, a fast-paced concert may be matched with a transition with a large animation amplitude, such as rotation, fast switching, and the like.

In another optional embodiment, the terminal device determines a rephoto degree value of a beat point corresponding to the beat point, determines a key beat point whose rephoto degree value satisfies a preset degree value, and adds rephoto effect information at a video segment splicing position corresponding to the key beat point. Specifically, the effects such as image dithering and RGB separation are added at a certain beat point, and an industrial cool wind is created. In this manner, the final production effect can be presented on the video processing interface.

Optionally, the terminal device may further receive an adjustment instruction triggered based on the adjustment control, adjust the transition effect information or the rephotograph effect information, and change the transition effect information or the rephotograph effect information to new transition effect information or rephotograph effect information preferred by the user.

In the embodiment of the application, the duration of the target video finally presented on the video processing interface is the same as the duration of the target music. Or the duration of the target music is longer than that of the target video, and the terminal device can cut the duration of the target music according to the duration of the target video, so that the duration of the target video is the same as that of the cut target music.

Fig. 8 is a block diagram illustrating a video processing device according to an example embodiment. Referring to fig. 8, the apparatus includes an acquisition module 801, a cropping information determination module 802, and a display module 803.

An acquisition module 801 configured to perform acquisition of target music and video clip information; the video clip information comprises the clip duration of the original video clip in the video clip set;

a cropping information determining module 802 configured to perform determining cropping information of the original video clip according to the beat information of the target music and the clip duration of the original video clip;

a display module 803 configured to perform displaying a target video on a video processing interface; the target video is obtained by cutting and splicing the original video segments based on the cutting information; the target video has at least one video splicing point corresponding to the beat point in the beat information; wherein, the beat information of the target music is the original beat information of the target music.