Video switching method, video processing device and storage medium

文档序号：1721485 发布日期：2019-12-17 浏览：28次中文

阅读说明：本技术 视频切换方法、视频处理方法、装置及存储介质 (Video switching method, video processing device and storage medium ) 是由左洪涛王婷于 2018-06-11 设计创作，主要内容包括：本申请公开了一种视频切换方法,包括：当播放第一清晰度的视频数据中的第i个分片时,接收切换到第二清晰度的指令,根据所述第i个分片的索引编号i向服务器请求第二清晰度的视频数据中的第j个分片,其中i、j为正整数,j≥i,所述第一清晰度的视频数据和第二清晰度的视频数据中具有相同索引编号的分片,具有相同的分片时间戳；接收所述第二清晰度的视频数据中的第j个分片。本申请还提供了相应的视频处理方法、装置以及存储介质。(The application discloses a video switching method, which comprises the following steps: when an ith fragment in video data with first definition is played, receiving an instruction of switching to second definition, and requesting a jth fragment in the video data with second definition from a server according to an index number i of the ith fragment, wherein i and j are positive integers, j is not less than i, and the fragments with the same index number in the video data with first definition and the video data with second definition have the same fragment timestamp; receiving a jth slice in the second definition video data. The application also provides a corresponding video processing method, a corresponding video processing device and a corresponding storage medium.)

1. A video switching method, comprising:

when the ith slice in the video data with the first definition is played, receiving an instruction of switching to the second definition,

Requesting a jth fragment in the video data with the second definition from a server according to an index number i of the ith fragment, wherein i and j are positive integers, and j is not less than i;

receiving a jth slice in the second definition video data.

2. The method of claim 1, wherein each slice in the video data of the first definition and each slice in the video data of the second definition respectively comprise one or more groups of pictures, and the group of pictures contained in the video data of the first definition and the corresponding group of pictures in the video data of the second definition have the same play-out start time and play-out end time;

the method further comprises: and playing the next image group of the jth fragment in the video data with the second definition according to the image group of the jth fragment in the video data with the first definition currently being played.

3. The method of claim 2, wherein each group of pictures includes a reference frame;

wherein, according to the image group of the jth segment in the video data with the first definition currently being played, playing the next image group of the jth segment in the video data with the second definition comprises:

determining an index number m of a reference frame contained in a group of pictures of a jth slice in the video data with the first definition currently being played;

And requesting an image group corresponding to the reference frame with the index number of m +1 in the video data with the second definition according to the index number m of the reference frame, and playing the image group corresponding to the reference frame with the index number of m + 1.

4. The method of claim 1, further comprising:

and after the jth slice in the video data with the first definition is played, playing the jth slice in the video data with the second definition.

5. the method according to claim 1, wherein the requesting a jth slice of the video data of the second definition from the server according to an index number i of the ith slice comprises:

acquiring the data volume of the unplayed video data of the ith fragment in the video cache region;

And determining the index number of the jth fragment according to the data volume, and requesting the jth fragment from a server.

6. a video processing method, comprising:

When the terminal equipment plays the ith fragment in the video data with the first definition, responding to a received command of switching to the second definition and sending a request message for requesting the jth fragment in the video data with the second definition, wherein the request carries an index number of the ith fragment, i and j are positive integers, and j is more than or equal to i;

and sending the jth fragment to a terminal device, wherein the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp.

7. the method of claim 6, further comprising:

Dividing the video file into a plurality of fragments according to the playing time; each slice has a slice timestamp;

Coding the plurality of fragments according to a preset first definition to form a plurality of fragments corresponding to the video data with the first definition;

and coding the plurality of fragments according to a preset second definition to form a plurality of fragments corresponding to the video data with the second definition.

8. A video switching apparatus, comprising:

A first receiving unit, configured to receive an instruction to switch to a second definition when an ith slice in video data of a first definition is played,

the request unit is used for requesting a jth fragment in the video data with the second definition from the server according to the index number i of the ith fragment, wherein i and j are positive integers, j is more than or equal to i, and the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp;

and the second receiving unit is used for receiving the jth slice in the video data with the second definition.

9. a video processing apparatus, comprising:

The receiving unit is used for receiving a request message which is sent by a terminal device and requests the jth fragment in the video data with the second definition in response to a received command of switching to the second definition when the ith fragment in the video data with the first definition is played, wherein the request carries an index number of the ith fragment, i and j are positive integers, and j is more than or equal to i;

And the sending unit is used for sending the jth fragment to the terminal equipment, wherein the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp.

10. A computer-readable storage medium storing computer-readable instructions that cause at least one processor to perform the method of any one of claims 1-7.

Technical Field

the present application relates to the field of internet multimedia technologies, and in particular, to a video switching method, a video processing apparatus, and a storage medium.

Background

with the development of network technology, people often use video players to watch favorite video resources. Video assets include many different types of definitions such as normal, high definition, and ultra definition. The viewer can choose to watch videos of different definitions, which is called video definition switching when the viewer changes videos of different definitions. In the switching process of different definitions of videos, in order to enable users to have better viewing experience, researches on seamless switching technology are receiving more and more attention.

disclosure of Invention

The embodiment of the application provides a video switching method, which comprises the following steps:

When the ith slice in the video data with the first definition is played, receiving an instruction of switching to the second definition,

receiving a jth slice in the second definition video data.

The embodiment of the present application further provides a video processing method, including:

And sending the jth fragment to a terminal device, wherein the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp.

the embodiment of the present application further provides a video switching apparatus, including:

A first receiving unit, configured to receive an instruction to switch to a second definition when an ith slice in video data of a first definition is played,

The request unit is used for requesting a jth fragment in the video data with the second definition from the server according to the index number i of the ith fragment, wherein i and j are positive integers, j is more than or equal to i, and the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp;

And the second receiving unit is used for receiving the jth slice in the video data with the second definition.

the present application example also provides a video processing apparatus, including:

the present examples also provide a computer-readable storage medium storing computer-readable instructions that can cause at least one processor to perform the method as described above.

By adopting the scheme provided by the application, the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp. When the definition is switched, the current playing is not interrupted, the fragment corresponding to the second definition is requested according to the index number of the fragment, and the seamless switching of the video data with different definitions can be realized because the fragment data with the same index number has the same timestamp.

Drawings

in order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic diagram of a system architecture to which some embodiments of the present application relate;

FIG. 2 is a schematic flow diagram of a video switching method according to some examples of the present application;

Fig. 3A is another flow chart illustration of a video handoff method in some examples of the present application;

FIG. 3B is a schematic diagram of a user interface for sharpness switching in some examples of the present application;

FIG. 4A is a block diagram illustrating a fragmented video file according to some examples of the present application;

Fig. 4B is a diagram of slice structures of different resolutions and IDR frame structures within slices according to some examples of the present application;

FIG. 5 is a schematic diagram of the structure of some example IDR frame sequences of the present application;

FIG. 6 is a schematic illustration of on-chip handover in some examples of the present application;

FIG. 7 is a schematic illustration of inter-slice handover in some examples of the present application;

Fig. 8A is a schematic flowchart of resolution switching performed by some example terminal devices according to the present application;

fig. 8B is a schematic flowchart of resolution switching performed by some example terminal devices according to the present application;

FIG. 9 is a schematic flow diagram of a video processing method according to some examples of the present application;

FIG. 10 is a flow diagram illustrating different sharpness coding of video according to some examples of the present application;

FIG. 11 is a schematic block diagram of a video switch apparatus according to some examples of the present application;

FIG. 12 is a block diagram of some example video processing devices of the present application;

FIG. 13 is a schematic diagram of a computing device component structure in some embodiments of the present application; and

FIG. 14 is a block diagram of a computing device in accordance with further embodiments of the present application.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

When a viewer of a video switches between different definitions, one definition switching scheme is to interrupt the current play, display the download on a User Interface (UI), and then request a new definition play address and restart the play. This causes secondary buffering and the user needs to wait. And the other method is that the current playing is not interrupted, a new definition playing address is requested firstly, then another definition corresponding segment is requested through the index number of the current playing segment, and then switching is carried out. Because the slices (transport streams, ts) with different definitions are not aligned, that is, the playing start time and the playing end time corresponding to the slices with the same index number are different, the pictures before and after switching are inconsistent, which will present the phenomenon of discontinuous pictures, the watching effect is poor, and the smooth switching effect of seamless switching cannot be achieved.

fig. 1 is a schematic diagram of a system architecture 100 according to the present application, and as shown in fig. 1, the system architecture 100 includes a terminal device 101 and a server 102, where the terminal device 101 and the server 102 are connected through a network 103. In some embodiments, each user is communicatively connected to the server 102 through a video application 108-a to 108-c executing on a terminal device 101 (e.g., terminal devices 101 a-101 c). The video application may be a video playing APP or a video player configured in a browser.

The server 102 may be a video server, and may be implemented on one or more independent data processing devices or a distributed computer network, and provides a video playing service to the terminal device 101. The server 102 may be a content delivery edge node in a CDN (content delivery network). The different sharpness video data may be stored on the same server 102 or on different servers 102. For example, in the case of multiple servers, one of the servers may be used to store description information of the video, such as the identification, address, etc. of the server storing each slice of the video. The other server may be a server storing video data.

the terminal device 101 may be a portable terminal device such as a mobile phone, a tablet, a palm computer, a wearable device, or the like, or a PC such as a desktop computer, a notebook computer, or various smart devices having internet access capability and a display interface such as a smart television.

In some embodiments, network 103 may include a Local Area Network (LAN) and a Wide Area Network (WAN) such as the Internet. Network 103 may be implemented using any well-known network protocol, including various wired or wireless protocols.

the server 102 stores therein video data of different resolutions for each video file, for example, normal-definition encoded data, high-definition encoded data, super-definition encoded data, and the like. The encoded data for each definition of video includes different slices. Encoded data of different resolutions have the same slice structure. The video application 108 in the terminal device 101 requests the server 102 for video data, where the request carries a video identifier, a video definition identifier (the definition identifier may be a default definition identifier or a definition identifier corresponding to a definition selected by a user) and address information, and the server 102 continuously sends the definition encoded data to the client in the terminal device 101. When transmitting encoded data to the terminal apparatus 101, the server 102 transmits the encoded data in the basic unit of a slice. And after receiving one fragment, the client decodes the fragment and plays the decoded fragment. When receiving an instruction of switching to another definition during the playing process, the video application 108 requests the server for the segment corresponding to the second definition according to the index number of the currently played segment, thereby realizing switching of different definitions.

The video switching method provided by the application can be applied to a video online playing scene, in the scene, the video file can be a video file stored in the server 102, the server 102 generates encoded data of the video file with different definitions, and the encoded data with different definitions have the same index number and the same fragment time stamp. The video switching method provided by the application can also be applied to a live broadcast scene, in the scene, the server 102 receives a video data stream sent by a video acquisition end, decodes encoded data in the video data stream and then re-encodes the decoded data to generate encoded data with different definitions, the encoded data with different definitions have the same fragment with the same index number and the same fragment timestamp, and the re-encoded data is sent to the terminal equipment for decoding and playing.

The present application provides a video switching method, which is applied to a terminal device 101, and fig. 2 is a flowchart of a video switching method provided in some embodiments of the present application. As shown in fig. 2, the method comprises the steps of:

S201: and when the ith fragment in the video data with the first definition is played, receiving an instruction of switching to the second definition.

In some embodiments, the terminal device 101 needs to request the server 102 for the second definition segment after receiving the instruction of switching to the second definition from the user. Slices of video files of different resolutions are stored in server 102. The server 102 may encode the video file in advance to form encoded data with different definitions, where the fragments with the same index number in the encoded data with different definitions have the same fragment timestamp.

The video file in the server 102 may be a video file stored in the server 102, or may be code stream data sent by a video acquisition end received in a live scene. When the video file is stored in the server 102, the server encodes the video file according to different definitions to form encoded data with different definitions, wherein the segments of the encoded data with different definitions are aligned. When the video file is received code stream data of a video acquisition end, the received code stream data is decoded, and then re-encoded to form encoded data with different definitions, and the encoded data with different definitions are aligned in a slicing mode. The slice alignment means that the encoded data with different definitions corresponding to the same index number have the same slice timestamp, that is, the same video playing time period. When forming the coded data with different definitions, firstly, the video image is sliced, then the coded data with different definitions is coded, and the formed coded data with different definitions are aligned in slices.

s202: requesting a jth fragment in the video data with the second definition from a server according to the index number i of the ith fragment, wherein i and j are positive integers, j is not less than i, and the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment timestamp.

In some embodiments, the data amount of the unplayed video data of the ith slice in the video buffer area may be obtained; and determining the index number of the jth fragment according to the data volume, and requesting the jth fragment from a server.

for example, when the data amount is smaller than a preset value, determining that j is i +1, or j is i + 2;

and when the data quantity is larger than a preset value, determining that j is i or j is i + 1.

S203: receiving a jth slice in the second definition video data.

In the method provided by the embodiment of the present application, slices with the same index number in video data with different definitions have the same slice timestamp. When the definition is switched, the current playing is not interrupted, other definition corresponding fragments are requested according to the index numbers of the fragments, and because the fragment data with the same index number has the same timestamp, the seamless switching of the video data with different definitions can be realized.

the video switching method provided by the present application is described below with reference to fig. 3A. Fig. 3A is another flowchart of a video switching method according to an embodiment of the present disclosure. As shown in fig. 3A, the method includes the following operations:

s301, when the ith slice in the video data with the first definition is played, an instruction of switching to the second definition is received.

Fig. 3B is a schematic diagram of a user interface for switching definition, where the user interface is provided with a picture quality control 3001, and when the user clicks the picture quality control 3001, the definition selection box 3002 is displayed, and in the definition selection box 3002, the user can select videos with different definitions to play. The specific implementation of step S301 is similar to step S201, and is not described herein again.

S302, acquiring the data volume of the unplayed video data of the ith fragment in the video buffer area.

In some embodiments, in order to avoid the occurrence of secondary buffering during the switching from the first definition to the second definition, the example considers the number index j of the j-th slice requesting the second definition and the remaining playing time corresponding to the remaining data amount of the i-th slice in the video buffer.

and judging whether the residual data volume of the ith fragment in the current buffer area is enough to play the jth fragment with the second definition. In this example, a preset value of the data amount may be set, and when the remaining data amount of the ith slice in the buffer exceeds the preset value, it is considered that the data amount in the buffer may be played to the jth slice requested to the second definition.

S303, according to the data volume, determining the index number of the jth fragment, and requesting the jth fragment from a server.

In some examples, when the data amount is less than the preset value, that is, the unplayed data amount of the ith slice in the buffer cannot be played to the jth slice requested to the second definition, at this time, the ith +1 or ith +2 slice of the first definition may be played continuously, and j ═ i +1 or j ═ i +2 is determined.

When the data volume is larger than the preset value, namely the unplayed data volume of the ith fragment in the buffer can be played to the jth fragment requesting the second definition, at this moment, the (i + 1) th fragment of the first definition does not need to be played again, after the ith fragment of the first definition is played, the jth fragment of the second definition can be directly played, or the jth fragment of the first definition is switched to the second definition to be played in the playing process of the ith fragment of the first definition. Thus, j ═ i or j ═ i +1 can be determined.

and after receiving the jth segment with the second definition, the terminal device 101 plays the jth segment. And playing the jth fragment in different modes according to different switching strategies. When the video switching method adopts the on-chip switching, when the jth fragment with the first definition is played, the jth fragment with the second definition is switched to, and the on-chip switching is realized. When the video switching method adopts inter-slice switching, when the jth sub-slice with the first definition is played, the jth sub-slice with the second definition is switched to play, and the inter-slice switching is realized.

in some embodiments, after acquiring the jth shard with the second definition from the server, step S304 may be executed to perform inter-shard switching, as shown in fig. 7; or step S305 is executed to execute the intra-slice handover, as shown in fig. 6.

S304, after the jth slice in the video data with the first definition is played, the jth slice in the video data with the second definition is played.

In some examples, when the playing time of the remaining amount of data in the video buffer is sufficient to request the jth slice of the second definition, j may be made i + 1. After the j-1 th segment (i.e. the ith segment) of the first definition is played, the i +1 segment of the second definition is played to realize the switching from the first definition to the second definition.

When the playing time of the remaining data amount in the video buffer is not enough to request the jth slice of the second definition, j may be made to be i + 2. After playing the i piece of the first definition, playing the i +1 piece of the first definition, and after playing the i +1 piece of the first definition, playing the i +2 piece of the second definition to realize the switching from the first definition to the second definition.

s305, according to the image group of the jth fragment in the video data with the first definition currently played, playing the next image group of the jth fragment in the video data with the second definition.

In some examples, each slice in the video data Of the first definition and each slice in the video data Of the second definition respectively include one or more Group Of Pictures (GOP), a Group Of pictures included in the video data Of the first definition and a corresponding Group Of pictures in the video data Of the second definition have the same play start time and play end time, and each Group Of pictures includes one reference frame: an IDR (Instantaneous decoding refresh) frame. The embodiment of the application can further determine the reference frame in each slice, and then perform coding with different definitions, under the condition, the formed coded data with different definitions are aligned in the slices, and meanwhile, the IDR frames in the slices are aligned. As shown in fig. 4A, the video file is divided into n slices, each slice includes one or more IDR frames, and one IDR frame and an image frame before the next IDR frame form a GOP group. Where a GOP group is a group of consecutive pictures in the coded video stream, i.e., the distance between two IDR frames. The number of IDR frames included in each slice may be the same or different. According to the method, a two-pass algorithm is adopted as a control algorithm for recoding (transcoding), after frame information (fragment information and IDR frame information in each fragment) of a video file is determined in a pass1, the video file is coded in a pass2 according to different definitions to form coded data with different definitions, the fragments of the coded data with different definitions are aligned, and sequences of IDR frames in the fragments are the same. For example, as shown in fig. 4B, for two definitions 720p and 1080p, both have the same frame information, i.e. slice alignment, and the encoded data of the same slice has the same time stamp. In some examples, for a slice, encoded data of different resolutions have the same sequence of IDR frames within the slice, as shown in fig. 5, with different resolutions having the same sequence of IDR frames.

In some examples, when the playing time of the remaining data amount in the video buffer is enough to request the jth slice of the second definition, j may be made equal to i, and the switching from the first definition to the second definition is performed when the i slice of the first definition is played.

When the playing time of the remaining data volume in the video cache is not enough to request the jth fragment of the second definition, j can be made to be i +1, after the i fragment of the first definition is played, the i +1 fragment of the first definition is played, and the switching from the first definition to the second definition is performed when the i +1 fragment of the first definition is played.

by adopting the video switching method provided by the application, the coded data with different definitions of one video file have the same frame information, namely, the fragments are aligned. And when the definition is switched in the playing process, the current playing is not interrupted, and the coded data of the segment corresponding to the second definition is requested to be decoded and played according to the index number of the segment. Because the fragments with the same index number have the same timestamp, the switching between different definitions is realized, and the real smooth switching effect of seamless switching is achieved.

furthermore, in some examples of the present application, the sequence of IDR frames in each slice of video data of different sharpness is also aligned, i.e., IDR frames with the same index number have the same time stamp. Therefore, in the process of executing switching, in addition to switching between the slices, the switching in the slices can be selected, so that the video data with the second definition can be switched to be played more quickly, and the switching efficiency is improved.

The following describes the procedure of on-chip handover with reference to fig. 6.

And playing the next image group of the jth fragment in the video data with the second definition according to the image group of the jth fragment in the video data with the first definition currently being played.

wherein the group of pictures is a GOP. In this example, the video switching method provided by the present application performs switching from the first definition to the second definition by using an on-chip switching policy. As shown in fig. 6, when the first definition is ultra high definition and the second definition is high definition, slice switching is performed between the T slice of the first definition and the T slice of the second definition. Currently playing to the I-th GOP with ultra-high definition, and after playing the I-th GOP with ultra-high definition, reading the I + 1-th GOP with high definition by a packet reading thread of the video player for playing. In the T slice, each GOP with ultra-high definition is aligned with each GOP with high definition, namely, the corresponding GOPs have the same video playing start time and end time.

In some examples, each group of pictures includes a reference frame;

When executing the image group according to the j-th fragment in the currently played video data with the first definition, playing the next image group of the j-th fragment in the video data with the second definition comprises:

Determining an index number m of a reference frame contained in a group of pictures of a jth slice in the video data with the first definition currently being played;

the reference frame refers to an IDR frame, and a GOP group is formed between one IDR frame and the next IDR frame. The B-frames and P-frames are encoded with reference to the IDR frame in a GOP. When the video switching method of the present application employs an on-chip switching strategy, this example illustrates how to determine the corresponding group of the second definition according to the currently played group of the first definition. When performing intra-slice switching, the index number of the first definition is the same as the index number of the second definition, and is j, for example. Each GOP includes an IDR frame, and if the index number of an IDR frame included in a GOP of a j slice of a first definition currently played is determined, for example, to be m, the index number of a corresponding IDR frame in a corresponding second definition is m + 1. And determining the IDR frame with the index number of m +1 in the jth slice with the second definition, and further determining the GOP corresponding to the IDR frame. When the switching is carried out, after the GOP with the first definition is played, the GOP with the second definition is played. For example, when the super-definition T-th fragment is currently being played, and the player receives a request for switching to high definition, the player requests the second-definition T-fragment with a new playing address. When a T-th fragment with second definition is requested, determining that the packet reading of the T-th fragment with the current first definition is decoded to an I-th GOP, and directly switching to the I + 1-th GOP of the T-th fragment with the second definition for packet reading decoding after the I-th GOP packet reading is completed. The following describes a handover method provided by the present application with reference to fig. 8A and 8B. When the video switching method provided by the present application adopts an on-chip switching test, a flow chart diagram of the terminal device 101 switching between the first definition and the second definition is shown as an intention, as shown in fig. 8A, and the method mainly includes the following steps:

S801: and playing the T-shaped fragment with the first definition.

s802: and when the T-shaped segment with the first definition is played, receiving an instruction of switching to the second definition.

S803: judging whether the residual data volume of the T slice in the video buffer is enough to request the T slice with the second definition, that is, judging whether the residual data stream with the first definition in the video buffer is greater than a preset value, if so, executing step S804, otherwise, executing step S807.

S804: t of the second definition is requested.

S805: the I-th GOP of the T slice of the first definition currently played to is determined.

s806: and after the I < th > GOP of the T with the first definition is played, playing the I < th > +1 < th > GOP of the T with the second definition.

s807: requesting a T +1 th slice of the second definition.

S808: and when the T of the first definition is played, playing the T +1 sub-slice of the first definition.

S809: the ith GOP of the T +1 slice of the first definition currently played to is determined.

S810: and after the I < th > GOP of the T < 1 > slicing of the first definition is played, playing the I < th > +1 < th > GOP of the T < 1 > slicing of the second definition.

When the video switching method provided by the present application adopts an on-chip switching test, a flow chart diagram of the terminal device 101 switching between the first definition and the second definition is shown as an intention, as shown in fig. 8B, and the method mainly includes the following steps:

S8001: and playing the T-shaped fragment with the first definition.

S8002: and when the T-shaped segment with the first definition is played, receiving an instruction of switching to the second definition.

S8003: and judging whether the residual data volume of the T slices in the video buffer area is enough to request the T +1 slices with the second definition, namely judging whether the residual data stream with the first definition in the video buffer area is larger than a preset value, executing the step S8004 when the data volume is larger than the preset value, and otherwise executing the step S8006.

S8004: requesting a T +1 slice of the second definition.

s8005: and when the T fragment with the first definition is played, playing the T +1 fragment with the second definition.

s8006: requesting a T +2 slice of the second definition.

s8007: and when the T segment with the first definition is played, playing the T +1 segment with the first definition, and then playing the T +2 segment with the second definition.

the present application further provides a video processing method applied to the server 102, as shown in fig. 9, including the following steps:

s901: when the ith fragment in the video data with the first definition is played, the receiving terminal equipment responds to a received command of switching to the second definition and sends a request message of requesting the jth fragment in the video data with the second definition, wherein the request carries an index number of the ith fragment, i and j are positive integers, and j is larger than or equal to i.

s902: and sending the jth fragment to a terminal device, wherein the fragments with the same index number in the video data with the first definition and the video data with the second definition have the same fragment time stamp.

The correlation steps in this example correspond to the correlation steps in S201-S203 described above on the terminal side, and are not described again here.

In some examples, the video processing method provided in the present application, as shown in fig. 10, further includes the following steps:

S1001: dividing the video file into a plurality of fragments according to the playing time; each slice has a slice timestamp.

When transcoding, the server 102 transcodes according to the two-pass algorithm. At pass1, frame information for the video file is determined, the frame information including mainly tile information and on-tile IDR frame information. Determining transcoding parameters when transcoding the pass1, wherein the transcoding parameters may include: the frame rate, rate and resolution are selected as parameters for transcoding by pass1 to adapt transcoding by pass2 to all definitions or formats. The slices may be sliced according to time, for example, 150s per slice, at least one GOP in a slice, and the GOP group determined for a slice may be determined according to the set encoding parameters of pass1, for example, according to the frame rate, assuming that the frame rate is 25 frames per second, 25 frames of pictures may be determined as a GOP. The slice timestamp of each slice may refer to a play start time of the video corresponding to the slice.

S1002: and coding the plurality of fragments according to a preset first definition to form a plurality of fragments corresponding to the video data with the first definition.

S1003: and coding the plurality of fragments according to a preset second definition to form a plurality of fragments corresponding to the video data with the second definition.

in the above steps S1002 and S1003, after the frame information of the video is determined, the video file is transcoded according to the frame information and different definitions, so as to form encoded data with different definitions. Namely, the transcoding of the pass2 is performed, and when the transcoding of the pass2 is performed, the frame rate of the pass2 is consistent with the frame rate of the pass 1. And determining the resolution corresponding to different definitions according to the different definitions. As the frame rate of the pass1 is the same as that of the pass2, the code rate corresponding to each definition can be determined according to the resolution of different definitions, the resolution of the pass1 and the code rate of the pass 1. Specifically, the ratio of the resolution of each definition to the resolution of pass1 is determined, and the ratio is multiplied by the code rate of pass1 to obtain the corresponding code rate of each definition. The code rate, the frame rate and the resolution of each definition form coding parameters of each definition during transcoding by pass2, and a video file is coded according to the coding parameters of pass2 of each definition and the frame information determined in the step S1001 to form coded data with different definitions. In encoded data of different definitions, slices with the same index number have the same time stamp, i.e., have the same video playback start time.

By adopting the video processing method provided by the application, the coded data with different definitions of one video file have the same frame information, namely, the fragments are aligned. Therefore, when the definition is switched, the current playing is not interrupted, and seamless switching between different definitions can be realized according to the index number of the segment.

In some examples, each slice in the video data of the first definition and each slice in the video data of the second definition respectively include one or more image groups, and the image group included in the video data of the first definition and the corresponding image group in the video data of the second definition have the same play start time and play end time.

In this example, the segment includes a plurality of image groups, and the same image group with different definitions has the same play start time and play end time, which has been described in detail in the method of the terminal side and will not be described herein again.

in some examples, each group of pictures includes a reference frame, and reference frames having the same index number in the video data of the first definition and the video data of the second definition have the same reference frame timestamp.

Wherein the reference frame is an IDR frame. After the slicing information of the video file is determined, IDR frames in each slice are determined at the same time, and pictures between one IDR frame and the next IDR frame form a GOP group. When encoding is performed, encoding is performed in units of GOP groups. When encoding a GOP group, encoding is carried out by taking IDR frames in the GOP group as reference, wherein the IDR frames in the encoded data comprise full information, and B frames and P frames in the GOP group are encoded according to change information relative to I frames. The encoded data of a slice thus formed includes encoded data corresponding to each of one or more reference frames included in the slice. Among the encoded data of different resolutions, IDR frames having the same index number have the same reference frame time stamp, i.e., have the same video play start time.

When the method is applied to a live broadcast scene, each definition fragment in a live broadcast stream is ensured to be strictly aligned during transcoding, and IDR frames in a slice are strictly aligned, so that the live broadcast stream is directly switched to a new definition fragment, and when inter-slice or intra-slice data stream switching is carried out, timestamps are aligned, and the phenomenon that pictures are discontinuous and unsmooth cannot occur. In the switching process, the current playing is not needed to be interrupted, and only the UI is needed to be informed after the switching is finished, so that the real seamless and smooth effect can be achieved.

The present application further provides a video switching apparatus 1100, as shown in fig. 11, including:

A first receiving unit 1101 for receiving an instruction to switch to a second definition when playing an ith slice in video data of a first definition,

A requesting unit 1102, configured to request, from a server, a jth segment in video data of a second definition according to an index number i of the ith segment, where i and j are positive integers, and j is greater than or equal to i, and segments with the same index number in the video data of the first definition and the video data of the second definition have the same segment timestamp;

A second receiving unit 1103 is configured to receive a jth slice in the video data with the second definition.

In some examples, each slice in the video data of the first definition and each slice in the video data of the second definition respectively include one or more image groups, and an image group included in the video data of the first definition and a corresponding image group in the video data of the second definition have the same play start time and play end time;

The apparatus further comprises a play unit 1104 to: and playing the next image group of the jth fragment in the video data with the second definition according to the image group of the jth fragment in the video data with the first definition currently being played.

In some examples, each group of pictures includes a reference frame; the playing unit 1104 is further configured to:

Determining an index number m of a reference frame contained in a group of pictures of a jth slice in the video data with the first definition currently being played;

and requesting an image group corresponding to the reference frame with the index number of m +1 in the video data with the second definition according to the index number m of the reference frame, and playing the image group corresponding to the reference frame with the index number of m + 1.

In some examples, the playing unit 1104 is further configured to: and after the jth slice in the video data with the first definition is played, playing the jth slice in the video data with the second definition.

in some examples, the request unit 1102 is configured to: acquiring the data volume of the unplayed video data of the ith fragment in the video cache region; and determining the index number of the jth fragment according to the data volume, and requesting the jth fragment from a server.

In some examples, the request unit 1102 is configured to: when the data volume is smaller than a preset value, determining that j is i +1 or j is i + 2; and when the data quantity is larger than a preset value, determining that j is i or j is i + 1.

In some examples, the slice timestamp is used to represent a play start time corresponding to the slice.

The present application also provides a video processing apparatus 1200, as shown in fig. 12, which mainly includes:

The receiving unit 1201 is configured to receive a request message for requesting a jth fragment in video data with a second definition, which is sent by a terminal device in response to a received instruction for switching to the second definition when the ith fragment in the video data with the first definition is played, where the request carries an index number of the ith fragment, i and j are positive integers, and j is greater than or equal to i;

A sending unit 1202, configured to send the jth slice to a terminal device, where the slices with the same index number in the video data with the first definition and the video data with the second definition have the same slice timestamp.

In some examples, the apparatus further includes a transcoding unit 1203 to: dividing the video file into a plurality of fragments according to the playing time; each slice has a slice timestamp; coding the plurality of fragments according to a preset first definition to form a plurality of fragments corresponding to the video data with the first definition; and coding the plurality of fragments according to a preset second definition to form a plurality of fragments corresponding to the video data with the second definition.

The present application also provides a computer-readable storage medium storing computer-readable instructions that can cause at least one processor to perform the method as described above.

fig. 13 is a block diagram showing a configuration of a computing apparatus in which the video switching apparatus 1100 is located. As shown in fig. 13, the computing device includes one or more processors (CPUs) 1302, a communication module 1304, memory 1306, a user interface 1310, and a communication bus 1308 for interconnecting these components.

The processor 1302 may receive and transmit data via the communication module 1304 to enable network communications and/or local communications.

The user interface 1310 includes one or more output devices 1312 including one or more speakers and/or one or more visual displays. The user interface 1310 also includes one or more input devices 1314, including, for example, a keyboard, mouse, voice command input unit or microphone, touch screen display, touch sensitive tablet, gesture capture camera or other input buttons or controls, and the like.

The memory 1306 may be a high-speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; or non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.

Memory 1306 stores a set of instructions executable by processor 1302, including:

An operating system 1316, including programs for handling various basic system services and for performing hardware related tasks;

the application 1318 includes some or all of the units or modules of the video switching apparatus 1100. At least one unit of the video switching apparatus 1100 and the video processing apparatus 1200 may store machine executable instructions. Processor 1302 may be configured to implement the functionality of at least one of the various units or modules described above by executing machine-executable instructions in at least one of the various units in memory 1306.

Fig. 14 shows a configuration diagram of a computing device in which the video processing apparatus 1200 is located. As shown in fig. 14, the computing device includes one or more processors (CPUs) 1402, a communication module 1404, a memory 1406, a user interface 1410, and a communication bus 1408 for interconnecting these components.

the processor 1402 can receive and transmit data via the communication module 1404 to enable network communication and/or local communication.

User interface 1410 includes one or more output devices 1412 including one or more speakers and/or one or more visual displays. User interface 1410 also includes one or more input devices 1414, including, for example, a keyboard, a mouse, a voice command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture-capture camera or other input buttons or controls, and the like.

Memory 1406 may be high speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; or non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.

memory 1406 stores sets of instructions executable by processor 1402, including:

an operating system 1416, including programs for handling various basic system services and for performing hardware related tasks;

the application 1418 includes some or all of the units or modules of the video processing apparatus 1200. At least one unit in the video processing apparatus 1200 may store machine executable instructions. The processor 1402, by executing the machine-executable instructions in at least one of the units in the memory 1406, is able to implement the functionality of at least one of the units or modules described above.

it should be noted that not all steps and modules in the above flows and structures are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The division of each module is only for convenience of describing adopted functional division, and in actual implementation, one module may be divided into multiple modules, and the functions of multiple modules may also be implemented by the same module, and these modules may be located in the same device or in different devices.

The hardware modules in the embodiments may be implemented in hardware or a hardware platform plus software. The software includes machine-readable instructions stored on a non-volatile storage medium. Thus, embodiments may also be embodied as software products.

In various examples, the hardware may be implemented by specialized hardware or hardware executing machine-readable instructions. For example, the hardware may be specially designed permanent circuits or logic devices (e.g., special purpose processors, such as FPGAs or ASICs) for performing the specified operations. Hardware may also include programmable logic devices or circuits temporarily configured by software (e.g., including a general purpose processor or other programmable processor) to perform certain operations.

in addition, each example of the present application can be realized by a data processing program executed by a data processing apparatus such as a computer. It is clear that a data processing program constitutes the present application. Further, the data processing program, which is generally stored in one storage medium, is executed by directly reading the program out of the storage medium or by installing or copying the program into a storage device (such as a hard disk and/or a memory) of the data processing device. Such a storage medium therefore also constitutes the present application, which also provides a non-volatile storage medium in which a data processing program is stored, which data processing program can be used to carry out any one of the above-mentioned method examples of the present application.

Machine readable instructions corresponding to the modules in fig. 13 or fig. 14 may cause an operating system or the like operating on the computer to perform some or all of the operations described herein. The nonvolatile computer-readable storage medium may be a memory provided in an expansion board inserted into the computer or written to a memory provided in an expansion unit connected to the computer. A CPU or the like mounted on the expansion board or the expansion unit may perform part or all of the actual operations according to the instructions.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

26页详细技术资料下载

Video switching method, video processing device and storage medium

相关技术

网友询问留言