Video playing method, video player and computer storage medium

文档序号:1524380 发布日期:2020-02-11 浏览:19次 中文

阅读说明:本技术 一种视频播放方法、视频播放器及计算机存储介质 (Video playing method, video player and computer storage medium ) 是由 徐明伟 孟子立 陈婧 郭雅宁 孙晨 于 2019-09-16 设计创作,主要内容包括:本发明公开了一种视频播放方法、视频播放器及计算机存储介质,该方法应用于移动终端,该方法包括:播放由多个连续的视频块组成的预定视频;针对预定视频中的每个视频块,计算该视频块的第一播放状态,并采用ABR算法并根据第一播放状态计算与第一播放状态对应的该视频块的下一个视频块的比特率决策;将预定视频中所有视频块的第一播放状态及其对应的动作作为训练数据集;采用CART算法基于训练数据集生成决策树;将决策树部署到移动终端的视频播放器中并使视频播放器根据由决策树得到的比特率播放视频。本发明能够将复杂的ABR算法转化为算法简单的决策树,移动终端的视频播放器根据由上述决策树得到的比特率播放视频时,能够极大地提高用户体验。(The invention discloses a video playing method, a video player and a computer storage medium, wherein the method is applied to a mobile terminal and comprises the following steps: playing a predetermined video composed of a plurality of consecutive video blocks; calculating a first playing state of each video block in a preset video, and calculating a bit rate decision of a next video block of the video block corresponding to the first playing state according to the first playing state by adopting an ABR algorithm; taking the first playing states of all video blocks in a preset video and corresponding actions as a training data set; generating a decision tree based on a training data set by adopting a CART algorithm; and deploying the decision tree into a video player of the mobile terminal and enabling the video player to play the video according to the bit rate obtained by the decision tree. The invention can convert the complex ABR algorithm into the decision tree with simple algorithm, and the video player of the mobile terminal can greatly improve the user experience when playing the video according to the bit rate obtained by the decision tree.)

1. A video playing method is applied to a mobile terminal and is characterized by comprising the following steps:

playing a predetermined video, wherein the predetermined video is composed of a plurality of continuous video blocks;

for each video block in the predetermined video, performing the steps of:

calculating a first playing state of the video block;

calculating an action corresponding to the first playing state according to the first playing state by adopting an ABR algorithm, wherein the action corresponding to the first playing state is a bit rate decision of a next video block of the video block;

taking the first playing states and corresponding actions of all video blocks in the preset video as a first training data set;

generating a decision tree for determining a bit rate for playing the video based on the first training data set using a CART algorithm;

deploying the decision tree into a video player of a mobile terminal;

and sending a request to a preset video server by the video player, and playing the video fed back by the video server according to the bit rate obtained by the decision tree based on the current network state after receiving a message that the request fed back by the video server passes.

2. The video playback method of claim 1, further comprising:

optimizing the decision tree;

deploying the decision tree to a video player of a mobile terminal, including: deploying the optimized decision tree into a video player of the mobile terminal,

playing the video fed back by the video server according to the bit rate obtained by the decision tree based on the current network state, comprising: and playing the video fed back by the video server according to the bit rate obtained by the optimized decision tree based on the current network state.

3. The video playback method of claim 2, wherein the generating a decision tree based on the first training data set using the CART algorithm comprises:

and selecting the playing state in the first training data set as a data feature construction leaf node by adopting a greedy algorithm in the CART algorithm until the number of leaf nodes reaches a first preset threshold or the Gini coefficient of the first training data set is smaller than a second preset threshold.

4. The video playback method of claim 3, wherein the loss function used in generating the decision tree is l (r; r) 0):

Figure FDA0002202899220000021

Wherein r is pi(s), r 0=π *(s), π is the currently generated decision tree, π *The video block is an ABR algorithm, and s is the playing state of the video block; r maxFor a preset maximum bit rate, R minIs a preset minimum bit rate.

5. The video playback method of claim 4, wherein optimizing the decision tree comprises:

s1: playing the predetermined video;

for each video block in the predetermined video, performing steps S2 and S3:

s2: calculating a second playing state of the video block based on the decision tree;

s3: calculating an action corresponding to the second playing state according to the second playing state by adopting the ABR algorithm, wherein the action corresponding to the second playing state is a bit rate decision of a next video block of the video block;

s4: summarizing the first playing state and the second playing state of all video blocks in the preset video to obtain the playing state of the optimization method;

s5: summarizing actions corresponding to the first playing states of all video blocks in the preset video and actions corresponding to the second playing states of all video blocks to obtain decision actions of the optimization method;

s6: taking the playing state of the optimization method and the decision action of the optimization method as a second training data set;

s7: taking a decision tree generated by a CART algorithm based on the second training data set as an optimized decision tree;

repeating the steps S1-S7 until a preset maximum number of iterations is reached.

6. A video player applied to a mobile terminal, comprising:

the video playing module is used for playing a preset video, and the preset video consists of a plurality of continuous video blocks;

a calculation module for performing the following steps for each video block in the predetermined video:

calculating a first playing state of the video block;

calculating an action corresponding to the first playing state according to the first playing state by adopting an ABR algorithm, wherein the action corresponding to the first playing state is a bit rate decision of a next video block of the video block;

a first training data set acquisition module, configured to use the first playing states and corresponding actions of all video blocks in the predetermined video as a first training data set;

a decision tree generation module for generating a decision tree for determining a bit rate for playing a video based on the first training data set by using a CART algorithm;

a deployment module for deploying the decision tree into the video player;

and the receiving and sending module is used for sending a request to a preset video server, receiving a message that the request fed back by the video server passes, and then informing the video playing module to play the video fed back by the video server according to the bit rate obtained by the decision tree based on the current network state.

7. The video player of claim 6, further comprising:

an optimization module for optimizing the decision tree,

the deployment module is further configured to deploy the optimized decision tree into the video player,

the video playing module is also used for playing the video fed back by the video server according to the bit rate obtained by the optimized decision tree based on the current network state.

8. The video player of claim 7, wherein the decision tree generation module is configured to select a playing state in the first training data set as a data feature construction leaf node in a CART algorithm by using a greedy algorithm until a number of leaf nodes reaches a first preset threshold or a Gini coefficient of the first training data set is smaller than a second preset threshold.

9. The video player of claim 8, wherein the loss function employed by the decision tree construction module is l (r; r) 0):

Figure FDA0002202899220000031

Wherein r is pi(s), r 0=π *(s), π is the currently generated decision tree, π *The video playing method comprises the following steps that (1) an ABR algorithm is adopted, and s is the current playing state of a video; r maxFor a preset maximum bit rate, R minIs a preset minimum bit rate.

10. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements a video playback method as claimed in any one of claims 1 to 5.

Technical Field

The present invention relates to the field of internet information technologies, and in particular, to a video playing method, a video player, and a computer storage medium.

Background

In existing network systems, video traffic accounts for a significant portion of the total network traffic. Meanwhile, the demand for online video transmission has increased dramatically in recent years. Adaptive bit-rate (ABR) techniques have evolved to optimize the video quality of videos viewed online by users. ABR techniques were first proposed by academia in 2011 to optimize the Quality of Experience (QoE) of users. Briefly, the ABR algorithm selects the bit rate best suited for the user for video transmission at the client based on an estimate of the current network conditions. By using the ABR technology, when a user watches videos on line, the current available network bandwidth can be fully utilized, and the situations of blocking and the like are avoided as much as possible, so that the experience quality of the user is improved.

In practical deployment, the ABR algorithm needs to be sophisticated optimized under the combined actions of differences in QoE requirements (some users want video sharpness as high as possible without being stuck, and some users are opposite), fluctuations in network throughput (future network throughput is difficult to predict accurately), and the effect of correlation between decisions (decisions are depended on one another in a sequential decision process). Various optimization schemes for ABR algorithms exist in the prior art, such as mixed integer Programming (MILP), lyapunov optimization, and deep neural network optimization, etc., to optimize the performance of online video playback.

However, the various optimization schemes described above cause problems in actual deployment of the ABR algorithm. At present, most videos are played on a mobile terminal, and because the ABR optimization algorithm is complex, the computing resources of the mobile terminal playing the videos are usually very limited, and the solution of the complex optimization problem is difficult to support, therefore, it is difficult for a video content provider to directly integrate the ABR algorithm into an HTML page like a conventional method and deploy the ABR algorithm in a client player, and the problem becomes more severe with the further complication of the ABR optimization method in the future.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the ABR optimization algorithm in the prior art is complex, so that the ABR optimization algorithm cannot be directly deployed in a client player, and the watching experience of a user is poor during video playing.

In order to solve the above technical problem, the present invention provides a video playing method, which is applied to a mobile terminal, and comprises:

playing a predetermined video, wherein the predetermined video is composed of a plurality of continuous video blocks;

for each video block in the predetermined video, performing the steps of:

calculating a first playing state of the video block;

calculating an action corresponding to the first playing state according to the first playing state by adopting an ABR algorithm, wherein the action corresponding to the first playing state is a bit rate decision of a next video block of the video block;

taking the first playing states and corresponding actions of all video blocks in the preset video as a first training data set;

generating a decision tree for determining a bit rate for playing the video based on the first training data set using a CART algorithm;

deploying the decision tree into a video player of a mobile terminal;

and after the video player sends a request to a preset video server and receives a message that the request fed back by the video server passes, the video fed back by the video server is played according to the bit rate obtained by the decision tree based on the current network state.

Further, the method further comprises:

optimizing the decision tree;

deploying the decision tree to a video player of a mobile terminal, including: deploying the optimized decision tree into a video player of the mobile terminal,

playing the video fed back by the video server according to the bit rate obtained by the decision tree based on the current network state, comprising: and playing the video fed back by the video server according to the bit rate obtained by the optimized decision tree based on the current network state.

Further, the generating a decision tree based on the first training data set by using the CART algorithm includes:

and selecting the playing state in the first training data set as a data feature construction leaf node by adopting a greedy algorithm in the CART algorithm until the number of leaf nodes reaches a first preset threshold or the Gini coefficient of the first training data set is smaller than a second preset threshold.

Preferably, the penalty function employed in generating the decision tree is l (r; r) 0):

Figure BDA0002202899230000031

Wherein r is pi(s), r 0=π *(s), π is the currently generated decision tree, π *The video block is an ABR algorithm, and s is the playing state of the video block; r maxFor a preset maximum bit rate, R minIs a preset minimum bit rate.

Further, optimizing the decision tree includes:

s1: playing the predetermined video;

for each video block in the predetermined video, performing steps S2 and S3:

s2: calculating a second playing state of the video block based on the decision tree;

s3: calculating an action corresponding to the second playing state according to the second playing state by adopting an ABR algorithm, wherein the action corresponding to the second playing state is a bit rate decision of a next video block of the video block;

s4: summarizing the first playing state and the second playing state of all video blocks in the preset video to obtain the playing state of the optimization method;

s5: summarizing actions corresponding to the first playing states of all video blocks in the preset video and actions corresponding to the second playing states of all video blocks to obtain decision actions of the optimization method;

s6: taking the playing state of the optimization method and the decision action of the optimization method as a second training data set;

s7: taking a decision tree generated by a CART algorithm based on the second training data set as an optimized decision tree;

repeating the steps S1-S7 until a preset maximum number of iterations is reached.

The invention also provides a video player applied to the mobile terminal, which comprises:

the video playing module is used for playing a preset video, and the preset video consists of a plurality of continuous video blocks;

a calculation module for performing the following steps for each video block in the predetermined video:

calculating a first playing state of the video block;

calculating an action corresponding to the first playing state according to the first playing state by adopting an ABR algorithm, wherein the action corresponding to the first playing state is a bit rate decision of a next video block of the video block;

a first training data set acquisition module, configured to use the first playing states and corresponding actions of all video blocks in the predetermined video as a first training data set;

a decision tree generation module for generating a decision tree for determining a bit rate for playing a video based on the first training data set by using a CART algorithm;

a deployment module for deploying the decision tree into the video player;

and the receiving and sending module is used for sending a request to a preset video server, receiving a message that the request fed back by the video server passes, and then informing the video playing module to play the video fed back by the video server according to the bit rate obtained by the decision tree based on the current network state.

Further, the video player further includes:

an optimization module for optimizing the decision tree,

the deployment module is further configured to deploy the optimized decision tree into a video player,

the video playing module is also used for playing the video fed back by the video server according to the bit rate obtained by the optimized decision tree based on the current network state.

Further, the decision tree generation module is configured to select, in the CART algorithm, a playing state in the first training data set as a data feature by using a greedy algorithm to construct a leaf node until the number of leaf nodes reaches a first preset threshold or a Gini coefficient of the first training data set is smaller than a second preset threshold.

Preferably, the loss function adopted by the decision tree construction module is l (r; r) 0):

Figure BDA0002202899230000041

Wherein r is pi(s), r 0=π *(s), π is the currently generated decision tree, π *The video playing method comprises the following steps that (1) an ABR algorithm is adopted, and s is the current playing state of a video; r maxFor a preset maximum bit rate, R minIs a preset minimum bit rate.

The present invention also provides a computer storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing any of the video playback methods described above.

Compared with the prior art, one or more embodiments in the above scheme can have the following advantages or beneficial effects:

by applying the video playing method, the original ABR algorithm with extremely complex calculation is converted into the decision tree with simple algorithm and light weight, so that the consumption of calculation resources is greatly reduced, and the decision delay is shortened.

Drawings

The scope of the present disclosure may be better understood by reading the following detailed description of exemplary embodiments in conjunction with the accompanying drawings. Wherein the included drawings are:

FIG. 1 is a first flowchart of a method according to an embodiment of the present invention;

FIG. 2 is a flowchart of a second method of an embodiment of the present invention;

FIG. 3 is a first block diagram of a system according to an embodiment of the present invention;

FIG. 4 is a second block diagram of the system according to the embodiment of the present invention;

FIG. 5 is a diagram illustrating a structure of a decision tree and a decision effect thereof according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating predicted effects of an unoptimized decision tree in an embodiment of the present invention;

FIG. 7 is a schematic diagram of an algorithm according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the following will describe in detail an implementation method of the present invention with reference to the accompanying drawings and embodiments, so that how to apply technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented.

The design goal of the invention is to convert the complex ABR algorithm into a lightweight and efficient on-line deployment model, and to ensure that the performance of the converted deployment model is not different from that of the original ABR algorithm. The methods of linear fitting, nonlinear fitting, strategy summarization and the like can be used as alternatives of the target transformation model. The invention finally adopts a decision tree as a target transformation model based on the following reasons:

(1) the decision tree has rich expression capability. Since the decision tree is a form of non-parameterized expression, it can express complex decision logic. The rich expression capability of the decision tree ensures the performance of the ABR algorithm in the transformation process. As shown in fig. 5, even if the degree of non-linearity of the decision boundary is high in the state space, the decision tree can approximate the decision boundary with high fidelity, because it can flexibly adjust its decision granularity when needed.

(2) The decision tree is sufficiently lightweight. Since the binary decision tree (binary tree) is composed of a series of conditional judgments, the network administrator can implement it lightweight through branch statements in JavaScript when actually deploying. Deploying a decision tree of 100 leaf nodes brings less than 1% of the extra size of the HTML page.

(3) Decision tree decision logic is similar to the ABR algorithm. The ABR algorithm is also generally combined by a series of conditional decisions at the time of decision making. For example, optimizing QoE requires that a high bit rate can be selected on the premise that the current buffer size and network throughput are both high (to avoid video stuck), and the current resolution is also high (to avoid resolution jitter).

However, a decision tree is a supervised learning approach that is designed to optimize a specific loss function (typically the average prediction error). It usually requires a large tagged data set to be optimized within the entire state space. Mathematically, this optimization process can be expressed as:

Figure BDA0002202899230000061

wherein d is πHowever, since the distribution is coupled to traffic throughput, video length, and policy content, it is difficult to directly compute the probability distribution of statesThe dimensions of the state space are typically very high (Pensieve's state space has 25 dimensions), making enumeration of all combinations inefficient. Meanwhile, since states in the state space are not uniformly distributed in the real world, uniform sampling in the state space may not be unbiased, thus degrading performance. Therefore, we adopt the design of the virtual player and use real network traffic data to simulate the ABR algorithm. Virtual players are fast and efficient compared to packet-level simulations because they only compute video block-level information. Then, we collect the state-action pairs during the simulated play. Because these data are generated using real-world traffic data, they are unbiased compared to real production environments.

However, by utilizing a virtual player, converting the ABR algorithm to a decision tree based on a given data set is also challenging. Due to the cascading effect of the ABR system, the performance of the transformed decision tree may be poor even though the overall prediction accuracy of the final decision tree is high. As shown in fig. 6, although the overall accuracy is high, a false decision may bring the decision tree into a state space that was not experienced during training. Decision trees may make more mistakes with them because they do not know how the subspace should be processed. This will further push the decision tree off track and degrade performance. To address this challenge, inspired by recent advances in imitation learning, the present invention continually performs simulation experiments on the decision tree and lets the original ABR algorithm (teacher) correct the wrong decisions made by the decision tree (student). In loop iteration, the decision tree will learn step by step how to make decisions over the entire state space.

Based on the above analysis, the algorithm of the embodiment of the present invention is shown in fig. 7, and in order to convert the complex original ABR algorithm into a decision tree, the embodiment of the present invention uses a virtual player to effectively simulate the system dynamics of a real video player, and uses a simulation learning to improve the fidelity of the decision tree. The invention corrects errors generated by the decision tree by continuously simulating the performance of the decision tree and according to the result of the original ABR algorithm. The following is a schematic of the algorithm code in the embodiment of the present invention:

based on the codes, the invention provides a video playing method, which is applied to a mobile terminal, and as shown in fig. 1, the method comprises the following steps:

s110, playing a preset video, wherein the preset video is composed of a plurality of continuous video blocks;

for each video block in the predetermined video, performing step S120 and step S130:

s120, calculating a first playing state of the video block;

s130, calculating an action corresponding to the first playing state by adopting an ABR algorithm to be deployed according to the first playing state, wherein the action corresponding to the first playing state is a bit rate decision of a next video block of the video block;

s140, taking the first playing states and corresponding actions of all video blocks in the preset video as a first training data set;

in this embodiment, a preset virtual player is used to play a predetermined video, where the predetermined video is a network traffic data set and a video summary that are divided into video blocks. For each ABR algorithm, the present embodiment first simulates the ABR algorithm in a virtual player to collect initial state-action pairs (S, a) for subsequent decision tree training (line 1 of the algorithm code described above). The decision tree training process in this embodiment is also a generation process of the decision tree. The virtual player is a tracking-based block-level simulator that can accurately simulate the behavior of an actual video player with traces and a video manifest. For a certain ABR algorithm to be deployed, the virtual player uses the network traffic data set and the video summary as inputs to the algorithm to perform the simulation. In a practical deployment, a content provider may use a public network traffic data set or collect historical data for simulation. Furthermore, our evaluation shows that the method has strong generalization capability even if the network traffic data set used in the training phase is different in statistics from the network traffic data characteristics in the test environment.

Specifically, the virtual player continuously calculates a playing status (i.e. a first playing status) of the current video block, where the playing status specifically includes parameters such as the size of the current buffer, the current download time, and the like. Then, the original ABR algorithm to be deployed acquires the first play state, and generates a bit rate decision for the next video block, wherein the decision is an action corresponding to the first play state. The action is sent back to the virtual player, the virtual player continues to play the next video block according to the action, then continues to calculate the state of the next video block, and so on, until the predetermined video playing is completed. In this process, we can collect the playing state of each video block and its corresponding action, initialize these state-action pairs to (S, a), and use them as the training data set for the subsequent generation of decision tree.

S150, generating a decision tree based on the first training data set by adopting a CART algorithm;

as shown in line 3 of the above algorithm code, the present embodiment first generates a decision Tree pi (also referred to as a student) based on the initialized state-action pairs (S, a) using a Classification and Regression Tree (CART) algorithm. In this embodiment, we do not use the 0-1 loss in the prior art to predict the accuracy (equation 1), but use the normalized square loss as the training loss during the generation of the decision tree, and the loss function used in this embodiment is as follows:

Figure BDA0002202899230000081

wherein r is pi(s), r 0=π *(s), π is the currently generated decision tree, π *S is the current playing state of the video for the ABR algorithm to be deployed; r maxFor a preset maximum bit rate, R minIs a preset minimum bit rate.

The principle behind using the square loss is that it is desirable to penalize those bit rate errors in student (decision tree) strategies that are far from teacher (original ABR algorithm) strategies, as they have a greater impact on video footage and the like. Then, a greedy algorithm is adopted in the CART algorithm to select the playing state in the first training data set as a data feature to construct a leaf node so as to minimize a loss function until the number of the leaf nodes reaches a first preset threshold or a Gini coefficient of the first training data set is smaller than a second preset threshold, wherein the first preset threshold is set by a network operator. When the Gini coefficient of the first training data set is less than a second preset threshold, it indicates that all samples have been completely separated.

And S160, deploying the decision tree to a video player of the mobile terminal, enabling the video player to send a request to a preset video server, and playing the video fed back by the video server according to the bit rate obtained by the decision tree after receiving a message that the request fed back by the video server passes.

After the complex original ABR algorithm is converted into a simple decision tree, the decision tree can be directly deployed into a video player of a mobile terminal as in the prior art. And playing the network video through the bit rate decision continuously generated by the decision tree so as to improve the video experience of the user. Specifically, a transceiver module in the video player sends a video request to a preset video server, the video server feeds back a response message and a network video that the request passes to the transceiver module of the video player, the transceiver module receives the network video and then transmits the network video to a video playing module in the video player, and the video playing module plays the network video according to the currently calculated bit rate. This playing process is also a process of playing the video blocks constituting the network video.

In order to improve the performance of the obtained decision tree, as shown in fig. 2, the embodiment further includes: s170, before the decision tree is deployed to the mobile terminal, optimizing the obtained decision tree; and S180, deploying the optimized decision tree to a video player of the mobile terminal, enabling the video player to send a request to a preset video server, and playing the video fed back by the video server according to the bit rate obtained by the optimized decision tree after receiving a message that the request fed back by the video server passes.

In this embodiment, optimizing the decision tree includes:

s1: playing the predetermined video;

for each video block in the predetermined video, performing steps S2 and S3:

s2: calculating a second play state of the current video block based on the decision tree;

s3: calculating an action corresponding to the second playing state by adopting an ABR algorithm to be deployed according to the second playing state, wherein the action corresponding to the second playing state is a bit rate decision of the next video block;

s4: summarizing the first playing state and the second playing state of all video blocks in the preset video to obtain the playing state of the optimization method;

s5: summarizing actions corresponding to the first playing states of all video blocks in the preset video and actions corresponding to the second playing states of all video blocks to obtain decision actions of the optimization method;

s6: taking the playing state of the optimization method and the decision action of the optimization method as a second training data set;

s7: taking a decision tree generated by a CART algorithm based on the second training data set as an optimized decision tree;

repeating the above steps S1 to S7 until reaching the preset maximum iteration number. Therefore, the optimization of the decision tree in the embodiment is a cyclic optimization process, so that the teacher cyclically corrects the strategy errors made by the students, thereby improving the performance of the decision tree.

Specifically, the present embodiment simulates a decision tree pi in a virtual player iAnd collecting a series of new state-action pairs (S) i,A i) (line 4 in the above algorithm code). At present, though the student pi i(i.e., the decision tree generated in step S150) already knows how to make decisions in the face of training conditions, but models π independently iMay result in poor performance. As shown in FIG. 6, the middle school student π is simulated due to the cascade effect iExperienced S iMay be in this implementationThe iterative training period (i.e., steps S1 through S7) in the example has not been experienced. We still need to correct the decision tree strategy in subsequent steps.

Therefore, we will be S iThe state in (1) is provided to the original ABR algorithm pi *(teacher) and collecting teacher generated decision data set

Figure BDA0002202899230000101

(line 5 of the above algorithm code). Finally, we use the current state-action pair

Figure BDA0002202899230000102

The total student status and teacher' S behavior (S, a) are aggregated and returned to line 2 of the algorithm for the next iteration (line 6 in the algorithm code above). In this case, the decision tree pi is trained when in the next iteration i+1It will draw lessons from the errors made by the last iteration. The loop will continue in this way until the user set maximum number of iterations (M) is reached. The decision tree generated by the last iteration will then be deployed into the client video player.

The following theoretical analysis is performed on the embodiments of the present invention:

as mentioned above, the network operator needs to set two hyper-parameters: maximum number of iterations (M) and number of leaf nodes (first preset threshold). Therefore, we provide a theoretical analysis of the mean loss function bounds (distortion) of the method when actually deploying the decision tree. We first demonstrated that the loss function defined in this example has both Lipschitz and strong convexity:

conclusion 1 l (r; r) in formula 2 0) Meanwhile, the product has Lipschitz property and strong convexity.

And (3) proving that:

Figure BDA0002202899230000103

we have:

|l(r 1;r 0)-l(r 2;r 0)|=|(r 1-r 0) 2-(r 2-r 0) 2|=

|r 1+r 2-2r 0|·|r 1-r 2|≤2(R max-R min)|r 1-r 2i (formula 3)

The last inequality sign holds because l (r; r) 0) In [ R ] min,R max]The above. Thus l (r; r) 0) Has Lipschitz property and Lipschitz constant of Similarly, we can also demonstrate that l (r; r) 0) Has strong convexity.

Figure BDA0002202899230000105

Comprises the following steps:

Figure BDA0002202899230000111

wherein the coefficient of strong convexity v is 2/(R) max-R min) 2. After the syndrome is confirmed.

Due to the loss function l (r; r) 0) The product has both Lipschitz property and strong convexity, and can expand the work published by Ross et al on AISTATTS 2011. Therefore, we can find that when the video is processed independently using the decision tree generated by the present method, the upper bound of the average loss function satisfies the following conclusion:

conclusion 2 for any δ > 0, the loss function value when training is ε MThen, there is a policy Such that the average loss function satisfies:

Figure BDA0002202899230000113

when in use The probability of satisfaction of the above equation is greater than 1- δ. T is the number of video blocks in the analog play.

And (3) proving that: order to

Figure BDA0002202899230000115

To take action a in the initial state s and then all take the cost of the strategy pi' in step t, then:

Figure BDA0002202899230000116

wherein s is τThe state at time τ. Therefore, the method comprises the following steps:

Figure BDA0002202899230000117

conclusion 2 is complete according to Ross et al.

Figure BDA0002202899230000118

Can be found by cross validation between decision trees of different iterations, which is generally the decision tree pi of the last iteration in our experiment M. Therefore, we provide an upper limit to the distortion of the method. Loss function value epsilon during training MRelated to the complexity of the original ABR algorithm and the number of leaf nodes N (expressive power of the decision tree).

The present invention also provides a video player applied to a mobile terminal, as shown in fig. 3, the video player including:

the video playing module is used for playing a preset video, and the preset video consists of a plurality of continuous video blocks;

a calculation module for performing the following steps for each video block in the predetermined video:

calculating a first playing state of the video block;

calculating an action corresponding to the first playing state by adopting an ABR algorithm to be deployed according to the first playing state, wherein the action corresponding to the first playing state is a bit rate decision of a next video block of the video block;

a first training data set acquisition module, configured to use the first playing states and corresponding actions of all video blocks in the predetermined video as a first training data set;

a decision tree generation module for generating a decision tree based on the first training data set by using a CART algorithm;

the deployment module is used for deploying the decision tree to the video player;

and the receiving and sending module is used for sending a request to a preset video server and informing the video playing module to play the video fed back by the video server according to the bit rate obtained by the decision tree after receiving the message that the request fed back by the video server passes.

Further, as shown in fig. 4, the video player further includes:

an optimization module for optimizing the decision tree,

the deployment module is further configured to deploy the optimized decision tree into a video player,

and the video playing module is also used for playing the video fed back by the video server according to the bit rate obtained by the optimized decision tree.

In an embodiment of the player, the decision tree generation module is configured to select, in the CART algorithm, a playing state in the first training data set as a data feature by using a greedy algorithm to construct a leaf node until the number of leaf nodes reaches a first preset threshold or a Gini coefficient of the first training data set is smaller than a second preset threshold.

In an embodiment of the player, the loss function employed by the decision tree generation module is l (r; r) 0):

Figure BDA0002202899230000121

Wherein r is pi(s), r 0*(s), π is the currently generated decision tree, π *S is the current playing state of the video for the ABR algorithm to be deployed; r maxFor a preset maximum bit rate, R minIs a preset minimum bit rate.

The specific working principle, working process and the like of the video player provided by the invention can be referred to the video playing method provided by the invention, and the same technical contents are not repeated here.

The present invention also provides a computer storage medium having a computer program stored thereon, which when executed by a processor implements a video playback method according to an embodiment of the present invention.

The invention is a practical design for ABR algorithm deployment for video playback, which can be universally used for various online video client devices, including but not limited to: personal computers, smart phones, tablet computers, smart televisions, and the like.

The invention provides a video playing method and a video player, which take an ABR algorithm expected to be selected by a network administrator as input, and reduce the resource consumption of ABR algorithm deployment by automatically converting a complex ABR algorithm (such as MILP and neural network) into a lightweight decision tree algorithm capable of being directly deployed. The invention adds a lightweight conversion step in the traditional direct deployment scheme and provides support for the actual deployment of the complex ABR algorithm. Meanwhile, in order to ensure performance guarantee during conversion, the invention provides that loop iterative fitting is carried out through simulation learning, so that the performance of a decision tree after conversion is similar to that of a complex ABR algorithm before conversion, and the actual landing of the latest ABR technology is accelerated.

The method identifies the limitation of direct deployment of the complex ABR algorithm, innovatively designs a lightweight deployment conversion method, and improves the practical value of ABR algorithm deployment; the invention innovatively provides a calculation scheme of selecting a decision tree as actual online deployment by analyzing various conversion target schemes so as to reduce decision delay, memory consumption and page size resource consumption of an ABR algorithm; according to the invention, a lightweight conversion step is innovatively introduced before the ABR algorithm client is directly deployed, and the resource consumption of a complex ABR algorithm is optimized by converting the actually deployed algorithm into a decision tree on the premise of not changing the training of an administrator or designing a new ABR algorithm; the invention also analyzes the dependency of the sequential decision process in video transmission, innovatively designs a loop iterative fitting conversion algorithm based on the simulation learning, and can ensure that the performance of the ABR algorithm before and after conversion is not lost. When the decision tree generated by the invention is deployed in a video player of the mobile terminal, the video player requests the video from the video server and plays the video according to the bit rate obtained by the decision tree, so that the video experience of a user can be greatly improved.

Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:片尾言情度标记方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类