Method and system for maintaining real-time audio stream playback delay in reliable transmission network

文档序号：1834651 发布日期：2021-11-12 浏览：2次中文

阅读说明：本技术 可靠传输网络中维持实时音讯串流播放延迟的方法及系统 (Method and system for maintaining real-time audio stream playback delay in reliable transmission network ) 是由李敬祥周志强于 2020-05-11 设计创作，主要内容包括：本发明公开一种可靠传输网络中维持实时音讯串流播放延迟的方法及系统。方法包含以下步骤：推算音讯封包的接收时间；检测音讯封包实际的接收时间是否早于所推算的接收时间,若是,定义最小延迟音讯封包,若否,计数检测次数；判断检测次数是否等于预设次数,若否,回到第一步骤,若是,定义最小延迟音讯封包,基于最小延迟音讯封包的接收时间、系统固定延迟时间以及音讯封包的时间长度,推算各音讯封包的开始播放时间；以及判断音讯封包的开始播放时间未超过预定开始播放时间时,播放音讯封包。(The invention discloses a method and a system for maintaining real-time audio stream playing delay in a reliable transmission network. The method comprises the following steps: calculating the receiving time of the audio packet; detecting whether the actual receiving time of the audio packet is earlier than the calculated receiving time, if so, defining a minimum delay audio packet, and if not, counting the detection times; judging whether the detection times are equal to preset times, if not, returning to the first step, if so, defining a minimum delay audio packet, and calculating the starting playing time of each audio packet based on the receiving time of the minimum delay audio packet, the system fixed delay time and the time length of the audio packet; and playing the audio packet when the starting playing time of the audio packet is not more than the preset starting playing time.)

1. A method for maintaining real-time audio stream playback delay in a reliable transmission network, adapted to a receiving end that receives a plurality of audio packets in sequence from a transmitting end, the method for maintaining real-time audio stream playback delay in a reliable transmission network comprising the steps of:

(a) using the received time of the received audio packet as a calculation reference to calculate the receiving time of the next audio packet;

(b) detecting whether the actual receiving time of the next audio packet is earlier than the estimated receiving time, if so, defining the next audio packet as a minimum transmission delay audio packet, and returning to the step (a) for estimating the next audio packet of the minimum transmission delay audio packet based on the minimum transmission delay audio packet, if not, counting the detection times, and then executing the next step (c);

(c) judging whether the detection times are equal to preset detection times, if not, returning to the step (a) to calculate the next audio packet, if so, defining the audio packet serving as a calculation reference as the minimum transmission delay audio packet, and then executing the next step (d);

(d) adding the receiving time of the minimum transmission delay audio packet to the system fixed delay time to calculate the starting playing time of the minimum transmission delay audio packet;

(e) calculating the playing start time of other audio packets based on the playing start time of the minimum transmission delay audio packet and the playing time length of each audio packet; and

(f) and judging whether the playing start time of each audio packet exceeds the preset playing start time, if so, discarding each audio packet, and if not, playing each audio packet when the playing start time is reached.

2. The method of claim 1, wherein the method for maintaining real-time audio stream playback delay in a reliable transmission network further comprises the steps of:

subtracting the time length of playing all the audio packets received before the minimum transmission delay audio packet from the time of playing the minimum transmission delay audio packet to calculate the time of playing the start of the earliest received audio packet.

3. The method of claim 1, wherein the method for maintaining real-time audio stream playback delay in a reliable transmission network further comprises the steps of:

subtracting the starting playing time of the minimum transmission delay audio packet by the playing time length of the audio packet to be calculated and the playing time lengths of all the audio packets received before the minimum transmission delay audio packet and after the audio packet to be calculated so as to calculate the starting playing time of the audio packet.

4. The method of claim 1, wherein the method for maintaining real-time audio stream playback delay in a reliable transmission network further comprises the steps of:

subtracting the time length of playing the audio packet received before the minimum transmission delay audio packet from the time length of playing the audio packet received before the minimum transmission delay audio packet to calculate the time length of playing the audio packet received before the minimum transmission delay audio packet, so as to calculate the time length of playing the audio packet received before the minimum transmission delay audio packet.

5. The method of claim 1, wherein the method for maintaining real-time audio stream playback delay in a reliable transmission network further comprises the steps of:

adding the playing start time of the minimum transmission delay audio packet to the playing time length of the minimum transmission delay audio packet to calculate the playing start time of the audio packet received next to the minimum transmission delay audio packet.

6. A system for maintaining real-time audio stream playback delay in a reliable transmission network, adapted to a receiving end that receives a plurality of audio packets in sequence from a transmitting end, the system for maintaining real-time audio stream playback delay in a reliable transmission network comprising:

a minimum delay detection module configured to calculate a receiving time of a next audio packet by using a receiving time of the received audio packet as a calculation reference, and define the next audio packet as a minimum transmission delay audio packet when detecting that an actual receiving time of the next audio packet is earlier than the calculated receiving time;

a detection counting module, connected to the minimum delay detection module, configured to count detection times when the actual receiving time of the next audio packet is later than the calculated receiving time, and to return the detection times to zero when the minimum transmission delay audio packet is found;

a detection threshold setting module, connected to the minimum delay detection module and the detection counting module, configured to instruct the minimum delay detection module to detect the subsequently received audio packets to find the minimum transmission delay audio packet when determining that the detection number is not equal to a preset detection number, and instruct the minimum delay detection module to directly define the audio packet as a calculation reference as the minimum transmission delay audio packet when determining that the detection number is equal to the preset detection number and the minimum transmission delay audio packet is not found;

a playing time calculation module, connected to the minimum delay detection module, configured to add the actual receiving time of the minimum transmission delay audio packet to a system fixed delay time to calculate the starting playing time of the minimum transmission delay audio packet, and calculate the starting playing time of each other audio packet based on the starting playing time of the minimum transmission delay audio packet and the playing time length of each audio packet; and

and the audio packet screening module is connected with the playing time calculation module and configured to discard the audio packets when the playing start time of each audio packet exceeds a preset playing start time, and instruct to play each audio packet when the playing start time of each audio packet does not exceed the preset playing start time.

7. The system of claim 6, wherein the playback time estimation module is configured to subtract the playback start time of the least transmitted delayed audio packet from the playback start time of all the audio packets received before the least transmitted delayed audio packet to estimate the playback start time of the earliest received audio packet.

8. The system of claim 6, wherein the playback time estimation module is configured to calculate the playback start time of the audio packet by subtracting the playback start time of the audio packet to be estimated from the playback start time of the minimum transmission delay audio packet, and subtracting the playback time duration of the audio packet to be estimated from the playback start time of all the audio packets received before the minimum transmission delay audio packet and after the audio packet to be estimated.

9. The system of claim 6, wherein the playback time estimation module is configured to subtract the playback start time of the least transmitted delayed audio packet from the playback start time of the audio packet received immediately before the least transmitted delayed audio packet to estimate the playback start time of the audio packet received immediately before the least transmitted delayed audio packet.

10. The system of claim 6, wherein the playback time estimation module is configured to add the playback start time of the least-transmitted delayed audio packet to the playback time duration of the least-transmitted delayed audio packet to estimate the playback start time of the audio packet next received by the least-transmitted delayed audio packet.

Technical Field

The present invention relates to audio stream playback, and more particularly, to a method and system for maintaining real-time audio stream playback delay in a reliable transmission network.

Background

In recent years, a bluetooth headset is connected with a mobile phone or a television to watch and play videos, and the problem of video and audio synchronization is emphasized.

The continuous audio data stream is cut into blocks at a sending end (mobile phone or television) and compressed to form audio packets, the audio packets are transmitted to a receiving end (earphone) through a wired or wireless network, and then the audio packets are recombined and decompressed at the receiving end, and the audio packets are restored into continuous audio data and played.

In the network transmission process, delay time jitter (jitter) of each audio packet from the sending end to the receiving end is often caused by various reasons such as network congestion, interference and lost retransmission, so that the receiving end cannot play the audio packet immediately after receiving the audio packet, and the interruption of playing caused by too long delay of the subsequent audio packet is avoided, and a jitter buffer is required to be used for relieving the interruption. The size of the jitter buffer must be a proper value, and the larger the value is, the larger the delay jitter can be tolerated, and the relative playing delay will also become larger.

The general network transmission method can be divided into best effort (best effort) and reliable (reliable) network transmission method, in which a sender retransmits (re-transmission) an audio packet when the audio packet is lost to ensure that a receiver receives complete data. Since the audio packet may be retransmitted many times, the transmission delay may also be increased significantly. Therefore, the sender will maintain a larger transmission queue (transmission queue) when sending audio data, and when the network is congested, the audio packets can be accumulated in the transmission queue first, and then the audio packets in the transmission queue are quickly sent out after the network recovers to be smooth.

In the best effort network transmission mode, the audio packets are allowed to be lost without resending, so that a large transmission queue is not required, the transmission delay is small, and the method is suitable for applications requiring low delay but not high sound quality, such as VOIP. The reliable network transmission mode is suitable for applications requiring high sound quality but allowing higher delay, such as playing music and watching movies, because audio packets are not lost.

In applications of watching video and playing games, users want high sound quality and low delay, and therefore, they must adopt a reusable network transmission mode and try to reduce the jitter buffer at the receiving end to no more than 150ms, so as to avoid the video and audio synchronization. Generally, the method comprises the steps of placing the jitter buffer after the received audio packets before playing, and taking out the audio packets from the jitter buffer to start playing after the audio packets in the jitter buffer accumulate to a certain number. When network congestion occurs and is just relieved, because a large number of audio packets accumulated in a transmission queue of a sending end can generate transmission burst (transmission burst), a jitter buffer of a receiving end is quickly filled and exceeds a lot, and in this way, the playing delay greatly exceeds the preset 150 ms.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a method for maintaining real-time audio stream playback delay in a reliable transmission network, which is suitable for a receiving end that receives a plurality of audio packets from a transmitting end in sequence, aiming at the deficiencies of the prior art. The method for maintaining real-time audio stream playback delay in a reliable transmission network includes the following steps: (a) using the receiving time of the received audio packet as a calculation reference to calculate the receiving time of the next audio packet; (b) detecting whether the actual receiving time of the next audio packet is earlier than the calculated receiving time, if so, defining the next audio packet as a minimum transmission delay audio packet, returning to the step (a) for calculating the next audio packet of the minimum transmission delay audio packet based on the minimum transmission delay audio packet, if not, counting the detection times, and then executing the next step (c); (c) judging whether the detection times are equal to a preset detection time, if not, returning to the step (a) to calculate the next audio packet, if so, defining the audio packet as a calculation reference as a minimum transmission delay audio packet, and then executing the next step (d); (d) adding a system fixed delay time to the receiving time of the minimum transmission delay audio packet to calculate the starting playing time of the minimum transmission delay audio packet; (e) calculating the starting playing time of other audio packets based on the starting playing time of the minimum transmission delay audio packet and the playing time length of each audio packet; and (f) judging whether the playing start time of each audio packet exceeds a preset playing start time, if so, discarding each audio packet, and if not, playing each audio packet when the playing start time is reached.

In one embodiment, the method for maintaining real-time audio stream playback delay in the reliable transmission network further comprises the steps of: the time length of all audio packets received before the minimum transmission delay audio packet is subtracted from the time of starting playing of the minimum transmission delay audio packet to calculate the time of starting playing of the earliest received audio packet.

In one embodiment, the method for maintaining real-time audio stream playback delay in the reliable transmission network further comprises the steps of: the time length of playing the audio packet to be estimated is subtracted from the time of starting playing the minimum transmission delay audio packet, and the time length of playing all the audio packets received before the minimum transmission delay audio packet and after the audio packet to be estimated is used for estimating the time of starting playing the audio packet.

In one embodiment, the method for maintaining real-time audio stream playback delay in the reliable transmission network further comprises the steps of: the playing start time of the minimum transmission delay audio packet is subtracted by the playing time length of the audio packet received before the minimum transmission delay audio packet to calculate the playing start time of the audio packet received before the minimum transmission delay audio packet and calculate the playing start time of the audio packet received before the minimum transmission delay audio packet.

In one embodiment, the method for maintaining real-time audio stream playback delay in the reliable transmission network further comprises the steps of: the playing start time of the minimum transmission delay audio packet is added to the playing time length of the minimum transmission delay audio packet to calculate the playing start time of the next received audio packet of the minimum transmission delay audio packet.

In addition, the invention provides a system for maintaining real-time audio stream playing delay in a reliable transmission network, which is suitable for a receiving end for receiving a plurality of audio packets from a sending end in sequence. The system for maintaining real-time audio stream playing delay in the reliable transmission network comprises a minimum delay detection module, a detection counting module, a detection threshold value setting module, a playing time calculation module and an audio packet screening module. The minimum delay detection module is configured to calculate the receiving time of the next audio packet by using the receiving time of the received audio packet as a calculation reference. The minimum delay detection module is configured to define the next audio packet as a minimum transmission delay audio packet when detecting that the actual receiving time of the next audio packet is earlier than the estimated receiving time. The detection counting module is connected with the minimum delay detection module. The detection counting module is configured to count a detection number when the actual receiving time of the next audio packet is later than the calculated receiving time, and return the detection number to zero when the minimum transmission delay audio packet is found. The detection threshold value setting module is connected with the minimum delay detection module and the detection counting module. The detection threshold setting module is configured to instruct the minimum delay detection module to detect a subsequently received audio packet to find a minimum transmission delay audio packet when the detection times are not equal to a preset detection times. The detection threshold setting module is configured to instruct the minimum delay detection module to directly define the audio packet as a calculation reference as the minimum transmission delay audio packet when the detection times are equal to the preset detection times and the minimum transmission delay audio packet is not found. The playing time calculation module is connected with the minimum delay detection module. The playing time calculation module is configured to add the actual receiving time of the minimum transmission delay audio packet to the system fixed delay time so as to calculate the starting playing time of the minimum transmission delay audio packet, and calculate the starting playing time of other audio packets based on the starting playing time of the minimum transmission delay audio packet and the playing time length of each audio packet. The audio packet screening module is configured to discard the audio packets when the playing start time of each audio packet exceeds a preset playing start time, and to play each audio packet when the playing start time of each audio packet does not exceed the preset playing start time.

In one embodiment, the playback time estimation module is configured to subtract the playback start time of all audio packets received before the minimum transmission delay audio packet from the playback start time of the minimum transmission delay audio packet to estimate the playback start time of the earliest received audio packet.

In one embodiment, the playing time estimation module is configured to subtract the playing start time of the minimum transmission delay audio packet by the playing time length of the audio packet to be estimated, and the playing time lengths of all audio packets received before the minimum transmission delay audio packet and after the audio packet to be estimated, so as to estimate the playing start time of the audio packet.

In one embodiment, the play time estimation module is configured to subtract the play start time of the minimum transmission delayed audio packet from the play start time of the audio packet received before the minimum transmission delayed audio packet to estimate the play start time of the audio packet received before the minimum transmission delayed audio packet.

In one embodiment, the playback time calculation module is configured to add the playback start time of the minimum transmission-delayed audio packet to the playback time duration of the minimum transmission-delayed audio packet to calculate the playback start time of the next received audio packet of the minimum transmission-delayed audio packet.

As described above, the present invention provides a system and method for maintaining real-time audio stream playback delay in a reliable transmission network, which effectively solves the problem of greatly increased playback delay caused by a large amount of jitter buffer (jitter buffer) at the receiving end when the playback starts in the reliable transmission network.

For a better understanding of the features and technical content of the present invention, reference should be made to the following detailed description and accompanying drawings, which are provided for purposes of illustration and description only and are not intended to limit the invention.

Drawings

Fig. 1 is a flowchart illustrating steps of a method for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention.

Fig. 2 is a time axis diagram illustrating the sending time, the arrival time, and the playing time of an audio packet according to the method for maintaining real-time audio stream playing delay in a reliable transmission network according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a system for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention, applied to a receiving end that receives a plurality of audio packets sequentially from a transmitting end.

Fig. 4 is a block diagram of a system for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention, applied to a receiving end that receives a plurality of audio packets sequentially from a transmitting end.

Fig. 5 is a block diagram of a system for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention.

Detailed Description

The following is a description of embodiments of the present invention with reference to specific embodiments, and those skilled in the art will understand the advantages and effects of the present invention from the disclosure of the present specification. The invention is capable of other and different embodiments and its several details are capable of modifications and various changes in detail, all without departing from the spirit and scope of the present invention. The drawings of the present invention are for illustrative purposes only and are not intended to be drawn to scale. The following embodiments will further explain the related art of the present invention in detail, but the disclosure is not intended to limit the scope of the present invention. In addition, the term "or" as used herein should be taken to include any one or combination of more of the associated listed items as the case may be.

Referring to fig. 1 and fig. 2, fig. 1 is a flowchart illustrating steps of a method for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the invention; fig. 2 is a time axis diagram illustrating the sending time, the arrival time, and the playing time of an audio packet according to the method for maintaining real-time audio stream playing delay in a reliable transmission network according to an embodiment of the present invention.

As shown in fig. 1, the method for maintaining real-time audio stream playback delay in a reliable transmission network of the present embodiment includes steps S101 to S121, which are applicable to a receiving end sequentially receiving a plurality of audio packets from a transmitting end, and the following description is provided.

In step S101, when the receiving end receives an audio packet, the receiving time of the received audio packet (e.g., the first audio packet sent to the receiving end) is used as a reference for calculating, based on the receiving time of the audio packet, the receiving time of the next received audio packet (e.g., the second audio packet sent to the receiving end) is calculated, i.e., the time of the next audio packet sent to the receiving end is calculated.

In step S103, it is detected whether the time when the receiving end actually receives the next audio packet is earlier than the estimated time for receiving the next audio packet. If so, the actual receiving time of the next audio packet is earlier than the estimated receiving time, and then steps S105 to S107 are performed in sequence. If not, the actual receiving time of the next audio packet is later than the estimated receiving time of the next audio packet, then step S109 is executed.

In step S105, the next audio packet is defined as a minimum transmission delay audio packet, and then step S107 is performed.

In step S107, the number of detections is zeroed, and then the process returns to step S101. Specifically, the number of detections is initially set to 0. When the actual receiving time of the next audio packet is earlier than the estimated receiving time, it means that the audio packet with the minimum transmission delay (i.e. the minimum transmission delay audio packet) is found out from all the currently received audio packets. In this case, the number of detections is zeroed to 0. Then, in step S101, the minimum transmission delay audio packet is used as the audio packet of the estimation reference. That is, in step S101, the receiving time of the audio packet following the minimum transmission delayed audio packet at the receiving end is calculated based on the receiving time of the minimum transmission delayed audio packet, including the receiving time of the next audio packet of the minimum transmission delayed audio packet.

In step S109, the number of detections is counted. Specifically, if the actual receiving time of the next audio packet is later than the estimated receiving time of the next audio packet, it indicates that the transmission delay time length of the next audio packet is greater than the transmission delay time length of the audio packet serving as the estimation reference, so the next audio packet is not the minimum transmission delay audio packet. In this case, counting/accumulating the number of detections indicates that the audio packet (e.g., the audio packet addressed to the receiving end first) received subsequently from the audio packet (e.g., the audio packet addressed to the receiving end second) detected as the basis for the estimation is not the minimum transfer delay audio packet.

In step S111, it is determined whether the number of detections is equal to a predetermined number of detections. If not, that is, if the detection times do not reach the preset detection times, step S101 is executed again to calculate and detect whether the actual receiving time of the next audio packet (for example, the third audio packet sent to the receiving end) is earlier than the calculated receiving time of the next audio packet (based on the first audio packet sent to the receiving end). And the next audio packet in step S101 is synchronized by calculating the receiving time of the next audio packet. The steps S101 to S103 are repeatedly performed to detect a plurality of audio packets until the minimum transmission delay audio packet is found in step S105, or until the number of detections equals (i.e., is accumulated to) the preset number of detections, and then step S113 is performed.

For example, as shown in fig. 2, the sender sends five audio packets to the receiver at sending time points a1, a2, A3, a4, and a5 in sequence, and the receiver receives the five audio packets at sending time points B1, B2, B3, B4, and B5 in sequence, respectively. Obviously, the time between the sending time point a4 and the arrival time point B4 of the fourth audio packet is the shortest (representing the shortest transmission delay time).

In step S113, if the actual receiving time of the subsequent audio packets of the audio packet serving as the estimation reference (e.g., the first audio packet addressed to the receiving end) is later than the estimated receiving time, and N audio packets (N reaches the predetermined number of times of detection) are detected, it represents that the delay time length of the subsequent audio packets (e.g., the second and third audio packets addressed to the receiving end) is greater than the delay time length of the audio packet serving as the estimation reference (e.g., the first audio packet addressed to the receiving end). In this case, the audio packet (e.g., the first audio packet to the receiving end) used as the estimation reference in step S101 is defined as the minimum transmission delay audio packet.

In step S115, calculating the start playing time includes adding a system fixed delay time to the time of receiving the minimum transmission delay audio packet to calculate a start playing time of the minimum transmission delay audio packet, and then calculating the start playing time of each of the other audio packets based on the start playing time of the minimum transmission delay audio packet and the playing time length of each of the audio packets.

For example, as shown in fig. 2, the fourth audio packet is detected, which has the shortest transmission delay time compared to the first three audio packets, and is defined as the minimum transmission delay audio packet. Therefore, the time of receiving the minimum transmission delay audio packet (i.e. the arrival time B4) is added to the system fixed delay time to calculate the starting playing time C4 of the minimum transmission delay audio packet, which is expressed by the following formula: c4 ═ B4+ DT, where DT represents the system fixed delay time.

After calculating the estimated time to start playing the minimum propagation delay audio packet, the method further comprises the following steps: the time length of all the audio packets received before the minimum transmission delay audio packet is subtracted from the time of starting playing the minimum transmission delay audio packet to calculate the time of starting playing the earliest received audio packet. For example, as shown in fig. 2, the playing start time C4 of the fourth audio packet is subtracted by the respective playing time lengths AT1, AT2, and AT3 of the first three audio packets to calculate the playing start time C1 of the first audio packet, which is expressed by the following formula: c1 ═ C4- (AT1+ AT2+ AT 3).

After calculating the estimated time to start playing the minimum propagation delay audio packet, the method further comprises the following steps: the time length of playing the audio packet to be estimated is subtracted from the time of starting playing the minimum transmission delay audio packet, and the time length of playing all the audio packets received before the minimum transmission delay audio packet and after the audio packet to be estimated is used for estimating the time of starting playing the audio packet. For example, as shown in fig. 2, the playing start time C4 of the fourth audio packet is subtracted by the playing time lengths AT2 and AT3 of the second and third audio packets to calculate the playing start time C2 of the second audio packet, which is expressed by the following formula: c2 ═ C4- (AT2+ AT 3).

After calculating the estimated time to start playing the minimum propagation delay audio packet, the method further comprises the following steps: the time length of the playing time of the audio packet before the minimum transmission delay audio packet is subtracted from the time of the playing of the audio packet before the minimum transmission delay audio packet to calculate the time of the playing start of the audio packet received before the minimum transmission delay audio packet. For example, as shown in fig. 2, the playing start time C4 of the fourth audio packet is subtracted by the playing time length AT3 of the third audio packet to calculate the playing start time C3 of the third audio packet, which is expressed by the following formula: c3 ═ C4-AT 3.

After calculating the estimated time to start playing the minimum propagation delay audio packet, the method further comprises the following steps: the playing start time of the minimum transmission delay audio packet is added to the playing time length of the minimum transmission delay audio packet to calculate the playing start time of the audio packet next to the minimum transmission delay audio packet. For example, as shown in fig. 2, the playing start time C4 of the fourth audio packet is added to the playing time length AT4 of the fourth audio packet to calculate the playing start time C5 of the fifth audio packet, which is expressed by the following formula: c5 ═ C4+ AT 4.

In step S117, it is determined whether the playing start time of each audio packet exceeds a predetermined playing start time (e.g., the predetermined playing start time of the audio packet is the time point when the action of the game character occurs). If yes, go to step S119. If not, go to step S121.

In step S119, if the playing start time of the audio packet exceeds the predetermined playing start time, the audio packet is discarded, that is, the audio packet is not played, but the sound effect of the next action (that is, the next audio packet) is requested directly at the time when the next action occurs.

In step S121, if the start time of the audio packet does not exceed the predetermined start time, the audio packet is played when the start time of each audio packet is reached.

For example, as shown in fig. 2, the first predetermined playing time represents a time when the audio playing device starts to execute the playing operation. After waiting for a certain period of time at the upper limit of the first predetermined playing time (i.e. the first predetermined starting playing time), the first packet is played when the starting playing time C1 of the first packet is reached, and then the second, third, fourth and fifth audio packets received in sequence are played when the starting playing times C2, C3, C4 and C5 are reached.

For another example, the second predetermined playback time DT2 represents the time when the audio playback device starts to execute the playback job. Since the respective play-starting times C1 and C2 of the first and second audio packets are earlier than the second predetermined play-starting time DT2, the first and second audio packets are discarded, and the subsequent third, fourth and fifth audio packets are played back when the play-starting times C3, C4 and C5 are reached later.

Please refer to fig. 3-5, wherein fig. 3 is a schematic diagram of a system for maintaining real-time audio stream playback delay in a reliable transmission network applied to a receiving end for sequentially receiving a plurality of audio packets from a transmitting end according to an embodiment of the present invention; FIG. 4 is a block diagram of a system for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention applied to a receiving end that receives a plurality of audio packets sequentially from a transmitting end; fig. 5 is a block diagram of a system for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention.

As shown in fig. 3, the transmitting end TX may be a mobile phone, and the receiving end RX may be an earphone, which are only exemplary and not intended to limit the present invention. As shown in fig. 3 and 4, the sender TX divides the audio data AU into a plurality of audio packets PG 1-PGn, and then sends a plurality of audio packets PG 1-PGn to the receiver RX. An audio receiving module RCV of the receiving terminal RX sequentially receives a plurality of audio packets sent by the sending terminal TX through a wired or wireless manner (for example, but not limited to, using a bluetooth wireless transmission technology).

In order to solve the problem that the play-out delay is greatly increased when the jitter buffer (jitter buffer) of the receiving end exceeds a lot when the play-out is started in the reliable transmission network. As shown in fig. 4, the system SYS for maintaining real-time audio stream playback delay in the reliable transmission network according to the embodiment of the present invention is applied to the receiving end RX, and is connected to the audio receiving module RCV and the audio playing module PYT of the receiving end RX. The transmitting terminal TX is connected to the audio receiving module RCV of the receiving terminal RX.

As shown in fig. 5, a system SYS for maintaining real-time audio stream playback delay in a reliable transmission network according to an embodiment of the present invention may include a minimum delay detection module 10, a detection counting module 20, a detection threshold setting module 30, a playback time calculation module 40, and an audio packet filtering module 50.

The minimum delay detection module 10 shown in fig. 5 is connected to the audio receiving module RCV shown in fig. 4, the detection counting module 20, the detection threshold setting module 30 and the playing time calculating module 40 shown in fig. 5. The detection counting module 20 is connected with the detection threshold setting module 30. The playing time calculating module 40 is connected to the audio packet filtering module 50.

In the process of sequentially sending a plurality of audio packets PG 1-PGn sent by the sending end TX to the receiving end RX, the minimum delay detection module 10 sequentially detects actual receiving times of the audio packets PG 1-PGn.

First, when the receiver RX receives the audio packet PG1, the minimum delay detection module 10 calculates the receiving time of the next audio packet PG2 based on the receiving time of the audio packet PG 1. When the minimum delay detection module 10 detects that the actual reception time of the next audio packet PG2 is earlier than the estimated reception time of the next audio packet PG2, the next audio packet PG2 is defined as a minimum transit delay audio packet.

The detection count module 20 sets the number of detections to initially be zero, which is denoted as C-0. When the actual reception time of the next audio packet PG2 is later than the estimated reception time of the next audio packet PG2, i.e., no audio packet with a shorter delay time than the audio packet PG1 is found, the detection count module 20 (e.g., counter) counts the detection times, which is denoted as C ═ 1, where C denotes the detection times, which denotes that the next audio packet PG2 received after the audio packet PG1 is detected not to be the minimum transit delay audio packet.

The detection threshold setting module 30 sets a default detection number. When the detection threshold setting module 30 determines that the current detection count is not equal to the predetermined detection count, for example, the predetermined detection count is 10, only one audio packet PG2 is currently detected (the current detection count C is 1), which indicates that the minimum delay detection module 10 detects the subsequently received audio packets PG 3-PG 10.

If it is detected that the next received audio packet PG2 in audio packet PG1 is not the minimum transit delay audio packet, and then the next (e.g., the third) received audio packet PG3 is detected, the minimum delay detection module 10 estimates the reception time of the next audio packet PG3 in audio packet PG1 based on the received audio packet PG 1.

Similarly, when the minimum delay detection module 10 detects that the actual reception time of the audio packet PG3 is earlier than the estimated reception time of the audio packet PG3, the next audio packet PG3 is defined as a minimum transit delay audio packet. On the contrary, when the actual receiving time of the next audio packet PG3 is later than the estimated receiving time of the next audio packet PG3, that is, no audio packet with a shorter delay time than the audio packet PG1 is found, the detection counting module 20 (e.g., a counter) counts/accumulates the detection times, which is denoted as C ═ 2.

In the case where no audio packet with a shorter delay time than the audio packet PG1 is found, it is continuously checked whether the subsequently received audio packets PG 4-PG 10 are the minimum transmission delay audio packets, and the checking procedure as described above is performed. If it is found that the delay time length of the subsequently received audio packet PG4 is shorter than the delay time length of the audio packet PG1 during the detection process of the audio packets PG 4-PG 10, the audio packet PG4 is defined as the minimum transmission delay audio packet.

When the minimum transmission delay audio packet is found, the detection counting module 20 returns the detection count to zero, which indicates that C is 0. Then, the minimum delay detection module 10 calculates the receiving time of the subsequently received audio packets PG 5-PG 10 based on the receiving time of the minimum transmission delay audio packet, such as the audio packet PG4, to find out whether there is an audio packet with a transmission delay time length shorter than that of the audio packet PG 4.

When detecting that one audio packet PG 4-PG 10 is not the minimum transit delay audio packet compared to PG1, the detection count module 20 counts up the detection times each time. Until the current detection count is accumulated to a predetermined detection count, for example, 10, the detection procedure is stopped, i.e., the subsequent audio packets, such as the audio packets PG 11-PGn, are not detected. When the current detection times are accumulated to the predetermined detection times, but the delay time length of any one of the audio packets PG 4-PG 10 is not found to be shorter than the delay time length of the audio packet PG1, the audio packet PG1, which is the estimated reference of the receiving times of other audio packets PG 2-PG 10, is defined as the minimum transmission delay audio packet.

After the audio packet with the smallest transmission delay (e.g., audio packet PG1 or audio packet PG4) is found, the time at which each of audio packets PG1 PGn starts playing can be estimated. The playing time estimation module 40 adds a system fixed delay time to the actual receiving time of the minimum transmission delay audio packet (e.g., audio packet PG4) to estimate the starting playing time of the minimum transmission delay audio packet.

Then, the playing time calculating module 40 calculates the playing start time of each of the other audio packets PG 1-PGn based on the playing start time of the minimum transmission delay audio packet (e.g., the audio packet PG1 or the audio packet PG4) and the playing time length of each of the audio packets PG 1-PGn, as shown in the above-mentioned step S115, which is not described herein again.

When the audio packet screening module 50 determines that the start playing time of any one of the audio packets PG 1-PGn, for example, the audio packet PG5, exceeds/is later than the predetermined playing time of the audio playing module PYT, the audio packet PG5 is discarded. Conversely, when the audio packet screening module 50 determines that the start time of any one of the audio packets PG 1-PGn does not exceed the predetermined start time, it instructs the audio playing module PYT shown in fig. 4 to play the corresponding audio packets PG 1-PGn when the start time is reached.

One of the benefits of the present invention is that the system and method for maintaining real-time audio stream playback delay in a reliable transmission network provided by the present invention effectively improve the problem of greatly increased playback delay caused by the fact that the jitter buffer (jitter buffer) of the receiving end exceeds a lot when the playback starts in the reliable transmission network.

The disclosure above is only a preferred embodiment of the present invention and is not intended to limit the claims, so that all the modifications and equivalents of the disclosure and drawings are included in the claims.

17页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：增强UDP网络协议以有效传输大型数据单元的技术

Method and system for maintaining real-time audio stream playback delay in reliable transmission network

相关技术

网友询问留言