Audio signal dynamic range compression

文档序号：1568605 发布日期：2020-01-24 浏览：36次中文

阅读说明：本技术 音频信号动态范围压缩 (Audio signal dynamic range compression ) 是由游余立于 2018-11-29 设计创作，主要内容包括：本申请公开了音频信号动态范围压缩。提供了用于压缩音频信号的动态范围的系统、方法和技术。在一个实现方式中：获得输入音频信号,并基于该信号和期望的输出范围提供时变增益信号。随后将时变的初步增益信号施加到输入音频信号以提供输出音频信号。时变增益信号的提供包括使用攻击增益响应时间和释放增益响应时间作为滤波参数对基于输入音频信号的信号进行低通滤波,其中响应于确定输入音频信号中出现瞬变,攻击增益响应时间减少,而释放增益响应时间增加。(Audio signal dynamic range compression is disclosed. Systems, methods, and techniques for compressing the dynamic range of an audio signal are provided. In one implementation: an input audio signal is obtained and a time varying gain signal is provided based on the signal and a desired output range. The time-varying preliminary gain signal is then applied to the input audio signal to provide an output audio signal. The providing of the time-varying gain signal comprises low-pass filtering the input audio signal based signal using an attack gain response time and a release gain response time as filtering parameters, wherein the attack gain response time decreases and the release gain response time increases in response to determining that a transient occurs in the input audio signal.)

1. A method of compressing the dynamic range of an audio signal, comprising:

(a) obtaining an input audio signal;

(b) providing a time-varying gain signal based on the input audio signal and a desired output range; and

wherein step (b) comprises: (i) determining and providing an indication of whether a transient has occurred in the input audio signal, (ii) providing an attack gain response time and a release gain response time based on the indication of whether a transient has occurred in the input audio signal, and (iii) low-pass filtering a signal based on the input audio signal using the attack gain response time and the release gain response time as filtering parameters, and

wherein the attack gain response time is decreased and the release gain response time is increased in response to a determination that a transient has occurred in the input audio signal.

2. The method of claim 1, wherein the indication of whether a transient has occurred in the input audio signal indicates a measure of the strength of a detected transient.

3. The method of claim 2, wherein the attack gain response time decreases more and the release gain response time increases more as a result of a greater measure of the strength of the detected transient.

4. The method of claim 1, wherein the indication of whether a transient has occurred in the input audio signal is based on a crest factor detector using peaks.

5. The method of claim 4, wherein the indication of whether a transient has occurred in the input audio signal is also based on the intensity of the input audio signal compared to a specified threshold.

6. The method of claim 1, wherein the indication of whether a transient has occurred in the input audio signal is based on a strength of the input audio signal compared to a specified threshold.

7. The method according to any of claims 1-6, wherein the indication of whether a transient has occurred in the input audio signal is calculated as the delta value by first determining a plurality of preliminary attack values and then calculating a delta value that is the amount of change between a previous one of the plurality of preliminary attack values and a current one of the plurality of preliminary attack values.

8. The method of any of claims 1-6, wherein the input audio signal is frame-based, and the indication of whether a transient has occurred in the input audio signal is determined for each frame of the input audio signal.

9. The method of any one of claims 1-6, wherein step (b) further comprises identifying a gain to achieve a desired static range compression.

10. The method of any of claims 1-6, wherein each of the attack gain response time and the release gain response time is an exponential time constant.

Technical Field

The present invention relates to, among other things, systems, methods, and techniques related to dynamic compression of the range of an audio signal (e.g., from a minimum level of the signal to a maximum level thereof) and that may be used, for example, to increase the volume of an audio signal while better preventing or limiting audio distortion and/or damage to output devices such as speakers or headphones, and to improve the listening experience in general.

Background

The dynamic range of some audio channels is sometimes much wider than the range that available output devices (such as loudspeakers) can produce accurately or clearly. For example, when the audio signal is low, the reproduced sound may not be audible, and when the audio signal is high, the reproduced sound may clip or be overloaded. A person watching a movie may need to turn the volume down during loud scenes and up during quiet scenes.

Dynamic range compression attempts to address these problems. It refers to a class of techniques for reducing the dynamic range of an audio signal to accommodate playback device and/or contextual requirements. In d.giannoulis, m.massberg and j.reiss in 2012 in the Journal of Audio Engineering Society 60: a review of this technique is given in "Digital dynamic Range computer Design-A Tutorial and Analysis" (which is referred to herein as "Giannoulis 2012") published on pages 399-.

One conventional implementation is the Dynamic Range Compressor (DRC)5 shown in fig. 1. There, the Abs module 10 performs absolute value operations, while the Log module 12 performs logarithmic functions. In a more specific implementation, the Log module 12 converts the input values to decibels, as follows:

X_G(n)＝20log₁₀(|x(n)|)，

wherein X (n) represents an input signal, and X_G(n) denotes a converted signal at the nth sampling period.

The gain computer 14 then implements static range compression, for example, as follows (from Giannoulis 2012):

where T, R and w are the specified threshold, compression ratio, and inflection point width (knee width), respectively. As used herein, the term "static" refers to the modification of an individual input value without reference to other input values (i.e., input values at other points in time). Other implementations of the gain computer 14 are also possible, such as any of the implementations given in "dynamic processors-Technology & Application Tips" (referred to herein as "ran 2005") by ran Corporation in 2005, including, for example, the aforementioned combination of compressor and expander and limiter.

As shown in fig. 1, the output of gain computer 14 is then subtracted from the input of gain computer 14 in subtractor 15 to obtain the following negative gain signal:

X_L＝X_G-Y_Gequation 2

Which is input into a level detector 16 for substantially X pairs_LA smoothing operation is applied to obtain a smoothed representation of the signal level. There are many possible implementations of the level detector 16, including any of those described in Giannoulis 2012. In one particular example, level detector 16 performs the following operations:

Y₁(n)＝max(X_L(n)，α_RY₁(n-1)+(1-α_R)X_L(n))

Y_L(n)＝α_AY_L(n-1)+(1-α_A)Y₁(n) equation 3

Wherein, Y₁(n) is the internal state, α_AAnd alpha_RAre the attack and release poles (attack and release poles) of the respective first order Infinite Impulse Response (IIR) filter. These poles control Y_LSmoothness of (n), or Y_L(n) to X_L(n) varying the speed at which the response is made. Each associated with τ, which is a corresponding Time Constant (TC), as follows:

wherein f is_sIs the sampling frequency. In other words,

wherein, tau_AAnd τ_RAttack TCs and release TCs, respectively.

After smoothing in the level detector 16, the compensation gain 19(M) is added to the inverse of the smoothed negative gain in the adder 18, and the result is converted to a linear scale by the exponential function block 20, for example, as follows:

this linear gain is then applied to the (optionally delayed) input signal in multiplier 21 to produce an output signal, as follows:

y(n)＝K(n)x(n-τ)

where τ is an optional delay provided by optional delay unit 22 that may be used to match the delay within gain computation sidechain 30 and/or to provide sidechain 30 with the ability to "look farther forward" (e.g., to "prepare" DRC 5 to better cope with powerful attacks). However, in some embodiments, delay element 22 is omitted entirely.

As will be readily appreciated from the above discussion, a DRC typically multiplies an input signal by a time-varying gain, so it performs an operation that introduces distortion into the signal. To keep this distortion low and ideally inaudible, a large TC should be used so that the gain changes slowly. Such large TCs are effective for quasi-stationary segments of the audio signal, but are often subject to transient attacks or sudden powerful sound bursts, such as may be produced by a percussive instrument or an explosion. When such a strong attack comes, the slow gain variation produced by the large TC cannot reduce the gain fast enough to prevent the attack from exceeding the upper limit of the desired range. Thus, the audio signal may clip in the digital domain (causing distortion), the power amplifier may be overloaded (potentially damaging it), and/or the voice coil of the loudspeaker may strike its back plate (potentially damaging the loudspeaker). All of these situations produce objectionable sounds and in some cases can cause damage. Therefore, it is desirable to adapt the TC to the dynamically changing nature of the input signal, for example, such that: (1) allowing fast gain reduction during strong attacks, and (2) providing slower varying gain during quasi-steady state segments.

Some attempts have been made in this regard. For example, D.Giannoulis, M.Massberg and J.Reiss in 2013, "Parametric Automation in a Dynamic Range comparator" (which is referred to herein as "Giannoulis 2013") published in Journal of the Audio Engineering Society at page 716 and 726 employs a transient or attack detector to distinguish between transient and quasi-steady state segments of an input signal, and then uses shorter TCs for transient segments and longer TCs for quasi-steady state segments. Typically, such conventional methods use "standard" attacks and release TCs during quasi-steady-state segmentation. For example, during a quasi-steady state segment, the attack TC may be 50-100 milliseconds (ms), while the release TC (typically 10 times larger) may be 500-1000 ms. Then, when a transient is detected, these values are typically reduced by a factor of 10, or such that attack TCs are reduced to 5-10ms and release TCs are reduced to 50-100 ms.

Summary of The Invention

Unfortunately, while the traditional direct approach of using short attack TCs and release TCs during short attacks and long attack and release TCs during quasi-steady-state segments seems to be in accordance with intuitive logic, the inventors have found that such an approach often does not actually provide good results. The present invention addresses this problem, for example, by adjusting TC or other measures of how fast the gain is allowed to change (sometimes referred to herein as "gain response time"), which is different from methods that have been used in the past.

Accordingly, one embodiment of the present invention is directed to compressing the dynamic range of an audio signal, for example, wherein: obtaining an input audio signal; providing a time-varying gain signal based on an input audio signal and a desired output range; and the time-varying gain signal is applied to the input audio signal to provide an output audio signal. The provision of the time-varying gain signal in the present embodiment includes: (i) determine and provide an indication of whether a transient has occurred in the input audio signal, (ii) provide an attack gain response time (e.g., attack index time constant) and a release gain response time (e.g., release index time constant) based on the indication of whether a transient has occurred in the input audio signal, and (iii) low pass filter a signal based on the input audio signal using the attack gain response time and the release gain response time as filtering parameters. In response to a determination that a transient has occurred in the input audio signal, the attack gain response time is decreased and the release gain response time is increased. Preferably, the attack gain response time primarily controls how fast the output audio signal strength is allowed to increase in response to a sudden increase in the input audio signal strength, and the release gain response time primarily controls how fast the output audio signal strength is allowed to decrease in response to a sudden decrease in the input audio signal strength.

As discussed in more detail below, an indication of whether a transient is present in the input audio signal is preferably provided as an attack function value, which also indicates a measure of the strength of any detected transients. Preferably, the attack gain response time decreases more and the release gain response time increases more due to a greater measure of the strength of the detected transient.

The attack function according to the present invention can be implemented in a variety of different ways, including any one or any combination of the following: (1) a crest factor detector based on the usage of the peak; (2) based on the intensity of the input audio signal compared to a specified threshold; and/or (3) as a delta value by first determining a preliminary attack value and then calculating a delta value that is the amount of change between a previous preliminary attack value and a current preliminary attack value. The input audio signal may be frame-based, e.g., where the indication of whether a transient has occurred in the input audio signal is determined for each frame of the input audio signal, or not frame-based, where the indication of whether a transient has occurred is determined on a sample-by-sample basis.

In a preferred embodiment, the generation of the time-varying gain signal further comprises identifying a gain (e.g., using a piecewise linear mapping of input audio signal values) that will achieve the desired static range compression.

The foregoing summary is intended only to provide a brief description of certain aspects of the invention. A more complete understanding of the present invention may be derived by referring to the claims and the following detailed description of the preferred embodiments when considered in conjunction with the figures.

According to an embodiment of the invention, the following is also included:

1) a method of compressing the dynamic range of an audio signal, comprising:

(a) obtaining an input audio signal;

(b) providing a time-varying gain signal based on the input audio signal and a desired output range; and

wherein the attack gain response time is decreased and the release gain response time is increased in response to a determination that a transient has occurred in the input audio signal.

2) The method of 1), wherein the indication of whether a transient occurs in the input audio signal indicates a measure of the strength of a detected transient.

3) The method of 2), wherein the attack gain response time decreases more and the release gain response time increases more as a result of a greater measure of the strength of the detected transient.

4) The method of 1), wherein the indication of whether a transient occurs in the input audio signal is based on a crest factor detector using peaks.

5) The method of 4), wherein the indication of whether a transient has occurred in the input audio signal is also based on the intensity of the input audio signal compared to a specified threshold.

6) The method of 1), wherein the indication of whether a transient has occurred in the input audio signal is based on the intensity of the input audio signal compared to a specified threshold.

7) The method according to any one of 1) -6), wherein the indication of whether a transient has occurred in the input audio signal is calculated as the delta value by first determining a plurality of preliminary attack values and then calculating a delta value that is the amount of change between a previous one of the plurality of preliminary attack values and a current one of the plurality of preliminary attack values.

8) The method according to any of 1) -6), wherein the input audio signal is frame-based, and the indication of whether a transient has occurred in the input audio signal is determined for each frame of the input audio signal.

9) The method of any of 1) -6), wherein step (b) further comprises identifying a gain to achieve a desired static range compression.

10) The method of any of 1) -6), wherein each of the attack gain response time and the release gain response time is an exponential time constant.

11) The method of any of 1) -6), wherein the attack gain response time primarily controls how fast the intensity of the output audio signal is allowed to increase in response to a sudden increase in the intensity of the input audio signal, and the release gain response time primarily controls how fast the intensity of the output audio signal is allowed to decrease in response to a sudden decrease in the intensity of the input audio signal.

12) A system for compressing the dynamic range of an audio signal, comprising:

(a) a system input that accepts an input audio signal;

(b) an adaptive gain generation module having an input coupled to the system input and an output providing a time-varying gain signal based on the input audio signal and a desired output range;

(c) a multiplier having an output, a first input coupled to the system input, and a second input coupled to the output of the adaptive gain generation module,

wherein the adaptive gain generation module comprises a level detector having an input and an output and a gain computer,

wherein the level detector comprises: (i) an attack detection module that determines and provides an indication of whether a transient has occurred in the input audio signal, (ii) a gain response time generator that provides an attack gain response time and a release gain response time based on the indication of whether a transient has occurred in the input audio signal that has been provided by the attack detection module, and (iii) a filter that low-pass filters a signal that has been input into the level detector using the attack gain response time and the release gain response time as filtering parameters, and

wherein the gain response time generator shortens the attack gain response time and increases the release gain response time in response to transient detection by the attack detection module.

13) The system of 12), wherein the indication of whether a transient occurs in the input audio signal provided by the attack detection module indicates a measure of the strength of the detected transient.

14) The system of 13), wherein the attack gain response time decreases more and the release gain response time increases more as a result of a greater measure of the strength of the detected transient.

15) The system of 12), wherein the indication of whether a transient has occurred in the input audio signal is based on a crest factor detector using peaks.

16) The system of 15), wherein the indication of whether a transient has occurred in the input audio signal is also based on the intensity of the input audio signal compared to a specified threshold.

17) The system of 12), wherein the indication of whether a transient has occurred in the input audio signal is based on the intensity of the input audio signal compared to a specified threshold.

18) The system of any of 12) -17), wherein the indication of whether a transient has occurred in the input audio signal is calculated as the delta value by first determining a plurality of preliminary attack values and then calculating a delta value that is the amount of change between a previous one of the plurality of preliminary attack values and a current one of the plurality of preliminary attack values.

19) The system of any of 12) -17), wherein the gain computer identifies a gain that produces static range compression to accommodate the desired output range.

20) The system of any of 12) -17), wherein the attack gain response time primarily controls how fast the intensity of the output audio signal is allowed to increase in response to a sudden increase in the intensity of the input audio signal, and the release gain response time primarily controls how fast the intensity of the output audio signal is allowed to decrease in response to a sudden decrease in the intensity of the input audio signal.

21) A computer readable medium storing a computer program executable to perform the method of any one of 1) -11).

22) An adaptive gain generation module comprising:

(a) an input end;

(b) an output end;

(d) a level detector having an input and an output,

wherein the level detector comprises: (i) an attack detection module that determines and provides an indication of whether a transient has occurred in an input audio signal, (ii) a gain response time generator that provides an attack gain response time and a release gain response time based on the indication of whether a transient has occurred in the input audio signal that has been provided by the attack detection module, and (iii) a filter that low-pass filters a signal that has been input into the level detector using the attack gain response time and the release gain response time as filtering parameters, and

wherein the gain response time generator shortens the attack gain response time and increases the release gain response time in response to transient detection by the attack detection module.

23) The adaptive gain generation module of 22), wherein the indication of whether a transient occurs in the input audio signal provided by the attack detection module indicates a measure of the strength of the detected transient.

24) The adaptive gain generation module of 23), wherein the attack gain response time decreases more and the release gain response time increases more as a result of a greater measure of the strength of the detected transient.

25) The adaptive gain generation module of 22), wherein the indication of whether a transient has occurred in the input audio signal is based on a crest factor detector using peaks.

26) The adaptive gain generation module of 25), wherein the indication of whether a transient has occurred in the input audio signal is also based on the strength of the input audio signal compared to a specified threshold.

27) The adaptive gain generation module of 22), wherein the indication of whether a transient has occurred in the input audio signal is based on the strength of the input audio signal compared to a specified threshold.

28) The adaptive gain generation module according to any one of 22) -27), wherein the indication of whether a transient has occurred in the input audio signal is calculated as the delta value by first determining a plurality of preliminary attack values and then calculating a delta value that is the amount of change between a previous one of the plurality of preliminary attack values and a current one of the plurality of preliminary attack values.

29) The adaptive gain generation module of any of 22) -27), wherein the gain computer identifies a gain that produces static range compression to accommodate a desired output range.

30) The adaptive gain generation module of any one of 22) -27), wherein the attack gain response time primarily controls how fast the intensity of the output audio signal is allowed to increase in response to an abrupt increase in the intensity of the input audio signal, and the release gain response time primarily controls how fast the intensity of the output audio signal is allowed to decrease in response to an abrupt decrease in the intensity of the input audio signal.

22页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：立体声信号的编码方法、解码方法、编码装置和解码装置

Audio signal dynamic range compression

相关技术

网友询问留言