Audio enhancement in response to compression feedback

文档序号：835647 发布日期：2021-03-30 浏览：25次中文

阅读说明：本技术 响应于压缩反馈进行音频增强 (Audio enhancement in response to compression feedback ) 是由蒂莫西·艾伦·波特威廉·托马斯·罗利温斯顿·子维·黄塞巴斯蒂安·P·B·霍尔茨阿普费尔于 2019-06-20 设计创作，主要内容包括：在一些实施例中,本发明揭示一种方法,所述方法用于响应于指示对经增强音频信号的至少一个频带施加的压缩量的反馈而对音频信号执行增强以产生所述经增强音频信号。在典型实施例中,所述增强是或包含低音增强。在其它实施例中执行的其它类型的增强的实例包含对话增强、升混、频率移位、谐波注入或换位、次谐波注入、虚拟化及均衡。其它方面是经配置以执行所述方法的任何实施例的系统(例如,经编程处理器)及装置(例如,低音再现能力在物理上受限的装置,例如笔记本计算机、平板计算机、移动电话或具有小型扬声器的其它装置)。(In some embodiments, the disclosure describes a method for performing enhancement on an audio signal to produce an enhanced audio signal in response to feedback indicative of an amount of compression applied to at least one frequency band of the enhanced audio signal. In a typical embodiment, the enhancement is or includes a bass enhancement. Examples of other types of enhancements performed in other embodiments include dialog enhancement, upmixing, frequency shifting, harmonic injection or transposition, sub-harmonic injection, virtualization, and equalization. Other aspects are systems (e.g., a programmed processor) and devices (e.g., devices with physically limited bass reproduction capability such as notebook computers, tablet computers, mobile phones, or other devices with small speakers) configured to perform any embodiments of the methods.)

1. A method for audio signal compression and enhancement, comprising the steps of:

performing enhancement on an input audio signal to produce an enhanced audio signal; and

performing multi-band compression on the enhanced audio signal, producing a compressed enhanced audio signal, wherein the enhancement is performed in response to compression feedback indicating an amount of compression applied to each of at least two frequency bands of the enhanced audio signal.

2. The method of claim 1, wherein the enhancement is a bass enhancement, the bass enhancement is or includes at least one of a psychoacoustic bass enhancement or an equalized bass enhancement, and the enhanced audio signal is a bass enhanced audio signal.

3. The method of claim 2, wherein the bass enhancement is performed in response to the compression feedback, including by selectively applying one or the other or both of the psychoacoustic bass enhancement or the equalized bass enhancement to the input audio signal in a manner controlled by the compression feedback.

4. The method of claim 1, wherein the enhancement is or includes at least one of: dialog enhancement, upmixing, frequency shifting, harmonic injection, harmonic transposition, subharmonic injection, virtualization, equalization, volume modeling, volume leveling, or automatic gain control.

5. The method of claim 1, wherein the enhancement is or includes automatic gain control.

6. The method of claim 1, 2, 3, 4, or 5, wherein the multi-band compression of the enhanced audio signal is performed in a manner intended to prevent distortion when playing the compressed enhanced audio signal.

7. The method of claim 1, 2, 3, 4, 5 or 6, wherein the input audio signal is indicative of audio content, the method comprising the steps of:

in response to the input audio signal, generating banded frequency-domain audio data indicative of the audio content such that the banded frequency-domain audio data includes a series of frequency components for each different frequency band in a set of frequency bands, and wherein the enhancing is performed on the banded frequency-domain audio data.

8. A system, comprising:

an enhancement subsystem coupled and configured to perform enhancement on an input audio signal to generate an enhanced audio signal; and

a multi-band compressor coupled and configured to perform multi-band compression on the enhanced audio signal, thereby generating a compressed enhanced audio signal, and to provide compression feedback to the enhancer system, wherein the compression feedback is indicative of an amount of compression applied by the multi-band compressor to each of at least two frequency bands of the enhanced audio signal, and

wherein the enhancer system is configured to perform the enhancing in response to the compression feedback.

9. The system of claim 8, wherein the enhancement is a bass enhancement, the bass enhancement is or includes at least one of a psychoacoustic bass enhancement or an equalized bass enhancement, and the enhanced audio signal is a bass enhanced audio signal.

10. The system of claim 9, wherein the enhancer system is configured to perform the bass enhancement in response to the compression feedback, including by selectively applying one or the other or both of the psychoacoustic bass enhancement or the equalized bass enhancement to the input audio signal in a manner controlled by the compression feedback.

11. The system of claim 8, wherein the enhancement is or includes at least one of: dialog enhancement, upmixing, frequency shifting, harmonic injection, harmonic transposition, subharmonic injection, virtualization or equalization, volume modeling, volume leveling, or automatic gain control.

12. The system of claim 8, wherein the enhancement is or includes automatic gain control.

13. The system of claim 8, 9, 10, 11 or 12, wherein the system is an audio playback system.

14. The system of claim 8, 9, 10, 11, or 12, wherein the system is a processor programmed to implement the enhancer system and the multiband compressor.

15. The system of claim 8, 9, 10, 11, or 12, wherein the system is a digital signal processor configured to implement the enhancer system and the multi-band compressor.

Technical Field

The invention relates to the following methods and systems: enhancement (e.g., bass enhancement) and compression are performed on an audio signal to generate an enhanced audio signal in response to feedback indicative of compression applied to each of at least two frequency bands of the enhanced audio signal. In some embodiments, the enhancement includes at least one of psychoacoustic bass enhancement (e.g., harmonic transposition) and equalized bass enhancement, the enhancement performed in response to feedback indicative of an amount of compression applied to individual frequency bands of the bass-enhanced audio signal.

Background

There are several known methods to modify an audio signal (thereby producing an enhanced audio signal) to enhance perceived low frequency (bass) content during playback of the enhanced audio signal. These can be classified as:

"equalized bass enhancement" techniques that enhance the true (physical) bass response of a speaker for playback, via equalization strategies to enhance low-frequency content, or

A "psychoacoustic bass enhancement" technique that enhances a perceived bass response of a speaker (e.g., a small loudspeaker) for playback via a psychoacoustic strategy (e.g., a "virtual bass" synthesis or generation method) designed to increase a perceived level of bass content of an audio signal during playback by at least one loudspeaker that is incapable of physically reproducing bass frequencies of the audio signal.

Equalization strategies are simpler to implement and are generally considered to provide a better listening experience than psychoacoustic strategies. Thus, if the speaker (used to play the audio signal) is capable of reproducing real/physical low frequency content, then an equalizing bass boost is typically applied to the signal rather than a psycho-acoustic bass boost. In some cases (e.g., when the speaker used for playback is not capable of reproducing real/physical low frequency content), psycho-acoustic bass enhancement is employed instead of or in addition to the equalizing bass enhancement. However, it is not known when two types of bass enhancement can be applied to an audio signal (e.g., in a bass enhancement subsystem of an overall audio signal processing system) until the present invention applies the two types of bass enhancement (e.g., to selectively apply one or the other or both) in a manner controlled by the amount of compression applied to the individual frequency bands of the resulting bass enhancement signal (e.g., the output of the bass enhancement subsystem).

At volume levels (of the input audio signal undergoing enhancement and playback) that are significantly below the maximum operating level of the speaker, the equalizing bass enhancement strategy generally works well. However, at higher volume levels, augmenting real/physical low frequency content with equalized bass enhancement may cause the speaker to be distorted at these low frequencies.

It is known to prevent loudspeaker distortion by using a multi-band compressor (e.g., an audio conditioner of a dolby audio API) that attenuates individual frequency bands of an audio signal according to a band energy threshold, which may be configured based on the true energy distortion characteristics of the playback system in the individual frequency bands. A multi-band compressor (sometimes referred to herein as a "regulator") may limit or attenuate, rather than emphasize, signal levels in any frequency band of an audio signal in which it operates.

However, the equalizing bass enhancement used to enhance low frequency content can be cancelled (implemented by the regulator) by multi-band compression (especially when it is desired to play at high volume) to reduce speaker distortion, sometimes to the point of completely canceling the bass enhancement. Applying such bass enhancement and compression may even have the unexpected result of reducing the overall playback volume, since the regulator may also attempt to preserve timbre (e.g., by not only attenuating at least one frequency band to prevent distortion, but also attempting to attenuate adjacent frequency bands by a similar amount).

A psychoacoustic strategy for bass enhancement (e.g., bass enhancement implemented by the "virtual bass" processing of the dolby audio API) is to supplement the energy of the lower frequency band (which the playback speaker cannot reproduce) with the energy in the higher frequency band that the speaker is capable of reproducing. This type of bass enhancement processing is typically used when the speaker is unable to reproduce low frequency content at any volume level due to the fundamental physical limitations of the speaker. However, this type of bass enhancement processing (used in some embodiments of the invention) may also be used when the speaker is capable of, but not desirable (e.g., due to minor system limitations) for reproducing the associated low frequency content.

One conventional type of psychoacoustic bass enhancement is bass synthesis, which is a collective term for a class of techniques that add components to the low frequency range of an audio signal to enhance the bass perceived during playback of an enhanced signal. Some such techniques, sometimes referred to as sub-bass synthesis methods, form low frequency components that are lower than the existing frequency components of the signal to extend and increase the lowest frequency range. Other techniques in the category, known as "virtual pitch" algorithms, produce audible harmonics from an inaudible bass range (e.g., an inaudible bass range when a signal is rendered by a small loudspeaker), such that the produced harmonics improve the perceived bass response. The virtual pitch method typically utilizes the well-known "fundamental missing" phenomenon in which low pitches (one or more low frequency fundamental waves and a lower harmonic of each fundamental wave) can sometimes be inferred by the human auditory system from the upper harmonics of the low frequency fundamental waves when the fundamental waves and the lower harmonics (e.g., the first harmonic of each fundamental wave) themselves are missing.

Some virtual pitch methods are designed to increase the perceived level of bass content of an audio signal during playback of the signal by one or more loudspeakers that are unable to physically reproduce the bass frequencies of the audio signal. This method generally comprises the steps of: analyzing bass frequencies present in the input audio; and enhancing the input audio by generating audible harmonics that help perceive (and include in the enhanced audio) the lower frequencies that are missing during the playing of the enhanced audio (e.g., played by a small loudspeaker that is unable to physically reproduce the missing lower frequencies). Such methods perform harmonic transposition on frequency components of the input audio that are expected to be inaudible during playback of the input audio (i.e., frequencies too low to be audible during playback on the expected speakers) to produce audible higher frequency components (i.e., frequencies high enough to be audible during playback on the expected speakers). For example, an audio signal may have an inaudible range of frequency components and an inaudible range of frequency components that is higher than the inaudible range. Harmonic transposition of frequency components in the inaudible range may produce transposed frequency components in a portion of the audible range, which may enhance the perceived level of bass content of the audio signal during playback. Such harmonic transposition may include applying a plurality of transposition factors to each relevant frequency component of the input audio to generate a plurality of harmonics of the component.

Disclosure of Invention

In a first class of embodiments, the invention is a method for performing enhancement on an audio signal to produce an enhanced audio signal in response to feedback indicative of an amount of compression applied to each of at least two frequency bands of the enhanced audio signal. In typical embodiments in this category, the enhancement is or includes a bass enhancement. Examples of other types of enhancements performed in other embodiments include (but are not limited to): dialog enhancement, upmixing, frequency shifting, harmonic injection or transposition, sub-harmonic injection, virtualization, and equalization.

Some embodiments in the first category include the following steps: enhancing the audio signal (e.g., at an enhancement stage or subsystem) to produce an enhanced audio signal; and performing multi-band compression (e.g., in a regulator coupled to an output of an enhancement stage or subsystem) on the enhanced audio signal (e.g., in an effort to prevent distortion at playback), wherein the enhancement is performed in response to compression feedback indicating an amount of compression applied to each of at least two frequency bands of the enhanced audio signal. In some such embodiments, the enhancing is or includes one or both of psychoacoustic bass enhancement (e.g., harmonic transposition) and equalized bass enhancement to generate a bass-enhanced audio signal in response to the input audio signal, and the bass enhancement is performed in a manner controlled by feedback (e.g., to selectively apply one or the other or both of the two types of bass enhancement to the input audio signal) in response to feedback indicative of an amount of compression applied to each of at least two frequency bands of the bass-enhanced audio signal.

In some embodiments, the present methods and systems implement a dynamic mixing method to use both (or a selected one) of equalized bass enhancement and psychoacoustic bass enhancement to produce an enhanced signal, followed by applying multi-band compression (sometimes referred to as conditioning) to the enhanced signal in an effort to (i.e., in the desire to) prevent distortion upon playback. When the enhancement, e.g., at a lower volume level of the input signal, enhances the level of the input audio signal in several frequency bands (typically, low frequency bands) so that the enhancement will not cause distortion when played (or be limited by a regulator in the frequency bands), equalizing type bass enhancement, e.g., relatively more equalizing type bass enhancement than psychoacoustic bass enhancement, is used to increase the level (energy) in the frequency bands and improve the overall bass response. When the enhancement emphasizes the level of the input signal in the low frequency band (e.g., at higher volume levels of the input signal) such that the enhancement will cause distortion in playback (or is limited by the regulator in that band), psychoacoustic bass enhancement (e.g., relatively better psychoacoustic bass enhancement than equalized bass enhancement) is employed to increase the level of the higher frequency band (increase energy) without increasing the energy of any lower frequency band at the distortion/limiting edge to improve the overall perceived bass response. In an exemplary embodiment, determining when equalization-type enhancement is preferable to psychoacoustic enhancement is automatically performed on a block-by-block basis by: feedback indicative of an amount of compression applied to each frequency band of the enhanced signal is generated, and a feedback loop is implemented in which the feedback is employed to control the generation of the enhanced signal.

Another aspect of the invention is a system (e.g., a device with physically or otherwise limited bass reproduction capability, such as a notebook computer, tablet computer, mobile phone, or other device with a small speaker) configured to perform any embodiment of the inventive method on an input audio signal.

In a class of embodiments, the invention is an audio playback system (e.g., a notebook computer, tablet computer, mobile phone, or other device having a small speaker or a playback system having limited (e.g., physically limited) bass reproduction capability) and is configured to perform audio enhancement (e.g., bass enhancement) on audio in response to compression feedback (in accordance with any embodiment of the inventive method) to produce enhanced audio and play the enhanced audio.

In some embodiments, the present system is or includes a general-purpose or special-purpose processor programmed with software (or firmware) and/or configured to perform an embodiment of the present method. In some embodiments, the present system is a general purpose processor coupled to receive input audio data and programmed (with appropriate software) to generate output audio data by performing embodiments of the present method. In some embodiments, the inventive system is a digital signal processor coupled to receive input audio data and configured (e.g., programmed) to generate output audio data in response to the input audio data by performing an embodiment of the inventive method.

Aspects of the invention include a system configured (e.g., programmed) to perform any embodiment of the inventive method and a computer-readable medium (e.g., disk) storing code for implementing any embodiment of the inventive method.

Drawings

FIG. 1 is a block diagram of a system configured to perform audio enhancement (e.g., bass enhancement) in accordance with an embodiment of the present invention.

Fig. 2 is a block diagram of an embodiment of the enhancement subsystem 1 of the system of fig. 1.

Concept and nomenclature

Throughout the present disclosure, including in the claims, the expressions "frequency band" and "frequency band" may be used interchangeably as synonyms.

Throughout this disclosure, including in the claims, "multi-band compression" expressing an audio signal or a pair of audio signals (e.g., to frequency domain data indicative of an enhanced audio signal or other audio signal or to one or more channels of a multi-channel audio signal) represents band-by-band compression (in at least two different frequency bands) that limits the level of the signal from not increasing in any frequency band. In each frequency band, multi-band compression reduces (or does not change by a significant or significant amount) the level of the signal. Multiband compression is sometimes referred to herein as "conditioning," and compressors that perform or are configured to perform multiband compression are sometimes referred to herein as "conditioners.

Throughout this disclosure, including in the claims, the expression "enhancement" of an audio signal (or "audio enhancement") or "enhancement" of an audio signal (e.g., on frequency domain data indicative of the audio signal, or one or more channels of a multi-channel audio signal) represents any enhancement operation performed on the signal. For example, the enhancement may be an enhancement operation performed on the signal band by band (in at least two different frequency bands of the signal). Examples of audio enhancements include, but are not limited to, bass enhancement (e.g., equalized bass enhancement or psychoacoustic bass enhancement), dialog enhancement, upmixing, frequency shifting, harmonic injection or transposition, sub-harmonic injection, virtualization, and equalization.

Throughout this disclosure, including in the claims, the expression "performing an operation on" a signal or data (e.g., filtering, scaling, transforming, or applying gain to the signal or data) is intended in a broad sense to be representative of performing the operation directly on the signal or data or on a processed version of the signal or data (e.g., performing the operation on a version of the signal that has undergone preliminary filtering or preprocessing, and then on a version of the signal).

Throughout the present disclosure, including in the claims, the expression "system" is used in a broad sense to represent a device, system, or subsystem. For example, a subsystem implementing a decoder may be referred to as a decoder system, and a system including such a subsystem (e.g., a system that generates X output signals in response to multiple inputs, where the subsystem generates M inputs and receives another X-M inputs from an external source) may also be referred to as a decoder system.

Throughout this disclosure, including in the claims, the term "processor" is used in a broad sense to represent a system or device that is programmable or configurable (e.g., with software or firmware) to perform operations on data (e.g., audio or video or other image data). Examples of processors include field programmable gate arrays (or other configurable integrated circuits or chipsets), digital signal processors programmed and/or configured to perform pipelined processing of audio or other sound data, programmable general purpose processors or computers, and programmable microprocessor chips or chipsets.

Throughout the present disclosure, including in the claims, the terms "coupled" or "coupled" are used to mean either a direct connection or an indirect connection. Thus, if a first device couples to a second device, that connection may be through a direct connection, or through an indirect connection via other devices and connections.

Detailed Description

Many embodiments of the invention are technically possible. How these embodiments are implemented will be apparent to those of skill in the art in light of this disclosure. Embodiments of the present systems, methods, and media will be described with reference to fig. 1 and 2.

FIG. 1 is a block diagram of a system (9) configured to perform audio enhancement according to an embodiment of the present invention. In the system of fig. 1, an enhancement subsystem 1 is coupled and configured to perform audio enhancement on an input audio signal, producing an enhanced audio signal. A regulator 3 (sometimes referred to as a multi-band compression subsystem 3) is coupled to the enhancer system 1 and is configured to perform multi-band compression on the enhanced audio signal, producing an output audio signal, which is a compressed enhanced audio signal. In operation, the subsystem 3 applies compression to the enhanced audio signal on a band-by-band basis (i.e., to reduce or leave unchanged a level of each band of the enhanced audio signal at each time in a series of times) to aim at preventing distortion when playing the compressed enhanced audio signal output from the subsystem 3. The subsystem 3 is also configured to generate a compression signal indicative of the amount of compression (level attenuation) applied by the subsystem 3 to each of at least one frequency band of the enhanced audio signal, e.g. to each of at least two individual frequency bands, or to each of an entire set of individual frequency bands, and to provide this compression signal as feedback to the enhancer system 1. Thus, the compression signal is a feedback signal indicative of the amount of compression applied by the regulator subsystem 3 to at least one frequency band (or each of at least two frequency bands) of the enhanced audio signal.

The enhancement subsystem 1 is configured to perform audio enhancement on an enhanced audio signal in response to a compression signal (a feedback signal indicative of an amount of compression applied to at least one frequency band, e.g., each of at least two individual frequency bands).

The system of fig. 1 also includes a rendering subsystem 5 (coupled to the conditioner 3) and speakers 7 (coupled to the rendering subsystem 5). In operation, the compressed enhanced audio signal output by the self-adjuster 3 is provided to the rendering subsystem 5, and the subsystem 5 (with speakers 7) performs playback of the audio content of the compressed enhanced audio signal. The subsystem 5 is configured to generate speaker feed tones in response to the compressed enhanced audio signal. The speaker feed is provided to the speaker 7, and the speaker 7 is configured to emit sound in response to the speaker feed. In general, the compression performed by the regulator 3 prevents sound distortion.

Thus, the subsystem 5 is configured to render the audio content (indicated by the compressed enhanced audio signal) by converting the content into speaker feed of the compressed enhanced audio signal, and the subsystem 5 (together with the speaker 7) is configured to render this audio content by converting the content into speaker feed and converting the speaker feed into sound.

The system 9 of fig. 1 may be a processor programmed (or otherwise configured) to perform an embodiment of the present enhanced method, and wherein elements 1 and 3 (and optionally also elements 5 and 7) of fig. 1 are implemented as subsystems (e.g., stages) of the processor. In another example, system 9 of FIG. 1 is a playback device configured to perform an embodiment of the present enhancement method, and where elements 1, 3, 5, and 7 of FIG. 1 are implemented as subsystems (e.g., stages) of the playback device.

In some embodiments, the inventive system (e.g., enhancement subsystem 1 of fig. 1) is configured to perform one or both of psychoacoustic bass enhancement (e.g., harmonic transposition) and equalized bass enhancement in response to an input audio signal to generate a bass-enhanced audio signal, and the bass enhancement is performed in a manner controlled by feedback indicative of an amount of compression applied by a regulator (e.g., regulator 3) to each of at least two frequency bands of the bass-enhanced audio signal (e.g., to selectively apply one or the other or both of the two types of bass enhancements to the input audio signal). Fig. 2 is a block diagram of an example of this embodiment of the enhancer system 1 of the system of fig. 1.

In fig. 2, transform element 6 is configured to perform a time-domain to frequency-domain transform (e.g., an FFT) on an input audio signal to generate banded input audio indicative of audio content of the input audio signal such that the banded input audio includes a series of frequency components of each different frequency band in a set of frequency bands. Thus, in the embodiment of fig. 2 (as in some other embodiments of the invention), the input data to be enhanced is banded frequency domain audio data indicative of the audio content of the input audio signal. A psychoacoustic bass enhancement ("PBE") subsystem 8 is coupled and configured to perform psychoacoustic bass enhancement on the banded input audio, typically including by enhancing content in frequency bands other than the lowest frequency band (increasing level). An equalized bass enhancement subsystem ("equalizer") 10 is coupled and configured to perform equalized bass enhancement on the banded input audio (typically to emphasize content in the low band). The combining subsystem 12 is coupled to receive the audio output of each of the subsystems 8 and 10 and the compression feedback signal ("compression feedback") generated by the regulator 3 of fig. 1, and is configured to generate band-wise enhanced audio in response to the compression feedback signal. The banded enhanced audio is the output of the fig. 2 embodiment of the enhancement subsystem 1 and is provided to the regulator 3 of fig. 1. The compression feedback indicates the amount of compression applied by the adjuster 3 to each of the at least two frequency bands of the banded enhanced audio. Typically, the frequency band of the band-wise enhanced audio (output from the combining subsystem 12) is the same as the frequency band to which compression is applied by the adjuster 3 (and is the same as the band-wise audio output from the transform subsystem 6), and the compression feedback indicates the amount of compression applied by the adjuster 3 to each of these frequency bands.

The combining subsystem 12 is configured to time-division multiplex or combine the output of the subsystem 8 with the output of the subsystem 10 to produce the band-wise enhanced audio as (at any time): the output of one or the other of subsystems 8 and 10 over the time; or a combination (e.g., linear combination) of the frequency components output from subsystems 8 and 10 over that time.

The combining subsystem 12 is typically configured to generate the banded enhanced audio as a series of banded enhanced audio values, where the banded enhanced audio values correspond to each time (or time interval) consisting of values for each of a number of different frequency bands, and such that each of the values for a period of time (or time interval) and one frequency band is:

a combination (e.g., a linear combination) and frequency band (e.g., some value responsive to compression feedback for a corresponding time or interval) of frequency components output from subsystems 8 and 10 over the time (or time interval), or

The frequency components and frequency bands output from one or the other of subsystems 8 and 10 over that time (or time interval) (e.g., in response to some other value of the compression feedback over the corresponding time or interval).

For example, when the compression feedback indicates that the regulator 3 is not applying compression in any frequency band, the output of the subsystem 12 for each frequency band (at a corresponding time or time interval) may be the frequency components output from the subsystem 10. Whereas if the compression feedback (corresponding to a later time or interval) instructs the regulator 3 to apply compression in each frequency band (to prevent distortion), the output of the subsystem 12 for each frequency band (at the corresponding time or interval) may be the frequency components output from the subsystem 8.

For another example, when the compression feedback indicates that the regulator 3 is not applying compression (or applying a small amount of attenuation) in a frequency band, the output of the subsystem 12 for that frequency band (at a corresponding time or time interval) may be a first linear combination of the frequency components output from the subsystem 8 and the frequency components output from the subsystem 10 (e.g., aX + bY, where a and b are factors, X is the frequency component output from the subsystem 8, and Y is the frequency component output from the subsystem 10). If the compression feedback (corresponding to a later time or time interval) indicates that the regulator 3 applies compression (or applies a greater amount of attenuation) in the frequency bands, the output of the subsystem 12 for each frequency band (at the corresponding time or interval) may be a second linear combination (different from the first linear combination) of the frequency components output from the subsystem 8 and the frequency components output from the subsystem 10 (e.g., cX + dY, where c is a factor different from a, d is a factor different from b, X is the frequency components output from the subsystem 8, and Y is the frequency components output from the subsystem 10).

Alternatively (or additionally), compression feedback is provided to subsystem 8 and/or subsystem 10 (indicated by dashed lines in fig. 2) to control the manner in which subsystem 8 and/or subsystem 10 performs bass enhancement. For example, operation of one or both of subsystems 8 and 10 (during a time interval) may be enabled or disabled in response to compression feedback, and/or the manner in which one or both of subsystems 8 and 10 perform enhancement on the banded input audio may be controlled by the compression feedback.

For example, the PBE subsystem 8 may perform harmonic transposition using even harmonics in response to some values of the compression feedback and/or perform harmonic transposition using odd harmonics in response to some other values of the compression feedback. In typical operation of the fig. 2 system, bass enhancement by subsystem 8 and/or subsystem 10 is controlled (by compression feedback) to prevent the bass enhancement from causing distortion (when played) in any particular frequency band (in view of the amount of compression applied in that frequency band by regulator 3), and/or to prevent regulator 3 from applying compression in one or more particular frequency bands. In the case where the regulator 3 applies a large amount of attenuation in a frequency band, it may be desirable for the subsystem 8 (or subsystem 10) to process less (or different types of processing) in that frequency band to prevent distortion.

For example, in situations where the feedback indicates that the regulator 3 applies a large amount of attenuation (e.g., attenuation above a predetermined threshold amount) in a frequency band, such as where too much emphasis (by the subsystem 10) with high compression (by the regulator 3) may result in distortion, emphasis (by the subsystem 10) in that frequency band may be reduced. In some embodiments, the amount or degree of processing by one of subsystems 8 or 10 is determined in response to the amount or degree of processing by the other of subsystems 8 or 10 (which is in turn determined by compression feedback), e.g., to keep the total amount or degree of processing by both subsystems 8 and 10 constant or at a desired amount.

Unless the PBE subsystem 8 operates in response to compression feedback indicating the compression applied by the regulator 3, the regulator 3 typically cannot provide reliable speaker distortion protection because perceptual bass enhancement is clearly non-linear.

In the exemplary embodiment of the system 9 of fig. 1, a full bandwidth compression feedback signal (indicative of the compression applied by the adjuster 3 in each frequency band of the full set of frequency bands) is fed into the enhancement subsystem 1 (sometimes referred to as the enhancement layer). In response, the enhancement subsystem 1 generates an enhanced audio signal that is fed into the conditioner 3, and the conditioner 3 and rendering subsystem 5 operate to generate non-distorted speaker feed tones. The compressed feedback signal may have some gating to ensure that there are no undesirable fluctuations in the signal output from the regulator 3.

Typical causes of changes in the amount of restriction (attenuation) imposed by the regulator 3 (in at least one frequency band) include changes in the playback volume or changes in the level of the audio signal provided to or produced by the inventive system due to user control. It is very important to place the regulator in series behind the enhancement layer (so that the regulator operates the output of the enhancement layer) to ensure that the loudspeaker is not fed a signal that would cause distortion thereof.

Exemplary embodiments of the invention are based, in part, on the inventors' recognition that:

equalization strategies for bass enhancement fail at high system volumes due to speaker distortion hole-associated protection mechanisms (e.g., multi-band compression), an

Conventional configurations of equalizing bass enhancement and psychoacoustic bass enhancement algorithms do not depend on feedback indicating compression applied to the bass enhanced signal, an

Although sometimes no psychoacoustic bass enhancement is employed at all in systems employing equalized bass enhancement, it will often be desirable to conditionally employ psychoacoustic bass enhancement to supplement the equalized bass enhancement (e.g., under conditions of high energy/high compression levels in the low-band of the bass-enhanced audio signal).

In some embodiments, the present system (e.g., enhancement subsystem 1 of fig. 1) is configured to perform enhancement (in addition to bass enhancement) using compression feedback to generate an enhanced audio signal (in response to an input audio signal). Examples of some types of such enhancements (e.g., as performed by embodiments of the enhancement subsystem 1) that may use compressed feedback control are described next in some embodiments of the present invention. Examples include:

1. dialog enhancement

When performing dialog enhancement (e.g., by operating an embodiment of the enhancement subsystem 1), the level of a dialog enhancement signal (e.g., generated by an embodiment of the enhancement subsystem 1) may be reduced in response to compression feedback from a regulator (e.g., the regulator 3) to limit the maximum level of dialog enhanced audio signals asserted to the regulator (in one or more particular frequency bands) such that this maximum level is caused to be low enough to prevent the regulator from compressing (limiting) audio in these frequency bands. If the level of the dialog enhancement signal is not so reduced when the modifier restricts the dialog enhanced audio signal (in at least one frequency band), dialog enhancement will often make the dialog (indicated by the compressed dialog enhanced audio signal output from the modifier) harder to understand than easier to understand.

In some alternative embodiments, the shape of the dialog enhancement curve (used to perform dialog enhancement) may be changed in response to the compression feedback to reduce the gain of the dialog enhanced audio signal in each frequency band (outside the typical speech frequency range, i.e., 300-. For example, when the enhancement subsystem 1 is configured to perform dialog enhancement, the gain of the output of the subsystem 1 in each frequency band within the speech frequency range will typically not be reduced in response to the compression feedback (although in some cases the gain of the output of the subsystem 1 in each frequency band outside the speech frequency range will be reduced). This may be done to ensure that the tone preservation mode of the regulator (e.g., regulator 3) does not result in a compressed dialog-enhanced audio signal (output from the regulator) having too quiet dialog and still ensure that an increase in user-controlled volume achieves an increase in dialog volume;

2. upmixing

When performing upmixing (e.g., by operating an embodiment of the enhancement subsystem 1), when compression feedback from a modifier (e.g., the modifier 3) instructs the modifier to limit an associated frequency band of the upmixed audio signal (i.e., an associated frequency band of at least one channel of multi-channel upmixed audio) to reduce the amount of energy of the upmixed audio signal fed into the modifier, the amount of diffuse content (e.g., produced by an embodiment of the enhancement subsystem 1) may be reduced (without touching the direct content) in response to the compression feedback. Alternatively, the upmixing may be disabled in response to the compression feedback (such that no upmixing is performed at all) for a particular time interval during which the compression feedback indicates that it should be disabled;

3. volume leveling, modeling, or automatic gain control (e.g., implemented with dolby volume). When performing volume modeling (e.g., by operating an embodiment of the enhancement subsystem 1), the volume modeler may analyze incoming audio, group similar frequencies into critical frequency bands, and apply the appropriate amount of gain to each frequency band in the manner of a compression feedback control of a regulator (e.g., regulator 3) that compresses the output of the volume modeler. In response to the compression feedback, the volume modeler may adjust the frequency response of different playback levels (relative to an assumed reference level, typically about 85 decibels) to compensate for the way in which humans perceive audio during playback at different playback levels. Thus, volume modeling ensures that the user always hears the correct tone balance, whether at a high or low playback level.

When performing volume leveling (e.g., by operating an embodiment of the enhancement subsystem 1), the volume leveler may operate in a manner controlled by the compression feedback from a regulator (e.g., regulator 3) that compresses the output of the volume leveler. Regardless of source selection and content, the volume leveler may control the playback level of the input audio to maintain a consistent playback level.

In some examples of implementations of the enhancement subsystem 1, the enhancer system may be controlled in response to compression feedback in any of the following ways:

a target reference level of the volume leveler or a reference level of the volume modeler (implemented by subsystem 1) may be adjusted in response to the compression feedback to ensure that subsystem 1 does not drive (e.g., continuously drive) regulator 3 to cause the regulator to compress audio in one or more specific frequency bands; or

The gain swing of the Automatic Gain Control (AGC) implemented by the subsystem 1 may be adjusted in response to the compression feedback to limit the maximum level of the output of the subsystem 1 (in one or more specific frequency bands) to be low enough to prevent the regulator 3 from compressing audio in these frequency bands;

4. frequency shift block

To increase speech intelligibility (e.g. of audio captured during a conference call), the enhancement subsystem 1 may be implemented as a frequency shift block. When operating this embodiment of the enhancement subsystem 1, the frequency shift block may operate in a manner controlled by the compression feedback from the regulator (e.g., regulator 3) compressing the output of the frequency shift block. Typically, when the user increases the volume and the adjuster begins to limit the frequency band in the range of typical speech, the frequency shift block will shift all frequencies in a direction that will increase the perceived volume taking into account the capabilities of the playback device (and optionally the noise level of the surrounding environment);

5. harmonic injection

In the frequency band limited by the regulator, compression feedback from the regulator may be provided to embodiments of the enhancer system 1. The enhancement subsystem may operate in response to the compression feedback to inject harmonic psychoacoustic frequencies into the audio input signal (e.g., to provide virtual bass) and thereby generate an enhanced signal that is asserted to the input of the regulator. It should be noted that the harmonic injection in this case is not limited to conventional bass frequencies. The harmonic injection may be performed at all frequencies (at fundamental frequencies up to 12 KHz; thereafter the second harmonic is above the human hearing threshold);

6. sub-harmonic injection

When the regulator limits the signal in the higher frequency band, compression feedback from the regulator may be provided to an embodiment of the enhancer system 1. The enhancement subsystem may operate in response to the compression feedback to generate a subharmonic (having a frequency equal to (fundamental)/(n), where n is an integer) and insert the subharmonic into the audio input signal, generating an enhanced signal asserted to the input of the regulator. This has the advantage that it is effective even up to 24 Khz. This will allow the perceived volume to increase when the user increases the volume control;

7. virtualization

When virtualization is performed (e.g., by operating an embodiment of the augmentation subsystem 1), the virtualizer may operate in a manner controlled by compression feedback from a regulator (e.g., regulator 3) that compresses the output of the virtualizer. Virtualizers typically cause volume changes that may cause the regulators to limit certain frequency bands. In some cases, this will cause the spatial audio to collapse unless the operation of the virtualizer is controlled (in accordance with embodiments of the present invention) by compression feedback.

In one example of such virtualization, the virtualizer does not virtualize the height filter in situations where the modulators are being caused (as indicated by the compression feedback) to limit the frequency band, but instead renders the audio to the listener plane. In another example of such virtualization, when the regulator is limiting the relevant frequency band (as indicated by the compression feedback), the virtualizer reduces the amount of reverberation ("wet" component) within the signal and only keeps the muffled feed ("dry" component); or

8. Equalization

When performing equalization (e.g., by operating an embodiment of the enhancement subsystem 1), the equalizer may operate in a manner controlled by compression feedback from a regulator (e.g., regulator 3) that compresses the output of the equalizer. The equalizer preset may cause the adjuster to start limiting a particular frequency band. The equalizer may decide (in response to the compression feedback) to change to another preset to avoid the restriction (as indicated by the compression feedback) that occurs due to the regulator component.

In some embodiments, the present system (e.g., enhancement subsystem 1 of fig. 1) is configured to perform enhancement in response to compression feedback indicating that compression is applied in only one frequency band (e.g., the compression feedback indicates that no compression (zero amount) is applied in the other frequency band). For example, if the regulator 3 is a multiband limiter with bands 1Hz to 1000Hz and 1000Hz to 20000Hz, and if the content consists of 500Hz sinusoids and this distorts the speaker, the regulator will not apply compression to the high band (1000Hz to 20000Hz) and compression feedback will indicate this.

In some embodiments, the present system (e.g., enhancement subsystem 1 of fig. 1) is configured to perform enhancement (in response to compression feedback) in the time domain. For example, the enhancement may apply a parametric filter (which may be implemented as a time-domain biquad filter). These parametric filters may be used to implement equalized bass enhancement. As another example, the enhancement may apply a parametric low pass filter that adjusts its knee point based on compression feedback.

The listed exemplary embodiments (EEEs) of the invention include the following:

eee1. a method for audio signal compression and enhancement, comprising:

performing enhancement on an input audio signal to produce an enhanced audio signal; and

performing multi-band compression on the enhanced audio signal to generate a compressed enhanced audio signal, wherein the enhancement is performed in response to compression feedback indicating an amount of compression applied to each of at least two frequency bands of the enhanced audio signal.

EEE2. the method of EEE1, wherein the enhancement is a bass enhancement, the bass enhancement is or includes at least one of a psychoacoustic bass enhancement or an equalized bass enhancement, and the enhanced audio signal is a bass enhanced audio signal.

EEE3. the method of EEE2, wherein the bass enhancement is performed in response to the compression feedback, including by selectively applying one or the other or both of the psychoacoustic bass enhancement or the equalized bass enhancement to the input audio signal in a manner controlled by the compression feedback.

The method of EEE1, wherein the enhancement is or includes at least one of: dialog enhancement, upmixing, frequency shifting, harmonic injection, harmonic transposition, subharmonic injection, virtualization, equalization, volume modeling, volume leveling, or automatic gain control.

EEE5. the method according to EEE1, wherein the multiband compression is performed on the enhanced audio signal with the aim of preventing distortion when playing the compressed enhanced audio signal.

EEE6. the method according to EEE1, wherein the input audio signal is indicative of audio content, the method comprising the steps of:

EEE7. the method according to EEE1, wherein the enhancement is or comprises automatic gain control.

A system, eee8, comprising:

an enhancement subsystem coupled and configured to perform enhancement on an input audio signal to generate an enhanced audio signal; and

EEE9. the system of EEE8, wherein the enhancement is a bass enhancement, the bass enhancement is or includes at least one of a psychoacoustic bass enhancement or an equalized bass enhancement, and the enhanced audio signal is a bass enhanced audio signal.

EEE10. the system of EEE9, wherein the enhancer system is configured to perform the bass enhancement in a manner controlled by the compression feedback in response to the compression feedback including by selectively applying one or the other or both of the psychoacoustic bass enhancement or the equalized bass enhancement to the input audio signal.

The system according to EEE8, wherein the enhancement is or includes at least one of: dialog enhancement, upmixing, frequency shifting, harmonic injection, harmonic transposition, subharmonic injection, virtualization or equalization, volume modeling, volume leveling, or automatic gain control.

EEE12. the system according to EEE8, wherein the enhancement is or comprises automatic gain control.

EEE13. the system according to EEE8, wherein the system is an audio playback system.

The system according to EEE8, wherein the system is a processor programmed to implement the enhancer system and the multiband compressor.

The system of EEE15. according to EEE14, wherein the system is a digital signal processor configured to implement the enhancer system and the multi-band compressor.

In some embodiments, the present invention is a system or device configured to perform any embodiment of the inventive method on an input audio signal (e.g., a playback device or other device, such as a notebook computer, tablet computer, mobile phone, or other device having at least one small speaker, with physically or otherwise limited bass reproduction capability). By way of example, the system 9 of fig. 1 may be: a playback device that includes all of elements 1, 3, 5, and 7 of FIG. 1 (such that the device implements all of these elements); or an audio processor, which includes (in a sense implements) all of elements 1, 3 and 5 of fig. 1.

In a class of embodiments, the present invention is an audio playback system (e.g., system 9 implemented as a notebook computer, tablet computer, mobile phone, or other device having a small speaker, or a playback system with limited (e.g., physically limited) bass reproduction capability) and is configured to perform audio enhancement (e.g., bass enhancement) on audio in response to compression feedback (in accordance with any embodiment of the inventive method) to generate enhanced audio and play the enhanced audio.

In typical embodiments, the present system is or includes a general-purpose processor or a special-purpose processor (e.g., an implementation of elements 1, 3, and 5 of system 9 of FIG. 1, or an implementation of element 1 of FIG. 1 or FIG. 2) programmed with software (or firmware) and/or otherwise configured to perform an embodiment of the present method. In some embodiments, the present system is a general purpose processor coupled to receive input audio data and programmed (with appropriate software) to generate output audio data in response to the input audio data by performing an embodiment of the present method. In some embodiments, the present system is a digital signal processor (e.g., an implementation of elements 1, 3, and 5 of system 9 of fig. 1, or an implementation of element 1 of fig. 1 or 2) coupled to receive input audio data and configured (e.g., programmed) to generate output audio data in response to the input audio data by performing an embodiment of the present method.

Although specific embodiments of, and applications for, the present invention have been described herein, it will be apparent to those skilled in the art that many changes can be made to the embodiments and applications described herein without departing from the scope of the invention described and claimed herein. It is to be understood that while certain forms of the invention have been illustrated and described, the invention is not to be limited to the specific embodiments described and illustrated or to the specific methods described.

15页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：用于非同步TDD多频带操作的无线电单元

Audio enhancement in response to compression feedback

相关技术

网友询问留言