Reverberation gain normalization

文档序号：884277 发布日期：2021-03-19 浏览：14次中文

阅读说明：本技术 混响增益归一化 (Reverberation gain normalization ) 是由 R·S·奥德弗雷 J-M·约特 S·C·迪克尔于 2019-06-14 设计创作，主要内容包括：公开了用于提供精确且独立控制的混响特性的系统和方法。在一些实施例中,系统可以包括混响处理系统、直接处理系统和组合器。混响处理系统可以包括混响初始功率RIP控制系统和混响器。RIP控制系统可以包括混响初始增益RIG和RIP校正器。RIG可以被配置为将RIG值应用于输入信号,RIP校正器可以被配置为将RIP校正因子应用于来自RIG的信号。混响器可以被配置为将混响效果应用于来自RIP控制系统的信号。在一些实施例中,可以计算和应用一个或多个值和/或校正因子,以使得从混响处理系统中的部件输出的信号被归一化为预定值(例如,单位值(1.0))。(Systems and methods for providing accurate and independently controlled reverberation characteristics are disclosed. In some embodiments, the system may include a reverberation processing system, a direct processing system, and a combiner. The reverberation processing system may include a reverberation initial power RIP control system and a reverberator. The RIP control system may include a reverberation initial gain RIG and a RIP corrector. The RIG may be configured to apply a RIG value to the input signal and the RIP corrector may be configured to apply a RIP correction factor to the signal from the RIG. The reverberator may be configured to apply a reverberation effect to the signal from the RIP control system. In some embodiments, one or more values and/or correction factors may be calculated and applied such that the signals output from components in the reverberation processing system are normalized to a predetermined value (e.g., unity value (1.0)).)

1. A method for rendering an audio signal, the method comprising:

receiving an input signal, the input signal comprising a first portion and a second portion;

using a reverberation processing system to:

applying a reverberation initial gain RIG value to the first part of the input signal,

applying a reverberation initial power RIP correction factor to the first portion of the input signal, wherein the RIP correction factor is applied after the RIG value is applied, an

Introducing a reverberation effect in the first part of the input signal;

using a direct processing system to:

introducing a delay to the second part of the input signal, an

Applying a gain to the second portion of the input signal;

combining the first portion of the input signal from the reverberation processing system and the second portion of the input signal from the direct processing system; and

outputting the combined first and second portions of the input signal as an output signal, wherein the output signal is the audio signal.

2. The method of claim 1, further comprising:

calculating the RIP correction factor, wherein the RIP correction factor is calculated and applied to the first portion of the input signal by a RIP corrector,

wherein the RIP correction factor is calculated such that a signal output from the RIP corrector is normalized to 1.0.

3. The method of claim 1, wherein the RIP correction factor depends on one or more of: reverberator topology, number and duration of delay units, connection gain, and filter parameters.

4. The method of claim 1, wherein the RIP correction factor is equal to the RMS power of the reverberation impulse response.

5. The method of claim 1, wherein introducing the reverberation effect in the first portion of the input signal comprises: one or more frequencies are filtered out.

6. The method of claim 1, wherein introducing the reverberation effect comprises: changing a phase of the first portion of the input signal.

7. The method of claim 1, wherein introducing the reverberation effect comprises: selecting a reverberator topology, and setting internal reverberator parameters.

8. The method of claim 1, wherein the RIG value is equal to 1.0, the method further comprising:

calculating the RIP correction factor such that the RIP of the reverberation processing system is equal to 1.0.

9. The method of claim 1, further comprising:

calculating the RIP correction factor by:

the reverberation time is set to infinity and,

recording reverberator impulse response, and

the amplitude of the reverberation RMS is measured and,

wherein the RIP correction factor is related to the inverse of the reverberation RMS amplitude.

10. The method of claim 1, further comprising:

calculating the RIP correction factor by:

the reverberation time is set to a finite value,

the reverberator impulse response is recorded and,

deriving a reverberation RMS amplitude decay curve, an

The RMS amplitude at the time of transmission is determined,

wherein the RIP correction factor is related to the inverse of the reverberation RMS amplitude.

11. The method of claim 1, wherein applying the RIG value comprises:

applying a reverberation gain RG value to said first part of said input signal, an

Applying a reverb energy RE correction factor to the first portion of the input signal, wherein the RE correction factor is applied after applying the RG value.

12. The method of claim 11, further comprising:

calculating the RE correction factor, wherein the RE correction factor is calculated and applied to the first portion of the input signal by an RE corrector,

wherein the RE correction factor is calculated such that a signal output from the RE corrector is normalized to 1.0.

13. The method of claim 11, further comprising:

calculating the RIG value, wherein the RIG value is equal to the RG value multiplied by the RE correction factor.

14. The method of claim 1, wherein the reverberation effect is introduced after applying the RIP correction factor.

15. A system, comprising:

a wearable head device configured to provide an audio signal to a user; and

circuitry configured to render the audio signal, wherein the circuitry comprises:

a reverberation processing system comprising:

a reverberation initial gain RIG configured to apply a RIG value to a first part of the input signal,

a reverberation initial power RIP corrector configured to apply a RIP correction factor to a signal from the RIG, an

A reverberator configured to introduce a reverberation effect in the signal from the RIP corrector;

a direct processing system, comprising:

a propagation delay configured to introduce a delay in a second portion of the input signal, an

A direct gain configured to apply a gain to the second portion of the input signal; and

a combiner configured to:

combining the first part of the input signal from the reverberation processing system and the second part of the input signal from the direct processing system, and

outputting the combined first and second portions of the input signal as an output signal, wherein the output signal is the audio signal.

16. The system of claim 15, wherein the reverberator includes a plurality of comb filters configured to filter out one or more frequencies in the signal from the RIP corrector.

17. The system of claim 16, wherein the reverberator includes a plurality of all-pass filters configured to change the phase of the signals from the plurality of comb filters.

18. The system of claim 15, wherein the RIG comprises a reverberation gain RG configured to apply an RG value to the first portion of the input signal.

19. The system of claim 18, wherein the RIG further comprises a reverberation energy RE corrector configured to apply RE correction factors to signals from the RG.

Technical Field

The present disclosure relates generally to reverberation algorithms and reverberators for using the disclosed reverberation algorithms. More particularly, the present disclosure relates to calculating and applying Reverberation Initial Power (RIP) correction factors in series with a reverberator. The disclosure also relates to calculating and applying a Reverberation Energy Correction (REC) factor in series with the reverberator.

Background

Virtual environments are ubiquitous in computing environments, and may find use in video games (where a virtual environment may represent a game world); a map (where the virtual environment may represent terrain to be navigated); simulation (where a virtual environment may simulate a real environment); digital storytelling (where virtual characters may interact with each other in a virtual environment); as well as many other applications. Modern computer users often comfortably perceive and interact with virtual environments. However, techniques for presenting a virtual environment may limit the user's experience in the virtual environment. For example, traditional displays (e.g., 2D display screens) and audio systems (e.g., fixed speakers) may not be able to implement virtual environments in a manner that creates a compelling, realistic, and immersive experience.

Virtual reality ("VR"), augmented reality ("AR"), mixed reality ("MR"), and related technologies (collectively "XR") share the ability to present sensory (sensory) information corresponding to a virtual environment represented by data in a computer system to a user of the XR system. By combining virtual visual and audio (audio) cues with real vision and sound, such systems can provide unique highlighting immersion and realism. Accordingly, it may be desirable to present digital sound to a user of an XR system as follows: sound appears to occur naturally in the user's real environment and conforms to the user's expectations for sound. In general, users expect that virtual sounds will have the acoustic characteristics of the real environment in which the sounds are heard. For example, users of XR systems in large concert halls will expect the virtual sound of the XR system to have a large hollow sound quality; conversely, users in small apartments will expect the sound to become softer, closer and more immediate.

Digital or artificial reverberators may be used in audio and music signal processing to simulate the perceptual effect of diffuse acoustic reverberation in a room. There may be a need for a system that provides each digital reverberator with precise and independent control of the reverberation loudness and reverberation attenuation, e.g., for intuitive control by sound designers.

Disclosure of Invention

Systems and methods for providing accurate and independent control of reverberation characteristics are disclosed. In some embodiments, the system may include a reverberation processing system, a direct processing system, and a combiner. The reverberation processing system may include a Reverberation Initial Power (RIP) control system and a reverberator. The RIP control system may include a Reverberation Initial Gain (RIG) and a RIP corrector. The RIG may be configured to apply a RIG value to the input signal, and the RIP corrector may be configured to apply a RIP correction factor to the signal from the RIG. The reverberator may be configured to apply a reverberation effect to the signal from the RIP control system.

In some embodiments, the reverberator may include one or more comb filters to filter out one or more frequencies in the system. For example, one or more frequencies may be filtered out to mimic environmental effects. In some embodiments, the reverberator may include one or more all-pass filters. Each all-pass filter may receive a signal from a comb filter and may be configured to pass its input signal without changing its amplitude, but may change the phase of the signal.

In some embodiments, the RIG may include a Reverberation Gain (RG) configured to apply an RG value to the input signal. In some embodiments, the RIG may include a REC configured to apply an RE correction factor to the signal from the RG.

Drawings

Fig. 1 illustrates an example wearable system, in accordance with some embodiments.

Fig. 2 illustrates an example hand-held controller that may be used in conjunction with an example wearable system, in accordance with some embodiments.

Fig. 3 illustrates an example secondary unit that may be used in conjunction with an example wearable system, in accordance with some embodiments.

Fig. 4 illustrates an example functional block diagram for an example wearable system, in accordance with some embodiments.

Fig. 5A illustrates a block diagram of an example audio rendering system, in accordance with some embodiments.

FIG. 5B illustrates a flow of an example process for operating the audio rendering system of FIG. 5A, in accordance with some embodiments.

Fig. 6 shows a graph of an example reverberation RMS amplitude when the reverberation time is set to infinity according to some embodiments.

Fig. 7 shows a graph of example RMS power following substantially an exponential decay after a reverberation start time, in accordance with some embodiments.

FIG. 8 illustrates an example output signal from the reverberator of FIG. 5 in accordance with some embodiments.

FIG. 9 shows the amplitude of the impulse response for an example reverberator that includes only a comb filter, according to some examples.

FIG. 10 shows the amplitude of the impulse response for an example reverberator including an all-pass filter stage (stage) according to an example of the present disclosure.

Fig. 11A illustrates an example reverberation processing system having a reverberator with a comb filter according to some embodiments.

Fig. 11B illustrates a flow of an example process for operating the reverberation processing system of fig. 11A according to some embodiments.

FIG. 12A illustrates an example reverberation processing system having a reverberator with multiple all-pass filters.

Fig. 12B illustrates a flow of an example process for operating the reverberation processing system of 12A according to some embodiments.

Fig. 13 illustrates an impulse response of the reverberation processing system of fig. 12 according to some embodiments.

Fig. 14 illustrates signal inputs and outputs through a reverberation processing system 510 according to some embodiments.

Fig. 15 illustrates a block diagram of an example FDN including a feedback matrix, in accordance with some embodiments.

Fig. 16 illustrates a block diagram of an example FDN including multiple all-pass filters according to some embodiments.

Fig. 17A illustrates a block diagram of an example reverberation processing system including a REC according to some embodiments.

Fig. 17B illustrates a flow of an example process for operating the reverberation processing system of fig. 17A according to some embodiments.

Fig. 18A illustrates an example calculated RE timeout (overtime) for a virtual sound source collocated with a virtual listener (collectee), in accordance with some embodiments.

Fig. 18B illustrates an example calculated RE with an instant (instant) reverberation start, in accordance with some embodiments.

Fig. 19 illustrates a flow of an example reverberation processing system according to some embodiments.

Detailed Description

In the following description of examples, reference is made to the accompanying drawings which form a part hereof, and in which is shown by way of illustration specific examples that may be practiced. It is to be understood that other examples may be used and structural changes may be made without departing from the scope of the disclosed examples.

Example wearable System

Fig. 1 shows an example wearable head device 100 configured to be worn on a user's head. The wearable head device 100 may be part of a broader wearable system that includes one or more components, such as a head device (e.g., the wearable head device 100), a handheld controller (e.g., the handheld controller 200 described below), and/or an auxiliary unit (e.g., the auxiliary unit 300 described below). In some examples, the wearable headpiece 100 may be used for virtual reality, augmented reality, or mixed reality systems or applications. The wearable head device 100 may include one or more displays, such as displays 110A and 110B (which may include left and right transmissive displays, and associated components for coupling light from the displays to the user's eye, such as Orthogonal Pupil Expansion (OPE) grating sets 112A/112B and Exit Pupil Expansion (EPE) grating sets 114A/114B); left and right acoustic structures, such as speakers 120A and 120B (which may be mounted on temples 122A and 122B, respectively, and positioned adjacent to the user's left and right ears); one or more sensors, such as infrared sensors, accelerometers, GPS units, Inertial Measurement Units (IMUs) (e.g., IMU 126), acoustic sensors (e.g., microphone 150); a quadrature coil electromagnetic receiver (e.g., receiver 127 shown mounted to left temple arm 122A); left and right cameras oriented away from the user (e.g., depth (time-of-flight) cameras 130A and 130B); and left and right eye cameras oriented toward the user (e.g., to detect eye movement of the user) (e.g., eye cameras 128 and 128B). However, the wearable head device 100 may incorporate any suitable display technology, as well as any suitable number, type, or combination of sensors, or other components, without departing from the scope of the present invention. In some examples, the wearable headpiece 100 may incorporate one or more microphones 150, the microphones 150 configured to detect audio signals generated by the user's speech; such microphones may be positioned in the wearable head device adjacent to the user's mouth. In some examples, the wearable headpiece 100 may incorporate networking features (e.g., Wi-Fi capabilities) to communicate with other devices and systems including other wearable systems. The wearable head device 100 may further include components such as a battery, processor, memory, storage unit, or various input devices (e.g., buttons, touch pad); or may be coupled to a handheld controller (e.g., handheld controller 200) or an auxiliary unit (e.g., auxiliary unit 300) that includes one or more such components. In some examples, the sensor may be configured to output a set of coordinates of the head-mounted unit relative to the user environment, and may provide input to a processor that performs a simultaneous localization and mapping (SLAM) process and/or a visual ranging algorithm. In some examples, the wearable headpiece 100 may be coupled to the handheld controller 200 and/or the auxiliary unit 300, as described further below.

Fig. 2 illustrates an example mobile handheld controller assembly 200 of an example wearable system. In some examples, the handheld controller 200 may be in wired or wireless communication with the wearable headpiece 100 and/or the secondary unit 300 described below. In some examples, handheld controller 200 includes a handle portion 220 to be held by a user, and one or more buttons 240 disposed along top surface 210. In some examples, the handheld controller 200 may be configured to function as an optical tracking target; for example, a sensor (e.g., a camera or other optical sensor) of the wearable-head device 100 may be configured to detect a position and/or orientation of the handheld controller 200, such that the position and/or orientation of the hand of the user holding the handheld controller 200 may be indicated by the expansion. In some examples, such as described above, the handheld controller 200 may include a processor, memory, storage unit, display, or one or more input devices. In some examples, the handheld controller 200 includes one or more sensors (e.g., any of the sensors or tracking components described above with respect to the wearable head apparatus 100). In some examples, the sensor may detect a position or orientation of the handheld controller 200 relative to the wearable headpiece 100 or relative to another component of the wearable system. In some examples, the sensor may be located in the handle portion 220 of the hand-held controller 200 and/or may be mechanically coupled to the hand-held controller. The hand-held controller 200 may be configured to provide one or more output signals, e.g., corresponding to a depressed state of the button 240; or the position, orientation, and/or motion of the hand-held controller 200 (e.g., via the IMU). Such output signals may be used as an input to the processor of the wearable headpiece 100, the auxiliary unit 300, or another component of the wearable system. In some examples, the handheld controller 200 may include one or more microphones to detect sounds (e.g., the user's voice, ambient sounds) and, in some cases, provide signals corresponding to the detected sounds to a processor (e.g., a processor of the wearable head device 100).

Fig. 3 illustrates an example secondary unit 300 of an example wearable system. In some examples, the secondary unit 300 may be in wired or wireless communication with the wearable head apparatus 100 and/or the handheld controller 200. The auxiliary unit 300 may include a battery to provide energy to operate one or more components of the wearable system, such as the wearable headpiece 100 and/or the handheld controller 200 (including a display, sensors, acoustic structures, a processor, a microphone, and/or other components of the wearable headpiece 100 or the handheld controller 200). In some examples, the secondary unit 300 may include a processor, a memory, a storage unit, a display, one or more input devices, and/or one or more sensors, as described above. In some examples, the accessory unit 300 includes a clip 310 for attaching the accessory unit to a user (e.g., a belt worn by the user). An advantage of using the auxiliary unit 300 to house one or more components of the wearable system is that doing so may allow large or heavy components to be carried on the user's waist, chest, or back (which are relatively well suited to supporting larger and heavier objects) rather than being mounted to the user's head (e.g., if housed in the wearable headpiece 100) or carried by the user's hand (e.g., if housed in the handheld controller 200). This may be particularly advantageous for relatively heavy or bulky components, such as batteries.

Fig. 4 illustrates an example functional block diagram that may correspond to an example wearable system 400 (such as may include the example wearable headpiece 100, handheld controller 200, and secondary unit 300 described above). In some examples, wearable system 400 may be used for virtual reality, augmented reality, or mixed reality applications. As shown in fig. 4, wearable system 400 may include an example handheld controller 400B, referred to herein as a "totem" (and may correspond to handheld controller 200 described above); the hand-held controller 400B may include a six degree of freedom (6DOF) totem subsystem 404A totem to helmet (headgear). The wearable system 400 may also include an example wearable headpiece 400A (which may correspond to the wearable headpiece 100 described above); the wearable headpiece 400A includes a totem-to-helmet 6DOF helmet subsystem 404B. In this example, the 6DOF totem subsystem 404A and the 6DOF helmet subsystem 404B collectively determine six coordinates (e.g., offset in three translational directions and rotation along three axes) of the handheld controller 400B relative to the wearable headpiece 400A. The six degrees of freedom may be expressed relative to a coordinate system of the wearable headpiece 400A. In such a coordinate system, the three translation offsets may be expressed as X, Y and a Z offset, may be expressed as a translation matrix, or some other representation. The rotational degrees of freedom may be expressed as a sequence of yaw, pitch, and roll rotations; expressed as a vector; expressed as a rotation matrix; expressed as a quaternion; or expressed as some other representation. In some examples, one or more depth cameras 444 (and/or one or more non-depth cameras) included in the wearable headset 400A; and/or one or more optical sights (e.g., buttons 240 of the handheld controller 200 as described above, or dedicated optical sights included in the handheld controller) may be used for 6DOF tracking. In some examples, as described above, the handheld controller 400B may include a camera; and the helmet 400A may include optical sighting for optical tracking with the camera. In some examples, wearable headpiece 400A and hand-held controller 400B each include a set of three orthogonally oriented solenoids for wirelessly transmitting and receiving three distinguishable signals. By measuring the relative amplitudes of the three distinguishable signals received in each coil for reception, the 6DOF of the handheld controller 400B relative to the wearable head device 400A can be determined. In some examples, the 6DOF totem subsystem 404A may include an Inertial Measurement Unit (IMU) that may be used to provide improved accuracy and/or more timely information regarding the fast moving hand-held controller 400B.

In some examples involving augmented reality or mixed reality applications, it may be desirable to transform coordinates from a local coordinate space (e.g., a coordinate space fixed relative to the wearable headpiece 400A) to an inertial or environmental coordinate space. For example, such transformations may be necessary for the display of the wearable head device 400A to render a virtual object (e.g., a avatar sitting on a real chair, facing forward, regardless of the position and orientation of the wearable head device 400A) at an expected position and orientation relative to the real environment rather than a fixed position and orientation on the display (e.g., at the same position in the display of the wearable head device 400A). This may preserve the illusion that the virtual object is present in the real environment (and does not appear to be positioned in the real environment unnaturally as the wearable headset 400A moves and rotates, for example). In some examples, a compensating transformation between coordinate spaces may be determined by processing images from depth camera 444 (e.g., using simultaneous localization and mapping (SLAM) and/or visual ranging processes) in order to determine a transformation of wearable head device 400A relative to an inertial or environmental coordinate system. In the example shown in fig. 4, depth camera 444 may be coupled to SLAM/visual ranging module 406 and may provide images to module 406. Implementations of SLAM/visual ranging module 406 may include a processor configured to process the image and determine a position and orientation of the user's head, which may then be used to identify a transformation between the head coordinate space and the actual coordinate space. Similarly, in some examples, additional sources of information regarding the user's head pose and position are obtained from the IMU 409 of the wearable head device 400A. Information from IMU 409 may be integrated with information from SLAM/visual ranging module 406 to provide improved accuracy and/or more timely information regarding rapid adjustments of the user's head pose and position.

In some examples, depth camera 444 may provide 3D images to gesture tracker 411, which may be implemented in a processor of wearable head device 400A. The gesture tracker 411 may identify a user's gesture, for example, by matching the 3D image received from the depth camera 444 to a stored pattern (pattern) representing the gesture. Other suitable techniques for recognizing user gestures will be apparent.

In some examples, the one or more processors 416 may be configured to receive data from the helmet subsystem 404B, IMU 409, SLAM/visual ranging module 406, depth camera 444, microphone (not shown), and/or gesture tracker 411. The processor 416 may also send and receive control signals from the 6DOF totem system 404A. Such as in the example where the handheld controller 400B is not tethered, the processor 416 may be wirelessly coupled to the 6DOF totem system 404A. The processor 416 may further communicate with additional components, such as an audiovisual content memory 418, a Graphics Processing Unit (GPU)420, and/or a Digital Signal Processor (DSP) audio sound field locator (spatializer) 422. DSP audio sound field locator 422 may be coupled to Head Related Transfer Function (HRTF) memory 425. The GPU 420 may include a left channel output coupled to a left source 424 of imagewise modulated light and a right channel output coupled to a right source 426 of imagewise modulated light. The GPU 420 may output stereoscopic image data to the source of the imagewise modulated light 424, 426. DSP audio sound field locator 422 may output audio to left speaker 412 and/or right speaker 414. DSP audio sound field locator 422 may receive input from processor 416 indicating a direction vector from the user to a virtual sound source (which may be moved by the user, e.g., via handheld controller 400B). Based on the direction vectors, DSP audio sound field locator 422 may determine a corresponding HRTF (e.g., by accessing the HRTF, or by interpolating multiple HRTFs). DSP audio sound field locator 422 may then apply the determined HRTF to an audio signal, such as an audio signal corresponding to a virtual sound generated by a virtual object. By combining the relative position and orientation of the user with respect to the virtual sound in a mixed reality environment, that is, by presenting the virtual sound that matches the user's expectations that the virtual sound sounds like a real sound in a real environment, the trustworthiness and realism of the virtual sound may be enhanced.

In some examples, such as shown in fig. 4, one or more of processor 416, GPU 420, DSP audio soundfield locator 422, HRTF memory 425, and audio/video content memory 418 may be included in secondary unit 400C (which may correspond to secondary unit 300 described above). The auxiliary unit 400C may include a battery 427 to power its components and/or to power the wearable headpiece 400A and/or the handheld controller 400B. Including such components in an auxiliary unit that can be mounted to the user's waist can limit the size and weight of the wearable headpiece 400A, which in turn can reduce fatigue on the user's head and neck.

Although fig. 4 presents elements corresponding to various components of example wearable system 400, various other suitable arrangements of these components will become apparent to those skilled in the art. For example, the elements presented in fig. 4 associated with the secondary unit 400C may alternatively be associated with the wearable headpiece 400A or the handheld controller 400B. Furthermore, some wearable systems may forego the handheld controller 400B or the auxiliary unit 400C altogether. Such variations and modifications are to be understood as being included within the scope of the disclosed examples.

Mixed reality environment

Like everyone, users of mixed reality systems also exist in real environments, that is, three-dimensional portions of the "real world" and all of its content that the users can perceive. For example, a user perceives a real environment using common human senses (vision, sound, touch, taste, smell) and interacts with the real environment by moving his body in the real environment. The location in the real environment may be described as coordinates in a coordinate space; for example, the coordinates may include latitude, longitude, and altitude relative to sea level; distances in three orthogonal dimensions from a reference point; or other suitable values. Also, a vector may describe a quantity having a direction and magnitude in coordinate space.

The computing device may maintain a representation of the (maintain) virtual environment, for example, in memory associated with the device. As used herein, a virtual environment is a computational representation of a three-dimensional space. The virtual environment may include representations of any object, action, signal, parameter, coordinate, vector, or other feature associated with the space. In some examples, circuitry (e.g., a processor) of a computing device may maintain and update a state of a virtual environment; that is, the processor may determine the state of the virtual environment at the second time at the first time based on data associated with the virtual environment and/or user-provided input. For example, if an object in the virtual environment is located at a first coordinate at that time (at time) and has certain programmed physical parameters (e.g., mass, coefficient of friction); and receiving an input from a user indicating that a force should be applied to the object in the direction vector; the processor may apply kinematics laws to determine the location of the then-current object using basic mechanics. The processor may use any known suitable information about the virtual environment and/or any suitable input to determine the then-current state of the virtual environment. In maintaining and updating the state of the virtual environment, the processor may execute any suitable software, including software related to creating and deleting virtual objects in the virtual environment; software (e.g., scripts) for defining the behavior of virtual objects or characters in a virtual environment; software for defining the behavior of signals (e.g. audio signals) in a virtual environment; software for creating and updating parameters associated with the virtual environment; software for generating an audio signal in a virtual environment; software for processing inputs and outputs; software for implementing network operations; software for applying asset data (e.g., animation data to move virtual objects over time); or many other possibilities.

An output device (such as a display or speaker) may present any or all aspects of the virtual environment to a user. For example, the virtual environment may include virtual objects (which may include representations of inanimate objects, people, animals, lights, etc.) that may be presented to the user. The processor may determine a view of the virtual environment (e.g., corresponding to a "camera" having origin coordinates, a view axis, and a frustum (frustum)); and rendering a visible scene of the virtual environment corresponding to the view to the display. Any suitable rendering technique may be used for this purpose. In some examples, the visible scene may include only some virtual objects in the virtual environment and exclude certain other virtual objects. Similarly, the virtual environment may include audio aspects that may be presented to the user as one or more audio signals. For example, a virtual object in the virtual environment may generate sound that originates from the position coordinates of the object (e.g., a virtual character may speak or cause a sound effect); or the virtual environment may be associated with a musical cue or environmental sound that may or may not be associated with a particular location. The processor may determine audio signals corresponding to "listener" (listener) coordinates, e.g., audio signals corresponding to sound synthesis in a virtual environment, and mix and process to simulate audio signals to be heard by the listener at the listener coordinates and present the audio signals to a user via one or more speakers.

Because the virtual environment exists only as a computing structure, the user cannot directly perceive the virtual environment using common senses. Instead, the user can only indirectly perceive the virtual environment presented to the user, e.g., through a display, speakers, haptic output devices, and so forth. Similarly, a user cannot directly touch, manipulate, or otherwise interact with the virtual environment; input data may be provided via input devices or sensors to a processor that may use the device or sensor data to update the virtual environment. For example, a camera sensor may provide optical data indicating that a user is attempting to move an object in the virtual environment, and a processor may use that data to cause the object to respond accordingly in the virtual environment.

Reverberation algorithm and reverberator

In some embodiments, the digital reverberator may be designed based on a delay network with feedback. In such embodiments, reverberator algorithm design criteria may be included/available for precise parametric decay time control and for preserving reverberation loudness as the decay time changes. Relative adjustment of the reverberation loudness may be achieved by providing an adjustable signal amplitude gain in cascade with a digital reverberator. The method may enable a sound designer or recording engineer to independently tune the reverberation decay time and the reverberation loudness while audibly monitoring the reverberator output signal to achieve a desired effect.

A program application, such as a video game or an interactive audio engine of VR/AR/MR, may simulate multiple moving sound sources at various locations and distances around a listener (e.g., a virtual listener) in a room/environment (e.g., a virtual room/environment), relative reverberation loudness control may not be sufficient. In some embodiments, an absolute reverberation loudness is applied, which may be experienced from each virtual sound source at rendering time. Many factors may adjust this value, such as the listener and sound source locations, and the acoustic properties of the room/environment (e.g., simulated by the reverberator). In some embodiments, such as in interactive audio applications, it is desirable to control the Reverberation Initial Power (RIP) programmatically, for example, as defined in "Analysis and synthesis of room reverberation on a statistical time-frequency model" by Jean-Marc Jot, Laurent Cerveau, and Olivier Warusfel. RIP may be used to characterize a virtual room regardless of the location of the virtual listener or virtual sound source.

In some embodiments, the reverberation algorithm (performed by the reverberator) may be configured to perceptually match the acoustic reverberation characteristics of a particular room. Example acoustic reverberation characteristics may include, but are not limited to, Reverberation Initial Power (RIP) and reverberation decay time (T60). In some embodiments, the acoustic reverberation characteristics of a room may be measured in a real room, calculated by computer simulation based on the geometric and/or physical description of the real room or a virtual room, etc.

Example Audio rendering System

Fig. 5A illustrates a block diagram of an example audio rendering system, in accordance with some embodiments. FIG. 5B illustrates a flow of an example process for operating the audio rendering system of FIG. 5A, in accordance with some embodiments.

The audio rendering system 500 may include a reverberation processing system 510A, a direct processing system 530, and a combiner 540. Both the reverberation processing system 510A and the direct processing system 530 may receive the input signal 501.

Reverb processing system 510A may include a RIP control system 512 and a reverberator 514. RIP control system 512 may receive input signal 501 and may output a signal to reverberator 514. RIP control system 512 may include Reverberation Initial Gain (RIG)516 and RIP corrector 518. RIG 516 may receive a first portion of input signal 501 and may output the signal to RIP corrector 518. The RIG 516 may be configured to apply a RIG value to the input signal 501 (step 552 of process 550). Setting the RIG value may have the effect of specifying the absolute amount of RIP in the output signal of reverberation processing system 510A.

RIP corrector 518 may receive the signal from RIG 516 and may be configured to calculate a RIP correction factor and apply the RIP correction factor to its input signal (from RIG 516) (step 554). The RIP corrector 518 may output a signal to the reverberator 514. Reverberator 514 may receive the signal from RIP corrector 518 and may be configured to introduce a reverberation effect in the signal (step 556). The reverberation effect may for example be based on a virtual environment. Reverberator 514 is discussed in more detail below.

The direct processing system 530 may include a propagation delay 532 and a direct gain 534. The direct processing system 530 and the propagation delay 532 may receive a second portion of the input signal 501. The propagation delay 532 may be configured to introduce a delay in the input signal 501 (step 558), and the delayed signal may be output to the direct gain 534. The direct gain 534 may receive the signal from the propagation delay 532 and may be configured to apply a gain to the signal (step 560).

Combiner 540 may receive output signals from both reverb processing system 510A and direct processing system 530, and may be configured to combine (e.g., add, sum, etc.) the signals (step 562). The output of the combiner 540 may be the output signal 540 of the audio rendering system 500.

Example Reverberation Initial Power (RIP) normalization

In reverberation processing system 510A, both RIG 516 and RIP corrector 518 may apply (and/or calculate) a RIG value and a RIP correction factor, respectively, such that when applied in series, the signal output from RIP corrector 518 may be normalized to a predetermined value (e.g., unity) (1.0)). That is, the RIG value of the output signal may be controlled by applying RIG 516 in series with RIP corrector 518. In some embodiments, the RIP correction factor may be applied directly after the RIG value. The RIP normalization process will be discussed in detail below.

In some embodiments, to produce a diffuse reverberation tail (tail), the reverberation algorithm may, for example, include parallel comb filters followed by a series all-pass filter. In some embodiments, the digital reverberator may be constructed as a network including one or more delay elements interconnected with feedback and/or feedforward paths, which may also include signal gain scaling or filter elements. The RIP correction factors for a reverberation processing system, such as reverberation processing system 510A of fig. 5A, may depend on one or more parameters, such as reverberator topology, number and duration of delay elements included in the network, connection gain, and filter parameters.

In some embodiments, when the reverberation time is set to infinity, the RIP correction factor of the reverberation processing system may be equal to the Root Mean Square (RMS) power of the impulse response of the reverberation system. In some embodiments, for example, as shown in FIG. 6, when the reverberant time of the reverberator is set to infinity, the impulse response of the reverberator may be a non-attenuated noise-like signal with an RMS amplitude that is constant over time.

RMS Power P of digital Signal { x } at time t, represented in samples (samples)_rms(t) may be equal to the average of the squared signal amplitudes. In some embodiments, the RMS power may be expressed as:

where t is time, N is the number of consecutive signal samples, and N is a signal sample. The average value may be evaluated over a signal window starting at time t and comprising N consecutive signal samples.

The RMS amplitude may be equal to the RMS power P_rmsThe square root of (t). In some embodiments, the RMS amplitude may be expressed as:

in some embodiments, the RIP correction factor may be derived as the desired RMS power of a constant power signal following the onset of reverberation in the impulse response of the reverberator (e.g., as shown in fig. 6), with the reverberation decay time set to infinity. Fig. 8 shows an example output signal from running a single pulse of amplitude 1.0 into the audio rendering system 500 of fig. 5A. In this case, the reverberation decay time is set to infinity, the direct signal output is set to 1.0, and the direct signal output is delayed by the propagation delay of the source-to-listener.

In some embodiments, the reverberation time of the reverberation processing system 510A may be set to a finite value. For this finite value, the RMS power may essentially follow an exponential decay (after the reverberation start time), as shown in fig. 7. The reverberation time (T60) of the reverberation processing system 510A may be generally defined as the duration of 60dB of RMS power (or amplitude) decay. The RIP correction factor may be defined as the power measured on the RMS power decay curve extrapolated to time t-0. Time t-0 may be the transmission time of input signal 501 (in fig. 5A).

Example reverberator

In some embodiments, reverberator 514 (of fig. 5A) may be configured to run a reverberation algorithm, such as the reverberation algorithm described below: "j.o. physical Audio Si-physical Processing (j.o. physical Audio signal Processing)" by Smith, http: stanford. ed u/. about jos/pasp/, online book, 2010 edition. In these embodiments, the reverberator may contain a comb filter stage. The comb filter stage may comprise 16 comb filters (e.g., eight comb filters per ear), where each comb filter may have a different feedback loop delay length.

In some embodiments, the RIP correction factor for the reverberator may be calculated by setting the reverberation time to infinity. Setting the reverberation time to infinity can be equivalent to assuming that the comb filter does not have any built-in attenuation. If Dirac pulses are input through the comb filter, the output signal of the reverberator 514 may be, for example, a sequence of full scale pulses.

FIG. 8 illustrates an example output signal from the reverberator 514 of FIG. 5A according to some embodiments. Reverberator 514 may include a comb filter (not shown). If only one comb filter has a feedback loop delay length d in samples, the echo (echo) density may be equal to the inverse of the feedback loop delay length d. The RMS amplitude may be equal to the square root of the echo density. The RMS amplitude can be expressed as:

in some embodiments, the reverberator may have multiple comb filters, and the RMS amplitude may be expressed as:

where N is the number of comb filters in the reverberator, d_meanIs the average feedback delay length. Average feedback delay length d_meanCan be represented in samples and averaged over N comb filters.

FIG. 9 shows the amplitude of the impulse response of an example reverberator that includes only a comb filter, according to some examples. In some embodiments, the reverberator may set the decay time to a finite value. As shown, the RMS amplitude of the reverberator impulse response decreases exponentially with time. On the dB scale, the RMS amplitude falls along a straight line and starts at a value equal to RIP at time t-0. Time t-0 may be the transmission time of a unit pulse at the input (e.g., the time of transmission of a pulse by a virtual sound source).

FIG. 10 shows the amplitude of the impulse response of an example reverberator including an all-pass filter stage according to examples of the present disclosure. The reverberator may be similar to the reverberator described below: "j.o. physical Audio Signal Processing" by Smith, http: stanford. edu/. about jos/pasp/, online book, 2010 edition. Since the inclusion of an all-pass filter may not significantly affect the RMS amplitude of the reverberator impulse response (as compared to the RMS amplitude of the reverberator impulse response of FIG. 9), the linear decay trend of the RMS amplitude in dB may be the same as that of FIG. 9. In some embodiments, the linear decay trend may start with the same RIP value observed at time t-0.

Fig. 11A illustrates an example reverberation processing system having a reverberator with a comb filter according to some embodiments. Fig. 11B illustrates a flow of an example process for operating the reverberation processing system of fig. 11A according to some embodiments.

Reverb processing system 510B may include RIP control system 512 and reverberator 1114. RIP control system 512 may include RIG 516 and RIP corrector 518. RIP control system 512 and RIP corrector 518 may be correspondingly similar to those included in reverb processing system 510A (of fig. 5A). Reverb processing system 510B may receive input signal 501 and output signals 502A and 502B. In some embodiments, reverberation processing system 510B may be included in the audio rendering system 500 of fig. 5A in place of reverberation processing system 510A (of fig. 5A).

RIG 516 may be configured to apply RIG values (step 1152 of process 1150) and RIP corrector 518 may apply RIP correction factors (step 1154), both in series with reverberator 1114. The series configuration of RIG 516, RIP corrector 518, and reverberator 114 may make the RIP of reverberation processing system 510B equal to RIG.

In some embodiments, the RIP correction (correction) factor may be expressed as:

when the RIG value is set to 1.0, applying the RIP correction factor to the signal may cause the RIP to be set to a predetermined value, such as a unit value (1.0).

Reverberator 514 may receive the signal from RIP control system 512 and may be configured to introduce a reverberation effect into the first portion of the input signal (step 1156). Reverberator 514 may include one or more comb filters 1115. Comb filter(s) 1115 may be configured to filter out one or more frequencies in the signal (step 1158). For example, comb filter(s) 1115 may filter out (e.g., cancel) one or more frequencies to mimic environmental effects (e.g., walls of a room). Reverberator 1114 may output two or more output signals 502A and 502B (step 1160).

FIG. 12A illustrates an example reverberation processing system having a reverberator with multiple all-pass filters. Fig. 12B illustrates a flow of an example process for operating the reverberation processing system of fig. 12A according to some embodiments.

The reverberation processing system 510C may be similar to the reverberation processing system 510B (of fig. 11A), but its reverberator 1214 may additionally include a plurality of all-pass filters 1216. Steps 1252, 1254, 1256, 1258, and 1260 may be similar to steps 1152, 1154, 1156, 1158, and 1160, respectively.

Reverb processing system 510C may include a RIP control system 512 and a reverberator 1214. RIP control system 512 may include RIG 516 and RIP corrector 518. RIP control system 512 and RIP corrector 518 may be correspondingly similar to those included in reverb processing system 510A (of fig. 5A). Reverb processing system 510B may receive input signal 501 and output signals 502A and 502B. In some embodiments, reverberation processing system 510B may be included in the audio rendering system 500 of fig. 5A in place of reverberation processing system 510A (of fig. 5A) or reverberation processing system 510B (of fig. 11).

Reverberator 1214 may additionally include an all-pass filter 1215 that may receive signals from comb filter 1115. Each all-pass filter 1215 may receive a signal from the comb filter 1115 and may be configured to pass its input signal without changing the amplitude of the input signal (step 1262). In some embodiments, all-pass filter 1215 may change the phase of the signal. In some embodiments, each all-pass filter may receive a unique signal from the comb filter. The output of all-pass filter 1215 can be the output signal 502 of reverberation processing system 510C and audio rendering system 500. For example, all-pass filter 1215A may receive the unique signal from comb filter 1115 and may output signal 502A; similarly, all-pass filter 1215B can receive the unique signal from comb filter 1115 and can output signal 502B.

Comparing fig. 9 and 10, the inclusion of all-pass filter 1216 may not significantly affect the output RMS amplitude decay trend.

When applying the RIP correction factor, if the reverberation time is set to infinity, the RIG value is set to 1.0 and a single unit pulse is input through the reverberation processing system 510C, a noise-like output with a constant RMS level of 1 may be obtained.

Fig. 13 illustrates an example impulse response of the reverberation processing system 510C of fig. 12 according to some embodiments. The reverberation time may be set to a limited amount and RIG may be set to 1.0. On the dB scale, the RMS level may drop along a straight attenuation line, as shown in fig. 10. However, due to the RIP correction factor, the RIP observed in fig. 13 can be normalized to 0dB at time t-0.

In some embodiments, the RIP normalization methods described in conjunction with fig. 5, 6, 7, and 18A may be applied regardless of the particular digital reverberation algorithm implemented in the reverberator 514 of fig. 5. For example, the reverberator may be constructed from a network of feedback and feedforward delay elements connected to a gain matrix.

Fig. 14 illustrates signals input and output by the reverberation processing system 510 according to some embodiments. For example, fig. 14 shows the signal flow of any of the reverberation processing systems 510 discussed above (e.g., those discussed in fig. 5A, 11A, and 12A). The step 1416 of applying the RIG may comprise setting a RIG value and applying it to the input signal 501. The step 1418 of applying the RIP correction factors may include calculating the RIP correction factors for the selected reverberator design and internal reverberator parameter settings. Additionally, passing signals through reverberator 1414 may enable the system to select the reverberator topology and set internal reverberator parameters. As shown, the output of reverberator 1414 may be output signal 502.

Example feedback delay network

According to some embodiments, embodiments disclosed herein may have a reverberator that includes a Feedback Delay Network (FDN). FDN may include an identity matrix, which may allow the output of the delay cell to be fed back to its input. Fig. 15A illustrates a block diagram of an example FDN including a feedback matrix, in accordance with some embodiments. The FDN 1515 may include a feedback matrix 1520, a plurality of combiners 1522, a plurality of delays 1524, and a plurality of gains 1526.

The combiner 1522 may receive the input signal 1501 and may be configured to combine (e.g., add, sum, etc.) its inputs (step 1552 of process 1550). The combiner 1522 may also receive signals from the feedback matrix 1520. Delay 1524 may receive the combined signal from combiner 1522 and may be configured to introduce a delay into one or more signals (step 1554). Gain 1526 may receive the signal from delay 1524 and may be configured to introduce gain into one or more signals (step 1556). The output signal from gain 1526 may form output signal 1502 and may also be input into feedback matrix 1520. In some embodiments, the feedback matrix 1520 may be an N × N unitary (energy preserving) matrix.

In the general case where the feedback matrix 1520 is unitary, the expression for the RIP correction factor can also be given by equation (5) because the total energy transfer around the reverberator feedback loop remains unchanged and without delay.

For example, the RIP correction factors may be calculated for a given arbitrary selection of reverberator design and internal parameter settings. The calculated RIP correction factor may be such that if the RIG value is set to 1.0, the RIP of the entire reverberation processing system 510 is also 1.0.

In some embodiments, the reverberator may include an FDN with one or more all-pass filters. Fig. 16 illustrates a block diagram of an example FDN including multiple all-pass filters according to some embodiments.

The FDN 1615 may include a plurality of all-pass filters 1630, a plurality of delays 1632, and a mixing matrix 1640B. The all-pass filter 1630 may include a plurality of gains 1526, an absorption delay 1632, and another mixing matrix 1640A. The FDN 1615 may also include a plurality of combiners (not shown).

All-pass filter 1630 receives input signal 1501 and may be configured to pass its input signal without changing the amplitude of the input signal. In some embodiments, the all-pass filter 1630 may change the phase of the signal. In some embodiments, each all-pass filter 1630 may be configured such that the power input to the all-pass filter 1630 may be equal to the power output from the all-pass filter. In other words, each all-pass filter 1630 may be non-absorbing. In particular, the absorption delay 1632 may receive the input signal 1501 and may be configured to introduce a delay in the signal. In some embodiments, the absorption delay 1632 may delay its input signal by multiple samples. In some embodiments, each absorption delay 1632 may have an absorption level that makes its output signal smaller than its input signal by a certain level.

Gains 1526A and 1526B may be configured to introduce a gain in their respective input signals. The input signal of gain 1526A may be the input signal to the absorption delay and the output signal of gain 1526B may be the output signal to the mixing matrix 1640A.

The output signal from all-pass filter 1630 may be the input signal to delay 1632. The delays 1632 may receive the signals from the all-pass filters 1630 and may be configured to introduce a delay into their respective signals. In some embodiments, the output signals from delay 1632 may be combined to form output signal 1502, or in some embodiments, these signals may be viewed separately as multiple output channels, among others. In some embodiments, the output signal 1502 may be obtained from other points in the network.

The output signal from the delay 1632 may also be an input signal to the mixing matrix 1640B. The mixing matrix 1640B may be configured to receive multiple input signals and may output its signal to be fed back to the all-pass filter 1630. In some embodiments, each mixing matrix may be a complete mixing matrix.

In these reverberator topologies, the RIP correction factor may be represented by equation (5) because the total energy transfer in and around the reverberator's feedback loop may remain unchanged and without delay. In some embodiments, FDN 1615 may change the configuration of the input and/or output signals (placement) to achieve the desired output signal 1501.

FDN 1615 with all-pass filter 1630 may be a reverberant system that takes input signal 1501 as its input and creates a multi-channel output that may include the correct attenuated reverberation signal. The input signal 1501 may be a mono input signal.

In some embodiments, the RIP correction factor may be expressed as determining the reverberation RMS amplitude a when the reverberation time is set to infinity_rms({ P }) a mathematical function of a set of reverberator parameters { P }. For example, the RIP correction factor may be expressed as:

RIPcorrection＝1/A_rms({P}) (6)

for a given reverberator topology and a given setting of the reverberator's delay element length, the RIP correction factor may be calculated by performing the following steps: (1) setting the reverberation time to infinity; (2) recording the impulse response of the reverberator (as shown in FIG. 6); (3) measuring the reverberation RMS amplitude A_rms(ii) a And (4) determining the RIP correction factor according to equation (6).

In some embodiments, the RIP correction factor may be calculated by performing the following steps: (1) setting the reverberation time to any finite value; (2) recording the impulse response of the reverberator; (3) derivation of the reverberation RMS amplitude decay Curve A_rms(t) (as shown in fig. 7A or fig. 7C); (4) the value (RMS amplitude) it extrapolated at transmission time t-0 (denoted as a) is determined_rms(0) As shown in fig. 10); and (5) determining the RIP correction factor according to equation 7 (below).

RIPcorrection＝1/A_rms({0}) (7)

Example method for normalization of reverberation energy

In some embodiments, it may be desirable to provide perceptually relevant reverberation gain control methods, e.g., for application developers, sound engineers, etc. For example, in some reverberator or room simulator embodiments, it may be desirable to provide programmed control over the measurement of the power amplification factor, which represents the effect of the reverberation processing system on the power of the input signal. The power of the input signal may be expressed in dB, for example. The programmed control of the power amplification factor may allow, for example, an application developer, sound engineer, etc. to determine a balance between the loudness of the reverberant output signal and the loudness of the input signal, or to determine the loudness of the direct sound output signal.

In some embodiments, the system may apply a Reverberation Energy (RE) correction factor. Fig. 17A illustrates a block diagram of an example reverberation processing system including an RE corrector according to some embodiments. Fig. 17B illustrates a flow of an example process for operating the reverberation processing system of fig. 17A according to some embodiments.

Reverb processing system 510D may include a RIP control system 512 and a reverberator 514. RIP control system 512 may include RIG 516 and RIP corrector 518. RIP control system 512, reverberator 514, and RIP corrector 518 may be correspondingly similar to those included in reverberation processing system 510A (of fig. 5A). Reverb processing system 510D may receive input signal 501 and may output signal 502. In some embodiments, reverb processing system 510D may be included in audio rendering system 500 of fig. 5A in place of reverb processing system 510A (of fig. 5A), reverb processing system 510B (of fig. 11A), or reverb processing system 510C (of fig. 12A).

Reverb processing system 510D may also include RIG 516, RIG 516 including Reverb Gain (RG)1716 and RE corrector 1717. The RG 1716 may receive the input signal 501 and may output a signal to the RE corrector 1717. The RG 1716 may be configured to apply RG values to the first portion of the input signal 501 (step 1752 of process 1750). In some embodiments, RIG may be implemented by cascading RG 1716 with RE corrector 1717 such that the RE correction factor is applied to the first portion of the input signal after the RG values are applied. In some embodiments, RIG 516 may be cascaded with RIP corrector 518, forming RIP control system 512 cascaded with reverberator 514.

The RE corrector 1717 may receive a signal from the RG 1716 and may be configured to calculate a RE correction factor and apply the RE correction factor to its input signal (from the RG 1716) (step 1754). In some embodiments, the RE correction factor may be calculated such that it represents the total energy in the reverberator impulse response under the following conditions: (1) RIP is set to 1.0, and (2) reverberation start time is set equal to the time at which a unit pulse is emitted by the sound source. Both the RG 1716 and REC 1717 may apply (and/or calculate) RG values and REC correction factors, respectively, so that when applied in series, the signal output from the RE corrector 1717 may be normalized to a predetermined value (e.g., unity value (1.0)). The RIP of the output signal may be controlled by applying the reverberator gain in series with the reverberator, the reverberator energy correction factor, and the reverberator initial power factor, as shown in FIG. 17A. The RE normalization process will be discussed in detail below.

RIP corrector 518 may receive the signal from RIG 516 and may be configured to calculate and apply a RIP correction factor to its input signal (from RIG 516) (step 1756). Reverberator 514 may receive the signal from RIP corrector 518 and may be configured to introduce a reverberation effect in the signal (step 1758).

In some embodiments, the RIP of the virtual room may be controlled using the reverberation processing system 510A of fig. 5A (included in the audio rendering system 500), the reverberation processing system 510B of fig. 11A (included in the audio rendering system 500), or both. RIG 516 of reverberation processing system 510A (of FIG. 5A) may directly specify the RIP, and may be physically interpreted as proportional to the inverse of the square root of the cubic volume of the virtual room, for example, as shown in Jean-Marc Jot, Laurent Cerveau, and Olivier Warusfel entitled "Analysis and synthesis of room reverberation on a statistical time-frequency model".

The RG 516 of the reverberation processing system 510D (of fig. 17A) may indirectly control the RIP of the virtual room by specifying REs. The RE may be a perceptually relevant quantity proportional to the expected reverberation energy that the user will receive from the virtual sound source if the virtual sound source is collocated in the virtual room with the virtual listener. One example virtual sound source collocated with the virtual listener is the virtual listener's own sound or footprint.

In some embodiments, the RE may be calculated and used to represent the amplification of the input signal by the reverberation processing system. Amplification may be expressed in terms of signal power. As shown in fig. 7, RE may be equal to the area under the reverberation RMS power envelope (envelope) integrated from the reverberation start time. In some embodiments, in an interactive audio engine for video games or virtual reality, the reverberation start time may be at least equal to the propagation delay of a given virtual sound source. Thus, the calculation of REs for a given virtual sound source may depend on the location of the virtual sound source.

Fig. 18A illustrates a calculated RE timeout for a virtual sound source collocated with a virtual listener, in accordance with some embodiments. In some embodiments, it may be assumed that the reverberation start time is equal to the time of sound emission. In this case, RE may represent the total energy in the reverberator impulse response when assuming that the reverberation start time is equal to the time at which the sound source emits a unit pulse. RE may be equal to the area under the reverberation RMS power envelope integrated from the reverberation start time.

In some embodiments, the RMS power curve may be represented as a continuous function of time t. In this case, RE can be represented as

In some embodiments, such as in a discrete-time embodiment of a reverberation processing system, the RMS power curve may be expressed as a function of discrete time t ═ n/Fs. In this case, RE can be expressed as:

where Fs is the same rate.

In some embodiments, an RE correction factor may be calculated and applied in series with the RIP correction factor and reverberator so that RE may be normalized to a predetermined value (e.g., unity value (1.0)). REC may be set equal to the inverse of the square root of RE, as follows:

in some embodiments, the RIP of the output reverberation signal may be controlled by applying an RG value in series with an RE correction factor, RIP correction factor, and reverberator, such as shown in reverberation processing system 510C of fig. 17A. The RG values and RE corrections can be combined to determine RIG as follows:

RIG＝RG*REC (11)

thus, from the perspective of the amount of signal domain RG, the RIP correction factor can be controlled using the RE correction factor (REC) instead of RIG.

In some embodiments, RIP may be mapped to measured signal power amplification derived from integrated RE in the system impulse response. This mapping allows the RIP to be controlled by the familiar concept of signal amplification factor (i.e., RG), as shown in equations (10) - (11) above. In some embodiments, as shown in fig. 18B and equations (8) - (9), an advantage of assuming the instantaneous reverberation onset calculated for the RE may be that the mapping may be represented without regard to the location of the user or listener.

In some embodiments, the reverberant RMS power curve of the impulse response of the reverberator 514 may be expressed as a decaying function of time. The decay function of time may start from time t-0.

P_rms(t)＝RIP*e^-αt (12)

In some embodiments, the decay parameter may be expressed as a function of the decay time T60, as follows:

α＝3*log(10)/T60 (13)

the total RE can be expressed as:

in some embodiments, RIP may be normalized to a predetermined value (e.g., unity (1.0)), and REC may be expressed as follows:

in some embodiments, REC may be approximated according to the following equation:

fig. 19 illustrates a flow of an example reverberation processing system according to some embodiments. For example, fig. 19 may illustrate a flow of the reverberation processing system 510D of fig. 17A. For a given arbitrary choice of reverberator design and internal parameter settings, the RIP correction factors may be calculated by applying, for example, equations (5) - (7). In some embodiments, for a given run-time adjustment of the reverberation decay time T60, the total RE may be recalculated by applying equations (8) - (9), where RIP may be assumed to be normalized to 1.0. The REC factor can be derived from equation (10).

Adjusting the RG value or reverberation decay time T60 at run-time may have the effect of automatically correcting the RIP of the reverberation processing system due to the application of the REC factor so that the RG can be used as an amplification factor of the RMS amplitude of the output signal (e.g., output signal 502) relative to the RMS amplitude of the input signal (e.g., input signal 501). It should be noted that adjusting the reverberation decay time T60 may not require recalculation of the RIP correction factor, as the RIP may not be affected by the modification of the decay time in some embodiments.

In some embodiments, REC may be defined based on measuring RE as the energy between two points in the reverberant tail, specified in time of emission from the sound source, after RIP is set to 1.0 by applying the RIP correction factor. This may be beneficial, for example, when convolution is used with a measured reverberation tail.

In some embodiments, the RE correction factor may be defined based on measuring RE as the energy between two points in the reverberation tail defined using an energy threshold after RIP is set to 1.0 by applying the RIP correction factor. In some embodiments, an energy threshold relative to direct sound may be used, or an absolute energy threshold may be used.

In some embodiments, the RE correction factor may be defined based on measuring RE as the energy between a temporally defined one of the reverberant tails and a defined one of the points using an energy threshold after RIP is set to 1.0 by applying the RIP correction factor.

In some embodiments, the RE correction factor may be calculated by considering a weighted sum of the energies contributed by the different coupling spaces (coupler spaces) after the RIP of each reverberation tail is set to 1.0 by applying the RIP correction factor to each reverberation. One example application of this RE correction factor calculation may be in the case of an acoustic environment that includes two or more coupled spaces.

With respect to the above-described systems and methods, the elements of the systems and methods may be suitably implemented by one or more computer processors (e.g., a CPU or DSP). The present disclosure is not limited to any particular configuration of computer hardware, including computer processors, for implementing these elements. In some cases, multiple computer systems may be employed to implement the above-described systems and methods. For example, a first computer processor (e.g., a processor of a wearable device coupled to a microphone) may be employed to receive incoming microphone signals and perform initial processing of those signals (e.g., signal conditioning and/or segmentation, such as described above). A second (and perhaps more computationally powerful) processor may then be employed to perform more computationally intensive processing, such as determining probability values associated with the speech segments of those signals. Another computer device, such as a cloud server, may host (host) a speech recognition engine, to which the input signal is ultimately provided. Other suitable configurations will be apparent and are within the scope of the present disclosure.

Although the disclosed examples have been fully described with reference to the accompanying drawings, it is to be noted that various changes and modifications will become apparent to those skilled in the art. For example, elements of one or more implementations may be combined, deleted, modified, or supplemented to form further implementations. Such changes and modifications are to be understood as being included within the scope of the disclosed examples as defined by the appended claims.

41页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：声音对话装置、声音对话系统以及声音对话装置的控制方法

Reverberation gain normalization

相关技术

网友询问留言