Acoustic signal processing device, acoustic signal processing system, acoustic signal processing method, and program

文档序号：174695 发布日期：2021-10-29 浏览：33次中文

阅读说明：本技术 声信号处理装置、声信号处理系统、声信号处理方法和程序 (Acoustic signal processing device, acoustic signal processing system, acoustic signal processing method, and program ) 是由渡边隆太郎于 2020-01-24 设计创作，主要内容包括：本发明实现了以下配置：应用与由用户识别所识别的用户相对应的头部相关传递函数(HRTF)执行声音定位处理；并针对每个用户位置从输出单元进行输出。该配置包括执行用户识别和用户位置识别处理的用户识别单元；和使用特定于用户的头部相关传递函数(HRTF)作为处理参数执行声音定位处理的声像定位处理单元。声像定位处理单元执行将特定于所识别用户的HRTF作为处理参数的声音定位处理,并将通过声音定位处理获得的信号输出到用于所识别用户位置的输出单元。在用户识别单元识别多个用户的情况下,声像定位处理单元使用多个用户中的每个用户的HRTF并行地执行声音定位处理,并将经处理的信号输出到用于每个用户位置的输出单元。(The invention realizes the following configuration: performing sound localization processing applying a Head Related Transfer Function (HRTF) corresponding to a user identified by the user identification; and output from the output unit for each user position. The configuration includes a user identification unit that performs user identification and user position identification processing; and a sound image localization processing unit that performs sound localization processing using a user-specific Head Related Transfer Function (HRTF) as a processing parameter. The sound image localization processing unit performs sound localization processing that takes HRTFs specific to the identified user as processing parameters, and outputs signals obtained by the sound localization processing to the output unit for the identified user position. In a case where the user identifying unit identifies a plurality of users, the sound image localization processing unit performs sound localization processing in parallel using HRTFs of each of the plurality of users, and outputs the processed signals to the output unit for each user position.)

1. An acoustic signal processing apparatus comprising:

a user identification unit that executes a user identification process;

an acquisition unit that acquires a Head Related Transfer Function (HRTF) unique to the user identified by the user identification unit from among one or more HRTFs; and

a sound image localization processing unit that performs sound image localization processing using the Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

2. The acoustic signal processing apparatus of claim 1, wherein

The user identification unit also performs user position identification processing, an

The sound image localization processing unit outputs a signal obtained by the sound image localization processing from a speaker near the user position identified by the user identifying unit.

3. The acoustic signal processing apparatus of claim 1, wherein

The user identification unit performs the user identification process for a plurality of users, an

The sound image localization processing unit performs the sound image localization process in parallel using the Head Related Transfer Function (HRTF) of each of the plurality of users identified in the user identifying unit as a processing parameter.

4. The acoustic signal processing apparatus of claim 1, wherein

The user identification unit performs the user identification processing based on an image captured by a camera or both the user identification processing and user position identification processing.

5. The acoustic signal processing apparatus of claim 1, wherein

The subscriber identification unit

The user identification processing is performed based on sensor information or both the user identification processing and user position identification processing.

6. The acoustic signal processing apparatus of claim 1, further comprising

A Head Related Transfer Function (HRTF) database storing the Head Related Transfer Functions (HRTFs) corresponding to users.

7. The acoustic signal processing apparatus of claim 1, wherein

The acquisition unit

Accepting user identification information from the user identification unit as an input, acquiring the Head Related Transfer Function (HRTF) unique to the user from a database inside the acoustic signal processing apparatus based on the user identification information, and outputting to the sound image localization processing unit.

8. The acoustic signal processing apparatus of claim 1, wherein

The acquisition unit

Accepting user identification information from the user identification unit as an input, acquiring the Head Related Transfer Function (HRTF) unique to the user from a database in an external server based on the user identification information, and outputting to the sound image localization processing unit.

9. The acoustic signal processing apparatus of claim 1, wherein

The sound image localization processing unit

Performing processing of stopping the position output signal for which it is determined that the user is not present to the user identification unit or reducing the position output signal for which it is determined that the user is not present to the user identification unit.

10. The acoustic signal processing apparatus of claim 1, wherein

The subscriber identification unit

The registration data in the boarding reservation system is referred to perform the user identification process or both the user identification process and the user position identification process.

11. The acoustic signal processing apparatus of claim 1, wherein

The subscriber identification unit

The user identification processing or both the user identification processing and user position identification processing is performed based on a sensor worn by the user or information received from a user terminal.

12. The acoustic signal processing apparatus of claim 1, wherein

The subscriber identification unit

The user identification process is performed with reference to pre-registered user member information.

13. An acoustic signal processing apparatus comprising:

a storage unit storing a Head Related Transfer Function (HRTF) unique to a user;

an acquisition unit that acquires the Head Related Transfer Function (HRTF) unique to the user from the storage unit; and

14. The acoustic signal processing apparatus of claim 13, wherein

The sound image localization processing unit

The sound image localization process is performed on the sound signal acquired from the external server.

15. An acoustic signal processing system comprising a user terminal and a server, wherein

The server sends an audio signal to the user terminal, an

The user terminal includes:

a storage unit storing a Head Related Transfer Function (HRTF) unique to a user,

an acquisition unit that acquires the Head Related Transfer Function (HRTF) unique to the user from the storage unit, an

A sound image localization processing unit that performs sound image localization processing on the audio signal received from the server using the Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

16. An acoustic signal processing method performed in an acoustic signal processing apparatus, the method comprising:

performing a user identification process by a user identification unit;

acquiring, by an acquiring unit, a Head Related Transfer Function (HRTF) unique to a user identified by the user identifying unit from one or more HRTFs; and

performing, by a sound image localization processing unit, sound image localization processing using the Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

17. A sound signal processing method performed in a sound signal processing apparatus,

the acoustic signal processing apparatus includes a storage unit storing a user-unique Head Related Transfer Function (HRTF), the method including:

acquiring, by an acquisition unit, the Head Related Transfer Function (HRTF) unique to the user from the storage unit; and

18. A program for causing an acoustic signal processing apparatus to execute acoustic signal processing, the acoustic signal processing comprising:

causing a user identification unit to execute a user identification process;

causing an acquisition unit to acquire a Head Related Transfer Function (HRTF) unique to a user identified by the user identification unit from among one or more HRTFs; and

causing a sound image localization processing unit to perform sound image localization processing using the Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

Technical Field

The present disclosure relates to an acoustic signal processing apparatus, an acoustic signal processing system, an acoustic signal processing method, and a program. More particularly, the present disclosure relates to an acoustic signal processing apparatus, an acoustic signal processing system, an acoustic signal processing method, and a program that perform signal processing to set an optimum virtual sound source position for each user (listener).

Background

For example, there is a system in which speakers are embedded at left and right positions of a headrest portion of a seat on which a user such as a vehicle driver is seated, and sound is output from the speakers.

However, in the case where the speaker is provided in the headrest portion, a user (listener) such as a driver hears a sound from behind the ear, which may feel unnatural, and in some cases, some users may feel hearing fatigue.

Sound localization processing is a technique to solve such problems. The sound localization process is an audio signal process that causes a user to perceive sound as if the sound comes from a virtual sound source position different from the actual speaker position, for example, setting the virtual sound source position to a position in front of the listener.

For example, if an audio signal subjected to sound localization processing is output from a speaker behind the ear of a user (listener), the user will perceive the sound as if the sound source is in front of the user.

One example of a related art disclosed technique regarding sound localization processing is patent document 1 (japanese patent application laid-open No. 2003-.

Note that the above patent document discloses a configuration that generates a sound output from a speaker by performing signal processing in consideration of a Head Related Transfer Function (HRTF) from the speaker to an ear of a listener.

Performing signal processing based on a head-related transfer function (HRTF) enables control of an optimal virtual sound source position of a listener.

Reference list

Patent document

Patent document 1: japanese patent application laid-open No. 2003-111200.

Disclosure of Invention

Problems to be solved by the disclosure

As described above, by outputting the processed signals based on the Head Related Transfer Functions (HRTFs) from the speakers, it is possible to perform sound image position control that sets the optimum virtual sound source position for the listener.

However, the Head Related Transfer Function (HRTF) is different for each individual. Therefore, in the case of outputting a processed signal to which a Head Related Transfer Function (HRTF) corresponding to a specific user is applied from a speaker, there is a problem that a virtual sound source position is an optimal position for the specific user, but not necessarily an optimal virtual sound source position for another user.

The present disclosure solves such problems and provides an acoustic signal processing apparatus, an acoustic signal processing system, an acoustic signal processing method, and a program capable of controlling a processed signal output from a speaker to which a Head Related Transfer Function (HRTF) specific to a user (listener) is applied, and setting an ideal virtual sound source position for each user (listener).

Solution to the problem

According to a first aspect of the present disclosure,

there is provided an acoustic signal processing apparatus comprising:

a user identification unit that performs a user identification process;

an acquisition unit that acquires a Head Related Transfer Function (HRTF) unique to the user identified by the user identification unit from among the one or more Head Related Transfer Functions (HRTFs); and

a sound image localization processing unit that performs sound localization processing using a Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

Further, according to the second aspect of the present disclosure,

there is provided an acoustic signal processing apparatus comprising:

a storage unit storing a Head Related Transfer Function (HRTF) unique to a user;

an acquisition unit that acquires a Head Related Transfer Function (HRTF) unique to a user from the storage unit; and

Further, according to a third aspect of the present disclosure,

there is provided an acoustic signal processing system comprising a user terminal and a server, wherein

Server

Transmitting the audio signal to the user terminal, an

The user terminal comprises

A storage unit storing a Head Related Transfer Function (HRTF) unique to a user,

an acquisition unit that acquires a Head Related Transfer Function (HRTF) unique to the user from the storage unit, an

A sound image localization processing unit that performs sound localization processing on the audio signal received from the server using a Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

Further, according to a fourth aspect of the present disclosure,

there is provided an acoustic signal processing method performed in an acoustic signal processing apparatus, the method including:

performing a user identification process by a user identification unit;

acquiring, by an acquiring unit, a user-unique Head Related Transfer Function (HRTF) identified by a user identifying unit from one or more Head Related Transfer Functions (HRTFs); and

sound localization processing is performed by the sound image localization processing unit using a Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

Further, according to a fifth aspect of the present disclosure,

there is provided an acoustic signal processing method performed in an acoustic signal processing apparatus,

the acoustic signal processing apparatus includes a storage unit storing a Head Related Transfer Function (HRTF) unique to a user, the method including:

acquiring, by an acquisition unit, a Head Related Transfer Function (HRTF) unique to a user from a storage unit; and

Further, according to a sixth aspect of the present disclosure,

there is provided a program for causing an acoustic signal processing apparatus to execute acoustic signal processing, the program including:

causing a user identification unit to execute a user identification process;

causing the acquisition unit to acquire a user-unique Head Related Transfer Function (HRTF) identified by the user identification unit from among the one or more Head Related Transfer Functions (HRTFs); and

the sound image localization processing unit is caused to perform sound localization processing using a Head Related Transfer Function (HRTF) unique to the user acquired by the acquisition unit as a processing parameter.

Note that the program according to the present disclosure may be provided by a storage medium or a communication medium for providing the program in a computer-readable format to an information processing apparatus or a computer system capable of executing various program codes, for example. Since such a program is provided in a computer-readable format, processing according to the program is executed on an information processing apparatus or a computer system.

Other objects, features and advantages of the present disclosure will become apparent from the detailed description based on the embodiments of the present disclosure and the drawings described later. Note that in this specification, a system refers to a logical set configuration including a plurality of devices, and the devices of the configuration are not necessarily included in the same housing.

Effects of the invention

According to the configuration of the exemplary aspect of the present disclosure, a configuration is realized in which sound localization processing is performed applying a Head Related Transfer Function (HRTF) corresponding to a user identified by a user identification unit, and output from an output unit is performed for each user position.

Specifically, for example, a user identification unit that performs user identification and user position identification processing, and a sound image localization processing unit that performs sound localization processing using a user-specific Head Related Transfer Function (HRTF) as a processing parameter are included. The sound image localization processing unit performs sound localization processing that processes HRTFs specific to the identified users as processing parameters, and outputs signals obtained by the sound localization processing to the output unit for the identified user positions. In a case where the user identifying unit identifies a plurality of users, the sound image localization processing unit performs sound localization processing in parallel using HRTFs of each of the plurality of users, and outputs the processed signals to the output unit for each user position.

According to the present configuration, a configuration is realized in which sound localization processing is performed applying a Head Related Transfer Function (HRTF) corresponding to a user identified by user identification, and output from an output unit is performed for each user position.

Note that the effects described in this specification are merely non-limiting examples, and there may be additional effects.

Drawings

Fig. 1 is a diagram for describing an overview of audio signal processing based on sound localization processing and Head Related Transfer Functions (HRTFs).

Fig. 2 is a diagram for describing an example of a process of measuring a Head Related Transfer Function (HRTF) which is regarded as a parameter applied to the sound localization process.

Fig. 3 is a diagram showing an exemplary configuration of a device that performs sound localization processing using a Head Related Transfer Function (HRTF).

Fig. 4 is a diagram for describing an example of performing signal processing based on a Head Related Transfer Function (HRTF) corresponding to each user.

Fig. 5 is a diagram for explaining the configuration and processing of embodiment 1 of the present disclosure.

Fig. 6 is a diagram for describing an exemplary configuration of an acoustic signal processing apparatus according to the present disclosure.

Fig. 7 is a diagram for describing an exemplary configuration in which an HRTF database is placed on an external server.

Fig. 8 is a diagram showing a flowchart for describing a sequence of processing performed by the acoustic signal processing apparatus according to the present disclosure.

Fig. 9 is a diagram for describing an embodiment of performing output control according to the presence or absence of a user.

Fig. 10 is a diagram showing a flowchart for describing a sequence of processing performed by the acoustic signal processing apparatus according to the present disclosure.

Fig. 11 is a diagram for describing an embodiment in which the acoustic signal processing apparatus according to the present disclosure is applied to a seat on an airplane.

Fig. 12 is a diagram for describing an embodiment of applying the acoustic signal processing apparatus according to the present disclosure to a seat on an airplane.

Fig. 13 is a diagram showing a flowchart for describing a sequence of processing performed by the acoustic signal processing apparatus according to the present disclosure.

Fig. 14 is a diagram for describing an embodiment of applying the acoustic signal processing apparatus according to the present disclosure to attractions of an amusement park.

Fig. 15 is a diagram showing a flowchart for describing a sequence of processing performed by the acoustic signal processing apparatus according to the present disclosure.

Fig. 16 is a diagram for describing an embodiment of applying the acoustic signal processing apparatus according to the present disclosure to an art museum.

Fig. 17 is a diagram showing a flowchart for describing a sequence of processing performed by the acoustic signal processing apparatus according to the present disclosure.

Fig. 18 is a diagram for describing an embodiment of storing a user-specific Head Related Transfer Function (HRTF) in a user terminal.

Fig. 19 is a diagram for describing an embodiment of storing a user-specific Head Related Transfer Function (HRTF) in a user terminal.

Fig. 20 is a diagram for describing an exemplary hardware configuration of an acoustic signal processing apparatus, a user terminal, a server, and the like.

Detailed Description

Hereinafter, an acoustic signal processing apparatus, an acoustic signal processing system, an acoustic signal processing method, and a program according to the present disclosure will be described in detail with reference to the accompanying drawings. Note that description will be made in the following sections.

1. Overview of Audio Signal processing based on Sound localization processing and Head Related Transfer Function (HRTF)

2. Configuration and processing of acoustic signal processing apparatus according to the present disclosure

3. Embodiment for executing output control according to presence or absence of user

4. Other embodiments

5. Embodiments for storing a user-specific Head Related Transfer Function (HRTF) in a user terminal

6. Exemplary hardware configuration of Acoustic Signal processing apparatus, user terminal, Server, and the like

7. Configuration summary according to the present disclosure

[1. overview of Audio Signal processing based on Sound localization processing and Head Related Transfer Function (HRTF) ]

First, with reference to fig. 1 and subsequent drawings, an overview of audio signal processing based on sound localization processing and Head Related Transfer Functions (HRTFs) will be described.

Fig. 1 shows a motor vehicle 1. A user (listener) 10 is seated in the driver seat. A left speaker 21 and a right speaker 22 are mounted in a headrest portion of a driver seat, and a stereo signal (LR signal) from a sound source such as a CD, not shown, is output from these two speakers.

In the case where the speaker is provided in the headrest portion and the stereo signal (LR signal) is simply output from the sound source in this way, the user (listener) 10 hears sound from behind the ear, which may feel unnatural, and in some cases, some users may experience hearing fatigue.

To solve such a problem, the acoustic signal processing apparatus inside the automobile 1 performs signal processing on the LR signal output from the sound source such as a CD, and outputs signals obtained by the signal processing from the left speaker 21 and the right speaker 22. The signal processing is sound localization processing.

As described above, the sound localization processing is signal processing for making a user (listener) perceive sound as if a sound source exists at a virtual sound source position different from an actual speaker position.

In the example shown in fig. 1, the user can be made to perceive sound as if the L signal of a sound source is being output from the virtual left speaker 31 and the R signal of the sound source is being output from the virtual right speaker 32 at a position in front of the user (listener) 10 (virtual sound source position).

An example of a process of measuring a Head Related Transfer Function (HRTF) regarded as a parameter applied to the sound localization process will be described with reference to fig. 2. Note that fig. 2 is a diagram recorded in patent document 1 (japanese patent application laid-open No. 2003-111200), which is a publication technique previously described as a related art relating to sound localization processing. The processing according to the present disclosure can be performed by using existing sound localization processing described in patent document 1 and the like.

As shown in fig. 2, in a predetermined playback sound field of, for example, a studio, a real left speaker 41 and a real right speaker 42 are actually installed at left and right virtual speaker positions (positions where speakers are expected to exist) for the user 10.

Thereafter, the sounds emitted by the real left speaker 41 and the real right speaker 42 are picked up at a portion near either ear of the user 10, and a Head Related Transfer Function (HRTF) indicating how the sounds emitted from the real left speaker 41 and the real right speaker 42 change when reaching a portion near either ear of the user 10 is measured.

In the example shown in fig. 2, M11 is the head related transfer function of sound from the real left speaker 41 to the left ear of the user 10, and M12 is the head related transfer function of sound from the real left speaker 41 to the right ear of the user 10. Similarly, M21 is the head related transfer function of sound from the real right speaker 42 to the left ear of the user 10, and M22 is the head related transfer function of sound from the real right speaker 42 to the right ear of the user 10.

These Head Related Transfer Functions (HRTFs) are parameters applied to signal processing performed on an LR signal output from a sound source (e.g., CD). Signals obtained by signal processing using these parameters are output from the left speaker L21 and the right speaker 22 in the headrest portion of the driver seat shown in fig. 1. This arrangement may allow the user to perceive the sound as if the sound emitted from the speakers in the headrest portion were being output from the virtual speaker locations.

In other words, the user 10 can be made to perceive sound as if the L signal of the sound source is being output from the virtual left speaker 31 and the R signal of the sound source is being output from the virtual right speaker 32 from a position (virtual sound source position) in front of the user (listener) 10 shown in fig. 1.

Fig. 3 is a diagram showing an exemplary configuration of a device that performs sound localization processing using a Head Related Transfer Function (HRTF).

The L signal and the R signal are reproduced as a stereo signal from a sound source 50 such as a CD. The reproduced signals are input (Lin, Rin) to the sound image localization processing unit 60 to which the HRTF is applied.

The HRTF-applied sound image localization processing unit 60 acquires, from the HRTF storage unit 70, a head-related transfer function (HRTF) measured by the measurement processing described above with reference to fig. 2, applies the obtained data to perform signal processing, and generates output signals (Lout, Rout) to, for example, the left speaker 21 and the right speaker 22 of the headrest portion.

The left speaker 21 outputs an output signal (Lout) processed in the HRTF-applied sound image localization processing unit 60.

Further, the right speaker 22 outputs an output signal (Rout) processed in the HRTF-applied sound image localization processing unit 60.

In this way, when the signals subjected to the sound localization processing in the HRTF-applied sound image localization processing unit 60 are output to the left speaker 21 and the right speaker 22 in the headrest portion, the user 10 can perceive the sound as if the sound emitted from the speakers in the headrest portion is located at the virtual speaker position, or in other words, as if the L signal of the sound source is being output from the virtual left speaker 31 and the R signal of the sound source is being output from the virtual right speaker 32 at the position (virtual sound source position) in front of the user 10 shown in fig. 3.

Accordingly, signal processing is performed based on a head-related transfer function (HRTF), so that an optimal virtual sound position of a listener can be controlled.

However, as described above, the Head Related Transfer Function (HRTF) is different for each individual. Therefore, in the case of outputting a processed signal to which a Head Related Transfer Function (HRTF) corresponding to a specific user is applied from a speaker, there is a problem that a virtual sound source position may be an optimal position of the specific user, but not necessarily an optimal virtual sound source position of another user.

For example, as shown in fig. 1, it is expected that a plurality of different users will be seated in the driver's seat of the automobile 1.

In this case, the HRTF-applied sound image localization processing unit 60 shown in fig. 3 needs to perform signal processing based on a head-related transfer function (HRTF) corresponding to each user.

As shown in fig. 4, in the case where three users a to C change positions, it is necessary to perform signal processing by applying a Head Related Transfer Function (HRTF) corresponding to each user.

In the example of fig. 4, from time t1, the user a11 is seated in the driver seat, and in this case, it is necessary to perform signal processing to apply a Head Related Transfer Function (HRTF) of the user a11 to output from the speakers.

From time t2, the user B12 is seated in the driver seat, and in this case, it is necessary to perform signal processing to apply the Head Related Transfer Function (HRTF) of the user B12 to output from the speakers.

Further, from time t3, the user C13 is seated in the driver seat, and in this case, it would be necessary to perform signal processing to apply the Head Related Transfer Function (HRTF) of the user C13 to output from the speakers.

[2. configuration and processing of an acoustic signal processing apparatus according to the present disclosure ]

Next, the configuration and processing of the acoustic signal processing apparatus according to the present disclosure will be described.

As described above, the Head Related Transfer Function (HRTF) is different for each user, and an optimum virtual sound source position cannot be set unless sound localization processing that applies a Head Related Transfer Function (HRTF) unique to the user as a listener is performed.

An acoustic signal processing apparatus according to the present disclosure described below performs a user identification process and a user position identification process, determines a Head Related Transfer Function (HRTF) to be applied to a sound localization process based on identification information, and performs a signal process by applying a Head Related Transfer Function (HRTF) corresponding to each user. Further, a signal processing result obtained as an output is output from a speaker provided at a position of a user having a Head Related Transfer Function (HRTF) applied to the signal processing.

First, the configuration and processing according to embodiment 1 of the present disclosure will be described with reference to fig. 5 and subsequent drawings.

Fig. 5 shows a car 80. The acoustic signal processing apparatus 100 according to the present disclosure is mounted on the automobile 80. Note that a specific example of the configuration of the acoustic signal processing apparatus 100 according to the present disclosure will be described later.

Four users, user A110a, user B110B, user C110C, and user D110D, are on car 80.

An LR speaker corresponding to each user is installed in a headrest portion of each user seat.

In the headrest portion for the user a110a, a user a left speaker 122aL and a user a right speaker 122aR are installed.

In the headrest portion for the user B110B, a user B left speaker 122bL and a user B right speaker 122bR are mounted.

In the headrest portion for the user C110C, a user C left speaker 122cL and a user C right speaker 122cR are mounted.

In the headrest portion for the user D110D, a user D left speaker 122dL and a user D right speaker 122dR are mounted.

Further, a sensor (camera) 101 that captures an image of the face of each of the users a to D is mounted on the automobile 80.

A captured image of the face of each of the users a to D acquired by the sensor (camera) 101 is input into the acoustic signal processing apparatus 100 according to the present disclosure.

The acoustic signal processing apparatus 100 according to the present disclosure performs user identification and user position identification based on a captured image of the face of each of the users a to D acquired by a sensor (camera) 101.

The acoustic signal processing apparatus 100 according to the present disclosure acquires a Head Related Transfer Function (HRTF) of each of the users a to D from a database based on user identification information, and applies the acquired Head Related Transfer Function (HRTF) of each of the users a to D to perform signal processing (sound localization processing) in parallel.

Further, from the LR speaker at the position of each user specified based on the user position identification information, four pairs of output LR signals obtained by signal processing (sound localization processing) applying a Head Related Transfer Function (HRTF) of each of the users a to D are output.

Through these processes, each of the users a to D can listen to an output signal obtained by signal processing (sound localization processing) applying the user's own Head Related Transfer Function (HRTF) separately from the speakers in the headrest portion, and each of the users can listen to sound from an ideal virtual sound source position.

Fig. 6 is a diagram showing an exemplary configuration of the acoustic signal processing apparatus 100 according to the present disclosure.

As shown in fig. 6, an acoustic signal processing apparatus 100 according to the present disclosure includes a sensor (e.g., a camera) 101, a user and user position recognition unit 102, a user-corresponding HRTF acquisition unit 103, an HRTF database 104, and a sound image localization processing unit 105 to which HRTFs are applied.

The HRTF-applied sound image localization processing unit 105 includes HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users capable of performing processing in parallel.

The sensor (e.g., camera) 101 is a sensor that acquires information usable to identify the user and the user's position, and includes, for example, a camera.

Sensor detection information acquired by a sensor (e.g., a camera) 101, such as an image captured by the camera, is input to a user and a user position recognition unit 102.

The user and user position identification unit 102 identifies the user and the user position based on sensor detection information (e.g., an image captured by a camera) acquired by the sensor (e.g., camera) 101.

For example, the user and user position identification unit 102 identifies the user by comparing a face image contained in an image captured by a camera with user face image information stored in a user database, not shown.

Further, the user and user position identification unit 102 also identifies the position of each identified user. The identification of the user position is performed as a process of determining the position where each user is located to hear the sound output from which speaker.

The user identification information and the user position identification information generated by the user and user position identification unit 102 are input to the HRTF obtaining unit 103 corresponding to the user.

The user-corresponding HRTF obtaining unit 103 obtains a head-related transfer function (HRTF) corresponding to each identified user from the HRTF database 104 based on the user identification information input from the user and user position identifying unit 102.

Head Related Transfer Functions (HRTFs) corresponding to each user measured in advance are stored in the HRTF database 104.

In the HRTF database 104, a Head Related Transfer Function (HRTF) corresponding to each user is stored in association with the user identifier. Note that the Head Related Transfer Function (HRTF) corresponding to each user can be measured by the process described with reference to fig. 2 described above.

The user-corresponding HRTF obtaining unit 103 outputs a head-related transfer function (HRTF) corresponding to each identified user, obtained from the HRTF database 104, in association with the user identification information and the user position identification information input from the user and user position identification unit 102.

As described above, the HRTF applied sound image localization processing unit 105 includes the HRTF applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users.

Each of the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users is pre-associated with an LR speaker that outputs a processed signal (Lout, R-out), respectively.

For example, the HRTF-applied sound image localization processing unit 105-1 corresponding to the user is connected to the LR speakers of the driver seat, i.e., the user left speaker 122aL and the user right speaker 122aR of the driver seat where the user a110a is seated as shown in fig. 5.

In this way, each of the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to the user is pre-associated with the LR speakers that respectively output the processed signals (Lout, R-out).

The HRTF-applied sound image localization processing unit 105 applies an HRTF corresponding to each user in the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to the user based on data associated with user identification information and user position identification information input from the HRTF acquisition unit 103 corresponding to the user having a Head Related Transfer Function (HRTF) corresponding to each identified user, and performs signal processing.

Specifically, for example, the HRTF-applied sound image localization processing unit 105-1 corresponding to the user, which is connected to the user a left speaker 122aL and the user a right speaker 122aR of the driver seat where the user a110a is seated, performs signal processing (sound localization processing) of receiving a Head Related Transfer Function (HRTF) corresponding to the user a as input.

The output signals (Lout-a, Rout-a) are generated by signal processing. The generated output signals (Lout-a, Rout-a) are output from the user a left speaker 122aL and the user a right speaker 122aR of the driver seat in which the user a110a is seated.

Similarly, the HRTF-applied sound image localization processing unit 105-N corresponding to the user, which is connected to the user N left speaker 122nL and the user N right speaker 122nR of the driver seat in which the user N110N is seated shown in fig. 6, performs signal processing (sound localization processing) of receiving a head-related transfer function (HRTF) corresponding to the user N as input.

The output signals (Lout-n, Rout-n) are generated by signal processing. The generated output signals (Lout-N, Rout-N) are output from the user N left speaker 122nL and the user N right speaker 122nR of the driver seat in which the user N110N is seated.

The same applies to other users, and output signals (Lout-x, Rout-x) generated by applying signal processing (sound localization processing) corresponding to a Head Related Transfer Function (HRTF) of each user are output from a speaker at each user position.

Through these processes, all users can listen to signals (Lout-x, Rout-x) obtained by performing sound localization processing by applying each user's own Head Related Transfer Function (HRTF) from speakers in a headrest portion at a position where each user sits, and listen to sound from an optimum virtual sound source position for each user.

Note that the exemplary configuration of the acoustic signal processing apparatus 100 shown in fig. 6 is an example, and other configurations are also possible.

For example, HRTF database 104 of acoustic signal processing apparatus 100 shown in fig. 6 may also be placed on an external server.

This exemplary configuration is shown in fig. 7.

As shown in fig. 7, the sound output apparatus 100 built in the automobile is configured to be connected through a network 130 and capable of communicating with the management server 120.

The sound output apparatus 100 built in the automobile does not include the HRTF database 104 described with reference to fig. 6.

HRTF database 104 is stored in management server 120.

The management server 120 includes an HRTF database 104 that stores Head Related Transfer Functions (HRTFs) measured in advance corresponding to each user. In the HRTF database 104, a Head Related Transfer Function (HRTF) corresponding to each user is stored in association with a user identifier.

The sound output apparatus 100 performs a process of searching the HRTF database 104 in the management server 120 based on the user identification information generated by the user and the user position identification unit 102 to acquire a Head Related Transfer Function (HRTF) corresponding to each user.

The processing thereafter is similar to that described with reference to fig. 6.

In this way, by placing the HRTF database 104 in the management server 120, signal processing can be performed applying Head Related Transfer Functions (HRTFs) of more users.

Next, a processing sequence performed by the acoustic signal processing apparatus according to the present disclosure will be described with reference to a flowchart shown in fig. 8.

Note that the processes following the flows in fig. 8 and the subsequent drawings described below may be executed according to, for example, a program stored in a storage unit of the acoustic signal processing apparatus, and executed under the control of a control unit having a program execution function (e.g., a CPU). Hereinafter, the processing in each step of the flow shown in fig. 8 will be described continuously.

(step S101)

First, in step S101, the acoustic signal processing apparatus performs user identification and user position identification.

This process is performed by the user and user position identification unit 102 shown in fig. 6.

(step S102)

Next, in step S102, the acoustic signal processing apparatus acquires a Head Related Transfer Function (HRTF) of each identified user from the database.

This process is performed by the HRTF obtaining unit 103 corresponding to the user shown in fig. 6.

In the HRTF database 104, a Head Related Transfer Function (HRTF) corresponding to each user is stored in association with a user identifier.

The HRTF obtaining unit 103 corresponding to the user performs a database search process based on the user identification information input from the user and the user position identifying unit 102, and obtains a Head Related Transfer Function (HRTF) corresponding to each identified user.

(step S103)

Next, in step S103, the acoustic signal processing apparatus inputs a Head Related Transfer Function (HRTF) of each user to the HRTF applied sound image localization processing units corresponding to the respective users, and generates an output signal corresponding to each user.

This process is performed by the HRTF-applied sound image localization processing unit 105 shown in fig. 6.

As described with reference to fig. 6, the HRTF applied sound image localization processing unit 105 includes a plurality of HRTF applied sound image localization processing units 105-1 to 105-n corresponding to users.

Each of the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users is pre-assigned with an LR speaker that outputs a processed signal (Lout, R-out), respectively.

The HRTF-applied sound image localization processing unit 105 applies HRTFs corresponding to each user in the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to the users, based on data associated with user identification information and user position identification information input from the HRTF acquisition unit 103 corresponding to the user having a Head Related Transfer Function (HRTF) corresponding to each identified user, and performs signal processing.

(step S104)

Finally, in step S104, the acoustic signal processing apparatus outputs the generated output signal corresponding to each user to a speaker installed at the user position corresponding to each generated signal.

This process is also performed by the HRTF-applied sound image localization processing unit 105 shown in fig. 6.

Output signals (Lout-x, Rout-x) generated by applying signal processing (sound localization processing) corresponding to a Head Related Transfer Function (HRTF) of each user are output from a speaker at each user position.

[3. embodiment for performing output control according to presence or absence of user ]

Next, as embodiment 2, execution of embodiment output control according to presence or absence of a user will be described.

In the example described above with reference to fig. 5, all users (listeners) are seated on the seat of the automobile 80 in which the speakers are installed. In practice, however, in many cases, for example as shown in fig. 9, some seats may be empty.

In this case, outputting sound from the speaker on the empty seat leads to an increase in power consumption. Further, if the output sounds from these speakers enter the ears of a user sitting on another seat, the user may regard these sounds as unwanted noise.

The embodiments described below solve such a problem, and are embodiments of controlling the output stop or mute of the speaker at a position where no user is present.

A processing sequence according to embodiment 2 will be described with reference to a flowchart shown in fig. 10.

Hereinafter, the processing in each step of the flow shown in fig. 10 will be described continuously.

The flow shown in fig. 10 is obtained by adding steps S101a and S101b between step S101 and step S102 of the flow shown in fig. 8 described above.

The processing in the other steps (step S101 and steps S102 to S104) is similar to the processing described with reference to fig. 8, and thus description is omitted.

Hereinafter, the processing in step S101a and the processing in step S101b will be described.

(step S101a)

In step S101, the acoustic signal processing apparatus performs user identification and user position identification, and then performs the process of step S101 a.

In step S101a, the acoustic signal processing apparatus determines whether there is a speaker-mounted seat without a user.

This process is performed by the user and user position identification unit 102 shown in fig. 6.

In the case where there is no seat without a user having a speaker mounted thereon, the flow proceeds to step S102, and the processing in steps S102 to S104 is performed.

These processes are similar to the processes described above with reference to fig. 8, and signals obtained by performing signal processing (sound localization processing) corresponding to each user are output from speakers in all seats.

On the other hand, if it is determined in the determination process of step S101a that there is a speaker-mounted seat without a user, the flow advances to step S101 b.

(step S101b)

If it is determined in the determination process of step S101a that there is a speaker-mounted seat without a user, the flow advances to step S101 b.

In step S101b, the acoustic signal processing apparatus stops the output of each speaker-mounted seat or performs mute control on the output from each speaker-mounted seat without a user.

This process is performed by the HRTF-applied sound image localization processing unit 105 shown in fig. 6.

In the case where the output is stopped, the generation of the output sound is not performed for the speakers in these seats either. In the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to the user of the HRTF-applied sound image localization processing unit 105 shown in fig. 6, no processing is performed by the processing unit that generates output sounds for the speaker without the user.

In addition, in the case where the mute control is performed, the output sound is generated as a playback sound limited to a level that is not audible to the ears of the nearby user. Note that the HRTF applied to the signal processing (sound localization processing) in this case is a standard type HRTF stored in the HRTF database 104. Alternatively, the playback sound may be directly output from the sound source without performing signal processing (sound localization processing).

Thereafter, in steps S102 to S104, signal processing and output of playback sound are performed only for the speaker at which the user is located at the position on the seat.

By performing these processes, the output of the speaker at a position where there is no user is stopped or muted, and reduction in power consumption is achieved. In addition, noise entering the ears of the users of the other seats can be reduced.

[4. other examples ]

The above-described embodiments describe the sound output control configuration inside the automobile, but the processing according to the present disclosure is otherwise applicable to various places.

Hereinafter, an embodiment in which the present disclosure is applied to a seat on an airplane, an embodiment in which the present disclosure is applied to an attraction of an amusement park, and an embodiment in which the present disclosure is applied to an art museum will be described.

(a) Embodiments of applying the present disclosure to seats on an aircraft

First, with reference to fig. 11 and subsequent drawings, an embodiment in which the acoustic signal processing apparatus according to the present disclosure is applied to a seat on an airplane will be described.

Seats on airplanes are equipped with a socket (headphone jack) for inserting headphones, and a user (passenger) sitting on the seat can listen to music or the like by inserting headphones.

As shown in fig. 11, some seats are occupied by users (passengers) while other seats are empty. Furthermore, some users are using headphones, while others do not.

The seats are assigned and the seat in which each user sits is predetermined.

A record of which user sits in which seat is recorded in a database of the boarding reservation system.

With this arrangement, the acoustic signal processing apparatus on the airplane can check the seat position of each user (passenger) based on the recorded data in the boarding reservation system.

Fig. 12 shows an exemplary system configuration according to the present embodiment.

The acoustic signal processing apparatus 200 on the airplane is connected to a boarding reservation system 201 and a management server 202 through a network.

Note that the management server 202 includes an HRTF database 210 in which a head-related transfer function (HRTF) of each user (passenger) is recorded. The acoustic signal processing apparatus 200 on the aircraft has a configuration substantially similar to that described above with reference to fig. 6.

However, this configuration omits HRTF database 104 and also does not include sensor 101. The user identification and the user position identification are performed using the recorded data in the boarding reservation system 201 connected through the network.

The user and user position identification unit (i.e., the user and user position identification unit 102 shown in fig. 6) of the acoustic signal processing apparatus 200 on the airplane identifies the user at each seat position based on the boarding reservation system 201 connected through the network. Specifically, a user identifier of a user who subscribes to each seat position is acquired.

Further, the HRTF obtaining unit corresponding to the user of the acoustic signal processing apparatus 200 (i.e., the HRTF obtaining unit 103 corresponding to the user shown in fig. 6) obtains a head-related transfer function (HRTF) corresponding to each user from the HRTF database 210 of the management server 202 based on the user identifier of each seat position.

Next, the acoustic signal processing device 200 generates output sound through the headphone jack in each seat. The generation of the output sound is a process performed by the HRTF-applied sound image localization processing unit of the acoustic signal processing apparatus 200 (i.e., the HRTF-applied sound image localization processing unit 105 shown in fig. 6).

Each of the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users is previously assigned with headphone jacks that respectively output processed signals (Lout, R-out).

The HRTF-applied sound image localization processing unit 105 performs signal processing in the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to the user, based on data associated with the user (seat) identification information and the user position identification information input from the HRTF acquisition unit 103 corresponding to the user having a Head Related Transfer Function (HRTF) corresponding to each identified user.

In each of the HRTF-applied sound image localization processing units 105-1 corresponding to the users, signal processing to apply an HRTF corresponding to each identified user is performed to generate a sound localization processing signal corresponding to each user. The signal corresponding to each user is output as an output sound from the headphone jack at the seat position of each user.

By this processing, the user who is a passenger on the airplane can listen to the signal subjected to the Head Related Transfer Function (HRTF) processing (sound localization processing) based on each user's own head, and can listen to the sound from the ideal virtual sound source position.

Next, a processing sequence performed by the acoustic signal processing apparatus according to the present disclosure will be described with reference to a flowchart shown in fig. 13.

Hereinafter, the processing in each step of the flow shown in fig. 13 will be described continuously.

(step S201)

First, in step S201, the acoustic signal processing apparatus performs user identification and user position identification based on the check-in information.

This process is performed by the user and user position identification unit of the acoustic signal processing apparatus 200 on the airplane shown in fig. 12 (i.e., the user and user position identification unit 102 shown in fig. 6).

The user of the acoustic signal processing apparatus 200 and the user position identification unit identify the user at each seat position based on the boarding reservation system 201 connected through the network. Specifically, a user identifier of a user who subscribes to each seat position is acquired.

(step S202)

Next, in step S202, the acoustic signal processing apparatus acquires a Head Related Transfer Function (HRTF) of each identified user from the database.

This process is performed by the user-corresponding HRTF acquisition unit (i.e., the user-corresponding HRTF acquisition unit 103 shown in fig. 6) of the acoustic signal processing apparatus 200.

The HRTF obtaining unit corresponding to the user obtains a Head Related Transfer Function (HRTF) corresponding to each user from the HRTF database 210 of the management server 202 based on a user identifier of the user who subscribes to each seat.

(step S203)

Next, in step S203, the acoustic signal processing apparatus inputs a Head Related Transfer Function (HRTF) of each user to HRTF-applied sound image localization processing units corresponding to the respective users, and generates an output signal corresponding to each user.

This process is performed by the HRTF-applied sound image localization processing unit 105 shown in fig. 6.

Each of the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users generates an output signal corresponding to the user by performing signal processing (sound localization processing) that treats a head-related transfer function (HRTF) corresponding to each user at each seat position as a processing parameter.

(step S204)

Finally, in step S204, the acoustic signal processing apparatus outputs the generated output signal corresponding to each user as an output signal from the headphone jack at the seat position of each user corresponding to the generated signal.

The output from the headphone jack at the seat position of each user is an output signal (Lout-x, Rout-x) generated by applying signal processing (sound localization processing) corresponding to a Head Related Transfer Function (HRTF) of each user.

Through these processes, all users (passengers) can listen to signals (Lout-x, Rout-x) obtained by performing sound localization processing applying each user's own Head Related Transfer Function (HRTF) at the seat position where each user is seated, and listen to sound from the optimum virtual sound source position of each user.

(b) Embodiments applying the disclosure to attractions of an amusement park

Next, with reference to fig. 14, an embodiment in which the acoustic signal processing apparatus according to the present disclosure is applied to an attraction of an amusement park will be described.

FIG. 14 shows a user 251 playing at an amusement park attraction.

When the user 251 purchases tickets at the entrance of the amusement park, the user information is registered, and during the registration process, the user receives the sensor 252 storing the user identifier to wear on the user's arm.

The sensor 252 communicates with the communication devices 263 installed at various locations inside the amusement park, and transmits the user identifier to the acoustic signal processing apparatus disposed in the management center of the amusement park. The acoustic signal processing apparatus arranged in the management center of the amusement park has a configuration substantially similar to the configuration described above with reference to fig. 6.

However, the user and user position identification unit 102 receives user identification information and user position information from the sensor 252 worn by the user 251 shown in fig. 14 through the communication device 263, and identifies each user and the position of each user.

As shown in fig. 14, a plurality of speakers, such as a speaker L261 and a speaker R262, are installed in each attraction.

An acoustic signal processing apparatus arranged in the management center of an amusement park uses outputs from these speakers as processing signals (sound localization processing signals) in which Head Related Transfer Functions (HRTFs) of the user 251 in front of the speakers have been applied as processing parameters.

A processing sequence performed by the acoustic signal processing apparatus according to the present disclosure will be described with reference to a flowchart shown in fig. 15.

Hereinafter, the processing in each step of the flow shown in fig. 15 will be described continuously.

(step S301)

First, in step S301, the acoustic signal processing apparatus performs user identification and user position identification based on a signal received from the sensor 252 worn by the user.

This processing is performed by a user and user position identification unit (i.e., the user and user position identification unit 102 shown in fig. 6) of the acoustic signal processing apparatus in the management center of the amusement park.

The user of the acoustic signal processing apparatus and the user position identification unit receive the output of the sensor 252 worn by the user shown in fig. 14 through the communication device 263 to perform user identification and user position identification.

(step S302)

Next, in step S302, the acoustic signal processing apparatus acquires a Head Related Transfer Function (HRTF) of each identified user from the database.

This process is performed by a user-corresponding HRTF acquisition unit (i.e., the user-corresponding HRTF acquisition unit 103 shown in fig. 6) of an acoustic signal processing apparatus in the management center of an amusement park.

A user-corresponding HRTF acquisition unit acquires a Head Related Transfer Function (HRTF) corresponding to each user from an HRTF database based on a user identifier of the user in each attraction.

Note that the HRTF database may be stored in an acoustic signal processing apparatus in a management center of an amusement park in some cases, or may be stored in a management server connected through a network in other cases.

(step S303)

Next, in step S303, the acoustic signal processing apparatus inputs a Head Related Transfer Function (HRTF) of each user to HRTF-applied sound image localization processing units corresponding to the respective users, and generates an output signal corresponding to each user.

This process is performed by the HRTF-applied sound image localization processing unit 105 shown in fig. 6.

(step S304)

Finally, in step S304, the acoustic signal processing apparatus outputs the generated output signal corresponding to each user as an output signal from a speaker at the sight point position of each user corresponding to the generated signal.

The output from the speaker at each spot is an output signal (Lout-x, Rout-x) generated by signal processing (sound localization processing) applying a Head Related Transfer Function (HRTF) corresponding to the user playing at the spot.

Through these processes, the users playing at the sights can listen to signals (Lout-x, Rout-x) obtained by performing sound localization processing by applying each user's own Head Related Transfer Function (HRTF), and listen to sound from the optimum virtual sound source position of each user.

Next, with reference to fig. 16 and subsequent drawings, an embodiment in which the acoustic signal processing apparatus according to the present disclosure is applied to an art museum will be described.

FIG. 16 shows a user 271 visiting an art museum.

When the user 271 purchases a ticket at the entrance of the art museum, the user information is registered, and during the registration process, the user receives the user terminal 272 storing the user identifier.

The user terminal 272 is provided with a headphone jack, and by inserting a plug of the headphone 273 into the headphone jack, the user 271 can listen to various comments from the headphone.

The user terminal 272 is capable of communicating with an acoustic signal processing device disposed in the management center of an art museum.

The acoustic signal processing apparatus disposed in the management center of the art museum has a configuration substantially similar to that described above with reference to fig. 6.

However, the user and user position identification unit 102 receives user identification information and user position information from the user terminal 272 worn by the user 271 shown in fig. 16, and identifies each user and the position of each user.

Note that, for example, in the case where there is a member database storing registered member information, the database registration information may also be used for user identification.

Further, the acoustic signal processing apparatus disposed in the management center of the art museum uses the output from the headphones 273 used by the user 271 as a processing signal (sound localization processing signal) in which the Head Related Transfer Function (HRTF) of the user 271 has been applied as a processing parameter.

A processing sequence performed by the acoustic signal processing apparatus according to the present disclosure will be described with reference to a flowchart shown in fig. 17.

Hereinafter, the processing in each step of the flow shown in fig. 17 will be described continuously.

(step S401)

First, in step S401, the acoustic signal processing apparatus performs user identification and user position identification based on a signal received from the user terminal 272 worn by the user or registered member information.

This processing is performed by the user and user position identification unit of the acoustic signal processing apparatus in the management center of the art museum (i.e., the user and user position identification unit 102 shown in fig. 6).

The user and user position identification unit of the acoustic signal processing apparatus performs user identification and user position identification by receiving the output of the user terminal 272 worn by the user shown in fig. 16. Note that the user identification may also be performed using registered member information (e.g., a member database referred to during examination upon entering an art museum).

(step S402)

Next, in step S402, the acoustic signal processing apparatus acquires a Head Related Transfer Function (HRTF) of each identified user from the database.

This process is performed by the HRTF acquisition unit corresponding to the user of the acoustic signal processing apparatus in the management center of the art museum (i.e., the HRTF acquisition unit 103 corresponding to the user shown in fig. 6).

A user-corresponding HRTF acquisition unit acquires a Head Related Transfer Function (HRTF) corresponding to each user from an HRTF database based on a user identifier of the user.

Note that the HRTF database may be stored in an acoustic signal processing apparatus in a management center of an art museum in some cases, or may be stored in a management server connected through a network in other cases.

(step S403)

Next, in step S403, the acoustic signal processing apparatus inputs a Head Related Transfer Function (HRTF) of each user to HRTF-applied sound image localization processing units corresponding to the respective users, and generates an output signal corresponding to each user.

This process is performed by the HRTF-applied sound image localization processing unit 105 shown in fig. 6.

Each of the HRTF-applied sound image localization processing units 105-1 to 105-n corresponding to a plurality of users generates an output signal corresponding to the user by performing signal processing (sound localization processing) that treats a Head Related Transfer Function (HRTF) corresponding to each user at a different position in an art museum as a processing parameter.

(step S304)

Finally, in step S404, the acoustic signal processing apparatus transmits the generated output signal corresponding to each user to the user terminal 272 of the user corresponding to the generated signal as an output signal from the earphone 273 inserted in the user terminal 272.

The output from the headphones 273 inserted into the user terminal 272 carried by the user at different positions within the art museum is an output signal (Lout-x, Rout-x) generated by signal processing (sound localization processing) applying a Head Related Transfer Function (HRTF) corresponding to the user.

Through these processes, users at different positions within the art museum can listen to signals (Lout-x, Rout-x) obtained by performing sound localization processing by applying each user's own Head Related Transfer Function (HRTF), and can listen to sound from the optimum virtual sound source position of each user.

Note that although the above-described embodiment shows an example in which the acoustic signal processing apparatus disposed in the management center of the art museum generates the sound localization processing signal, signal processing (sound localization processing) to which a Head Related Transfer Function (HRTF) corresponding to each user is applied may also be configured to be performed, for example, in the user terminal 272 carried by each user.

(d) Embodiments of performing sound localization processing using signal processing other than Head Related Transfer Function (HRTF)

Although the foregoing describes an embodiment in which sound localization processing using a Head Related Transfer Function (HRTF) is performed, sound localization processing may also be performed by signal processing using data other than a Head Related Transfer Function (HRTF).

For example, parameters determining a Head Related Transfer Function (HRTF) or an approximation thereof are suitable for signal processing. The parameters are, for example:

(1) an approximation of a Head Related Transfer Function (HRTF),

(2) determining parameters of a Head Related Transfer Function (HRTF), an

(3) Parameters of an approximation of a head-related transfer function (HRTF) are determined.

Specifically, in the case of reproducing an HRTF by EQ, information such as Fq, gain, and Q may be used.

In addition, data based on individual physical characteristics used in signal processing other than sound localization processing, such as an individually optimized filter for noise cancellation, may also be used.

In addition, data based on individual preferences may also be used, such as EQ parameters for adjusting sound quality and volume.

[5. embodiment of storing a user-specific Head Related Transfer Function (HRTF) in a user terminal ]

Next, with reference to fig. 18 and subsequent drawings, an embodiment of storing a user-specific Head Related Transfer Function (HRTF) in a user terminal will be described.

The foregoing describes an embodiment in which Head Related Transfer Functions (HRTFs) of a plurality of users are stored in an HRTF database.

In contrast, the embodiment described with reference to fig. 18 and subsequent figures is an embodiment in which a Head Related Transfer Function (HRTF)311 unique to a specific user 301 is stored in a user terminal 310 carried by the user 301.

The user terminal 310 outputs an audio signal to the headset 303 wirelessly or through a headset jack. User 301 listens to audio output from headphones 303.

The output sound from the headphone 303 is a signal processed by signal processing (sound localization processing) applying a Head Related Transfer Function (HRTF)311 unique to the user 301.

For example, the user 301 receives music provided by the music delivery server 322 by downloading or streaming to the user terminal 310.

The user terminal 310 performs signal processing (sound localization processing) of applying a Head Related Transfer Function (HRTF)311 corresponding to a user unique to the user 301 stored in the user terminal 310 to an audio signal acquired from the music delivery server 322, and outputs the processed audio signal to the headphones 303.

With this arrangement, it is possible to hear the audio signal to which the sound localization processing applying the Head Related Transfer Function (HRTF) unique to the user has been applied.

However, in the case where signal processing (sound localization processing) is performed in the signal processing unit inside the user terminal 310, in some cases, it may be necessary to acquire authorization information from the management server 321.

The configuration and processing of the present embodiment will be described with reference to fig. 19.

As shown in fig. 19, the user terminal 310 includes a Head Related Transfer Function (HRTF)311 corresponding to a user unique to the user carrying the user terminal 310, a signal processing unit 312 that performs signal processing (sound localization processing) applying the Head Related Transfer Function (HRTF) unique to the user, and a communication unit 313 that outputs the processed signal from the signal processing unit 312 to the headphones 303.

The audio signals (Lin, Rin) of the music 351 provided by the music delivery server 322 are input into the signal processing unit 312 of the user terminal 310.

The signal processing unit 312 performs signal processing (sound localization processing) of applying a Head Related Transfer Function (HRTF)311 corresponding to the user stored in the storage unit of the user terminal 310 to an audio signal acquired from the music delivery server 322.

However, in the case where signal processing (sound localization processing) is performed in the signal processing unit 312, in some cases, it is necessary to acquire authorization information from the management server 321.

The user terminal 310 acquires the authorization information 371 from the management server 321. The authorization information 371 is, for example, key information or the like that enables a signal processing (sound localization processing) program to be executed in the signal processing unit 312.

On the condition that the authorization information 371 is acquired from the management server 321, the user terminal 310 performs signal processing (sound localization processing) of applying a Head Related Transfer Function (HRTF)311 corresponding to the user to an audio signal transmitted from the music transmission server 322.

The processed audio signals (Lout, Rout) are output to the headphones 303 through a headphone jack or communication unit 313.

With this arrangement, the user can listen to the audio signal that has been subjected to the sound localization process to which the Head Related Transfer Function (HRTF) unique to the user is applied.

Note that as a configuration in which Head Related Transfer Functions (HRTFs) for a plurality of different users are stored in the user terminal 310, the user terminal may be configured such that the user being used selects which Head Related Transfer Function (HRTF) to use.

Alternatively, the user terminal may be provided with a user identification unit, and the user identification unit may be configured to apply a Head Related Transfer Function (HRTF) corresponding to the identified user to perform audio output control.

As another example, in a configuration in which an audio output system such as an in-vehicle audio system communicates with the user terminal 310, the audio output system acquires a Head Related Transfer Function (HRTF) stored in the user terminal 310, and audio output control according to the acquired Head Related Transfer Function (HRTF) may be performed on the audio output system side.

Note that although the above-described embodiment is described by taking a stereo signal as an example of a sound source, the processing according to the present disclosure is also applicable to processing performed on signals other than a stereo signal, such as a multi-channel signal, an object-based signal in which sound is played back in units of objects, and a high fidelity stereo image reproduction (Ambisonic) signal or a higher order Ambisonic signal reproducing a sound field.

[6. exemplary hardware configuration of Acoustic Signal processing apparatus, user terminal, Server, etc. ]

Next, an exemplary hardware configuration of the acoustic signal processing apparatus, the user terminal, the server, and the like described in the above-described embodiments will be explained.

The hardware described with reference to fig. 20 is an exemplary hardware configuration of the acoustic signal processing apparatus, the user terminal, the server, and the like described in the above-described embodiments.

A Central Processing Unit (CPU)501 functions as a control unit and a data processing unit that execute various processes according to programs stored in a Read Only Memory (ROM)502 or a storage unit 508. For example, the processing according to the sequence described in the above-described embodiment is performed. A Random Access Memory (RAM)503 stores programs, data, and the like executed by the CPU 501. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504.

The CPU 501 is connected to an input/output interface 505 via a bus 504, and the input/output interface 505 is connected to an input unit 506 including various switches, a keyboard, a mouse, a microphone, a sensor, and the like, and an output unit 507 including a display, a speaker, and the like. The CPU 501 executes various processes in response to an instruction input from the input unit 506, and outputs a processing result to, for example, the output unit 507.

The storage unit 508 connected to the input/output interface 505 includes, for example, a hard disk or the like, and stores programs executed by the CPU 501 and various data. The communication unit 509 functions as a transmission/reception unit for Wi-Fi communication, bluetooth (registered trademark) (BT) communication, and other data communication via a network such as the internet and a local area network, and communicates with an external device.

A drive 510 connected to the input/output interface 505 drives a removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card, and records or reads data.

[7. summary of arrangement according to the present disclosure ]

Embodiments of the present disclosure have been described in detail with reference to the above specific embodiments. However, it is apparent that those skilled in the art can make modifications and substitutions to the embodiments without departing from the gist of the present disclosure. In other words, the present disclosure has been disclosed in the form of examples and is not to be construed restrictively. To ascertain the gist of the present disclosure, the claims should be considered.

Further, the present technology disclosed in the present specification may include the following configurations.

(1) An acoustic signal processing apparatus comprising:

a user identification unit that executes a user identification process;

(2) The acoustic signal processing apparatus according to (1), wherein

The subscriber identification unit additionally

Performs a user location identification process, an

Sound image localization processing unit

The signal obtained by the sound localization process is output from a speaker near the user position identified in the user identification unit.

(3) The acoustic signal processing apparatus according to (1) or (2), wherein

Subscriber identification unit

Performing user identification processing for a plurality of users, an

Sound image localization processing unit

Sound localization processing is performed in parallel using a Head Related Transfer Function (HRTF) of each of a plurality of users identified in a user identification unit as a processing parameter.

(4) The acoustic signal processing apparatus according to any one of (1) to (3), wherein

Subscriber identification unit

The user recognition processing is performed based on an image captured by the camera or both of the user recognition processing and the user position recognition processing.

(5) The acoustic signal processing apparatus according to any one of (1) to (4), wherein

Subscriber identification unit

The user identification processing is performed based on the sensor information or both the user identification processing and the user position identification processing.

(6) The acoustic signal processing apparatus according to any one of (1) to (5), further comprising

A Head Related Transfer Function (HRTF) database is stored corresponding to a Head Related Transfer Function (HRTF) of each user.

(7) The acoustic signal processing apparatus according to any one of (1) to (6), wherein

Acquisition unit

The user identification information from the user identification unit is accepted as input, a Head Related Transfer Function (HRTF) unique to the user is acquired from a database inside the acoustic signal processing apparatus based on the user identification information, and is output to the sound image localization processing unit.

(8) The acoustic signal processing apparatus according to any one of (1) to (7), wherein

Acquisition unit

The user identification information from the user identification unit is accepted as input, a Head Related Transfer Function (HRTF) unique to the user is acquired from a database in an external server based on the user identification information, and is output to the sound image localization processing unit.

(9) The acoustic signal processing apparatus according to any one of (1) to (8), wherein

Sound image localization processing unit

Processing of stopping the output of the signal to the position where the user identifying unit determines that the user does not exist or reducing the output of the signal to the position where the user identifying unit determines that the user does not exist is performed.

(10) The acoustic signal processing apparatus according to any one of (1) to (9), wherein

Subscriber identification unit

The registration data in the boarding reservation system is referred to perform the user identification process or both the user identification process and the user position identification process.

(11) The acoustic signal processing apparatus according to any one of (1) to (10), wherein

Subscriber identification unit

The user identification processing or both the user identification processing and the user position identification processing are performed based on a sensor worn by the user or information received from the user terminal.

(12) The acoustic signal processing apparatus according to any one of (1) to (11), wherein

Subscriber identification unit

The user identification process is performed with reference to the user member information registered in advance.

(13) An acoustic signal processing apparatus comprising:

a storage unit storing a Head Related Transfer Function (HRTF) unique to a user;

an acquisition unit that acquires a Head Related Transfer Function (HRTF) unique to a user from the storage unit; and

(14) The acoustic signal processing apparatus according to (13), wherein

Sound image localization processing unit

Sound localization processing is performed on an acoustic signal acquired from an external server.

(15) An acoustic signal processing system comprising a user terminal and a server, wherein

Server

Transmitting the audio signal to the user terminal, an

The user terminal comprises

A storage unit storing a Head Related Transfer Function (HRTF) unique to a user,

an acquisition unit that acquires a Head Related Transfer Function (HRTF) unique to the user from the storage unit, an

(16) An acoustic signal processing method performed in an acoustic signal processing apparatus, the method comprising:

performing a user identification process by a user identification unit;

acquiring, by an acquiring unit, a user-unique Head Related Transfer Function (HRTF) identified by a user identifying unit from one or more Head Related Transfer Functions (HRTFs); and

(17) An acoustic signal processing method performed in an acoustic signal processing apparatus,

the acoustic signal processing apparatus includes a storage unit storing a Head Related Transfer Function (HRTF) unique to a user, the method including:

acquiring, by an acquisition unit, a Head Related Transfer Function (HRTF) unique to a user from a storage unit; and

(18) A program for causing an acoustic signal processing apparatus to execute acoustic signal processing, the program comprising:

causing a user identification unit to execute a user identification process;

Further, the series of processes described herein may be performed by hardware, software, or a combination configuration thereof. In the case where the processing is performed by software, the program recorded with the processing sequence may be executed after being installed in a memory of dedicated hardware incorporated in a computer, or may be executed after being installed in a general-purpose computer capable of various processing. For example, such a program may be recorded in advance in a recording medium. The program may be installed in the computer from a recording medium. Alternatively, the program may be received through a network such as a LAN (local area network) or the internet and installed in a recording medium such as an internal hard disk.

Note that the processes described herein are not necessarily performed in the time-series order described, and may be performed in parallel or individually as needed or in accordance with the processing capability of the apparatus that performs the processes. Further, in this specification, a system refers to a logical set configuration including a plurality of devices, and the devices of the respective configurations are not necessarily included in the same housing.

Industrial applicability

As described above, according to the configuration of the exemplary aspect of the present disclosure, a configuration is realized in which sound localization processing is performed applying a Head Related Transfer Function (HRTF) corresponding to a user identified by user identification, and output from an output unit is performed for each user position.

Specifically, for example, a user identification unit that performs user identification and user position identification processing, and a sound image localization processing unit that performs sound localization processing using a user-specific Head Related Transfer Function (HRTF) as a processing parameter are included. The sound image localization processing unit performs sound localization processing that takes HRTFs specific to the identified user as processing parameters, and outputs signals obtained by the sound localization processing to the output unit for the identified user position. In a case where the user identifying unit identifies a plurality of users, the sound image localization processing unit performs sound localization processing in parallel using HRTFs of each of the plurality of users, and outputs the processed signals to the output unit for each user position.

List of reference numerals

1 automobile

10 users

21 left loudspeaker

22 right loudspeaker

31 virtual left speaker

32 virtual right speaker

41 true left speaker

42 true right speaker

50 sound source

60 sound image localization processing unit using HRTF

70 HRTF memory cell

80 automobile

100 sound signal processing device

101 sensor

102 subscriber and subscriber location identification unit

103 HRTF obtaining unit corresponding to user

HRTF database corresponding to 104 users

105 sound image localization processing unit using HRTF

110 users

120 management server

124 HRTF database

200 sound signal processing device

201 boarding reservation system

202 management server

210 HRTF database

251 user

252 sensor

261 loudspeaker L

262 loudspeaker R

263 communication device

271 user

272 user terminal

273 earphone

301 user

303 earphone

310 user terminal

311 HRTF database corresponding to user

312 Signal processing Unit

313 communication unit

321 management server

322 music delivery server

501 CPU

502 ROM

503 RAM

504 bus

505 input/output interface

506 input unit

507 output unit

508 memory cell

509 communication unit

510 driver

511 removable media.

47页详细技术资料下载

Acoustic signal processing device, acoustic signal processing system, acoustic signal processing method, and program

相关技术

网友询问留言