Sound processing device

文档序号:1895405 发布日期:2021-11-26 浏览:27次 中文

阅读说明:本技术 声音处理装置 (Sound processing device ) 是由 宫阪修二 于 2020-01-17 设计创作,主要内容包括:声音处理装置(10)具备:预处理部(30),从来自第1麦克风(20)的第1电信号提取声音频带的信号,输出第1输出信号;第1控制部(50),生成第1放大系数,通过将第1放大系数与第1输出信号相乘,从而对第1输出信号的强度的动态范围进行压缩,并且以第1时间常数对第1放大系数进行平滑处理从而生成第1修正放大系数;以及第1乘法运算部(44),将第1修正放大系数与第1输出信号相乘,关于第1时间常数,在第1输出信号的强度增大的情况下,第1时间常数为第1上升时间常数,在第1输出信号的强度减小的情况下,第1时间常数为第1下降时间常数,第1上升时间常数为,听力低下者的听觉的时间分辨率以上、且小于使听力低下者引发重振现象的声音的持续时间。(The sound processing device (10) is provided with: a preprocessing unit (30) that extracts a signal of an audio frequency band from the 1 st electric signal from the 1 st microphone (20) and outputs a 1 st output signal; a 1 st control unit (50) that generates a 1 st amplification factor, compresses the dynamic range of the intensity of the 1 st output signal by multiplying the 1 st amplification factor by the 1 st output signal, and generates a 1 st correction amplification factor by smoothing the 1 st amplification factor with a 1 st time constant; and a 1 st multiplication unit (44) for multiplying the 1 st correction amplification factor by the 1 st output signal, wherein the 1 st time constant is a 1 st rising time constant when the intensity of the 1 st output signal is increased, and the 1 st time constant is a 1 st falling time constant when the intensity of the 1 st output signal is decreased, and the 1 st rising time constant is equal to or more than the time resolution of the sense of hearing of the hearing-impaired person and is less than the duration of the sound causing the re-vibration phenomenon to the hearing-impaired person.)

1. A sound processing device for processing a sound signal,

the sound processing device is provided with:

a 1 st microphone converting a 1 st sound into a 1 st electric signal;

a preprocessing unit for extracting a signal of an audio frequency band from the 1 st electric signal and outputting a 1 st output signal including the signal of the audio frequency band;

a 1 st control unit that generates a 1 st amplification factor, and generates a 1 st correction amplification factor by smoothing the 1 st amplification factor with a 1 st time constant, the 1 st amplification factor being used to compress a dynamic range of intensity of the 1 st output signal by multiplying the 1 st output signal by the 1 st amplification factor; and

a 1 st multiplication unit for multiplying the 1 st corrected amplification factor by the 1 st output signal,

with respect to the 1 st time constant,

the 1 st time constant is a 1 st rising time constant when the intensity of the 1 st output signal increases,

in the case where the intensity of the 1 st output signal decreases, the 1 st time constant is a 1 st falling time constant,

the 1 st rising time constant is equal to or more than the time resolution of the hearing-impaired person and is less than the duration of the sound causing the re-vibration phenomenon to the hearing-impaired person.

2. The sound processing device as set forth in claim 1,

the sound processing device further includes a 1 st setting unit that sets the 1 st rising time constant and the 1 st falling time constant.

3. The sound processing apparatus according to claim 1 or 2,

the 1 st rising time constant is a value of 20msec or more and less than 200 msec.

4. The sound processing apparatus according to any one of claims 1 to 3,

the 1 st rising time constant is a value greater than the 1 st falling time constant.

5. The sound processing apparatus according to any one of claims 1 to 4,

the pretreatment unit includes:

a 1 st filter for extracting a signal of the sound frequency band from the 1 st electric signal; and

a preprocessing multiplication unit for multiplying the output signal of the 1 st filter by a preprocessing amplification coefficient,

the preprocessing amplification factor is smaller when the intensity of the output signal of the 1 st filter is smaller than a predetermined threshold value than when the intensity of the output signal of the 1 st filter is larger than the predetermined threshold value.

6. A sound processing device for processing a sound signal,

the sound processing device is provided with:

a 1 st microphone converting a 1 st sound into a 1 st electric signal;

a preprocessing unit for extracting a signal of an audio frequency band from the 1 st electric signal and outputting a 1 st output signal including the signal of the audio frequency band;

a 1 st control unit that generates a 1 st amplification factor, and generates a 1 st correction amplification factor by smoothing the 1 st amplification factor with a 1 st time constant, the 1 st amplification factor being used to compress a dynamic range of intensity of the 1 st output signal by multiplying the 1 st output signal by the 1 st amplification factor;

a 1 st multiplication unit for multiplying the 1 st corrected amplification factor by the 1 st output signal;

a 2 nd microphone converting the 2 nd sound into a 2 nd electric signal;

a 2 nd control unit that generates a 2 nd amplification factor, and generates a 2 nd correction amplification factor by smoothing the 2 nd amplification factor with a 2 nd time constant, the 2 nd amplification factor being used to compress a dynamic range of intensity of the 2 nd output signal by multiplying the 2 nd output signal with respect to the 2 nd electric signal; and

a 2 nd multiplication unit for multiplying the 2 nd corrected amplification factor by the 2 nd output signal,

with respect to the 1 st time constant,

the 1 st time constant is a 1 st rising time constant when the intensity of the 1 st output signal increases,

in the case where the intensity of the 1 st output signal decreases, the 1 st time constant is a 1 st falling time constant,

with respect to the 2 nd time constant,

the 2 nd time constant is a 2 nd rising time constant when the intensity of the 2 nd output signal increases,

in the case where the intensity of the 2 nd output signal decreases, the 2 nd time constant is a 2 nd falling time constant,

the 1 st rise time constant is greater than the 2 nd rise time constant.

7. A sound processing device for processing a sound signal,

the audio processing device includes an increase emphasis unit to which a signal of an audio frequency band is input and which emphasizes an increasing portion of the signal for a predetermined time,

the predetermined time is equal to or more than the time resolution of the hearing-impaired person and is less than the duration of the sound for causing the low-hearing person to cause the phenomenon of the heavy vibration.

Technical Field

The present disclosure relates to a sound processing apparatus.

Background

In order to make a sound easier for a person with low hearing ability to hear, it is known that the following 3 items need to be considered (for example, see non-patent document 1 and the like).

Consider item 1: not loud, speaking clearly with a slightly loud voice.

Consider item 2: lines pa (パ), ta (タ), ka (カ) and sa (サ) in japanese are clearly pronounced.

Consider item 3: for the beginning of a word (in other words, the beginning of a word), it is sufficient to exert force and speak a little longer.

On the other hand, a technology has been developed in which normal sound is converted into sound that can be easily heard by digital signal processing.

In patent document 1, a dynamic range compression technique is used to increase small sounds and decrease excessive sounds, thereby improving the ease with which a person with low hearing ability can hear them.

In patent document 2, the degree to which a person with low hearing ability can easily hear is increased by detecting and emphasizing the sub-sound part.

(Prior art document)

(patent document)

Patent document 1: japanese patent No. 5149991

Patent document 2: japanese patent No. 6284003

Non-patent document 1: ming and Zhi Antang Life group nursing comprehensive information website 'MY nursing square' [ online ], [ 31 years, 4 months and 5 days retrieval ], Internet < URL: https:// www.my-kaigo.com/pub/individual/chiebukuro/taikon/choukku/0030. html >

Non-patent document 2: "frequency selectivity and time resolution of hearing impaired", otolaryngological viewing, 12.15.2002, Vol.45, No. 6, p.460-468

The technique described in patent document 1 corresponds to the above-mentioned consideration item 1, and on the other hand, acts against the above-mentioned consideration item 3. The reason is described below.

When the dynamic range compression technique is applied for hearing aid purposes, the following constraints are imposed.

First, as the 1 st constraint, it is necessary to reduce the time constant (ATT) of the start (attach). The initial time constant is a constant that determines how steep the attenuation process is to be performed for reducing the intensity when a signal having high intensity is input, and the smaller the value of the time constant is, the faster the attenuation process is performed, and the larger the value is, the slower the attenuation process is performed. The reason why this value must be reduced is to prevent the occurrence of a phenomenon of heavy vibration in the hearing-impaired person. The phenomenon of heavy vibration is a phenomenon often seen by a person with low hearing ability, and is a phenomenon that the phenomenon reacts sensitively to a signal with high intensity and sounds louder than a healthy person when the person listens to sound. Therefore, when hearing assistance processing is performed for a person with low hearing ability, it is necessary to quickly attenuate a loud sound generated suddenly. Therefore, it is necessary to reduce the initial time constant (ATT).

Next, as the 2 nd constraint, the time constant (REL) of Release (Release) must be increased. The release time constant is a constant that determines how steep the amplification process is to be performed for increasing the intensity when a signal with low intensity is input, and the amplification process is performed quickly as the value of the time constant is smaller and slowly as the value is larger. The reason why the time constant of the release needs to be increased will be described below.

According to the above-mentioned constraint condition 1, since the value of the initial time constant is small and the control of the sound volume is steep when the time constant is small, the processed sound is unnatural jittering on a signal in which the sound volume frequently fluctuates. In order to suppress such an unnatural fluctuation in volume, it is necessary to set the release time constant to be much longer than the initial time constant.

An example of the time constant of the hearing aid process determined based on the above-described restriction conditions 1 and 2 will be described with reference to fig. 14. Fig. 14 is a diagram showing an example of a time constant of a conventional hearing aid process. Fig. 14 shows an example of a time constant when a commercially available hearing aid is used as an example of a time constant in a conventional hearing aid process. In the conventional hearing aid processing, the initial time constant ATT and the release time constant REL have values shown in fig. 14, for example, and the initial time constant ATT is smaller than the release time constant REL by about one digit or two digits.

In this way, in the dynamic range compression for hearing aid purposes, since the header portion whose sound becomes large rapidly is suppressed sharply, it plays a role in that it is difficult to hear from the viewpoint of the above-mentioned consideration of item 3. Further, since the time constant for releasing needs to be increased, there is a problem that the original function of the hearing aid function, such as "increasing small sound", is delayed.

The technique described in patent document 2 corresponds to the above-mentioned consideration item 2, and has a problem that when the prefix is a vowel, the prefix is not emphasized, and therefore cannot be a sound which can be easily heard.

Disclosure of Invention

In view of the above-described conventional problems, it is an object of the present disclosure to provide a sound processing device capable of outputting a sound that is easy to hear.

In order to solve the above problem, an audio processing device according to an aspect of the present disclosure includes: a 1 st microphone converting a 1 st sound into a 1 st electric signal; a preprocessing unit for extracting a signal of an audio frequency band from the 1 st electric signal and outputting a 1 st output signal including the signal of the audio frequency band; a 1 st control unit that generates a 1 st amplification factor, and generates a 1 st correction amplification factor by smoothing the 1 st amplification factor with a 1 st time constant, the 1 st amplification factor being used to compress a dynamic range of intensity of the 1 st output signal by multiplying the 1 st output signal by the 1 st amplification factor; and a 1 st multiplication unit that multiplies the 1 st correction amplification factor by the 1 st output signal, wherein the 1 st time constant is a 1 st rising time constant when the intensity of the 1 st output signal is increased, and the 1 st time constant is a 1 st falling time constant when the intensity of the 1 st output signal is decreased, and the 1 st rising time constant is equal to or greater than the temporal resolution of the sense of hearing of a hearing-impaired person and less than the duration of a sound that causes a reverberation phenomenon to occur in the hearing-impaired person.

By setting the 1 st rising time constant to be equal to or more than the time resolution of the hearing-impaired person in this way, it is possible to prevent the intensity of the sound of the prefix from being abruptly suppressed. Further, by setting the 1 st rising time constant to be equal to or more than the time resolution of the hearing-impaired person, the 1 st correction amplification factor maintains the amplification factor immediately before the rise immediately after the rise of the intensity of the 1 st output signal of the prefix. So that the 1 st output signal immediately after the 1 st control part is raised is amplified. In this way, the prefix is emphasized, and therefore, a sound which is easy to hear can be generated. Further, by setting the 1 st rising time constant to be shorter than the duration of the sound causing the re-vibration phenomenon to the low hearing person, the occurrence of the re-vibration phenomenon can be suppressed.

The sound processing device according to one aspect of the present disclosure may further include a 1 st setting unit that sets the 1 st rising time constant and the 1 st falling time constant.

Thus, the 1 st rising time constant and the 1 st falling time constant can be set to desired values.

In the sound processing device according to one aspect of the present disclosure, the 1 st rise time constant may be a value of 20msec or more and less than 200 msec.

By setting the 1 st rising time constant to 20msec or more in this way, the intensity of the sound of the prefix can be prevented from being abruptly suppressed. Further, by setting the 1 st rising time constant to less than 200msec, occurrence of the ringing phenomenon can be suppressed.

In the sound processing device according to one aspect of the present disclosure, the 1 st rising time constant may be a value larger than the 1 st falling time constant.

By increasing the 1 st rising time constant in this way, it is possible to suppress unnatural sound from being jittered even when the 1 st falling time constant is small.

In addition, the sound processing apparatus according to one aspect of the present disclosure may be configured such that the preprocessing unit includes: a 1 st filter for extracting a signal of the sound frequency band from the 1 st electric signal; and a preprocessing multiplication unit that multiplies the output signal of the 1 st filter by a preprocessing amplification factor, wherein the preprocessing amplification factor is smaller when the intensity of the output signal of the 1 st filter is smaller than a predetermined threshold value than when the intensity of the output signal of the 1 st filter is larger than the predetermined threshold value.

Thus, amplification of a noise component having low intensity can be reduced.

In order to solve the above problem, an audio processing device according to an aspect of the present disclosure includes: a 1 st microphone converting a 1 st sound into a 1 st electric signal; a preprocessing unit for extracting a signal of an audio frequency band from the 1 st electric signal and outputting a 1 st output signal including the signal of the audio frequency band; a 1 st control unit that generates a 1 st amplification factor, and generates a 1 st correction amplification factor by smoothing the 1 st amplification factor with a 1 st time constant, the 1 st amplification factor being used to compress a dynamic range of intensity of the 1 st output signal by multiplying the 1 st output signal by the 1 st amplification factor; a 1 st multiplication unit for multiplying the 1 st corrected amplification factor by the 1 st output signal; a 2 nd microphone converting the 2 nd sound into a 2 nd electric signal; a 2 nd control unit that generates a 2 nd amplification factor, and generates a 2 nd correction amplification factor by smoothing the 2 nd amplification factor with a 2 nd time constant, the 2 nd amplification factor being used to compress a dynamic range of intensity of the 2 nd output signal by multiplying the 2 nd output signal with respect to the 2 nd electric signal; and a 2 nd multiplication unit that multiplies the 2 nd corrected amplification factor by the 2 nd output signal, wherein the 1 st time constant is a 1 st rising time constant when the intensity of the 1 st output signal increases, the 1 st time constant is a 1 st falling time constant when the intensity of the 1 st output signal decreases, the 2 nd time constant is a 2 nd rising time constant when the intensity of the 2 nd output signal increases, and the 2 nd time constant is a 2 nd falling time constant when the intensity of the 2 nd output signal decreases, and the 1 st rising time constant is larger than the 2 nd rising time constant.

By setting the 1 st rising time constant to be larger than the 2 nd rising time constant in this way, it is possible to prevent the intensity of the prefix of the 1 st sound from being abruptly suppressed. Thereby enabling the generation of an easily audible sound.

In order to solve the above problem, a sound processing device according to one aspect of the present disclosure includes an enhancement unit that receives a signal in a sound frequency band and enhances an enhancement portion of the signal for a predetermined time. The predetermined time is equal to or more than the time resolution of the hearing-impaired person and is less than the duration of the sound for causing the low-hearing person to cause the phenomenon of the heavy vibration.

In this way, by emphasizing the rising portion of the signal in the audio band at a time equal to or more than the time resolution of the hearing-impaired person, it is possible to generate a sound that is easy to be heard by the hearing-impaired person. Further, by setting the 1 st rising time constant to be shorter than the duration of the sound causing the re-vibration phenomenon to the low hearing person, the occurrence of the re-vibration phenomenon can be suppressed.

By the present disclosure, a sound processing device capable of outputting a sound that is easy to hear can be provided.

Drawings

Fig. 1 is a block diagram showing an example of a functional configuration of an audio processing device according to embodiment 1.

Fig. 2 is a block diagram showing an example of a functional configuration of the 1 st control unit according to embodiment 1.

Fig. 3 is a block diagram showing an example of a functional configuration of a correction coefficient generation unit according to embodiment 1.

Fig. 4 is a graph showing an example of the relationship between the 1 st amplification factor and the 1 st output signal intensity in embodiment 1.

Fig. 5 is a flowchart showing an example of the method for determining the 1 st time constant according to embodiment 1.

Fig. 6A is a diagram showing an example of a waveform of a rising portion of the 1 st output signal in embodiment 1, and the magnitude and orientation of the 1 st correction amplification factor multiplied by the 1 st output signal at each time.

Fig. 6B is a graph showing the waveform of a signal obtained by multiplying the 1 st output signal shown in fig. 6A by the 1 st correction amplification factor.

Fig. 7 is a graph showing the results of an experimental study using the sound processing device according to embodiment 1.

Fig. 8 is a block diagram showing an example of a functional configuration of the audio processing device according to embodiment 2.

Fig. 9 is a block diagram showing an example of a functional configuration of the preprocessing unit according to embodiment 2.

Fig. 10 is a graph showing an example of the relationship between the preprocessing amplification factor and the intensity E of the output signal of the filter in embodiment 2.

Fig. 11 is a block diagram showing an example of a functional configuration of the audio processing device according to embodiment 3.

Fig. 12 is a diagram for explaining a method of determining the 1 st time constant in the modification.

Fig. 13 is a diagram showing an example of a hardware configuration of a computer in which the functions of the audio processing device are realized by software.

Fig. 14 is a diagram showing an example of a time constant in a conventional hearing aid process.

Fig. 15 is a diagram showing a relationship between hearing level and time resolution described in non-patent document 2.

Detailed Description

Embodiments of the present disclosure are described below in detail with reference to the drawings. The embodiments described below are specific examples of the present disclosure. The numerical values, shapes, materials, specifications, constituent elements, arrangement positions and connection forms of the constituent elements, steps, order of the steps, and the like shown in the following embodiments are merely examples, and the present invention is not limited thereto. Among the components of the following embodiments, components of independent claims that are not described in the highest concept of the present disclosure will be described as arbitrary components. Also, the drawings are not intended to be strictly schematic. In each of the drawings, substantially the same components are denoted by the same reference numerals, and redundant description may be omitted or simplified.

(approach to achieving one aspect of the disclosure)

Among the items described in the above-mentioned background art, which need to be considered in order to make it easier for a hearing impaired person to hear, there is a possibility that the considered item 1 and the considered item 3 are reversed. The consideration item 1 describes that a strong sound cannot be emitted. On the other hand, considering item 3, the head (rise) of the recommended word makes a strong sound.

However, these items can coexist by performing the following coordination. That is, strong sounds are emitted at the beginning of the word, but it is sufficient if the strong sounds do not continue to such an extent as to cause the phenomenon of the heavy vibration. Here, regarding an appropriate time length for generating a high-intensity sound in the headword portion, the following is analogized with reference to the result of recent research on auditory psychology (non-patent document 2).

Fig. 15 is a diagram showing a relationship between hearing level and time resolution described in non-patent document 2. Graphs (a) to (d) in fig. 15 show the relationship between hearing level and time resolution obtained by performing the gap detection threshold test on hearing-impaired persons aged 11 to 75 years. The horizontal axis of each graph shows the hearing level that can be detected. In other words, the more to the right of the horizontal axis, the more hearing is impaired. The vertical axis shows the time resolution, in other words, the length of the gap of the sound that can be detected. The upper the vertical axis indicates the worse the time resolution. The black dots marked in fig. 15 indicate the positions of the data of the respective subjects. Graphs (a), (b), (c), and (d) in fig. 15 show the results of the experimental stimulus sound frequencies of 1kHz, 2kHz, 4kHz, and 8kHz, respectively. On either graph, there is an increasing tendency, and it can be understood that the more hearing deteriorates, the more time resolution deteriorates. Especially the temporal resolution, sometimes degrades from 20msec to 30 msec. The deterioration of the temporal resolution means that short sounds cannot be perceived. It is estimated that a person with low hearing based on the symptom cannot perceive a complicated frequency change of the prefix, and consequently, cannot hear clearly the spoken word. However, if the prefix is emphasized continuously for about 20msec or more, it is expected that even a person with low hearing who has the above-mentioned symptoms can be heard more easily.

On the other hand, if the emphasis process is continued for a long time, the auditory sense is allergic due to the phenomenon of heavy vibration, and the loudness of the sound in the head is large, resulting in uncomfortable hearing.

In the SISI (Short Increment Sensitivity Index) test known as a test for examining the presence or absence of symptoms of a stress phenomenon, the duration of stimulus sound is 200 msec. This indicates that, in the case of a hearing-impaired person having symptoms of a heavy vibration phenomenon, the symptoms of the heavy vibration phenomenon are caused when the duration of the strong signal is 200 msec. Thus, the time length for emphasizing the prefix can be estimated and should be set to less than 200 msec.

In view of such conventional research results, in the embodiments described below, the time length for emphasizing the prefix is set to be from 20msec to 200 msec.

Furthermore, since demonstration experiments were carried out on many elderly including those with low hearing ability as subjects using this method, it was shown that this method contributes to the ability of elderly to hear sound clearly, and the results are also described in the embodiment.

(embodiment mode 1)

Next, an audio processing device according to embodiment 1 will be described.

[ Structure ]

First, a functional configuration of the audio processing device according to the present embodiment will be described with reference to the drawings. Fig. 1 is a block diagram showing an example of a functional configuration of a sound processing device 10 according to the present embodiment.

The sound processing apparatus 10 is an apparatus that processes the 1 st sound and emits the 1 st sound from the 1 st speaker 80. As shown in fig. 1, the audio processing device 10 includes a 1 st microphone 20, a preprocessing unit 30, and an emphasis-up unit 40 as functional units. In the present embodiment, the audio processing device 10 further includes a 1 st speaker 80.

The 1 st microphone 20 is a microphone that converts the 1 st sound into a 1 st electric signal. In the present embodiment, the 1 st microphone 20 is designed to input the voice of a speaker who is close. Therefore, the volume of the speaker's voice is much higher than the surrounding noise in the 1 st voice input to the 1 st microphone 20.

The preprocessing unit 30 is a processing unit that extracts a signal of an audio frequency band from the 1 st electric signal output from the 1 st microphone 20 and outputs a 1 st output signal including the signal of the audio frequency band. Here, the voice band is a band including the frequency of human voice. Specifically, the voice band is approximately 70Hz to 3000 Hz. In the present embodiment, the preprocessing unit 30 has a filter 32. The filter 32 is an example of a 1 st filter that extracts a signal of an audio band from the 1 st electric signal. In the present embodiment, the filter 32 is a low-pass filter that extracts a signal in a frequency band of a predetermined frequency or less from the 1 st electric signal. The filter 32 extracts a signal having a frequency band of 8000Hz or less from the 1 st electric signal, for example.

The rise emphasis unit 40 is a processing unit to which a signal of an audio band is input and which emphasizes a rising portion of the signal for a predetermined time. The predetermined time is equal to or more than the time resolution of the hearing-impaired person and is shorter than the duration of the sound causing the low-hearing person to cause the phenomenon of the heavy vibration. In the present embodiment, the 1 st output signal is input as a signal of an audio band. The ascending emphasis unit 40 emphasizes an ascending part, in other words, a part of the beginning of a word, in the 1 st sound. The enhancement unit 40 includes a 1 st control unit 50 and a 1 st multiplication unit 44. In the present embodiment, the enhanced rise portion 40 further includes a 1 st setting unit 42.

The 1 st control unit 50 is a processing unit that generates a 1 st amplification factor, compresses the dynamic range of the intensity of the 1 st output signal by multiplying the 1 st amplification factor by the 1 st output signal output from the preprocessing unit 30, and generates a 1 st correction amplification factor by smoothing the 1 st amplification factor with a 1 st time constant τ. Regarding the 1 st time constant τ, the 1 st time constant τ is the 1 st rising time constant ATT (in other words, initial time constant) when the intensity of the 1 st output signal increases, and the 1 st time constant τ is the 1 st falling time constant REL (in other words, released time constant) when the intensity of the 1 st output signal decreases. The 1 st rising time constant ATT is equal to or more than the temporal resolution of the hearing-impaired person and is smaller than the duration of the sound causing the low-hearing person to induce the ringing phenomenon. Specifically, the 1 st rising time constant ATT is, for example, 20msec or more and less than 200 msec.

Here, the 1 st control unit 50 will be described with reference to fig. 2. Fig. 2 is a block diagram showing an example of the functional configuration of the 1 st control unit 50 according to the present embodiment. As shown in fig. 2, the 1 st control unit 50 includes an intensity detection unit 52, a coefficient generation unit 54, a time constant determination unit 56, and a correction coefficient generation unit 60.

The intensity detection unit 52 is a detection unit that detects the intensity E of the 1 st output signal. The intensity detector 52 outputs the detected intensity E to the coefficient generator 54.

The coefficient generation unit 54 is a processing unit that generates the 1 st amplification coefficient g (t) that increases with a decrease in the intensity E and decreases with an increase in the intensity E. In other words, the coefficient generation unit 54 generates the 1 st amplification coefficient g (t) for compressing the dynamic range of the 1 st output signal. Further, t here represents time. The coefficient generation unit 54 outputs the generated 1 st amplification coefficient to the time constant determination unit 56 and the correction coefficient generation unit 60.

The time constant determination unit 56 is a processing unit that determines a 1 st time constant τ, which is a time constant of the smoothing process used in the correction coefficient generation unit 60.

The correction coefficient generation unit 60 is a processing unit that smoothes the temporal variation of the 1 st amplification coefficient g (t) generated by the coefficient generation unit 54 to convert the 1 st amplification coefficient into the 1 st correction amplification coefficient mg (t). Here, a configuration example of the correction coefficient generating unit 60 will be described with reference to fig. 3. Fig. 3 is a block diagram showing an example of a functional configuration of the correction coefficient generation unit 60 according to the present embodiment. As shown in fig. 3, the correction coefficient generation unit 60 includes a multiplier 62, a multiplier 68, an adder 64, and a delay element 66. In other words, the correction coefficient generation unit 60 is a digital filter having a time constant of the 1 st time constant τ.

The adder 64 is a processing unit that adds the two signals output from the multiplier 62 and the multiplier 68. The delay element 66 is a processing unit that delays the signal output from the adder 64 by a cycle T of the unit processing. The cycle T of the unit processing is a cycle in which the signal intensity detection unit 52 detects the intensity E of the 1 st output signal, and further, the coefficient generation unit 54 generates the 1 st amplification coefficient. For example, when the sampling frequency of the 1 st output signal is 16kHz, T becomes 1/16000sec when the 1 st amplification factor is generated for each sample, and when the 1 st amplification factor is generated for each 16 samples (the 1 st amplification factor is obtained by summing up 16 samples), T becomes 16/16000sec, that is, 1 msec.

The multiplier 62 is a processing unit that multiplies the 1 st amplification coefficient g (t) by the magnification b. The multiplier 68 is a processing unit that multiplies the intensity of the signal output from the delay element 66 by the magnification a, and is input with the 1 st time constant τ from the time constant determination unit 56. Here, the multiplier a of the multiplier 68 and the multiplier b of the multiplier 62 are expressed by the following expressions (1) and (2) using the 1 st time constant τ and the unit processing period T, respectively.

a=τ/(τ+T) (1)

b=1-a=T/(τ+T) (2)

The 1 st setting unit 42 shown in fig. 1 is a setting unit that sets the 1 st rising time constant ATT and the 1 st falling time constant REL. The 1 st setting unit 42 outputs the set 1 st rising time constant ATT and 1 st falling time constant REL to the 1 st control unit 50. Accordingly, the 1 st rising time constant ATT and the 1 st falling time constant REL can be set to desired values. Note that the 1 st setting unit 42 is not an essential component of the audio processing device 10 according to the present embodiment. For example, the 1 st rising time constant ATT and the 1 st falling time constant REL may be set in advance by the 1 st control unit 50.

The 1 st multiplication unit 44 is a processing unit that multiplies the 1 st correction amplification factor by the 1 st output signal. The 1 st multiplication unit 44 multiplies the 1 st correction amplification factor by the 1 st output signal to output the signal to the 1 st speaker 80.

The 1 st speaker 80 is a speaker which emits an output signal of the 1 st multiplication unit 44.

[ actions ]

The operation of the sound processing apparatus configured as described above will be described below.

First, the 1 st microphone 20 collects the 1 st sound including the voice of the speaker and converts the sound into the 1 st electric signal. Here, the 1 st microphone 20 is set such that the speaker approaches the 1 st microphone 20 and emits a sound. The 1 st microphone 20 may be designed such that, for example, a speaker can visually confirm the position of the sound pickup hole of the 1 st microphone 20. Alternatively, the 1 st microphone 20 may be designed such that when the speaker is facing a camera photographing the face of the speaker, the mouth of the speaker is close to the sound pickup hole of the 1 st microphone 20. By providing the 1 st microphone 20 in this manner, the 1 st voice input to the 1 st microphone 20 is mainly composed of the voice uttered by the speaker, and therefore, it is possible to suppress the collection of large-volume noise that suddenly occurs around the 1 st microphone 20. Therefore, even if the 1 st rise time constant ATT is set to a large value, it is possible to suppress the generation of noise of a large sound volume which is suddenly generated.

Next, the filter 32 of the preprocessing unit 30 extracts a signal of an audio frequency band from the 1 st electric signal from the 1 st microphone 20, and outputs a 1 st output signal including the signal of the audio frequency band. For example, since the sampling frequency is 16kHz here, the cutoff frequency of the filter 32 is set to 8 kHz. Therefore, it is possible to further clarify that the main component of the signal to be processed later is a human voice signal. Therefore, as described later, even if the 1 st rise time constant ATT in the dynamic range compression is set to a large value, the risk that the sound of a large volume which is suddenly generated cannot be suppressed can be reduced. The cutoff frequency of the filter 32 is not limited to 8 kHz. For example, the cutoff frequency of filter 32 may be approximately 3kHz or more and 24kHz or less.

Next, the 1 st control unit 50 of the up-emphasis unit 40 generates the 1 st correction amplification factor by performing smoothing processing with the 1 st up time constant ATT for the 1 st amplification factor that decreases when the intensity E of the 1 st output signal from the pre-processing unit 30 increases, and performing smoothing processing with the 1 st down time constant REL for the 1 st amplification factor that increases when the intensity E decreases.

Specifically, the intensity detection unit 52 shown in fig. 2 detects the intensity E of the 1 st output signal from the preprocessing unit 30. The intensity E may be, for example, the absolute value of the 1 st output signal, or may be energy. As the intensity E, the sum of absolute values of the intensity E of the 1 st output signal at several times may be obtained, or the sum of squares may be obtained. When the intensity E is obtained for each sample, the period T of the unit processing is 1/16000 when the sampling frequency is 16kHz, and when the intensity E is obtained for the entire 16 samples, the period T of the unit processing is 16/16000, that is, 1 msec. The value obtained as described above is further processed by a low-pass filter to obtain the intensity E.

Next, the coefficient generating unit 54 shown in fig. 2 generates a 1 st amplification factor, and the 1 st amplification factor is larger as the intensity E is smaller and smaller as the intensity E is larger. The 1 st amplification factor will be described with reference to fig. 4. Fig. 4 is a graph showing an example of the relationship between the 1 st amplification factor and the intensity E of the 1 st output signal in the present embodiment. In fig. 4, the horizontal axis represents the intensity E of the 1 st output signal, and the vertical axis represents the 1 st amplification factor corresponding thereto. The graph tends to be descending, and the 1 st magnification factor is smaller as the intensity E is larger, and the 1 st magnification factor is larger as the intensity E is smaller. In other words, the 1 st amplification factor is monotonically decreasing for the intensity E. The monotonic decrease here also includes a range in which the 1 st amplification factor has a constant value for the intensity E, as shown in fig. 4. In fig. 4, the 1 st amplification factor changes from a positive value to a negative value in accordance with the increase in the intensity E, but it is not always necessary to change the 1 st amplification factor across the positive and negative boundaries, and the 1 st amplification factor may not increase as the intensity E increases.

The correction coefficient generation unit 60 generates the 1 st correction amplification factor by performing smoothing processing on the temporal variation of the 1 st amplification factor generated as described above. In this case, the 1 st time constant τ used for the smoothing process is determined by the time constant determination unit 56, and this determination method will be described with reference to fig. 5. Fig. 5 is a flowchart showing an example of the method of determining the 1 st time constant τ according to the present embodiment. As shown in fig. 5, the time constant determination unit 56 first obtains the 1 st rising time constant ATT and the 1 st falling time constant REL from the 1 st setting unit 42. Further, the time constant determination unit 56 obtains the 1 st amplification factor g (T) at the current time from the coefficient generation unit 54, and obtains the 1 st correction amplification factor mG (T-T) at a time before the cycle T of 1 unit process from the correction factor generation unit 60 (S12).

Next, the time constant determination unit 56 compares the 1 st amplification factor g (T) with the 1 st correction amplification factor mG (T-T) (S14). In the case of g (T) < mG (T-T) (yes in S14), the 1 st amplification factor at the current time is smaller than that in the past, and therefore the intensity E at the current time is greater than 1 unit processing cycle T before, and therefore the state of the start (in other words, the rise) is indicated. In this case, the time constant determination unit 56 selects the 1 st rising time constant ATT as the 1 st time constant τ used for the smoothing processing (S16). If not (no in S14), the time constant determination unit 56 selects the 1 st descent time constant REL (S18). The 1 st time constant τ thus determined is output to the correction coefficient generation unit 60 by the time constant determination unit 56 (S20). In this way, using the 1 st time constant τ output from the time constant determination unit 56, the 1 st correction amplification factor is generated by smoothing the temporal variation of the 1 st amplification factor in the correction factor generation unit 60.

In the present embodiment, the 1 st rising time constant ATT and the 1 st falling time constant REL are set by the 1 st setting unit 42. Specific values of these time constants are determined, for example, in the following manner.

First, the 1 st rise time constant ATT is equal to or more than the time resolution of the auditory sense of the person with low hearing ability who listens to the 1 st sound processed by the sound processing device 10, and is smaller than the duration of the sound for causing the phenomenon of the heavy vibration to the person with low hearing ability. Specifically, the 1 st rising time constant ATT is set to a value of, for example, 20msec or more and less than 200 msec. The numerical ranges are based as recited in the passage leading to one aspect of the disclosure. The 1 st rise time constant ATT may be determined to be a value suitable for a person with low hearing ability who listens to the 1 st sound processed by the sound processing device 10. Specifically, the time resolution of the auditory sensation of each hearing-impaired person and the duration of the sound causing the ringing phenomenon may be measured, and the 1 st rise time constant ATT may be determined based on the values obtained by the measurement.

By setting the 1 st rising time constant to be equal to or more than the time resolution of the hearing-impaired person in this way, it is possible to prevent the intensity of the sound of the prefix from being abruptly suppressed. When the 1 st rising time constant is set to be shorter than the duration of the sound causing the low hearing person to cause the heavy vibration phenomenon, the occurrence of the heavy vibration phenomenon can be suppressed.

On the other hand, the 1 st fall time constant REL is a value of about several tens or more and several hundreds or less of the rise time constant as in the conventional technique shown in fig. 14, but in the sound processing device 10 of the present embodiment, the 1 st rise time constant ATT may be set to a large value, and the 1 st fall time constant REL may not be as large as the value shown in fig. 14. Specifically, the 1 st falling time constant REL may be equal to (for example, less than 10 times) the 1 st rising time constant ATT, or may be less than the 1 st rising time constant ATT. For example, the 1 st rising time constant ATT is 50msec, the 1 st falling time constant REL is 200msec, or the 1 st rising time constant ATT is 100msec, and the 1 st falling time constant REL is 80 msec. The 1 st falling time constant REL is less than 40msec, and the 1 st rising time constant ATT is 40msec or more.

Using the 1 st time constant τ set in this way, the 1 st control unit 50 generates a 1 st correction amplification factor, and multiplies the 1 st correction amplification factor by the 1 st output signal to obtain an output signal. The 1 st speaker 80 that has received the output signal emits sound in accordance with the output signal, and the processed sound is provided to the listener who is a hearing-impaired person.

The reason why the 1 st sound is converted into a signal that is easy for the hearing-impaired person to hear and whose prefix is emphasized by the sound processing apparatus 10 having the above-described configuration will be described with reference to fig. 6A and 6B.

Fig. 6A is a diagram showing an example of a waveform of a rising portion (in other words, a prefix portion) of the 1 st output signal of the present embodiment, and the magnitude and direction of the 1 st correction amplification factor multiplied by the 1 st output signal at each time. In fig. 6A, the 1 st output signal is represented by an arrow mark extending outward from the time axis when the 1 st output signal is amplified by the 1 st correction amplification factor, and is represented by an arrow mark extending toward the time axis when the 1 st output signal is attenuated by the 1 st correction amplification factor. Fig. 6B is a graph showing the waveform of a signal obtained by multiplying the 1 st output signal shown in fig. 6A by the 1 st correction amplification factor.

On the left side of fig. 6A, that is, at a time before the sound rises, the intensity E of the 1 st output signal is small, and the 1 st amplification factor has a large value. For example, when switching is performed according to the relationship between the intensity E and the 1 st amplification factor shown in fig. 4, the 1 st amplification factor is always +6dB until the sound rises. Here, regarding the 1 st correction amplification factor, since the 1 st amplification factor is always +6dB, it is still +6dB even if the temporal variation is smoothed. On the left side of fig. 6A, the arrow mark indicates that the arrow mark faces outward from the center. Thus, the 1 st output signal in fig. 6A is +6dB, which is the left side of fig. 6B.

The time of the rise of the sound is then reached. In this case, the intensity E of the 1 st output signal increases, so the 1 st amplification factor decreases. For example, fig. 4 shows an example in which the 1 st amplification factor is-6 dB.

In this case, the intensity E of the 1 st output signal rapidly increases to be in the initial state. According to the method shown in fig. 5, since the 1 st amplification factor g (T) at the time immediately after the 1 st output signal intensity E is abruptly increased is-6 dB and the 1 st correction amplification factor mG (T-T) before the cycle T of the unit process at this time is +6dB, yes is selected (in other words, the start side) in step S14 of fig. 5, and the 1 st rising time constant ATT is selected as the 1 st time constant τ.

Here, the 1 st rising time constant ATT is set to a value larger than the time resolution of the hearing impaired person, for example, 40msec or the like. The time constant of 40msec means that the time required for the current 1 st correction amplification factor to reach 63% of the target 1 st correction amplification factor is about 40 msec. In the example shown in fig. 4, since the 1 st amplification factor is +6dB before the cycle T of the unit processing at the present time and the 1 st amplification factor is-6 dB at the present time, the time constant of 40msec means that the time required to reach 63% of the target-6 dB from +6dB is 40 msec. In other words, at least for 40msec, the signal stays at a value larger than the target value (in other words, the 1 st output signal is emphasized). The arrow mark shown in fig. 6A indicates that a time period is after the rising time of the 1 st output signal, and this means that the time period is outward from the time axis.

As a result, the waveform of the 1 st output signal multiplied by the 1 st correction amplification factor becomes a waveform in which a rising portion (in other words, a prefix portion) is emphasized as shown in fig. 6B. In the present embodiment, the 1 st rising time constant ATT has a value larger than the time resolution of the hearing-impaired person, and therefore the prefix portion is emphasized, and even the hearing-impaired person can clearly perceive the sound. However, if the 1 st rising time constant ATT is too large, a phenomenon of heavy vibration occurs, which makes the hearing impaired person unpleasant. It is known that a loud sound is continuously emitted for 200msec, which causes a ringing phenomenon, so the 1 st rise time constant must be less than 200 msec.

The audio processing device 10 according to this sample embodiment includes the rise emphasis unit 40, and the rise emphasis unit 40 receives the 1 st output signal, which is a signal in the audio frequency band, and emphasizes a rising portion of the 1 st output signal for a predetermined time. The predetermined time is equal to or more than the time resolution of the hearing-impaired person and is shorter than the duration of the sound causing the low-hearing person to cause the phenomenon of the heavy vibration. Therefore, the prefix is continuously emphasized for a time period that can be recognized by the person with low hearing ability, and the time length of the emphasis process is relatively short, so that discomfort due to the phenomenon of heavy vibration does not occur, and therefore, a sound that can be easily heard by the person with low hearing ability can be realized.

Further, as described above, the 1 st rising time constant may be a value larger than the 1 st falling time constant. By increasing the 1 st rising time constant in this way, it is possible to suppress unnatural sound from being jittered even when the 1 st falling time constant is small.

[ test results ]

Next, the results of the demonstration experiment performed using the sound processing device 10 according to the present embodiment will be described with reference to fig. 7. Fig. 7 is a graph showing the results of an experimental experiment performed using the sound processing device 10 according to the present embodiment. Fig. 7 shows the results of the experiment for 23 subjects. The horizontal axis of fig. 7 represents the individual subjects sorted in ascending order of correct answer rate. The vertical axis of fig. 7 represents the correct answer rate of each experiment.

In this experiment, a speech recognition test was performed using a 67-S language table according to the speech hearing test method established by the Japanese society of auditory medicine in 2003. Specifically, the subject was made to listen to a monosyllable shown in the 67-S language table, and the correct answer rate was examined by using hiragana. The subjects were 3 in the age range of 60 years, 18 in the age range of 70 years, and 3 in the age range of 80 years for a total of 23. The test was also performed under two different conditions, one in which the monosyllabic sound was heard by the subject without being processed, and the other in which the subject heard the monosyllabic sound processed by the sound processing device 10 of the present embodiment. The audio processing device 10 sets the 1 st rising time constant ATT and the 1 st falling time constant REL to 40msec and 20msec, respectively.

In fig. 7, the correct answer rate in the case where each subject listened to the unprocessed voice and the correct answer rate in the case where the voice was processed by the voice processing device 10 according to the present embodiment are indicated by black triangles and black dots, respectively, and the difference between these 2 correct answer rates is indicated by an arrow mark.

As shown in fig. 7, the correct answer rate of the sound processed by the sound processing device 10 according to the present embodiment is higher than the correct answer rate of the sound without processing, not only for the subject whose hearing reduction degree is not significant and whose correct answer rate is high, but also for the subject whose hearing reduction degree is significant and whose correct answer rate is low. It was confirmed through experiments that the sound processing device 10 of the present embodiment can output a sound that can be easily heard.

(embodiment mode 2)

Next, an audio processing device according to embodiment 2 will be described. As described above, in the audio processing device 10 according to embodiment 1, immediately after the rise (start), the prefix portion is surely emphasized, and thereafter, the prefix portion is relatively gradually lowered to the target level according to the 1 st rise time constant. To realize such an operation, as shown in fig. 4, the coefficient generation unit 54 generates a large 1 st amplification coefficient when the intensity E of the 1 st output signal is small. Accordingly, the time constant of the smoothing process when the 1 st amplification factor is decreased is increased, and the amplitude of the signal can be reliably increased. However, in this case, noise that reduces the intensity E of the 1 st output signal is also amplified. For example, the 1 st output signal on the left side of fig. 6B, in other words, a signal corresponding to noise before a speaker utters a sound is amplified.

In view of the above, the present embodiment will explain a sound processing apparatus capable of suppressing noise in this manner. The configuration of the preprocessing unit of the audio processing device of the present embodiment is different from the audio processing device 10 of embodiment 1. The following description focuses on differences from the audio processing device 10 of embodiment 1 regarding the audio processing device of the present embodiment.

First, the functional configuration of the audio processing device according to the present embodiment will be described with reference to fig. 8 and 9. Fig. 8 is a block diagram showing an example of the functional configuration of the audio processing device 110 according to the present embodiment. Fig. 9 is a block diagram showing an example of the functional configuration of the preprocessing unit 130 according to the present embodiment.

As shown in fig. 8, the audio processing device 110 according to the present embodiment includes, as functional units, a 1 st microphone 20, a preprocessing unit 130, and an emphasis-up unit 40. In the present embodiment, the audio processing device 110 further includes the 1 st speaker 80. As shown in fig. 8, the audio processing device 110 of the present embodiment is different from the audio processing device 10 according to embodiment 1 in the configuration of the preprocessor 130.

The preprocessing unit 130 of the present embodiment includes a filter 32, a preprocessing control unit 133, and a preprocessing multiplication unit 138, as shown in fig. 9. The filter 32 is an example of a 1 st filter that extracts a signal of an audio band from the 1 st electric signal output from the 1 st microphone 20, as in the filter 32 according to embodiment 1.

The preprocessing control unit 133 is a processing unit that generates a preprocessing amplification factor from the output signal of the filter 32. As shown in fig. 9, the preprocessing control unit 133 includes an intensity detection unit 134, a coefficient generation unit 135, and a correction coefficient generation unit 136.

The intensity detection unit 134 is a detection unit that detects the intensity E of the output signal of the filter 32. The intensity detector 134 may detect the intensity E in the same manner as the intensity detector 52 of embodiment 1. The intensity detection unit 134 outputs the detected intensity E to the coefficient generation unit 135.

The coefficient generation unit 135 is a processing unit that generates a preprocessing amplification coefficient. The preprocessing amplification factor is a coefficient that varies according to the intensity E of the output signal of the filter 32, and is smaller when the intensity E of the output signal of the filter 32 is smaller than a predetermined threshold value, than when the intensity E of the output signal of the filter 32 is larger than the predetermined threshold value. The preprocessing amplification factor will be described below with reference to fig. 10. Fig. 10 is a graph showing an example of the relationship between the preprocessing amplification factor and the intensity E of the output signal of the filter 32 according to the present embodiment. As shown in fig. 10, the preprocessing amplification factor increases as the intensity E of the output signal of the filter 32 becomes larger. For example, the preprocessing amplification factor may be-6 dB in the range of the intensity E where the 1 st amplification factor becomes +6dB as shown in fig. 3, and 0dB in the range of the intensity E where the 1 st amplification factor becomes-6 dB as shown in fig. 3. In the example shown in fig. 10, the preprocessing amplification factor is smaller when the intensity E is smaller than a predetermined threshold value of-48 dB, and smaller when the intensity E is larger than the threshold value of-48 dB. Specifically, when the intensity E is smaller than a predetermined threshold value of-48 dB, the preprocessing amplification factor is also reduced as the intensity E becomes smaller, and when the intensity E is equal to or smaller than the predetermined intensity E, the preprocessing amplification factor becomes-6 dB. On the other hand, when the intensity E is larger than the predetermined threshold value of-48 dB, the preprocessing amplification factor becomes 0 dB.

The correction coefficient generation unit 136 is a processing unit that corrects the preprocessing amplification factor by smoothing the temporal variation of the preprocessing amplification factor. The correction coefficient generation unit 136 outputs the corrected preprocessing amplification coefficient to the preprocessing multiplication unit 138. The correction coefficient generation unit 136 is not an essential component of the audio processing device 110. In other words, the preprocessing amplification factor generated by the coefficient generation unit 135 may be directly input to the preprocessing multiplication unit 138.

The preprocessing multiplication unit 138 is a processing unit that multiplies the output signal of the filter 32 by the preprocessing amplification factor. The preprocessing multiplication unit 138 multiplies the output signal of the filter 32 by the preprocessing amplification factor to obtain a signal, and outputs the signal to the up-emphasis unit 40 as a 1 st output signal.

As described above, the preprocessing unit 130 of the present embodiment outputs the 1 st output signal in which the signal with the low intensity E is attenuated. Therefore, it is possible to reduce the noise component with low intensity E amplified in the enhancement enhancing unit 40 while reducing the suppression of the headword corresponding to the enhancement.

(embodiment mode 3)

Next, an audio processing device according to embodiment 3 will be described. The audio processing device of the present embodiment includes a processing unit that processes the 1 st audio and a processing unit that processes the 2 nd audio. The following description focuses on differences from the audio processing device 10 according to embodiment 1 with respect to the audio processing device of the present embodiment.

First, a functional configuration of the audio processing device according to the present embodiment will be described with reference to fig. 11. Fig. 11 is a block diagram showing an example of the functional configuration of the audio processing device 210 according to the present embodiment.

As shown in fig. 11, the audio processing device 210 includes, as functional units, a 1 st microphone 20, a preprocessing unit 30, an emphasis-up unit 40, a 2 nd microphone 220, and a compression unit 240. In the present embodiment, the audio processing device 210 further includes the 1 st speaker 80 and the 2 nd speaker 280. For example, the sound processing device 210 can process sound in both directions, i.e., in the direction from the 1 st microphone 20 to the 1 st speaker 80 and in the direction from the 2 nd microphone 220 to the 2 nd speaker 280, by providing the 2 nd microphone 220 in the vicinity of the 1 st speaker 80 and providing the 2 nd speaker 280 in the vicinity of the 1 st microphone 20. In other words, the sound processing device 210 of the present embodiment can process sounds in both directions, which are uttered in a dialogue between the 1 st user who utters the 1 st sound and the 2 nd user who utters the 2 nd sound.

The 1 st microphone 20, the preprocessing unit 30, the emphasis increasing unit 40, and the 1 st speaker 80 according to the present embodiment have the same configurations as the 1 st microphone 20, the preprocessing unit 30, the emphasis increasing unit 40, and the 1 st speaker 80 according to embodiment 1.

The 2 nd microphone 220 is a microphone that converts the 2 nd sound into a 2 nd electric signal. In the present embodiment, the voice input to the 2 nd microphone 220 is not limited to the voice of the speaker near the microphone. For example, the 2 nd microphone 220 may be designed to input the voice of a speaker having difficulty approaching the 2 nd microphone 220 as the 2 nd voice. In this case, the volume of the surrounding noise may be relatively high with respect to the volume of the speaker's voice in the 2 nd sound input to the 2 nd microphone 220.

The compression unit 240 is a processing unit that compresses the dynamic range of the 2 nd output signal with respect to the 2 nd electric signal output from the 2 nd microphone 220. In this embodiment, the 2 nd output signal is the same as the 2 nd electric signal, but the 2 nd output signal may be a signal obtained by processing the 2 nd electric signal by a preprocessing unit similar to each preprocessing unit of embodiment 1 or embodiment 2. In other words, the audio processing device 210 according to the present embodiment may further include a preprocessing unit that generates a 2 nd output signal by processing the 2 nd electric signal from the 2 nd microphone 220, and outputs the 2 nd output signal to the compressing unit 240. The compression unit 240 includes a 2 nd control unit 250 and a 2 nd setting unit 242. In the present embodiment, the compression unit 240 further includes a 2 nd multiplication unit 244.

The 2 nd control unit 250 is a processing unit, and the 2 nd control unit generates a 2 nd amplification factor, compresses the dynamic range of the intensity of the 2 nd output signal by multiplying the 2 nd amplification factor by the 2 nd output signal, and generates a 2 nd correction amplification factor by smoothing the 2 nd amplification factor with a 2 nd time constant. Regarding the 2 nd time constant, the 2 nd time constant is a 2 nd rising time constant when the intensity of the 2 nd output signal increases, and the 2 nd time constant is a 2 nd falling time constant when the intensity of the 2 nd output signal decreases. The 2 nd control unit 250 has the same configuration as the 1 st control unit 50. Here, the 1 st rising time constant set by the 1 st setting unit 42 is larger than the 2 nd rising time constant set by the 2 nd setting unit 242, and the 1 st falling time constant set by the 1 st setting unit 42 is smaller than the 2 nd falling time constant set by the 2 nd setting unit 242.

The 2 nd setting unit 242 is a setting unit that sets the 2 nd rising time constant and the 2 nd falling time constant. The 2 nd setting unit 242 outputs the set 2 nd rising time constant and the set 2 nd falling time constant to the 2 nd control unit 250. Accordingly, the 2 nd rising time constant and the 2 nd falling time constant can be set to desired values. The 2 nd setting unit 242 is not an essential component of the audio processing device 210 according to the present embodiment. For example, the 2 nd rising time constant and the 2 nd falling time constant may be set in advance by the 2 nd control unit 250.

The 2 nd multiplication unit 244 is a processing unit that multiplies the 2 nd correction amplification factor by the 2 nd output signal. The 2 nd multiplication unit 244 outputs a signal obtained by multiplying the 2 nd correction amplification factor by the 2 nd output signal to the 2 nd speaker 280.

Since the sound processing device 210 of the present embodiment has the above-described configuration, it is possible to process sound in both directions, i.e., in the direction from the 1 st microphone 20 to the 1 st speaker 80 and in the direction from the 2 nd microphone 220 to the 2 nd speaker 280. In the present embodiment, since the 1 st rising time constant set by the 1 st setting unit 42 is larger than the 2 nd rising time constant set by the 2 nd setting unit 242, the prefix of the 1 st sound is emphasized similarly to the sound processing apparatus 10 according to embodiment 1. Accordingly, in the case of processing the respective sounds emitted in the dialogue between the 1 st user who is a non-hearing-impaired person emitting the 1 st sound and the 2 nd user who is a hearing-impaired person emitting the 2 nd sound in both directions, the 2 nd user can listen to the easy-to-hear sound with the word emphasized from the 1 st speaker 80.

On the other hand, when the 2 nd user does not approach the 2 nd microphone 220 to generate sound, the 2 nd sound may include relatively much noise other than the 2 nd user's sound. Even in such a case, since the 2 nd output signal corresponding to the 2 nd sound is compressed in the dynamic range by the compression unit 240, it is possible to suppress noise of a large sound volume that may be included in the 2 nd sound.

The sound processing device 210 according to the present embodiment can achieve the same effects as described above, and can be applied to, for example, a nurse call or the like. In this case, the nurse uses the 1 st microphone 20 and the 2 nd speaker 280, and the patient who is a hearing-impaired person uses the 2 nd microphone 220 and the 1 st speaker 80, so that the patient can easily hear the nurse's voice. Further, the patient may not be able to approach the 2 nd microphone 220, and the sound processing device 210 that can be used by such a patient is realized by setting the sensitivity of the 2 nd microphone 220 to be higher than the sensitivity of the 1 st microphone 20, for example.

(modification example etc.)

The sound processing device of the present disclosure has been described above according to the embodiments, but the present disclosure is not limited to these embodiments. Various modifications that a person skilled in the art may make to the embodiments or to a combination of some of the components in the embodiments are also included in the scope of the present disclosure within the scope not departing from the spirit of the present disclosure.

For example, in the above embodiments, the 1 st time constant is determined based on the 1 st amplification factor and the 1 st correction amplification factor, but the method of determining the 1 st time constant is not limited thereto. For example, the 1 st time constant may be determined according to whether the 1 st output signal is in the initial state. Such a method will be described with reference to fig. 12. Fig. 12 is a diagram for explaining a method of determining the 1 st time constant in the modification. Fig. 12 shows an outline of the 1 st output signal SL in which the energy of the signal changes in a stepwise manner with time, and windows W1 and W2 indicating a detection period for detecting whether the 1 st output signal is in the initial state, with the horizontal axis being a time axis.

As shown in fig. 12, the energy in the window is detected by the window W1 and the window W2 following the window W1, and it is determined whether the 1 st output signal rises (in other words, is in the initial state). For example, when the positions of the windows W1 and W2 are as shown in waveforms (a) to (d), (j), and (k) in fig. 12, since there is no energy change between the two windows, it is determined that the 1 st output signal does not rise (i.e., is not the start). In addition, when the positions of the windows W1 and W2 are as shown in waveforms (e) to (i) of fig. 12, the energy detected in the window W2 is larger than the energy detected in the window W1, and therefore it can be determined that the 1 st output signal is rising (i.e., is the start).

Thus, the 1 st time constant can be determined according to whether the 1 st output signal is in the initial state. When such a method is used, the audio processing apparatus may be provided with a memory for temporarily storing the energy detected by the window W1.

The forms shown below may also be included within the scope of one or more aspects of the present disclosure.

(1) The hardware configuration of the components constituting the respective audio processing devices is not particularly limited, and may be configured by a computer, for example. An example of such a hardware configuration will be described with reference to fig. 13. Fig. 13 is a diagram showing an example of a hardware configuration of a computer 1000 in which the functions of the audio processing device according to the present disclosure are realized by software.

The computer 1000 is a computer provided with an input device 1001, an output device 1002, a CPU1003, an internal memory 1004, a RAM1005, and a bus 1009 as shown in fig. 13. The input device 1001, the output device 1002, the CPU1003, the internal memory 1004, and the RAM1005 are connected by a bus 1009.

The input device 1001 is a device that serves as a user interface such as an input button, a touch panel, and accepts an operation by a user. The input device 1001 may be configured to receive a touch operation by a user, a voice operation, and a remote operation by a remote controller or the like. The input device 1001 may include microphones corresponding to the 1 st microphone 20 and the 2 nd microphone 220.

The output device 1002 is a device that outputs a signal from the computer 1000, and may be a device serving as a user interface such as a speaker, a display, or the like, in addition to serving as a signal output terminal. The output device 1002 may include speakers corresponding to the 1 st speaker 80 and the 2 nd speaker 280.

The built-in memory 1004 is a flash memory or the like. The internal memory 1004 may be stored with at least one of a program for realizing the functions of the audio processing apparatus and an application using the functional configuration of the audio processing apparatus.

The RAM1005 is a Random Access Memory (Random Access Memory) and is used for storing data and the like when a program or an application is executed.

The CPU1003 is a Central Processing Unit (Central Processing Unit), copies a program or an application stored in the built-in memory 1004 to the RAM1005, and sequentially reads out and executes commands included in the program or the application from the RAM 1005.

The computer 1000 performs the same processing as the preprocessing section, the enhancement section, and the compression section of each of the above embodiments, for example, on the 1 st electric signal and the 2 nd electric signal which are digital signals.

(2) A part of the components constituting each of the audio processing devices may be 1 system LSI (Large Scale Integration). The system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of components on 1 chip, and specifically includes a computer system including a microprocessor, a rom (read Only memory), a RAM, and the like. The RAM is recorded with a computer program. The RAM stores a computer program. The microprocessor operates in accordance with the computer program, and the system LSI achieves its functions.

(3) A part of the components constituting the sound processing device may be an IC card or a single module that can be attached to and detached from each device. The IC card or the module is a computer system including a microprocessor, a ROM, a RAM, and the like. The IC card or the module may include the ultra-multifunctional LSI. The microprocessor operates according to a computer program, and the IC card or the module achieves the function. The IC card or the module may have tamper resistance.

(4) Further, a part of the components constituting the audio processing device may be a recording medium that records the computer program or the digital signal on a computer readable medium, such as a flexible disk, a hard disk, a CD-ROM, an MO, a DVD-ROM, a DVD-RAM, a BD (Blu-ray (registered trademark) Disc), a semiconductor memory, or the like. The digital signal may be recorded on such a recording medium.

Further, a part of the components constituting the audio processing device may be configured to transmit the computer program or the digital signal via an electric communication line, a wireless or wired communication line, a network typified by the internet, a data broadcast, or the like.

(5) The present disclosure may be the method illustrated above. The methods may be computer programs implemented by a computer, or may be digital signals constituted by the computer programs.

(6) The present disclosure may be a computer system including a microprocessor and a memory, the memory storing the computer program, and the microprocessor operating according to the computer program.

(7) The program or the digital signal may be recorded in the recording medium and transferred, or the program or the digital signal may be transferred via the network or the like, so that the program or the digital signal may be implemented by another independent computer system.

(8) The above embodiments and the above modifications may be combined, respectively.

The sound processing device according to the present disclosure can provide sound that can be easily heard by a person with low hearing ability, and therefore can be used for a doorbell intercom device in a house where the person with low hearing ability lives, and for calling a nurse or the like in a hospital. In addition, the method is suitable for the television for the person with low hearing ability to view, thereby improving the sound of the television program to be easy to hear.

Description of the symbols

10, 110, 210 sound processing device

20 th microphone

30, 130 preprocessing part

32 filter

40 rising edge emphasis part

42 st setting part

44 1 st multiplication unit

50 st control part

52, 134 intensity detection part

54, 135 coefficient generating part

56 time constant determination unit

60, 136 correction coefficient generating part

62, 68 multiplier

64 adder

66 delay element

80 st loudspeaker

133 pretreatment control section

138 preprocessing multiplication unit

220 nd 2 microphone

240 compression part

242 nd 2 setting part

244 nd multiplication unit

250 nd control part 2

280 nd 2 loudspeaker

1000 computer

1001 input device

1002 output device

1003CPU

1004 built-in memory

1005RAM

1009 bus

28页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于分布式语音处理的设备、系统和方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!