Decoding device, encoding device, methods thereof, and program
阅读说明:本技术 解码装置、编码装置、它们的方法以及程序 (Decoding device, encoding device, methods thereof, and program ) 是由 杉浦亮介 镰本优 守谷健弘 于 2018-12-03 设计创作,主要内容包括:解码装置,包括:频带扩展部(25),通过在与频域的样本串相比高域侧,配置基于通过解码得到的频域的样本串中包含的K个样本的样本,得到解码扩展频谱序列;以及摩擦音调整解除部(23),在表示是否是被输入的摩擦音性的音的信息表示是摩擦音性的音的情况下,得到将解码扩展频谱序列中的与规定的频率相比位于低域侧的低域侧频率样本串的全部或者一部分、和与其相同数目的、解码扩展频谱序列中的与规定的频率相比位于高域侧的高域侧频率样本串的全部或者一部分进行了调换的结果,作为解码音信号的频谱序列。(A decoding apparatus, comprising: a band spreading unit (25) that obtains a decoded spread spectrum sequence by arranging samples based on K samples included in the frequency-domain sample sequence obtained by decoding on the higher side of the frequency-domain sample sequence; and a fricative adjustment canceling unit (23) which, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is fricative sound, obtains, as a spectrum sequence of the decoded speech signal, a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence with all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence, the number of the samples being the same as the number of the samples.)
1. A decoding apparatus, comprising:
a decoding unit that decodes a spectrum code of a frame unit of a predetermined time interval, the spectrum code being a spectrum code in which bits are not allocated to a part of the high-order side, to obtain a sample sequence of a frequency domain;
a band spreading unit configured to obtain a decoded spread spectrum sequence by arranging samples based on K samples included in a sample string of a frequency domain obtained by decoding the spectrum code by the decoding unit, on a higher side than the sample string of the frequency domain obtained by decoding the spectrum code by the decoding unit, where K is an integer of 2 or more; and
and a fricative adjustment canceling unit configured to, when the information indicating whether or not the input fricative sound indicates a fricative sound, obtain a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit with the same number of low-range-side frequency sample sequences as the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit, and all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit, as a spectrum sequence of a decoded speech signal, and to, when the number of high-range-side frequency sample sequences is not the same number as the predetermined frequency, obtain a spectrum sequence in which the decoded spread spectrum sequence obtained by the band spreading unit is directly used.
2. The decoding device as set forth in claim 1,
the band spreading unit obtains the decoded spread spectrum sequence by arranging K samples obtained by multiplying K samples included in the sample sequence of the frequency domain obtained by the decoding unit decoding the spectrum code by K band spreading gains on a higher side than the sample sequence of the frequency domain obtained by the decoding unit decoding the spectrum code, and obtaining a set of K band spreading gains by decoding the band spreading gain code.
3. The decoding device as set forth in claim 2,
the band expansion unit stores a plurality of codes, a fricative gain candidate vector corresponding to each of the codes, and a non-fricative gain candidate vector corresponding to each of the codes,
each of the fricative gain candidate vector and the non-fricative gain candidate vector contains K gain candidate values,
the processing of the band spreading section decoding the band spreading gain code to obtain K sets of the band spreading gains is as follows: and a processing of setting K gain candidate values included in a fricative gain candidate vector corresponding to a code identical to the band spread gain code among the plurality of fricative gain candidate vectors as a set of K band spread gains, and setting K gain candidate values included in a non-fricative gain candidate vector corresponding to a code identical to the band spread gain code among the plurality of non-fricative gain candidate vectors as a set of K band spread gains, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is a fricative sound.
4. A decoding device for decoding a spectrum code in a frame unit of a predetermined time interval to obtain a spectrum sequence of a decoded speech signal, comprising:
a decoding unit configured to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a lower side when the information indicating whether the inputted fricative sound is the fricative sound indicates that the inputted fricative sound is the fricative sound, and configured to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a higher side when the information indicates that the inputted fricative sound is other than the fricative sound; and
and a fricative corresponding band extension unit configured to, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is fricative sound, obtain the spectral sequence of the decoded sound signal by performing band extension on a lower band side with respect to the frequency domain spectral sequence obtained by the decoding unit, and, when the information indicates that the input fricative sound is fricative sound, obtain the spectral sequence of the decoded sound signal by performing band extension on an upper band side with respect to the frequency domain spectral sequence obtained by the decoding unit.
5. An encoding device including an encoding unit that encodes a sample string of frequencies corresponding to a sound signal in units of frames in a predetermined time interval by an encoding process in which bits are not allocated to a part of a high-band side, thereby obtaining a spectrum code, the encoding device comprising:
a fricative determination unit that determines whether or not the sound signal is fricative sound; and
a fricative adjustment unit that, when the fricative determination unit determines that the sound is fricative, obtains a result of replacing all or a part of a low-range-side spectrum sequence located on a low range side with respect to a predetermined frequency in a spectrum sequence of the sound signal with the same number of low-range-side spectrum sequences located on a high range side with respect to the predetermined frequency in the spectrum sequence, and obtains a spectrum sequence corresponding to the sound signal as an adjusted spectrum sequence when the number of the high-range-side spectrum sequences is not the predetermined frequency,
the encoding unit encodes the adjusted spectrum sequence obtained by the fricative adjustment unit as a sample string of frequencies corresponding to the speech signal to obtain a spectrum code,
the encoding device further includes:
and a band spreading gain encoding unit that stores a plurality of codes, each of which includes K gain candidate values, and a gain candidate vector corresponding to each of the codes, and that obtains, as a band spreading gain code, a code corresponding to a gain candidate vector in which an error between a sequence of absolute values of K values obtained by multiplying the K adjusted spectrums to which bits are allocated by the encoding unit in the adjusted spectrum sequence and the K gain candidate values included in the gain candidate vector and a sequence of absolute values of K adjusted spectrums to which bits are not allocated by the encoding unit in the adjusted spectrum sequence is minimized, and that outputs the code, wherein K is an integer equal to or greater than 2.
6. The encoding device as set forth in claim 5,
the band expansion gain encoding unit stores a plurality of codes, a fricative gain candidate vector corresponding to each of the codes, and a non-fricative gain candidate vector corresponding to each of the codes,
the band expansion gain coding unit may use a fricative gain candidate vector as the gain candidate vector when the fricative determination unit determines that the sound is a fricative sound, and may use a non-fricative gain candidate vector as the gain candidate vector when the noise is not the fricative sound.
7. The encoding apparatus according to claim 5 or 6,
the fricative determination unit determines that the sound signal is a fricative sound when the index having a larger value as the ratio of the average energy of the spectrum on the high-range side to the average energy of the spectrum on the low-range side in the spectrum sequence of the frame is larger than a predetermined threshold or equal to or larger than the threshold.
8. The encoding apparatus according to claim 5 or 6,
in the case where the index having a larger value is larger than a predetermined threshold value or the number of frames that is equal to or larger than the threshold value is larger than the number of frames that is not equal to or larger than the threshold value, or the number of frames that is not equal to or larger than the threshold value, the fricative sound determination unit determines that the sound signal is a fricative sound, among the plurality of frames including the frame.
9. A decoding method, comprising:
a decoding step of decoding a spectrum code of a frame unit of a predetermined time interval and a spectrum code in which bits are not allocated to a part of the high-band side, to obtain a sample sequence of a frequency domain;
a band spreading step of obtaining a decoded spread spectrum sequence by arranging samples of K samples included in a sample string of a frequency domain obtained by decoding the spectrum code in the decoding step, on a higher side than the sample string of the frequency domain obtained by decoding the spectrum code in the decoding step, where K is an integer of 2 or more; and
and a fricative adjustment canceling step of, when the information indicating whether or not the input fricative sound indicates a fricative sound, obtaining a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained in the band spreading step with the same number of low-range-side frequency sample sequences as the predetermined frequency and all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained in the band spreading step, as a spectrum sequence of a decoded speech signal, and obtaining the decoded spread spectrum sequence obtained in the band spreading step as a spectrum sequence of the decoded speech signal as it is, in a case other than the above case.
10. A decoding method for decoding a spectrum code in a frame unit of a predetermined time interval to obtain a spectrum sequence of a decoded speech signal, comprising:
a decoding step of, when the information indicating whether or not the inputted fricative sound indicates that the inputted fricative sound is fricative sound, decoding the spectrum code without allocating bits to a part of the spectrum code on the lower side to obtain a spectrum sequence in the frequency domain, and, when the information does not indicate that the inputted fricative sound is fricative sound, decoding the spectrum code without allocating bits to a part of the spectrum code on the upper side to obtain a spectrum sequence in the frequency domain; and
a fricative corresponding band extension step of, when the information indicating whether or not the input fricative sound indicates a fricative sound, performing band extension on the frequency domain spectrum sequence obtained in the decoding step to a lower band side to obtain the spectrum sequence of the decoded sound signal, and when the information indicates otherwise, performing band extension on the higher band side to obtain the spectrum sequence of the decoded sound signal.
11. An encoding method, comprising: an encoding step of encoding a sample string of frequencies corresponding to a frame-unit speech signal in a predetermined time interval by an encoding process in which bits are not allocated to a part of the high-band side, to obtain a spectrum code, the encoding method further comprising:
a fricative determination step of determining whether or not the sound signal is fricative sound;
a fricative adjustment step of, when the fricative determination step determines that the sound is a fricative sound, obtaining a result of replacing all or a part of a low-range-side spectrum sequence located on a low range side with respect to a predetermined frequency in a spectrum sequence of the sound signal with all or a part of a high-range-side spectrum sequence located on a high range side with respect to the predetermined frequency in the spectrum sequence in the same number as the predetermined frequency, and obtaining a spectrum sequence corresponding to the sound signal as an entire spectrum sequence as it is in a case other than the above case,
the encoding step is a step of encoding the adjusted spectrum sequence obtained in the fricative adjustment step as a sample string of frequencies corresponding to the speech signal to obtain a spectrum code,
the encoding method further includes:
a band spreading gain encoding step of storing a plurality of codes and gain candidate vectors corresponding to the codes, each of the gain candidate vectors including K gain candidate values, obtaining a code corresponding to a gain candidate vector in which an error between a sequence of absolute values of K values obtained by multiplying K adjusted spectrums to which bits are allocated in the encoding step in the adjusted spectrum sequence and the K gain candidate values included in the gain candidate vectors and a sequence of absolute values of K adjusted spectrums to which bits are not allocated in the encoding step in the adjusted spectrum sequence is minimized, as a band spreading gain code, and outputting the code, where K is an integer of 2 or more.
12. A program for causing a computer to function as each means of the decoding device of any one of claims 1 to 4.
13. A program for causing a computer to function as each unit of the encoding device according to any one of claims 5 to 8.
Technical Field
The present invention relates to a technique for encoding or decoding a sample string derived from a spectrum of a speech signal in a signal processing technique such as a speech signal encoding technique.
Background
When a sound signal is compression-encoded, conventionally, the sound signal is represented as a spectrum string, and bits are allocated to the spectrum string in consideration of the importance of auditory sense, so as to improve the compression efficiency. Bit allocation in consideration of the importance of auditory sense is performed by preferentially allocating bits and the like to samples corresponding to low frequencies in the spectrum string. As a result, the following structure may be adopted: no bit is allocated to a sample corresponding to a high frequency in the spectrum sequence, and no direct information on the sample sequence corresponding to the high frequency is encoded in the encoding device. In the decoding device corresponding to this encoding device, since the decoded sound is obtained by setting the sample value corresponding to the high frequency in the spectral string to 0, the band extension technique as described in non-patent document 1, that is, the technique in which the decoding device outputs the result of copying while adjusting the amplitude of the sample string corresponding to the low frequency as the decoding result of the sample string corresponding to the high frequency, may be used. This is based on the fact that a person has low sensitivity to high frequencies when listening to a sound and does not feel uncomfortable if a low-frequency octave sound can be heard. By allocating the number of bits saved in the high frequency band to the low frequency band, information more important to the auditory sense characteristics of a human can be expressed with high accuracy. Thus, the coding of the tone is typically designed to allocate a greater number of bits to the spectrum at lower frequencies.
Disclosure of Invention
Problems to be solved by the invention
According to the band extension technique of non-patent document 1, a band extension sound with less deterioration of auditory sense quality can be obtained from a decoded sound obtained by a decoding device for most of natural sounds. However, there are also natural sounds in which energy is concentrated at high frequencies and there is substantially no energy at low frequencies, such as fricatives in human speech, and if encoding is performed by an encoding device that allocates the number of bits as described above for such speech signals, decoded sounds with large distortion of the main frequency component of the sounds are obtained from the decoding device particularly under conditions of low bit rates, and if band-extended sounds are obtained from the decoded sounds by the band-extension technique of non-patent document 1, there is a problem that the band-extended sounds are acoustically deteriorated.
Therefore, an object of the present invention is to provide an encoding device that performs compression encoding on the encoding side on the premise of band expansion on the decoding side, a decoding device that performs decoding in association with band expansion on the decoding side, methods therefor, and programs therefor, which reduce auditory deterioration of sound signals such as fricatives.
Means for solving the problems
A decoding device according to an aspect of the present invention includes: a decoding unit that decodes a spectrum code of a frame unit of a predetermined time interval, the spectrum code being a spectrum code in which bits are not allocated to a part of the high-order side, to obtain a sample sequence of a frequency domain; a band spreading unit that obtains a decoded spread spectrum sequence by arranging samples based on K samples included in a sample sequence of a frequency domain obtained by decoding the spectrum code by the decoding unit, on a higher side than the sample sequence of the frequency domain obtained by decoding the spectrum code by the decoding unit, wherein K is an integer of 2 or more; and a fricative adjustment canceling unit that, when the information indicating whether or not the input fricative sound indicates that the input fricative sound indicates fricative sound, obtains, as a spectral sequence of the decoded speech signal, a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit with all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit, the number of the samples being equal to the number of the samples.
A decoding device according to an aspect of the present invention is a decoding device that decodes a spectrum code in a frame unit of a predetermined time interval to obtain a spectrum sequence of a decoded speech signal, and includes: a decoding unit configured to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a lower side when the information indicating whether the inputted fricative sound is the fricative sound indicates that the inputted fricative sound is the fricative sound, and to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a higher side when the information indicates other than the above; and a fricative corresponding band extension unit configured to, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is fricative sound, obtain a frequency spectrum sequence of the decoded sound signal by performing band extension on the low side of the frequency spectrum sequence obtained by the decoding unit, and, when the information indicates that the input fricative sound is fricative sound, obtain a frequency spectrum sequence of the decoded sound signal by performing band extension on the high side of the frequency spectrum sequence obtained by the decoding unit.
An encoding device according to an aspect of the present invention includes an encoding unit configured to encode a sample string of frequencies corresponding to a frame-unit speech signal in a predetermined time interval by an encoding process in which bits are not allocated to a part of a high-band side, to obtain a spectrum code, and includes: a fricative determination unit that determines whether or not the sound signal is fricative sound; and a fricative adjustment unit that, when the fricative determination unit determines that the sound is fricative, obtains a result of replacing all or a part of a low-range-side spectral sequence located lower than a predetermined frequency in a spectral sequence of the sound signal with all or a part of a high-range-side spectral sequence located higher than the predetermined frequency in the same number of spectral sequences, and obtains, as an adjusted spectral sequence, a spectral sequence corresponding to the sound signal as it is, when the number of the spectral sequences is not the same, an adjusted spectral sequence, and the encoding unit encodes the adjusted spectral sequence obtained by the fricative adjustment unit as a sample sequence of frequencies corresponding to the sound signal, and the obtained spectral code encoding device further includes a band expansion gain encoding unit that stores a plurality of codes and gain candidate vectors corresponding to the codes, each of gain candidate vectors, which are a sequence of absolute values of K values obtained by multiplying K gain candidate values included in a gain candidate vector by K adjusted spectrums in which bits are allocated to encoding sections in an adjusted spectrum sequence, and a sequence of absolute values of K adjusted spectrums in which bits are not allocated to encoding sections in the adjusted spectrum sequence, is the smallest in error, includes K gain candidate values, and a code corresponding to the gain candidate vector, where K is an integer of 2 or more, is obtained as a band spread gain code, and is output.
ADVANTAGEOUS EFFECTS OF INVENTION
According to the encoding device and the decoding device, encoding and decoding can be performed so that acoustic signals such as fricatives are less deteriorated in auditory sense.
Drawings
Fig. 1 is a block diagram showing an example of an encoding device according to the first embodiment.
Fig. 2 is a flowchart showing an example of the encoding method according to the first embodiment.
Fig. 3 is a block diagram showing an example of the decoding apparatus according to the first embodiment.
Fig. 4 is a flowchart showing an example of the decoding method according to the first embodiment.
Fig. 5 is a diagram for explaining an example of the fricative adjustment process.
Fig. 6 is a diagram for explaining an example of the fricative adjustment process.
Fig. 7 is a diagram for explaining an example of the fricative adjustment process.
Fig. 8 is a diagram for explaining an example of the fricative adjustment process.
Fig. 9 is a block diagram showing an example of the encoding device according to the second embodiment.
Fig. 10 is a flowchart showing an example of the encoding method according to the second embodiment.
Fig. 11 is a block diagram showing an example of a decoding device according to the second embodiment.
Fig. 12 is a flowchart showing an example of the decoding method according to the second embodiment.
Fig. 13 is a diagram for explaining an example of the band expansion process and the fricative adjustment cancellation process.
Fig. 14 is a diagram for explaining an example of the band expansion process and the fricative adjustment cancellation process.
Detailed Description
< first embodiment >
The first embodiment is an embodiment that is a premise of a second embodiment that is one embodiment of the present invention.
The system of the first embodiment includes an encoding device and a decoding device. The encoding device encodes a time-domain audio signal input in units of frames of a predetermined time length to obtain a code, and outputs the code. The code output by the encoding apparatus is input to the decoding apparatus. The decoding device decodes the input code and outputs a time-domain audio signal in frame units. The speech signal input to the encoding apparatus is, for example, a speech signal or an acoustic signal obtained by collecting speech or music with a microphone and performing AD conversion. The audio signal output from the decoding device is DA-converted, reproduced by a speaker, and listened to.
Coding device
Referring to fig. 1, a process of the encoding device of the first embodiment is explained. As illustrated in fig. 1, the encoding device of the first embodiment includes: a frequency
In addition, a configuration may be adopted in which a frequency-domain audio signal is input to the encoding device instead of the time-domain audio signal. In the case of this configuration, the encoding device may not include the
[ frequency domain converting part 11]
The frequency
The frequency
[ Friction sound determination unit 12 (friction sound determination device) ]
The frictional
The frictional
When MA is an integer value larger than 1 and smaller than N-1 and MB is an integer value larger than MA and smaller than N, the frictional
The integer value MA may be set so that a sample on the low domain side, which is a calculation target of the low domain side average energy in the fricative
At sample X to be located at the low domain side0,…,XMAWhen the values of some of the samples in (2) are used for the calculation of the index, X may be selected from0,…,XMAThe lowest frequency side of (1) or more samples is used for the calculation of the index, that is, α may be a positive integer less than MA, and X may be a positive integer less than MA0,…,XαThe average value of the sum of absolute values or the average value of the sum of squares of the values of the samples (2) is set as the low-range-side average energy value α is determined in advance by experiments or the like in advance so that the value of X is equal to or greater than the value of X0,…,XαThe frequency spectrum of a sound other than the fricative sound may be in a range where the sound can normally exist.
In the encoding process in the
Fig. 5 and 6 show the adjustment of fricatives described later when N is 32 and M is 20An example of the
As indicated by a broken line in fig. 1, the
[ fricative sound adjustment unit 13]
The spectrum sequence X output from the
If an integer value greater than 1 and less than N is set to M, for example, if the spectral sequence X is set0,…,XN-1X is a sample with a sample number less than M0,…,XM-1The sample group of (2) is set as a low-side spectrum sequence, and the spectrum sequence X is set as a spectrum sequence X0,…,XN-1Sample number of (1) is M or more, i.e. XM,…,XN-1When the sample group of (2) is a high-level spectral sequence, the adjustment process performed by the
[ example 1 of adjustment processing by the frictional sound adjustment unit 13]
When the fricative decision information indicates that the sound is fricative, the
Step 1-1: the frequency spectrum sequence X0,…,XN-1The sample group of samples with the sample number less than M is set as the low-region side spectrum sequence X0,…,XM-1A sequence of frequency spectra X0,…,XN-1The sample group of samples with the sample number of M or more is set as the high-region side spectrum sequence XM,…,XN-1。
Step 1-2: extracting the low-side spectrum sequence X obtained in the step 1-10,…,XM-1The C samples (C is a positive integer) included in (b) are samples to be adjusted to the upper domain side.
Step 1-3: extracting the high-side spectrum sequence X obtained in step 1-1M,…,XN-1The C samples included in (1) are samples to be adjusted to the lower domain side.
Step 1-4: the sample position at which the adjustment target sample to the higher domain side in the lower domain side spectrum sequence is extracted in step1-2 is obtained, and the result of arranging the adjustment target sample to the lower domain side extracted from the higher domain side spectrum sequence in step 1-3 is obtained as the adjusted lower domain side spectrum sequence Y0,…,YM-1。
Step 1-5: the sample position at which the adjustment target sample to the low domain side in the high domain side spectrum sequence is extracted in Step 1-3 is obtained, and the result of arranging the adjustment target sample to the high domain side extracted from the low domain side spectrum sequence in Step1-2 is obtained as the high domain side adjusted spectrum sequence YM,…,YN-1。
Step 1-6: the low-side adjusted spectrum sequence Y obtained in the step 1-4 is subjected to0,…,YM-1And the high-side adjusted spectral sequence Y obtained in step 1-5M,…,YN-1Combining to obtain an adjusted spectrum sequence Y0,…,YN-1。
Fig. 5 shows the steps when N is 32, M is 20, and C is 8Examples of step 1-1 to step 1-6. The
[ example 2 of adjustment processing by the fricative sound adjustment unit 13]
The frictional
Step 1-4': in step1-2, the remaining samples from which the adjustment target samples to the high range side in the low range side spectrum sequence have been extracted are pushed to the low range side, the adjustment target samples to the low range side extracted from the high range side spectrum sequence in step 1-3 are arranged at the sample positions to the high range side which have been left free, and the result is obtained as the low range side adjusted spectrum sequence Y0,…,YM-1。
The
In this way, when the fricative
[ example 3 of adjustment processing by the fricative sound adjustment unit 13]
Similarly, the frictional
Step 1-5': the remaining samples after the adjustment target samples to the low side in the high side spectrum sequence have been extracted in Step 1-3 are pushed to the low side, and the adjustment target samples to the high side extracted from the low side spectrum sequence at Step1-2 are arranged at the sample positions to the high side left free, and the result is obtained as the high side adjusted spectrum sequence YM,…,YN-1。
The
Fig. 6 shows an example in which, when N is 32, M is 20, and C is 8, step 1-4 'is performed instead of step 1-4 in step 1-1 to step 6, and step 1-5' is performed instead of step 1-5. The
In this way, when the
[ example 4 of adjustment processing by the fricative sound adjustment unit 13]
It is desirable that the
In the above-described examples of fig. 5 and 6, γ is set to 2 so that X, which is 2 samples from the lowest frequency in the low-range-side spectrum sequence, is not included in the adjustment target samples from the low-range-side spectrum sequence toward the high range side0And X1。
In other words, when the
[ example 5 of adjustment processing by the fricative sound adjustment unit 13]
In the encoding process in the
In the above-described examples of fig. 5 and 6, the 4 samples, i.e., X, from the highest frequency in the high-side spectrum sequence are not set28,…,X31The samples are included in the samples to be adjusted from the high-range side spectrum sequence to the low-range side.
In other words, when the
[ encoding section 14]
The adjusted spectral sequence Y output from the
Here, a method of preferentially allocating bits to samples having a small sample number is, for example, the following method: will adjust the frequency spectrum sequence Y0,…,YN-1The method includes dividing the sequence into a plurality of partial sequences, dividing each sample included in the partial sequence by a smaller gain, and coding or vector-quantizing an integer value of a division result by a variable length code or a fixed length code to obtain a spectrum code corresponding to an adjusted spectrum sequence. In this case, a code corresponding to a partial sequence having a large sample number may not be obtained for the partial sequence. That is, no bit may be allocated to a partial sequence having a large sample number.
For the adjusted spectral sequence Y0,…,YN-1In the partial sequence having a small sample number, the samples included in the partial sequence are respectively divided by the small gain to obtain large integer values, and the large integer values are respectively encoded. On the other hand, for the adjusted spectral sequence Y0,…,YN-1In the partial sequence having a large sample number, the samples included in the partial sequence are respectively divided by the gains of large values to obtain small integer values, and the small integer values are respectively encoded. Each of the integer values obtained by dividing the value of the sample included in the partial sequence by a large gain is 0 at most.
Further, as shown by the chain line in fig. 1, if the
[ multiplexing section 15]
The fricative decision information output from the
Decoding device
Referring to fig. 3, a process of the decoding device of the first embodiment will be described. As illustrated in fig. 3, the decoding device according to the first embodiment includes a demultiplexing unit 21, a decoding unit 22, a fricative adjustment canceling unit 23, and a time domain converting unit 24. The code output by the encoding device is input to the decoding device. The code input to the decoding apparatus is input to the demultiplexing section 21. The decoding device performs processing for each unit in units of frames of a predetermined time length. The decoding method according to the first embodiment is realized by each section of the decoding apparatus performing the following processing of step S21 to step S24 illustrated in fig. 4.
[ multiplexing/demultiplexing unit 21]
The code output from the encoding device is input to the multiplexing/demultiplexing unit 21. The multiplexing/demultiplexing unit 21 demultiplexes the input code into a code corresponding to the fricative determination information and a spectrum code in frame units, outputs fricative determination information obtained from the code corresponding to the fricative determination information to the fricative adjustment canceling unit 23, and outputs the spectrum code to the decoding unit 22 (step S21).
When the fricative decision information is 1-bit information, the code itself corresponding to the fricative decision information input to the demultiplexing unit 21 may be the fricative decision information.
[ decoding section 22]
The spectrum code output from the demultiplexing unit 21 is input to the decoding unit 22. The decoding unit 22 decodes the input spectrum code in units of frames by a decoding method corresponding to the encoding method performed by the
When the spectrum code is decoded by the decoding method corresponding to the above-described encoding method at the description position of the
In this way, the decoding unit 22 decodes the spectrum code of the frame unit of the predetermined time zone and the spectrum code in which bits are preferentially allocated to the low-band side, and obtains a sample sequence of the frequency domain corresponding to the decoded speech signal (decoded adjusted spectrum sequence).
[ adjustment canceling part for frictional noise 23]
The fricative adjustment canceling unit 23 receives the fricative determination information output from the demultiplexing unit 21 and the decoded adjusted spectral sequence Y output from the decoding unit 220,…,^YN-1. The fricative adjustment canceling unit 23 performs decoding on the input decoded adjusted spectrum sequence Y in a frame unit when the input fricative determination information indicates a fricative sound0,…,^YN-1Performing the following adjustment release processing to obtain the decoded spectrum sequence ^ X0,…,^XN-1The obtained decoded spectrum sequence ^ X0,^X1,…,^XN-1The output to the time domain conversion unit 24 is used for decoding the adjusted spectrum sequence when the fricative decision information indicates that the sound is not fricative0,…,^YN-1Directly as decoded spectral sequence ^ X0,…,^XN-1The result is output to the time domain conversion unit 24 (step S23).
If an integer value greater than 1 and less than N is set to M, for example, the adjusted spectral sequence ^ Y is decoded0,…,^YN-1Samples with less than M sample numbers, i.e. ^ Y0,…,^YM-1The group of samples is set as the low-level side decoding the adjusted spectrum sequence, and the adjusted spectrum sequence is decoded0,…,^YN-1The sample number in (A) is more than M, i.e. [ lambda ] YM,…,^YN-1When the sample group of (2) is a high-order side decoded and the fricative decision information indicates that the sound is a fricative sound, the adjustment cancellation process performed by the fricative adjustment cancellation unit 23 is as follows: decoding the low domain sideAdjusting a sequence of spectra ^ Y0,…,^YN-1And the same number of high-side decoding-adjusted spectrum sequences ^ Y as the number of samples of all or part ofM,…,^YN-1The sample swapping of all or part of the sequence of the decoded spectrum X0,…,^XN-1. The adjustment canceling process performed by the fricative sound adjustment canceling unit 23 may be various processes including the following exemplary process, but is determined in advance so as to be the inverse process of the adjustment process performed by the fricative
In other words, when the input information indicating whether or not the input sound is a fricative sound indicates a fricative sound, the fricative adjustment canceling unit 23 transposes all or a part of a low-range-side frequency sample sequence (low-range-side decoded adjusted spectrum sequence) located on the low range side with respect to a predetermined frequency among the frequency-range sample sequences obtained by the decoding unit 22, and all or a part of a high-range-side frequency sample sequence (high-range-side decoded adjusted spectrum sequence) located on the high range side with respect to the predetermined frequency among the same number of frequency-range sample sequences obtained by the decoding unit 22, and obtains the transposed result as a spectrum sequence (decoded spectrum sequence) of the decoded speech signal, in cases other than the above, the fricative adjustment canceling unit 23 obtains the frequency-domain sample sequence (decoded adjusted spectrum sequence) obtained by the decoding unit 22 as it is as a spectrum sequence (decoded spectrum sequence) of the decoded speech signal.
[ example 1 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]
When the fricative decision information indicates that the sound is fricative, the fricative adjustment canceling unit 23 obtains the decoded spectrum sequence ^ X by performing, for example, the following steps 2-1 to 2-60,…,^XN-1. In addition, although the following steps 2-1 to 2-6 are divided into 6 steps in order to easily understand the operation of the frictional sound adjustment canceling unit 23, the division of the following steps 2-1 to 2-6 by the frictional sound adjustment canceling unit 23 is merely an example, and the steps may be performed by 1 step by exchanging the arranged elements or replacing the index2-1 to step 2-6.
Step 2-1: will decode the adjusted spectral sequence ^ Y0,…,^YN-1The sample group of samples with the sample number less than M is set as the low-domain side decoding adjusted spectrum sequence ^ Y0,…,^YM-1Will decode the adjusted spectral sequence ^ Y0,…,^YN-1The sample group of samples with the middle sample number of more than M is set as the high-domain side decoding adjusted spectrum sequence ^ YM,…,^YN-1。
Step 2-2: fetching the low-side decoded adjusted spectrum sequence Y obtained in step 2-10,…,^YM-1C (C is a positive integer) samples included in (b) are samples to be adjusted to the upper domain side.
Step 2-3: fetching the high-side decoded adjusted spectrum sequence ^ Y obtained in step 2-1M,…,^YN-1The C samples included in (1) are samples to be adjusted to the lower domain side.
Step 2-4: the sample position where the adjustment target sample to the higher domain side in the decoded adjusted spectrum sequence on the lower domain side is taken in step2-2 is obtained, and the result of arranging the adjustment target sample to the lower domain side taken from the decoding adjusted spectrum sequence on the higher domain side in step 2-3 is obtained as the decoded spectrum sequence on the lower domain side ^ X0,…,^XM-1。
Step 2-5: the sample position at which the adjustment target sample to the lower domain side in the decoded adjusted spectrum sequence on the higher domain side is extracted in Step 2-3 is obtained, and the result of arranging the adjustment target sample to the higher domain side extracted from the decoded adjusted spectrum sequence on the lower domain side in Step2-2 is obtained as the decoded spectrum sequence on the higher domain side ^ XM,…,^XN-1。
Step 2-6: decoding the low-domain side decoded spectrum sequence ^ X obtained in the step 2-40,…,^XM-1And the high-domain side decoded spectrum sequence ^ X obtained in the step 1-5M,…,^XN-1Combining to obtain a decoded spectrum sequence ^ X0,…,^XN-1。
Fig. 7 shows an example of steps 2-1 to 2-6 when N is 32, M is 20, and C is 8. Friction toneThe erasure section 23 first decodes the adjusted spectrum sequence ^ Y0,…,^Y31In ^ Y0,…,^Y19Set to decode the adjusted spectrum sequence at the low-field side, will20,…,^Y31The high-domain side is set to decode the adjusted spectral sequence (step 2-1). The fricative adjustment removing part 23 extracts the low-level decoded adjusted spectrum sequence ^ Y0,…,^Y198 samples ^ Y contained in2,…,^Y9As an adjustment target sample to the high range side (step 2-2). The fricative adjustment removing section 23 extracts the high-order decoded adjusted spectral sequence ^ Y20,…,^Y318 samples ^ Y contained in20,…,^Y27As an adjustment target sample to the low domain side (step 2-3). The fricative adjustment removing section 23 obtains the presence of over ^ Y in the decoded and adjusted spectral sequence at the low-level side2,…,^Y9Sample position of is configured with ^ Y20,…,^Y27As a result of decoding the spectrum sequence ^ X as the low-domain side0,…,^X19(step 2-4). The fricative adjustment removing section 23 obtains the presence of over ^ Y in the decoded and adjusted spectral sequence on the high-order side20,…,^Y27Sample position of is configured with ^ Y2,…,^Y9As a result of decoding the spectrum sequence ^ X as the high-domain side20,…,^X31(step 2-5). The fricative adjustment release unit 23 decodes the lower-level decoded spectrum sequence ^ X0,…,^X19Decoding the spectrum sequence ^ X with the high domain side20,…,^X31Combining to obtain a decoded spectrum sequence ^ X0,…,^X31(step 2-6).
[ example 2 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]
When the
Step 2-4': the samples remaining after the samples to be adjusted to the higher side in the decoded adjusted spectrum sequence at the lower side are extracted at step2-2 are pushed to the lower side and the higher side, and the sample positions at the gaps left are arranged at step 2-3 to be decoded from the higher sideThe sample of the low-level side of the adjusted spectrum sequence is taken out, and the configured result is used as the decoding spectrum sequence ^ X of the low-level side0,…,^XM-1。
[ example 3 of adjustment processing by the fricative sound adjustment unit 13]
When the
Step 2-5': the remaining samples obtained by extracting the adjustment target samples to the lower side in the decoded adjusted spectrum sequence to the higher side in Step 2-3 are pushed to the higher side, and the adjustment target samples to the higher side are extracted from the decoded adjusted spectrum sequence to the lower side in Step2-2 at the sample position to the lower side left free, and the resulting decoded spectrum sequence to the higher side ^ X after the arrangement is obtainedM,…,^XN-1。
Fig. 8 shows an example in which, when N is 32, M is 20, and C is 8, step 2-4 'is performed by replacing step 2-4 in step 2-1 to step 2-6, and step 2-5' is performed by replacing step 2-5. The fricative adjustment removing part 23 first decodes the adjusted spectrum sequence ^ Y0,…,^Y31In ^ Y0,…,^Y19Set to decode the adjusted spectrum sequence at the low-field side, will20,…,^Y31The high-domain side is set to decode the adjusted spectral sequence (step 2-1). The fricative adjustment removing part 23 extracts the low-level decoded adjusted spectrum sequence ^ Y0,…,^Y198 samples ^ Y contained in12,…,^Y19As an adjustment target sample to the high range side (step 2-2). The fricative adjustment removing section 23 extracts the high-order decoded adjusted spectral sequence ^ Y20,…,^Y318 samples ^ Y contained in24,…,^Y31As an adjustment target sample to the low domain side (step 2-3). The fricative adjustment release section 23 decodes ^ Y in the adjusted spectrum sequence on the low-level side0,^Y1Pressing toward the lower region side2,…,^Y11Pressing toward the high region side, and arranging ^ Y at the vacant gap24,…,^Y31Obtain the configured resultDecoding a spectral sequence ^ X for the low-domain side0,…,^X19(step 2-4'). The fricative adjustment removing unit 23 decodes ^ Y in the adjusted spectral sequence on the high-order side20,…,^Y23Pressing toward the high zone side, Y after pressing toward the high zone side20,…,^Y23Low domain side configuration of ^ Y12,…,^Y19Obtaining the configured result as a high-domain side decoding frequency spectrum sequence ^ X20,…,^X31(step 2-5'). The fricative adjustment release unit 23 decodes the lower-level decoded spectrum sequence ^ X0,…,^X19And high-field side decoding spectrum sequence ^ X20,…,^X31Combining to obtain a decoded spectrum sequence ^ X0,…,^X31(Steps 1 to 6).
[ example 4 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]
When the adjustment target samples from the low-side spectral sequence to the high-side in step1-2 do not include 1 or more samples from the lowest frequency in the
[ example 5 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]
When the adjustment target samples from the high-side spectral sequence toward the low-side in step 1-3 do not include 1 or more samples from the highest frequency, the fricative adjustment canceling unit 23 in the encoding apparatus does not include 1 or more samples from the highest frequency in the adjustment target samples from the high-side decoded adjusted spectral sequence toward the low-side in step 2-3.
Further, as shown by the chain line in fig. 3, it can be said that if the decoding unit 22 and the fricative adjustment canceling unit 23 are the fricative-corresponding
[ time domain converting section 24]
The decoded spectrum sequence ^ X output by the input fricative adjustment release unit 23 in the time domain conversion unit 240,…,^XN-1. The time domain converter 24 uses a conversion method to the time domain corresponding to the conversion method to the frequency domain performed by the
When the frequency
The decoding device may be configured to output the decoded audio signal in the frequency domain without outputting the decoded audio signal in the time domain. In the case of this configuration, the decoding apparatus may be configured not to include the time domain converter 24, and the decoded frequency spectrum sequence in the frame unit obtained by the fricative adjustment canceling unit 23 may be connected in time-domain order and output as a decoded sound signal in the frequency domain.
Action and Effect
According to the encoding device and the decoding device of the first embodiment, by configuring such that the conventional encoding process that is designed to allocate a larger number of bits to a spectrum of a low frequency and the decoding process corresponding thereto are added with the fricative adjustment process and the fricative adjustment cancellation process corresponding thereto, it is possible to perform compression encoding so that the sound quality is less deteriorated in auditory sense even for a sound signal including fricatives and the like.
As a conventional technique capable of performing compression coding so as to reduce auditory deterioration even in a sound signal including a fricative sound or the like, there is an encoding/decoding technique in which bits are preferentially allocated to a subband having large energy. However, in this technique, it is necessary to transmit information of bit allocation to each subband from the encoding side to the decoding side. In contrast, according to the encoding device and the decoding device of the first embodiment, compression encoding can be performed by transmitting only 1-bit fricative determination information from the encoding side to the decoding side, and thus, even a sound signal including fricatives and the like has less auditory deterioration.
< modification of the first embodiment >
The modification of the first embodiment differs from the first embodiment only in the fricative
[ frictional sound determination unit 12]
The frictional
The frictional
The comparison result storage unit stores the comparison result information by a predetermined number of past frames. That is, the fricative
The fricative
In this way, the
The fricative determination information may be, for example, 1-bit information, or may be an average value of the sum of absolute values or an average value of the sum of squares of all or a part of the values of the samples, which is used as the average energy, and is the same as the
Action and Effect
When the processing in the encoding apparatus and the decoding apparatus according to the first embodiment is performed, a decoded sound with less coding distortion of a high-domain component and more coding distortion of a low-domain component is obtained for a frame subjected to the adjustment processing and the adjustment release processing, and a decoded sound with more coding distortion of a high-domain component and less coding distortion of a low-domain component is obtained for a frame not subjected to the adjustment processing and the adjustment release processing, and therefore, a waveform of a decoded sound may be discontinuous at a boundary between a frame subjected to the adjustment processing and the adjustment release processing and a frame not subjected to the adjustment processing and the adjustment release processing. That is, when the determination result of the frictional
In the frictional
< second embodiment >
The system according to the second embodiment of the present invention includes an encoding device and a decoding device, as in the system according to the first embodiment.
The second embodiment is different from the first embodiment in that a spectrum to which no bit is allocated in the encoding apparatus is restored in the decoding apparatus, that is, a band is extended in the decoding apparatus. The decoding apparatus according to the second embodiment expands the band by decoding the adjusted spectrum sequence, which is the spectrum transposed according to the fricative decision information. In the spectrum to which no bit is allocated in the encoding apparatus, the time interval of a non-fricative sound is included in the high range, and the time interval of a fricative sound is included in the low range. Therefore, in the second embodiment, for a time section of a sound that is not fricative, a high-frequency spectrum is reproduced by copying a low-frequency spectrum to expand a band, and for a time section of a sound that is fricative, a low-frequency spectrum is reproduced by copying a high-frequency spectrum to expand a band.
The spectrum replication in the second embodiment is performed by multiplying the spectrum as the source of the replication by a gain. Therefore, the encoding device of the second embodiment obtains the gain used by the decoding device of the second embodiment in addition to the processing performed by the encoding device of the first embodiment, and outputs the code corresponding to the obtained gain.
Coding device
Referring to fig. 9, a process of the encoding device of the second embodiment is explained. As illustrated in fig. 9, the encoding device according to the second embodiment includes: a frequency
In the encoding device, a time-domain audio signal is input in units of frames of a predetermined time length. The time-domain audio signal input to the encoding apparatus is input to the
[ frequency domain converting part 11]
The frequency
[ frictional sound determination unit 12]
The frictional
In other words, the
In addition, the
[ fricative sound adjustment unit 13]
The
The adjustment processing of the frequency spectrum by the
In other words, when the
[ encoding section 14]
The
The method of preferentially allocating bits to samples having a small sample number in the
14 examples of the encoding sectionE.g. for the adjusted spectral sequence Y0,…,YN-1K (K ≦ N/2) large sample numbers of N adjusted spectra of (a) adjusted spectrum YN-K,…,YN-1Without bit allocation, the N-K adjusted spectrums Y starting from the smaller of the rest sample numbers0,…,YN-K-1Allocating bits to adjust the spectral sequence Y0,…,YN-1The spectrum code is obtained by encoding, and the obtained spectrum code is output to the
[ band expansion gain encoding section 16]
The band expansion
Only the adjusted spectrum sequence Y is input to the band spreading
In addition, the band spreading
The
Hereinafter, the adjusted spectrum to which bits are allocated by the
[ example 1 of band expansion gain coding section 16]
In this example, it is assumed that J sets of gain candidate vectors each composed of a gain candidate value corresponding to K samples and a code are stored in the
The band expansion
In other words, the band spreading
[ example 2 of band expansion gain coding section 16]
In this example, the
When the inputted fricatives determination information indicates that the input fricatives determination information is fricatives, the band expansion
In other words, when the input fricative decision information indicates that the sound is fricative, the band expansion
As described above, the band expansion
[ example 1 of the band spreading
In examples 1 and 2 described above, the adjusted spectrum to be subjected to multiplication of the gain candidate is set to the adjusted spectrum Y to which bits are allocated from the
[ example 1 of the band spreading
In examples 1 and 2 described above, Y in the order of decreasing k value to increasing k value in formula (1)N-2K+k,gj,k,YN-K+kHowever, any association may be used as long as the association is predetermined.
[ concrete example of band expansion gain coding section 16]
A specific example of the band expansion
Fig. 13 is an example of a case where the fricative determination information indicates that the sound is not fricative. As will be described later, the
Fig. 14 shows an example of a case where the fricative determination information indicates that the information is fricative sound. The band extending unit 25 of the decoding apparatus performs the following processing as described below: the 8 th to 19 th decoded and adjusted spectrum are set as the copy source, and the values of the decoded and adjusted spectrum of these copy sources are multiplied by the band spreading gain to obtain the results of the order of the 16 th to 19 th sample numbers followed by the 8 th to 15 th sample numbers as the 20th to 31 th decoded spread spectrum.Therefore, when the inputted fricative decision information indicates that the sound is fricative, the band expansion gain encoding unit 16 sets the gain candidate vector for fricative stored in the storage unit 161 as a gain candidate vector for gain, which is the adjusted spectrum Y to which bits are allocated from the encoding unit 14, and obtains a code corresponding to the gain candidate vector as a band expansion gain code0,…,Y1912 adjusted spectra Y starting from the larger one of the sample numbers8,…,Y19And gain candidate values g constituting a gain candidate vectorj,0,…,gj,11Absolute value | Y of respective multiplied values8gj,0|,…,|Y19gj,11I.e., the adjusted spectrum Y to which no bit is allocated by the encoding unit 1424,…,Y31,Y20,…,Y23Absolute value of each | Y24|,…,|Y31|,|Y20|,…,|Y23Absolute value of difference | | Y8gj,0|-|Y24||,…,||Y15gj,7|-|Y31||,||Y16gj,8|-|Y20||,…,||Y19gj,11|-|Y23Sum of | | EjThe smallest gain candidate vector.
As described above, the band spreading
The operation of the band expansion
That is, the
[ multiplexing section 15]
The multiplexing
Decoding device
A process of the decoding device according to the second embodiment will be described with reference to fig. 11. As illustrated in fig. 11, the decoding device according to the second embodiment includes a demultiplexing unit 21, a decoding unit 22, a
The code output by the encoding device is input to the decoding device. The code input to the decoding apparatus is input to the demultiplexing section 21. The decoding apparatus performs processing for each component in units of frames of a predetermined time length. The decoding method of the second embodiment is implemented by each component of the decoding apparatus performing the following processing of step S21 to step S25 illustrated in fig. 12.
[ multiplexing/demultiplexing unit 21]
The multiplexing/demultiplexing unit 21 demultiplexes the input code into a code corresponding to the fricative determination information, a band expansion gain code, and a spectrum code, outputs the fricative determination information obtained from the code corresponding to the fricative determination information to the fricative adjustment canceling unit 23 and the
[ decoding section 22]
The decoding unit 22 decodes the input spectrum code in units of frames by the decoding process corresponding to the encoding process performed by the
As described above, since the
Further, the value of the decoded adjusted spectrum of the sample number to which no bit is assigned in the
In this way, the decoding unit 22 decodes the spectrum code of the frame unit of the predetermined time interval to obtain the sample sequence of the frequency domain (decoded adjusted spectrum sequence) in order to decode the spectrum code in which no bit is allocated to a part of the high-order side.
However, as will be described later, when the input information indicating whether or not the sound is fricative indicates that the sound is fricative, the fricative adjustment canceling unit 23 obtains, as a result of exchanging all or a part of the low-range-side frequency sample sequence located on the low range side with respect to the predetermined frequency in the decoded spread spectrum sequence (spectrum sequence based on the decoded adjusted spectrum sequence) obtained by the
The decoding unit 22 of the decoding apparatus according to the first embodiment outputs the obtained decoded adjusted spectral sequence to the fricative adjustment canceling unit 23, whereas the decoding unit 22 of the decoding apparatus according to the second embodiment outputs the obtained decoded adjusted spectral sequence to the
[ band expansion section 25]
The
When the
The
In the storage unit 251 of the
Hereinafter, the adjusted spectrum to which bits are allocated by the
[ example 1 of band expanding section 25]
In this example, the storage unit 251 stores J sets of gain candidate vectors each composed of a gain candidate value corresponding to K samples and a code set. Hereinafter, the J gain candidate vectors are each set to Gj(J-0, …, J-1), the gain candidate vector G will be summed withj(J is 0, …, J-1) and C is the code corresponding to each of (J is 0, …, J-1)Gj(J-0, …, J-1), each gain candidate vector GjSet as K gain candidates g based on the number of K samplesj,k(K is 0, …, K-1).
The
[ example 2 of band expanding section 25]
In this example, the storage unit 251 stores J sets of gain candidate vectors and codes as in example 1, but unlike example 1, 2 sets of gain candidate vectors for fricatives and gain candidate vectors for non-fricatives are stored as gain candidate vectors. That is, the storage unit 251 stores J sets of a fricative gain candidate vector, a non-fricative gain candidate vector, and a code, and each of the fricative gain candidate vector and the non-fricative gain candidate vector is configured with a gain candidate value corresponding to K samples. Hereinafter, the J fricative gain candidate vectors are each G1j(J is 0, …, J-1), and the J gain candidate vectors for non-fricatives are G2j(J-0, …, J-1), and a fricative gain candidate vector G1j(J-0, …, J-1) and gain candidate vector for frictionless sound G2jEach code corresponding to (J-0, …, J-1) is CGj(J-0, …, J-1). The gain candidate vector G1 for each fricativejSet as K gain candidates g1 based on the number of K samplesj,k(K is 0, …, K-1), and each non-fricative gain candidate vector G2jSet as K gain candidates g2 based on the number of K samplesj,k(K is 0, …, K-1).
The
[ example 1 of
In examples 1 and 2 described above, the decoded adjusted spectrum to be subjected to multiplication of band spread gain is set to the decoded adjusted spectrum Y obtained by decoding the spectrum code0,…,^YN-K-1K adjusted spectra ^ Y starting from the side with larger sample number in (a)N-2K,…,^YN-K-1. However, the decoded adjusted spectrum to be subjected to multiplication of band spread gain is a decoded adjusted spectrum ^ Y obtained by decoding a spectrum code0,…,^YN-K-1The K decoded adjusted spectra in (1) may be the K decoded adjusted spectra corresponding to the predetermined K sample numbers.
[ example 1 of
In examples 1 and 2 above, decoding the value of k from small to large adjusts the spectrum YN-2K+kBand spreading gain g with k from small to largekMultiplying to obtain a decoded spread spectrum Y with a k value from small to largeN-K+kThat is, the value of k is related from small to large, butAny association may be used as long as it is a predetermined association.
[ example of band expanding section 25]
A specific example of the
Fig. 13 is an example of a case where the fricative determination information indicates that the sound is not fricative. The
Fig. 14 shows an example of a case where the fricative determination information indicates that the information is fricative sound. The
The operation of the
In this way, the
More specifically, for example, the
Further, the processing for the
[ adjustment canceling part for frictional noise 23]
The fricatives decision information outputted from the demultiplexing unit 21 and the decoded spread spectrum sequence Y outputted from the
The adjustment canceling process performed by the fricative adjustment canceling unit 23 is for decoding the spread spectrum sequence Y0,…,~YN-1The fricative adjustment release unit 23 of the decoding device according to the first embodiment decodes the adjusted spectrum sequence ^ Y0,…,^YN-1The same treatment as that performed. That is, if an integer value larger than 1 and smaller than N is M, for example, the decoded spread spectrum sequence Y is0,…,~YN-1Y being a sample having a sample number less than M0,…,~YM-1The sample group of (2) is a low-level decoded spread spectrum sequence, and the decoded spread spectrum sequence Y0,…,~YN-1Sample number of (1) is M or more, i.e., -YM,…,~YN-1When the sample group of (2) is a high-level decoded spread spectrum sequence, the adjustment canceling process by the fricative adjustment canceling unit 23 is as follows: obtaining the exchanged low-side decoded spread spectrum sequence-Y0,…,~YN-1And the same number of high-side decoded spread spectrum sequences-Y as the number of samples of all or a part ofM,…,~YN-1As a result of decoding the spectral sequence X0,…,^XN-1。
In other words, when the information indicating whether or not the input sound is fricative indicates that the sound is fricative, the fricative adjustment canceling unit 23 may obtain, as a spectrum sequence of a decoded sound signal (decoded spectrum sequence), a result of exchanging all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained by the
Further, as shown by the chain line in fig. 11, it can be said that if the
[ time domain converting section 24]
The time domain conversion unit (24) converts the decoded spectrum sequence ^ X for each frame using a method of conversion to the time domain corresponding to the method of conversion to the frequency domain performed by the frequency domain conversion unit (11) of the encoding device0,…,^XN-1The signal is converted into a time domain signal, and a frame-unit audio signal (decoded audio signal) is obtained and output (step S24).
Action and Effect
According to the encoding device and the decoding device of the second embodiment, similarly to the encoding device and the decoding device of the first embodiment, by performing the fricative adjustment process and the fricative adjustment cancellation process, bits are preferentially allocated to the high domain in the time section of the fricative sound and bits are preferentially allocated to the low domain in the time section other than the time section, whereby it is possible to reduce the deterioration in hearing even for a sound signal including a fricative sound or the like.
According to the encoding device and the decoding device of the second embodiment, by further using the band expansion gain, the low-range spectrum is reproduced and the band is expanded by the reproduction of the high-range spectrum in the time zone of the fricative sound, and the high-range spectrum is reproduced and the band is expanded by the reproduction of the low-range spectrum in the time zone other than the time zone, so that even a sound signal including a fricative sound or the like can be reduced more acoustically than the first embodiment. In this case, the original spectrum contour is reproduced as much as possible by performing a frequency-order-maintained copy using a band expansion gain based on the amplitude of the spectrum, thereby improving the auditory quality.
Further, when the frictional
[ program and recording Medium ]
Each of the encoding device, the decoding device, and the fricative determination device may be realized by a computer. In this case, the processing contents of the functions to be provided by each of the encoding device, the decoding device, and the fricative determination device are described by a program. Then, by executing the program on a computer, each of the encoding device, the decoding device, and the fricative determination device is realized on the computer.
The program in which the processing contents are described may be recorded in a computer-readable recording medium. Examples of the computer-readable recording medium include any medium such as a magnetic recording device, an optical disk, an magneto-optical recording medium, and a semiconductor memory.
The processing of each part may be configured by executing a predetermined program on a computer, or at least a part of the processing may be realized by a hardware method.
It is to be understood that appropriate modifications can be made without departing from the scope of the present invention.