Decoding device, encoding device, methods thereof, and program

文档序号：1191977 发布日期：2020-08-28 浏览：22次中文

阅读说明：本技术 解码装置、编码装置、它们的方法以及程序 (Decoding device, encoding device, methods thereof, and program ) 是由杉浦亮介镰本优守谷健弘于 2018-12-03 设计创作，主要内容包括：解码装置,包括：频带扩展部(25),通过在与频域的样本串相比高域侧,配置基于通过解码得到的频域的样本串中包含的K个样本的样本,得到解码扩展频谱序列；以及摩擦音调整解除部(23),在表示是否是被输入的摩擦音性的音的信息表示是摩擦音性的音的情况下,得到将解码扩展频谱序列中的与规定的频率相比位于低域侧的低域侧频率样本串的全部或者一部分、和与其相同数目的、解码扩展频谱序列中的与规定的频率相比位于高域侧的高域侧频率样本串的全部或者一部分进行了调换的结果,作为解码音信号的频谱序列。(A decoding apparatus, comprising: a band spreading unit (25) that obtains a decoded spread spectrum sequence by arranging samples based on K samples included in the frequency-domain sample sequence obtained by decoding on the higher side of the frequency-domain sample sequence; and a fricative adjustment canceling unit (23) which, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is fricative sound, obtains, as a spectrum sequence of the decoded speech signal, a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence with all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence, the number of the samples being the same as the number of the samples.)

1. A decoding apparatus, comprising:

a decoding unit that decodes a spectrum code of a frame unit of a predetermined time interval, the spectrum code being a spectrum code in which bits are not allocated to a part of the high-order side, to obtain a sample sequence of a frequency domain;

a band spreading unit configured to obtain a decoded spread spectrum sequence by arranging samples based on K samples included in a sample string of a frequency domain obtained by decoding the spectrum code by the decoding unit, on a higher side than the sample string of the frequency domain obtained by decoding the spectrum code by the decoding unit, where K is an integer of 2 or more; and

and a fricative adjustment canceling unit configured to, when the information indicating whether or not the input fricative sound indicates a fricative sound, obtain a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit with the same number of low-range-side frequency sample sequences as the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit, and all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit, as a spectrum sequence of a decoded speech signal, and to, when the number of high-range-side frequency sample sequences is not the same number as the predetermined frequency, obtain a spectrum sequence in which the decoded spread spectrum sequence obtained by the band spreading unit is directly used.

2. The decoding device as set forth in claim 1,

the band spreading unit obtains the decoded spread spectrum sequence by arranging K samples obtained by multiplying K samples included in the sample sequence of the frequency domain obtained by the decoding unit decoding the spectrum code by K band spreading gains on a higher side than the sample sequence of the frequency domain obtained by the decoding unit decoding the spectrum code, and obtaining a set of K band spreading gains by decoding the band spreading gain code.

3. The decoding device as set forth in claim 2,

the band expansion unit stores a plurality of codes, a fricative gain candidate vector corresponding to each of the codes, and a non-fricative gain candidate vector corresponding to each of the codes,

each of the fricative gain candidate vector and the non-fricative gain candidate vector contains K gain candidate values,

the processing of the band spreading section decoding the band spreading gain code to obtain K sets of the band spreading gains is as follows: and a processing of setting K gain candidate values included in a fricative gain candidate vector corresponding to a code identical to the band spread gain code among the plurality of fricative gain candidate vectors as a set of K band spread gains, and setting K gain candidate values included in a non-fricative gain candidate vector corresponding to a code identical to the band spread gain code among the plurality of non-fricative gain candidate vectors as a set of K band spread gains, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is a fricative sound.

4. A decoding device for decoding a spectrum code in a frame unit of a predetermined time interval to obtain a spectrum sequence of a decoded speech signal, comprising:

a decoding unit configured to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a lower side when the information indicating whether the inputted fricative sound is the fricative sound indicates that the inputted fricative sound is the fricative sound, and configured to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a higher side when the information indicates that the inputted fricative sound is other than the fricative sound; and

and a fricative corresponding band extension unit configured to, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is fricative sound, obtain the spectral sequence of the decoded sound signal by performing band extension on a lower band side with respect to the frequency domain spectral sequence obtained by the decoding unit, and, when the information indicates that the input fricative sound is fricative sound, obtain the spectral sequence of the decoded sound signal by performing band extension on an upper band side with respect to the frequency domain spectral sequence obtained by the decoding unit.

5. An encoding device including an encoding unit that encodes a sample string of frequencies corresponding to a sound signal in units of frames in a predetermined time interval by an encoding process in which bits are not allocated to a part of a high-band side, thereby obtaining a spectrum code, the encoding device comprising:

a fricative determination unit that determines whether or not the sound signal is fricative sound; and

a fricative adjustment unit that, when the fricative determination unit determines that the sound is fricative, obtains a result of replacing all or a part of a low-range-side spectrum sequence located on a low range side with respect to a predetermined frequency in a spectrum sequence of the sound signal with the same number of low-range-side spectrum sequences located on a high range side with respect to the predetermined frequency in the spectrum sequence, and obtains a spectrum sequence corresponding to the sound signal as an adjusted spectrum sequence when the number of the high-range-side spectrum sequences is not the predetermined frequency,

the encoding unit encodes the adjusted spectrum sequence obtained by the fricative adjustment unit as a sample string of frequencies corresponding to the speech signal to obtain a spectrum code,

the encoding device further includes:

and a band spreading gain encoding unit that stores a plurality of codes, each of which includes K gain candidate values, and a gain candidate vector corresponding to each of the codes, and that obtains, as a band spreading gain code, a code corresponding to a gain candidate vector in which an error between a sequence of absolute values of K values obtained by multiplying the K adjusted spectrums to which bits are allocated by the encoding unit in the adjusted spectrum sequence and the K gain candidate values included in the gain candidate vector and a sequence of absolute values of K adjusted spectrums to which bits are not allocated by the encoding unit in the adjusted spectrum sequence is minimized, and that outputs the code, wherein K is an integer equal to or greater than 2.

6. The encoding device as set forth in claim 5,

the band expansion gain encoding unit stores a plurality of codes, a fricative gain candidate vector corresponding to each of the codes, and a non-fricative gain candidate vector corresponding to each of the codes,

the band expansion gain coding unit may use a fricative gain candidate vector as the gain candidate vector when the fricative determination unit determines that the sound is a fricative sound, and may use a non-fricative gain candidate vector as the gain candidate vector when the noise is not the fricative sound.

7. The encoding apparatus according to claim 5 or 6,

the fricative determination unit determines that the sound signal is a fricative sound when the index having a larger value as the ratio of the average energy of the spectrum on the high-range side to the average energy of the spectrum on the low-range side in the spectrum sequence of the frame is larger than a predetermined threshold or equal to or larger than the threshold.

8. The encoding apparatus according to claim 5 or 6,

in the case where the index having a larger value is larger than a predetermined threshold value or the number of frames that is equal to or larger than the threshold value is larger than the number of frames that is not equal to or larger than the threshold value, or the number of frames that is not equal to or larger than the threshold value, the fricative sound determination unit determines that the sound signal is a fricative sound, among the plurality of frames including the frame.

9. A decoding method, comprising:

a decoding step of decoding a spectrum code of a frame unit of a predetermined time interval and a spectrum code in which bits are not allocated to a part of the high-band side, to obtain a sample sequence of a frequency domain;

a band spreading step of obtaining a decoded spread spectrum sequence by arranging samples of K samples included in a sample string of a frequency domain obtained by decoding the spectrum code in the decoding step, on a higher side than the sample string of the frequency domain obtained by decoding the spectrum code in the decoding step, where K is an integer of 2 or more; and

and a fricative adjustment canceling step of, when the information indicating whether or not the input fricative sound indicates a fricative sound, obtaining a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained in the band spreading step with the same number of low-range-side frequency sample sequences as the predetermined frequency and all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained in the band spreading step, as a spectrum sequence of a decoded speech signal, and obtaining the decoded spread spectrum sequence obtained in the band spreading step as a spectrum sequence of the decoded speech signal as it is, in a case other than the above case.

10. A decoding method for decoding a spectrum code in a frame unit of a predetermined time interval to obtain a spectrum sequence of a decoded speech signal, comprising:

a decoding step of, when the information indicating whether or not the inputted fricative sound indicates that the inputted fricative sound is fricative sound, decoding the spectrum code without allocating bits to a part of the spectrum code on the lower side to obtain a spectrum sequence in the frequency domain, and, when the information does not indicate that the inputted fricative sound is fricative sound, decoding the spectrum code without allocating bits to a part of the spectrum code on the upper side to obtain a spectrum sequence in the frequency domain; and

a fricative corresponding band extension step of, when the information indicating whether or not the input fricative sound indicates a fricative sound, performing band extension on the frequency domain spectrum sequence obtained in the decoding step to a lower band side to obtain the spectrum sequence of the decoded sound signal, and when the information indicates otherwise, performing band extension on the higher band side to obtain the spectrum sequence of the decoded sound signal.

11. An encoding method, comprising: an encoding step of encoding a sample string of frequencies corresponding to a frame-unit speech signal in a predetermined time interval by an encoding process in which bits are not allocated to a part of the high-band side, to obtain a spectrum code, the encoding method further comprising:

a fricative determination step of determining whether or not the sound signal is fricative sound;

a fricative adjustment step of, when the fricative determination step determines that the sound is a fricative sound, obtaining a result of replacing all or a part of a low-range-side spectrum sequence located on a low range side with respect to a predetermined frequency in a spectrum sequence of the sound signal with all or a part of a high-range-side spectrum sequence located on a high range side with respect to the predetermined frequency in the spectrum sequence in the same number as the predetermined frequency, and obtaining a spectrum sequence corresponding to the sound signal as an entire spectrum sequence as it is in a case other than the above case,

the encoding step is a step of encoding the adjusted spectrum sequence obtained in the fricative adjustment step as a sample string of frequencies corresponding to the speech signal to obtain a spectrum code,

the encoding method further includes:

a band spreading gain encoding step of storing a plurality of codes and gain candidate vectors corresponding to the codes, each of the gain candidate vectors including K gain candidate values, obtaining a code corresponding to a gain candidate vector in which an error between a sequence of absolute values of K values obtained by multiplying K adjusted spectrums to which bits are allocated in the encoding step in the adjusted spectrum sequence and the K gain candidate values included in the gain candidate vectors and a sequence of absolute values of K adjusted spectrums to which bits are not allocated in the encoding step in the adjusted spectrum sequence is minimized, as a band spreading gain code, and outputting the code, where K is an integer of 2 or more.

12. A program for causing a computer to function as each means of the decoding device of any one of claims 1 to 4.

13. A program for causing a computer to function as each unit of the encoding device according to any one of claims 5 to 8.

Technical Field

The present invention relates to a technique for encoding or decoding a sample string derived from a spectrum of a speech signal in a signal processing technique such as a speech signal encoding technique.

Background

When a sound signal is compression-encoded, conventionally, the sound signal is represented as a spectrum string, and bits are allocated to the spectrum string in consideration of the importance of auditory sense, so as to improve the compression efficiency. Bit allocation in consideration of the importance of auditory sense is performed by preferentially allocating bits and the like to samples corresponding to low frequencies in the spectrum string. As a result, the following structure may be adopted: no bit is allocated to a sample corresponding to a high frequency in the spectrum sequence, and no direct information on the sample sequence corresponding to the high frequency is encoded in the encoding device. In the decoding device corresponding to this encoding device, since the decoded sound is obtained by setting the sample value corresponding to the high frequency in the spectral string to 0, the band extension technique as described in non-patent document 1, that is, the technique in which the decoding device outputs the result of copying while adjusting the amplitude of the sample string corresponding to the low frequency as the decoding result of the sample string corresponding to the high frequency, may be used. This is based on the fact that a person has low sensitivity to high frequencies when listening to a sound and does not feel uncomfortable if a low-frequency octave sound can be heard. By allocating the number of bits saved in the high frequency band to the low frequency band, information more important to the auditory sense characteristics of a human can be expressed with high accuracy. Thus, the coding of the tone is typically designed to allocate a greater number of bits to the spectrum at lower frequencies.

Disclosure of Invention

Problems to be solved by the invention

According to the band extension technique of non-patent document 1, a band extension sound with less deterioration of auditory sense quality can be obtained from a decoded sound obtained by a decoding device for most of natural sounds. However, there are also natural sounds in which energy is concentrated at high frequencies and there is substantially no energy at low frequencies, such as fricatives in human speech, and if encoding is performed by an encoding device that allocates the number of bits as described above for such speech signals, decoded sounds with large distortion of the main frequency component of the sounds are obtained from the decoding device particularly under conditions of low bit rates, and if band-extended sounds are obtained from the decoded sounds by the band-extension technique of non-patent document 1, there is a problem that the band-extended sounds are acoustically deteriorated.

Therefore, an object of the present invention is to provide an encoding device that performs compression encoding on the encoding side on the premise of band expansion on the decoding side, a decoding device that performs decoding in association with band expansion on the decoding side, methods therefor, and programs therefor, which reduce auditory deterioration of sound signals such as fricatives.

Means for solving the problems

A decoding device according to an aspect of the present invention includes: a decoding unit that decodes a spectrum code of a frame unit of a predetermined time interval, the spectrum code being a spectrum code in which bits are not allocated to a part of the high-order side, to obtain a sample sequence of a frequency domain; a band spreading unit that obtains a decoded spread spectrum sequence by arranging samples based on K samples included in a sample sequence of a frequency domain obtained by decoding the spectrum code by the decoding unit, on a higher side than the sample sequence of the frequency domain obtained by decoding the spectrum code by the decoding unit, wherein K is an integer of 2 or more; and a fricative adjustment canceling unit that, when the information indicating whether or not the input fricative sound indicates that the input fricative sound indicates fricative sound, obtains, as a spectral sequence of the decoded speech signal, a result of replacing all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit with all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit, the number of the samples being equal to the number of the samples.

A decoding device according to an aspect of the present invention is a decoding device that decodes a spectrum code in a frame unit of a predetermined time interval to obtain a spectrum sequence of a decoded speech signal, and includes: a decoding unit configured to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a lower side when the information indicating whether the inputted fricative sound is the fricative sound indicates that the inputted fricative sound is the fricative sound, and to decode the spectrum code to obtain a spectrum sequence in a frequency domain without allocating bits to a part of the spectrum code on a higher side when the information indicates other than the above; and a fricative corresponding band extension unit configured to, when the information indicating whether or not the input fricative sound indicates that the input fricative sound is fricative sound, obtain a frequency spectrum sequence of the decoded sound signal by performing band extension on the low side of the frequency spectrum sequence obtained by the decoding unit, and, when the information indicates that the input fricative sound is fricative sound, obtain a frequency spectrum sequence of the decoded sound signal by performing band extension on the high side of the frequency spectrum sequence obtained by the decoding unit.

An encoding device according to an aspect of the present invention includes an encoding unit configured to encode a sample string of frequencies corresponding to a frame-unit speech signal in a predetermined time interval by an encoding process in which bits are not allocated to a part of a high-band side, to obtain a spectrum code, and includes: a fricative determination unit that determines whether or not the sound signal is fricative sound; and a fricative adjustment unit that, when the fricative determination unit determines that the sound is fricative, obtains a result of replacing all or a part of a low-range-side spectral sequence located lower than a predetermined frequency in a spectral sequence of the sound signal with all or a part of a high-range-side spectral sequence located higher than the predetermined frequency in the same number of spectral sequences, and obtains, as an adjusted spectral sequence, a spectral sequence corresponding to the sound signal as it is, when the number of the spectral sequences is not the same, an adjusted spectral sequence, and the encoding unit encodes the adjusted spectral sequence obtained by the fricative adjustment unit as a sample sequence of frequencies corresponding to the sound signal, and the obtained spectral code encoding device further includes a band expansion gain encoding unit that stores a plurality of codes and gain candidate vectors corresponding to the codes, each of gain candidate vectors, which are a sequence of absolute values of K values obtained by multiplying K gain candidate values included in a gain candidate vector by K adjusted spectrums in which bits are allocated to encoding sections in an adjusted spectrum sequence, and a sequence of absolute values of K adjusted spectrums in which bits are not allocated to encoding sections in the adjusted spectrum sequence, is the smallest in error, includes K gain candidate values, and a code corresponding to the gain candidate vector, where K is an integer of 2 or more, is obtained as a band spread gain code, and is output.

ADVANTAGEOUS EFFECTS OF INVENTION

According to the encoding device and the decoding device, encoding and decoding can be performed so that acoustic signals such as fricatives are less deteriorated in auditory sense.

Drawings

Fig. 1 is a block diagram showing an example of an encoding device according to the first embodiment.

Fig. 2 is a flowchart showing an example of the encoding method according to the first embodiment.

Fig. 3 is a block diagram showing an example of the decoding apparatus according to the first embodiment.

Fig. 4 is a flowchart showing an example of the decoding method according to the first embodiment.

Fig. 5 is a diagram for explaining an example of the fricative adjustment process.

Fig. 6 is a diagram for explaining an example of the fricative adjustment process.

Fig. 7 is a diagram for explaining an example of the fricative adjustment process.

Fig. 8 is a diagram for explaining an example of the fricative adjustment process.

Fig. 9 is a block diagram showing an example of the encoding device according to the second embodiment.

Fig. 10 is a flowchart showing an example of the encoding method according to the second embodiment.

Fig. 11 is a block diagram showing an example of a decoding device according to the second embodiment.

Fig. 12 is a flowchart showing an example of the decoding method according to the second embodiment.

Fig. 13 is a diagram for explaining an example of the band expansion process and the fricative adjustment cancellation process.

Fig. 14 is a diagram for explaining an example of the band expansion process and the fricative adjustment cancellation process.

Detailed Description

< first embodiment >

The first embodiment is an embodiment that is a premise of a second embodiment that is one embodiment of the present invention.

The system of the first embodiment includes an encoding device and a decoding device. The encoding device encodes a time-domain audio signal input in units of frames of a predetermined time length to obtain a code, and outputs the code. The code output by the encoding apparatus is input to the decoding apparatus. The decoding device decodes the input code and outputs a time-domain audio signal in frame units. The speech signal input to the encoding apparatus is, for example, a speech signal or an acoustic signal obtained by collecting speech or music with a microphone and performing AD conversion. The audio signal output from the decoding device is DA-converted, reproduced by a speaker, and listened to.

Coding device

Referring to fig. 1, a process of the encoding device of the first embodiment is explained. As illustrated in fig. 1, the encoding device of the first embodiment includes: a frequency domain converting unit 11, a fricative determining unit 12, a fricative adjusting unit 13, an encoding unit 14, and a multiplexing unit 15. The time-domain audio signal input to the encoding apparatus is input to the frequency domain converting unit 11. The encoding device performs processing for each unit in units of frames of a predetermined time length. The encoding method according to the first embodiment is realized by each section of the encoding apparatus performing the following processing of step S11 to step S15 illustrated in fig. 2.

In addition, a configuration may be adopted in which a frequency-domain audio signal is input to the encoding device instead of the time-domain audio signal. In the case of this configuration, the encoding device may not include the frequency domain converter 11, and may input the sound signal of the frequency domain of the frame unit of the predetermined time length to the fricative determiner 12 and the fricative adjuster 13.

[ frequency domain converting part 11]

The frequency domain transform unit 11 receives the time domain audio signal input to the encoding apparatus. The frequency domain converting unit 11 converts an input time domain speech signal into a frequency domain N-point spectrum sequence X by, for example, Modified Discrete Cosine Transform (MDCT) in units of frames of a predetermined length of time₀,…,X_N-1And then output (step S11). N is a positive integer, for example, N ═ 32, or the like. The subscript added to X in the subscript manner is a number assigned in order from the spectrum with a lower frequency. As a method of transforming into the frequency domain, various known transformation methods other than MDCT (for example, discrete fourier transform, short-time fourier transform, and the like) can be used.

The frequency domain converting unit 11 outputs the spectrum sequence obtained by the conversion to the fricative determining unit 12 and the fricative adjusting unit 13. The frequency domain converting unit 11 may apply filtering processing and companding processing to the spectrum sequence obtained by the conversion for auditory weighting, and may set the sequence after the filtering processing or the companding processing as the spectrum sequence X₀,…,X_N-1And (6) outputting.

[ Friction sound determination unit 12 (friction sound determination device) ]

The frictional sound determination unit 12 inputs the spectral sequence X output from the frequency domain conversion unit 11, for example₀,…,X_N-1. The frictional sound determination unit 12 uses the inputted spectrum sequence X in units of frames₀,…,X_N-1Whether or not the sound signal is a fricative sound is determined, and the determination result is output to the fricative sound adjustment unit 13 and the multiplexing unit 15 as fricative sound determination information (step S12). For example, 1-bit information may be used as the fricative determination information. That is, the fricative determination unit 12 outputs a ratio as information indicating that a sound is fricative when a sound signal is fricative in a frame unitNote that "1" may be the fricative determination information, and when the speech signal of the frame is not a fricative sound, the bit "0" may be output as the fricative determination information as information indicating a sound that is not a fricative sound.

The frictional sound determination unit 12 obtains, for example, an input spectrum sequence X₀,…,X_N-1Of samples located on the high domain side with respect to the input spectral sequence X₀,…,X_N-1The index having a larger value as the ratio of the average energy of the samples located on the low side in (b) is larger is used as the index of the frame as the fricative sound. The frictional sound determination unit 12 determines that the sound is frictional sound when the obtained index is greater than or equal to a predetermined threshold value, and determines that the sound is not frictional sound when the obtained index is not greater than or equal to the predetermined threshold value or less than the threshold value.

When MA is an integer value larger than 1 and smaller than N-1 and MB is an integer value larger than MA and smaller than N, the frictional sound determination unit 12 determines, for example, the spectrum sequence X₀,…,X_N-1Sample number of (5) is not more than MA, i.e., X₀,…,X_MASetting the samples at the low domain side to be the frequency spectrum sequence X₀,…,X_N-1Sample number of (1) is a sample of MB or more, that is, X_MB,…,X_N-1Let X be a sample on the high range side₀,…,X_MAThe average of the sum of absolute values or the average of the sum of squares of the values of all or a part of the samples in (2) is set as the low-range average energy, and X is set as_MB,…,X_N-1The average value of the sum of absolute values or the average value of the sum of squares of the values of all or a part of the samples in (2) is set as the high-side average energy, and a value obtained by dividing the high-side average energy by the low-side average energy is obtained as an index of the fricative sound.

The integer value MA may be set so that a sample on the low domain side, which is a calculation target of the low domain side average energy in the fricative sound determination unit 12, is included in a low domain side spectrum sequence in the fricative sound adjustment unit 13, which will be described later. That is, the integer MA used in the fricative sound determination unit 12 may be set to a value less than the integer M of the fricative sound adjustment unit 13 described later. The integer value MB may be set so that a sample on the high-range side, which is a calculation target of the high-range-side average energy in the fricative sound determination unit 12, is included in a high-range-side spectrum sequence in the fricative sound adjustment unit 13, which will be described later. That is, the integer MB used in the fricative sound determination unit 12 may be equal to or larger than the integer M of the fricative sound adjustment unit 13 described later.

At sample X to be located at the low domain side₀,…,X_MAWhen the values of some of the samples in (2) are used for the calculation of the index, X may be selected from₀,…,X_MAThe lowest frequency side of (1) or more samples is used for the calculation of the index, that is, α may be a positive integer less than MA, and X may be a positive integer less than MA₀,…,X_αThe average value of the sum of absolute values or the average value of the sum of squares of the values of the samples (2) is set as the low-range-side average energy value α is determined in advance by experiments or the like in advance so that the value of X is equal to or greater than the value of X₀,…,X_αThe frequency spectrum of a sound other than the fricative sound may be in a range where the sound can normally exist.

In the encoding process in the encoding unit 14 described later, due to the restriction of the maximum value of the number of bits obtained in the encoding process, there are cases where no bits are allocated at all to a few samples from the highest frequency in the adjusted spectral sequence, and in this case, there are cases where no bits are allocated at all to β (β is a positive integer) samples from the highest frequency in the spectral sequence, regardless of whether the spectral adjustment process in the fricative adjustment unit 13 described later is performed or not performed_MB,…,X_N-1With the X of β samples starting from the highest frequency removed_MB,…,X_N-1-βFor the calculation of the above-mentioned index. That is, X may be substituted_MB,…,X_N-1-βThe average value of the sum of absolute values or the average value of the sum of squares of the values of the samples (b) is set as the high-range-side average energy, and the value of β may be determined in advance in accordance with the coding process by the coding unit 14 and the adjustment process by the fricative adjustment unit 13, which are designed in advance.

Fig. 5 and 6 show the adjustment of fricatives described later when N is 32 and M is 20An example of the portion 13. In these examples, X in the sequence of spectra₀,…,X₁₉Is set as a low-side spectrum sequence, X in the spectrum sequence₂₀,…,X₃₁Is set as a high-side spectrum sequence. Therefore, the frictional sound determination unit 12 sets MA to a value smaller than 20, for example, 19, MB to a value not less than 20, for example, 20, and X to₀,…,X₁₉The average of the sum of absolute values or the average of the sum of squares of the values of all or a part of the samples in (2) is set as the low-range average energy, and X is set as₂₀,…,X₃₁The average value of the sum of absolute values or the average value of the sum of squares of all or a part of the values of the samples in (2) may be the high-range-side average energy, and when α is 8, the frictional sound determination unit 12 sets X to 8₀,…,X₈The average value of the sum of absolute values or the average value of the sum of squares of the values of the samples (a) may be the low-range-side average energy, and when β is 4, the frictional sound determination unit 12 sets X to be₂₀,…,X₂₇The average of the sum of absolute values or the average of the sum of squares of the values of the samples in (2) may be the high-range-side average energy.

As indicated by a broken line in fig. 1, the fricative determination unit 12 receives the time-domain speech signal input to the encoding device, instead of the frequency-domain transform unit 11, and determines whether or not the frame speech signal is fricative using the input time-domain speech signal on a frame-by-frame basis. This determination may be made, for example, as follows: the zero-crossing number of the input time-domain sound signal is obtained as an index of a sound having a fricative property in the frame, and if the obtained index is greater than a predetermined threshold value or greater, it is determined as a sound having a fricative property, and if not, that is, if the obtained index is less than or equal to a predetermined threshold value or less than the threshold value, it is determined as a sound having a fricative property.

[ fricative sound adjustment unit 13]

The spectrum sequence X output from the frequency domain converter 11 is input to the fricative sound adjuster 13₀,…,X_N-1And fricative sound determination information output by the fricative sound determination unit 12. Fricative adjustment unit 13 uses frameA unit for applying to the inputted frequency spectrum sequence X when the inputted fricative determination information indicates that the inputted fricative determination information is fricative sound₀,…,X_N-1The adjusted spectrum sequence Y is obtained by performing the following spectrum adjustment process₀,…,Y_N-1The obtained adjusted spectrum sequence Y₀,…,Y_N-1Outputs the result to the encoding unit 14, and if the fricative decision information indicates that the sound is not fricative, outputs the spectral sequence X₀,…,X_N-1Directly as the adjusted spectral sequence Y₀,…,Y_N-1And outputs the result to the encoding unit 14 (step S13).

If an integer value greater than 1 and less than N is set to M, for example, if the spectral sequence X is set₀,…,X_N-1X is a sample with a sample number less than M₀,…,X_M-1The sample group of (2) is set as a low-side spectrum sequence, and the spectrum sequence X is set as a spectrum sequence X₀,…,X_N-1Sample number of (1) is M or more, i.e. X_M,…,X_N-1When the sample group of (2) is a high-level spectral sequence, the adjustment process performed by the fricative adjustment unit 13 is a process for obtaining a low-level spectral sequence X when the fricative determination information indicates that the sample group is a fricative sound₀,…,X_M-1All or a part of the samples of (2), and the high-domain side spectrum sequence X of the same number as that of the samples_M,…,X_N-1As a result of the transposition of all or a part of the samples of (a) as an adjusted spectral sequence Y₀,…,Y_N-1. The adjustment process performed by the frictional sound adjustment unit 13 will be described below. The adjustment processing performed by the fricative adjustment unit 13 may be various processing including the following exemplary contents, but it is predetermined which processing is performed.

[ example 1 of adjustment processing by the frictional sound adjustment unit 13]

When the fricative decision information indicates that the sound is fricative, the fricative adjustment unit 13 performs, for example, the following steps 1-1 to 1-6 to obtain the adjusted spectral sequence Y₀,…,Y_N-1. The following steps 1-1 to 1-6 are divided into 6 steps for easily understanding the operation of the frictional sound adjusting unit 13, but the steps are not limited theretoThe following steps 1-1 to 1-6 are merely examples, and the fricative sound adjustment unit 13 may perform the equivalent processing to steps 1-1 to 1-6 by performing the permutation of the elements, the replacement of the index, or the like 1 time.

Step 1-1: the frequency spectrum sequence X₀,…,X_N-1The sample group of samples with the sample number less than M is set as the low-region side spectrum sequence X₀,…,X_M-1A sequence of frequency spectra X₀,…,X_N-1The sample group of samples with the sample number of M or more is set as the high-region side spectrum sequence X_M,…,X_N-1。

Step 1-2: extracting the low-side spectrum sequence X obtained in the step 1-1₀,…,X_M-1The C samples (C is a positive integer) included in (b) are samples to be adjusted to the upper domain side.

Step 1-3: extracting the high-side spectrum sequence X obtained in step 1-1_M,…,X_N-1The C samples included in (1) are samples to be adjusted to the lower domain side.

Step 1-4: the sample position at which the adjustment target sample to the higher domain side in the lower domain side spectrum sequence is extracted in step1-2 is obtained, and the result of arranging the adjustment target sample to the lower domain side extracted from the higher domain side spectrum sequence in step 1-3 is obtained as the adjusted lower domain side spectrum sequence Y₀,…,Y_M-1。

Step 1-5: the sample position at which the adjustment target sample to the low domain side in the high domain side spectrum sequence is extracted in Step 1-3 is obtained, and the result of arranging the adjustment target sample to the high domain side extracted from the low domain side spectrum sequence in Step1-2 is obtained as the high domain side adjusted spectrum sequence Y_M,…,Y_N-1。

Step 1-6: the low-side adjusted spectrum sequence Y obtained in the step 1-4 is subjected to₀,…,Y_M-1And the high-side adjusted spectral sequence Y obtained in step 1-5_M,…,Y_N-1Combining to obtain an adjusted spectrum sequence Y₀,…,Y_N-1。

Fig. 5 shows the steps when N is 32, M is 20, and C is 8Examples of step 1-1 to step 1-6. The fricative adjustment unit 13 first adjusts the frequency spectrum sequence X₀,…,X₃₁X in (1)₀,…,X₁₉Set as a low-side spectrum sequence, X₂₀,…,X₃₁The high-side spectrum sequence is set (step 1-1). The fricative adjustment unit 13 extracts the low-range side spectrum sequence X₀,…,X₁₉8 samples X contained in₂,…,X₉As an adjustment target sample to the high range side (step 1-2). The fricative adjustment unit 13 extracts the high-range side spectrum sequence X₂₀,…,X₃₁8 samples X contained in₂₀,…,X₂₇As an adjustment target sample to the low domain side (step 1-3). The fricative adjustment unit 13 obtains the presence of over-X in the low-range side spectral sequence₂,…,X₉At the sample position of (2) is provided with X₂₀,…,X₂₇As a result of (3), as a low-side adjusted spectral sequence Y₀,…,Y₁₉(Steps 1 to 4). The fricative adjustment unit 13 obtains the presence of over-X in the high-range side spectral sequence₂₀,…,X₂₇At the sample position of (2) is provided with X₂,…,X₉As a result of (3), as a high-side adjusted spectral sequence Y₂₀,…,Y₃₁(Steps 1 to 5). The fricative adjustment unit 13 adjusts the low-level side adjusted spectral sequence Y₀,…,Y₁₉And the high-side adjusted spectrum sequence Y₂₀,…,Y₃₁Combining to obtain an adjusted spectrum sequence Y₀,…,Y₃₁(Steps 1 to 6).

[ example 2 of adjustment processing by the fricative sound adjustment unit 13]

The frictional sound adjusting unit 13 may perform the following steps 1 to 4' instead of the above steps 1 to 4.

Step 1-4': in step1-2, the remaining samples from which the adjustment target samples to the high range side in the low range side spectrum sequence have been extracted are pushed to the low range side, the adjustment target samples to the low range side extracted from the high range side spectrum sequence in step 1-3 are arranged at the sample positions to the high range side which have been left free, and the result is obtained as the low range side adjusted spectrum sequence Y₀,…,Y_M-1。

The fricative adjustment unit 13 performs steps 1 to 4' instead of steps 1 to 4, and the encoding unit 14 at the subsequent stage encodes the samples having lower frequencies so as to increase the importance of auditory sense.

In this way, when the fricative sound determination unit 12 determines that the sound is a fricative sound, the fricative sound adjustment unit 13 may obtain the adjusted spectrum sequence by configuring the adjusted spectrum sequence with the low-range-side adjusted spectrum sequence and the high-range-side adjusted spectrum sequence, including a part of samples in the low-range-side spectrum sequence in the high-range-side adjusted spectrum sequence, arranging the remaining samples in the low-range-side spectrum sequence on the low-range side in the low-range-side adjusted spectrum sequence, arranging a part of samples in the high-range-side spectrum sequence on the high-range side in the low-range-side adjusted spectrum sequence, and including the remaining samples in the high-range-side spectrum sequence in the high-range-side adjusted spectrum sequence.

[ example 3 of adjustment processing by the fricative sound adjustment unit 13]

Similarly, the frictional sound adjusting unit 13 may perform the following steps 1 to 5' instead of the above steps 1 to 5.

Step 1-5': the remaining samples after the adjustment target samples to the low side in the high side spectrum sequence have been extracted in Step 1-3 are pushed to the low side, and the adjustment target samples to the high side extracted from the low side spectrum sequence at Step1-2 are arranged at the sample positions to the high side left free, and the result is obtained as the high side adjusted spectrum sequence Y_M,…,Y_N-1。

The fricative adjustment unit 13 performs steps 1 to 5' instead of steps 1 to 5, and the subsequent encoding unit 14 can encode the sample originally located on the higher side with higher auditory significance than the sample originally located on the lower side.

Fig. 6 shows an example in which, when N is 32, M is 20, and C is 8, step 1-4 'is performed instead of step 1-4 in step 1-1 to step 6, and step 1-5' is performed instead of step 1-5. The fricative adjustment unit 13 first adjusts the frequency spectrum sequence X₀,…,X₃₁X in (1)₀,…,X₁₉Set as a low-side spectrum sequence, X₂₀,…,X₃₁Is set to the high range sideThe sequence of spectra (step 1-1). The fricative adjustment unit 13 extracts the low-range side spectrum sequence X₀,…,X₁₉8 samples X contained in₂,…,X₉As an adjustment target sample to the high range side (step 1-2). The fricative adjustment unit 13 extracts the high-range side spectrum sequence X₂₀,…,X₃₁8 samples X contained in₂₀,…,X₂₇As an adjustment target sample to the low domain side (step 1-3). The fricative adjustment unit 13 adjusts the X in the low-range side spectrum sequence₁₀,…,X₁₉Pressing to the lower region side, X after pressing to the lower region side₁₀,…,X₁₉High domain side arrangement of X₂₀,…,X₂₇The result is obtained as a low-side adjusted spectrum sequence Y₀,…,Y₁₉(Steps 1-4'). The fricative adjustment unit 13 adjusts the X in the high-side spectral sequence₂₈,…,X₃₁Pressing to the lower region side, X after pressing to the lower region side₂₈,…,X₃₁High domain side arrangement of X₂,…,X₉The result is obtained as the high-side adjusted spectrum sequence Y₂₀,…,Y₃₁(Steps 1-5'). The fricative adjustment unit 13 adjusts the low-level side adjusted spectral sequence Y₀,…,Y₁₉And the high-side adjusted spectrum sequence Y₂₀,…,Y₃₁Combining to obtain an adjusted spectrum sequence Y₀,…,Y₃₁(Steps 1 to 6).

In this way, when the fricative decision unit 12 decides that the sound is a fricative sound, the fricative sound adjustment unit 13 may configure the adjusted spectrum sequence by the low-range adjusted spectrum sequence and the high-range adjusted spectrum sequence, place a part of samples in the low-range spectrum sequence on the high-range side of the high-range adjusted spectrum sequence, include the remaining samples in the low-range spectrum sequence in the low-range adjusted spectrum sequence, place a part of samples in the high-range spectrum sequence in the low-range adjusted spectrum sequence, and place the remaining samples in the high-range spectrum sequence on the low-range side of the high-range adjusted spectrum sequence, thereby obtaining the adjusted spectrum sequence.

[ example 4 of adjustment processing by the fricative sound adjustment unit 13]

It is desirable that the fricative adjustment unit 13 does not include 1 or more samples from the lowest frequency among the samples to be adjusted to the high frequency side from the low frequency side spectrum sequence in step 1-2. This is because the low-frequency samples contribute to the continuity of the inter-frame signal waveform, and the encoding unit 14 should perform encoding in which more bits are allocated. That is, when γ is set to a positive integer, the spectrum sequence is shifted from X in the low-range side_γ,…,X_M-1Selecting C samples to be adjusted, e.g. X_γ,…,X_γ+C-1The sample to be adjusted may be set. Further, if the value of γ is increased, the continuity of the signal waveform between frames increases, but the number of bits allocated to other samples in the encoding unit 14 is relatively small, and therefore the perceptual quality of the decoded sound within a frame decreases. Therefore, in consideration of these circumstances, the value of γ can be determined by a prior experiment or the like.

In the above-described examples of fig. 5 and 6, γ is set to 2 so that X, which is 2 samples from the lowest frequency in the low-range-side spectrum sequence, is not included in the adjustment target samples from the low-range-side spectrum sequence toward the high range side₀And X₁。

In other words, when the fricative decision unit 12 decides that the sound is fricative, the fricative adjustment unit 13 may obtain the result of replacing a part of the high-range side spectrum sequence located in the low-range side spectrum sequence with the same number of all or a part of the high-range side spectrum sequence as the high-range side spectrum sequence, as the adjusted spectrum sequence.

[ example 5 of adjustment processing by the fricative sound adjustment unit 13]

In the encoding process in the encoding unit 14 described later, due to the restriction of the maximum value of the number of bits obtained in the encoding process, there is a case where no bits are allocated to a plurality of samples from the highest frequency in the adjusted spectrum sequence. In this case, the high-side spectrum sequence X is subjected to_M,…,X_N-1The high-side spectral sequence X may be selected from 1 or more samples of the highest frequency in the sequence without being coded_M,…,X_N-1In the low regionThe remaining samples on the side are set as encoding targets. Therefore, in this case, the fricative adjustment unit 13 does not include 1 or more samples from the highest frequency in the high-range-side spectrum sequence among the samples to be adjusted to the low-range side from the high-range-side spectrum sequence in step 1-3.

In the above-described examples of fig. 5 and 6, the 4 samples, i.e., X, from the highest frequency in the high-side spectrum sequence are not set₂₈,…,X₃₁The samples are included in the samples to be adjusted from the high-range side spectrum sequence to the low-range side.

In other words, when the fricative decision unit 12 decides that the sound is fricative, the fricative adjustment unit 13 may obtain, as the adjusted spectral sequence, the result of replacing all or a part of the low-range spectral sequence with a part of the high-range spectral sequence having the same number of sequences as the low-range spectral sequence.

[ encoding section 14]

The adjusted spectral sequence Y output from the fricative adjustment unit 13 is input to the encoding unit 14₀,…,Y_N-1. The encoding unit 14 preferentially allocates bits to samples having a small sample number in frame units, for example, by the same method as that of non-patent document 1, and thereby, converts the input adjusted spectrum sequence Y into a signal having a small sample number₀,…,Y_N-1The spectrum code is obtained by encoding, and the obtained spectrum code is output to the multiplexing unit 15 (step S14).

Here, a method of preferentially allocating bits to samples having a small sample number is, for example, the following method: will adjust the frequency spectrum sequence Y₀,…,Y_N-1The method includes dividing the sequence into a plurality of partial sequences, dividing each sample included in the partial sequence by a smaller gain, and coding or vector-quantizing an integer value of a division result by a variable length code or a fixed length code to obtain a spectrum code corresponding to an adjusted spectrum sequence. In this case, a code corresponding to a partial sequence having a large sample number may not be obtained for the partial sequence. That is, no bit may be allocated to a partial sequence having a large sample number.

For the adjusted spectral sequence Y₀,…,Y_N-1In the partial sequence having a small sample number, the samples included in the partial sequence are respectively divided by the small gain to obtain large integer values, and the large integer values are respectively encoded. On the other hand, for the adjusted spectral sequence Y₀,…,Y_N-1In the partial sequence having a large sample number, the samples included in the partial sequence are respectively divided by the gains of large values to obtain small integer values, and the small integer values are respectively encoded. Each of the integer values obtained by dividing the value of the sample included in the partial sequence by a large gain is 0 at most.

Further, as shown by the chain line in fig. 1, if the fricative adjustment unit 13 and the encoding unit 14 are the fricative correspondence encoding unit 17, it can be said that, when the fricative determination unit 12 determines that the sounds are fricative sounds, the fricative correspondence encoding unit 17 encodes the spectrum sequence by the encoding process of preferentially allocating bits to the high-range side to obtain the spectrum code, and that, when the sounds are other than the above, the fricative correspondence encoding unit 17 encodes the spectrum sequence by the encoding process of preferentially allocating bits to the low-range side to obtain the spectrum code.

[ multiplexing section 15]

The fricative decision information output from the fricative decision unit 12 and the spectrum code output from the encoding unit 14 are input to the multiplexing unit 15. The multiplexing unit 15 outputs a code obtained by concatenating the code corresponding to the inputted fricative determination information and the spectrum code in units of frames (step S15). When the fricative determination information output from the fricative determination unit 12 is 1-bit information, the fricative determination information itself output from the fricative determination unit 12 and input to the multiplexing unit 15 may be a code corresponding to the fricative determination information.

Decoding device

Referring to fig. 3, a process of the decoding device of the first embodiment will be described. As illustrated in fig. 3, the decoding device according to the first embodiment includes a demultiplexing unit 21, a decoding unit 22, a fricative adjustment canceling unit 23, and a time domain converting unit 24. The code output by the encoding device is input to the decoding device. The code input to the decoding apparatus is input to the demultiplexing section 21. The decoding device performs processing for each unit in units of frames of a predetermined time length. The decoding method according to the first embodiment is realized by each section of the decoding apparatus performing the following processing of step S21 to step S24 illustrated in fig. 4.

[ multiplexing/demultiplexing unit 21]

The code output from the encoding device is input to the multiplexing/demultiplexing unit 21. The multiplexing/demultiplexing unit 21 demultiplexes the input code into a code corresponding to the fricative determination information and a spectrum code in frame units, outputs fricative determination information obtained from the code corresponding to the fricative determination information to the fricative adjustment canceling unit 23, and outputs the spectrum code to the decoding unit 22 (step S21).

When the fricative decision information is 1-bit information, the code itself corresponding to the fricative decision information input to the demultiplexing unit 21 may be the fricative decision information.

[ decoding section 22]

The spectrum code output from the demultiplexing unit 21 is input to the decoding unit 22. The decoding unit 22 decodes the input spectrum code in units of frames by a decoding method corresponding to the encoding method performed by the encoding unit 14 of the encoding apparatus to obtain a decoded adjusted spectrum sequence ^ Y₀,…,^Y_N-1The resulting decoded adjusted spectral sequence ^ Y₀,…,^Y_N-1And outputs the result to the frictional sound adjustment canceling unit 23 (step S22).

When the spectrum code is decoded by the decoding method corresponding to the above-described encoding method at the description position of the encoding unit 14 of the encoding device, the decoding unit 22 decodes the spectrum code to obtain an integer value string, and combines a plurality of partial sequences of sample values obtained by multiplying the integer value by the gain of the smaller value for the partial sequence having the smaller sample number to obtain a decoded adjusted spectrum sequence ^ Y₀,…,^Y_N-1. In the case where no bit is allocated to a partial sequence having a large sample number in the encoding apparatus, for example, the decoding of the modulated signal corresponding to the partial sequenceThe value of the whole spectrum is set to 0. Further, since the value obtained by multiplying the gain to the sample whose integer value is 0 is also 0, the value of the decoded adjusted spectrum becomes 0. That is, for a partial sequence having a large sample number, the integer value is mostly 0, and the value of the decoded adjusted spectrum is mostly 0.

In this way, the decoding unit 22 decodes the spectrum code of the frame unit of the predetermined time zone and the spectrum code in which bits are preferentially allocated to the low-band side, and obtains a sample sequence of the frequency domain corresponding to the decoded speech signal (decoded adjusted spectrum sequence).

[ adjustment canceling part for frictional noise 23]

The fricative adjustment canceling unit 23 receives the fricative determination information output from the demultiplexing unit 21 and the decoded adjusted spectral sequence Y output from the decoding unit 22₀,…,^Y_N-1. The fricative adjustment canceling unit 23 performs decoding on the input decoded adjusted spectrum sequence Y in a frame unit when the input fricative determination information indicates a fricative sound₀,…,^Y_N-1Performing the following adjustment release processing to obtain the decoded spectrum sequence ^ X₀,…,^X_N-1The obtained decoded spectrum sequence ^ X₀,^X₁,…,^X_N-1The output to the time domain conversion unit 24 is used for decoding the adjusted spectrum sequence when the fricative decision information indicates that the sound is not fricative₀,…,^Y_N-1Directly as decoded spectral sequence ^ X₀,…,^X_N-1The result is output to the time domain conversion unit 24 (step S23).

If an integer value greater than 1 and less than N is set to M, for example, the adjusted spectral sequence ^ Y is decoded₀,…,^Y_N-1Samples with less than M sample numbers, i.e. ^ Y₀,…,^Y_M-1The group of samples is set as the low-level side decoding the adjusted spectrum sequence, and the adjusted spectrum sequence is decoded₀,…,^Y_N-1The sample number in (A) is more than M, i.e. [ lambda ] Y_M,…,^Y_N-1When the sample group of (2) is a high-order side decoded and the fricative decision information indicates that the sound is a fricative sound, the adjustment cancellation process performed by the fricative adjustment cancellation unit 23 is as follows: decoding the low domain sideAdjusting a sequence of spectra ^ Y₀,…,^Y_N-1And the same number of high-side decoding-adjusted spectrum sequences ^ Y as the number of samples of all or part of_M,…,^Y_N-1The sample swapping of all or part of the sequence of the decoded spectrum X₀,…,^X_N-1. The adjustment canceling process performed by the fricative sound adjustment canceling unit 23 may be various processes including the following exemplary process, but is determined in advance so as to be the inverse process of the adjustment process performed by the fricative sound adjusting unit 13 of the corresponding encoding apparatus.

In other words, when the input information indicating whether or not the input sound is a fricative sound indicates a fricative sound, the fricative adjustment canceling unit 23 transposes all or a part of a low-range-side frequency sample sequence (low-range-side decoded adjusted spectrum sequence) located on the low range side with respect to a predetermined frequency among the frequency-range sample sequences obtained by the decoding unit 22, and all or a part of a high-range-side frequency sample sequence (high-range-side decoded adjusted spectrum sequence) located on the high range side with respect to the predetermined frequency among the same number of frequency-range sample sequences obtained by the decoding unit 22, and obtains the transposed result as a spectrum sequence (decoded spectrum sequence) of the decoded speech signal, in cases other than the above, the fricative adjustment canceling unit 23 obtains the frequency-domain sample sequence (decoded adjusted spectrum sequence) obtained by the decoding unit 22 as it is as a spectrum sequence (decoded spectrum sequence) of the decoded speech signal.

[ example 1 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]

When the fricative decision information indicates that the sound is fricative, the fricative adjustment canceling unit 23 obtains the decoded spectrum sequence ^ X by performing, for example, the following steps 2-1 to 2-6₀,…,^X_N-1. In addition, although the following steps 2-1 to 2-6 are divided into 6 steps in order to easily understand the operation of the frictional sound adjustment canceling unit 23, the division of the following steps 2-1 to 2-6 by the frictional sound adjustment canceling unit 23 is merely an example, and the steps may be performed by 1 step by exchanging the arranged elements or replacing the index2-1 to step 2-6.

Step 2-1: will decode the adjusted spectral sequence ^ Y₀,…,^Y_N-1The sample group of samples with the sample number less than M is set as the low-domain side decoding adjusted spectrum sequence ^ Y₀,…,^Y_M-1Will decode the adjusted spectral sequence ^ Y₀,…,^Y_N-1The sample group of samples with the middle sample number of more than M is set as the high-domain side decoding adjusted spectrum sequence ^ Y_M,…,^Y_N-1。

Step 2-2: fetching the low-side decoded adjusted spectrum sequence Y obtained in step 2-1₀,…,^Y_M-1C (C is a positive integer) samples included in (b) are samples to be adjusted to the upper domain side.

Step 2-3: fetching the high-side decoded adjusted spectrum sequence ^ Y obtained in step 2-1_M,…,^Y_N-1The C samples included in (1) are samples to be adjusted to the lower domain side.

Step 2-4: the sample position where the adjustment target sample to the higher domain side in the decoded adjusted spectrum sequence on the lower domain side is taken in step2-2 is obtained, and the result of arranging the adjustment target sample to the lower domain side taken from the decoding adjusted spectrum sequence on the higher domain side in step 2-3 is obtained as the decoded spectrum sequence on the lower domain side ^ X₀,…,^X_M-1。

Step 2-5: the sample position at which the adjustment target sample to the lower domain side in the decoded adjusted spectrum sequence on the higher domain side is extracted in Step 2-3 is obtained, and the result of arranging the adjustment target sample to the higher domain side extracted from the decoded adjusted spectrum sequence on the lower domain side in Step2-2 is obtained as the decoded spectrum sequence on the higher domain side ^ X_M,…,^X_N-1。

Step 2-6: decoding the low-domain side decoded spectrum sequence ^ X obtained in the step 2-4₀,…,^X_M-1And the high-domain side decoded spectrum sequence ^ X obtained in the step 1-5_M,…,^X_N-1Combining to obtain a decoded spectrum sequence ^ X₀,…,^X_N-1。

Fig. 7 shows an example of steps 2-1 to 2-6 when N is 32, M is 20, and C is 8. Friction toneThe erasure section 23 first decodes the adjusted spectrum sequence ^ Y₀,…,^Y₃₁In ^ Y₀,…,^Y₁₉Set to decode the adjusted spectrum sequence at the low-field side, will₂₀,…,^Y₃₁The high-domain side is set to decode the adjusted spectral sequence (step 2-1). The fricative adjustment removing part 23 extracts the low-level decoded adjusted spectrum sequence ^ Y₀,…,^Y₁₉8 samples ^ Y contained in₂,…,^Y₉As an adjustment target sample to the high range side (step 2-2). The fricative adjustment removing section 23 extracts the high-order decoded adjusted spectral sequence ^ Y₂₀,…,^Y₃₁8 samples ^ Y contained in₂₀,…,^Y₂₇As an adjustment target sample to the low domain side (step 2-3). The fricative adjustment removing section 23 obtains the presence of over ^ Y in the decoded and adjusted spectral sequence at the low-level side₂,…,^Y₉Sample position of is configured with ^ Y₂₀,…,^Y₂₇As a result of decoding the spectrum sequence ^ X as the low-domain side₀,…,^X₁₉(step 2-4). The fricative adjustment removing section 23 obtains the presence of over ^ Y in the decoded and adjusted spectral sequence on the high-order side₂₀,…,^Y₂₇Sample position of is configured with ^ Y₂,…,^Y₉As a result of decoding the spectrum sequence ^ X as the high-domain side₂₀,…,^X₃₁(step 2-5). The fricative adjustment release unit 23 decodes the lower-level decoded spectrum sequence ^ X₀,…,^X₁₉Decoding the spectrum sequence ^ X with the high domain side₂₀,…,^X₃₁Combining to obtain a decoded spectrum sequence ^ X₀,…,^X₃₁(step 2-6).

[ example 2 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]

When the fricative adjustment unit 13 of the encoding device performs step 1-4 'instead of step 1-4, the fricative adjustment canceling unit 23 performs step 2-4' described below instead of step 2-4 described above.

Step 2-4': the samples remaining after the samples to be adjusted to the higher side in the decoded adjusted spectrum sequence at the lower side are extracted at step2-2 are pushed to the lower side and the higher side, and the sample positions at the gaps left are arranged at step 2-3 to be decoded from the higher sideThe sample of the low-level side of the adjusted spectrum sequence is taken out, and the configured result is used as the decoding spectrum sequence ^ X of the low-level side₀,…,^X_M-1。

[ example 3 of adjustment processing by the fricative sound adjustment unit 13]

When the fricative adjustment unit 13 of the encoding device performs step 1-5 'instead of step 1-5, the fricative adjustment canceling unit 23 performs step 2-5' described below instead of step 2-5 described above.

Step 2-5': the remaining samples obtained by extracting the adjustment target samples to the lower side in the decoded adjusted spectrum sequence to the higher side in Step 2-3 are pushed to the higher side, and the adjustment target samples to the higher side are extracted from the decoded adjusted spectrum sequence to the lower side in Step2-2 at the sample position to the lower side left free, and the resulting decoded spectrum sequence to the higher side ^ X after the arrangement is obtained_M,…,^X_N-1。

Fig. 8 shows an example in which, when N is 32, M is 20, and C is 8, step 2-4 'is performed by replacing step 2-4 in step 2-1 to step 2-6, and step 2-5' is performed by replacing step 2-5. The fricative adjustment removing part 23 first decodes the adjusted spectrum sequence ^ Y₀,…,^Y₃₁In ^ Y₀,…,^Y₁₉Set to decode the adjusted spectrum sequence at the low-field side, will₂₀,…,^Y₃₁The high-domain side is set to decode the adjusted spectral sequence (step 2-1). The fricative adjustment removing part 23 extracts the low-level decoded adjusted spectrum sequence ^ Y₀,…,^Y₁₉8 samples ^ Y contained in₁₂,…,^Y₁₉As an adjustment target sample to the high range side (step 2-2). The fricative adjustment removing section 23 extracts the high-order decoded adjusted spectral sequence ^ Y₂₀,…,^Y₃₁8 samples ^ Y contained in₂₄,…,^Y₃₁As an adjustment target sample to the low domain side (step 2-3). The fricative adjustment release section 23 decodes ^ Y in the adjusted spectrum sequence on the low-level side₀,^Y₁Pressing toward the lower region side₂,…,^Y₁₁Pressing toward the high region side, and arranging ^ Y at the vacant gap₂₄,…,^Y₃₁Obtain the configured resultDecoding a spectral sequence ^ X for the low-domain side₀,…,^X₁₉(step 2-4'). The fricative adjustment removing unit 23 decodes ^ Y in the adjusted spectral sequence on the high-order side₂₀,…,^Y₂₃Pressing toward the high zone side, Y after pressing toward the high zone side₂₀,…,^Y₂₃Low domain side configuration of ^ Y₁₂,…,^Y₁₉Obtaining the configured result as a high-domain side decoding frequency spectrum sequence ^ X₂₀,…,^X₃₁(step 2-5'). The fricative adjustment release unit 23 decodes the lower-level decoded spectrum sequence ^ X₀,…,^X₁₉And high-field side decoding spectrum sequence ^ X₂₀,…,^X₃₁Combining to obtain a decoded spectrum sequence ^ X₀,…,^X₃₁(Steps 1 to 6).

[ example 4 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]

When the adjustment target samples from the low-side spectral sequence to the high-side in step1-2 do not include 1 or more samples from the lowest frequency in the fricative adjustment unit 13 of the encoding apparatus, the fricative adjustment cancellation unit 23 does not include 1 or more samples from the lowest frequency in the adjustment target samples from the low-side decoded adjusted spectral sequence to the high-side in step 2-2.

[ example 5 of adjustment canceling processing by the frictional sound adjustment canceling unit 23]

When the adjustment target samples from the high-side spectral sequence toward the low-side in step 1-3 do not include 1 or more samples from the highest frequency, the fricative adjustment canceling unit 23 in the encoding apparatus does not include 1 or more samples from the highest frequency in the adjustment target samples from the high-side decoded adjusted spectral sequence toward the low-side in step 2-3.

Further, as shown by the chain line in fig. 3, it can be said that if the decoding unit 22 and the fricative adjustment canceling unit 23 are the fricative-corresponding decoding unit 26, when the information indicating whether or not the input fricative sound indicates fricative sound, the fricative-corresponding decoding unit 26 preferentially allocates bits to the higher-order side in the spectrum code and decodes the spectrum code to obtain a spectrum sequence (decoded spectrum sequence), and if not, the fricative-corresponding decoding unit 26 preferentially allocates bits to the lower-order side in the spectrum code and decodes the spectrum code to obtain a spectrum sequence (decoded spectrum sequence).

[ time domain converting section 24]

The decoded spectrum sequence ^ X output by the input fricative adjustment release unit 23 in the time domain conversion unit 24₀,…,^X_N-1. The time domain converter 24 uses a conversion method to the time domain corresponding to the conversion method to the frequency domain performed by the frequency domain converter 11 of the encoding apparatus, for example, inverse MDCT, to decode the spectrum sequence ^ X per frame₀,…,^X_N-1The signal is converted into a time domain signal, and a frame-unit audio signal (decoded audio signal) is obtained and output (step S24).

When the frequency domain converting unit 11 of the encoding device applies filtering processing and companding processing for weighting in the auditory sense to the spectrum sequence obtained by the conversion, the time domain converting unit 24 converts the result of the inverse filtering processing or inverse companding processing corresponding to the processing performed on the decoded spectrum sequence into a signal in the time domain, and outputs the decoded sound signal obtained thereby.

The decoding device may be configured to output the decoded audio signal in the frequency domain without outputting the decoded audio signal in the time domain. In the case of this configuration, the decoding apparatus may be configured not to include the time domain converter 24, and the decoded frequency spectrum sequence in the frame unit obtained by the fricative adjustment canceling unit 23 may be connected in time-domain order and output as a decoded sound signal in the frequency domain.

Action and Effect

According to the encoding device and the decoding device of the first embodiment, by configuring such that the conventional encoding process that is designed to allocate a larger number of bits to a spectrum of a low frequency and the decoding process corresponding thereto are added with the fricative adjustment process and the fricative adjustment cancellation process corresponding thereto, it is possible to perform compression encoding so that the sound quality is less deteriorated in auditory sense even for a sound signal including fricatives and the like.

As a conventional technique capable of performing compression coding so as to reduce auditory deterioration even in a sound signal including a fricative sound or the like, there is an encoding/decoding technique in which bits are preferentially allocated to a subband having large energy. However, in this technique, it is necessary to transmit information of bit allocation to each subband from the encoding side to the decoding side. In contrast, according to the encoding device and the decoding device of the first embodiment, compression encoding can be performed by transmitting only 1-bit fricative determination information from the encoding side to the decoding side, and thus, even a sound signal including fricatives and the like has less auditory deterioration.

< modification of the first embodiment >

The modification of the first embodiment differs from the first embodiment only in the fricative sound determination unit 12 included in the encoding device. Other configurations of the encoding device and the decoding device are the same as those of the first embodiment. Hereinafter, operations of the fricative sound determination unit 12 different from those of the first embodiment, and effects of the operations in the encoding device and the decoding device caused by the operations will be described.

[ frictional sound determination unit 12]

The frictional sound determination unit 12 of the modification of the first embodiment includes a comparison result storage unit, not shown.

The frictional sound determination unit 12 finds an input spectral sequence X of the frame in units of frames₀,…,X_N-1With respect to the inputted spectral sequence X, the average energy of the samples on the high-order side in (b) is₀,…,X_N-1The index of which the value becomes larger as the ratio of the average energy of the samples located on the low-range side becomes larger in (b) is larger, and as the index of the sound of which the frame is fricative, comparison result information indicating whether the obtained index is larger than a predetermined threshold or equal to or larger than the threshold is obtained.

The comparison result storage unit stores the comparison result information by a predetermined number of past frames. That is, the fricative sound determination unit 12 newly stores the comparison result information calculated from the spectrum sequence of the frame in the comparison result storage unit on a frame-by-frame basis, and deletes the oldest stored comparison result information.

The fricative sound determination unit 12 determines that the sound is fricative sound when the comparison result information calculated from the spectral sequence of the frame and the comparison result information of a predetermined number of past frames stored in the comparison result storage unit are used, and if the comparison result information of at least half of the comparison result information or the comparison result information of more than half of the comparison result information indicates that the sound is greater than a predetermined threshold value or is greater than or equal to the threshold value, determines that the sound is not fricative sound, and outputs the determination result to the fricative sound adjustment unit 13 and the multiplexing unit 15 as fricative sound determination information.

In this way, the fricative determination unit 12 may determine that the sound signal is a fricative sound for a frame when, of a plurality of frames including the frame, the index whose value is larger as the ratio of the average energy of the spectrum on the high-range side to the average energy of the spectrum on the low-range side in the spectrum sequence of the sound signal is larger than a predetermined threshold value or the number of frames that are greater than or equal to the threshold value is larger than or equal to the number of frames that are not greater than or equal to the predetermined threshold value.

The fricative determination information may be, for example, 1-bit information, or may be an average value of the sum of absolute values or an average value of the sum of squares of all or a part of the values of the samples, which is used as the average energy, and is the same as the fricative determination unit 12 according to the first embodiment.

Action and Effect

When the processing in the encoding apparatus and the decoding apparatus according to the first embodiment is performed, a decoded sound with less coding distortion of a high-domain component and more coding distortion of a low-domain component is obtained for a frame subjected to the adjustment processing and the adjustment release processing, and a decoded sound with more coding distortion of a high-domain component and less coding distortion of a low-domain component is obtained for a frame not subjected to the adjustment processing and the adjustment release processing, and therefore, a waveform of a decoded sound may be discontinuous at a boundary between a frame subjected to the adjustment processing and the adjustment release processing and a frame not subjected to the adjustment processing and the adjustment release processing. That is, when the determination result of the frictional sound determination unit 12 is frequently switched, discontinuity of the waveform of the decoded sound frequently occurs, and the discontinuity may be sensed, thereby deteriorating the auditory quality. The coding device according to the modification of the first embodiment can suppress frequent switching of the determination result by the frictional sound determination unit 12, suppress the frequency of occurrence of discontinuity of the waveform of the decoded sound, and suppress deterioration of the auditory quality caused by the discontinuity being sensed, as compared with the coding device according to the first embodiment.

In the frictional sound determination unit 12 according to the modification of the first embodiment, the number of comparison result information used for the determination needs to be determined in consideration of the tradeoff between the deterioration of the auditory quality due to the perception of discontinuity and the auditory quality of the decoded sound for each frame, although the frequency of occurrence of discontinuity in the waveform of the decoded sound can be suppressed as the number of comparison result information used for the determination increases, and as the determination result of the frictional sound determination unit 12 switches frequently. For example, when the frame length is 3ms, the number of pieces of comparison result information used for determination may be 16.

< second embodiment >

The system according to the second embodiment of the present invention includes an encoding device and a decoding device, as in the system according to the first embodiment.

The second embodiment is different from the first embodiment in that a spectrum to which no bit is allocated in the encoding apparatus is restored in the decoding apparatus, that is, a band is extended in the decoding apparatus. The decoding apparatus according to the second embodiment expands the band by decoding the adjusted spectrum sequence, which is the spectrum transposed according to the fricative decision information. In the spectrum to which no bit is allocated in the encoding apparatus, the time interval of a non-fricative sound is included in the high range, and the time interval of a fricative sound is included in the low range. Therefore, in the second embodiment, for a time section of a sound that is not fricative, a high-frequency spectrum is reproduced by copying a low-frequency spectrum to expand a band, and for a time section of a sound that is fricative, a low-frequency spectrum is reproduced by copying a high-frequency spectrum to expand a band.

The spectrum replication in the second embodiment is performed by multiplying the spectrum as the source of the replication by a gain. Therefore, the encoding device of the second embodiment obtains the gain used by the decoding device of the second embodiment in addition to the processing performed by the encoding device of the first embodiment, and outputs the code corresponding to the obtained gain.

Coding device

Referring to fig. 9, a process of the encoding device of the second embodiment is explained. As illustrated in fig. 9, the encoding device according to the second embodiment includes: a frequency domain converting unit 11, a fricative determining unit 12, a fricative adjusting unit 13, an encoding unit 14, a band expansion gain encoding unit 16, and a multiplexing unit 15. The coding apparatus according to the second embodiment of fig. 9 differs from the coding apparatus of fig. 1 in that it includes a band expansion gain coding unit 16, and the code output from the multiplexing unit 15 further includes a band expansion gain code output from the band expansion gain coding unit 16. The other configurations of the encoding device of the second embodiment, that is, the operations of the frequency domain converting unit 11, the fricative determining unit 12, the fricative adjusting unit 13, and the encoding unit 14 are the same as those of the encoding device of the first embodiment, and therefore only essential parts of the operations will be described below.

In the encoding device, a time-domain audio signal is input in units of frames of a predetermined time length. The time-domain audio signal input to the encoding apparatus is input to the frequency domain converter 11. The encoding device performs processing for each component in units of frames of a predetermined time length. The encoding method according to the second embodiment is realized by each component of the encoding apparatus performing the following processing of step S11 to step S16 illustrated in fig. 10.

[ frequency domain converting part 11]

The frequency domain converting unit 11 converts a time domain audio signal input to the encoding device into a frequency domain N-point spectrum sequence X in frame units₀,…,X_N-1And then output (step S11).

[ frictional sound determination unit 12]

The frictional sound determination unit 12 uses the spectrum sequence X obtained by the frequency domain conversion unit 11 on a frame-by-frame basis₀,…,X_N-1Or a time-domain speech signal input to the encoding apparatus, determines whether or not the speech signal is a fricative sound, and outputs the determination result as fricative determination information (step S12). Coding device of the first embodimentThe frictional sound judging unit 12 outputs the frictional sound judging information to the frictional sound adjusting unit 13 and the multiplexing unit 15, but the frictional sound judging unit 12 of the encoding device of the second embodiment outputs the frictional sound judging information to the band expansion gain encoding unit 16 in addition to the frictional sound adjusting unit 13 and the multiplexing unit 15. The frictional sound determination unit 12 of the coding apparatus according to the second embodiment may perform the same operation as the frictional sound determination unit 12 of the coding apparatus according to the modification of the first embodiment.

In other words, the fricative decision unit 12 may decide that the sound signal is a fricative sound when the index having a larger value as the ratio of the average energy of the high-range spectrum to the average energy of the low-range spectrum in the spectrum sequence of a certain frame is larger than a predetermined threshold or equal to or larger than the threshold.

In addition, the fricative decision unit 12 may decide that the sound signal is a fricative sound when, of a plurality of frames including a certain frame, the index having a larger value as the ratio of the average energy of the spectrum on the high-range side to the average energy of the spectrum on the low-range side in the spectrum sequence is larger than a predetermined threshold value, or the number of frames that are equal to or larger than the threshold value is larger than the number of frames that are not equal to or larger than the number of frames that are not equal to the threshold value.

[ fricative sound adjustment unit 13]

The fricative adjustment unit 13 performs a fricative adjustment on the spectral sequence X obtained by the frequency domain conversion unit 11 on a frame-by-frame basis when the fricative determination information obtained by the fricative determination unit 12 indicates that the sound is a fricative sound₀,…,X_N-1Adjusting the frequency spectrum to obtain an adjusted frequency spectrum sequence Y₀,…,Y_N-1The obtained adjusted spectrum sequence Y₀,…,Y_N-1Outputs the result to the encoding unit 14, and if the fricative determination information obtained by the fricative determination unit 12 indicates that the sound is not fricative, converts the spectral sequence X obtained by the frequency domain conversion unit 11 into a spectral sequence X₀,…,X_N-1Directly as the adjusted spectral sequence Y₀,…,Y_N-1And outputs the result to the encoding unit 14 (step S13).

The adjustment processing of the frequency spectrum by the fricative adjustment unit 13 is as follows: the frequency spectrum sequence X₀,…,X_N-1Middle low-side spectrum sequence X₀,…,X_M-1And the same number of spectral sequences X as the samples of all or a part of₀,X…,X_N-1Middle high-side spectrum sequence X_M,…,X_N-1Is performed on all or a part of the samples, and the result of the conversion is used as the adjusted spectrum sequence Y₀,…,Y_N-1。

In other words, when the fricative decision unit 12 decides that the sound is a fricative sound, the fricative adjustment unit 13 transposes all or a part of the low-range-side spectral sequence located on the low range side with respect to the predetermined frequency in the spectral sequence of the sound signal and all or a part of the high-range-side spectral sequence located on the high range side with respect to the predetermined frequency in the same number of spectral sequences as the low-range-side spectral sequence, and obtains the transposed result as an adjusted spectral sequence, and in the case other than the above, the fricative adjustment unit 13 obtains the spectral sequence corresponding to the sound signal as it is as the adjusted spectral sequence.

[ encoding section 14]

The encoding unit 14 preferentially allocates bits to samples having a small sample number on a frame-by-frame basis, thereby adjusting the adjusted spectral sequence Y obtained by the fricative adjustment unit 13₀,…,Y_N-1The spectrum code is obtained by encoding, and the obtained spectrum code is output to the multiplexing unit 15 (step S14).

The method of preferentially allocating bits to samples having a small sample number in the encoding unit 14 of the encoding device according to the first embodiment may be a method of allocating bits to all samples of the adjusted spectrum sequence, or a method of not allocating bits to some samples having a large sample number. In contrast, the method of preferentially allocating bits to samples having a small sample number in the encoding unit 14 of the encoding device according to the second embodiment is limited to a method of not allocating bits to a portion of the adjusted spectrum having a large sample number in the adjusted spectrum sequence. The method of allocating bits is determined in advance and stored in the encoding unit 14, and also stored in the band spreading gain encoding unit 16 described later.

14 examples of the encoding sectionE.g. for the adjusted spectral sequence Y₀,…,Y_N-1K (K ≦ N/2) large sample numbers of N adjusted spectra of (a) adjusted spectrum Y_N-K,…,Y_N-1Without bit allocation, the N-K adjusted spectrums Y starting from the smaller of the rest sample numbers₀,…,Y_N-K-1Allocating bits to adjust the spectral sequence Y₀,…,Y_N-1The spectrum code is obtained by encoding, and the obtained spectrum code is output to the multiplexing unit 15. That is, the encoding unit 14 substantially only adjusts the spectrum sequence Y₀,…,Y_N-1Of the N adjusted spectra, N-K adjusted spectra Y starting with the smaller sample number₀,…,Y_N-K-1The spectrum code is obtained by encoding.

[ band expansion gain encoding section 16]

The band expansion gain encoding unit 16 receives at least the adjusted spectral sequence Y output from the fricative adjustment unit 13₀,…,Y_N-1. The band spreading gain coding unit 16 at least receives the adjusted spectrum sequence Y in frame units₀,…,Y_N-1The band spreading gain code is obtained as described below, and the obtained band spreading gain code is output to the multiplexing unit 15 (step S16).

Only the adjusted spectrum sequence Y is input to the band spreading gain coding unit 16₀,…,Y_N-1In the case of the configuration of (1), for example, as in example 1 described below, the band spreading gain coding unit 16 receives the adjusted spectrum sequence Y in frame units₀,…,Y_N-1The band spreading gain code is obtained, and the obtained band spreading gain code is output to the multiplexing unit 15.

In addition, the band spreading gain coding unit 16 may be configured to receive the adjusted spectrum sequence Y in addition to the input of the adjusted spectrum sequence Y₀,…,Y_N-1The information is also inputted to the friction sound determination information outputted from the friction sound determination unit 12. In this configuration, for example, as in example 2 described below, the band spreading gain coding unit 16 receives the adjusted spectrum sequence Y in units of frames₀,…,Y_N-1And fricative decision information to obtain a band spreading gain code, and outputting the obtained band spreading gain codeAnd output to the multiplexing unit 15.

The storage unit 161 of the band expansion gain encoding unit 16 stores in advance a plurality of groups each of which is composed of gain candidate values corresponding to a plurality of samples, each of the gain candidate vectors being composed of a gain candidate value corresponding to a candidate of a gain vector and a code capable of specifying the gain candidate vector. The band spreading gain encoding unit 16 obtains, as a band spreading gain code, a code corresponding to a gain candidate vector in which the sum of the absolute value of the difference between the absolute value of the value obtained by multiplying the value of the adjusted spectrum to which bits are allocated by the encoding unit 14 and the gain candidate value constituting the gain candidate vector and the absolute value of the difference between the absolute value of the adjusted spectrum to which bits are not allocated by the encoding unit 14 is the smallest, and outputs the code. Instead of the absolute value, a square value or the like may be used.

Hereinafter, the adjusted spectrum to which bits are allocated by the encoding unit 14 will be described as being from the adjusted spectrum sequence Y₀,…,Y_N-1Of the N-K adjusted spectra Y starting from the side with the smaller sample number₀,…,Y_N-K-1The adjusted spectrum to which no bit is allocated by the encoding unit 14 is selected from the adjusted spectrum sequence Y₀,…,Y_N-1K adjusted spectra Y starting from the one having the larger sample number_N-K,…,Y_N-1An example of the case (1).

[ example 1 of band expansion gain coding section 16]

In this example, it is assumed that J sets of gain candidate vectors each composed of a gain candidate value corresponding to K samples and a code are stored in the storage unit 161. Hereinafter, let J gain candidate vectors be G_j(J-0, …, J-1), the gain candidate vector G will be summed with_jEach code corresponding to (J-0, …, J-1) is C_Gj(J-0, …, J-1), each gain candidate vector G_jFrom K gain candidates g_j,k(K is 0, …, K-1).

The band expansion gain encoding unit 16 outputs the gain candidate vector G stored in the storage unit 161_j(J-0, …, J-1) E obtained by the following formula (1)_jIs the smallest gain candidate vector G_jCorresponding code C_GjAs band spreading gain code C_G。

In other words, the band spreading gain encoding unit 16 obtains and outputs a code corresponding to a gain candidate vector, which is the adjusted spectrum Y to which bits are allocated from the encoding unit 14, as a band spreading gain code₀,…,Y_N-K-1K adjusted spectra Y starting from the one having the larger sample number_N-2K,…,Y_N-K-1And gain candidate values g constituting a gain candidate vector_j,0,…,g_j,K-1Absolute value | Y of respective multiplied values_N-2Kg_j,0|,…,|Y_N-K-1g_j,KI.e., the adjusted spectrum Y to which no bit is allocated by the encoding unit 14_N-K,…,Y_N-1Absolute value of each | Y_N-K|,…,|Y_N-1Absolute value of difference | | Y_N-2Kg_j,0|-|Y_N-K||,…,||Y_N-K-1g_j,K|-|Y_N-1Sum of | | E_jThe smallest gain candidate vector.

[ example 2 of band expansion gain coding section 16]

In this example, the storage unit 161 stores J sets of gain candidate vectors and codes as in example 1, but unlike example 1, 2 sets of gain candidate vectors for fricatives and gain candidate vectors for non-fricatives are stored as gain candidate vectors. That is, J sets of a gain candidate vector for a fricative and a gain candidate vector for a non-fricative, each consisting of gain candidate values for K samples, and a code are stored in the storage unit 161. Hereinafter, the J fricative gain candidate vectors are each G1_j(J is 0, …, J-1), and the J gain candidate vectors for non-fricative sounds are G2_j(J-0, …, J-1) and a fricative gain candidate vector G1_j(J-0, …, J-1) and gain candidate vector for frictionless sound G2_j(J-0, …, J-1) for each code setIs C_Gj(J-0, …, J-1). Then, the gain candidate vector G1 for each fricative is set_jFrom the amount of K samples, i.e. K gain candidates g1_j,k(K is 0, …, K-1), and each non-fricative gain candidate vector G2_jFrom the amount of K samples, i.e. K gain candidates g2_j,k(K is 0, …, K-1).

When the inputted fricatives determination information indicates that the input fricatives determination information is fricatives, the band expansion gain encoding unit 16 uses the gain candidate vector G1 for fricatives stored in the storage unit 161_j(J is 0, …, J-1) is used as a gain candidate vector G_j(J is 0, …, J-1), and when the inputted fricative decision information indicates that the input fricative decision information is not fricative, the band expansion gain encoding unit 16 uses the gain candidate vector for non-fricative G2 stored in the storage unit 161 as the gain candidate vector for non-fricative_j(J is 0, …, J-1) is used as a gain candidate vector G_j(J-0, …, J-1), the gain candidate vector G will be summed with_j(J-0, …, J-1) E obtained by the above formula (1)_jIs the smallest gain candidate vector G_jCorresponding band spreading gain code C_GjAs band spreading gain code C_GAnd (6) outputting.

In other words, when the input fricative decision information indicates that the sound is fricative, the band expansion gain encoding unit 16 sets the gain candidate vector for fricative stored in the storage unit 161 as the gain candidate vector, and when the input fricative decision information indicates that the sound is not fricative, the band expansion gain encoding unit 16 sets the gain candidate vector for non-fricative stored in the storage unit 161 as the gain candidate vector, obtains a code corresponding to the gain candidate vector that is the adjusted spectrum Y to which bits are allocated from the encoding unit 14, as the band expansion gain code, and outputs the code₀,…,Y_N-K-1K adjusted spectra Y starting from the one having the larger sample number_N-2K,…,Y_N-K-1And gain candidate values g constituting a gain candidate vector_j,0,…,g_j,K-1Absolute value | Y of respective multiplied values_N-2Kg_j,0|,…,|Y_N-K-1g_j,K-1I, and codingSection 14 adjusted spectrum Y without allocated bits_N-K,…,Y_N-1Absolute value of each | Y_N-K|,…,|Y_N-1Absolute value of difference | | Y_N-2Kg_j,0|-|Y_N-K||,…,||Y_N-K-1g_j,K-1|-|Y_N-1Sum of | | E_jThe smallest gain candidate vector.

As described above, the band expansion gain encoding unit 16 may store a plurality of codes, a fricative gain candidate vector corresponding to each code, and a non-fricative gain candidate vector corresponding to each code, and the band expansion gain encoding unit 16 may use the fricative gain candidate vector as the gain candidate vector when the fricative determination unit determines that the sounds are fricative sounds, and the band expansion gain encoding unit 16 may use the non-fricative gain candidate vector as the gain candidate vector in cases other than the above.

[ example 1 of the band spreading gain coding section 16 and modification 1 of example 2]

In examples 1 and 2 described above, the adjusted spectrum to be subjected to multiplication of the gain candidate is set to the adjusted spectrum Y to which bits are allocated from the encoding unit 14₀,…,Y_N-K-1K adjusted spectra Y starting from the one having the larger sample number_N-2K,…,Y_N-K-1. However, the adjusted spectrum to be subjected to multiplication of the gain candidate is the adjusted spectrum Y to which bits are allocated by the encoding unit 14₀,…,Y_N-K-1The K adjusted spectra in (1) may be the K adjusted spectra corresponding to the predetermined K sample numbers.

[ example 1 of the band spreading gain coding section 16 and modification 2 of example 2]

In examples 1 and 2 described above, Y in the order of decreasing k value to increasing k value in formula (1)_N-2K+k，g_j,k，Y_N-K+kHowever, any association may be used as long as the association is predetermined.

[ concrete example of band expansion gain coding section 16]

A specific example of the band expansion gain coding unit 16 when N is 32 and K is 12 will be described. This specific example corresponds to modification 2 of example 2 of band expansion gain encoding unit 16. Fig. 13 and 14 show examples of a band extending unit 25 and a fricative adjustment canceling unit 23 of a decoding device described later when N is 32 and K is 12.

Fig. 13 is an example of a case where the fricative determination information indicates that the sound is not fricative. As will be described later, the band spreading section 25 of the decoding apparatus performs processing of setting the 8 th to 19 th decoded adjusted spectrums as copy sources, and obtaining values obtained by multiplying the values of the decoded adjusted spectrums of these copy sources by the band spreading gain as the 20th to 31 th decoded spread spectrums in order of sample number. Therefore, when the inputted fricative decision information indicates that the sound is not fricative, the band expansion gain encoding unit 16 sets the gain candidate vector for non-fricative stored in the storage unit 161 as the gain candidate vector corresponding to the gain candidate vector of the adjusted spectrum Y to which bits are allocated from the encoding unit 14, and obtains a code corresponding to the gain candidate vector as the band expansion gain code₀,…,Y ₁₉12 adjusted spectra Y starting from the one with the larger sample number₈,…,Y₁₉And gain candidate values g constituting a gain candidate vector_j,0,…,g_j,11Absolute value | Y of respective multiplied values₈g_j,0|,…,|Y₁₉g_j,11I.e., the adjusted spectrum Y to which no bit is allocated by the encoding unit 14₂₀,…,Y₃₁Absolute value of each | Y₂₀|,…,|Y₃₁Absolute value of difference | | Y₈g_j,0|-|Y₂₀||,…,||Y₁₉g_j,11|-|Y₃₁Sum of | | E_jThe smallest gain candidate vector.

Fig. 14 shows an example of a case where the fricative determination information indicates that the information is fricative sound. The band extending unit 25 of the decoding apparatus performs the following processing as described below: the 8 th to 19 th decoded and adjusted spectrum are set as the copy source, and the values of the decoded and adjusted spectrum of these copy sources are multiplied by the band spreading gain to obtain the results of the order of the 16 th to 19 th sample numbers followed by the 8 th to 15 th sample numbers as the 20th to 31 th decoded spread spectrum.Therefore, when the inputted fricative decision information indicates that the sound is fricative, the band expansion gain encoding unit 16 sets the gain candidate vector for fricative stored in the storage unit 161 as a gain candidate vector for gain, which is the adjusted spectrum Y to which bits are allocated from the encoding unit 14, and obtains a code corresponding to the gain candidate vector as a band expansion gain code₀,…,Y₁₉12 adjusted spectra Y starting from the larger one of the sample numbers₈,…,Y₁₉And gain candidate values g constituting a gain candidate vector_j,0,…,g_j,11Absolute value | Y of respective multiplied values₈g_j,0|,…,|Y₁₉g_j,11I.e., the adjusted spectrum Y to which no bit is allocated by the encoding unit 14₂₄,…,Y₃₁,Y₂₀,…,Y₂₃Absolute value of each | Y₂₄|,…,|Y₃₁|,|Y₂₀|,…,|Y₂₃Absolute value of difference | | Y₈g_j,0|-|Y₂₄||,…,||Y₁₅g_j,7|-|Y₃₁||,||Y₁₆g_j,8|-|Y₂₀||,…,||Y₁₉g_j,11|-|Y₂₃Sum of | | E_jThe smallest gain candidate vector.

As described above, the band spreading gain encoding unit 16 stores a plurality of codes and gain candidate vectors corresponding to the codes, each of the gain candidate vectors including K (K is an integer equal to or greater than 2) gain candidate values, and the band spreading gain encoding unit 16 obtains and outputs a code corresponding to a gain candidate vector in which an error between a sequence of K absolute values obtained by multiplying K adjusted spectrums in which bits are allocated by the encoding unit 14 in the adjusted spectrum sequence by K gain candidate values included in the gain candidate vector and a sequence of K absolute values of adjusted spectrums in which bits are not allocated by the encoding unit 14 in the adjusted spectrum sequence is minimized, as a band spreading gain code.

The operation of the band expansion gain encoding unit 16 corresponds to the operations of the band expansion unit 25 and the fricative adjustment canceling unit 23 of the decoding apparatus. In the example of fig. 8, the fricative adjustment canceling unit 23 of the decoding apparatus sets the 20th to 23 rd decoded spread spectrums on the side where the sample number is small in the 20th to 31 th decoded spread spectrums as the decoded spectrums having the sample numbers from the 28 th to 31 th, and sets the 24 th to 31 th decoded spread spectrums on the side where the sample number is large in the 20th to 31 th decoded spread spectrums as the sample numbers from the 2 nd to 9 th decoded spectrums. The band extending unit 25 of the decoding apparatus performs the operation of fig. 14 in consideration of the frequency level of the decoded spectrum obtained by the operation of the fricative adjustment canceling unit 23.

That is, the band extending unit 25 of the decoding apparatus performs the process of matching the level of the frequency in the decoded spectrum, regardless of whether the fricative determination information indicates a fricative sound or a sound that does not indicate a fricative sound. Therefore, the band spreading gain coding unit 16 also performs an operation corresponding to the band spreading unit 25.

[ multiplexing section 15]

The multiplexing unit 15 receives the fricative decision information output from the fricative decision unit 12, the spectrum code output from the encoding unit 14, and the band spread gain code output from the band spread gain encoding unit 16. The multiplexing unit 15 outputs a code obtained by concatenating the code corresponding to the inputted fricatives determination information, the spectrum code, and the band gain code (step S15).

Decoding device

A process of the decoding device according to the second embodiment will be described with reference to fig. 11. As illustrated in fig. 11, the decoding device according to the second embodiment includes a demultiplexing unit 21, a decoding unit 22, a band extending unit 25, a fricative adjustment canceling unit 23, and a time domain converting unit 24. The decoding device according to the second embodiment of fig. 11 differs from the decoding device according to the first embodiment of fig. 3 in that it includes a band spreading unit 25, and the multiplexing/demultiplexing unit 21 obtains a band spread gain code from the input code. The operations of the decoding unit 22, the fricative adjustment canceling unit 23, and the time domain conversion unit 24, which are other configurations of the decoding device according to the second embodiment, are the same as those of the decoding device according to the first embodiment, and therefore only essential parts of the operations will be described below.

The code output by the encoding device is input to the decoding device. The code input to the decoding apparatus is input to the demultiplexing section 21. The decoding apparatus performs processing for each component in units of frames of a predetermined time length. The decoding method of the second embodiment is implemented by each component of the decoding apparatus performing the following processing of step S21 to step S25 illustrated in fig. 12.

[ multiplexing/demultiplexing unit 21]

The multiplexing/demultiplexing unit 21 demultiplexes the input code into a code corresponding to the fricative determination information, a band expansion gain code, and a spectrum code, outputs the fricative determination information obtained from the code corresponding to the fricative determination information to the fricative adjustment canceling unit 23 and the band expanding unit 25, outputs the band expansion gain code to the band expanding unit 25, and outputs the spectrum code to the decoding unit 22 (step S21).

[ decoding section 22]

The decoding unit 22 decodes the input spectrum code in units of frames by the decoding process corresponding to the encoding process performed by the encoding unit 14 of the encoding apparatus, obtains a decoded adjusted spectrum sequence, and outputs the decoded adjusted spectrum sequence (step S22).

As described above, since the encoding unit 14 of the encoding device according to the second embodiment performs the encoding process in which no bit is allocated to a part of samples having a large sample number, even if the spectrum code is decoded, the value of the decoded adjusted spectrum of the sample number cannot be obtained. In the case of the above-described example of the encoding unit 14, the decoding unit 22 decodes the spectrum code to obtain N-K decoded adjusted spectra ^ Y starting from the smaller sample number₀,…,^Y_N-K-1Decoding the adjusted spectral sequence.

Further, the value of the decoded adjusted spectrum of the sample number to which no bit is assigned in the encoding unit 14 may be set to 0. That is, in the case of the example of the encoding unit 14 described above, the decoding unit 22 may decode the spectrum code, and may decode the adjusted spectrum Y by K samples whose sample numbers start with the larger one_N-K,…,^Y_N-1The respective values are set to 0, resulting in a decoded adjusted spectral sequence ^ Y₀,…,^Y_N-1。

In this way, the decoding unit 22 decodes the spectrum code of the frame unit of the predetermined time interval to obtain the sample sequence of the frequency domain (decoded adjusted spectrum sequence) in order to decode the spectrum code in which no bit is allocated to a part of the high-order side.

However, as will be described later, when the input information indicating whether or not the sound is fricative indicates that the sound is fricative, the fricative adjustment canceling unit 23 obtains, as a result of exchanging all or a part of the low-range-side frequency sample sequence located on the low range side with respect to the predetermined frequency in the decoded spread spectrum sequence (spectrum sequence based on the decoded adjusted spectrum sequence) obtained by the band spreading unit 25 described later and all or a part of the high-range-side frequency sample sequence located on the high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit 25, the fricative adjustment canceling unit 23 obtains, as a spectrum sequence of the decoded speech signal, the decoded spread spectrum sequence obtained by the band spreading unit 25 in cases other than the above, as a spectrum sequence of the decoded speech signal. That is, when the input information indicating whether or not the input sound is a fricative sound indicates a fricative sound, the decoding unit 22 is configured to decode the spectrum code without allocating bits to a part of the spectrum code on the lower side to obtain a spectrum sequence in the frequency domain (decoded adjusted spectrum sequence), and when the information is other than the above, the decoding unit 22 is configured to decode the spectrum code without allocating bits to a part of the spectrum code on the upper side to obtain a spectrum sequence in the frequency domain (decoded adjusted spectrum sequence).

The decoding unit 22 of the decoding apparatus according to the first embodiment outputs the obtained decoded adjusted spectral sequence to the fricative adjustment canceling unit 23, whereas the decoding unit 22 of the decoding apparatus according to the second embodiment outputs the obtained decoded adjusted spectral sequence to the band extending unit 25.

[ band expansion section 25]

The band spreading unit 25 receives at least the band spreading gain decoding output from the demultiplexing unit 21 and the decoded adjusted spectrum sequence output from the decoding unit 22. The band spreading unit 25 decodes and decodes the adjusted spectrum sequence in units of frames based on at least the inputted band spreading gain, and obtains decoded spread spectrum sequence Y as follows₀,…,～Y_N-1The resulting decoded spread spectrum sequence Y₀,…,～Y_N-1And outputs the result to the frictional sound adjustment canceling unit 23 (step S25).

When the band spreading unit 25 is configured to receive only the band spreading gain code and decode the adjusted spectrum sequence, the band spreading unit 25 obtains the decoded spread spectrum sequence Y from the band spreading gain code and decode the adjusted spectrum sequence received in the frame unit, as in example 1 described below, for example₀,…,～Y_N-1The resulting decoded spread spectrum sequence Y₀,…,～Y_N-1And outputs the result to the frictional sound adjustment canceling unit 23.

The band spreading unit 25 may be configured to receive the fricative decision information output from the demultiplexing unit 21 in addition to the band spreading gain decoding and decoding of the adjusted spectral sequence. In the case of this configuration, for example, as in example 2 described below, the band extending unit 25 obtains the decoded spread spectrum sequence Y on a frame-by-frame basis from the input band spread gain code, the decoded adjusted spectrum sequence, and the fricative decision information₀,…,～Y_N-1The resulting decoded spread spectrum sequence Y₀,…,～Y_N-1And outputs the result to the frictional sound adjustment canceling unit 23.

In the storage unit 251 of the band spreading unit 25, as in the case of the storage unit 161 of the band spreading gain encoding unit 16 of the encoding apparatus, a plurality of groups are stored in advance by forming a group of gain candidate vectors that are candidates for a gain vector and a code that can specify the gain candidate vector, and each gain candidate vector is composed of gain candidate values for a plurality of sample amounts. The band spreading unit 25 obtains, as a decoded spread spectrum sequence, a sequence in which a value obtained by multiplying each sample value of a copy source by each band spread gain included in a gain candidate vector specified by a code corresponding to a band spread gain code is set as a result of a decoded spread spectrum corresponding to an adjusted spectrum to which no bit is allocated in the encoding unit 14 of the encoding apparatus, and a decoded adjusted spectrum obtained by decoding a spectrum code is directly set as a result of a decoded spread spectrum, the copy source being the whole or a part of the decoded adjusted spectrum obtained by decoding the spectrum code (the decoded adjusted spectrum corresponding to the adjusted spectrum to which a bit is allocated in the encoding unit 14 of the encoding apparatus).

[ example 1 of band expanding section 25]

In this example, the storage unit 251 stores J sets of gain candidate vectors each composed of a gain candidate value corresponding to K samples and a code set. Hereinafter, the J gain candidate vectors are each set to G_j(J-0, …, J-1), the gain candidate vector G will be summed with_j(J is 0, …, J-1) and C is the code corresponding to each of (J is 0, …, J-1)_Gj(J-0, …, J-1), each gain candidate vector G_jSet as K gain candidates g based on the number of K samples_j,k(K is 0, …, K-1).

The band expanding section 25 decodes the adjusted spectrum ^ Y₀,…,^Y_N-K-1Directly starting from the smaller sample number of the decoded spread spectrum sequence, the N-K decoded spread spectrums-Y₀,…,～Y_N-K-1. The band extending unit 25 also selects the gain candidate vector G stored in the storage unit 251_j(J-0, …, J-1), the code C corresponding to the input is obtained_GjThe K gain candidate values included in the gain candidate vector having the same band spreading gain code are used as the band spreading gain g₀,…,g_K-1. The band expanding section 25 further decodes the adjusted spectrum ^ Y₀,…,^Y_N-K-1The first K decoded adjusted spectrums with larger sample numbers^Y_N-2K,…,^Y_N-K-1Sum band spreading gain g₀,…,g_K-1Values of after multiplication respectively ^ Y_N-2Kg₀,…,^Y_N-K-1g_K-1The decoded spread spectrum sequence is set to K decoded spread spectrums-Y starting from the side with the larger sample number of the decoded spread spectrum sequence_N-K,…,～Y_N-1。

[ example 2 of band expanding section 25]

In this example, the storage unit 251 stores J sets of gain candidate vectors and codes as in example 1, but unlike example 1, 2 sets of gain candidate vectors for fricatives and gain candidate vectors for non-fricatives are stored as gain candidate vectors. That is, the storage unit 251 stores J sets of a fricative gain candidate vector, a non-fricative gain candidate vector, and a code, and each of the fricative gain candidate vector and the non-fricative gain candidate vector is configured with a gain candidate value corresponding to K samples. Hereinafter, the J fricative gain candidate vectors are each G1_j(J is 0, …, J-1), and the J gain candidate vectors for non-fricatives are G2_j(J-0, …, J-1), and a fricative gain candidate vector G1_j(J-0, …, J-1) and gain candidate vector for frictionless sound G2_jEach code corresponding to (J-0, …, J-1) is C_Gj(J-0, …, J-1). The gain candidate vector G1 for each fricative_jSet as K gain candidates g1 based on the number of K samples_j,k(K is 0, …, K-1), and each non-fricative gain candidate vector G2_jSet as K gain candidates g2 based on the number of K samples_j,k(K is 0, …, K-1).

The band expanding section 25 decodes the adjusted spectrum ^ Y₀,…,^Y_N-K-1Directly starting from the smaller sample number of the decoded spread spectrum sequence, the N-K decoded spread spectrums-Y₀,…,～Y_N-K-1. When the inputted fricative decision information indicates that the input fricative sound is a fricative sound, the band expansion unit 25 uses the gain candidate vector G1 for the fricative sound stored in the storage unit 251_j(J is 0, …, J-1) is used as a gain candidate vector G_j(J is 0, …, J-1), and when the input fricative decision information indicates that the input fricative decision information is not a fricative sound, the band extending unit 25 sets the gain candidate vector G2 for non-fricative sounds stored in the storage unit 251_j(J is 0, …, J-1) is used as a gain candidate vector G_j(J-0, …, J-1) to obtain a gain candidate vector G_jSymbol C corresponding to input in (J-0, …, J-1)_GjThe K gain candidate values included in the gain candidate vector having the same band spreading gain code are used as the band spreading gain g₀,…,g_K-1. The band expanding section 25 further decodes the adjusted spectrum ^ Y₀,…,^Y_N-K-1The first K decoded adjusted spectra ^ Y starting with the larger sample number in_N-2K,…,^Y_N-K-1And band extension gain g₀,…,g_K-1Values of after multiplication respectively ^ Y_N-2Kg₀,…,^Y_N-K-1g_K-1K decoded spread spectrum-Y starting from the one with the larger sample number of the decoded spread spectrum sequence_N-K,…,～Y_N-1。

[ example 1 of band extending unit 25 and modification 1 of example 2]

In examples 1 and 2 described above, the decoded adjusted spectrum to be subjected to multiplication of band spread gain is set to the decoded adjusted spectrum Y obtained by decoding the spectrum code₀,…,^Y_N-K-1K adjusted spectra ^ Y starting from the side with larger sample number in (a)_N-2K,…,^Y_N-K-1. However, the decoded adjusted spectrum to be subjected to multiplication of band spread gain is a decoded adjusted spectrum ^ Y obtained by decoding a spectrum code₀,…,^Y_N-K-1The K decoded adjusted spectra in (1) may be the K decoded adjusted spectra corresponding to the predetermined K sample numbers.

[ example 1 of band extending unit 25 and modification 2 of example 2]

In examples 1 and 2 above, decoding the value of k from small to large adjusts the spectrum Y_N-2K+kBand spreading gain g with k from small to large_kMultiplying to obtain a decoded spread spectrum Y with a k value from small to large_N-K+kThat is, the value of k is related from small to large, butAny association may be used as long as it is a predetermined association.

[ example of band expanding section 25]

A specific example of the band extending unit 25 when N is 32 and K is 12 will be described. This specific example corresponds to modification 2 of example 2 of band extending unit 25. Fig. 13 and 14 show examples of processing performed by the band extending unit 25 and the fricative adjustment canceling unit 23 when N is 32 and K is 12.

Fig. 13 is an example of a case where the fricative determination information indicates that the sound is not fricative. The band expansion unit 25 adjusts the decoded spectrum Y obtained by decoding the spectrum code₀,…,^Y₁₉Set directly to decoding spread spectrum-Y₀,…,～Y₁₉. The band spreading section 25 also obtains a code C corresponding to the input_GjThe 12 gain candidate values included in the gain candidate vector having the same band spreading gain code are used as the band spreading gain g₀,…,g₁₁. The band expanding section 25 further decodes the adjusted spectrum ^ Y₀,…,^Y₁₉The first 12 decoded adjusted spectra ^ Y with larger sample number₈,…,^Y₁₉And band extension gain g₀,…,g₁₁Values of after multiplication respectively ^ Y₈g₀,…,^Y₁₉g₁₁K decoded spread spectrum-Y starting from the one with the larger sample number of the decoded spread spectrum sequence₂₀,…,～Y₃₁。

Fig. 14 shows an example of a case where the fricative determination information indicates that the information is fricative sound. The band expansion unit 25 adjusts the decoded spectrum Y obtained by decoding the spectrum code₀,…,^Y₁₉Set directly to decoding spread spectrum-Y₀,…,～Y₁₉. The band spreading section 25 also obtains a code C corresponding to the input_GjThe 12 gain candidate values included in the gain candidate vector having the same band spreading gain code are used as the band spreading gain g₀,…,g₁₁. The band expanding section 25 further decodes the adjusted spectrum ^ Y₀,…,^Y₁₉The first 12 decoded adjusted spectra ^ Y with larger sample number₈,…,^Y₁₉And band extension gain g₀,…,g₁₁Values of after multiplication respectively ^ Y₈g₀,…,^Y₁₉g₁₁The decoded spread spectrum sequence is set to K decoded spread spectrums-Y starting from the side with the larger sample number of the decoded spread spectrum sequence₂₄,…,～Y_31,～Y₂₀,…,～Y₂₃. That is, the band extending unit 25 performs the following processing: will decode the adjusted spectrum from the 8 th to the 19 th₈,…,^Y₁₉Setting the copy sources as the decoding adjusted spectrum ^ Y of the copy sources₈,…,^Y₁₉Value of (d) and band extension gain g₀,…,g₁₁Multiplied value ^ Y₈g₀,…,^Y₁₉g₁₁Arranged so as to correspond to the order of sample numbers from 16 th to 19 th of decoded adjusted spectrum₂₀＝^Y₁₆g₈,…,～Y₂₃＝^Y₁₉g₁₁Followed by decoded spread spectrum-Y corresponding in order to sample number from 8 th to 15 th of decoded adjusted spectrum₂₄＝^Y₈g₀,…,～Y₃₁＝^Y₁₅g₇The result of the sequence of (1) is that the spread spectrum Y is decoded from the 20th to the 31 th₂₀,…,～Y₃₁。

The operation of the band extending unit 25 corresponds to the operation of the frictional sound adjustment canceling unit 23. In the example of fig. 8, the fricative adjustment canceling unit 23 decodes the 20th to 31 th spread spectrum Y₂₀,…,～Y₃₁20th to 23 th decoded spread spectrum Y on the side where the sample number is small in₂₀,…,～Y₂₃Set as decoded spectrum X from sample number 28 to 31₂₈,…,^X₃₁Decoding the 20th to 31 th decoded spread spectrum Y₂₀,…,～Y₃₁The 24 th to 31 th decoded spread spectrum-Y on the side where the sample number is large₂₄,…,～Y₃₁Set to decode spectrum X from sample number 2 to 9₂,…,^X₈. The band extending unit 25 performs the operation of fig. 14 in consideration of the frequency level of the decoded spectrum obtained by the operation of the fricative adjustment canceling unit 23. I.e. withoutThe band extending unit 25 of the decoding apparatus performs a process of matching the level of the frequency in the decoded spectrum, whether the fricative determination information indicates a fricative sound or a non-fricative sound.

In this way, the band spreading unit 25 obtains a decoded spread spectrum sequence by arranging samples based on K (K is an integer of 2 or more) samples included in the frequency domain sample sequence (decoded adjusted spectrum sequence) obtained by the decoding unit 22 decoding the spectrum code, on the higher side than the frequency domain sample sequence (decoded adjusted spectrum sequence) obtained by the decoding unit 22 decoding the spectrum code.

More specifically, for example, the band spreading unit 25 obtains a set of K band spreading gains by decoding a band spreading gain code, and arranges K samples obtained by multiplying K band spreading gains by K samples included in a sample string of the frequency domain obtained by decoding a spectrum code by the decoding unit 22 on the higher side of the sample string of the frequency domain obtained by decoding the spectrum code (decoded adjusted spectrum sequence) by the decoding unit 22, thereby obtaining a decoded spread spectrum sequence.

Further, the processing for the band expansion unit 25 to decode the band expansion gain code to obtain a set of K band expansion gains may be performed by storing a plurality of codes, a fricative gain candidate vector corresponding to each code, and a non-fricative gain candidate vector corresponding to each code, and setting each of the fricative gain candidate vector and the non-fricative gain candidate vector to include K gain candidate values: when the information indicating whether or not the input fricative sound indicates a fricative sound, K gain candidate values included in a fricative sound gain candidate vector corresponding to a code identical to the band expansion gain code among the plurality of fricative sound gain candidate vectors are set as a set of K band expansion gains, and when the information is other than the above, K gain candidate values included in a non-fricative sound gain candidate vector corresponding to a code identical to the band expansion gain code among the plurality of non-fricative sound gain candidate vectors are set as a set of K band expansion gains.

[ adjustment canceling part for frictional noise 23]

The fricatives decision information outputted from the demultiplexing unit 21 and the decoded spread spectrum sequence Y outputted from the band spreading unit 25 are inputted to the fricatives adjustment canceling unit 23₀,…,～Y_N-1. When the fricative determination information input in frame units indicates that the input fricative sound is fricative sound, the fricative adjustment canceling unit 23 cancels the input decoded spread spectrum sequence Y₀,…,～Y_N-1Performing adjustment release processing to obtain a decoded spectrum sequence ^ X₀,…,^X_N-1The obtained decoded spectrum sequence ^ X₀,…,^X_N-1The fricatives adjustment canceling unit 23 outputs the decoded spread spectrum sequence Y to the time domain conversion unit 24, and when the fricatives determination information indicates that the decoded spread spectrum sequence Y is not fricatives₀,…,～Y_N-1Directly as decoded spectral sequence ^ X₀,…,^X_N-1The result is output to the time domain conversion unit 24 (step S23).

The adjustment canceling process performed by the fricative adjustment canceling unit 23 is for decoding the spread spectrum sequence Y₀,…,～Y_N-1The fricative adjustment release unit 23 of the decoding device according to the first embodiment decodes the adjusted spectrum sequence ^ Y₀,…,^Y_N-1The same treatment as that performed. That is, if an integer value larger than 1 and smaller than N is M, for example, the decoded spread spectrum sequence Y is₀,…,～Y_N-1Y being a sample having a sample number less than M₀,…,～Y_M-1The sample group of (2) is a low-level decoded spread spectrum sequence, and the decoded spread spectrum sequence Y₀,…,～Y_N-1Sample number of (1) is M or more, i.e., -Y_M,…,～Y_N-1When the sample group of (2) is a high-level decoded spread spectrum sequence, the adjustment canceling process by the fricative adjustment canceling unit 23 is as follows: obtaining the exchanged low-side decoded spread spectrum sequence-Y₀,…,～Y_N-1And the same number of high-side decoded spread spectrum sequences-Y as the number of samples of all or a part of_M,…,～Y_N-1As a result of decoding the spectral sequence X₀,…,^X_N-1。

In other words, when the information indicating whether or not the input sound is fricative indicates that the sound is fricative, the fricative adjustment canceling unit 23 may obtain, as a spectrum sequence of a decoded sound signal (decoded spectrum sequence), a result of exchanging all or a part of a low-range-side frequency sample sequence located on a low range side with respect to a predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit 25 with all or a part of a high-range-side frequency sample sequence located on a high range side with respect to the predetermined frequency in the decoded spread spectrum sequence obtained by the band spreading unit 25 in the same number as the low-range-side frequency sample sequence, and may obtain, in a case other than the above, a spectrum sequence of a decoded sound signal (decoded spectrum sequence) as it is from the band spreading unit 25.

Further, as shown by the chain line in fig. 11, it can be said that if the band extending unit 25 and the fricative adjustment canceling unit 23 are the fricative corresponding band extending unit 27, the fricative corresponding band extending unit 27 performs band extension on the frequency domain spectrum sequence (decoded adjusted spectrum sequence) obtained by the decoding unit 22 to the lower domain side to obtain a spectrum sequence (decoded spectrum sequence) of a decoded sound signal when the information indicating whether or not the inputted fricative sound indicates that the.

[ time domain converting section 24]

The time domain conversion unit (24) converts the decoded spectrum sequence ^ X for each frame using a method of conversion to the time domain corresponding to the method of conversion to the frequency domain performed by the frequency domain conversion unit (11) of the encoding device₀,…,^X_N-1The signal is converted into a time domain signal, and a frame-unit audio signal (decoded audio signal) is obtained and output (step S24).

Action and Effect

According to the encoding device and the decoding device of the second embodiment, similarly to the encoding device and the decoding device of the first embodiment, by performing the fricative adjustment process and the fricative adjustment cancellation process, bits are preferentially allocated to the high domain in the time section of the fricative sound and bits are preferentially allocated to the low domain in the time section other than the time section, whereby it is possible to reduce the deterioration in hearing even for a sound signal including a fricative sound or the like.

According to the encoding device and the decoding device of the second embodiment, by further using the band expansion gain, the low-range spectrum is reproduced and the band is expanded by the reproduction of the high-range spectrum in the time zone of the fricative sound, and the high-range spectrum is reproduced and the band is expanded by the reproduction of the low-range spectrum in the time zone other than the time zone, so that even a sound signal including a fricative sound or the like can be reduced more acoustically than the first embodiment. In this case, the original spectrum contour is reproduced as much as possible by performing a frequency-order-maintained copy using a band expansion gain based on the amplitude of the spectrum, thereby improving the auditory quality.

Further, when the frictional sound determination unit 12 of the modification of the first embodiment is used as the frictional sound determination unit 12 of the encoding device of the second embodiment, it is possible to suppress frequent switching of the determination result by the frictional sound determination unit 12, suppress the frequency of occurrence of discontinuity of the waveform of the decoded sound, and suppress deterioration of the auditory sense quality due to the feeling of discontinuity, compared to the configuration of the frictional sound determination unit 12 of the encoding device of the second embodiment using the frictional sound determination unit 12 of the first embodiment.

[ program and recording Medium ]

Each of the encoding device, the decoding device, and the fricative determination device may be realized by a computer. In this case, the processing contents of the functions to be provided by each of the encoding device, the decoding device, and the fricative determination device are described by a program. Then, by executing the program on a computer, each of the encoding device, the decoding device, and the fricative determination device is realized on the computer.

The program in which the processing contents are described may be recorded in a computer-readable recording medium. Examples of the computer-readable recording medium include any medium such as a magnetic recording device, an optical disk, an magneto-optical recording medium, and a semiconductor memory.

The processing of each part may be configured by executing a predetermined program on a computer, or at least a part of the processing may be realized by a hardware method.

It is to be understood that appropriate modifications can be made without departing from the scope of the present invention.

42页详细技术资料下载

Decoding device, encoding device, methods thereof, and program

相关技术

网友询问留言