Air conditioner indoor unit abnormal sound detection method based on sound classification model

文档序号:1939911 发布日期:2021-12-07 浏览:14次 中文

阅读说明:本技术 一种基于声音分类模型的空调内机异常声音检测方法 (Air conditioner indoor unit abnormal sound detection method based on sound classification model ) 是由 袁东风 高东 于 2021-09-07 设计创作,主要内容包括:本发明涉及一种基于声音分类模型的空调内机异常声音检测方法,包括数据预处理、梅尔联合特征提取和分类网络三部分。具体过程为:采集待检测空调的声音数据,将其切片处理;提取每个片段的梅尔频谱特征和梅尔倒谱系数,将它们组成梅尔联合特征;将梅尔联合特征输入训练好的分类网络进行分类,每个片段对应一个分类结果;将结果序列可视化,同时给出空调质量的总体判别结果。该方法可以快速准确地进行空调异常声音检测,实现了质检环节的自动化和智能化,从而提高生产效率、减小生产成本。(The invention relates to a method for detecting abnormal sounds of an air conditioner indoor unit based on a sound classification model, which comprises three parts of data preprocessing, Mel combined feature extraction and classification network. The specific process is as follows: collecting sound data of an air conditioner to be detected, and slicing the sound data; extracting Mel frequency spectrum characteristics and Mel cepstrum coefficients of each segment, and combining them into Mel combined characteristics; inputting the Mel combined features into a trained classification network for classification, wherein each segment corresponds to a classification result; and visualizing the result sequence and simultaneously giving an overall judgment result of the air conditioner quality. The method can quickly and accurately detect the abnormal sound of the air conditioner, and realizes automation and intellectualization of a quality inspection link, thereby improving the production efficiency and reducing the production cost.)

1. An abnormal sound detection method of an air conditioner indoor unit based on a sound classification model is characterized by comprising the following steps:

(1) collecting and recording running sound signals of an air conditioner indoor unit;

(2) intercepting abnormal parts in the sound signals obtained in the step (1), slicing the intercepted abnormal parts, and labeling each segment according to the abnormal type;

(3) intercepting a normal part in the sound signal obtained in the step (1), slicing the intercepted normal part, and marking each segment as a normal sound;

(4) performing fast Fourier transform and squaring on all the fragments in the step (2) and the step (3) to obtain an energy spectrum, obtaining a Mel frequency spectrum through Mel filtering, and extracting Mel frequency spectrum characteristics based on the Mel frequency spectrum;

(5) carrying out logarithmic compression on the amplitude of the Mel frequency spectrum, and then carrying out fast Fourier inverse transformation or discrete cosine transformation on the amplitude of the Mel frequency spectrum to obtain a Mel cepstrum coefficient;

(6) forming Mel frequency spectrum characteristics and MFCCs into Mel joint characteristics, namely characteristic sets, and dividing the characteristic sets into training sets and testing sets;

(7) inputting the training set into a classification network to perform sound classification model training, and selecting a sound classification model with the best classification effect through test set testing;

(8) collecting and recording a new running sound signal of an air conditioner indoor unit to be detected, and slicing the running sound signal;

(9) sequentially carrying out the operations of the step (4), the step (5) and the step (6) on each fragment obtained in the step (8) to obtain a Mel joint characteristic;

(10) inputting the Mel combined features into the trained sound classification model in the step (7) for classification to obtain a classification result sequence of the whole piece of sound data;

(11) visualizing the classification result sequence and simultaneously giving an overall judgment result of the air conditioner quality;

(12) and recording the serial number of the air conditioner judged to be unqualified in quality, and simultaneously giving a prompt signal.

2. The method for detecting the abnormal sound of the indoor unit of the air conditioner based on the sound classification model is characterized in that the classification network comprises a five-layer framework which is a sequence input layer, a BilSTM network layer, a full connection layer, a softmax layer and a classification output layer in sequence;

the sequence input layer is a sequence layer with 24 dimensions; the BilSTM network layer has 100 neurons, namely, input data is mapped to a 100-dimensional feature space; inputting the result of the BilSTM network layer into a full connection layer, wherein the number of neurons of the full connection layer is equal to the classification number, the full connection layer maps the result of the BilSTM network layer to a 2-dimensional or 3-dimensional classification space, each dimension represents a class, values are subjected to exponential mapping through a softmax layer, the weight in each class is regarded as the probability of the value, and class judgment is carried out according to the probability; the classification output layer is used for calculating the cross entropy loss of the classification.

3. The method as claimed in claim 1, wherein in the steps (1) and (8), when the operating sound signal of the air conditioner indoor unit is collected and recorded, the sampling rate is 48000 hz, and a mono 32-bit storage format is adopted.

4. The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model as claimed in claim 1, wherein the slicing in the step (2), the step (3) and the step (8) specifically means: the sound signal is further divided into segments with the time length of 0.5 second by taking 0.75 as an overlapping rate;

in the step (2), selecting a segment with the abnormal sound ratio of not less than 0.5 as an abnormal sample, and marking, wherein the label of the grinding vibration sound is B, and the label of the outer membrane sound is C; in the step (3), N is used as a label of normal sound.

5. The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model as claimed in claim 1, wherein in the step (4), the short-time fourier transform specifically comprises:

firstly, framing a signal, wherein the frame length is 512, and the overlapping rate is 0.5;

then, carrying out fast Fourier transform frame by frame to obtain a frequency spectrum and squaring to obtain an energy spectrum; the FFT length is 512, and each frame is multiplied by a Hamming window before the fast Fourier transform, which is given by the formula:n is the window length, N is the time domain variable, w [ N ]]Is the hamming window amplitude.

6. The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model as claimed in claim 1, wherein in the step (4), a mel spectrum is obtained through mel filtering, specifically:

the mel filtering is to multiply the energy spectrum and a mel filter in a frequency domain to obtain a mel spectrum, and the specific calculation formula is as follows: melspectrum (power _ spectrum (f)) mel) ilter (f), where Melspectrum is a mel spectrum, power _ spectrum is an energy spectrum, mel) ilter is a mel filter, and f is a frequency variable;

the Mel filter comprises 40 triangular filters with an overlap ratio of 0.5, and the frequency range is 1500-;

normalizing the amplitude of the triangular filter based on the bandwidth, the bandwidth being determined by the center frequency of the adjacent triangular filter;

the center frequency of the triangular filter is obtained by the following method: equally dividing the frequency range into 40 frequency bands, wherein the central frequency of each frequency band is the Mel central frequency, mapping the Mel central frequency according to the Mel mapping formula, and obtaining the result which is the central frequency of the triangular filter;

the amplitude of the triangular filter is obtained by the following method: the lower cut-off frequency of the triangular filter is the central frequency of the previous triangular filter, and the upper cut-off frequency of the triangular filter is the central frequency of the next triangular filter, so that the bandwidth of the triangular filter is determined; calculating the ratio of the bandwidth reciprocal of each triangular filter to the sum of the bandwidth reciprocals of all the triangular filters, and taking the ratio as the amplitude of the triangular filter; the specific calculation formula is as follows:δ (i) is the amplitude of the ith triangular filter, B (i) is the bandwidth of the ith triangular filter, B (J) is the bandwidth of the jth triangular filter, and J is the total number of triangular filters.

7. The method for detecting the abnormal sound of the indoor unit of the air conditioner based on the sound classification model as claimed in claim 1, wherein in the step (4), the mel-frequency spectrum characteristics comprise spectrum energy, spectrum centroid, spectrum entropy, spectrum peak value, spectrum attenuation, spectrum flux, spectrum kurtosis, spectrum attenuation point, spectrum skewness, spectrum inclination and spectrum distribution.

8. The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model as claimed in claim 1, wherein in the step (5), a discrete cosine transform formula is shown as formula (i):

in formula (I), M is frequency domain variable, k is transform domain variable, M is time domain point number, X M is time domain amplitude, and X k is transform domain amplitude.

9. The abnormal sound detection method for the indoor unit of the air conditioner based on the sound classification model as claimed in claim 1, wherein in the step (6), the MFCCs and mel frequency spectrum features are combined into 24-dimenal mel combined features;

in step (7), when the sound classification model with the best classification effect is selected, 5 times of repeated training and testing are carried out on the parameters of each group of classification networks.

10. The abnormal sound detection method for the indoor unit of the air conditioner based on the sound classification model as claimed in any one of claims 1 to 9, wherein in the step (10), the classification result sequence of the whole piece of sound data is composed of B, C and N;

in the step (11), when the classification result sequence is visualized, the type B corresponds to a numerical value of-1, the type C corresponds to a numerical value of +1, and the type N corresponds to a numerical value of 0, and the result sequence is converted into a numerical sequence.

Technical Field

The invention relates to a sound classification model-based abnormal sound detection method for an air conditioner indoor unit, and belongs to the field of sound signal processing, artificial intelligence application and air conditioner quality detection.

Background

In the field of manufacturing, product quality control is an essential link. The failure pre-diagnosis of the air conditioner before delivery is helpful to reduce the reject ratio of products and improve the public praise of manufacturers. In the big data era, quality detection needs to be carried out by means of artificial intelligence technology, and common modes comprise appearance detection and sound analysis. Visual inspection relies on sophisticated computer vision techniques to detect leaks during assembly, thereby helping to perfect the manufacturing process. But the appearance detection floats on the surface and cannot go deep into the interior. The sound analysis can identify abnormal noise when the machine operates, so that the diagnosis of the internal quality of the product is made, and therefore, the sound analysis can make up for the deficiency of appearance detection.

In actual production, the last procedure before the air conditioner indoor unit leaves the factory is abnormal sound detection, and only qualified products can be packaged and delivered. For products with abnormal sounds, rework is needed and further processing is performed by technicians. The grinding sound vibration sound and the outer membrane sound are two common abnormal sounds, the grinding sound vibration sound is the sound generated by friction of a bearing or a through flow during operation, and the outer membrane sound is the sound generated by shaking of the outer membrane of the air conditioner under the action of wind power. The two abnormal sounds are different from normal air conditioner wind sound in hearing. Based on this, set up special noise detection unit in the workshop, diagnose the empty tempering volume through the sense of hearing by workman master. The noise unit comprises soundproof room and operation panel two parts, and the whole journey of test is by manual operation, and specific flow is:

1) the air conditioner indoor unit is conveyed into the soundproof room through the conveyor belt and confirmed to be in a proper position.

2) And closing the soundproof door and electrifying the air conditioner. The worker carries the earphone to distinguish the voice.

3) And (5) after the test is finished, opening the sound insulation door to transport the air conditioner out. And carrying out the next processing according to the detection result.

4) And repeating the steps to test the next air conditioner.

The original manual detection method cannot meet the actual requirement. Firstly, as the order quantity is increased, the speed of manual detection is far lower than the running speed of a production line, and an air conditioner internal unit is accumulated in a noise detection unit, so that the capacity increase is severely restricted. Secondly, the manual method relies on the experience of a master worker to carry out judgment, technical standards are not formed, and the judgment result is not objective and stable enough. Moreover, the background noise of the workshop seriously interferes with the judgment of a master worker, and further influences the objective and accurate result. Finally, in the big data and 5G era, especially with the application of artificial neural networks, intelligent manufacturing and digital production are gradually becoming the targets of enterprise development, and traditional production modes and artificial means must be replaced by intelligent methods.

Currently, there is relatively little research on intelligent detection of abnormal sounds in air conditioners due to the lack of available data sets. However, the abnormal sound detection of the air conditioner is based on a sound classification model, and the sound classification is widely applied in many fields, such as scene classification, speaker recognition, underwater target recognition, sound event recognition, and the like. The sound classification depends on the difference between different classes of sounds, and different classes of air conditioner sound signals are difficult to distinguish in a time domain through visualization, but have obvious difference in a frequency domain, so that the air conditioner sound classification is realized. These frequency domain differences are mostly in the mid-low frequency band, and the mel-frequency spectrum can highlight the mid-low frequency band and mask the high frequency band, so the classification effect can be improved by using the mel-frequency spectrum. And useful features are further extracted on the basis of the Mel frequency spectrum, so that redundancy and noise interference can be reduced, and data dimension reduction is realized, thereby obtaining higher efficiency and accuracy. The mel cepstral coefficients (MFCCs) obtained by performing cepstrum transformation on mel frequency spectrum have been widely used as audio features, but their successful application is mainly speech recognition and instrument recognition, so the MFCCs are significant in the field of intelligent manufacturing. Furthermore, MFCCs contain only spectral envelope information, and some other features are necessary in order to describe the spectral characteristics more fully.

In the big data age, the artificial neural network gradually becomes a main means of feature analysis by virtue of its powerful analysis capability, and at present, the neural network has been widely applied to the task of sound classification and recognition. The Convolutional Neural Network (CNN) has a strong image analysis capability, and sound classification can be realized by inputting a sound feature sequence into the CNN in an image form, but the image form wastes time sequence dependent information of a sound signal, and the time sequence information has the potential of improving the classification accuracy. Furthermore, the resolution of the image may also affect the classification accuracy. A Recurrent Neural Network (RNN) is suitable for analysis of time series data because it can effectively use time series dependent information of a sound signal, but RNN is apt to have problems of disappearance of a gradient and explosion of a gradient when processing a long sequence. Long-short memory (LSTM) networks are an important variant of RNN, whose internal neurons can alleviate the gradient problem by reducing the memory burden. Although LSTM can analyze longer time series, it can only analyze data in one direction. The bidirectional long-short time memory (BilSTM) network can perform bidirectional analysis on the sequence by virtue of the bidirectional memory capability, can find the symmetry between the occurrence and the termination of the abnormal sound, and improves the identification efficiency by utilizing the symmetry, so that the BilSTM is suitable for the task of classifying the air conditioner sound.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an abnormal sound detection method of an air conditioner indoor unit based on a sound classification model.

Firstly, slicing an air conditioner sound signal; then extracting Mel frequency spectrum characteristics and MFCCs of the sound signal based on each sound fragment, and combining the frequency spectrum characteristics and cepstrum coefficients into combined characteristics; secondly, classifying the combined features through a classification network; and finally, visualizing the classification results of all the segments of the whole sound signal by using a curve, and giving a judgment result of whether the air conditioner is qualified. Wherein the classification network needs to be trained and tested in advance using the feature set.

The invention also provides computer equipment and a storage medium.

Interpretation of terms:

1. fast Fourier Transform (FFT), MATLAB function FFT.

2. Mel frequency spectrum: stevens, Volkmann and Newmann in 1937 proposed the mel scale, in which the same pitch distance had the same auditory difference, and the mel spectrum was developed based on this principle. Mathematically, the mel spectrum corresponds to a logarithmic compression of the fourier spectrum along the frequency axis, which emphasizes the mid-low frequency components and folds the high frequency components. In reality the mel frequency spectrum is obtained by short-time fourier transform and mel filtering.

3. Spectral energy, i.e., MATLAB function spectral energy.

4. Spectral centroid, i.e. MATLAB function spectral centroid.

5. Spectral entropy, i.e. MATLAB function spectral entropy.

6. Spectral peaks, i.e. MATLAB function spectral creet.

7. Spectral decay, i.e., MATLAB function spectral decay.

8. Spectral flux, i.e., MATLAB function spectral flux.

9. Spectral kurtosis, i.e., MATLAB function spectral kurtosis.

10. The point of spectral decay, namely the MATLAB function spectral roll-off point.

11. Spectral skewness, i.e., MATLAB function spectral skewness.

12. Spectral slope, i.e., MATLAB function spectral slope.

13. Spectral distribution, i.e. MATLAB function spectral spread.

14. Logarithmic compression, which means performing Mel mapping, the formula is: mel (f) ═ 2595 × log10(1+ f/700); where Mel (f) is the Mel frequency after compression, and f is the Fourier frequency before compression.

15. Inverse Fast Fourier Transform (IFFT), MATLAB function IFFT.

16. Classifying the network: the classification network is a complete network model and has five layers of architectures, which are sequentially as follows: a sequence input layer, a BilSTM network layer, a full connection layer and a classification output layer.

17. Mel-Frequency Cepstral Coefficients (Mel Frequency Cepstral Coefficients, MFCCs): in 1980, proposed by Davis and Mermelstein, in the acoustic field, MFCCs were used to characterize formants, i.e. the envelope of the spectrum. Mel cepstrum coefficients are widely used as audio features and have been successfully used in speech recognition and instrument recognition.

The technical scheme of the invention is as follows:

an abnormal sound detection method for an air conditioner indoor unit based on a sound classification model comprises the following steps:

(1) collecting and recording running sound signals of an air conditioner indoor unit;

(2) intercepting abnormal parts in the sound signals obtained in the step (1), slicing the intercepted abnormal parts, and labeling each segment according to the abnormal type;

(3) intercepting a normal part in the sound signal obtained in the step (1), slicing the intercepted normal part, and marking each segment as a normal sound;

(4) performing fast Fourier transform on all the fragments in the step (2) and the step (3) to obtain an energy spectrum, obtaining a Mel frequency spectrum through Mel filtering, and extracting Mel frequency spectrum characteristics based on the Mel frequency spectrum;

(5) carrying out logarithmic compression on the amplitude of the Mel frequency spectrum, and then carrying out fast Fourier inverse transformation or discrete cosine transformation on the amplitude to obtain MFCCs;

(6) forming Mel frequency spectrum characteristics and MFCCs into Mel joint characteristics, namely characteristic sets, and dividing the characteristic sets into training sets and testing sets;

(7) inputting the training set into a classification network to perform sound classification model training, and selecting a sound classification model with the best classification effect through test set testing;

(8) collecting and recording a new running sound signal of an air conditioner indoor unit to be detected, and slicing the running sound signal;

(9) sequentially carrying out the operations of the step (4), the step (5) and the step (6) on each fragment obtained in the step (8) to obtain a Mel joint characteristic;

(10) inputting the Mel combined features into the trained sound classification model in the step (7) for classification to obtain a classification result sequence of the whole piece of sound data;

(11) visualizing the classification result sequence and simultaneously giving an overall judgment result of the air conditioner quality;

(12) and recording the serial number of the air conditioner judged to be unqualified in quality, and simultaneously giving a prompt signal.

Preferably, in the step (1) and the step (8), when the operating sound signal of the air conditioner indoor unit is collected and recorded, the sampling rate is 48000 hertz, and a single-channel 32-bit storage format is adopted.

Preferably, the slicing in step (2), step (3) and step (8) specifically means: the sound signal is further sliced into 0.5 second-duration segments with an overlap rate of 0.75.

Preferably, in the step (2), a segment with the abnormal sound ratio of not less than 0.5 is selected as an abnormal sample, and when labeling is carried out, the label of the grinding vibration sound is B, and the label of the outer membrane sound is C; in the step (3), N is used as a label of normal sound.

Preferably, in step (4), the short-time fourier transform specifically includes:

firstly, framing a signal, wherein the frame length is 512, and the overlapping rate is 0.5;

then, carrying out fast Fourier transform frame by frame to obtain a frequency spectrum, and squaring the frequency spectrum to obtain an energy spectrum; the FFT length is 512, and each frame needs to be multiplied by a hamming window before the fast fourier transform, and the formula is:n is more than or equal to 0 and less than or equal to N-1, N is the window length, N is the time domain variable, w [ N ]]Is the hamming window amplitude.

Preferably, in step (4), the mel spectrum is obtained by mel filtering, specifically:

the mel filtering is to multiply the energy spectrum and a mel filter in a frequency domain to obtain a mel spectrum, and the specific calculation formula is as follows: melfilter (f), where melexpect is a mel spectrum, power _ spectrum is an energy spectrum, melfilter is a mel filter, and f is a frequency variable;

the Mel filter comprises 40 triangular filters with an overlap ratio of 0.5, and the frequency range is 1500-;

normalizing the amplitude of the triangular filter based on the bandwidth, the bandwidth being determined by the center frequency of the adjacent triangular filter;

the center frequency of the triangular filter is obtained by the following method: equally dividing the frequency range into 40 frequency bands, wherein the central frequency of each frequency band is the Mel central frequency, mapping the Mel central frequency according to the Mel mapping formula, and obtaining the result which is the central frequency of the triangular filter;

the amplitude of the triangular filter is obtained by the following method: the lower cut-off frequency of the triangular filter is the central frequency of the previous triangular filter, and the upper cut-off frequency of the triangular filter is the central frequency of the next triangular filter, so that the bandwidth of the triangular filter is determined; calculating the ratio of the bandwidth reciprocal of each triangular filter to the sum of the bandwidth reciprocals of all the triangular filters, and taking the ratio as the amplitude of the triangular filter; the specific calculation formula is as follows:δ (i) is the amplitude of the ith triangular filter, B (i) is the bandwidth of the ith triangular filter, B (J) is the bandwidth of the jth triangular filter, and J is the total number of triangular filters.

Preferably, in step (4), the mel-frequency spectrum characteristics include spectrum energy, spectrum centroid, spectrum entropy, spectrum peak, spectrum attenuation, spectrum flux, spectrum kurtosis, spectrum attenuation point, spectrum skewness, spectrum slope and spectrum distribution;

preferably, in step (5), the Discrete Cosine Transform (DCT) formula is shown in formula (i):

in formula (I), M is frequency domain variable, k is transform domain variable, M is time domain point number, X M is time domain amplitude, and X k is transform domain amplitude.

Preferably, according to the invention, in step (6), the MFCCs and mel-frequency spectral features are combined into a 24-dimenal combined feature.

According to the optimization of the invention, the classification network comprises a five-layer framework, which is a sequence input layer, a BilSTM network layer, a full connection layer, a softmax layer and a classification output layer in sequence;

the sequence input layer is a sequence layer with 24 dimensions; the BilSTM network layer has 100 neurons, namely, input data is mapped to a 100-dimensional feature space; inputting the result of the BilSTM network layer into a full connection layer, wherein the number of neurons of the full connection layer is equal to the classification number, the full connection layer maps the result of the BilSTM network layer to a 2-dimensional or 3-dimensional classification space, each dimension represents a class, values are subjected to exponential mapping through a softmax layer, the weight in each class is regarded as the probability of the value, and class judgment is carried out according to the probability; the classification output layer is used for calculating the cross entropy loss of the classification.

Preferably, in step (7), when the sound classification model with the best classification effect is selected, 5 times of repeated training and testing are performed on the parameters of each group of classification networks.

Preferably, in step (10), the classification result sequence of the whole piece of sound data is composed of B, C and N.

Preferably, in step (11), when the sorted result sequence is visualized, the result sequence is converted into a number sequence, where type B corresponds to a value of-1, type C corresponds to a value of +1, and type N corresponds to a value of 0.

A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the abnormal sound detection method of the air conditioner indoor unit based on the sound classification model when executing the computer program.

A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a sound classification model-based abnormal sound detection method for an air conditioner indoor unit.

The invention has the beneficial effects that:

1. the abnormal sound detection method of the air conditioner indoor unit based on the sound classification model provided by the invention is characterized in that the air conditioner sound to be detected is sliced and then input into the detection model, then the Mel frequency spectrum characteristics and the MFCCs of the sound fragments are extracted and input into a classification network taking a BiLSTM network as a core as joint characteristics for classification, and a classification detection result of the whole piece of data is obtained, so that whether the air conditioner is qualified or not is judged according to the abnormal degree. The method provided by the invention can quickly and accurately detect the abnormal sound of the air conditioner, and realizes the automation and intellectualization of the quality inspection link, thereby improving the production efficiency and reducing the production cost.

2. According to the method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model, the reaction force and tolerance of human ears to the abnormal sound of the air conditioner indoor unit are considered, and the short time is selected for carrying out abnormal detection, so that the user experience of air conditioner products can be improved, and the public praise of manufacturers can be improved; carry out anomaly analysis based on less sound fragment, can realize comparatively concrete unusual constantly location, based on less fragment simultaneously with the abnormal conditions visualization of whole piece sound, make things convenient for the maintainer to study and judge and maintain the trouble of unqualified air conditioner.

3. According to the method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model, provided by the invention, the characteristic extraction is carried out on the basis of the Mel frequency spectrum, and the difference of different types of sound signals is more balanced through the nonlinear mapping function (namely, the middle-low frequency band is highlighted and the high frequency band is masked) of the Mel frequency spectrum, so that a better classification effect is obtained, and the accuracy of abnormal detection is improved.

4. The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model provided by the invention has the advantages that the feature extraction is carried out on the basis of the Mel frequency spectrum, the redundancy and the interference are reduced, the data dimension reduction is realized, the calculation force requirement is reduced, and the detection efficiency is improved.

5. The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model uses the BilSTM network as the core of the classification network, fully utilizes the time sequence dependence information of sound signals, can carry out bidirectional analysis on the sequence, can find the symmetry between the occurrence and the termination of the abnormal sound, and improves the identification efficiency by utilizing the symmetry.

Drawings

FIG. 1 is an example of time domain diagram of the original sound signal obtained in step (1) and step (8) of the present invention;

FIG. 2 is a diagram of the general (Fourier) spectrum and the Mel frequency spectrum of the sound signal obtained by transformation in step (4) of the present invention;

FIG. 3 is a schematic diagram of the Mel filter used in the Mel filtering in step (4) of the present invention.

FIG. 4 is a schematic diagram of a classification network employed in the present invention;

FIG. 5 is a diagram illustrating a sequence visualization example of classification results of the whole sound data;

fig. 6 is a schematic flow chart of the method for detecting abnormal sounds of an air conditioner indoor unit based on a sound classification model according to the present invention.

Detailed Description

The invention is further described below, but not limited thereto, with reference to the following examples and the accompanying drawings.

Example 1

An abnormal sound detection method for an air conditioner indoor unit based on a sound classification model is disclosed, as shown in fig. 6, and the detection method comprises the following steps:

(1) collecting and recording running sound signals of an air conditioner internal unit through a sound insulation chamber;

(2) intercepting abnormal parts in the sound signals obtained in the step (1), slicing the intercepted abnormal parts, and labeling each segment according to the abnormal type;

(3) intercepting a normal part in the sound signal obtained in the step (1), slicing the intercepted normal part, and marking each segment as a normal sound;

(4) performing Fast Fourier Transform (FFT) on all the fragments in the step (2) and the step (3), squaring to obtain an energy spectrum, obtaining a Mel frequency spectrum through Mel filtering, and extracting Mel frequency spectrum characteristics based on the Mel frequency spectrum;

(5) carrying out logarithmic compression on the amplitude of the Mel frequency spectrum, and then carrying out Inverse Fast Fourier Transform (IFFT) or Discrete Cosine Transform (DCT) on the amplitude to obtain MFCCs;

(6) forming Mel frequency spectrum characteristics and MFCCs into Mel joint characteristics, namely characteristic sets, and dividing the characteristic sets into training sets and testing sets;

(7) inputting the training set into a classification network to perform sound classification model training, and selecting a sound classification model with the best classification effect through test set testing;

(8) collecting and recording a new running sound signal of an air conditioner indoor unit to be detected, and slicing the running sound signal;

(9) sequentially carrying out the operations of the step (4), the step (5) and the step (6) on each fragment obtained in the step (8) to obtain a Mel joint characteristic;

(10) inputting the Mel combined features into the trained sound classification model in the step (7) for classification to obtain a classification result sequence of the whole piece of sound data;

(11) visualizing the classification result sequence and simultaneously giving an overall judgment result of the air conditioner quality;

(12) and recording the serial number of the air conditioner judged to be unqualified in quality, and simultaneously giving a prompt signal. So as to carry out the next processing in time.

Example 2

The method for detecting the abnormal sound of the air conditioner indoor unit based on the sound classification model in the embodiment 1 is characterized in that:

in the step (1) and the step (8), when the running sound signal of the air conditioner indoor unit is collected and recorded, the sampling rate is 48000 Hz, and a single-channel 32-bit storage format is adopted. According to the setting, the air conditioner is subjected to two stages of a low wind speed mode and a high wind speed mode in sequence, and the duration of the two stages is fixed. Since the sound of the air conditioner in the high wind speed mode has higher loudness and frequency, so that the abnormal sound is masked, the quality detection is performed only by using the sound data of the low wind speed mode. The air conditioner sound signals collected in the step (1) and the step (8) are difficult to distinguish in the time domain, as shown in fig. 1. Therefore, the method of step (4) is adopted by the invention to classify the sound through frequency domain difference.

The slicing in the step (2), the step (3) and the step (8) specifically means: the sound signal is further sliced into 0.5 second-duration segments with an overlap rate of 0.75.

The 0.5 second study duration was chosen because the duration of a single cycle of an abnormal sound is short, and the 0.5 second duration can encompass the entire process from generation to termination of a single abnormal sound; if the time length is too short, the single abnormality cannot be included, and the influence of accidental factors is easy to occur; if the time length is too long, more redundant information is included, and abnormal sound identification is influenced. In addition, the abnormal sound of the air conditioner may ultimately affect the user, so the duration of the abnormal sound and the human ear reaction are considered together, and 0.5 second is the best research time. And selecting 0.75 as the overlap ratio can avoid missing and destroying the abnormity at the edge of the segment, and can expand the data volume and enrich the data set.

In the step (2), selecting a segment with the abnormal sound ratio of not less than 0.5 as an abnormal sample, and marking, wherein the label of the grinding vibration sound is B, and the label of the outer membrane sound is C; in the step (3), N is used as a label of normal sound. A data set as shown in table 1 was obtained.

TABLE 1

In the step (4), the short-time fourier transform specifically includes:

firstly, framing a signal, wherein the frame length is 512, and the overlapping rate is 0.5;

then, Fast Fourier Transform (FFT) is carried out frame by frame to obtain a frequency spectrum, and an energy spectrum is obtained by further squaring; the FFT length is 512, and framing causes spectral leakage, so that each frame needs to be multiplied by a hamming window before FFT, and the formula is: n is more than or equal to 0 and less than or equal to N-1, N is the window length, N is the time domain variable, w [ N ]]Is the hamming window amplitude.

Although the sound data is collected in the sound-proof room, the frequency band below 2000 hz of the sound signal contains only little useful information, and the main workshop noise is concentrated. Therefore, the frequency band information below 1500 hz is discarded in the experiment, and the frequency band information of 1500-.

The reason why the Fourier spectrum is transformed to obtain the Mel spectrum in step (4) is that: the class C sound has components in the entire frequency domain, while the class N and class B sounds have components only in the mid-low frequency band, as shown in (a) of fig. 2. The unbalanced distribution of the differences between the three sounds may restrict the classification accuracy. While the mel frequency spectrum is equivalent to a fourier spectrum which is logarithmically compressed along the frequency axis, it can emphasize the middle and low frequency components and fold the high frequency components. As can be seen from (B) in fig. 2, the mel spectrum increases the difference between the two classes B-N and decreases the difference between the two classes B-C, which makes the spectrum difference between the three classes B-C-N more balanced, thereby obtaining a better classification effect.

In the step (4), a mel frequency spectrum is obtained through mel filtering, which specifically means that:

the mel filtering is to multiply the energy spectrum by the mel filter in the frequency domain, as shown in fig. 3, the abscissa is frequency, the ordinate is amplitude, and the mel spectrum is obtained by the following specific calculation formula: melspectrum ═ power _ spectrum (f) mel0 filter (f), where Melspectrum is the mel spectrum, power _ spectrum is the energy spectrum, melfilter is the mel filter, and f is the frequency variable;

the Mel filter comprises 40 triangular filters with an overlap ratio of 0.5, and the frequency range is 1500-;

normalizing the amplitude of the triangular filter based on the bandwidth, the bandwidth being determined by the center frequency of the adjacent triangular filter;

to determine the center frequency of the triangular filter, the center frequency of the triangular filter is determined by: equally dividing the frequency range into 40 frequency bands, wherein the central frequency of each frequency band is the Mel central frequency, and mapping the Mel central frequency according to the Mel mapping formula Mel (f) ═ 2595 × log10(1+ f/700), wherein the obtained result is the central frequency of the triangular filter;

the amplitude of the triangular filter is obtained by the following method: the lower cut-off frequency of the triangular filter is the central frequency of the previous triangular filter, and the upper cut-off frequency of the triangular filter is the central frequency of the next triangular filter, so that the bandwidth of the triangular filter is determined; calculating the ratio of the bandwidth reciprocal of each triangular filter to the sum of the bandwidth reciprocals of all the triangular filters, and taking the ratio as the amplitude of the triangular filter; the specific calculation formula is as follows:δ (i) is the amplitude of the ith triangular filter, B (i) is the bandwidth of the ith triangular filter, B (J) is the bandwidth of the jth triangular filter, and J is the total number of triangular filters.

In the step (4), the Mel frequency spectrum characteristics comprise frequency spectrum energy, frequency spectrum centroid, frequency spectrum entropy, frequency spectrum peak value, frequency spectrum attenuation, frequency spectrum flux, frequency spectrum kurtosis, frequency spectrum attenuation point, frequency spectrum skewness, frequency spectrum inclination and frequency spectrum distribution;

the extracted Mel frequency spectrum features should be able to fully describe the characteristics of the original data, and at the same time, have enough discrimination for different classes. The invention tests commonly used spectral characteristics in the audio field based on samples of three sounds, and screens out 11 effective characteristics by comparing the distinguishing effects of the samples on the three sounds, wherein the effective characteristics are spectral energy, spectral centroid, spectral entropy, spectral peak value, spectral attenuation, spectral flux, spectral kurtosis, spectral attenuation point, spectral skewness, spectral inclination and spectral distribution. These features describe spectral characteristics from multiple dimensions, including a description of spectral details, making up for the deficiencies of MFCCs. The mathematical expressions of these features can be obtained through various channels and are not described in detail.

In the step (5), the MFCCs are obtained through Mel cepstrum transformation, the magnitude of the Mel frequency spectrum is logarithmically compressed, and then the compressed Mel frequency spectrum IFFT is carried out to obtain the MFCCs. Since the spectrum obtained by the aforementioned FFT has a real-even property, the IFFT can be replaced by a Discrete Cosine Transform (DCT), which is represented by the formula (i):

in formula (I), M is frequency domain variable, k is transform domain variable, M is time domain point number, X M is time domain amplitude, and X k is transform domain amplitude.

The MFCCs feature generally consists of MFCCs and their differences, and in practice the first 13 MFCCs are generally used. In the acoustic domain, MFCCs are used to characterize formants, i.e. the envelopes of the frequency spectrum.

In the step (6), the MFCCs and the Mel frequency spectrum features are combined into 24-dimensional Mel combined features. The Mel joint features include details and envelope information of the frequency spectrum, and can fully describe the frequency spectrum characteristics. In addition, each frame of the initial sound segment contains 512 time sampling points, and the original sound signal is represented by 24 feature values after feature extraction, so that great efficiency improvement can be brought by using the joint features.

The mel-frequency joint feature in the step (7) is a time sequence of 24 × 92, and each sound segment corresponds to a feature sequence after feature extraction.

The classification network comprises a five-layer architecture, as shown in fig. 4, a sequence input layer (input layer), a BilsTM network layer (BilsTM layer), a full connection layer, a softmax layer, and a classification output layer (classification layer) in sequence;

the sequence input layer is a sequence layer with 24 dimensions; the BilSTM network layer has 100 neurons, namely, input data is mapped to a 100-dimensional feature space; inputting the result of the BilSTM network layer into a full connection layer, wherein the number of neurons of the full connection layer is equal to the classification number, the full connection layer maps the result of the BilSTM network layer to a 2-dimensional or 3-dimensional classification space, each dimension represents a class, and the larger the value of data in a certain dimension is, the higher the possibility that the data belongs to the class is. And if the values of the data in the two dimensions are similar, the class of the data is difficult to judge, so that the values are subjected to exponential mapping through the softmax layer, and the discrimination is increased. Then, the weight in each category is taken as the probability of the weight, and category judgment is carried out according to the probability; the classification output layer is used for calculating the cross entropy loss of the classification.

The core part of the classification net in step (7) is a BilSTM net, because: each 0.5 second segment of sound corresponds to a 24-dimensional signature sequence, and the RNN is a neural network dedicated to processing the sequence. However, RNN has a short memory capacity, making it only able to process shorter sequences. The LSTM adds a gate control unit on the basis of the structure of the RNN, and the gate control unit is used for discarding redundant information, thereby extending the sequence processing capability of the LSTM. The BilSTM is composed of a forward LSTM and a backward LSTM, can process longer time sequence data, and can simultaneously analyze the sequence rule and the reverse sequence rule of a sequence. In the sound segment set, some segments record the process of the abnormal sound starting, some segments record the process of the abnormal sound ending, and the two sound data have symmetry. BilSTM can utilize the symmetry to obtain better recognition effect.

In step (7), when the sound classification model with the best classification effect is selected, 5 times of repeated training and testing are performed on the parameters of each group of classification networks in order to avoid the contingency brought by the network parameters. The results of the experimental classification are shown in table 2.

TABLE 2

In step (10), the classification result sequence of the whole piece of sound data is composed of B, C and N.

In the step (11), when the classification result sequence is visualized, the type B corresponds to a numerical value of-1, the type C corresponds to a numerical value of +1, and the type N corresponds to a numerical value of 0, and the result sequence is converted into a numerical sequence. Visualization is facilitated as shown in fig. 5.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种语音转换模型的训练方法及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!