Near infrared spectrum feature extraction method and device
阅读说明:本技术 一种近红外光谱特征提取方法及装置 (Near infrared spectrum feature extraction method and device ) 是由 潘天红 郭威 李鱼强 陈山 皱小波 于 2019-10-12 设计创作,主要内容包括:本发明公开了一种近红外光谱特征提取方法及装置,所述方法包括:获取N个待测样品;使用光谱仪获取N个待测样品的近红外光谱数据;对近红外光谱数据进行预处理获取二维近红外光谱平滑数据;对二维近红外光谱平滑数据经排列与转换获取四维谱图数据;对四维谱图数据进行特征提取;对特征提取后的四维谱图数据进行特征排列获取二维特征数据;本发明的优点在于:能够保证数据的完整性,能够在全光谱区间进行特征提取,保证信息不会丢失。(The invention discloses a near infrared spectrum feature extraction method and a device, wherein the method comprises the following steps: obtaining N samples to be detected; acquiring near infrared spectrum data of N samples to be detected by using a spectrometer; preprocessing the near infrared spectrum data to obtain two-dimensional near infrared spectrum smooth data; arranging and converting the two-dimensional near infrared spectrum smooth data to obtain four-dimensional spectrogram data; extracting the features of the four-dimensional spectrogram data; performing feature arrangement on the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data; the invention has the advantages that: the integrity of data can be guaranteed, the features can be extracted in a full spectrum interval, and information cannot be lost.)
1. A method for near infrared spectral feature extraction, the method comprising:
obtaining N samples to be detected;
acquiring near infrared spectrum data of N samples to be detected by using a spectrometer;
preprocessing the near infrared spectrum data to obtain two-dimensional near infrared spectrum smooth data;
arranging and converting the two-dimensional near infrared spectrum smooth data to obtain four-dimensional spectrogram data;
extracting the features of the four-dimensional spectrogram data;
and performing feature arrangement on the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data.
2. The method of claim 1, wherein the pre-processing of the near infrared spectrum data to obtain two-dimensional near infrared spectrum smoothing data comprises: constructing a local model with the length of 2 lambda +1 of the current sample to be detected
According to the local model, acquiring an absorption rate model corresponding to the local model
Wherein, XtIs the wavelength of the central point of the current sample to be measured at the moment t, YtIs XtThe corresponding absorption rate;
scaling and mapping the local interval [ t-lambda, t + lambda ] to the interval [ -1,1], and obtaining the weight function of the local interval
Wherein x is*Is composed of
by the formulaTo XtSmoothing the corresponding absorption rate to obtain XtCorresponding smoothed data of absorption rate
Repeating the above stepsSmoothing the absorptance corresponding to M wavelengths in each sample to obtain NxM two-dimensional near infrared spectrum smoothing data
3. The method of claim 2, wherein the arranging and converting the two-dimensional nir spectrum smooth data to obtain four-dimensional spectrogram data comprises: smoothing the NxM two-dimensional near infrared spectrum data
Wherein the content of the first and second substances,
wherein the content of the first and second substances,
4. The near infrared spectral feature extraction method of claim 3, wherein the feature extraction of the four-dimensional spectral data comprises: taking four-dimensional spectrogram data as an input layer of a convolutional neural network, performing operation according to the sequence of convolution, pooling, convolution and pooling … … through L convolutional layers and pooling layers to obtain spectrogram features, and completing feature extraction of the four-dimensional spectrogram data, wherein each convolutional layer C is connected with a convolutional layer C, and each convolutional layer C is connected with a convolutional layer C through a filteriComprises
5. The near infrared spectrum feature extraction method of claim 4, wherein the feature arrangement of the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data comprises: and performing feature arrangement on the spectrogram features in an inverse transformation mode to obtain two-dimensional feature data.
6. An apparatus for near infrared spectral feature extraction, the apparatus comprising:
the screening module is used for acquiring N samples to be detected;
the spectrum data acquisition module is used for acquiring near infrared spectrum data of the N samples to be detected by using a spectrometer;
the smoothing processing module is used for preprocessing the near infrared spectrum data to obtain two-dimensional near infrared spectrum smoothing data;
the four-dimensional spectrogram data acquisition module is used for acquiring four-dimensional spectrogram data by arranging and converting the two-dimensional near infrared spectrum smooth data;
the characteristic extraction module is used for extracting the characteristics of the four-dimensional spectrogram data;
and the feature arrangement module is used for carrying out feature arrangement on the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data.
7. The near infrared spectral feature extraction device of claim 6, wherein the smoothing module is further configured to: constructing a local model with the length of 2 lambda +1 of the current sample to be detected
According to the local model, acquiring an absorption rate model corresponding to the local model
Wherein, XtIs the wavelength of the central point of the current sample to be measured at the moment t, YtIs XtThe corresponding absorption rate;
scaling and mapping the local interval [ t-lambda, t + lambda ] to the interval [ -1,1], and obtaining the weight function of the local interval
Wherein x is*Is composed of
by the formula
Repeating the steps, smoothing all the absorbances corresponding to the M wavelengths in each sample to obtain the smooth data of the NxM two-dimensional near infrared spectrum
8. The near infrared spectral feature extraction device of claim 7, wherein the four-dimensional spectral data acquisition module is further configured to: smoothing the NxM two-dimensional near infrared spectrum dataTaking M as an axis, cutting step length a, arranging into b rows so as to smooth the two-dimensional near infrared spectrum data
Wherein the content of the first and second substances,for the converted four-dimensional spectrogram data, r is a spectral data step interval, r' is an RGB step interval, Dic is an RGB dictionary,
wherein the content of the first and second substances,
9. The near infrared spectral feature extraction device of claim 8, wherein the feature extraction module is further configured to: taking four-dimensional spectrogram data as an input layer of a convolutional neural network, performing operation according to the sequence of convolution, pooling, convolution and pooling … … through L convolutional layers and pooling layers to obtain spectrogram features, and completing feature extraction of the four-dimensional spectrogram data, wherein each convolutional layer C is connected with a convolutional layer C, and each convolutional layer C is connected with a convolutional layer C through a filteriComprises
10. The near infrared spectral feature extraction device of claim 9, wherein the feature arrangement module is further configured to: and performing feature arrangement on the spectrogram features in an inverse transformation mode to obtain two-dimensional feature data.
Technical Field
The invention relates to the field of pattern recognition and nondestructive testing, in particular to a near infrared spectrum feature extraction method and device.
Background
The near infrared spectrum analysis technology is an analysis method for realizing qualitative and quantitative rapid detection of a detection object by utilizing the optical characteristics of chemical substances in a near infrared spectrum interval, and has the advantages of less sample consumption, no damage to samples, high analysis speed, low detection cost, no waste pollution and the like which cannot be compared with the conventional detection and analysis methods. Through technical development and improvement for many years, the technology is widely applied to the national important production fields of agriculture, petroleum, medicine, chemical industry, food and the like. With the continuous development of market economy and the improvement of quality of life standards in China, the requirements of international markets and general consumers on product quality are continuously improved, the traditional analysis method mainly based on chemical inspection cannot meet the market requirements and the requirements of people due to the defects of time consumption, pollution and the like, and the near infrared spectrum analysis method replacing the traditional detection analysis method can realize the rapid and nondestructive detection of samples. However, the data obtained while ensuring the integrity of the sample are generally high dimensional data, and the existing analysis methods have the following disadvantages:
(1) there is a high degree of dependency on the analysis objects. The existing feature extraction algorithm has different effects according to the characteristics of an analysis object and acquired data, and is specifically embodied in that all analysis methods have no universality and only can act on the analysis object with one or more data structures, and when the change frequency of a detection object is high, the effectiveness of the existing analysis method cannot be ensured;
(2) the feature data integrity is low. The integrity of the characteristic data determines the effectiveness, stability and comprehensiveness of the established model, the existing analysis method can only realize selection or data compression on a low-beam spectrum data interval, and cannot realize characteristic extraction on a full-spectrum interval, so that the integrity of final modeling data cannot be ensured, and the existing analysis model is difficult to optimize.
(3) The feature extraction result has limitations. The existing feature extraction algorithm is based on finding data correlation in a linear space, and the nonlinear features of the low-beam spectrum data cannot be effectively analyzed. When the number of samples of the low-beam spectrum data is smaller than the data dimension, the existing nonlinear kernel function topological method can make the hyperplane data dimension lower than the original data dimension, so that the information is lost.
Chinese patent publication No. CN108446631A discloses an intelligent spectrogram analysis method based on deep learning of convolutional neural network, which obtains a spectral image set to be analyzed; preprocessing a frequency spectrum image; training a Convolutional Neural Network (CNN) module; inputting the required frequency spectrum image into the trained CNN for feature extraction and performance analysis; and outputting the result. The method solves the problem that the model structure is not universal due to the fact that the data dimensionality in the processed spectrum data is too high or uncertain. However, the spectrum image is input, and the two-dimensional data sample cannot be analyzed, so that the two-dimensional data sample is easy to lose, the characteristic extraction of a full spectrum interval cannot be realized, and the integrity of final data cannot be guaranteed.
Disclosure of Invention
The technical problem to be solved by the invention is how to provide a near infrared spectrum feature extraction method and device which have high data integrity and extract features of a full spectrum interval.
The invention solves the technical problems through the following technical means: a method of near infrared spectral feature extraction, the method comprising:
obtaining N samples to be detected;
acquiring near infrared spectrum data of N samples to be detected by using a spectrometer;
preprocessing the near infrared spectrum data to obtain two-dimensional near infrared spectrum smooth data;
arranging and converting the two-dimensional near infrared spectrum smooth data to obtain four-dimensional spectrogram data;
extracting the features of the four-dimensional spectrogram data;
and performing feature arrangement on the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data.
The near infrared spectrum data are converted into the four-dimensional spectrogram data, the four-dimensional spectrogram data are used as input variables for feature extraction, the data integrity is guaranteed, the nonlinear feature extraction of a full-spectrum interval is realized, the problem that the feature information of the existing analysis method is lost is solved, the effective information of a sample is increased, and the accuracy of a system is improved.
Preferably, the preprocessing the near infrared spectrum data to obtain two-dimensional near infrared spectrum smooth data includes: constructing a local model with the length of 2 lambda +1 of the current sample to be detected
According to the local model, acquiring an absorption rate model corresponding to the local model
Wherein, XtIs the wavelength of the central point of the current sample to be measured at the moment t, YtIs XtThe corresponding absorption rate;
scaling and mapping the local interval [ t-lambda, t + lambda ] to the interval [ -1,1], and obtaining the weight function of the local interval
Wherein x is*Is composed of
Scaling mapping to the interval [ -1,1 []The latter value is then used to determine the value,by the formula
To XtSmoothing the corresponding absorption rate to obtainXtCorresponding smoothed data of absorption rate
Repeating the steps, smoothing all the absorbances corresponding to the M wavelengths in each sample to obtain the smooth data of the NxM two-dimensional near infrared spectrum
Preferably, the obtaining of the four-dimensional spectrogram data by arranging and converting the two-dimensional near infrared spectrum smooth data includes: smoothing the NxM two-dimensional near infrared spectrum data
Taking M as an axis, cutting step length a, arranging into b rows so as to smooth the two-dimensional near infrared spectrum dataConversion to a × b × N three-dimensional spectral dataConverting three-dimensional spectral data into four-dimensional spectrogram data through mapping relation f
Wherein the content of the first and second substances,
for the converted four-dimensional spectrogram data, r is a spectral data step interval, r' is an RGB step interval, Dic is an RGB dictionary,wherein the content of the first and second substances,
r is the pixel resolution, Ψ1=[0 r' 2r'…127]T,Ψ2=[128 128+r' 128+2r'…255]T。Preferably, the feature extraction of the four-dimensional spectrogram data includes: taking four-dimensional spectrogram data as an input layer of a convolutional neural network, performing operation according to the sequence of convolution, pooling, convolution and pooling … … through L convolutional layers and pooling layers to obtain spectrogram features, and completing feature extraction of the four-dimensional spectrogram data, wherein each convolutional layer C is connected with a convolutional layer C, and each convolutional layer C is connected with a convolutional layer C through a filteriComprisesDimension of
The input data of the convolutional layer is used as the characteristic data of the pooling layer P after convolution operationiComprising a dimension ofThe pooling window of (a).Preferably, the performing feature arrangement on the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data includes: and performing feature arrangement on the spectrogram features in an inverse transformation mode to obtain two-dimensional feature data.
An apparatus for near infrared spectral feature extraction, the apparatus comprising:
the screening module is used for acquiring N samples to be detected;
the spectrum data acquisition module is used for acquiring near infrared spectrum data of the N samples to be detected by using a spectrometer;
the smoothing processing module is used for preprocessing the near infrared spectrum data to obtain two-dimensional near infrared spectrum smoothing data;
the four-dimensional spectrogram data acquisition module is used for acquiring four-dimensional spectrogram data by arranging and converting the two-dimensional near infrared spectrum smooth data;
the characteristic extraction module is used for extracting the characteristics of the four-dimensional spectrogram data;
and the feature arrangement module is used for carrying out feature arrangement on the four-dimensional spectrogram data after feature extraction to obtain two-dimensional feature data.
Preferably, the smoothing module is further configured to: constructing a local model with the length of 2 lambda +1 of the current sample to be detected
According to the local model, acquiring an absorption rate model corresponding to the local model
Wherein, XtIs the wavelength of the central point of the current sample to be measured at the moment t, YtIs XtThe corresponding absorption rate;
scaling and mapping the local interval [ t-lambda, t + lambda ] to the interval [ -1,1], and obtaining the weight function of the local interval
Wherein x is*Is composed of
Scaling mapping to the interval [ -1,1 []The latter value is then used to determine the value,by the formula
To XtSmoothing the corresponding absorption rate to obtainXtCorresponding smoothed data of absorption rate
Repeating the steps, smoothing all the absorbances corresponding to the M wavelengths in each sample to obtain the smooth data of the NxM two-dimensional near infrared spectrum
Preferably, the four-dimensional spectrogram data acquiring module is further configured to: smoothing the NxM two-dimensional near infrared spectrum data
Taking M as an axis, cutting step length a, arranging into b rows so as to smooth the two-dimensional near infrared spectrum dataConversion to a × b × N three-dimensional spectral dataConverting three-dimensional spectral data into four-dimensional spectrogram data through mapping relation f
Wherein the content of the first and second substances,
for the converted four-dimensional spectrogram data, r is a spectral data step interval, r' is an RGB step interval, Dic is an RGB dictionary,wherein the content of the first and second substances,
r is the pixel resolution, Ψ1=[0 r' 2r'…127]T,Ψ2=[128 128+r' 128+2r'…255]T。Preferably, the feature extraction module is further configured to: taking four-dimensional spectrogram data as an input layer of a convolutional neural network, performing operation according to the sequence of convolution, pooling, convolution and pooling … … through L convolutional layers and pooling layers to obtain spectrogram features, and completing feature extraction of the four-dimensional spectrogram data, wherein each convolutional layer C is connected with a convolutional layer C, and each convolutional layer C is connected with a convolutional layer C through a filteriComprises
Dimension ofThe input data of the convolutional layer is used as the characteristic data of the pooling layer P after convolution operationiComprising a dimension ofThe pooling window of (a).Preferably, the feature arrangement module is further configured to: and performing feature arrangement on the spectrogram features in an inverse transformation mode to obtain two-dimensional feature data.
The invention has the advantages that:
(1) the near infrared spectrum data are converted into four-dimensional spectrogram data, the four-dimensional spectrogram data are used as input variables for feature extraction, the data integrity is guaranteed, meanwhile, a convolutional neural network is used as an analysis model, the nonlinear feature extraction of a full spectrum interval is achieved, the problem that feature information of an existing analysis method is lost is solved, effective information of a sample is increased, and the accuracy of a system is improved;
(2) the four-dimensional spectrogram data is used as an input variable, the processing capacity of the convolutional neural network on big data is combined, the effective input variable is greatly improved, although the input variable is increased, the calculation amount and the storage requirement are effectively reduced by parameter sharing and sparse interaction of the convolutional neural network, and the rapidity of the system is effectively improved;
(3) by combining image analysis and a convolutional neural network, the characteristic extraction of spectral data of different analysis objects can be realized, and the defect that the characteristic extraction of the spectral data is realized only according to a data structure can be effectively avoided;
(4) by adopting convolutional neural network feature extraction, when the near infrared spectrum data of different substances are faced, the updating effect can be achieved only by adjusting the weight in the full-connection layer after the feature extraction, and the subsequent maintenance and updating of the model are facilitated.
Drawings
Fig. 1 is an overall architecture diagram of a near infrared spectrum feature extraction method disclosed in embodiment 1 of the present invention;
FIG. 2 is a flow chart of the design of a method for extracting near infrared spectral features disclosed in embodiment 1 of the present invention;
fig. 3 is a schematic processing procedure diagram of a convolutional neural network of a near infrared spectrum feature extraction method disclosed in embodiment 1 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.