method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information

文档序号:1693486 发布日期:2019-12-10 浏览:14次 中文

阅读说明:本技术 一种利用荧光光谱特征信息实现快速识别比对的方法 (method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information ) 是由 何鹰 魏峨尊 高贝贝 王南达 王欣 李京都 刘莎莎 于 2019-08-29 设计创作,主要内容包括:本发明公开了一种利用荧光光谱特征信息实现快速识别比对的方法,该方法包括:(1)将原始荧光光谱文件中有效的行列数据信息保存,剔除文件中的非数据部分等共计34个步骤,本发明所公开的方法无须经过上述繁琐的数学解析方法,仅对由荧光仪采集的数据进行适当处理,通过选峰程序,确立荧光峰强度和峰中心位置坐标等特征信息,建立以峰强度和峰中心位置坐标等信息为基础的相关特征参数指标,由峰强度峰中心位置坐标计算得到的这些特征相关参数指标构建便于计算机自动计算的矩阵形式,从而与参照比对数据库的样本进行相似度系数的计算与匹配,获得准确的识别与判别信息,识别与比对正确率高,检测速度快。(The invention discloses a method for realizing rapid identification and comparison by utilizing fluorescence spectrum characteristic information, which comprises the following steps: (1) the method disclosed by the invention only properly processes data collected by a fluorometer without the complicated mathematical analysis method, establishes characteristic information such as fluorescence peak intensity and peak center position coordinates and the like through a peak selection program, establishes related characteristic parameter indexes based on the information such as the peak intensity and the peak center position coordinates and the like, and establishes a matrix form which is convenient for automatic calculation of a computer by the characteristic related parameter indexes obtained by calculating the peak intensity and peak center position coordinates, so that the calculation and matching of similarity coefficients are carried out on samples in a reference comparison database to obtain accurate identification and discrimination information, the identification and comparison accuracy is high, and the detection speed is high.)

1. a method for realizing rapid identification and comparison by utilizing fluorescence spectrum characteristic information is characterized by comprising the following steps:

(1) storing effective row and column data information in a sewage original fluorescence spectrum file, and removing a non-data part in the file;

(2) storing the excitation wavelength and emission wavelength information of the fluorescence spectrum;

(3) simultaneously deducting a first-level Rayleigh scattering interference peak and a Raman interference peak in the sewage original fluorescence spectrum file;

according to the rayleigh scattering formula: the rayleigh scattered light intensity is inversely proportional to the fourth power of the wavelength of the incident light; setting the excitation wavelength Ex to be 220-450 nm and the emission wavelength Em to be 250-600 nm, connecting excitation emission coordinate points (250 ) and (450 ) on an x-y plane coordinate system, deducting original scanning data with the emission wavelength of +/-20 nm on the connecting line, if the excitation wavelength scanning interval is 5nm, further deducting original scanning data with the excitation wavelength of 245nm and the emission wavelength of 250-259nm, and according to the following general fluorescence formula, deducting a primary Rayleigh scattering interference peak in the sewage original fluorescence spectrum file;

λ represents a wavelength;

(4) Deducting a secondary Rayleigh scattering interference peak in the sewage original fluorescence spectrum file, setting an excitation wavelength Ex to be 220-450 nm and an emission wavelength Em to be 250-600 nm, connecting coordinate points (220,440) with (300,600) on an x-y plane coordinate system, and deducting original scanning data of the emission wavelength +/-10 nm on the connecting line, thus finishing deducting the secondary Rayleigh scattering interference peak in the sewage original fluorescence spectrum file;

(5) Performing Gaussian low-pass filtering convolution on the spectrogram data obtained in the step (4);

(51) Setting convolution parameters, smoothing the original spectrogram data by a Gaussian filter, wherein the density formula of the two-dimensional Gaussian function is as follows:σ represents the standard deviation, and the gaussian template matrix uses its discretized representation;

(52) Performing Gaussian smoothing on the spectrogram data f (x, y) obtained in the step (4) to obtain processed spectrogram data gs (x, y) as follows: gs (x, y) ═ h (x, y, σ) × f (x, y), where x represents convolution, converting h (x, y, σ) into a two-dimensional template for convolution of the spectrogram data;

(6) setting spectrogram data output parameters according to the matrix obtained after convolution, outputting spectrogram data processing results, checking the output spectrogram data by using the convolution, and performing convolution operation to obtain a new spectrogram data matrix;

(7) performing peak searching calculation on the matrix after convolution operation: respectively transposing the matrix after convolution operation according to the matrix, namely searching peaks according to an emission-excitation matrix and searching peaks according to an excitation-emission matrix; setting the minimum distance between peak and peak, recording the peak intensity value and the coordinates of the peak center position before and after the matrix transposition, namely the excitation and emission wavelength values;

(8) As a supplement to the peak-finding calculation, a choice is made as to whether the maximum value of the first column and the last column of the matrix before and after the inversion is a peak, i.e. if the second column maximum value is smaller than the first column maximum value, the first column maximum value is taken as a peak, and similarly, if the last column maximum value is larger than the previous column maximum value, the last column maximum value is taken as a peak;

(9) setting the minimum interval between peaks, determining the maximum central position coordinate of peak, i.e. the exciting and emitting coordinates of maximum peak on x-y axis, and confirming other fluorescent peaks according to the principle that other effective peaks appear in the range of not less than 20nm from the coordinate of the point until all fluorescent peaks meeting the requirement are screened out;

(10) in order to eliminate the interference of the peak with low fluorescence intensity to the peak with high fluorescence intensity during matching identification, the maximum peak value is preferentially compared with the peak values of the other peaks, and if the maximum peak value is 1-3 times larger than the compared peak value, the peak with low fluorescence intensity is removed;

(11) arranging the peak intensity values from large to small according to the sequence of the peak intensity values;

(12) arranging the intensity values of all the peaks and the corresponding emission wavelengths and excitation wavelengths in parallel to form an m x3 matrix, wherein m represents the number of effective fluorescence peaks, the matrix is called a peak intensity peak center coordinate matrix and is counted as m x 3; when m is 3, the matrix is:

(13) after the original matrix is transposed, it is repeatednewly finishing peak searching, selecting effective peaks and arranging according to the height of peak values to form a peak intensity and peak center coordinate matrix after transposition, and counting as m' multiplied by 3; when m' is 4, the matrix is:

(14) the peak intensity and the peak center coordinate matrix before and after transposition are calculated as follows:

(15) calculating the ratio of the maximum peak intensity value of the first line to the rest peak intensity values, recording the ratio as R _ peak, and recording the ratios from the peak value of the second line, the peak value of the third line, the peak value of the fourth line to the peak value of the Nth line as R _ peak12, R _ peak13, R _ peak14 and … R _ peak1N respectively; calculating the ratio of the peak value of the second line to the peak values of the third line, the fourth line and the Nth line, and respectively recording the ratio as R _ peak23, R _ peak24 and … R _ peak 2N;

(16) calculating the difference between the emission wavelength of the maximum peak in the first row and the emission wavelengths of the other peaks, counting as D _ em, and the differences between the emission wavelengths of the second row, the third row and the fourth row up to the Nth row, | em1-em2|, | em1-em3|, | em1-em4|, | em1-emN |, taking absolute values to be respectively recorded as D _ em12, D _ em13, D _ em14 and … D _ em1N, calculating the differences between the emission wavelengths of the peaks in the second row and the emission wavelengths of the peaks in the third row, the fourth row and the Nth row, such as | em2-em3|, | em 2-4 |. | em2-emN |, taking absolute values to be respectively recorded as D _ em 585 and D _ em24 … D _ em 582N;

(17) calculating the difference between the excitation wavelength of the maximum peak in the first row and the excitation wavelengths of the other peaks, calculating the difference between D _ ex and the excitation wavelengths of the second, third and fourth rows to the Nth row, | ex1-ex2|, | ex1-ex3|, | ex1-ex4|, | ex1-exN |, taking absolute values as D _ ex12, D _ ex13, D _ 737ex 14 and … D _ ex1N respectively, calculating the difference between the excitation wavelength of the peak in the second row and the excitation wavelengths of the peaks in the third and fourth rows to the Nth row, | ex2-ex3|, | ex 2-4 |, | ex2-exN |, and taking absolute values as D _ ex23, D _ ex24, |, D _ ex N and D _ ex N;

(18) calculating the distance between the maximum peak center coordinate of the first row and the center coordinates of the other peaks, calculating as D _ xy, and calculating the distance between the maximum peak center coordinate of the first row and the peak centers of the second row, the third row, the fourth row and the Nth row as D _ xy12, D _ xy13, D _ xy14 and … D _ xy1N respectively; calculating the distances between the second peak center coordinate and the peak center coordinates of the third, fourth and Nth rows, and recording the distances as D _ xy23, D _ xy24 and … D _ xy 2N; the calculation formula is as follows:

(19) calculating the included angle between the connecting line of the center coordinates of any two peaks and the x or y axis, which is recorded as cos theta, the included angle between the connecting line of the strongest peak, the 2 nd, 3 th, 4 th to Nth strong peak center coordinates and the x axis can be represented as cos theta 12, cos theta 13, cos theta 14, … cos theta 1N, and so on, the included angle between the connecting line of the 2 nd strong peak and the 3 rd strong peak center coordinates and the x axis can be represented as cos theta 23, and the calculation formula is as follows:

(20) Calculating the slope of the connecting line of the centers of any two peaks on an x-y plane, which is denoted as Slo _ k, the slope of the connecting line of the strongest peak and the coordinates of the centers of the 2 nd, 3 rd and 4 th strong peaks till the Nth strong peak on the x-y plane can be expressed as Slo _ k12, Slo _ k13, Slo _ k14 and … Slo _ k1N, and so on, the slope of the connecting line of the center coordinates of the 2 nd strong peak and the 3 rd strong peak on the x-y plane can be expressed as Slo _ k23, and the calculation formula is as follows:

(21) Recombining all the calculation results with a peak intensity and a peak center coordinate matrix into two matrixes, namely a pre-transposition m multiplied by n matrix and a post-transposition m' multiplied by n matrix, and counting the two matrixes as TA and TB;

(22) combining the two matrixes TA and TB obtained by all samples into a database, and counting the database as T _ data for calculating similarity matching;

(23) subdividing the T _ data database, and establishing a new characteristic database according to the numerical values of m and m ', wherein the new characteristic database consists of data matrixes which are completely equal to m and m';

(24) when similarity matching identification is carried out on unknown samples, the data processing method of the unknown samples is kept consistent with that of each sample of the database, and the processed data are a matrix m multiplied by n before transposition and a matrix m' multiplied by n after transposition and respectively counted as XA and XB;

(25) establishing a comparison identification method of an unknown sample and a characteristic database sample: carrying out similarity matching calculation on unknown sample matrix data XA and XB and corresponding pre-and post-transposition matrix data TA and TB in a T _ data database;

(26) the similarity matching calculation method comprises the following steps:

(27) Performing similarity matching calculation on all TA matrixes in the matrix XA and the T _ data database one by one, and performing similarity matching calculation on all TB matrixes in the matrix XB and the T _ data database one by one;

(28) Before the unknown sample is transposed,xmn denotes the value of the m row and n column of the matrix;

After the rotation, the glass is rotated,y m 'n represents the value of the matrix at row m' and column n; before a sample of the feature database is transposed,amn represents the value of the mth row and nth column of the matrix;

after the rotation, the glass is rotated,b m 'n represents the value of the matrix at row m' and column n;

calculating the element value deviation of the unknown sample before transposition and each sample of the feature database according to the following formula, and taking the absolute value of the element value deviation as CV1

Calculating the deviation of the element value of the transformed unknown sample and each sample of the feature database according to the following formula, and taking the absolute value of the deviation as CV2

(29) setting a threshold value beta 1 of each element in the matrixes CV1 and CV2, if a certain element in the matrixes is less than or equal to the threshold value beta 1, recording the element as 1, otherwise, recording the element as 0, recording new matrixes as SN _ CV1 and SN _ CV2, and calculating the total number of the elements in each matrix as 1;

(30) Calculating the number of non-zero in unknown sample matrixes XA and XB, and recording the number as NZ _ A and NZ _ B;

(31) Calculating similarity matching coefficients before and after the transposition of the unknown sample matrix, and recording the similarity matching coefficients as X1 and X2; wherein

(32) calculating the total coefficient of similarity matching of the unknown samples, denoted as TX, wherein

(33) setting a display similarity matching total coefficient threshold value which is recorded as beta 2, and displaying sample information of which the similarity matching total coefficient is greater than or equal to beta 2;

(34) and displaying the matched sample information in the characteristic database sample T _ DATA according to the sequence of the number of the matched files or the sequence of the total coefficient of similarity matching.

2. the method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in step (52), the convolution template with template size of 5 × 5 and σ ═ 1 is:

3. The method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in the step (52), the value range of sigma is 0.5-3; the template size ranges from 3 × 3, 5 × 5, 7 × 7, or 9 × 9, with a preferred template size of 3 × 3.

4. The method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in step (6), the image may be convolved again by using a high-speed convolution kernel to obtain an image matrix after a plurality of convolution operations.

5. The method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in the steps (15) - (20), N is an integer greater than 4.

6. the method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in step (21), when m is 4 and n is 9, the matrix may be arranged as follows:

7. the method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in step (25), matrix data consistent with the values of m and m 'in the feature database T _ data may be selected according to the values of m and m' in the unknown sample matrix data XA and XB for similarity matching calculation.

8. the method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in step (29), the threshold value β 1 is not more than 0.3.

9. The method for realizing rapid identification and comparison by using fluorescence spectrum characteristic information as claimed in claim 1, wherein: in step (33), the threshold value β 2 is 0.7 or more.

Technical Field

the invention relates to the fields of environmental science, spectroscopy and the like, in particular to a method for rapidly identifying and comparing by using a fluorescence spectrogram.

Background

in recent years, along with the improvement of the national requirement on environmental protection, some environmental protection online monitoring devices with monitoring, early warning and tracing functions are gradually valued, because the existing conventional online monitoring devices are single in function, most devices can only give specific monitored values, and the specific values obtained by monitoring cannot be answered by information such as which pollutants cause, pollutant sources, industry affiliation, stealing, draining and discharging enterprises and the like, so that 'who pollutes and administers' cannot be well implemented. After the fluorescence spectrometer scans the sewage, a specific three-dimensional fluorescence spectrum of the sewage, namely a 'fingerprint spectrum' of the sewage can be obtained, and the fingerprint spectrum has fingerprint uniqueness or exclusivity on the wastewater discharged by different industries and different enterprises, so that the fluorescence spectrometer can be applied to monitoring, early warning and tracing of the sewage.

Each fluorescent substance has unique fluorescence spectrum information, and the fluorescence detection has higher sensitivity and selectivity, thereby being widely applied. However, when the difference between the fluorescence intensity of the dissolved organic matter and the fluorescence intensity of the background water is not large, or when the fluorescence spectra of a plurality of complex organic matters in the mixed solution are overlapped, only one contour line fluorescence spectrum consisting of the excitation wavelength, the emission wavelength and the fluorescence intensity projection is relied on, so that the difficulty in obtaining comprehensive and accurate identification and judgment information of the dissolved organic matter is extremely high. This is because, when the fluorescence intensity of the background water is comparable to the fluorescence intensity of the dissolved organic matter, the noise of the instrument interferes with the determination of the fluorescence intensity of the organic matter and the coordinates of the center position of the peak, and for a mixed organic matter fluorescence spectrum overlap system, the fluorescence spectrum obtained by scanning is a comprehensive result of mutual influence between the components of the mixed system, and the fluorescence intensity and the coordinates of the center position of the peak change more greatly, so that a method is needed, which can not only identify the three-dimensional fluorescence spectrum of the dissolved organic matter in a pure component system or a multi-component mixed system, but also compare the data information of the composition spectrum with the existing standard or reference information, and complete tasks such as classification, identification, comparison and tracing. At present, the common fluorescent spectrum identification method at home and abroad is to analyze the fluorescent spectrum by utilizing a parallel factor analysis method, a partial least square method, an alternative trilinear decomposition method, a non-negative matrix factor decomposition method and the like to obtain multi-component fluorescent information and then establish the identification method. However, there is no reliable scientific method for automatically and effectively identifying and comparing the obtained solutions with large difference in fluorescence intensity or serious aliasing degree of fluorescence spectrum by using a spectrum analysis method. Some identification methods only use the relation between the peak position and the peak intensity of the fluorescence spectrum to carry out discrimination and comparison; the other identification method is used for identifying the components of the mixed three-dimensional fluorescence spectrum by constructing a comprehensive similarity index through the characteristic peak and waveform characteristic parameters of two-dimensional fluorescence spectra decomposed by a multi-dimensional analysis algorithm; the former identifies a sample by simply utilizing so-called characteristic parameters such as peak position, fluorescence intensity and the like when the central positions of fluorescence peaks are close to some complex multi-component systems, has great artificial randomness and has lower identification or comparison accuracy; in the latter, because the identification or comparison method is based on the specific parallel factor method (PARAFAC), the output result sequence of the algorithm is uncertain, which may also cause the erroneous determination of identification or comparison. Therefore, the method utilizes the three-dimensional fluorescence spectrum spectrogram and data to rapidly identify and distinguish the authenticity of the white spirit and the traditional Chinese medicinal materials, and has very important significance for monitoring, detecting and tracing the environmental water quality.

disclosure of Invention

the invention aims to solve the technical problem of providing a method for rapidly identifying and tracing sewage by utilizing three-dimensional fluorescence spectrum characteristic information so as to solve the problems of long time consumption and low stability of identification and comparison errors when analyzing a three-dimensional fluorescence spectrum by utilizing a multi-dimensional analysis algorithm.

The invention adopts the following technical scheme:

The improvement of the method for realizing rapid identification and comparison by utilizing fluorescence spectrum characteristic information is that the method comprises the following steps:

(1) storing effective row and column data information in an original fluorescence spectrum file, and removing a non-data part in the file;

(2) storing the excitation wavelength and emission wavelength information of the fluorescence spectrum;

(3) deducting a first-order Rayleigh scattering interference peak and a Raman interference peak of water;

according to the rayleigh scattering formula: the rayleigh scattered light intensity is inversely proportional to the fourth power of the wavelength of the incident light; setting an excitation wavelength (Ex) to 220-450 nm and an emission wavelength (Em) to 250-600 nm, connecting excitation emission coordinate points (250 ) and (450 ) on an x-y plane coordinate, deducting original scanning data of the emission wavelength +/-20 nm on the connection line, and deducting original scanning data of excitation 245nm and the emission wavelength at 250-259nm if the excitation wavelength scanning interval is 5nm, thus completing deduction of a first-level Rayleigh scattering interference peak;

(4) Deducting a secondary Rayleigh scattering interference peak, setting an excitation wavelength (Ex) to 220-450 nm and an emission wavelength (Em) to 250-600 nm, connecting coordinate points (220,440) with (300,600) on an x-y plane coordinate, and deducting original scanning data of the emission wavelength +/-10 nm on the connecting line to finish deducting the secondary Rayleigh scattering interference peak;

(5) performing Gaussian low-pass filtering convolution on the spectrogram obtained by scanning, and setting convolution parameters

and smoothing the original image by using a Gaussian filter, wherein the density formula of the two-dimensional Gaussian function is as follows, and the Gaussian template matrix uses a discretized expression:

the Fspecial function is used to create a predefined filter operator, in the syntactic format:

h=fspecial(type)

h=fspecial(type,parameters,sigma)

the original image f (x, y) is subjected to gaussian smoothing processing, and a processed image gs (x, y) is obtained as follows: gs (x, y) ═ h (x, y, σ) × f (x, y) where denotes convolution. In the actual calculation process, h (x, y, σ) is converted into a two-dimensional template for performing convolution operation on the image, for example, for a convolution template with a template size of 5 × 5 and σ equal to 1:

Here, the selection of the standard deviation of σ has a certain influence on the shape of the function, and generally, the standard deviation is too small, the pixel weight deviating from the center is also small, and the function of smoothing noise is not considered; if the standard deviation is too large, the template is degraded into an average template; usually ranges from 0.5 to 3; the template size ranges from 3 × 3, 5 × 5, 7 × 7, or 9 × 9, with a preferred template size of 3 × 3;

(6) According to the three-dimensional matrix obtained after convolution, image output is carried out, and image output parameters are set

after the convolution kernel is used for carrying out convolution operation on the image, a new image matrix is obtained, and when necessary, the high-speed convolution kernel is used for carrying out convolution operation on the image again, so that the image matrix after multiple times of convolution operation is obtained;

(7) performing peak searching calculation on the matrix subjected to convolution operation; respectively transposing the matrix after convolution operation according to the matrix before and after, namely searching peaks according to an emission-excitation matrix and searching peaks according to an excitation-emission matrix; setting the minimum distance between peak and peak, recording the peak intensity value and the coordinates (excitation and emission wavelength value) of the peak center position before and after the matrix transposition;

(8) As a supplement to the peak-finding calculation, a choice is made as to whether the maximum value of the first column and the last column of the matrix before and after the inversion is a peak, i.e. if the second column maximum value is smaller than the first column maximum value, the first column maximum value is taken as a peak, and similarly, if the last column maximum value is larger than the previous column maximum value, the last column maximum value is taken as a peak;

(9) the minimum interval between peaks is sometimes required to be set due to the influence of instrument noise, and usually, the coordinate of the maximum peak center position, namely the coordinate of the maximum peak on the x-y axis (excitation-emission), is firstly determined, and then other fluorescence peaks are confirmed according to the principle that other effective peaks appear in the range of not less than 20nm from the coordinate of the point until all the fluorescence peaks meeting the requirement are screened out.

(10) in order to eliminate the interference of the peak with low fluorescence intensity to the peak with high fluorescence intensity during matching identification, the maximum peak value is preferentially compared with the peak values of the other peaks, and if the maximum peak value and the compared peak value are more than 1-3 times, the peak with small fluorescence intensity is rejected;

(11) Arranging the peak intensity values from large to small according to the sequence of the peak intensity values;

(12) arranging the intensity values of all the peaks and the corresponding emission wavelengths and excitation wavelengths in parallel to form an m x3 matrix, wherein m represents the number of effective fluorescence peaks, the matrix is called a peak intensity peak center coordinate matrix and is counted as m x 3; if m is 3, the matrix is:

(13) After the original matrix is transferred, finishing searching peaks again, selecting effective peaks and arranging according to the height of peak values to form a transferred peak intensity peak center coordinate matrix which is counted as m' multiplied by 3; if m' is 4, the matrix is:

(14) And (3) calculating the central coordinate matrix of the peak intensity before and after transposition as follows, if multiple peaks exist:

(15) calculating the ratio of the maximum peak intensity value of the first line to the rest peak intensity values, and recording the ratio as R _ peak12, R _ peak13, R _ peak14 and … if the ratio is to the peak values of the second line, the third line and the fourth line; calculating the ratio of the peak value of the second line to the peak value of the third line and the fourth line, and respectively recording the ratio as R _ peak23, R _ peak24 and the like;

(16) calculating the difference between the emission wavelength of the maximum peak in the first row and the emission wavelengths of the other peaks, and calculating as D _ em, such as the difference between the emission wavelengths of the second row, the third row and the fourth row, | em1-em2|, | em1-em3|, | em1-em4|,. the emission wavelengths of the second row and the emission wavelengths of the third row and the fourth row are respectively recorded as D _ em12, D _ em13, D _ em14 and …, and the difference between the emission wavelengths of the second row and the emission wavelengths of the third row and the fourth row are calculated as | em2-em3|, | em2-em4|, and the absolute values are respectively recorded as D _ em23, D _ em24 and the like;

(17) calculating the difference between the excitation wavelength of the maximum peak in the first row and the excitation wavelengths of the other peaks, and calculating as D _ ex, such as the difference between the excitation wavelengths of the second, third and fourth rows, | ex1-ex2|, | ex1-ex3|, | ex1-ex4|,. the absolute values are respectively recorded as D _ ex12, D _ ex13, D _ ex14, …, and the difference between the excitation wavelengths of the peaks in the second row and the excitation wavelengths of the peaks in the third and fourth rows, such as | ex2-ex3|, | ex2-ex4|, and the absolute values are respectively recorded as D _ ex23, D _ ex24, and the like;

(18) calculating the distance between the maximum peak center coordinate of the first row and the center coordinates of the other peaks, and calculating the distance as D _ xy, wherein the distances between the maximum peak center coordinate of the first row and the centers of the peaks of the second row, the third row and the fourth row are respectively recorded as D _ xy12, D _ xy13, D _ xy14 and …; calculating the distance between the second peak center coordinate and the third and fourth row peak center coordinates, which are respectively recorded as D _ xy23, D _ xy24 and the like; is calculated by the formula

(19) calculating the included angle between the connecting line of the center coordinates of any two peaks and the x or y axis, and recording the included angle as cos theta, wherein the included angle between the connecting line of the strongest peak and the center coordinates of the 2 nd, 3 rd and 4 th strong peaks and the x axis can be represented as cos theta 12, cos theta 13, cos theta 14 and …, the included angle between the connecting line of the center coordinates of the 2 nd strong peak and the 3 rd strong peak and the x axis can be represented as cos theta 23, and the like, and the calculation formula is

(20) Calculating the slope of the connecting line of the centers of any two peaks on an x-y plane, and recording the slope as Slo _ k, wherein if the slope of the connecting line of the strongest peak and the center coordinates of the 2 nd, 3 rd and 4 th strong peaks on the x-y plane can be represented as Slo _ k12, Slo _ k13, Slo _ k14 and …, the slope of the connecting line of the 2 nd strong peak and the center coordinates of the 3 rd strong peak on the x-y plane can be represented as Slo _ k23 and the like, the calculation formula is

(21) recombining all the calculation results and a peak intensity peak center coordinate matrix into two matrixes, namely a m multiplied by n matrix before transposition and a m' multiplied by n matrix after transposition, and counting the matrixes as TA and TB; for example, for m4 and n 9, the matrix may be arranged as follows

the transposed matrix TB is arranged similarly to TA;

(22) Combining the two matrixes TA and TB obtained by all samples into a database, and counting the database as T _ data for calculating similarity matching;

(23) Sometimes, the database can be subdivided according to needs, for example, a new feature database is established according to the numerical values of m and m ', the new feature database consists of data matrixes with m completely equal to m' and m 'completely equal to m', and the subdivided feature database has the characteristics of higher identification and comparison speed, higher matching accuracy and the like when being subjected to identification and comparison with unknown samples;

(24) when similarity matching identification is carried out on unknown samples, firstly, the data processing method of the unknown samples is kept consistent with the data processing method of each sample of a database, and the processed data are a matrix m multiplied by n before transposition and a matrix m' multiplied by n after transposition and respectively counted as XA and XB;

(25) The method for comparing and identifying the unknown sample and the sample of the feature database is established, similarity matching calculation can be generally carried out on corresponding pre-transposition matrix data TA and TB in unknown sample matrix data XA and XB and a T _ data base, and particularly, in order to improve the matching identification rate, matrix data consistent with the values of m and m 'in the feature database T _ data can be selected according to the values of m and m' in the unknown sample matrix data XA and XB to carry out similarity matching calculation;

(26) The similarity matching calculation method comprises the following steps:

(27) performing similarity matching calculation on all TA matrixes in the matrixes XA and T _ data one by one, and performing similarity matching calculation on all TB matrixes in the matrixes XB and T _ data one by one;

(28) Before the unknown sample is transposed,

After the rotation, the glass is rotated,

Before a sample of the feature database is transposed,

after being rotated

calculating the deviation of the element value of the unknown sample before transposition and each sample of the feature database according to the following formula, taking the absolute value of the deviation and recording the absolute value as CV1

Calculating the deviation of the element value of the transposed unknown sample and each sample of the feature database according to the following formula, taking the absolute value of the deviation and recording the absolute value as CV2

(29) Setting a threshold value beta 1 of each element in the matrixes CV1 and CV2, such as setting beta 1< ═ 0.3, namely, if an element in the matrixes is less than or equal to 0.3, marking the element as 1, otherwise, marking as 0, and marking new matrixes as SN _ CV1 and SN _ CV 2; and calculating the total number of elements of 1 in each matrix;

(30) calculating the number of non-zero in unknown sample matrixes XA and XB, and recording the number as NZ _ A and NZ _ B;

(31) calculating similarity matching coefficients before and after the transposition of the unknown sample matrix, and recording the similarity matching coefficients as X1 and X2; wherein

(32) calculating the total coefficient of similarity matching of the unknown samples, denoted as TX, wherein

(33) setting a display similarity matching total coefficient threshold value, recording the display similarity matching total coefficient threshold value as beta 2, and displaying sample information of which the similarity matching total coefficient is greater than or equal to 0.7 if beta 2> is set to be 0.7;

(34) and displaying the matched sample information in the characteristic database sample T _ DATA according to the sequence of the number of the matched files or the sequence of the total coefficient of similarity matching.

The invention has the beneficial effects that:

the existing identification and comparison of three-dimensional fluorescence spectra is based on analytical methods such as a parallel factor analysis method, a partial least square method, an alternative trilinear decomposition method, a non-negative matrix factorization method and the like, and then an identification and comparison method is established by obtaining multi-component fluorescence information. The method disclosed by the invention does not need to pass through the complicated mathematical analysis method, only properly processes the data collected by the fluorometer, establishes the characteristic information such as the fluorescence peak intensity and the peak center position coordinate and the like through a peak selection program, establishes the related characteristic parameter indexes based on the information such as the peak intensity and the peak center position coordinate and the like, and establishes the matrix form which is convenient for the automatic calculation of a computer through the characteristic related parameter indexes obtained by the calculation of the peak intensity and the peak center position coordinate, thereby carrying out the calculation and the matching of the similarity coefficient with the sample of a reference comparison database, obtaining the accurate identification and judgment information, having high identification and comparison accuracy and high detection speed.

Detailed Description

in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

18页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:分子印迹荧光光纤传感器及其构建方法、荧光检测方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!