Method for identifying fine spot species based on Fourier transform infrared spectroscopy and chemometrics

文档序号:969500 发布日期:2020-11-03 浏览:2次 中文

阅读说明:本技术 基于傅里叶变换红外光谱和化学计量学鉴别精斑种属的方法 (Method for identifying fine spot species based on Fourier transform infrared spectroscopy and chemometrics ) 是由 王振原 魏昕 于凯 王功绩 吴迪 刘睿娜 于 2020-06-19 设计创作,主要内容包括:本发明公开了一种基于傅里叶变换红外光谱和化学计量学鉴别精斑种属的方法:收集来自人类和狗、兔、猪、牛、羊五种常见动物的精液样本,将精液样本按照不同载体、形成时间,制备训练集、验证集共计420份精斑样本并进行傅里叶红外光谱检测,利用主成分分析方法对检测的光谱数据结果进行聚类分析,最终使用偏最小二乘法判别分析进行模型的训练和验证。与传统的精斑种属鉴别方法相比,本发明对精斑检材无破坏性,操作简单,可以快速、准确的鉴别精斑的种属来源,有利于案件的快速定性。(The invention discloses a method for identifying fine spot species based on Fourier transform infrared spectroscopy and chemometrics, which comprises the following steps: the method comprises the steps of collecting semen samples from five common animals including human beings, dogs, rabbits, pigs, cattle and sheep, preparing 420 parts of semen samples in total of a training set and a verification set according to different carriers and forming time, carrying out Fourier infrared spectrum detection, carrying out cluster analysis on detected spectrum data results by using a principal component analysis method, and finally carrying out model training and verification by using partial least square discriminant analysis. Compared with the traditional identification method of the species of the fine spots, the method has no damage to the material to be detected of the fine spots, is simple to operate, can quickly and accurately identify the species source of the fine spots, and is favorable for quick qualification of cases.)

1. A method for rapidly identifying fine speck species is characterized in that: the method comprises the following steps:

1) separating a semen sample from the semen spots of the species to be identified, and collecting Fourier transform infrared spectrum data of the semen sample;

2) selecting the data of the fine spot fingerprint area in the infrared spectrum data for calibration on the acquired Fourier transform infrared spectrum data;

3) inputting the calibrated data of the seminal speckle fingerprint area into a human/non-human seminal speckle species identification model, and outputting an identification result by the model; the seminal spot species identification model is established by utilizing corresponding Fourier transform infrared spectrum data of seminal fluid samples separated from a plurality of human and animal seminal spots and adopting a partial least square method for discriminant analysis.

2. The method for rapidly identifying the fine speck species according to claim 1, characterized in that: the semen sample is isolated by transferring the semen spots from the carrier to physiological saline.

3. The method for rapidly identifying the fine speck species according to claim 1, characterized in that: the Fourier transform infrared spectrum data acquisition method comprises the following steps: and carrying out Fourier transform infrared spectrum analysis on the semen sample separated from the seminal vesicle for multiple times, and then averaging the results of the multiple times of analysis to obtain Fourier transform infrared spectrum data corresponding to the seminal vesicle.

4. The method for rapidly identifying the genus of seminal plasma according to claim 1 or 3, wherein: in the Fourier transform infrared spectrum data, the fine spot fingerprint area comprises absorption peaks of amide I and amide II.

5. The method for rapidly identifying the fine speck species according to claim 1, characterized in that: the calibration comprises the following steps: and performing mean center transformation and standard normal variable transformation on the selected infrared spectrum data.

6. The method for rapidly identifying the fine speck species according to claim 1, characterized in that: in the establishment of the seminal vesicle species identification model, the seminal fluid sample is separated from a plurality of human seminal vesicles and animal seminal vesicles with different formation times; alternatively, the semen sample is isolated from two types of seminal plaques, one of which is human and animal seminal plaques differing in formation time and the other of which is human and animal seminal plaques differing in carrier.

7. The method for rapidly identifying the fine speck species according to claim 6, characterized in that: the formation time of the human seminal plaques and the animal seminal plaques is within 30 days.

8. The method for rapidly identifying the fine speck species according to claim 6, characterized in that: the carrier is selected from one or more of a glass matrix and a fiber matrix.

9. The method for rapidly identifying the genus of seminal plasma according to claim 1 or 6, wherein: the animal is selected from one or more of dog, rabbit, pig, cattle and sheep.

Technical Field

The invention belongs to the technical field of rapid and nondestructive testing of fine speckles, and relates to qualitative identification of fine speckle components in material evidence collected in a case testing process.

Background

Semen is a reproductive fluid that, when ejaculated under sexual stimulation, typically infiltrates or adheres to other carriers and dries to form a stain. In a sexual crime case or other case involving sexual activity, the stain, if it is left on a support and extracted on site or in a laboratory, can be used to obtain a fine spot of one of the most important and reliable markers. The fine spots can not only confirm the occurrence of sexual behaviors, but also be used for the identification of criminal suspects. In sexual criminal cases where the victim has mental disability, no witness, or the victim has died, the value of seminal plaques formed by seminal fluid carryover in forensic examination, case detection, and the like is particularly important.

Many mature methods for detecting the fine spots have been widely applied to actual cases, and provide clues for case investigation. However, it has been reported in the literature that putative detection of Seminal Acid Phosphatase (SAP) may lead to false positive results. Similarly, the detection of Prostate Specific Antigen (PSA) is not specific for seminal fluid and can be found in many non-prostate tissues and fluids (positive results for PSA can also be detected in the serum of patients with breast, lung, colon, etc.). While most commercial immunochromatographic assay tests used at crime scenes (e.g., RSIDTM and RSIDTM)

Figure BDA0002548672090000011

) Although fast, it is destructive to the sample. Especially when the amount of the sperm sample is very limited, the destructive nature of the sperm sample will affect the subsequent DNA analysis. Furthermore, none of the above-described seminal spot detection assays are species-specific, meaning that either SAP or PSA can be detected in seminal fluid of other species. Therefore, there is a need for a repeatable, highly accurate and non-destructive technique for detecting evidence of fine speckles and differentiating its species.

Fourier transform infrared spectroscopy (FTIR) is a widely used spectroscopic technique. The FTIR spectrum is studied in the mid-infrared region (about 4000-400 cm)-1) In the absorption spectrum of the electromagnetic wave, the absorption band of the region is mainly derived from the basic transition of molecular vibration, and the absorption mode has a strong characteristic that different chemical functional groups have different energy transition degrees after being irradiated by infrared light, so that different absorption spectra are shown. As a kind ofLabel-free and non-destructive spectroscopic detection techniques, FTIR spectra may show characteristic peaks corresponding to specific chemical functional groups in the sample. Compared with other spectrum technologies such as a phase contrast microscope and confocal fluorescence, the FTIR spectrum utilizes low-energy light waves, so that the problems of photobleaching, ionization damage and the like do not exist in a sample, the FTIR spectrum is particularly suitable for analyzing various samples in the biomedical field, and meanwhile, the spectrum specificity and the detection convenience of the sample also promote the wide application of the FTIR in the analytical chemistry field.

Chemometrics (Chemometrics), also called chemostatistics, was introduced in 1971 by Wold, a swedish scholars, and is a discipline that links the measured values of a chemical system to the state of the system by statistical or mathematical methods. In recent years, with the development of analytical chemistry, various modern measuring instruments have been developed, and a large amount of data on the chemical composition of a substance is obtained, from which valuable information can be mined by applying a chemometric method.

In the field of forensic criminal science, species identification is an important link in forensic investigation, and species identification should be performed on seminal plaques no matter what carrier the seminal plaques are formed on. At present, reports that Fourier transform infrared spectroscopy and chemometrics are combined and applied to identification of the seminal speckle species are not seen.

Disclosure of Invention

Aiming at the problem that the precise spot species cannot be identified rapidly, accurately and nondestructively at the same time, the invention provides a method for identifying the precise spot species based on Fourier transform infrared spectroscopy and chemometrics.

In order to achieve the purpose, the invention adopts the following technical scheme:

1) separating a semen sample from the semen spots of the species to be identified, and collecting Fourier transform infrared spectrum data of the semen sample;

2) selecting a fine spot fingerprint area (1800 plus 900 cm) from the Fourier transform infrared spectrum data acquired in the step 1)-1) Calibrating the data;

3) inputting calibrated data of the seminal speckle fingerprint area (specifically, absorption values corresponding to all wavelength points in the fingerprint area) into a human/non-human seminal speckle species identification model, and outputting an identification result by the model; the seminal speckle species identification model is established by utilizing corresponding Fourier transform infrared spectrum data (obtained by referring to the steps 1 and 2) of seminal fluid samples separated from a plurality of human and animal seminal speckles and adopting a partial least square method for discriminant analysis.

Preferably, the semen sample is isolated by transferring the semen spots from the carrier to physiological saline.

Preferably, the method for acquiring fourier transform infrared spectrum data comprises the following steps: and carrying out Fourier transform infrared spectrum analysis on the semen sample separated from the seminal vesicle for multiple times, and then averaging the results of the multiple times of analysis to obtain Fourier transform infrared spectrum data corresponding to the seminal vesicle.

Preferably, the speckle fingerprint area comprises protein absorption peaks and carbohydrate absorption peaks of amide I, amide II and the like in the speckle in Fourier transform infrared spectrum analysis.

Preferably, the calibration comprises the steps of: and performing mean center transformation and standard normal variable transformation on the selected infrared spectrum data.

Preferably, in the establishment of the seminal speckle species identification model, the semen sample is separated from a plurality of human seminal speckles and animal seminal speckles which are formed at different times; alternatively, the semen sample is isolated from two types of seminal plaques, one of which is human and animal seminal plaques that are formed at different times (same carrier), and the other of which is human and animal seminal plaques that are formed at different times (same carrier).

Preferably, the formation time of the human seminal plaques and the animal seminal plaques is within 30 days.

Preferably, the type of the carrier is determined according to actual cases (fine spots are generally formed on a fibrous substrate such as a paper towel or underwear, or on a glass substrate such as a glass slide).

Preferably, the animal is selected from one or more of mammals such as dog, rabbit, pig, cow, sheep, etc.

The invention has the beneficial effects that:

the invention utilizes semen samples separated from human and animal (such as dog, rabbit, pig, cow and sheep) seminal spots to establish a PLS-DA model for deducing the species source of the seminal spots, has no damage to the seminal spot detection material, is simple to operate, can quickly and accurately identify the species of the seminal spots (whether the seminal spots belong to human beings), and is beneficial to quickly qualifying cases according to the seminal spot evidence.

Furthermore, the invention does not need to carry out pretreatment of any chemical means on the seminal spot sample extracted from the emergency site, only needs to mix the seminal spot with normal saline to separate the seminal fluid sample, and can directly measure the Fourier transform infrared spectrum of the seminal fluid sample, thereby avoiding the damage of trace physical evidence.

Furthermore, the invention establishes a species identification model by utilizing the fine spot sample with the formation time within 30 days, the accuracy of the identification result is not influenced by different carriers (glass slides, tissues, underpants and the like), and the discrimination capability of identifying species sources is obviously improved;

furthermore, the invention increases different carrier seminal plaques as training samples, improves the reliability of the established species identification model, and can accurately deduce the species source for the seminal plaque detected material with the formation time (for example, more than 30 days) exceeding the formation time of the seminal plaques in the training samples.

Furthermore, the fingerprint area selected by the invention can more effectively utilize the information of the infrared spectrum data of the fine spots (invalid data are removed), and the data analysis time is shortened; meanwhile, infrared spectrum data closely related to the deduction accuracy of the classification model are reserved.

Furthermore, the invention calculates the average spectrum by utilizing a plurality of (parallel) Fourier transform infrared spectrum data of the semen sample collected for a plurality of times aiming at the semen sample separated from the same semen plaque, thereby enhancing the representativeness of the infrared spectrum sampling data.

Drawings

FIG. 1 is an average spectrum of respective species of human and 5 animals used in an example of the present invention; wherein the ratio of Dog: a dog; goat: sheep; human: a human; bull: cattle; and Pig: a pig; rabbit: rabbits.

FIG. 2 is a score chart of principal component analysis clustering results of infrared spectrum data of the fine spot samples on different carriers in the embodiment of the invention; wherein (a) shows the result by using a carrier as a label, and the proportions of the panties: underpants; microscope slides: a glass slide; tissues: paper towel, (b) display the result with the species as the label.

FIG. 3 is a score chart of principal component analysis clustering results of infrared spectrum data of the fine spot samples at different formation times in the embodiment of the present invention; wherein different colors represent different species and different shapes represent different formation times.

FIG. 4 is a schematic diagram of a partial least squares discriminant analysis model for constructing a fine spot species in an embodiment of the present invention; wherein (a) is a model constructed for a training set, and (b) is a prediction result of a validation set; discrimy3 shows the seminal plaque species inferred boundary between humans and animals.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. The examples are given solely for the purpose of illustration and are not intended to limit the scope of the invention.

The invention collects semen samples from five common animals of human beings, dogs, rabbits, pigs, cattle and sheep, prepares semen spot samples according to different carriers and formation time, carries out Fourier transform infrared spectrum detection, and carries out cluster analysis on the detected spectrum data results. In the clustering analysis, the Principal Component Analysis (PCA) method is used for carrying out dimensionality reduction on the spectrum data result and then observing the clustering trend, according to the influence of different carriers and formation time on the identification of the fine spot species, the fact that the different carriers of the fine spot do not influence the clustering trend of the fine spot species is found, and meanwhile, the fact that the different formation time (within 30 days) of the fine spot does not influence the clustering trend of the fine spot species is found. According to the scientific discovery and the practical modeling, the invention provides a method for identifying the precise spot species source by combining Fourier transform infrared spectroscopy and chemometrics analysis, the training and verification of the model are carried out by using the partial least square method for discriminant analysis, and the accuracy and precision of the model reach 100%.

Preparation of fine spot sample

Semen samples were collected from 10 healthy adult males, and from 5 animals (dogs, rabbits, pigs, cattle, sheep), 10 per human and animal. A total of 60 semen samples (70. mu.L each) were obtained.

Considering that in an actual case, fine spots are often attached to a carrier such as clothes and paper towels, and the time from the case to the sample extraction is uncertain, two parameters of the carrier (glass slide, paper towel, underwear with fabric components of 5% spandex and 95% regenerated cellulose fiber) and the forming time (10min, 8 days, 16 days, 21 days and 30 days) are set in the invention. Each sperm spot sample was prepared from 10. mu.L of semen sample. The preparation process comprises the following steps: dripping the semen on a carrier, then placing the carrier in a room (the temperature is 10-30 ℃, and the humidity is 25-75%), and simulating a real environment formed by the semen spots as far as possible (the semen spots are naturally dried after being placed for a plurality of minutes generally).

Second, extraction and separation of semen components in semen spot sample

After the semen sample is placed (under the same conditions of the indoor temperature and humidity) for a preset semen stain forming time (10min, 8 days, 16 days, 21 days and 30 days), dripping the same amount (10 mu L) of physiological saline on the semen stain, and after the physiological saline is fully mixed with the semen stain, carrying out treatment in different conditions: directly sucking the semen physiological saline mixed solution (as a semen separation sample) on the glass slide into an EP tube by using a pipette to be detected; and cutting off the corresponding part of the carrier along the edge of the seminal spots formed on the paper towel and the underpants, putting the carrier into a centrifugal column without a filter membrane, centrifuging (5000 revolutions for 1 minute) to obtain an eluted seminal fluid separation sample, and detecting.

Acquisition of three, Fourier transform infrared spectra

Spectroscopic measurements were performed on semen separation samples using a Nicolet FTIR-5700 fourier infrared spectrometer from Thermo corporation, as follows:

1. a background spectrogram (air) is collected before a sample is collected every time, and an Attenuated Total Reflection (ATR) probe is scrubbed clean by absolute ethyl alcohol before the background spectrogram is collected every time.

2. Turn on Thermo company OMNIC software, scan background and sample by setting, wherein waveNumber range: 4000-900cm-1Scanning 32 times with 4cm resolution-1The spectra show the absorbance pattern.

3. 1 microliter of sample is taken each time to fully cover the ATR probe, and scanning is started after the sample is naturally dried. Each sample is scanned for 3 times (namely 3 samples with 1 mu L are respectively taken from the same sample), and 3 spectrograms are parallelly acquired in each scanning, namely 9 spectrograms are obtained from each sample; the averaged spectral data (averaged at each wavelength point) of the 9 spectra was used as the fourier transform infrared spectra of the corresponding plaque samples prior to data processing described later. Intercepting 1800-charge 900cm-1Spectral data within the range as fingerprint regions.

Fourth, preprocessing the spectral data

And (3) preprocessing the selected fingerprint area spectrum by using a mean centering (mean centering) method, namely, preprocessing the selected fingerprint area spectrum by spectrum mean centering, wherein the average spectrum of the sample training set is calculated in the preprocessing, and then the average spectrum of the sample training set is subtracted from the original spectrum of each sample (namely, the Fourier transform infrared spectrum of the speckle sample obtained in the third step) to obtain mean centering spectrum data. And preprocessing the selected fingerprint area (mean-centered spectral data) by standard normal variable transformation (SNV) to obtain standard normal spectral data.

Five, spectral data clustering analysis

The spectral data of the preprocessed training set samples are introduced into MATLAB, and the PLS Toolbox 8.6 tool is used for principal component analysis, so that the most contributed regions are found to be positioned at Carbohydrate and protein absorption peaks, wherein the corresponding spectral results of pig, cow and sheep seminal plaques show that the protein (such as amide I, amide II and methylated protein) content is lower than that of human, but the Carbohydrate (Carbohydrate) content is far higher than that of human. The spectrum results of the corresponding plaques of dogs and rabbits show that the major difference from human is due to the difference in protein (FIG. 1). And further carrying out principal component analysis on infrared spectrum data results of the seminal vesicle samples of different carriers and different formation times, and finding that the carrier types and the formation times (within 30 days) do not influence the species clustering tendency of the seminal vesicles (fig. 2 and 3).

Sixthly, establishing a training model and verifying

Finally, 420 seminal vesicle samples were prepared using the above 60 seminal vesicle samples, wherein 336 seminal vesicle samples of the training set were prepared from 8 seminal vesicles of each of human and 5 animals (see table 1), and 84 seminal vesicle samples of the validation set were prepared from the remaining seminal vesicles. And (3) introducing the spectral data (specifically, absorption values corresponding to all wavelength points distributed at equal intervals in a fingerprint region) of the preprocessed training set sample into MATLAB (matrix laboratory), and thus establishing a partial least square method discriminant analysis model by using the training set sample.

TABLE 1 training set grouping

In the model building process, each classification model is optimized independently, the optimization is mainly influenced by potential variable numbers (LVs), for optimizing each classification model, 20-fold Cross Validation (CV) is used for selecting the optimal potential variable number to be 5(LVs is selected from 1 to 20) for each classification model, and then Partial least square discriminant analysis (PLS-DA) is used for building the optimal fine spot species source identification PLS-DA model.

And (3) importing the data of the verification set into the PLS-DA model to predict the species of the seminal vesicle, comparing the prediction result with the actual sample species, finding out that the accuracy rate of successfully judging whether the human seminal vesicle is 100 percent (figure 4) or not, and verifying that the PLS-DA model has robustness and accuracy.

In a word, the PLS-DA model for identifying the species source of the seminal vesicle by combining Fourier transform infrared spectroscopy and chemometrics analysis, which is established by the invention, has the advantages of less requirement on the seminal vesicle sample, no damage to pretreatment of the seminal vesicle sample, simplicity and convenience in preparation and data processing of the detection sample, high identification speed and high accuracy.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种提高太赫兹波无损检测分辨率的系统及方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!