High-throughput screening method for non-target biomarkers based on pollutant metabolic disturbance

文档序号:1294921 发布日期:2020-08-07 浏览:6次 中文

阅读说明:本技术 一种基于污染物代谢扰动的非目标生物标志物高通量筛查方法 (High-throughput screening method for non-target biomarkers based on pollutant metabolic disturbance ) 是由 韦斯 李昱茜 于红霞 于南洋 于 2020-04-26 设计创作,主要内容包括:本发明公开了一种基于污染物代谢扰动的非目标生物标志物高通量筛查方法,属于属于环境暴露与健康领域,其步骤为:(1)提取得到待测提取液;(2)色谱分析,得到含有色谱峰的谱图;(3)进行污染物特征峰的识别与标注,并将污染物特征峰之外的色谱峰作为潜在代谢物特征峰,对潜在代谢物特征峰进行非目标标注;(4)以所述潜在代谢物特征峰的峰面积为因变量,污染物特征峰峰面积为自变量,建立线性回归模型;(5)运行模型,进行生物标志物的非目标筛选,初步得到相关的生物标志物;(6)对初步得到的生物标志物的一级、二级谱图进行鉴定,识别出与污染物暴露相关的生物标志物。本发明的方法显著提升了生物标志物筛查的准确性,同时提高了生物标志物筛查的通量。(The invention discloses a high-throughput screening method for non-target biomarkers based on pollutant metabolism disturbance, belonging to the field of environmental exposure and health, and comprising the following steps of: (1) extracting to obtain an extracting solution to be detected; (2) performing chromatographic analysis to obtain a spectrogram containing chromatographic peaks; (3) identifying and labeling pollutant characteristic peaks, taking chromatographic peaks except the pollutant characteristic peaks as potential metabolite characteristic peaks, and performing non-target labeling on the potential metabolite characteristic peaks; (4) establishing a linear regression model by taking the peak area of the characteristic peak of the potential metabolite as a dependent variable and the peak area of the characteristic peak of the pollutant as an independent variable; (5) running the model, and carrying out non-target screening on the biomarkers to preliminarily obtain related biomarkers; (6) and identifying the primary and secondary spectrograms of the preliminarily obtained biomarkers to identify the biomarkers related to pollutant exposure. The method provided by the invention obviously improves the accuracy of biomarker screening and improves the flux of biomarker screening.)

1. A high-throughput screening method for non-target biomarkers based on metabolic disturbance of pollutants, comprising the following steps:

(1) and (3) extracting a sample: sample treatment, namely extracting pollutants and metabolites in a biological sample to obtain an extracting solution to be detected;

(2) and (3) chromatographic analysis: carrying out full-scan analysis and detection on the extracting solution to be detected by using a high performance liquid chromatography-time of flight mass spectrometer to obtain a spectrogram containing a chromatographic peak;

(3) pollutant labeling and potential metabolite non-target labeling: identifying and labeling pollutant characteristic peaks aiming at the spectrogram, taking chromatographic peaks except the pollutant characteristic peaks as potential metabolite characteristic peaks, and performing non-target labeling on the potential metabolite characteristic peaks;

(4) establishing a model: establishing a linear regression model by taking the peak area of the characteristic peak of the potential metabolite as a dependent variable and the peak area of the characteristic peak of the pollutant as an independent variable;

(5) non-target screening of biomarkers: running the model, carrying out non-target screening of the biomarkers, and primarily screening to obtain related biomarkers;

(6) biomarker identification: and (4) identifying the primary spectrogram and the secondary spectrogram of the biomarker obtained in the step (5) and identifying the biomarker related to pollutant exposure.

2. The contaminant metabolic perturbation based non-target biomarker high-throughput screening method according to claim 1, wherein: the method further comprises step (7): and (5) correcting the model by using a correction method, running the corrected model, and repeating the steps from (5) to (6).

3. The non-target biomarker high-throughput screening method based on pollutant metabolic disturbance according to claim 1 or 2, characterized in that: the method further comprises a metabolic pathway enrichment step of the biomarkers, wherein the identified biomarkers are enriched to the metabolic pathway to obtain a metabolic pathway disturbed by the pollutants.

4. The contaminant metabolic perturbation based non-target biomarker high-throughput screening method according to claim 2, wherein: the correction method comprises error discovery rate correction and interference factor correction, wherein in the error discovery rate correction process, a threshold value p <0.05 is corrected to be FDR < 20%; in the interference factor correction process, the interference factors existing in the sample are corrected in a mode of adding the interference factors as covariates into the model.

5. The non-target biomarker high-throughput screening method based on pollutant metabolic disturbance according to claim 2 or 4, characterized in that: when the extracting solution to be detected contains a plurality of pollutants, the correction method comprises a combined exposure correction method: and performing multivariate stepwise regression by taking a plurality of pollutants as potential independent variables to perform model correction.

6. The pollutant metabolism perturbation based non-target biomarker high-throughput screening method according to claim 5, characterized in that: and in the pollutant characteristic peak identification process, converting the spectrogram into a WIFF file, importing the WIFF file into PeakView software to extract peaks, aligning and analyzing the peaks, and identifying pollutants.

7. The method for high-throughput screening of non-target biomarkers based on pollutant metabolic disturbance according to claim 6, wherein in the process of non-target labeling of characteristic peaks of potential metabolites, the spectrogram is converted into an ABF file, the ABF file is imported into MSDIA L software for peak extraction and alignment, and the characteristic peaks with the detection rate of more than 80% are reserved and used as the characteristic peaks of potential metabolites.

8. The pollutant metabolism perturbation based non-target biomarker high-throughput screening method according to claim 5, characterized in that: in the step (4), the model with the significance p <0.05 after operation is taken as an effective model to carry out the operation process of the step (5).

9. The method for high-throughput screening of non-target biomarkers based on metabolic disturbance of contaminants according to claim 8, wherein in step (6), the identification of biomarkers is performed by using MS-DIA L software and MetDNA platform in combination.

10. The contaminant metabolic perturbation based non-target biomarker high-throughput screening method according to claim 9, wherein: in the step (2), the adopted detection conditions are as follows:

high performance liquid chromatograph: infinity 1260;

chromatographic column C18 column 2.1mm × 50mm, 2.5 μm;

column temperature: 40 ℃;

the flow rate is 0.4m L/min;

mobile phase: positive ion mode a phase: 0.1% formic acid-water solution; negative ion mode a phase: 2mM ammonium acetate aqueous solution and phase B: methanol;

the gradient elution conditions were as follows:

mass spectrometry: triple TOF 4600;

full scan mode: a data dependency pattern;

an ion source: a positive and negative electrospray ionization source;

full scan mass range: the first level is 50-1250 Da, and the second level is 30-1000 Da;

collision energy: +/-40 eV;

collision energy diffusion: 20 eV;

ion source temperature: at 550 ℃.

Technical Field

The invention belongs to the field of environmental exposure and health, and particularly relates to a high-throughput screening method for non-target biomarkers based on pollutant metabolic disturbance.

Background

With the development of scientific technology, the level of life of people is continuously improved, and the generation of pollutants is continuously increased, and the substances are released into the environment and accumulated in the environment. Organisms are exposed to various environmental media, and pollutants in the environment can enter the organisms through various ways such as touch, breath, diet and the like, so that the conversion from external exposure to internal exposure is caused. Exogenous environmental contaminants entering an organism pose a potential threat to the organism and there is a great deal of evidence that exposure to specific chemicals may lead to disease.

Biomarkers are signal indicators of abnormal changes observed before organisms are subjected to severe toxicity, and currently, the omics technology of biomarkers comprises genomes, proteomes, metabonomics and the like. The metabolomics carries out qualitative and quantitative research on small molecule metabolites and is considered as the omics research closest to phenotype. The development of metabolomics has been rapid in recent years, but related research still lags behind genomics and proteomics. The study on the metabolome disturbance caused by the environmental pollutants can fill the gap, so that the toxicity can be predicted, and the scientific control on the pollutants is facilitated.

The problems with current metabolomics studies are: on one hand, the screening flux is insufficient, most of researches are targeted researches, so that biomarkers which are not in the research range are easy to ignore, and the screening accuracy is influenced; on the other hand, since environmental pollutants and biological metabolites have complexity, the correlation study of metabonomics is difficult, and statistical tools need to be optimized.

In view of the defects of the prior art, the development of a screening method for the metabolic group biomarkers with high flux and high accuracy is needed.

Disclosure of Invention

1. Problems to be solved

Aiming at the problems of insufficient screening flux and low accuracy in the research of metabonomics in the prior art, the screening method disclosed by the invention firstly screens potential metabolites in a non-target manner, and then gradually narrows the range to obtain a small range of biomarker quantity, so that the screening can be more comprehensive, the biomarkers can be accurately screened and identified at high flux, and a scientific basis is provided for the prediction of toxicity and the evaluation and control of pollutant risks.

2. Technical scheme

In order to solve the problems, the technical scheme adopted by the invention is as follows:

the invention provides a high-throughput screening method of non-target biomarkers based on pollutant metabolic disturbance, which comprises the following steps:

(1) and (3) extracting a sample: processing a sample, and extracting pollutants and metabolites in a biological sample to obtain an extracting solution to be detected;

(2) and (3) chromatographic analysis: carrying out full-scan analysis and detection on the extracting solution to be detected by using a high performance liquid chromatography-time of flight mass spectrometer to obtain a spectrogram containing a chromatographic peak;

(3) pollutant labeling and potential metabolite non-target labeling: identifying and labeling pollutant characteristic peaks aiming at the spectrogram, taking chromatographic peaks except the pollutant characteristic peaks as potential metabolite characteristic peaks, and performing non-target labeling on the potential metabolite characteristic peaks;

(4) establishing a model: establishing a linear regression model by taking the peak area of the characteristic peak of the potential metabolite as a dependent variable and the peak area of the characteristic peak of the pollutant as an independent variable;

(5) non-target screening of biomarkers: running the model, carrying out non-target screening of the biomarkers, and primarily screening to obtain related biomarkers;

(6) biomarker identification: and (4) identifying the primary spectrogram and the secondary spectrogram of the biomarker obtained in the step (5) and identifying the biomarker related to pollutant exposure.

Preferably, the method further comprises step (7): and (5) correcting the model by using a correction method, running the corrected model, and repeating the steps from (5) to (6).

Preferably, the method further comprises a metabolic pathway enrichment step of the biomarker, wherein the identified biomarker is enriched to the metabolic pathway to obtain a metabolic pathway perturbed by the contaminant.

Preferably, the correction method comprises error discovery rate correction and interference factor correction, and in the error discovery rate correction process, the threshold p <0.05 is corrected to be FDR < 20%; in the interference factor correction process, the interference factors existing in the sample are corrected in a mode of adding the interference factors as covariates into the model.

Preferably, when the extract to be tested contains a plurality of pollutants, the calibration method comprises a combined exposure calibration method: and performing multivariate stepwise regression by taking a plurality of pollutants as potential independent variables to perform model correction.

Preferably, in the pollutant characteristic peak identification process, the spectrogram is converted into a WIFF file, and the WIFF file is introduced into PeakView software to extract peaks and align the peaks for analysis, so that pollutants are identified.

Preferably, in the process of non-target labeling of the characteristic peaks of the potential metabolites, the spectrogram is converted into an ABF file, the ABF file is imported into MSDIA L software to extract peaks and align the peaks, and the characteristic peaks with a detection rate of more than 80% are reserved and used as the characteristic peaks of the potential metabolites.

Preferably, in step (4), the operation process of step (5) is performed using the model with the post-operation significance p <0.05 as the effective model.

Preferably, in step (6), the identification of the biomarkers is performed using a combination of MS-DIA L software and the MetDNA platform.

Preferably, said step (4) is followed by prediction of metabolic disturbance capacity by structural optimization of the contaminant molecules whose structure is identified in the SYBY L software by applying a Terry wave (Tripos) force field, a Gauss-Herckel (Gasteiger-Huckel) charge, and a Bowdel (Powell) gradient method until the termination gradient is reduced to 0.001 kcal/(mol. cndot.) in the final gradient) The following. In addition, from the protein database (http:// www.rcsb.or)g) The optimized pollutant ligand and the protein pocket are butted in SYBY L software, the optimal conformation is selected as a butting result, the larger the total score value of the butting result is, the stronger the butting capability is, the fewer the pollutants in a free state are, otherwise, the more the pollutants in the free state are, the stronger the transient metabolic disturbance is possibly generated, and more biomarkers are generated.

Preferably, the method comprises the following specific steps:

(1) and (3) extracting a sample: homogenizing a solid sample (such as biological tissue and the like), placing a liquid sample (such as blood, urine and the like) in a centrifuge tube, adding 0.26-0.28 g of magnesium sulfate-sodium chloride mixture and acetonitrile into the sample, immediately vortexing, carrying out ultrasonic extraction on the sample for 30min at the moment, centrifuging and transferring a supernatant. The remaining residue was extracted twice with 95% acetonitrile-water solution, and the extracts were combined. Nitrogen is blown to near dryness, and the solution is transferred to a chromatographic sample bottle and is subjected to volume fixing by acetonitrile. If there is a small amount of white solid at the bottom at this point, it is centrifuged again and the supernatant is transferred to a chromatographic vial.

The pretreatment method reduces the early filtration of the metabolites in the sample, and then the biomarkers can be screened in a more comprehensive range.

(2) Testing on a machine: and carrying out full-scanning analysis and detection on the extracted sample by using a high performance liquid chromatography-time of flight mass spectrometer. The parameters are as follows:

high performance liquid chromatograph: infinity 1260;

chromatographic column C18 column (2.1mm × 50mm, 2.5 μm);

column temperature: 40 ℃;

the flow rate is 0.4m L/min;

mobile phase: 0.1% formic acid-water solution (positive ion mode a phase), 2mM ammonium acetate-water solution (negative ion mode a phase) and methanol (phase B); table 1 shows the gradient elution conditions.

TABLE 1 gradient elution conditions

Mass spectrometry: triple TOF 4600;

full scan mode: a data dependency pattern;

an ion source: a positive and negative electrospray ionization source;

full scan mass range: the first level is 50-1250 Da, and the second level is 30-1000 Da;

collision energy: +/-40 eV;

collision energy diffusion: 20 eV;

ion source temperature: at 550 ℃.

(3) Marking and identifying characteristic peaks of pollutants: and converting the spectrogram obtained after the analysis of the instrument into a WIFF file, importing the WIFF file into PeakView software for peak extraction and analyzing after aligning, and identifying the pollutants with the standard sample by comparing retention time and mass spectrum fragments. Contamination without standards the structure was calculated by analyzing the fragments of the mass spectra using the Formula Finder function.

The parameters are as follows:

peak mass range: 50-1250 Da;

peak mass error extraction: 0.01 Da;

alignment retention time error: 2 min;

alignment quality error: 0.01 Da;

and identifying a quality error: first 0.01Da and second 0.005 Da.

(4) Converting a spectrogram obtained after the analysis of an instrument into an ABF file, importing the ABF file into MSDIA L software to extract peaks and align the peaks, taking chromatographic peaks with the detection rate of more than 80 percent except the characteristic peaks of the pollutants as potential metabolite characteristic peaks, and counting peak areas and mass spectrograms corresponding to each peak as chromatographic peak tables, wherein the parameters are as follows:

peak mass range: 30-1250 Da;

peak mass error extraction: 0.01 Da;

alignment retention time error: 0.5 min;

alignment quality error: 0.015 Da.

(5) Non-target screening of biomarkers: establishing a linear regression model in SPSS software, taking the peak area of the potential metabolite characteristic peak marked by the non-target in the dependent variable (4) and the peak area of the pollutant characteristic peak in the independent variable (3), taking the model with the significance p less than 0.05 as an effective model after the model is operated, carrying out non-target screening on the biomarkers, and primarily screening to obtain the related biomarkers;

(6) high throughput identification of biomarkers: and (4) performing multi-platform combined programmed identification on the primary spectrogram and the secondary spectrogram of the related biomarker characteristic peak obtained by the primary screening in the step (5).

MSIA L software is loaded with MSP files in positive and negative ion modes respectively to carry out metabolite library comparison, the unaligned metabolic characteristic peak is uploaded to a MetDNA platform (http:// MetDNA. zhula. cn /) to be further identified, the identification result is divided according to the recommended partition confidence level of a metabolite standard plan (MSI). the metabolite identified by MSIA L is 2 grade, the metabolite identified by MetDNA is 2 grade, the other metabolite identified by MetDNA is 3 grade, the MSIA L parameter is:

mass error: first-level 0.01Da and second-level 0.05 Da;

score threshold: 80 minutes.

(7) And (3) correcting the model: the method for establishing the linear regression model through multiple correction to reduce the false positive results comprises the following steps:

error discovery rate correction: the application R software corrects the threshold p <0.05 to FDR < 20% by qvalue command;

and (3) correcting interference factors: adding interference factors existing in the sample into the regression model as covariates;

joint exposure correction: when a plurality of pollutants are researched, a plurality of pollutants are used as potential independent variables to carry out multivariate stepwise regression, specifically, when a certain pollutant model is analyzed, a biomarker is used as a dependent variable, other pollutants are selected as independent variables to run the multivariate regression model, a 'stepwise' method is selected, the pollutants with significance are reserved as final independent variables, and the pollutants without significance are deleted.

And (4) operating the model to obtain a regression model of ① dependent variables, the biomarker, the ② independent variable 1, the specific pollutant to be analyzed and the ③ independent variable 2, the pollutant with significance in other pollutants, wherein the significance of ② (independent variable 1, the certain pollutant to be analyzed) is a corrected significance p value, the metabolite with the p value still less than 0.05 is used as the final biomarker corresponding to the specific pollutant, and the identification process of the biomarker high-flux identification in the step (6) is repeatedly carried out.

(8) Metabolic pathway enrichment of biomarkers: and enriching the biomarker into a metabolic pathway to obtain the metabolic pathway disturbed by the pollutant.

3. Advantageous effects

Compared with the prior art, the invention has the beneficial effects that:

(1) the invention relates to a non-target biomarker high-throughput screening method based on pollutant metabolic disturbance, which is characterized in that after a pollutant characteristic peak is identified according to a chromatogram obtained by detection, other chromatographic peaks with high detection rate except the pollutant characteristic peak are used as potential metabolites for non-target marking, a linear regression model between the pollutant and the potential metabolites is established to obtain a primary screening result of the biomarkers related to pollutant exposure, and then the biomarkers with less quantity, higher accuracy and higher degree of pollutant exposure correlation are identified by carrying out structure identification on the primarily screened biomarkers, and are enriched to a metabolic pathway to obtain the metabolic pathway of the pollutant metabolic disturbance, the range is gradually reduced, and a screening result with higher accuracy is finally obtained, so that the method is not only suitable for screening pollutants with high flux, but also has higher accuracy.

(2) The non-target biomarker high-throughput screening method based on pollutant metabolic disturbance, disclosed by the invention, has the advantages that the non-target screening of potential metabolites is firstly carried out, and then the range is gradually reduced to obtain the number of the biomarkers in a small range, so that the defect that complete identification cannot be ensured when the target screening is adopted for the biomarkers can be effectively overcome.

(3) The non-target biomarker high-throughput screening method based on pollutant metabolism disturbance, disclosed by the invention, identifies the screened biomarkers by combining MS-DIA L software and a MetDNA platform, so that the identification and recognition efficiency is improved, and the flux of the identified biomarkers is increased.

(4) The invention corrects the statistical analysis model in a combined exposure correction mode, can eliminate the interference of other pollutants to accurately screen the biomarker of a specific pollutant under the condition of combined exposure of various pollutants, can enable the analysis result of the model to be more accurate, and can more reasonably analyze the actual exposure condition. Reduces false positive and improves the accuracy of the result.

Drawings

FIG. 1 is a flow chart of the biomarker non-target screening method of example 1;

FIG. 2 is a chromatogram flowsheet obtained from the analysis of the sample in example 1;

FIG. 3 is a primary mass spectrum obtained by analysis of the sample in example 1;

FIG. 4 is a secondary mass spectrum obtained by analysis of the sample of example 1;

FIG. 5 shows the results of the docking of perfluorooctanesulfonic acid with human serum albumin in example 2.

Detailed Description

The invention is further described with reference to specific examples.

16页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种多黏菌素B的氨基酸构型分析方法和N-多肽端序列测序方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!