Method for obtaining stable nucleosome genetic marker and application thereof in degradation DNA detection

文档序号:1871904 发布日期:2021-11-23 浏览:6次 中文

阅读说明:本技术 获得稳定核小体遗传标记的方法及其在降解dna检验中的应用 (Method for obtaining stable nucleosome genetic marker and application thereof in degradation DNA detection ) 是由 董春楠 丛斌 王琳 李淑瑾 马春玲 于 2021-08-26 设计创作,主要内容包括:本发明提供了一种获得稳定核小体遗传标记的方法及其在降解DNA检验中的应用,其包括以下步骤:(a)对待检测样本进行预处理,获得包含DNA的样品;(b)将样品分组,分别用高、中、低剂量水平的微球菌核酸酶对相应样品进行消化;(c)使用试剂盒提取消化后样品中的DNA,获得核小体核心区DNA;(d)对获得的核小体核心区DNA进行测序,将相同微球菌核酸酶消化水平的两个样本数据使用生物信息学工具DANPOS进行比较,进而筛选出稳定核小体定位信息,(f)获取稳定核小体遗传标记。本发明为高度降解检材的DNA分型提供了一种可靠的新方法。(The invention provides a method for obtaining a stable nucleosome genetic marker and application thereof in a degraded DNA test, which comprises the following steps: (a) pretreating a sample to be detected to obtain a sample containing DNA; (b) grouping the samples, and digesting the corresponding samples by using micrococcus nuclease with high, medium and low dose levels respectively; (c) extracting DNA in a digested sample by using a kit to obtain nucleosome core region DNA; (d) sequencing the obtained nucleosome core region DNA, comparing two sample data with the same micrococcus nuclease digestion level by using a bioinformatics tool DANPOS, and further screening out stable nucleosome positioning information, and (f) obtaining a stable nucleosome genetic marker. The invention provides a reliable new method for DNA typing of highly degraded test materials.)

1. A method for obtaining a stable nucleosome genetic marker comprising the steps of:

(a) pretreating a sample to be detected to obtain a sample containing DNA;

(b) grouping the samples, digesting the corresponding samples by using micrococcus nuclease with high, medium and low dose levels respectively, wherein the digestion condition is incubation for 2-4 hours at 37 ℃; wherein the high dose level is micrococcus nuclease to naked DNA amount = 110-130U: 17 mug, the medium dose level is micrococcus nuclease to naked DNA amount = 25-40U: 17 mug, and the low dose level is micrococcus nuclease to naked DNA amount = 10-20U: 17 mug;

(c) extracting DNA in a digested sample by using a kit to obtain nucleosome core region DNA;

(d) sequencing the obtained nucleosome core region DNA, introducing sequencing data into bwa software, and comparing with human reference genome GRCh 38; comparing two sample data of the same level of the digestion of the micrococcal nuclease using a bioinformatics tool DANPOS, calculating a nucleosome difference signal at a single nucleotide resolution based on poisson distribution, thereby obtaining variation values of displacement, ambiguity, and occupancy level of each nucleosome at the same position of the genome;

(e) obtaining nucleosome information with stable translational positioning according to set screening conditions;

(f) SNPs and InDels of nucleosome sequences were annotated using ANNOVAR software, and STRs were identified by LobSTR software.

2. The method of claim 1, wherein in step (a), the sample is peripheral blood of different persons, and the DNA-containing sample is leukocytes obtained by lysing the peripheral blood.

3. The method of claim 1, wherein in step (b), CaCl is added to the digestive juice2Solution of said CaCl2The concentration of the solution is 80-120mM, CaCl2The ratio of the addition amount of the solution to the amount of naked DNA was 230-270. mu.L: 17. mu.g.

4. The method of claim 1, wherein naked DNA digestion is used as a control in each sample in the step (b), DNA is extracted from the sample, then Micrococcus nuclease is added for digestion, the digestion condition is incubation at 37 ℃ for 2-4 hours, and Micrococcus nuclease/naked DNA amount = 0.45U: 1 μ g.

5. The method of claim 4, wherein in the step (d), the data of the medium and high dose digestion levels are corrected by DANPOS using the information obtained from the naked DNA digestion control group as a background.

6. The method of claim 1, wherein in step (d), reads obtained by sequencing are compared with human reference genome GRCh38 by using bwa software, raw reads are finely filtered to obtain clean reads, and BED files generated from all samples are run and compared by DANPOS algorithm.

7. The method of obtaining stable nucleosome localization information according to claim 1, wherein in step (d), the sequencing method and the data analysis method of each sample are consistent.

8. The method for obtaining stable nucleosome positioning information according to claim 1, wherein in the step (e), the stable nucleosome screening conditions are as follows:

nucleosome center displacement treat2control _ dis = 0;

the ambiguity satisfies: the fusiness _ diff _ log10pval is more than or equal to-10, and | fusiness _ log2FC | is less than or equal to 1;

the occupancy level satisfies: control _ smt _ val >0, and stream _ smt _ val > 0.

9. Use of a stable nucleosome genetic marker obtained by the method of any one of claims 1 to 8 in a test for degraded DNA.

10. An assay method for degraded DNA comprising the steps of any one of claims 1 to 8.

Technical Field

The invention relates to the technical field of forensic identification, in particular to a method for obtaining a stable nucleosome genetic marker and application thereof in degradation DNA detection.

Background

Various degradation biological test materials are often encountered during the legal medical record. After the cell death, the biological detection material is simultaneously subjected to the action of endogenous factors (such as decomposition of enzymes such as endogenous nuclease, various hydrolases, digestive enzymes secreted by microorganisms and the like) and exogenous factors (such as factors of high temperature, humidity, soil microorganisms, exposure to the sun, strong acid, strong alkali, ultraviolet radiation and the like), so that DNA is degraded. Under the condition, PCR amplification deletion or error amplification often occurs in the subsequent experiment process, and phenomena of Ladder-like bands, Sutter peaks, unequal amplification and loss of alleles and the like appear in STR typing, so that the success rate of DNA typing is greatly reduced, and the case detection becomes difficult. At present, genetic markers such as MiniSTR, SNP and the like used for analyzing degraded DNA still have the phenomena of allele loss and the like, and can not solve all the problems. Thus, there is still a lack of a method that can solve this problem in forensic cases.

Nucleosomes are the most basic structural unit of eukaryotic chromatin, consisting of DNA and histones. Each nucleosome DNA is about 200bp in length, wherein the core DNA (core DNA) wound on the histone octamer is about 146bp, and the DNA (linker DNA) between two histone octamers is about 20-50 bp. It was found that nucleosome DNA sequences are able to escape cleavage by enzymatic reactions during apoptosis or programmed cell death. Another study performed genome sequencing of the hair shaft of the Inneret, 4000 years ago, and found a nucleosome map due to the protective effect of nucleosomes on DNA. Suggesting that nucleosomes have a protective effect on the core DNA sequence. The detection of whether the protection effect can be applied to the forensic degradation biological detection material is explored by forensic workers, but there is a certain controversy. Freere-Aradas and the like screen out 18 SNPs in a nucleosome core region by using bioinformatics software RECON and construct a composite amplification system, and the parting success rate of the system on a degradation test material is 6% higher than that of an SNP for ID and is obviously higher than that of a miniSTR system. The nucleosome localisation of the 60 STR loci was scored using the bioinformatic software NXSensor and nuScore, Phuvadol Thanakinragai et al, and the results showed that these STR loci have been protected to some extent against nucleosome degradation. It is recommended to select as many loci as possible that can be protected in selecting a genetic marker. However, the results of subsequent experiments performed by the team show that there is no difference in detection rate between different score STR sites.

Based on the above, we believe that the reason for the different results may be the differences in nucleosome dynamics and the nucleosome localization method. Foreign teams locate the nucleosome core region based on prediction software. This method, which is classified and predicted as a whole, is not the fundamental approach to solving the problem of nucleosome localization. Currently, micrococcus nuclease digestion combined with high-throughput sequencing (MNase-seq) is the most commonly used nucleosome localization method, and can obtain high-resolution maps. However, the method is easily influenced by experimental conditions and the dynamic property of the nucleosome, so that accurate nucleosome positioning cannot be obtained, and the acquisition of genetic markers is influenced. Therefore, it is important to develop a method for obtaining high-precision nucleosome localization for DNA degradation.

Disclosure of Invention

The invention aims to provide a method for obtaining a stable nucleosome genetic marker and application thereof in degradation DNA detection, so as to solve the problem that the prior art cannot obtain accurate nucleosome positioning and influences the obtaining of the genetic marker.

The purpose of the invention is realized as follows: a method of obtaining a stable nucleosome genetic marker comprising the steps of:

(a) pretreating a sample to be detected to obtain a sample containing DNA;

(b) grouping the samples, digesting the corresponding samples by using micrococcus nuclease with high, medium and low dose levels respectively, wherein the digestion condition is incubation for 2-4 hours at 37 ℃; wherein the high dose level is Micrococcus nuclease and naked DNA (110-) -130U: 17 μ g, the medium dose level is Micrococcus nuclease and naked DNA (25-40U: 17 μ g), and the low dose level is Micrococcus nuclease and naked DNA (10-20U: 17 μ g);

(c) extracting DNA in a digested sample by using a kit to obtain nucleosome core region DNA;

(d) sequencing the obtained nucleosome core region DNA, introducing sequencing data into bwa software, and comparing with human reference genome GRCh 38; comparing two sample data of the same level of the digestion of the micrococcal nuclease using a bioinformatics tool DANPOS, calculating a nucleosome difference signal at a single nucleotide resolution based on poisson distribution, thereby obtaining variation values of displacement, ambiguity, and occupancy level of each nucleosome at the same position of the genome;

(e) obtaining nucleosome information with stable translational positioning according to set screening conditions;

(f) SNPs and InDels of nucleosome sequences were annotated using ANNOVAR software, and STRs were identified by LobSTR software.

In step (a), the sample is peripheral blood of different people, and the sample containing DNA is white blood cells obtained after the peripheral blood is lysed.

In the step (b), CaCl is added into the digestion solution2Solution of said CaCl2The concentration of the solution is 80-120mM, CaCl2The ratio of the amount of the solution added to the amount of the naked DNA was 230-270. mu.L: 17. mu.g.

In the step (b), naked DNA digestion is used as a control for each sample, the steps are firstly to extract DNA from the sample, and then micrococcal nuclease is added for digestion, the digestion condition is incubation for 2-4 hours at 37 ℃, and the quantity of micrococcal nuclease to naked DNA is 0.45U to 1 mu g.

In the step (d), the data of the medium and high dose digestion levels are corrected respectively using the information obtained from the naked DNA digestion control group as a background and using DANPOS.

In step (d), reads obtained by sequencing are compared with human reference genome GRCh38 by using bwa software, raw reads are subjected to fine filtering to obtain clean reads, and BED files generated by all samples are operated and compared by a DANPOS algorithm.

In step (d), the sequencing method and the data analysis method of each sample are kept consistent.

In step (e), the screening conditions for stable nucleosomes are as follows:

nucleosome center displacement direct 2control _ dis is 0;

the ambiguity satisfies: the fusiness _ diff _ log10pval is more than or equal to-10, and | fusiness _ log2FC | is less than or equal to 1;

the occupancy level satisfies: control _ smt _ val >0, and stream _ smt _ val > 0.

The stable nucleosome genetic marker obtained by the method is applied to degradation DNA detection.

An assay method for degraded DNA comprising the steps of the above method.

The invention provides a method for obtaining a stable nucleosome genetic marker, and the stable nucleosome refers to a nucleosome with consistent positioning among different samples, and has higher reliability for forensic identification. We screened genetic markers (STRs, SNPs, InDels) on stable nucleosomes for degradation DNA analysis, with roughly 300000 and 40000 SNPs and InDels. The invention compares the degradation resistance of SNPs positioned in the stable nucleosome region and the connecting region, and verifies the degradation resistance of the SNPs. Provides a reliable new method for DNA typing of highly degraded test materials.

The invention avoids noise interference in the detection and data analysis process as much as possible, and the main measures are (1) accurate experimental conditions such as the digestion level and the cutting characteristics of MNase, (2) ensuring the consistency of the sequencing method such as a sequencing platform, single-ended or double-ended sequencing, sequencing depth and the like, and (3) the consistency of the data analysis method such as an algorithm for detecting nucleosome peaks. Finally, by screening the stable nucleosomes and removing noise, high-precision nucleosome positioning information is obtained, and further genetic marker information of the nucleosome is obtained.

Drawings

FIG. 1 is a fuzzy score distribution of samples before and after naked DNA correction.

FIG. 2 is a graph of the correlation of GC content to occupancy level of samples before and after naked DNA correction. A is the pre-calibration sample data and B is the post-calibration sample data.

FIG. 3 shows the melting curve of SNP site rs 2983217.

FIG. 4 shows the trend of the DNA content of the samples at each time point of DNase I digestion.

FIG. 5 shows the DNA content of the samples at each time point of DNase I digestion.

FIG. 6 shows the DNA content of the sites at each time point of DNase I digestion.

Detailed Description

The technical solution of the present invention will be described in detail with reference to specific examples. The test conditions and procedures not mentioned in the examples of the present invention were carried out according to the conventional methods in the art or the conditions suggested by the manufacturer.

Example 1

The method of the embodiment comprises the following steps:

(I) three different MNase digestion degrees are set to obtain a comprehensive and accurate nucleosome positioning map

Leukocytes were collected from peripheral blood of 2 healthy persons, and the leukocytes after blood lysis were digested with micrococcal nuclease (MNase). The white blood cells were counted under a microscope at 5X 106Each cell was divided into 1 tube. Each specimen prepared 3 groups of leukocyte samples. Adding MNase (thermal scientific)15, 30 and 120U into each group of white blood cells of each sample, and simultaneously adding CaCl2The solutions (100 mM in 250. mu.l) were incubated at 37 ℃ for 3 hours and the reaction was stopped by adding 150. mu.l of 500mM EDTA. Purified DNA was extracted using E.Z.N.A.TM.blood DNA Midi Kit (OMEGA). Use ofAnd (3) analyzing the DNA digested by the MNase by the GX Touch24 to obtain the concentration of each fragment, and calculating the digestion degree. Single nucleosome fragments (around 150bp) from agarose Gel electrophoresis were excised, and DNA was recovered and purified using Wizard SV Gel and PCR clean-up System (promega). A naked DNA control sample was prepared for each individual sample in the same manner as for the 6 groups described above, except that DNA was first extracted from blood leukocytes, 0.45U MNase was added, and a 150bp band was recovered by agarose gel electrophoresis gel cutting. Nucleosome DNA was sequenced on the Novaseq 6000 platform using the paired-end (2X 150bp) method.

Sequencing data for a total of 8 samples raw reads were fine filtered using bwa to align the reads obtained from sequencing with the human reference genome GRCh38 to yield clean reads. The BED files generated for all samples were run through the DANPOS algorithm in which duplicate reads were removed to exclude any potential PCR amplification bias and the reads length was adjusted to increase the signal to noise ratio. DANPOS runs using default parameters. The location, ambiguity score and occupancy score for each nucleosome on the whole genome of each sample were obtained.

And (II) correcting the positioning data of the nucleosomes digested by part of MNase by using a naked DNA control sample, and eliminating noise caused by incomplete digestion.

As MNase has sequence preference during partial digestion, in order to eliminate the deviation and remove background noise and improve the precision of nucleosome positioning, naked DNA signals are used as the background, and a bioinformatics tool DANPOS is utilized to respectively correct the MNase-seq data with middle and high digestion degrees. Each nucleosome position, ambiguity score and occupancy score for each sample on the corrected whole genome were obtained. Comparing the changes before and after correction of the same sample, the results show that the ambiguity of the nucleosome is reduced after the 4 groups of samples are corrected, which indicates that the precision of the nucleosome positioning is improved (fig. 1). Furthermore, the correlation coefficient R of GC content with nucleosome occupancy level decreased after correction for the 4 groups of samples (fig. 2), indicating a decrease in GC bias.

And (III) comparing the positioning data of the nucleosomes with the same MNase digestion level, eliminating noise caused by the digestion level, and screening the stable nucleosomes. The sequencing method and data analysis for each sample remained consistent, eliminating noise as much as possible. In this experiment, the screening conditions for stable nucleosomes were: 1. displacement of the nucleosome centre is 0 (point 2control _ dis ═ 0), 2. no difference in ambiguity (fuzzy _ diff _ log10pval ≧ 10, and | fuzzy _ log2FC | ≦ 1), 3. occupancy water mean greater than detection threshold (control _ smt _ val >0 and point _ smt _ val > 0). The stable nucleosomes obtained from 2 different samples in this example averaged 925897, 843359, 862330 at three levels of MNase digestion. Respectively account for 10.13%, 10.47% and 14.01% of the total nucleosomes. Much more than under a single MNase condition.

And (IV) acquiring genetic markers in the stable nucleosome, including SNVs (single nucleotide variation sites), Indels (insertion deletion polymorphic sites) and STRs (short tandem repeat polymorphic sites). The sequenced paired-end reads were mapped to the ensembl human genome version 38 using the local alignment option of bowtie 2. Aligned reads were filtered to mapping quality (MAPQ >20) and further processed using SAM-tools and BEDtools. SNPs and InDels of nucleosome sequences were annotated using ANNOVAR software, and STRs were identified by LobSTR software.

Example 2

Verification of stability of nucleosome DNA against degradation

Firstly, SNPs which are recorded in a dbSNP _138 database in stable nucleosome SNPs obtained by early sequencing are selected as specific study objects, and the final experiment objects are further screened and determined according to conditions such as minimum allele frequency and the like. Specific screening criteria are as follows:

(1) minimum Allele Frequency (MAF) of 0.2 or more;

(2) the distribution is wide in the genome, and no linkage disequilibrium exists among the genomes.

The SNPs in the region of the stable nucleosome core DNA were finally determined according to the above screening criteria and recorded as nucleosome group SNPs. SNPs that were not present in the original sequencing data were randomly selected as non-nucleosome core DNA region SNPs and recorded as non-nucleosome group SNPs.

And (II) designing an amplification primer to ensure that the length of the fragment is less than or equal to 147 bp. Carrying out homogenization treatment on the amplification efficiency, the fragment length and the like of the nucleosome SNP and the non-nucleosome SNP, and determining that the nucleosome SNP and the non-nucleosome SNP have no difference in the aspects of the amplification efficiency, the fragment length and the like through pre-experiments and statistical analysis.

We designed and evaluated PCR amplification primers for SNP sites of nucleosome group and non-nucleosome group according to the following principle. Site flanking sequences were introduced in Primer Premier 5, and PCR primers were designed with the following parameters:

1) length of the primer: 18-30 bases;

2) length of product: less than or equal to 147bp, and no difference between the lengths of the amplified fragments of the nucleosome group SNPs and the non-nucleosome group SNPs is kept.

3) Tm value: 60 +/-2 ℃, and the difference between the upstream primer and the downstream primer at the same site is less than or equal to 5 ℃;

4) GC content: 40-60 percent.

The rest follows the general primer design principle, such as avoiding the formation of dimers or hairpin structures between primers and primers themselves, random distribution of four basic groups, etc.

The designed primers were evaluated and optimized for their general attributes using Oligo 7 software. Logging in http:// blast.ncbi.nlm.nih.gov/blast.cgi, and comparing the designed primer with the genome sequence to ensure the specificity of the primer.

All primers were synthesized by Shanghai Bioengineering, Inc. and purified by PAGE.

And (III) preparing degradation test materials for DNase I digestion at different time points (0, 5,10,20 and 30 minutes). Degradation test material from 3 individual blood samples, using a final concentration of 0.01U/. mu.l DNase I digestion 10U g DNA to 0, 5,10,20 and 30 minutes time points, in each time point sampling.

(IV) use of Takara JapanA Premix Ex Taq TM II kit was used to perform real-time fluorescent quantitation of the 36 SNPs. A real-time fluorescent quantitative PCR reaction system is prepared according to the following components:

real-time fluorescent quantitative PCR reactions were quantitatively analyzed with ROX as a calibration dye. The PCR reaction was carried out under the following conditions: pre-denaturation at 95 ℃ for 30 s; 5s at 95 ℃ and 34s at 60 ℃ for 40 cycles. For each experiment 2800 male DNA was amplified as a positive control and the melting curve was analyzed.

Experimental result of stable nucleosome DNA with good degradation resistance

Through screening, primer design and primer amplification specificity verification, 20 SNPs in the nucleosome group and 16 SNPs in the non-nucleosome group are finally determined. The details and primer information of the two sets of SNPs are shown in tables 1 and 2. The designed primers are successfully amplified by verifying the amplification efficiency, the melting curves are all single peaks (figure 3), and the amplification efficiency reaches 90-11 percent0 percent. Coefficient of correlation R2Is greater than 0.90. To eliminate the effect of amplified fragment length, we analyzed the difference in length of the amplicon fragments for nucleosome group versus non-nucleosome group using two independent sample T-tests. The results showed no difference in fragment length between the two groups.

P=0.650>0.05)

For each experiment 5 dilution gradients of standard curves were made with 2800M DNA and absolute quantification of the content of the nucleosome group SNPs and non-nucleosome group SNPs was performed in 3 DNA samples (A, B, C), 5 different degradation time points (0, 5,10,20 and 30 min). The locus content of the two nucleosomes was compared at different time points from 0-30min using repeated measures anova and multivariate anova in SPSS 21.0, with a statistical difference of P < 0.05. The results showed that the content of each site in 3 degraded DNAs showed a significant decrease with the increase of the digestion time of DNase I (FIG. 4). We analyzed the site content of the nucleosome group SNPs and non-nucleosome group SNPs at each time point using repeated measures anova and multivariate anova. The spheroidicity test showed that P <0.05 (PA 0.000 <0.05, PB 0.000 <0.05, PC 0.000 <0.05, indicating a high degree of correlation between the data of 5 replicates, preferably tested using multivariate analysis of variance, we compared the differences between the two sets of sites at each time point (0min, 5min, 10min, 20min, 30min) (table 3, fig. 5, fig. 6). the results show that in the initial DNA state (0min), there are no differences between the nucleosome SNPs of 3 samples and those of non-nucleosome SNPs at 5min, 10min, 20min, 30min, and that there are differences between the nucleosome SNPs of 3 samples and those of non-nucleosome SNPs at 5min, 10min, 20min, 30min, which indicates that it is more likely to obtain certain degrees of DNA degradation (10 μ g of intact DNA digested with 0.01U/μ l dnase I5, 10,20, 30 min.) because the protective nucleosome of the nucleosome protein octamer obtained is located in the nucleosome of the region of the amplified DNA The protective effect does exist.

TABLE 1.20 stable nucleosome SNPs information

TABLE 2.16 information of SNPs of non-nucleosomes

TABLE 3.3 real-time fluorescent quantitative analysis results of artificially degraded test materials

In FIG. 5, asterisks indicate that the DNA content of the stable nucleosome group was significantly different from that of the non-nucleosome group at this time point by multivariate analysis of variance (P < 0.05). Representing that sample B had significant differences in stable nucleosomes compared to non-nucleosomes at 5,10,20 and 30min of DNase I digestion, the P values for the two sets of differences at each time point were 0.001, 0.001, 0.006, 0.007, respectively.

In FIG. 6, asterisks indicate that the DNA content of the stable nucleosome group was significantly different from that of the non-nucleosome group at this time point by multivariate analysis of variance (P < 0.05). The stable nucleosomes at the locus rs2983217 were significantly different from the non-nucleosomes at 5,10,20 and 30min of DNase I digestion, and the P values of the two sets of differences at each time point were 0.000, 0.001, 0.000 and 0.03 respectively.

14页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:多重引导RNA

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!