Method and kit for quickly constructing plasma DNA sequencing library

文档序号:1138385 发布日期:2020-10-09 浏览:23次 中文

阅读说明:本技术 一种快速构建血浆dna测序文库的方法和试剂盒 (Method and kit for quickly constructing plasma DNA sequencing library ) 是由 陈迪 李亚丽 李天成 石燕滨 张江花 吴凯 于 2014-06-24 设计创作,主要内容包括:本发明提供了用于构建血浆DNA测序文库的方法,包括以下步骤:1)抽提血浆DNA;2)任选将血浆DNA进行末端补平,获得末端补平后的血浆DNA;3)将任选末端补平后的血浆DNA进行3’悬A,获得3’悬A后的血浆DNA;4)将3’悬A后的血浆DNA与测序接头连接,获得连接产物;5)对连接产物进行纯化,获得纯化产物。本发明也提供用于构建血浆DNA测序文库的试剂盒。本发明大大简化了血浆DNA测序文库的构建流程,除去了由于PCR扩增造成的环境以及样本间的污染,降低了血浆DNA文库构建对环境的要求,同时也除去了由于PCR过程造成的基因组覆盖的不均一性,使得血浆样本文库构建更加低成本、高效、快速,血浆DNA测序结果更加真实、有效。(The invention provides a method for constructing a plasma DNA sequencing library, which comprises the following steps: 1) extracting plasma DNA; 2) optionally, carrying out terminal filling on the plasma DNA to obtain the plasma DNA with the filled terminal; 3) 3 'suspending A of the plasma DNA with the optional end filled in to obtain 3' suspended A plasma DNA; 4) connecting the plasma DNA after 3' suspension of the DNA with a sequencing linker to obtain a ligation product; 5) purifying the connection product to obtain a purified product. The invention also provides kits for constructing a plasma DNA sequencing library. The invention greatly simplifies the construction process of the plasma DNA sequencing library, removes the environment and pollution among samples caused by PCR amplification, reduces the requirement of the construction of the plasma DNA library on the environment, and simultaneously removes the heterogeneity of genome coverage caused by the PCR process, so that the construction of the plasma sample library is lower in cost, efficient and rapid, and the plasma DNA sequencing result is more real and effective.)

1. A method for constructing a plasma DNA sequencing library comprising the steps of:

1) extracting plasma DNA;

2) optionally, carrying out terminal filling on the plasma DNA to obtain the plasma DNA with the filled terminal;

3) 3 'suspending A of the plasma DNA with the optional end filled in to obtain 3' suspended A plasma DNA;

4) connecting the plasma DNA after 3' suspension of the DNA with a sequencing linker to obtain a ligation product;

5) purifying the connection product to obtain a purified product.

2. The method according to claim 1, characterized in that step 4) is carried out by one or two enzymes selected from the group consisting of: t4DNA ligase and T7 DNA ligase.

3. The method of claim 1, wherein the sequencing adaptor is a double-stranded sequencing adaptor.

4. The method of claim 1, wherein the 3' overhang a is performed using klenow ex-enzyme, Taq enzyme, or a combination of klenow ex-enzyme and Taq enzyme.

5. The method according to claim 1, characterized in that a purification step is carried out between said steps 2) and 3); or said steps 2) and 3) are carried out in one reaction system without an intermediate purification step.

6. The method according to claim 5, wherein the step 2) is performed using T4DNA polymerase; and the step 3) is carried out by using Taq enzyme.

7. The method according to claim 1, characterized in that a purification step is carried out between said steps 3) and 4); or steps 2), 3) and 4) are carried out in one reaction system without an intermediate purification step.

8. The method according to claim 7, wherein the step 2) is performed using T4DNA polymerase; the step 3) is carried out by adopting Taq enzyme; and the step 4) is performed using T4DNA ligase and T7 DNA ligase.

9. A kit for constructing a plasma DNA sequencing library, comprising:

-optionally a reagent for end-filling of plasma DNA comprising dntps, an enzyme for end-filling, and an end-filling buffer;

-reagents to perform 3' overhang a, including dATP, enzymes for terminal overhang a, and terminal overhang a buffer;

-reagents for ligation to a sequencing linker, including a sequencing linker, a ligase, and a ligation buffer; and

reagents and devices for purifying the ligation product.

10. The kit according to claim 9, wherein the ligase is selected from one or two of the following: t4DNA ligase and T7 DNA ligase.

11. The kit of claim 9, wherein the sequencing adaptor is a double-stranded sequencing adaptor.

12. The kit of claim 9, wherein the enzyme for end-overhang a is a klenow-enzyme, a Taq-enzyme, or a combination of klenow-and Taq-enzymes.

13. The kit according to claim 9, wherein the enzyme for terminal filling-in is T4DNA polymerase.

14. The kit according to claim 12 or 13, wherein the enzyme for terminal filling-in is T4DNA polymerase; the enzyme used for suspending A at the tail end is Taq enzyme.

15. The kit according to claim 9, wherein the enzyme for terminal filling-in is T4DNA polymerase; the enzyme used for suspending A at the tail end is Taq enzyme; the enzymes used for ligation are T4DNA ligase and T7 DNA ligase; and the joint for connection is a double link joint.

16. The kit of claim 9, wherein the reagent for purifying the ligation product is selected from sterile dH2O or an elution buffer; the means for purifying the ligation products is selected from purification columns, Qiagen columns, purification magnetic beads, or Beckman Ampure XP beads.

Technical Field

The present invention relates to a method and a kit for constructing a plasma DNA sequencing library, and more particularly, to a method and a kit for constructing a plasma DNA library for high throughput sequencing.

Background

With the advancement of science and technology, the traditional Sanger sequencing cannot completely meet the research requirement, and the sequencing technology with lower cost, higher flux and higher speed and the high-throughput sequencing (also called second-generation sequencing) technology are needed for genome sequencing. The core idea of high throughput sequencing technology is sequencing-by-synthesis, i.e., sequencing of DNA by capturing tags of newly synthesized termini, and existing technology platforms mainly include Roche/454FLX, Illumina/Hiseq, Miseq, NextSeq, and Life Technologies/SOLID system, PGM, Proton, and the like. To date, HiSeq2000 can achieve a sequencing throughput of 30X coverage per run of 6 human genomes, approximately 600G/run data, and HiSeq2500 can achieve an average rate of one base read every 8 minutes at sequencing time. And with the maturity of the second generation sequencing technology, the application of the second generation sequencing technology to clinical research is rapidly developed.

Plasma DNA, also called circulating DNA, is extracellular DNA in blood, about several tens to several hundreds of nucleotides in length, and may exist in the form of DNA-protein complexes or as free DNA fragments. Normally, plasma DNA is derived from the DNA release of a small number of senescent dead cells. In a healthy state, the production and clearance of circulating DNA are in a dynamic equilibrium state, maintained at relatively constant low levels. The circulating DNA can reflect the metabolism state of cells in human body, and is an important index for health evaluation. The change of the quantity and quality of peripheral blood circulation DNA is closely related to various diseases (including tumors, complex severe trauma, organ transplantation, pregnancy-related diseases, infectious diseases, organ failure and the like), and as a noninvasive detection index, the change is expected to become an important molecular marker for early diagnosis, disease monitoring, curative effect and prognosis evaluation of certain diseases.

Since the discovery of fetal DNA in maternal blood, the noninvasive direct diagnosis and detection of fetal chromosomal abnormalities has been a major research topic. In 2008, lucy professor and its team took the lead to detect the plasma DNA of pregnant women by high-throughput sequencing, and the proportional relationship of chromosomes was counted, and the change in the chromosome number of trace amounts of embryonic DNA in plasma could be detected (Chiu, Chan et al.2008). With the development and maturity of detection technology, various countries are working on converting the laboratory technology into clinical application. In addition, it has been found that cancer cell apoptosis products (including fragmented and free cancer cell DNA) of cancer patients are released into the blood and transported to the whole body along the blood circulation system, and the whole genetic information of cancer cell genome is carried in the plasma DNA (Schwarzenbach, Hoon et al.2011). Recently, professor lunggui has found that cancer patients can be effectively distinguished from healthy people by performing high-throughput sequencing on plasma DNA of cancer patients to count methylation conditions of repeated regions (Chan, Jiang et al 2013).

The improvement of the sequencing efficiency and the popularization of multi-sample mixed sequencing put higher requirements on the preparation efficiency of samples, particularly the preparation of a large number of clinical samples; the development of the existing clinical plasma sample preparation method can not catch up with the improvement of the sequencing capability. Therefore, the efficiency and cost of preparing clinical plasma DNA samples by second generation high-throughput sequencing becomes a key to whether high-throughput sequencing can be popularized.

The preparation process of the plasma DNA sample for the second generation high-throughput sequencing is essentially to insert the DNA which accords with the sequencing length into the existing sequencing vector, namely, the two ends of the DNA to be sequenced are connected with the known sequencing joint sequence. At present, the construction of plasma DNA library mainly comprises the main steps of firstly carrying out end repair and 5' end phosphorylation on the extracted plasma DNA, then carrying out end suspension A, connecting joint, PCR and the like (figure 1), wherein, the purification steps are almost required between all the steps. The construction method of the plasma DNA sequencing library needs 6 main enzymes and 4 enzyme reaction systems in total, and is cleaned and purified for 4 times, so that the cost is higher, the operation is complex, the requirement on the environment is strict, the aerosol pollution is easy to form, the requirement on the operation capability of the molecular biology of an experimenter is very high, and the difficulty in realizing the simultaneous treatment of a plurality of samples is higher. In addition, aerosol pollution caused by the PCR process easily causes pollution of a sequencing sample, and meanwhile, certain base preference is introduced into the PCR process, so that the coverage of a sequencing result on a genome is uneven, and the inaccuracy of a final judgment result is caused.

Disclosure of Invention

In view of the problems currently encountered during the construction of plasma DNA sequencing libraries, the present inventors have discovered a new, more simplified, faster, more uniform method of constructing a plasma DNA sequencing library that can be adapted to a variety of second generation sequencing platforms, including but not limited to such sequencing platforms as Roche/454FLX, Illumina/Hiseq, Miseq, and Life Technologies/SOLIDsystem, PGM, Proton, etc.

The invention is based on the fact that: the inventor finds that the plasma DNA can be subjected to on-machine sequencing without PCR amplification after being connected with a sequencing joint, the PCR amplification step of the existing experimental flow is not necessary, and PCR reaction polymerase is not required. The inventor also finds that the three steps of terminal filling, terminal suspension A and connecting sequencing joints in the existing experimental process can be completed in one reaction system, a purification step is not needed in the middle, the terminal filling process of the plasma DNA can not be subjected to 5' terminal phosphorylation treatment, and finally the library of the to-be-processed sequencing can be obtained after the connection reaction is purified. According to the invention, plasma DNA broken in organisms can still construct a good plasma DNA library under the condition of 2ng as low as possible, and an effective sequencing result is obtained, and the PCR free technology is applied to very low-quantity (less than 10ng) plasma DNA for the first time.

Accordingly, one aspect of the present invention provides a method for constructing a plasma DNA sequencing library, comprising the steps of:

1) extracting plasma DNA;

2) optionally, carrying out terminal filling on the plasma DNA to obtain the plasma DNA with the filled terminal;

3) 3 'suspending A of the plasma DNA with the optional end filled in to obtain 3' suspended A plasma DNA;

4) connecting the plasma DNA after 3' suspension of the DNA with a sequencing linker to obtain a ligation product;

5) purifying the connection product to obtain a purified product.

Thus, according to the present invention, the plasma DNA sequencing library can be obtained without a PCR amplification reaction step for the purified product.

According to a preferred embodiment of the present invention, the step 4) of ligating the plasma DNA after 3' overhang a to a sequencing linker to obtain a ligation product is performed by one or two enzymes selected from the group consisting of: t4DNA ligase and T7 DNA ligase.

According to a preferred embodiment of the invention, the sequencing adaptor is a double stranded sequencing adaptor.

The double-stranded linker of the present invention is obtained by annealing two single-stranded DNAs, wherein one of the two single-stranded DNAs is a universal sequence, i.e., the sequence of the DNA is fixed and completely identical in different library applications, such as a universal adaptor (universal adaptor) as shown below. The other is a sequence with index, i.e., the sequence of the DNA has its own specific base, generally 3 to 8 specific bases, for distinguishing different adaptor sequences, and the sequences other than the specific base are fixed, as shown in index aptamers below.

The linker sequences used in the present invention are exemplified as follows:

Universal Adapter

5′AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT3′

Index ada ptor

5′GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG

the above is just one example of a linker sequence suitable for use in the present invention. One skilled in the art can link different and appropriate linker sequences for the plasma DNA library of the present invention depending on the sequencing platform, e.g., Illumina, Ion Torrent, Roche 454.

According to one embodiment of the invention, after said extracting plasma DNA, but before said ligating plasma DNA to the sequencing linker, the extracted plasma DNA is end-hung a and the end-hung a plasma DNA is purified. Preferably, the end overhang A is performed using klenow ex-enzyme, or Taq enzyme, or a combination of klenow ex-enzyme and Taq enzyme. Preferably, the connection between the terminal overhang a and the sequencing linker can be performed in one reaction system, that is, the connection between the sequencing linker and the terminal overhang a can be performed directly without purification, and the plasma DNA is purified after the connection reaction, so as to obtain the plasma DNA library to be subjected to on-machine sequencing. Wherein the terminal suspension A adopts Taq enzyme.

According to another embodiment of the invention, the extracted plasma DNA is end-filled and end-hung after said extracting plasma DNA but before said ligating plasma DNA to a sequencing linker. The terminal filling and the terminal suspension A can be carried out in two reaction systems, namely, after the terminal filling, the terminal suspension A is carried out after purification; alternatively and preferably, the end filling and the end overhang a are performed in one reaction system, and the plasma DNA is purified after the end filling and the end overhang a, wherein the end filling employs T4DNA polymerase and the end overhang a employs Taq enzyme; or more preferably, the end filling and the end suspension A, and the connection of the plasma DNA and the sequencing linker are only performed in one reaction system, and the plasma DNA is purified after the connection reaction, so that the plasma DNA library to be subjected to on-machine sequencing can be obtained. In the most preferred embodiment, only one reaction system is required for plasma DNA, and the final sequencing library can be obtained by one-step purification. Wherein the end filling-in adopts T4DNA polymerase, and the end suspension A adopts Taq enzyme.

Another aspect of the present invention provides a kit for constructing a plasma DNA sequencing library, comprising:

-optionally a reagent for end-filling of plasma DNA comprising dntps, an enzyme for end-filling, and an end-filling buffer;

-reagents to perform 3' overhang a, including dATP, enzymes for terminal overhang a, and terminal overhang a buffer;

-reagents for ligation to a sequencing linker, including a sequencing linker, a ligase, and a ligation buffer; and

reagents and devices for purifying the ligation product.

Thus, according to the present invention, the kit need not comprise a polymerase for the PCR amplification reaction.

According to an embodiment of the present invention, the present invention provides a kit for constructing a plasma DNA sequencing library, and an important feature of the kit is that the plasma DNA connected to the linker is purified to form a ready-to-use sequencing library, and a PCR amplification step is not required.

According to the kit of the present invention, preferably, the ligase is one or two selected from the following: t4DNA ligase, T7 DNA ligase.

According to the kit of the invention, the sequencing adaptor is a double-stranded sequencing adaptor.

In one embodiment, preferably, the enzyme for end overhang A is a klenow ex-enzyme, or a Taq enzyme, or a combination of klenow ex-enzyme and Taq enzyme.

In another embodiment, the kit of the invention comprises: -reagents for end-filling and end-overhang a of the extracted plasma DNA, including enzymes for end-filling, dntps, end-filling buffer, datps, enzymes for end-overhang a, end-overhang a buffer; and reagents and apparatus for purifying plasma DNA after terminal filling and terminal suspension a; wherein the terminal filling and the terminal suspension A are carried out in one reaction system, the enzyme for terminal filling is T4DNA polymerase, and the enzyme for terminal suspension A is Taq enzyme.

In yet another embodiment, the kit of the invention comprises: -reagents for end-filling and end-overhang a of the extracted plasma DNA, including enzymes for end-filling, dntps, end-filling buffer, datps, enzymes for end-overhang a, end-overhang a buffer; and reagents for ligating the plasma DNA to the sequencing linker, including sequencing linkers, ligase, and ligation buffer; wherein the end filling and the end suspension A and the connection of the plasma DNA and the sequencing adaptor are performed in a reaction system, the enzyme for end filling is T4DNA polymerase, and the enzyme for end suspension A is Taq enzyme.

According to one embodiment, the kit of the present invention does not include a polymerase for PCR amplification reaction, and may not include a purification reagent and a device after end repair and 3' overhang A.

According to one embodiment, the agent for purifying the ligation product is selected from sterile dH2O or an elution buffer; the means for purifying the ligation product is selected from a purification column, Qiagen column, purification magnetic beads, or Beckman Ampure XPbeads.

The invention is based on the discovery that plasma DNA after connecting a linker can be subjected to on-machine sequencing, so that the PCR reaction step is omitted in the existing construction method of the plasma DNA sequencing library, and PCR reaction polymerase is not used any more. On the basis, the invention preferably utilizes common Taq polymerase to replace klenow ex-enzyme to carry out the end suspension A step, so that a plurality of reaction systems can be compatible, and therefore, the steps of end filling, end suspension A and joint connection can be completed in only one reaction system, and the intermediate purification step is omitted. Therefore, the invention greatly simplifies the construction process of the plasma DNA sequencing library, reduces sample confusion and pollution caused by manual operation, and is more suitable for large-scale operation of constructing the plasma DNA library. Meanwhile, the invention removes the pollution between the environment and the samples caused by PCR amplification, reduces the requirement of the construction of the plasma DNA library on the environment, and simultaneously removes the base preference and the genomic coverage heterogeneity caused by the PCR amplification process and the inaccuracy of the sequencing result. The invention ensures that the construction of the plasma sample library is lower in cost, efficient and rapid, and the plasma DNA sequencing result is more real and effective. Therefore, the present invention is more suitable for large-scale experimental procedures and plasma DNA sequence analysis.

Drawings

FIG. 1 shows the construction of a second generation high throughput plasma DNA sequencing library commonly used in the prior art.

FIG. 2 illustrates a method of constructing a second generation high throughput plasma DNA sequencing library according to one embodiment of the present invention.

FIG. 3 shows a comparison of two different library construction approaches of the prior art and according to one embodiment of the present invention.

FIG. 4 shows a comparison of four different library construction modes according to examples 1-4 of the present invention.

Detailed Description

As mentioned above, the construction of the plasma DNA library currently suitable for the second generation high throughput sequencing platform mainly comprises the following steps (see FIG. 1): extracting plasma DNA (step 101) → performing end repair (filling in), 5 'end phosphorylation, and purification on the extracted plasma DNA (step 102) → performing 3' overhang a on the 5 'phosphorylated blunt-end DNA fragment, and purification (step 103) → ligating the 3' overhang a DNA with a sequencing linker (step 104) → purifying the ligated product to remove the unligated linker (step 105) → performing Polymerase Chain Reaction (PCR) on the purified product (step 106) → purifying the PCR product to obtain a plasma DNA sequencing library (step 107). Wherein purification steps are almost always required between the individual steps and the plasma DNA finally needs to be subjected to PCR amplification reactions. Therefore, 6 main enzymes are needed in the construction method of the second generation plasma DNA sequencing library, 4 enzyme reaction systems are needed, and 4 times of cleaning and purification and 1 time of PCR amplification reaction are carried out. Therefore, the cost is higher, the operation is complex, the requirement on the environment is strict, the pollution of aerosol is easy to form, the requirement on the operation capability of molecular biology of an experimenter is high, and the difficulty in realizing the simultaneous treatment of multiple samples is higher. In addition, aerosol pollution caused by the PCR process easily causes pollution of a sequencing sample, and meanwhile, certain base preference is introduced into the PCR process, so that the coverage of a sequencing result on a genome is uneven, and the inaccuracy of a final judgment result is caused.

Repeated experiments show that the plasma DNA after the joint is connected can be subjected to on-machine sequencing without PCR amplification, and the PCR amplification step in the conventional experimental process is not necessary. Therefore, an important feature of the present invention in constructing a plasma DNA sequencing library is that it does not comprise a PCR amplification step.

On the basis of this, the present inventors have also found that, depending on the choice of the enzyme used in end overhang A, the three-step reactions of end-blunting, end-overhang A and ligation of sequencing adaptors in the existing experimental protocol may not necessarily be performed in three reaction systems, but may be performed in only one reaction system without intermediate purification steps.

FIG. 2 shows a method for constructing a second generation high throughput plasma DNA sequencing library according to an embodiment of the present invention, which essentially comprises the following steps: extracting plasma DNA (step 201) → performing terminal filling and suspending a on the extracted plasma DNA (step 203) → ligating the plasma DNA of the 3' suspending a with a sequencing linker (step 204) → purifying the plasma DNA after ligating the linker to obtain a plasma DNA sequencing library (step 207). According to another embodiment of the present invention, step 203 can also be accomplished in two reaction systems, namely, step 202, which is performed to perform end repair and 5 ' end phosphorylation on the extracted plasma DNA, and step 203, which is performed to perform 3 ' overhang A on the 5 ' phosphorylated blunt-end DNA fragment; or purified between step 203 and step 204.

Compared with the method for constructing the plasma DNA sequencing library in the prior art, the method can obtain the final on-machine sequencing library only by one reaction system and one-step purification without a PCR process, remarkably simplifies the construction process of the plasma DNA sequencing library, and removes the base preference and the uneven genome coverage introduced by the PCR step, so that the construction of the plasma sample library is lower in cost, efficient and rapid, the sequencing result of the plasma DNA is more real and effective, and the method is convenient for large-scale use.

The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings. It should be noted that the drawings and their embodiments of the present invention are for illustrative purposes only and are not to be construed as limiting the invention. The embodiments and features of the embodiments in the present application may be combined with each other without contradiction.

17页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:微胶囊组合物及方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!