Method and kit for constructing second-generation sequencing library

文档序号:1884854 发布日期:2021-11-26 浏览:21次 中文

阅读说明:本技术 一种用于二代测序文库构建方法及试剂盒 (Method and kit for constructing second-generation sequencing library ) 是由 黄必胜 胡咏武 于 2021-07-20 设计创作,主要内容包括:本发明公开了核酸测序领域的一种用于二代测序文库构建方法及试剂盒,包括步骤1:提取需要测序的基因组DNA,包含但不限于各种肿瘤组织DNA,或外周血游离DNA或各种体液,如但不限于胸腔积水、尿液,唾液等中的DNA,步骤2:设计包含特异目标序列引物的,同时带有样本标记序列(index)和测序引物的引物序列,并人工合成该引物;步骤3:将所述特异性引物和待测DNA混合,进行PCR扩增;步骤4:纯化PCR产物;步骤5:上机测序,该用于二代测序文库构建方法及试剂盒简化了建库流程,降低建库成本,提高了检测的灵敏度,也降低了检测所需测序的数据量,有效去除假阳性、提高目标DNA片段富集效率且减少测序数据浪费。(The invention discloses a method and a kit for constructing a second-generation sequencing library in the field of nucleic acid sequencing, and the method comprises the following steps of 1: extracting genomic DNA to be sequenced, including but not limited to DNA of various tumor tissues, or free DNA of peripheral blood or DNA in various body fluids, such as but not limited to pleural effusion, urine, saliva and the like, step 2: designing a primer sequence which contains a specific target sequence primer and is provided with a sample marker sequence (index) and a sequencing primer, and artificially synthesizing the primer; and step 3: mixing the specific primer and the DNA to be detected, and carrying out PCR amplification; and 4, step 4: purifying the PCR product; and 5: the method and the kit for constructing the second-generation sequencing library simplify the library construction process, reduce the library construction cost, improve the detection sensitivity, reduce the data amount of sequencing required by detection, effectively remove false positives, improve the enrichment efficiency of target DNA fragments and reduce the sequencing data waste.)

1. A method for constructing a second generation sequencing library is characterized by comprising the following steps:

step 1: extracting the genomic DNA needing sequencing, including but not limited to DNA of various tumor tissues, or free DNA of peripheral blood (DNA of peripheral blood of pregnant women or peripheral blood of normal persons or tumor patients) or DNA in various body fluids, such as but not limited to pleural effusion, urine, saliva and the like.

Step 2: designing a primer sequence which contains a specific target sequence primer and is provided with a sample marker sequence (index) and a sequencing primer, and artificially synthesizing the primer;

and step 3: mixing the specific primer and the DNA to be detected, and carrying out PCR amplification;

and 4, step 4: purifying the PCR product;

and 5: sequencing on a computer;

wherein the primer sequence used in the step 2 further comprises an annealing product of the single-stranded DNA shown in SEQ ID NO. 1 and the single-stranded DNA with the nucleotide sequence shown in SEQ ID NO. 2 as the joint, and the primer sequence shown in SEQ ID NO. 1(A + B): 5'-GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG (X1) m-3', said SEQ ID NO:2(D + E + F): 5'-CAAGCAGAAGACGGCATACGAGAT(N) n (X2) m-3';

wherein (X1) m in the SEQ ID NO. 1(A + B) is designed to be adjacent to the target sequence site to be detected as an upstream primer; 2(D + E + F) wherein (N) N is a tag sequence for distinguishing sequencing data from different samples, N is a positive integer of 6-12, N N are independently selected from A, T, C, G, (X2) m is a downstream primer, m is a positive integer of 20-40, and a target sequence specific primer is 1-50 bp, such as 2-20 bp, away from the site.

2. The method of claim 1, wherein the library is a library of sequencing library, and the method comprises the following steps: the PCR reaction system in the step 3 is DNA20ul, PCR primer mixture 5ul and PCR master mixture 25 ul.

3. The method of claim 1, wherein the library is a library of sequencing library, and the method comprises the following steps: in the step 4, a magnetic bead method or a filter membrane adsorption method kit well known in the art can be used for purifying the product.

4. The kit of claim 1, wherein the kit comprises: the upstream and downstream primers may be, but are not limited to, a single-stranded DNA having a nucleotide sequence shown in SEQ ID NO. 1 and a single-stranded DNA having a nucleotide sequence shown in SEQ ID NO. 2, or their annealing products.

5. The kit of claim 4, wherein the kit comprises: also comprises the following reagents: taq enzyme, DNTP, MgCl2And PCR buffer.

6. A kit for use in a secondary sequencing library according to claims 4-5, characterized in that: also comprises a magnetic bead method DNA purification kit or a filter membrane adsorption method DNA purification kit.

Technical Field

The invention relates to the field of nucleic acid sequencing, in particular to a method and a kit for constructing a second-generation sequencing library.

Background

A gene mutation is a change in the base pair composition or arrangement of a gene in its structure. In pathogen detection, tumor mutant gene detection and detection of fetal free DNA in plasma of pregnant women, a small amount of mutant genes need to be found in a large amount of normal gene sequences, and at this time, the excellent characteristics can be exerted by the characteristic of high-throughput sequencing of second-generation sequencing, and a small amount (less than 1%) of mutant genes in a sample can be detected at a cost far lower than that of first-generation sequencing (Sanger sequencing), such as free DNA (ctDNA) with tumor characteristics in plasma of cancer patients, subclone mutation with a low proportion in cancer tissue samples (such as FFPE), fetal free DNA in plasma of pregnant women, and virus DNA in plasma of patients with AIDS, hepatitis and the like.

However, the construction process of the sequencing library of the conventional next generation sequencing method is complex, and comprises the steps of fragmenting a genome, performing end repair on a DNA fragment to obtain a blunt-end DNA fragment, adding A to the blunt-end DNA fragment at the 3' end to obtain a DNA fragment with the A at the 3' end, and adding a linker to the DNA fragment with the A at the 3' end to obtain a linker-added DNA fragment; and carrying out PCR amplification on the joint DNA fragment to obtain an amplification product, and the like, wherein multi-step purification is carried out during the process, so that the template DNA is greatly lost, and the low-frequency gene mutation detection is not facilitated. If a capture step for the target fragment is added, the detection accuracy is more difficult to ensure, the sequencing depth is often increased in practice, and the sequencing cost is greatly increased.

Disclosure of Invention

The invention aims to provide a method and a kit for constructing a second-generation sequencing library, and aims to solve the problems that template DNA loss is high, low-frequency gene mutation detection is not facilitated, the detection accuracy is difficult to guarantee, and the sequencing cost is increased in the background technology.

In order to achieve the purpose, the invention provides the following technical scheme: a method for constructing a second generation sequencing library, comprising the following steps:

step 1: extracting the genomic DNA needing sequencing, including but not limited to DNA of various tumor tissues, or free DNA of peripheral blood (DNA of peripheral blood of pregnant women or peripheral blood of normal persons or tumor patients) or DNA in various body fluids, such as but not limited to pleural effusion, urine, saliva and the like.

Step 2: designing a primer sequence which contains a specific target sequence primer and is provided with a sample marker sequence (index) and a sequencing primer, and artificially synthesizing the primer;

and step 3: mixing the specific primer and the DNA to be detected, and carrying out PCR amplification;

and 4, step 4: purifying the PCR product;

and 5: sequencing on a computer;

wherein the primer sequence used in the step 2 further comprises an annealing product of the single-stranded DNA shown in SEQ ID NO. 1 and the single-stranded DNA with the nucleotide sequence shown in SEQ ID NO. 2 as the joint, and the primer sequence shown in SEQ ID NO. 1(A + B): 5'-GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG (X1) m-3', said SEQ ID NO:2(D + E + F): 5'-CAAGCAGAAGACGGCATACGAGAT(N) n (X2) m-3';

wherein (X1) m in the SEQ ID NO. 1(A + B) is designed to be adjacent to the target sequence site to be detected as an upstream primer; 2(D + E + F) wherein (N) N is a tag sequence for distinguishing sequencing data from different samples, N is a positive integer of 6-12, N N are independently selected from A, T, C, G, (X2) m is a downstream primer, m is a positive integer of 20-40, and a target sequence specific primer is 1-50 bp, such as 2-20 bp, away from the site.

Preferably, the PCR reaction system in the step 3 is DNA20ul, PCR primer mixture 5ul and PCR master mix 25 ul.

Preferably, the step of purifying the product in step 4 may employ a magnetic bead method or a filter membrane adsorption method kit well known in the art.

The invention also provides a kit for the second generation sequencing library, which comprises the upstream and downstream primers, wherein the upstream and downstream primers can be, but are not limited to, single-stranded DNA with a nucleotide sequence shown as SEQ ID NO. 1 and single-stranded DNA with a nucleotide sequence shown as SEQ ID NO. 2, or annealing products of the single-stranded DNA and the single-stranded DNA.

Preferably, the following reagents are also included: taq enzyme, DNTP, MgCl2And PCR buffer.

Preferably, the kit further comprises a magnetic bead method DNA purification kit or a filter membrane adsorption method DNA purification kit.

Compared with the prior art, the invention has the beneficial effects that: the method and the kit for constructing the second-generation sequencing library omit the steps of DNA end filling, 3' end adenine addition, sequencing primer connection and the like, also reduce the purification step, the sequencing primer and the specific target sequence primer are designed in a combined mode, the step of capturing the target sequence is omitted, the library building process is greatly simplified, the library building cost is reduced, the detection sensitivity is improved, the data quantity of sequencing required by detection is reduced, false positives are effectively removed, the enrichment efficiency of target DNA fragments is improved, the sequencing data waste is reduced, and the method can be applied to sequencing of the target sequence, and can be applied to the second-generation sequencing detection of free DNA (ctDNA) with tumor characteristics in the plasma of a cancer patient, subclone mutation with a low proportion in a cancer tissue sample (such as FFPE), free DNA in the plasma of a pregnant woman, and virus DNA in the plasma of the patient with AIDS, hepatitis and the like.

Drawings

FIG. 1 is a schematic diagram of sequencing according to the present invention;

FIG. 2 is a flowchart of a procedure for setting up a PCR in an embodiment of the present invention.

Detailed Description

The technical solutions in the following embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In one aspect, the invention provides a method for DNA secondary sequencing library construction, comprising:

step 1: extracting the genomic DNA needing sequencing, including but not limited to DNA of various tumor tissues, or free DNA of peripheral blood (DNA of peripheral blood of pregnant women or peripheral blood of normal persons or tumor patients) or DNA in various body fluids, such as but not limited to pleural effusion, urine, saliva and the like.

Step 2: designing a primer sequence which contains a specific target sequence primer, is provided with a sample mark sequence (index) and a sequencing primer, and artificially synthesizing the primer;

and step 3: mixing the specific primer and the DNA to be detected, and carrying out PCR amplification;

and 4, step 4: purifying the PCR product;

and 5: sequencing on machine

Wherein, step 2 shown in FIG. 1 has the structure, and specifically, but not limited to, the annealing product of the single-stranded DNA with the nucleotide sequence shown in SEQ ID NO. 1 and the single-stranded DNA with the nucleotide sequence shown in SEQ ID NO. 2 is used as the linker;

in the library construction method of the present invention, PCR amplification is performed only once in the step 3.

In the library construction method of the present invention, the amount of the DNA fragment in step 1 is not particularly limited, but it should be noted that the library construction method of the present invention is applicable to the construction of a very small amount of sample library, and therefore, the amount of the DNA fragment in step 1 may be 1 to 200ng, for example, 5 to 50 ng.

In the library construction method of the present invention, the PCR amplification is performed only once in the step 4 (for example, 10 to 30 temperature cycles can be performed), and the step of performing PCR amplification on the adaptor-added DNA fragment is not included, so that the mismatch caused by the PCR amplification can be reduced, and the occurrence of false positive can be effectively reduced.

Among them, the sequencing in the method for detecting a DNA mutation of the present invention may be performed, for example, using Illumina platform (e.g., HiSeq 2500 or NextSeq 500).

In another aspect, the present invention also provides a kit for constructing a secondary sequencing DNA library, which can be used to perform the library construction method of the present invention, comprising reagents for constructing a secondary sequencing DNA library, the reagents for constructing a secondary sequencing DNA library comprising:

a single-stranded DNA having a nucleotide sequence shown by SEQ ID NO. 1 and a single-stranded DNA having a nucleotide sequence shown by SEQ ID NO. 2, or their annealing products;

in another aspect, the present invention also provides a kit for detecting low frequency mutations in DNA, which can be used to perform the detection method of the present invention, comprising:

the reagent is used for constructing a second-generation sequencing DNA library and the reagent is used for performing on-machine sequencing on the second-generation sequencing DNA library;

wherein the reagent for constructing the second-generation sequencing DNA library comprises:

a single-stranded DNA having a nucleotide sequence shown by SEQ ID NO. 1 and a single-stranded DNA having a nucleotide sequence shown by SEQ ID NO. 2, or their annealing products; the reagent for performing the on-machine sequencing on the second generation DNA sequencing library comprises at least one or more than two of the following groups: DNA polymerase, dntps, wash hybridization/buffer, 100% formamide (mass/volume), Read 2 sequencing primers for sequencing, Index i7 sequencing primers, Read 1 sequencing primers for sequencing, Hiseq Rapid PE Flow Cell, water, and reagents for light sensitivity enhancement/photography.

The present invention will be described in further detail with reference to the following examples, which are intended to illustrate and not limit the present invention.

Example 1

The library construction method is adopted to construct a second-generation sequencing DNA library to detect AKT1, TP53 and PIK3CA gene mutation.

1.1 specific primer design

Specific primers (corresponding to single-stranded DNA shown in SEQ ID NO: 3) were designed, wherein AKT1-T219P was used for detecting AKT1 NM-001014431: c.A655C: p.T219P, TP53-T245P was used for detecting TP53 NM-001126115: c.A733C: p.T24PP, and PIK3CA-H047R was used for detecting PIK3CA NM-006218: c.A3140G: p.H1047R.

The specific primer sequences are shown in the following table

1.2 DNA extraction

Two plasma samples are selected, free DNA samples 1 and 2 extracted from 2mL of plasma by a magnetic bead method are taken, 10ng of free DNA is quantitatively taken for construction, the specific primers are respectively used for detecting a target sequence, and the two samples are all the same except different indexes.

5’Adapter:

5'-GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG-3'

3' Index-41 primer (for 1):

5'-CGTGATGT-3'

3' Index-42 primer (for 2):

5'-GTCAGTCG-3'

3’Adapter:

5’-CAAGCAGAAGACGGCATACGAGAT

and (3) PCR reaction: setting PCR program as shown in FIG. 2, after the reaction, taking out the sample in time, storing in 4 deg.C refrigerator, and exiting the program or shutting down the instrument as required.

1.3 recovery and purification of PCR products

The PCR product in the reaction system was recovered and purified using 0.9 × Ampure magnetic beads, and dissolved in 30 μ L of EB.

1.4 library quantitation

The library was tested by 2100Bioanalyzer (Agilent)/LabChip GX (Caliper) and QPCR, and the quality was checked.

1.5 the constructed library was sequenced PE100 using Illumina HiSeqTM 2500.

1.6 bioinformatics data obtained at the end are shown in the following table:

rawdata: total data volume produced by on-board sequencing;

q20 and Q30 Gene high throughput sequencing gives a corresponding mass value for each base detected, which is a measure of sequencing accuracy. Q20 and Q30 in the industry indicate the percentage of the bases with the quality value ≧ 20 or 30. The Q20 value refers to that the error probability given to the recognized Base is 1% during Base recognition (Base Calling) in the sequencing process, namely the error rate is 1%, or the accuracy is 99%; the Q30 value refers to the sequencing process Base recognition (Base Calling) process, the recognized Base is given an error probability of 0.1%, i.e., an error rate of 0.1%, or a correct rate of 99.9%;

the comparison rate is as follows: percent of off-line sequencing data aligned to the reference genome after low quality filtering;

efficiency of targeted capture: the amount of data aligned to the target region is divided by the amount of data aligned to the reference genome 100%, or is described as the percentage of the amount of data aligned to the target region to the amount of data aligned to the reference genome.

The method and the kit for constructing the second-generation DNA sequencing library can conveniently and quickly construct the sequencing library, greatly reduce the library construction cost and the sequencing cost, effectively remove false positives, improve the enrichment efficiency of target DNA fragments and reduce the waste of sequencing data.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

While the invention has been described above with reference to an embodiment, various modifications may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In particular, the various features of the embodiments disclosed herein may be used in any combination, provided that there is no structural conflict, and the combinations are not exhaustively described in this specification merely for the sake of brevity and conservation of resources. Therefore, it is intended that the invention not be limited to the particular embodiments disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

8页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于基因捕获技术的测序方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!