PCR primer and application thereof in DNA fragment connection

文档序号:1138390 发布日期:2020-10-09 浏览:4次 中文

阅读说明:本技术 Pcr引物及其在dna片段连接中的应用 (PCR primer and application thereof in DNA fragment connection ) 是由 陈智超 唐冲 阮凤英 郭梅 石卓兴 杨林峰 于 2019-03-28 设计创作,主要内容包括:本发明提出了一种PCR引物及其在DNA片段连接中的应用。所述PCR引物包含回文序列段和任意PCR引物序列段,所述回文序列段的3’端与所述任意PCR引物序列段的5’端相连,所述回文序列段的3’端的核苷酸为U,所述任意PCR引物序列段与预扩增DNA模板的3’端互补配对。根据本发明实施例的引物可用于测序文库的构建,在构建过程中,利用根据本发明实施例的引物作为基于DNA的PCR扩增引物,所获得的扩增产物DNA双链的两端具有回文序列结构,且回文序列结构的3’端具有核苷酸U,为后续不同扩增产物双链DNA的相互连接奠定了基础。(The invention provides a PCR primer and application thereof in DNA fragment connection. The PCR primer comprises a palindromic sequence segment and an arbitrary PCR primer sequence segment, wherein the 3 'end of the palindromic sequence segment is connected with the 5' end of the arbitrary PCR primer sequence segment, the nucleotide of the 3 'end of the palindromic sequence segment is U, and the arbitrary PCR primer sequence segment is complementarily paired with the 3' end of the pre-amplification DNA template. The primer provided by the embodiment of the invention can be used for constructing a sequencing library, and in the construction process, the primer provided by the embodiment of the invention is used as a PCR amplification primer based on DNA, two ends of an obtained amplification product DNA double strand have palindromic sequence structures, and the 3' end of the palindromic sequence structure has nucleotide U, so that a foundation is laid for the interconnection of subsequent different amplification product double strand DNAs.)

1. A PCR primer is characterized by comprising a palindromic sequence segment and an arbitrary PCR primer sequence segment, wherein the 3 'end of the palindromic sequence segment is connected with the 5' end of the arbitrary PCR primer sequence segment, the nucleotide of the 3 'end of the palindromic sequence segment is U, and the arbitrary PCR primer sequence segment is complementarily paired with the 3' end of a pre-amplification DNA template.

2. The primer of claim 1, wherein the palindromic sequence segments are 2bp, 4bp, 6bp, 8bp, 10bp, or 12bp in length;

preferably, the palindromic sequence segment has the nucleotide sequence shown in SEQ ID NO 1 or 2.

3. The primer of claim 1, wherein the arbitrary PCR primer sequence segment has a length of 15 to 35bp,

preferably, the arbitrary PCR primer sequence segment is 25bp in length.

4. A method of PCR amplification wherein a pre-amplified DNA template is amplified by the action of PCR primers, PCR enzymes, said PCR primers being as defined in any one of claims 1 to 3.

5. The method of claim 4, wherein the pre-amplified DNA template is cDNA.

6. The method of claim 5, wherein the cDNA is obtained by:

(1) carrying out reverse transcription on RNA with polyA tail at the 3 ' end in the presence of a polyT primer and MMLV RT reverse transcriptase, and adding a plurality of C at the 3 ' end of a reverse transcription product cDNA single strand under the condition of terminal transferase activity of the MMLVRT reverse transcriptase when the reverse transcription is carried out to the 5' end of the RNA;

(2) combining a template conversion primer with a plurality of G at the 3' end with the reverse transcription product cDNA single strand obtained in the step (1) through CG base complementary pairing, and performing extension of the template conversion primer by using the reverse transcription product cDNA single strand as a template in the presence of MMLV RT reverse transcriptase so as to obtain a reverse transcription product cDNA double strand.

7. A method of constructing a sequencing library, comprising:

(1) amplifying a DNA template using the method of any one of claims 4 to 6;

(2) digesting the amplification product obtained in the step (1) by using a USER enzyme so as to generate sticky ends;

(3) performing connection treatment on the digestion product obtained in the step (2);

(4) subjecting the ligation products obtained in step (3) to a ligation process to obtain a sequencing library.

8. The method of claim 7, wherein after the ligation process, the method further comprises performing end repair and a 3 ' plus a process on the ligation process product, and performing T-a ligation on the linker with free T at the 5' end and the end repair product after the 3 ' plus a process to obtain a sequencing library.

9. The method of claim 7, wherein after step (1) and before step (2), further comprising subjecting the amplification product to a first purification treatment;

optionally, after step (3) and before step (4), further comprising subjecting the ligation-treated product to a second purification treatment.

10. A method for sequencing a full-length transcriptome, comprising:

constructing a sequencing library using the method of any one of claims 7 to 9; and

sequencing the sequencing library using the Pacbio sequencing platform to obtain sequence information for the full length transcriptome.

Technical Field

The invention relates to the technical field of biology, in particular to a PCR primer and application thereof in DNA fragment connection, and more particularly relates to a PCR primer, a PCR amplification method, a method for constructing a sequencing library and a sequencing method of a full-length transcription group.

Background

Genome and transcriptome sequencing is fundamental work in the life science field. Because most non-model organisms lack genome data, sequencing of full-length transcriptome becomes more important, and the full-length transcript can greatly promote the gene function, gene expression regulation, evolutionary relationship and other multi-aspect basic and application research of the species. Currently, the vast majority of transcriptome data is obtained based on second generation high throughput sequencing technologies. However, the sequencing sequence of the second generation sequencing technology is short, short sequence splicing cannot provide a large number of long transcripts and important information such as alternative splicing is lost, so that the PacBio third generation sequencing technology is adopted for the de novo sequencing of transcriptome. When the PacBio RS II was introduced in 2013, the lucidity was low, and the sequencing cost was always expected to be stopped by researchers. By 10 months 2015, PacBio launched the Sequel platform, which greatly improved the sequencing throughput of the full-length transcriptome, which was 5-10 times that of PacBio RS II. In view of the potential development of the third generation sequencing technology in scientific research and even clinical fields, many companies have introduced the sequential platform in succession to provide the full-length transcriptome sequencing service of the Pacbio third generation sequencing.

Disclosure of Invention

The present application is based on the discovery and recognition by the inventors of the following facts and problems:

usually, the average length of the transcript is less than 2k, the average length of the transcriptome library obtained by conventional library construction is also less than 2k, and the enzyme reading length of the Pacbio sequencing platform can reach more than 20k, namely, the average sequencing round number of the transcriptome library reaches more than 10 rounds. According to statistics, the number of sequencing rounds reaches 3, the QV value of sequencing can reach 0.9, and accurate analysis on transcriptome is enough. Therefore, the extra 7 rounds of sequencing can be considered as data redundancy, namely, the existing full-length transcriptome library building mode cannot fully utilize the advantage of enzyme reading length of the Pacbio platform. Recently, Pacbio introduced the upgrade of reagents and chips, the enzyme read length after upgrade could reach over 80k, and transcriptome sequencing would generate greater data redundancy if the existing library construction method was still used.

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

In a first aspect of the invention, the invention provides a PCR primer. According to the embodiment of the invention, the PCR primer comprises a palindromic sequence segment and an arbitrary PCR primer sequence segment, wherein the 3 'end of the palindromic sequence segment is connected with the 5' end of the arbitrary PCR primer sequence segment, the nucleotide of the 3 'end of the palindromic sequence segment is U, and the arbitrary PCR primer sequence segment is complementarily paired with the 3' end of the pre-amplification DNA template. The primer provided by the embodiment of the invention can be used for constructing a sequencing library, and in the construction process, the primer provided by the embodiment of the invention is used as a PCR amplification primer based on DNA, two ends of an obtained amplification product DNA double strand have palindromic sequence structures, and the 3' end of the palindromic sequence structure has nucleotide U, so that a foundation is laid for the interconnection of subsequent different amplification product double strand DNAs.

According to an embodiment of the present invention, the primer may further have at least one of the following additional features:

according to the embodiment of the present invention, the length of the palindromic sequence segment is not particularly limited as long as it can satisfy the following sticky end ligation of the double-stranded DNAs of different amplification products to each other. According to a specific embodiment of the present invention, the length of the palindromic sequence segment is 2bp, 4bp, 6bp, 8bp, 10bp or 12 bp.

According to an embodiment of the invention, the palindromic stretch has the nucleotide sequence shown in SEQ ID NO 1 or 2.

ACTAGU(SEQ ID NO:1)。

ACTCAUGAGU(SEQ ID NO:2)。

The inventor finds that the palindromic sequence segment with the nucleotide sequence shown in SEQ ID NO. 2 has two U, and is cut twice when the USER is used for enzyme digestion, so that two times of melting in the enzyme digestion process are carried out simultaneously, the melting is easier when a viscous terminal short sequence is formed, and the enzyme digestion process is carried out more smoothly.

According to the embodiment of the invention, the length of the random PCR primer sequence segment is 15-35 bp, and preferably, the length of the random sequence segment is 25 bp. The length of the random sequence segment according to the embodiment of the invention adopts the length required by the design of a conventional primer, and the length of the conventional sequence segment of the primer is in the range, so that the specific binding and the effective amplification of the target gene segment can be realized.

In a second aspect of the invention, a method of PCR amplification is provided. According to an embodiment of the invention, the pre-amplified DNA template is amplified by the action of PCR primers, PCR enzymes, as defined above. The two ends of the DNA double strand obtained by the amplification method provided by the embodiment of the invention have the palindromic sequence structure, the 3 'end of the palindromic sequence structure is provided with the nucleotide U, and the palindromic sequence structure can be cut at the 3' side of the nucleotide U under the action of the endonuclease to generate a sticky end, so that a foundation is laid for pairwise connection between subsequent amplification products.

According to an embodiment of the present invention, the method may further include at least one of the following additional technical features:

according to an embodiment of the invention, the pre-amplified DNA template is cDNA. The PCR amplification method according to the embodiment of the present invention can be used for construction of a transcript library.

According to the inventive example of this year, the cDNA was obtained as follows: (1) carrying out reverse transcription on RNA with polyA tail at the 3 ' end in the presence of a polyT primer and MMLV RT reverse transcriptase, and adding a plurality of C at the 3 ' end of a reverse transcription product cDNA single strand under the condition of terminal transferase activity of the MMLV RT reverse transcriptase when the reverse transcription is carried out to the 5' end of the RNA; (2) combining a template conversion primer with a plurality of G at the 3' end with the reverse transcription product cDNA single strand obtained in the step (1) through CG base complementary pairing, and performing extension of the template conversion primer by using the reverse transcription product cDNA single strand as a template in the presence of MMLV RT reverse transcriptase so as to obtain a reverse transcription product cDNA double strand. According to the way of obtaining cDNA, the invention uses the principle of template conversion to construct the transcript cDNA with complementary sequence at both ends, i.e. the transcript cDNA without single-strand region at both ends, and lays the foundation for the subsequent amplification of cDNA by taking the transcript cDNA as a template and the PCR primer as the primer, and also lays the foundation for the generation of consistent palindromic sequence viscous tail end after the double-end enzyme digestion of the subsequent amplification product.

In a third aspect of the invention, a method of constructing a sequencing library is provided. According to an embodiment of the invention, the method comprises: (1) amplifying the DNA template using the method described previously; (2) digesting the amplification product obtained in the step (1) by using a USER enzyme so as to generate sticky ends; (3) performing connection treatment on the digestion product obtained in the step (2); (4) subjecting the ligation products obtained in step (3) to a ligation process to obtain a sequencing library. According to the sequencing library obtained by the library construction method provided by the embodiment of the invention, the average length is effectively prolonged, the obtained sequencing library can fully utilize the advantage of enzyme reading length of a Pacbio platform, sequencing redundant data is reduced, more transcript data is obtained, and the effective utilization rate of sequencing data is improved.

According to an embodiment of the present invention, the method may further include at least one of the following additional technical features:

according to an embodiment of the present invention, after the ligation treatment, the method further comprises performing end repair and 3 ' end-to-a treatment on the ligation treatment product, and performing T-a ligation on the linker with free T at the 5' end and the end repair product after the 3 ' end-to-a treatment, so as to obtain a sequencing library. The inventor finds that the T-A connection can effectively avoid the self-connection phenomenon of the joint and improve the effective connection rate of the connection treatment product and the joint.

According to an embodiment of the invention, the connector is a hairpin structure.

According to a specific embodiment of the present invention, the linker with a free T at the 5' end has the nucleotide sequence shown in SEQ ID NO 3.

CTGCTCGTCAATCTCTCTCTTTTCCTCCTCCTCCGTTGTTGTTGTTGAGAGAGATTGACGAGCAGT(SEQ ID NO:3)。

According to an embodiment of the present invention, after the step (1) and before the step (2), the method further comprises subjecting the amplification product to a first purification treatment. Further improving the digestion efficiency of the USER enzyme.

According to an embodiment of the present invention, after step (3) and before step (4), the method further comprises subjecting the ligation product to a second purification treatment. Thereby further improving the connection efficiency of the joint.

In a fourth aspect of the invention, a method of sequencing a full-length transcriptome is provided. According to an embodiment of the invention, the method comprises: constructing a sequencing library using the methods described above; and sequencing the sequencing library using a Pacbio sequencing platform to obtain sequence information for the full-length transcriptome. According to the sequencing method of the full-length transcriptome, provided by the embodiment of the invention, the advantage of enzyme reading length of a Pacbio platform can be fully utilized, sequencing redundant data are reduced, more transcript data are obtained, and the effective utilization rate of sequencing data is improved.

Drawings

FIG. 1 is a schematic representation of a reverse transcription process according to an embodiment of the present invention;

FIG. 2 is a schematic flow diagram of a method of PCR amplification and obtaining a sequencing library according to an embodiment of the invention; and

fig. 3 is a schematic diagram of a PB dT adaptor (one dTTP more adaptor) according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

Aiming at the problem that the existing full-length transcriptome library construction mode cannot fully utilize the advantage of enzyme reading length of a Pacbio platform, the invention designs a scheme for constructing a full-length transcriptome in an end-to-end way, and PCR products of the transcriptome are connected in an end-to-end way through a viscous tail end, so that the average length of the library is increased, and more transcript data can be obtained after sequencing.

According to the embodiment of the invention, the reverse transcription step is consistent with that of a conventional library construction, the transcript cDNA with a complementary sequence at both ends is constructed by adopting a template conversion principle, in the subsequent PCR link, a palindromic sequence containing a U base is introduced on a PCR primer, so that the palindromic sequence is generated at the 5' end of a PCR product of the cDNA, the U base can be digested and removed by Uracil-DNA Glycosylase (UDG) and Endonuclase VIII in the USER enzyme, a palindromic cohesive end is generated, and the products with the palindromic cohesive ends are connected by using DNA ligase. Different products can be connected due to the existence of a palindromic sequence, and a standard dumbbell-shaped library can be obtained through the links of end repair, A addition and head addition. According to the embodiment of the invention, after the PCR products of the transcripts are subjected to the connection, a longer library can be obtained, and the connected library is used for on-machine sequencing, so that the data quality can be ensured, and the data utilization rate can be greatly improved.

For ease of understanding, applicants show the construction process of the transcriptome library according to an embodiment of the present invention in fig. 1 and 2, wherein fig. 1 is a schematic reverse transcription process, and fig. 2 is a schematic PCR amplification and sequencing library obtaining process.

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

15页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种多重PCR检测羊肉掺假的方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!