Method for eliminating self-connecting joint of sequencing library and application

文档序号:417583 发布日期:2021-12-21 浏览:23次 中文

阅读说明:本技术 一种测序文库自连接头消除的方法及应用 (Method for eliminating self-connecting joint of sequencing library and application ) 是由 曹德盼 李东东 赵翊婷 程珂燕 余盼 贾雪峰 蒋智 于 2021-09-22 设计创作,主要内容包括:本申请涉及一种测序文库自连接头消除方法及应用,所述方法以测序接头自连文库序列为靶序列,设计短序列guide DNA,结合Argonaute核酸内切酶实现dsDNA文库分子双链断裂,将自连接头从dsDNA文库分子上切除,从而阻止其在后续PCR反应中被扩增。该方法能够显著降低文库自连接头比例,增高测序clean reads,提高数据有效率。(The method takes a sequencing joint self-ligation library sequence as a target sequence, designs short sequence guide DNA, realizes double-strand break of dsDNA library molecules by combining Argonaute endonuclease, and cuts off a self-joint from the dsDNA library molecules so as to prevent the self-joint from being amplified in subsequent PCR reaction. The method can obviously reduce the proportion of the self-connecting heads of the library, increase sequencing clean reads and improve the data efficiency.)

1.A method for eliminating self-connectors from a sequencing library, the method comprising the steps of:

1) designing guide DNA aiming at the sequence of the self-connected head of the sequencing library;

2) adding guide DNA and Argonaute endonuclease into the library to be treated, and carrying out targeted enzyme digestion reaction;

preferably, the method further comprises the following steps:

3) removing the DNA components of the Argonaute endonuclease and guide in the system.

2. The method for eliminating self-connectors of a sequencing library of claim 1, wherein the method for designing guide DNA of the self-connectors of the sequencing library in the step 1) comprises the following steps:

designing forward and reverse guide DNA by taking adjacent 7-15bp sequences at two ends of the joint of two joints in the self-connecting head as target sequences;

preferably, the guide DNA sequence is complementary to the self-adaptor sequence.

3. The method for self-ligation assembly elimination of a sequencing library according to any of claims 1-2, wherein the guide DNA comprises any one or more of:

a. 5' -phosphorylation of the guide DNA;

b. the length of the guide DNA is 15-30 bp;

preferably, the method further comprises the following steps:

c. the 1 st base of the guide DNA is T,

d. the 12 th base of the guide DNA is adenosine;

e. the guide DNA sequence has low GC content.

4. The method for eliminating self-ligation fragments from a sequencing library according to any of claims 1 to 3, wherein the Argonaute endonuclease in step 2) is TtAgo, pfAgo, AaAgo, MjAgo or pAgo enzyme;

preferably, it is a TtAgo enzyme;

more preferably, the TtAgo enzyme concentration: guide DNA concentration <1: 3-10;

further preferably, the nucleotide DNA and the TtAgo enzyme are added as follows: mixing guide DNA and Ttago enzyme in 1 XNEB Thermoporeaction buffer, and incubating at 70-75 deg.C for 5-10 min;

still further preferably, the targeted enzyme digestion reaction conditions are as follows: incubating at 70-80 deg.C for 30-40min, and cooling at 3-5 deg.C.

5. The method for eliminating self-connectors of a sequencing library according to any one of claims 1 to 4, wherein the step 3) is performed by magnetic bead recovery.

6. A method of sequencing library construction comprising the method of any one of claims 1 to 5 and further comprising:

4) library amplification: and recovering the product, performing universal primer amplification, and recovering the library.

7. The method for self-ligation elimination according to any of claims 1 to 6, wherein the sequencing library is a second generation sequencing, third generation sequencing or fourth generation sequencing library;

preferably, the sequencing is a secondary sequencing library;

more preferably, the sequencing is an illumina secondary sequencing library.

8. A sequencing library self-connecting head elimination kit is characterized by comprising guide DNA aiming at a self-connecting head sequence, Argonaute protein endonuclease and a library amplification universal primer;

preferably, the guide DNA sequence is complementary with a self-connector sequence, and the guide DNA takes each 7-15bp sequence adjacent to two ends of the joint of two connectors in the self-connector as a target sequence;

more preferably, the Argonaute protein endonuclease is TtAgo.

9. A library repair agent comprising a guide DNA and an Argonaute protein endonuclease directed to a self-adaptor sequence;

preferably, the guide DNA sequence is complementary with a self-connector sequence, and the guide DNA takes each 7-15bp sequence adjacent to two ends of the joint of two connectors in the self-connector as a target sequence;

more preferably, the Argonaute protein endonuclease is TtAgo.

10. The use of a mixture of guide DNA and Argonaute endonuclease against self-linker sequences in the elimination of self-linkers from a sequencing library; preferably, the guide DNA sequence is complementary with a self-connector sequence, and the guide DNA takes each 7-15bp sequence adjacent to two ends of the joint of two connectors in the self-connector as a target sequence;

more preferably, the Argonaute protein endonuclease is TtAgo.

Technical Field

The application relates to the technical field of gene sequencing, in particular to a method for eliminating a sequencing library self-connecting joint and application.

Technical Field

In the high-throughput sequencing technology, the quality of the library is crucial to the quality of data produced by high-throughput sequencing, and a low-quality library causes too many Clusters or multiple templates and low data quality; the data reading quantity is small, the genome coverage rate is low, and therefore the quality of the library directly influences the sequencing effect. The quality of the sequencing library is critical to obtaining high quality nucleic acid sequencing data.

A low-quality library generally refers to a library with low concentration, dimer contamination, small fragment contamination, large fragment contamination and too wide a peak pattern, which can result in low effective data output of the whole lane library and influence the ratio of clean reads. Particularly, the library with high linker content has linker dimer, and during on-line sequencing, the linker dimer can be combined with an anchoring sequence on Flowcell to form a cluster through bridge PCR amplification, so that the effective data yield of sequencing is reduced; secondly, the dimer sequence of the joint is short, amplification is carried out preferentially when the dimer sequence is in a long cluster, the sequence is fixed, the complexity of the base is low, the length is short, the Q30 of sequencing can be reduced, the filtration rate of clean reads is influenced, and the data output is reduced sharply along with the increase of the content of the joint, so that the data volume loss is caused.

The current general solution for libraries with high linker dimer content is mainly as follows: 1. nucleic acid grade, processed in the library building process: firstly, the amount of the joint in the library building process is properly reduced, and secondly, the amount of the magnetic beads in the library purification process is adjusted. However, the reduction of the content of the linker and the adjustment of the amount of the magnetic beads can affect the amount of the library, the requirement of on-machine operation cannot be met, and if sequencing shows high contamination of linker dimer on a sample with a very small amount of samples, the failure of the experiment is meant, and the loss of the sample is caused. 2. Library grade, with library from pool: and (3) the library is re-amplified and recycled, under the condition that the linker content is extremely high (> 80%), the library amplification efficiency is low, the target fragment ratio is still low, and the sequencing requirement cannot be met, so that a method for effectively eliminating the linker dimer at the library level is urgently needed, the linker content can be reduced, and the detection requirement can be met.

In view of this, the present application is presented.

Disclosure of Invention

The core problem to be solved by the application is to find a method for eliminating the linker dimer, which can reduce the content of the linker and meet the detection requirement in the construction process of a sequencing library.

In order to solve the above problems, the present application proposes the following technical solutions:

the present application first provides a method for sequencing library self-connector elimination, comprising the steps of:

1) designing guide DNA of the self-connector sequence aiming at the sequencing library;

2) adding guide DNA and Argonaute endonuclease into the library to be treated, and carrying out targeted enzyme digestion reaction;

further, the method also comprises the following steps:

3) removing the Argonaute endonuclease and guide DNA components in the system.

Further, the method for designing the guide DNA of the self-connector sequence of the sequencing library in the step 1) comprises the following steps:

and (3) designing the guide DNA in the forward and reverse directions by taking the self-connecting head sequence as a target fragment and aiming at each 7-15bp connecting head sequence adjacent to the two ends of the insert sequence between the connectors as a target sequence.

Further, the guide DNA design includes any one or more of:

a. 5' -phosphorylation of the guide DNA;

b. the length of the guide DNA is 15-30 bp;

preferably, the method further comprises the following steps:

c. the 1 st base of the guide DNA is T,

d. the 12 th base of the guide DNA is adenosine;

e. the guide DNA sequence has low GC content.

Further, in the step 2)

The Argonaute endonuclease is TtAgo, pfAgo, AaAgo, MjAgo or pAgo enzyme; TtAgo enzyme is preferred.

In some embodiments, the TtAgo enzyme concentration: guide DNA concentration <1: 3-10;

in some embodiments, the addition of guide DNA and Argonaute endonuclease to the library to be treated: the guide DNA and the TtAgo enzyme were mixed in 1 XNEB Thermoporeaction buffer and incubated at 70-75 ℃ for 5-10 minutes.

In some embodiments, the targeted enzymatic cleavage reaction conditions: incubating at 70-80 deg.C for 30-40min, and cooling at 3-5 deg.C.

Further, the removal in the step 3) adopts 2 times magnetic bead recovery treatment.

The application also provides a sequencing library construction method, which is characterized by comprising any one of the methods, and further comprising the following steps:

4) library amplification: and recovering the product, performing universal primer amplification, and recovering the library.

Further, in any of the above methods for eliminating self-connectors from a sequencing library, the sequencing library is a second-generation sequencing library, a third-generation sequencing library or a fourth-generation sequencing library;

in some embodiments, the sequencing is a secondary sequencing library;

in some preferred embodiments, the sequencing is an illumina next generation sequencing library.

The application also provides a kit for eliminating the self-connecting head of the sequencing library, wherein the kit comprises guide DNA aiming at the sequence of the self-connecting head, Argonaute protein endonuclease and a general primer for library amplification;

in some embodiments, the Argonaute endonuclease is a TTtAgo, pfAgo, AaAgo, MjAgo, or pAgo enzyme; TtAgo enzyme is preferred.

The present application also provides a library repair agent comprising a guide DNA and an Argonaute protein endonuclease directed against the self-linker sequence;

in some embodiments, the Argonaute protein endonuclease is TtAgo.

In some embodiments, the guide DNA for the self-adaptor sequence is a guide DNA designed in forward and reverse directions with the self-adaptor sequence as a target fragment and 7-15bp adaptor sequences adjacent to both ends of the inter-adaptor sequence as target sequences.

In some embodiments, the guide DNA design rules include any one or more of:

a. 5' -phosphorylation of the guide DNA;

b. the length of the guide DNA is 15-30 bp;

c. the 1 st base of the guide DNA is T,

d. the 12 th base of the guide DNA is adenosine;

e. the guide DNA sequence has low GC content.

In some embodiments, the final concentration of TtAgo enzyme in the kit or repair agent is: the final concentration of guide DNA is 1: 3-5.

The present application also provides an application: comprising the use of a mixture of guide DNA and an Argonaute endonuclease directed against a self-linker sequence in the elimination of self-linkers from a sequencing library,

in some embodiments, the Argonaute endonuclease is a TtAgo, pfAgo, AaAgo, MjAgo, or pAgo enzyme; TtAgo enzyme is preferred.

In some embodiments, the guide DNA for the self-adaptor sequence is a guide DNA designed in forward and reverse directions with the self-adaptor sequence as the target fragment and the 7-15bp adaptor sequence adjacent to both ends of the inter-adaptor sequence as the target sequence.

In some embodiments, the guide DNA comprises any one or more of:

a. 5' -phosphorylation of the guide DNA;

b. the length of the guide DNA is 15-30 bp;

preferably, the method further comprises the following steps:

c. the 1 st base of the guide DNA is T,

d. the 12 th base of the guide DNA is adenosine;

e. the guide DNA sequence has low GC content.

In some embodiments, the TtAgo enzyme concentration: guide DNA concentration <1: 3-10;

in some embodiments, the guide DNA and TtAgo enzyme mixture is prepared by: the guide DNA and the TtAgo enzyme were mixed in 1 XNEB Thermoporeaction buffer and incubated at 70-75 ℃ for 5-10 minutes.

In some embodiments, the targeted enzymatic cleavage reaction conditions: incubating at 70-80 deg.C for 30-40min, and cooling at 3-5 deg.C.

The application has the beneficial technical effects that:

1. this application is through designing the guide DNA from the connector to combine the Argonaute endonuclease, under the normal library circumstances of not influencing, effectively get rid of from the connector. The method can obviously reduce the joint proportion of the sequencing library, increase clean reads and improve the data efficiency.

2. The method has universality: based on the versatility of the library adaptors, it is expected that the present application will work with all libraries of the illumina platform, whether DNA or RNA libraries, whether PCR amplified or PCR-free libraries.

3. The method has good effectiveness and low initial amount of the library: the library usage amount is low, 1ng m is less than or equal to 18ng illumina library is used as a template, and especially the effect of the library with high linker content (> 90%) is better.

4. The application is simple in operation: the conventional enzyme digestion experiment operation and PCR reaction operation can be finished.

5. Library quality system measurement: the change of the linker content before and after high-quality library treatment is small, slight change is caused by magnetic bead purification, the change of the linker content before and after low-quality library treatment is obvious, and the linker content can be greatly reduced.

Drawings

FIG. 1, scheme for the DNA sequence design of guide of the present application.

FIG. 2, initial 1ng library, fragment analysis after treatment with guide in different ratios: a untreated control library No. 2 (U68), b enzyme No. 2 with 1:3 linker elimination (U68), c enzyme No. 2 with 1:5 linker elimination (U68), d enzyme No. 2 with 1:10 linker elimination (U68), and e enzyme No. 2 with 1:20 linker elimination (U68).

FIG. 3, initial 18ng library fragment analysis after treatment with guide in different ratios: a untreated control library No. 2 (U68), b enzyme No. 2 with 1:3 linker elimination (U68), c enzyme No. 2 with 1:5 linker elimination (U68), d enzyme No. 2 with 1:10 linker elimination (U68), and e enzyme No. 2 with 1:20 linker elimination (U68).

FIG. 4, graph of sequencing adaptor ratiometric analysis after optimization of enzyme and guide DNA concentrations.

FIG. 5, graph of fragment analysis before and after treatment of an initial amount of 1ng of library with Ttago: a untreated control library No.1 (U13), b linker-depleted library No.1 (U13), c untreated control library No. 2 (U68), d linker-depleted control library No. 2 (U68).

FIG. 6, graph of fragment analysis before and after treatment of an initial amount of 18ng of library with Ttago: a untreated control library No.1 (U13), b linker-depleted library No.1 (U13), c untreated control library No. 2 (U68), d linker-depleted control library No. 2 (U68).

Figure 7, qPCR quantitative standard graph.

FIG. 8 is a diagram of sequencing linker ratio analysis.

FIG. 9 is a sequence distribution diagram of the first 10 base sequences of the U-13 library.

FIG. 10, first 10 base sequence distribution diagram of U-68 library.

Detailed Description

Embodiments of the present application will be described in detail below with reference to examples, but those skilled in the art will appreciate that the following examples are only illustrative of the present application and should not be construed as limiting the scope of the present application. The examples, in which specific conditions are not specified, were conducted under conventional conditions or conditions recommended by the manufacturer. The reagents or instruments used are not indicated by manufacturers, and are all conventional products available on the market.

Definition of partial terms

Unless defined otherwise below, all technical and scientific terms used in the detailed description of the present application are intended to have the same meaning as commonly understood by one of ordinary skill in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present application.

As used in this application, the terms "comprising," "including," "having," "containing," or "involving" are inclusive or open-ended and do not exclude additional unrecited elements or method steps. The term "consisting of …" is considered to be a preferred embodiment of the term "comprising". If in the following a certain group is defined to comprise at least a certain number of embodiments, this should also be understood as disclosing a group which preferably only consists of these embodiments.

Where an indefinite or definite article is used when referring to a singular noun e.g. "a" or "an", "the", this includes a plural of that noun.

The term "about" in the present application denotes an interval of accuracy that can be understood by a person skilled in the art, which still guarantees the technical effect of the feature in question. The term generally denotes a deviation of ± 10%, preferably ± 5%, from the indicated value.

Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments described herein are capable of operation in other sequences than described or illustrated herein.

The following terms or definitions are provided solely to aid in the understanding of the present application. These definitions should not be construed to have a scope less than understood by those skilled in the art.

The terms "nucleic acid," "polynucleotide," and "nucleotide sequence" as used herein are used interchangeably to refer to polymeric forms of nucleotides of any length, including deoxyribonucleotides, ribonucleotides, combinations thereof, and analogs thereof. "oligonucleotide" and "oligo" are used interchangeably to refer to a short polynucleotide having no more than about 50 nucleotides.

The term "library self-linker" as used herein refers to the product of the self-ligation reaction of the linker used in the sequencing library by the action of a DNA polymerase, which is usually in the form of a dimer, like the "library linker self-ligating dimer" as used herein. Such as: the structure of the illumina normal library is: linker F-TCTTCCGATCTGATCGGAAGAGCACA-sequence of target fragment-linker R-TGTGCTCTTCCGATCAGATCGGAAGA; then the library self-linker is the linker and self-ligation occurs under the action of DNA polymerase to form the linker dimer form: F-TCTTCCGATCTGATCGGAAGAGCACA-TGTGCTCTTCCGATCAGATCGGAAGA.

The term "adaptor self-ligating library sequence" or "self-adaptor sequence" as used herein is meant to have the same meaning and to include two adaptor-continuous sequences in a "library self-adaptor", wherein each adaptor comprises at least 5 bases and is 15-30bp in length; it matches with guide DNA following the base complementary pairing rules.

As used herein, "guide DNA" or "guide DNA" are used interchangeably herein and refer to a single-stranded oligonucleotide DNA capable of forming a complex with the Argonaute endonuclease of the present application and complementarily hybridizing to a target nucleic acid (adaptor self-ligating library sequence). DNA editing-guided 5' phosphorylated nucleotide DNA molecules of approximately 15-30bp in length.

The term "Argonaute endonuclease" as used herein means: endonucleases from the Argonaute family of proteins include enzymes such as TtAgo, pfAgo, AaAgo, MjAgo or pAgo. "Argonaute" and "Ago" are used interchangeably and refer to naturally occurring or engineered proteins that can specifically recognize a target nucleic acid comprising a complementary sequence of a guide DNA via single-stranded oligonucleotide DNA (i.e., guide DNA) guidance.

The term "TtAgo enzyme" as used herein refers to: (TtAgo) is an argonaute, DNA-editing endonuclease from Thermus thermophilus, which requires a short 5' phosphorylated single stranded DNA to be directed to a specific counterpart sequence on the substrate to activate its activity.

The term "TtAgo/guide complex" or "TtAgo/guide DNA complex" as used herein means the same meaning: mixture of TtAgo endonuclease and Guide DNA.

The method described herein is generally a method of: the sequence of the linker self-ligation library is used as a target sequence, a short sequence with 5' end phosphorylation is designed as guide DNA, the sequence of the linker self-ligation dimer of the library is targeted by the Argonaute endonuclease, double-strand break is formed on the dsDNA library molecules, the linker dimer is cut off from the dsDNA library molecules, and therefore the linker self-ligation dimer is prevented from being amplified in subsequent PCR reaction, and only the dsDNA library molecules are ensured to be amplified.

It will be appreciated that the method is not limited to sequencing platforms in the first place, but is applicable where linker self-ligation is involved in the library construction process. Thus, in some embodiments, the present application is applicable to methods including, but not limited to, second generation sequencing, third generation sequencing, or fourth generation sequencing; preferably, the sequencing is a secondary sequencing library; more preferably, the sequencing is an illumina secondary sequencing library.

Regarding the specific design of guide DNA in the present application, based on the design idea of the present application, the present application can design guide DNA for self-connectors of any sequence, and normally, a sequence of 7-15bp adjacent to both ends of the junction of two connectors in a self-connector is used as a target sequence to design the guide DNA in forward and reverse directions; in some preferred embodiments, the guide DNA sequence is complementary to the self-adaptor sequence. In some embodiments, b, the guide DNA is 15-30bp in length; to satisfy targeting, it is preferable that the guide DNA is usually subjected to 5' -phosphorylation; more preferably, the 1 st base of the guide DNA is T, the 12 th base of the guide DNA is adenosine, and the guide DNA sequence has low GC content.

According to some aspects of the application, the Argonaute endonuclease in step 2) may specifically recognize a complementary sequence comprising a guide DNA, which may be a TtAgo, pfAgo, AaAgo, MjAgo or pAgo enzyme, via single-stranded oligonucleotide DNA (i.e., guide DNA) guidance; preferred herein are TtAgo enzymes. In some embodiments, the TtAgo enzyme concentration: guide DNA concentration <1: 3-10; in some preferred embodiments, the addition of guide DNA and TtAgo enzyme is: mixing guide DNA and Ttago enzyme in 1 XNEB Thermoporeaction buffer, and incubating at 70-75 deg.C for 5-10 min; in some more preferred embodiments, the targeted cleavage reaction conditions: incubating at 70-80 deg.C for 30-40min, and cooling at 3-5 deg.C.

Without limitation, the present application may further comprise a step of removing the Argonaute endonuclease and guide DNA components of the system after cleaving the linker dimer from the dsDNA library molecules, including but not limited to processing by magnetic bead recovery or the like.

The application is illustrated below with reference to specific examples.

Example 1 this application design optimization

As described in the background of the present application, the existing sequencing library, especially the second-generation sequencing library, has a problem of linker interference during the construction process, but in solving the problem, the field is hard to think of performing separate treatment on the linker, because in the case of general abundant samples, more than ever, re-extracting the constructed library is considered, and even if the problem is not solved by re-extracting the constructed library, people may think that the sample quality is problematic; or aiming at the problem of the joint, the conventional method in the second generation sequencing technology flow is the adjustment of the joint content and the purification of magnetic beads or the gel cutting recovery, the processing from the dimension of the library is not considered, after all, the sample is few, especially is less common in scientific research samples, and the clinical sample can be common but does not spend a great deal of energy to research the sample; for precious samples, in most cases, only one extraction and library building experiment can be performed, library-level treatment can be performed, and the utilization rate of the precious samples can be greatly improved.

The present application was designed according to the company's illumina sequencing platform analysis of the original library and the original data:

in the application, a sequence of an illumina linker self-ligation library is used as a target sequence, a 5' phosphorylated short sequence is designed as guide DNA, a sequence of a TtAgo targeting library linker self-ligation dimer is used, a TtAgo/guide compound searches for a sequence which is completely base-paired with a target on a DNA library to induce endonuclease activity, so that the target DNA is cut between corresponding bases of the guide, double-strand break is formed on dsDNA library molecules, and the linker dimer is cut off from the dsDNA library molecules.

Specifically, the design of the present application is explained in terms of the design of the guide sequence, the Argonaute endonuclease and the concentration of use thereof, etc., as follows:

1) illustratively, the principle of guide sequence design in the present application is shown in FIG. 1. Taking a joint self-connection sequence as a target fragment, taking the position of 0 as an insert fragment (usually, the insert in a normal library is the target sequence, but two joints in a self-connection head are directly connected, so the position does not contain a sequence and is marked as 0), taking the positions of the joint sequences on both sides of 0, designing forward guide-F by using 9bp joint sequences (9 bp respectively before and after and totally 18bp) adjacent to the insert sequence, and designing reverse guide-R by using 9bp joint complementary sequences (9 bp respectively before and after and totally 18bp) adjacent to the insert sequence.

Through experimental optimization, Guide DNA design can include the following: firstly, the first base in the nucleotide sequence of guide can influence the TtAgo activity; starting a TtAgo guide with thymidine (T), the first base of the guide is not important for base pairing with the target sequence, so changing guide position 1 to T, even if it is not complementary to target position 1, can improve overall reactivity; ② the 12 th base in the nucleotide sequence of guide can influence the TtAgo activity; avoiding the use of adenosine at position 12 of the TtAgo guide; and thirdly, selecting the optimal target sequence as a low GC content sequence by the TtAgo.

A plurality of available guide DNA sequences are obtained through optimized screening, and the following guide DNA sequences are taken as examples:

guide-F:5’P-TTCCGATCTGATCGGAAG-3’(SEQ ID NO.1);

guide-R:5’P-TCTTCCGATCAGATCGGA-3’(SEQ ID NO.2)。

guide-F1:5’P-TTCCGATCTGATCGG-3’(SEQ ID NO.3);

guide-R1:5’P-TCTTCCGATCAGATC-3’(SEQ ID NO.4)。

guide-F2:5’P-TCTTCCGATCTGATCGGAAGAGCACA-3’(SEQ ID NO.5);

guide-R2:5’P-TGTGCTCTTCCGATCAGATCGGAAGA-3’(SEQ ID NO.6)。

the example of the application preferably selects guide-F & R (SEQ ID NO.1, 2) group.

2) Selection of Argonaute endonuclease

Earlier experiments prove that TtAgo, pfAgo, AaAgo, MjAgo or pAgo enzymes can be respectively adopted for carrying out targeting experiments; preferably, the TtAgo effect is selected to be optimal.

3) Optimized selection of enzyme and guide DNA concentrations

Adjusting the concentration of the TtAgo enzyme: ratio of guide DNA concentration, 4 ratio tests were performed, TtAgo enzyme concentration: guide DNA concentration 1: 3. 1: 5. 1:10 and 1:20, the library of 43-U68 was used for comparison of the effect of the system. Self-ligation selection of 1ng and 18ng of the library was performed in a system formulated with gradients of 1:3, 1:5, 1:10 and 1:20 for Ttgo and DNA guide, as shown in the following table.

The results of sequencing data of the systems with different concentrations of the lower epitope are shown in FIGS. 2-3, which are the fragment analysis graphs of the initial 1ng/18ng library after treatment with different ratios of enzyme to guide, and FIG. 4, which is the sequencing linker ratio analysis. It can be seen that the linker content of the original library is 88.54%, the data effective rate is 11.5%, after the library is eliminated from the linker, when 1ng of the initial amount of enzyme, i.e., guide, is 1:3, the linker content ratio is 7.96%, the linker ratio is reduced by 91%, the data effective rate is 86.97%, and the effective rate is increased by 6.56 times; when the initial amount of the enzyme, namely guide, is 18ng, the content of the linker is 13.79 percent when the ratio of the linker is 1:5, the ratio of the linker is reduced by 84.4 percent, the effective rate of the data is 76.94 percent, and the effective rate is improved by 5.69 times. On the whole, the effect was remarkable in the concentration ratio of enzyme to guide DNA of 1:3-10, regardless of 1ng or 18 ng.

Finally, the following method steps of the application are determined through comprehensive optimization (parameters such as concentration, time, temperature and the like):

1) design the 5' phosphorylated guide DNA for the self-adaptor sequence of the sequencing library: and (3) designing the forward and reverse guide DNA by taking the self-connecting head sequence as a target fragment and aiming at the 7-15bp connecting head sequence adjacent to the two ends of the 0 sequence between the connectors as a target sequence.

2) Adding a guide DNA and TtAgo enzyme mixed solution into the library to be treated, and carrying out targeted enzyme digestion reaction: TtAgo enzyme concentration: guide DNA concentration <1: 3-10; the mixture of guide DNA and Ttgao enzyme was prepared as follows: mixing guide DNA and Ttago enzyme in 1 XNEB Thermoporeaction buffer, and incubating at 70-75 deg.C for 5-10 min (preferably, incubating at 75 deg.C for 10 min); the conditions of the targeted enzyme digestion reaction are as follows: incubating at 70-80 deg.C for 30-40min, and cooling at 3-5 deg.C (preferably, incubating at 80 deg.C for 40min, and cooling at 4 deg.C).

3) Removing the Argonaute enzyme and guide DNA components in the system: the treated product was recovered with 2 Xmagnetic beads to remove the enzyme and excess oligo from the system.

4) Library amplification: and recovering the product, performing universal primer amplification, and recovering the library.

Example 2 comparison of Effect with untreated library data

First, library screening

The clinical sample has complex properties, the extraction and separation efficiency of different samples and the fragmentation degree and degradation degree of the genome influence the recovery rate of the sequence, the library constructed by a part of samples with poor quality has high joint content, the sequencing effective data rate is low, and the requirement of data analysis cannot be met.

From data of an under-machine of an illumina sequencing platform, 1 nucleic acid is selected, the concentration of the library is normal, the library out-of-warehouse concentration meets the on-machine standard of the illumina platform, but the efficiency of output data is low, the proportion of joints is as high as 95%, the concentration of 1 nucleic acid is low, the library out-of-warehouse concentration is normal, the library out-of-warehouse concentration meets the on-machine standard of the illumina platform, and the proportion of the joints is 88.5%.

Specifically, the method comprises the following steps: from data of an under-machine of an illumina sequencing platform, 1 nucleic acid with the concentration of 11.7 ng/mu L normally is selected, the initial library building amount of 200ng of the library with the concentration of 0.95 ng/mu L normally accords with the on-machine standard of the illumina platform, but the output data efficiency is only 0.88%, the proportion of a linker reaches 97% of the library and 1 library with the nucleic acid concentration of 0.078 ng/mu L which is very low, the library out-machine concentration of 3.04 ng/mu L normally accords with the on-machine standard of the illumina platform, but the data is 11.5%, and the proportion of the linker is higher than 88.5%.

TABLE 1 screening of library information

Sample MBXD56754-1-U13 43-U68
Nucleic acid concentration (ng/. mu.L) 11.7 0.078
Type of building a library PCR-free PCR-8
Library concentration (ng/. mu.L) 0.95 3.04
PCR quantitation (nM) 0.3 2.64
RawReads(#) 32,581,927 15,265,747
Adapter_ratio(%) 97.14 88.54
Duplication(%) 1.79 2.28
Clean_GC(%) 39.98 42.7
Clean_Q20 94.18 94.25
Clean_Q30 91.92 91.99
CleanReads 285,732 1,755,732
Effective(%) 0.88 11.5
AvgQuality 0.997364121 0.99751773
LowQuality(%) 0.08 0.23
TooShort(%) 97.07 85.83
PCR-8 post concentrationDegree (ng/. mu.L) 8.96 70.4

Second, library processing method

1. Due to the small amount and small volume of the library, 8cycles of the target library were amplified with the universal illimina primers, and after the amount required for the experiment was reached, the following experiment was performed with 18ng and 1ng of the library, respectively.

TABLE 2 library processing information

2. The library was left untreated: 1ng and 18ng of libraries were PCR amplified with universal primers of illiminina in the reaction system of Table 5 and 6, 15cycles were recovered with 45. mu.L of magnetic bead library, and the library was eluted with 30. mu.L of eluent as a control, and each library was replicated 3 times.

3. The self-connecting head elimination method comprises the following steps: the libraries 1ng and 18ng were subjected to a targeted cleavage reaction with TtAgo and DNA guide according to the system of Table 3 below at 80 ℃ by mixing the guide and TtAgo enzymes in a 1 XTtAgo buffer without adding the libraries before the reaction, incubating for 10 min at 75 ℃ (improving the cleavage specificity), adding the library to be treated, and reacting for 40min according to the procedure of Table 4. Add 40. mu.L of magnetic beads to recover and remove the enzyme and oligo, 20. mu.L of the eluted product, amplify 15cycles with the illumina universal primer PCR according to tables 5 and 6, add 45. mu.L of magnetic bead library to recover, elute the library with 30. mu.L of eluent as experimental groups, and perform 2 technical repeats for each library.

TABLE 3 digestion reaction System

TABLE 4 reaction procedure

TABLE 5 amplification System

TABLE 6 amplification reaction procedure

Third, quality inspection of library

Library fragment analysis by Agilent 2100 Bioanalyzer

Analyzing the sizes of library fragments of a control group and an experimental group by using an Agilent 2100 Bioanalyzer fragment analyzer, uniformly diluting the recovered high-concentration library to about 3 ng/muL, preparing glue, pouring the glue, spotting 1 muL, and running and analyzing the instrument.

1) Preparing glue: the reagent is balanced for 30min at room temperature, 15 mul of high-sensitivity DNA staining solution is added into the high-sensitivity DNA glue mixture, the mixture is evenly mixed by vortex, the glue and the staining solution mixture are completely transferred into a filter tube, the mixture is centrifuged for 15min at room temperature of 2240g (6000rpm), the filter tube is discarded, and the glue is reserved for later use.

2) Loading glue, namely adding 9 mu L of glue to the corresponding position of the chip according to the specification, and paying attention to no bubbles; pressing the glue and continuously adding the glue to the corresponding position of the chip.

3) Sample adding: in 12 other wells except the glue wells, 5. mu.L of high sensitivity DNA Markers were added, each sample well could not be left empty, 1. mu.L of DNA Ladder was added to the Ladder well, 1. mu.L of sample was added to the other 11 wells, and the sample was added to the end to prevent splashing during vortex mixing.

4) And opening the instrument, and mounting the chip for detection.

TABLE 7 library quality test data

2. Quantitative analysis of the control group library and the experimental group library is carried out by a qPCR method.

1) Dilution of the library: the diluted solution is diluted by 100 times according to 2+198, and diluted by 10000 times for 2 times.

2) The reaction system was prepared as follows:

3) library amplification quality testing was performed according to this reaction program:

4) preparation of a standard curve: s1, S2, S3, S4, S5; the concentrations are respectively: 20pmol, 2pmol, 0.2pmol, 0.02pmol, 0.002pmol and SYBR green system to make a standard curve;

5) and (3) qPCR result processing: calculating the qPCR concentration of a sample to be detected;

fourth, result analysis

The advantages of the present application are established by the following analysis.

1. Library adaptor ratio analysis

1.1 analysis, comparison of 2100 fragment size of library with peak map linker before and after alignment treatment;

1.2 analyzing the quantitative result change of the library qPCR before and after the comparison treatment;

1.3 analyzing effective data rate, clean reads, joint proportion and detection and the like of the library before and after comparison processing;

2. sequence information of the library before and after the alignment process was analyzed.

The specific results are as follows:

1. library adaptor ratio analysis

1.12100 fragment analysis

The results before and after the library with the same loading amount is processed show that about 145bp is the self-connecting dimer structure of the library joint in the 2100 fragment analysis, the abscissa is the size of the library fragment, the ordinate is the fluorescence signal intensity, the content of the reaction nucleic acid is shown (see fig. 5-6): the amount of linker dimer after library treatment significantly reduced or removed the large fragments cross-linked by the library. In contrast, 18ng of the library had a more pronounced effect on the linker dimer than 1ng of the library; the library of 18ng was treated to be significant in effect and had fragments of the target sequence.

1.2 qPCR library quality analysis

TABLE 8 qPCR quality control data

qPCR quantitative standard curve see FIG. 7, and qPCR quality control data are shown in Table 8. R of the standard curve2When the quantity of the library substances is 0.999, the library is qualified in quality detection within the detection range of the standard curve, the quantity concentration of the library substances is more than 3nM and meets the requirement of on-machine analysis, the quantity concentration of the library substances of the same sample with 18ng of starting quantity is more than 18ng of starting quantity, the result is consistent with the result of 2100-segment analysis, and the effect of removing the adaptor with the starting quantity of the library with 18ng is better than that of the starting quantity of the library with 1 ng. The effect is better because the proportion of the corresponding library target fragment is relatively higher by 18 ng. Thus, the method of the present application is highly efficient and can be adapted to situations where the initial amount of library is low (e.g., 1 ng. ltoreq. m.ltoreq.18 ng).

1.3 sequencing linker proportion analysis

TABLE 9 sequencing data

Table 9 shows the sequencing data, FIGS. 5-6 show the fragment analysis of the initial 1ng/18ng library after TtAgo enzyme treatment, and FIG. 8 shows the comprehensive analysis of the sequencing linker ratio. It can be seen that the linker content of the 1-U13 original library is as high as 97.14%, the effective rate of the data is 0.88%, after the library is eliminated by the self-connecting joint, the linker content of 1ng of the initial amount is 21.9%, the ratio of the linkers is reduced by 77%, the effective rate of the data is 12.9%, and the effective rate is improved by 13.7 times; the joint content of 18ng of the initial amount is 20.8%, the joint proportion is reduced by 79%, the data effective rate is 23%, and the effective rate is improved by 25.1 times.

The joint content of the 43-U68 original library is 88.54%, the data effective rate is 11.5%, after the library is eliminated by the self-joint, the joint content of 1ng of the original library is 23.36%, the joint proportion is reduced by 73.6%, the data effective rate is 74.5%, and the effective rate is improved by 5.5 times; the joint content of 18ng of the initial amount is 13.2%, the joint proportion is reduced by 85%, the data effective rate is 66.67%, and the effective rate is increased by 4.8 times.

This result is consistent with the results of the 2100 fragment analysis, with a linker removal effect at 18ng of library start better than 1ng of library start; the effect is better because the proportion of the corresponding library target fragment is relatively higher by 18 ng.

3. Sequence analysis

Illumina platform sequencing results frequency statistics of the number of sequences in which the first 10 bases of the library sequence completely matched the adaptor sequence were analyzed using uniq software. As can be seen in FIGS. 9 and 10, the frequency of the number of perfectly matched adaptor sequences of the first 10 bases decreases by orders of magnitude after the library is removed from the ligation adaptor. Moreover, the number of sequences whose first behavior perfectly matches the linker sequence is the majority of the linker sequences in the library, but the method of self-ligation elimination of the linker from the library is effective, with orders of magnitude reduction in the number of sequences.

And (4) conclusion: the combination of 2100 fragment analysis, qPCR library quantitative analysis, sequencing data result analysis and sequence number frequency analysis shows that the method for eliminating the self-connecting joint of the illumina library is feasible, the library quality can be improved, the joint proportion is reduced, and the data efficiency is improved.

The above description of the specific embodiments of the present application is not intended to limit the present application, and those skilled in the art may make various changes and modifications according to the present application without departing from the spirit of the present application, which is intended to fall within the scope of the appended claims.

21页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种MGMT启动子甲基化检测方法、引物组以及试剂盒

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!