Methods, compositions and kits for preparing nucleic acid libraries

文档序号:1471751 发布日期:2020-02-21 浏览:32次 中文

阅读说明:本技术 用于制备核酸文库的方法、组合物和试剂盒 (Methods, compositions and kits for preparing nucleic acid libraries ) 是由 富国良 T·邓威尔 于 2018-04-17 设计创作,主要内容包括:本发明涉及用于延伸多核苷酸的方法、组合物和试剂盒以及用于制备多核苷酸的测序文库的方法、组合物和试剂盒,所述方法、组合物和试剂盒涉及在衔接子模板寡核苷酸上生成经修饰的靶多核苷酸以及标记靶序列的一条链或两条链。测序文库适用于大规模平行测序,并且包括多个双链核酸分子。(The present invention relates to methods, compositions and kits for extending polynucleotides and methods, compositions and kits for preparing sequencing libraries of polynucleotides, which methods, compositions and kits involve generating modified target polynucleotides on adaptor template oligonucleotides and labeling one or both strands of the target sequence. The sequencing library is suitable for massively parallel sequencing and comprises a plurality of double-stranded nucleic acid molecules.)

1. A method for extending a population of target polynucleotides having single-stranded 3' ends, the method comprising:

(i) incubating the target polynucleotide with an Adapter Template Oligonucleotide (ATO) having:

(a) a 3' random sequence;

(b) a 3' end with a blocker rendering the ATO inextensible; and

(c) 5' to the universal sequence of the random sequence,

wherein the target polynucleotide hybridizes to the 3' random sequence of the ATO;

(ii) performing polymerase extension of the target polynucleotide using the ATO as a template, thereby producing an extended target polynucleotide having a 3' universal sequence.

2. The method of claim 1, further comprising (iii) generating a first Complementary Sequence (CS) of the modified target polynucleotide, wherein generating the first CS comprises polymerase extension from the 3' universal sequence using the modified target polynucleotide as a template.

3. The method of claim 2, wherein the polymerase extension that generates the first CS is any one of:

(a) in vitro transcription from a double-stranded promoter region in the modified target polynucleotide using RNA polymerase, the double-stranded promoter region in the modified target polynucleotide generated by extension on an ATO comprising an RNA polymerase promoter;

(b) self-initiated extension of the 3' stem-loop structure; or

(c) Extension of a primer that hybridizes to the 3' universal sequence.

4. The method of claim 2 or claim 3, wherein excess ATO is removed or digested prior to generating the first Complement (CS) of the modified target polynucleotide or after generating the first Complement (CS) of the modified target polynucleotide.

5. The method of any one of claims 2 to 4, wherein the first CS is extended using the method of claim 1 to generate an extended first complementary sequence having a 3' universal sequence.

6. The method of any one of claims 2 to 4, wherein the first CS is extended using DNA ligase to ligate an adaptor to the first CS.

7. The method of any one of claims 2 to 4, further comprising extending a primer that hybridizes to the first CS or modified first CS, thereby forming a second CS, wherein the primer that hybridizes to the first CS or modified first CS comprises a target-specific portion, or a universal sequence, or both a 3 'target-specific sequence and a 5' universal sequence.

8. The method of any preceding claim, wherein the polymerase used in step (ii) has 3 'to 5' exonuclease activity.

9. A method for preparing a sequencing library from a population of single-stranded nucleic acids, the method comprising:

(a) performing the method of claim 1 to produce an extended target polynucleotide having a 3' universal sequence;

(b) performing the method of claim 2 to generate a first complementary sequence;

(c) generating a second complementary sequence having a 5 'universal sequence and a 3' universal sequence, wherein the 5 'universal sequence and the 3' universal sequence are different and not complementary to each other; and

(d) amplifying the first complementary sequence and the second complementary sequence using primers targeting two universal sequences, thereby preparing a sequencing library of double-stranded nucleic acid fragments having known universal ends of different sequences.

10. The method of any one of the preceding claims, wherein the target polynucleotide having a single-stranded 3' end is RNA or DNA, wherein the DNA or the RNA is derived from an FFPE sample, circulating free nucleic acids, or a sample treated with bisulfite.

11. An Adaptor Template Oligonucleotide (ATO) for extending a polynucleotide, the Adaptor Template Oligonucleotide (ATO) comprising:

(a) a 3 ' random sequence of 3 to 36 ' N ' bases;

(b) a 3' end with a blocker rendering the ATO inextensible;

(c) a universal sequence 5' to the random sequence;

(d) (ii) an optional modified nucleotide or linkage that renders the ATO resistant to 3' exonuclease cleavage; and

(e) a moiety that renders the ATO degradable.

12. The adaptor-template oligonucleotide (ATO) according to claim 11, wherein the degradable moiety is a uracil nucleotide, a ribonucleotide, or a restriction enzyme recognition sequence.

13. The Adaptor Template Oligonucleotide (ATO) of claim 11 or claim 12, wherein the modified nucleotide or linkage that renders the ATO resistant to 3' exonucleolytic cleavage is a phosphorothioate linkage.

14. The Adaptor Template Oligonucleotide (ATO) according to any one of claims 11 to 13, wherein the universal sequence comprises a sequence capable of acting as a RNA polymerase promoter.

15. The Adaptor Template Oligonucleotide (ATO) of any one of claims 11 to 14, wherein the ATO comprises a 5 'stem part sequence, the 5' stem part sequence being complementary to a part or all or part of the universal sequence and thus being capable of forming a stem-loop structure.

16. The adaptor template oligonucleotide of claim 15, wherein the ATO comprises in 5 'to 3' order: a 5 ' stem portion, an RNA polymerase sequence, a priming site sequence, a single stranded overhang 3 ' random sequence, and a 3 ' end with a blocker.

17. The adaptor template oligonucleotide of claim 15 or claim 16, wherein the stem-loop structure contains a non-replicable linkage.

18. The adaptor template oligonucleotide of claim 17, wherein the non-replicable linkage is a C3 internodal arm, a triethylene glycol internodal arm, an 18 atom hexaethylene glycol internodal arm, or a 1 ', 2' -dideoxyribose (d-internodal arm).

19. The adaptor template oligonucleotide of any one of claims 11-18, wherein the 3' end is blocked by a moiety selected from the group consisting of: at least one ribonucleotide, at least one deoxynucleotide, a C3 spacer, a phosphate, a dideoxynucleotide, an amino group and an inverted deoxythymidine.

20. A composition for extending a polynucleotide, the composition comprising the ATO of any one of claims 11-18 and a polymerase having 3 'to 5' exonuclease activity.

21. The process according to any one of claims 1 to 10, wherein the ATO is as defined in any one of claims 11 to 19.

22. A kit for generating a polynucleotide library, the kit comprising the Adaptor Template Oligonucleotide (ATO) of any one of claims 11-19, a polymerase having 3' exonuclease activity, and primers compatible with the NGS platform.

Background

The present invention relates to methods and compositions for extending target polynucleotides. An adaptor sequence is added to the 3' end of the single-stranded nucleic acid.

The next generation of DNA sequencing is expected to revolutionize clinical medicine and basic research. However, while this technique is capable of generating hundreds of billions of nucleotides of a DNA sequence in a single experiment, an error rate of-1% causes hundreds of millions of sequencing errors. These scattered errors become a serious problem when "deep sequencing" of mixtures (such as tumors) or mixed populations of microorganisms with genetic heterogeneity.

To overcome the limitations of sequencing accuracy, several methods have been reported. Double sequencing (Schmitt, et alPNAS 109: 14508-. This method greatly reduces errors by independently labeling and sequencing each of the two strands of the DNA duplex. Because the two strands are complementary, true mutations are found at the same position in both strands. In contrast, PCR or sequencing errors only cause mutations in one strand, and can therefore be ignored as technical errors. Kinde et al reported another method called the safety sequencing System ("Safe-SeqS") (PNAS 2011 Jun 7; 108(23): 9530-5). The key to this approach is (i) assigning a Unique Identifier (UID) to each template molecule, (ii) amplifying each uniquely tagged template molecule to create a family of UIDs, and (iii) redundant sequencing of the amplified products. Only when 95% of the PCR fragments (fragments) with the same UID contain the same mutation is it considered to be mutated ("supermutant"). US patents US8,722,368, US8,685,678, US8,742,606 describe methods of sequencing polynucleotides attached to degenerate base regions to determine/estimate the number of different starting polynucleotides. However, these methods cannot be readily used for targeted amplicon sequencing and often involve ligation to attach a degenerate base region. For targeted amplicon-based enrichment and sequencing, fusion or translocation events cannot be easily sequenced. Furthermore, since primer sites cannot be sequenced, it is sometimes difficult to design a suitable primer pair for covering hot spot regions without losing some of the non-sequencable regions. For multiplex amplification in a single tube, the overlapping region cannot be amplified, resulting in loss of sequencing information in the primer binding region of the target polynucleotide. For small size target fragments (such as plasma DNA, small RNAs or mirnas), designing a pair of primers is also difficult because there is not much space for designing primer sequences.

Targeted next generation sequencing often involves analysis of large, complex fragments, and this is achieved by multiplex PCR (simultaneous amplification of different target DNA sequences in a single PCR reaction). However, the results obtained with multiplex PCR are often complicated by artifacts (artifacts) of the amplified product. These include false negative results due to reaction failure and false positive results due to non-specific priming events (e.g., amplification of a pseudo-product). Since the probability of non-specific priming increases with each additional primer pair, the conditions must be adjusted as needed when adding individual primer sets.

Detailed Description

To facilitate an understanding of the present invention, a number of terms are defined below.

As used herein, "sample" refers to any substance that contains or is suspected of containing nucleic acids, and includes tissue or fluid samples isolated from an individual or multiple individuals. In particular, the nucleic acid sample may be obtained from a single cell, organism, or a combination of organisms selected from viruses, bacteria, fungi, plants, and animals. Preferably, the nucleic acid sample is obtained from a mammal. In a preferred embodiment, the mammal is a human. The nucleic acid sample may be obtained from a sample of a biopsy of a body fluid or tissue of the subject, or from cultured cells. The body fluid may be selected from whole blood, serum, plasma, urine, sputum, bile, stool, bone marrow, lymph fluid, semen, breast exudate, bile, saliva, tears, bronchial washings (bronchial washing), gastric lavage (gastrostrick washing), spinal fluid, synovial fluid, peritoneal fluid, pleural effusion and amniotic fluid. A "single sample" may be a single cell (which may be a T cell or a B cell), while multiple samples may be a number of blood cells in a blood sample.

As used herein, the term "nucleotide sequence" refers to a homo-or hetero-polymer of deoxyribonucleotides, ribonucleotides, or other nucleic acids.

As used herein, the term "nucleotide" generally refers to the monomeric component of a nucleotide sequence, although in addition to nucleotides, monomers can also be nucleosides and/or nucleotide analogs, and/or modified nucleosides (e.g., amino-modified nucleosides). In addition, "nucleotide" encompasses non-naturally occurring analog structures. The nucleotides may be deoxyribonucleotides, ribonucleotides, or other nucleic acids.

As used herein, the term "nucleic acid" refers to at least two nucleotides covalently linked together. Nucleic acids will typically contain phosphodiester linkages, although in some cases nucleic acid analogs are included that may have additional backbones. Nucleic acids may be single-stranded or double-stranded, or contain portions of both double-stranded and single-stranded sequences, as specified. The nucleic acid can be DNA, genomic and cDNA, RNA, a mixture of both DNA and RNA, or a DNA-RNA hybrid, wherein the nucleic acid contains any combination of deoxyribonucleotides and ribonucleotides, as well as any combination of bases (including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, etc.). Reference to a "DNA sequence" or "RNA sequence" may encompass single-stranded and double-stranded DNA or RNA. Unless the context indicates otherwise, a specific sequence refers to a single-stranded DNA or RNA of such a sequence, a duplex of such a sequence with its complement (double-stranded DNA or RNA), and/or the complement of such a sequence.

As used herein, "polynucleotide" and "oligonucleotide" are types of "nucleic acids" and generally refer to a primer or oligomer fragment to be detected. There is no expected difference in length between the terms "nucleic acid", "polynucleotide" and "oligonucleotide", and these terms will be used interchangeably. "nucleic acid", "DNA" and "RNA" and similar terms also encompass nucleic acid analogs. Oligonucleotides need not be derived from any existing or natural sequence per se, but can be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, or a combination thereof.

As used herein, the terms "target sequence," "target nucleic acid," "target polynucleotide," and "nucleic acid of interest" are used interchangeably and refer to a desired region to be amplified, detected, or both amplified and detected, or the object of hybridization to a complementary oligonucleotide, polynucleotide (e.g., a blocking oligomer), or object of a primer extension process. The target sequence may be comprised of DNA, RNA, analogs thereof, or combinations thereof. The target sequence may be single-stranded or double-stranded. During the extension process, the target polynucleotide that forms a hybridization duplex with the oligonucleotide (template) may be referred to as a "primer", or the target nucleic acid that forms a hybridization duplex with the primer may also be referred to as a "template". The template is used as a model for the synthesis of complementary polynucleotides (pattern). The target sequence may be derived from any living or once living organism (including but not limited to prokaryotes, eukaryotes, plants, animals, and viruses), as well as synthetic and/or recombinant target sequences, or combinations thereof.

As used herein, "primer" refers to an oligonucleotide or polynucleotide (whether naturally occurring or synthetically produced) that is capable of acting as a point of initiation of synthesis when placed under conditions (i.e., in the presence of nucleotides and an agent for polymerization, and at a suitable temperature and in a suitable buffer) in which synthesis of a primer extension product complementary to a nucleic acid strand is induced. Such conditions include the presence of four or more different deoxyribonucleoside triphosphates and a polymerization-inducing agent (e.g., a DNA polymerase, and/or an RNA polymerase, and/or a reverse transcriptase), in a suitable buffer (the "buffer" contains substituents that are cofactors, or affect pH, ionic strength, etc.), and at a suitable temperature. The primers herein are selected to be substantially complementary to the strand of each specific sequence to be extended. This means that the primers must have sufficient complementarity to hybridize to their respective strands. Non-complementary nucleotides may be present.

As used herein, the term "complementary" refers to the ability to form a duplex nucleic acid complex either randomly according to the usual Watson-Crick (Watson-Crick) rule or by designing two nucleotide sequences to bind to each other sequence-specifically by hydrogen bonding of their purine and/or pyrimidine bases. It may also refer to the ability of nucleotide sequences (which may comprise modified nucleotides, or analogs of deoxyribonucleotides and ribonucleotides, or combinations thereof) to sequence-specifically bind to each other to form alternative nucleic acid duplex structures according to rules other than the usual Watson-Crick rules.

As used herein, the terms "hybridize" and "anneal" are interchangeable and refer to the process by which two nucleotide sequences that are complementary to each other are joined together to form a duplex sequence or segment (segment).

The terms "duplex" and "double-stranded" are interchangeable, meaning a structure that is formed as a result of hybridization between two complementary nucleic acid sequences. Such duplexes may be formed by complementary binding of two DNA segments to each other, two RNA segments to each other, or a DNA segment to an RNA segment, or two segments consisting of a mixture of RNA and DNA to each other, the latter structure also being referred to as a hybrid duplex. Either or both members of such duplexes (members) may contain modified nucleotides and/or nucleotide analogs as well as nucleoside analogs. As disclosed herein, such duplexes are formed as a result of the binding of one or more blocking oligonucleotides to a sample sequence.

As used herein, the terms "wild-type nucleic acid", "normal nucleic acid", "nucleic acid with normal nucleotides", "wild-type DNA", and "wild-type template" are used interchangeably and refer to a polynucleotide having a nucleotide sequence that is considered normal or unaltered.

As used herein, the terms "mutant polynucleotide," "mutant nucleic acid," "variant nucleic acid," and "nucleic acid having variant nucleotides" refer to a polynucleotide having a nucleotide sequence that is different from the nucleotide sequence of the corresponding wild-type polynucleotide. The difference in nucleotide sequence of a mutant polynucleotide compared to a wild-type polynucleotide is referred to as a nucleotide "mutation", "variant nucleotide" or "variation". The term "one or more variant nucleotides" also refers to one or more nucleotide substitutions, one or more nucleotide deletions, one or more nucleotide insertions, one or more nucleotide methylation, and/or one or more nucleotide modification changes.

As used herein, "amplifying" means using any amplification procedure to increase the concentration or copy number of a particular nucleic acid sequence within a mixture of nucleic acid sequences. The amplification may be linear amplification or exponential amplification.

The term "amplification product" or "amplicon" refers to a fragment of DNA or RNA that is amplified by a polymerase using primers in an amplification method.

The term "primer extension product" refers to a fragment of DNA or RNA that is extended by a polymerase using one or a pair of primers in a reaction that may involve one pass extension (e.g., first strand cDNA synthesis), or multiple extension cycles (which may be linear amplification, or cDNA synthesis), or many extension cycles (which may be exponential amplification (e.g., PCR)).

The term "compatible" refers to a primer sequence or a portion of a primer sequence that is the same, or substantially the same, complementary, substantially complementary, or similar to a PCR primer sequence/sequencing primer sequence used in a massively parallel sequencing platform.

The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, and recombinant DNA technology, which are within the skill of the art. In one aspect, the invention provides a Removable Template Oligonucleotide (RTO) for generating a library of polynucleotides, the Removable Template Oligonucleotide (RTO) comprising:

(a) a 3' random sequence;

(b) a blocker moiety attached to the 3' end, the blocker moiety rendering the RTO inextensible;

(c) universal sequences 5' to random sequences; and

(b) a nucleotide sequence/modification (NSM) that is recognized by an agent,

wherein the RTO serves as a template, is not incorporated into the reaction product, and is destroyed/removed after the reaction,

wherein the nucleotide sequence/modification facilitates the removal of the RTO.

In one embodiment, the 3' blocker moiety is the same as the NSM. The 3' blocker moiety and the NSM may be biotin and the agent is avidin or streptavidin. For example, NSM allows for the removal of RTO by digestion to produce strand cleavage or by affinity purification. The RTO molecule includes one or more NSM moieties that render the RTO degradable or unobtrusive and non-competitive in one or more reactions following the extension reaction, wherein the moieties are recognizable by an agent that facilitates digestion/removal of the RTO.

The Removable Template Oligonucleotide (RTO) may alternatively be referred to as an Adaptor Template Oligonucleotide (ATO). The terms RTO and ATO may be used interchangeably and refer to an oligonucleotide that serves as a template for extending the end of a target for adding a known sequence to modify the target by polymerase extension. The term Adaptor Template Oligonucleotide (ATO) or Removable Template Oligonucleotide (RTO) refers to a population of sequences having a common (universal) region between each member of the population and a randomly variable sequence (referred to as N, where N is each of the four bases). Due to the random nature of the 3' end, the population of Adaptor Template Oligonucleotides (ATOs) refers to a number of different sequences.

The present disclosure provides an Adaptor Template Oligonucleotide (ATO) for extending a polynucleotide, the Adaptor Template Oligonucleotide (ATO) comprising:

(a) a 3' random sequence;

(b) a blocker is attached to the 3' end, the blocker rendering the ATO inextensible; and

(c) universal sequences 5' to random sequences;

wherein the ATO serves as a template to direct an extension reaction by a polymerase.

The term universal sequence refers to the entirety of the ATOs from their 5' end to the first nucleotide of the random or target-specific sequence, and is so named because it is a "universal sequence" present on all ATOs.

The ATO molecule may also include one or more moieties that render the ATO degradable or non-interfering and non-competitive in one or more reactions following the extension reaction, wherein the moieties are recognized by agents that facilitate digestion/removal of the ATO.

The present disclosure provides an Adaptor Template Oligonucleotide (ATO) for extending a polynucleotide, the Adaptor Template Oligonucleotide (ATO) comprising:

(a) a 3 ' random sequence of 3 to 36 ' N ' bases;

(b) 3' with a blocker that renders the ATO inextensible;

(c) universal sequences 5' to random sequences;

(d) optionally modified nucleotides or linkages (linkages) rendering the ATO resistant to 3' exonuclease cleavage; and

(e) a moiety that renders the ATO degradable.

In one embodiment, the moiety is a uracil nucleotide, wherein the agent is a dU-glycosylase, or a dU-glycosylase and an apurinic/apyrimidinic endonuclease, capable of digesting/removing ATO following the first extension reaction.

In another embodiment, the moiety is a ribonucleotide, wherein the ribonucleotide is incorporated into the ATO during oligonucleotide (oligo) synthesis in place of any or all of the nucleotides; wherein the agent is a ribonuclease capable of digesting/removing ATO after the first extension reaction.

The ATO may be an RNA oligonucleotide, or a DNA oligonucleotide, or a combination of a DNA oligonucleotide and an RNA oligonucleotide.

An ATO may be a combination of one or more different ATO. The combinations of ATO may differ in sequence. The combined ATO may differ in design. The combined ATO may differ in function. As used herein, the term "ATO" may refer to any combination of one or more ATO, any sequence of ATO, any design of ATO having any combination of ATO design features, any combination of ATO having any combination of functions. When a combination of one or more ATO's is used, there may be variations in the ATO used within the universal sequence, in which case the term universal sequence is also used.

In another embodiment, the degradable moiety is a sequence recognizable by a restriction enzyme, wherein the agent is a restriction enzyme.

The universal sequence may include an RNA polymerase promoter sequence. Any RNA polymerase (such as T7 RNA polymerase, or T3 RNA polymerase, or SP6 RNA polymerase, or a combination thereof) can be used. The universal sequence may include an RNA polymerase promoter sequence and/or a priming site located 3' to the promoter sequence. The priming site provides a primer binding sequence for subsequent amplification.

The universal sequence may be double stranded or partially double stranded. Protection of the universal sequence as a double-stranded region prevents hybridization to the random 3' ends of the target polynucleotide and the ATO. In one embodiment, the ATO comprises a 5 'stem portion sequence, the 5' stem portion sequence being complementary or partially complementary to all or a portion of the universal sequence, which is capable of forming a stem-loop structure or a disrupted stem-loop structure. The ATO molecule may comprise in 5 'to 3' order: a 5 'stem portion, an RNA polymerase sequence, a priming site sequence, and a 3' random/degenerate sequence, a mixture of random/degenerate and specific sequences, or a sequence specific sequence. The RNA polymerase sequence may be located in the loop portion, or in a portion of the stem and a portion of the loop, or in the stem. The ring portion may include a non-replicable coupling. Alternatively, the ring portion may not include uncopyable couplings. Alternatively, the stem portion may comprise a non-replicatable coupling. If the 5' of the stem portion comprises an additional sequence, there may be an uncopyable linkage between the stem portion and the additional sequence. Alternatively, the stem portion may not comprise a non-replicable coupling. In another embodiment, the 5' stem portion comprises a non-replicable coupling. The non-replicatable linkage may be selected from (but not limited to) the following groups: c3 spacer phosphoramidite, or triethylene glycol spacer, or 18 atoms hexaethylene glycol spacer, or 1 ', 2' -dideoxyribose (d spacer).

The double-stranded stem portion may comprise one or more non-complementary regions, wherein the one or more non-complementary regions in the universal sequence strand comprise one or more random degenerate sequences, or specifically designed mismatches. The stem portion may form two or more broken segments separated by one or more non-replicable couplings. The stem portion may form two or more fragmented segments separated by one or more regions of mismatched base pairs.

In another embodiment, the ATO comprises an upper single strand that is complementary or partially complementary to all or a portion of the lower strand comprising the universal sequence. The 5' end of the upper individual strand may include a phosphate group.

An ATO may comprise one or more affinity binding moieties attached at any position of the ATO. The affinity binding moiety may be biotin.

The universal sequence of the ATO may include one or more random sequences or one or more sequence-specific sequences as additional Unique Identification (UID) sequences. One or more UID sequences may be located within the stem of the ATO. One or more UID sequences may be located within a loop of the ATO. One or more UID sequences may be located within the stem and loop of the ATO, and two or more UIDs may be present within both the stem and/or loop of the ATO.

The ATO sequence may include one or more non-canonical nucleotides (non-dA, non-dG, non-dT, non-dC) that are naturally occurring nucleotides or artificial nucleotides at any position. The one or more atypical nucleotides may be universal nucleotides. Atypical nucleotides may include inosine bases. The 3' random sequence may include atypical nucleotides in part or entirely. The universal sequence may include atypical nucleotides in part or entirely.

The 3 'end of the ATO may include one or more modified nucleotides or linkages that render the ATO resistant to the 3' exonuclease activity of the DNA polymerase. The modified linkage comprises a phosphorothioate linkage.

The ATO may also include a specific sequence 3 'of a random sequence, wherein the specific sequence is capable of hybridizing to a specific location of the target polynucleotide, or a specific sequence not designed for a specific target, and a portion of the 3' random/degenerate sequence is used as a template on which the polynucleotide is extended by a polymerase. The 3 'random sequence of the ATO can be divided into two or more parts by a specifically designed specific sequence and a part of the 3' random/degenerate/target specific sequence is used as a template on which the target polynucleotide is extended by a polymerase.

The present disclosure also provides compositions comprising at least one nucleic acid polymerase and one or more Adaptor Template Oligonucleotides (ATO) from any combination or mixture of the described Adaptor Template Oligonucleotides (ATO). The nucleic acid polymerase may be a mixture of DNA polymerase, or RNA polymerase, or reverse transcriptase, or any combination of DNA polymerase, RNA polymerase and reverse transcriptase. Preferably, the polymerase has strand displacement activity. The polymerase may have 3 'to 5' exonuclease activity. The polymerase may be a mixture of one or more different DNA polymerases, or a mixture of one or more RNA polymerases, or a combination of one or more DNA polymerases or RNA polymerases. The polymerase is a template-dependent polymerase, not a template-independent polymerase.

The present disclosure provides methods of extending a target polynucleotide, the methods comprising incubating the target polynucleotide with a composition described herein under conditions sufficient to allow extension of the 3' end of the target polynucleotide using one or more ATO as a template (referred to as an "ATO reaction"), wherein the one or more ATO can hybridize to any or specific positions of the target polynucleotide. In another aspect, the method further comprises degrading the one or more ATO following extension of the target polynucleotide.

The first ATO reaction may be an extension reaction in which the target polynucleotide is extended as a primer and one or more ATO's are used as templates. Alternatively, the first ATO reaction may be an extension-ligation reaction (nick filling) in which the target polynucleotide is extended and ligated to the 5' stem portion or upper strand of the ATO(s). Alternatively, the first reaction may be a ligation reaction only, wherein the target polynucleotide is hybridized to the 3 'random sequence portion of the one or more ATO and is directly ligated to the 5' portion or upper strand of the one or more ATO.

The ATO comprises one or more moieties that render the ATO degradable, wherein the one or more moieties are identifiable by an agent that facilitates digestion/removal of the ATO. Alternatively, the ATO comprises one or more moieties that render the ATO non-interfering and/or non-competitive in the reaction after the extension reaction.

In some embodiments, the ATO molecule comprises a moiety that is a dU base and can be degraded by incubation with dU-glycosylase (which creates abasic sites) followed by incubation at a temperature above 80 ℃ (introduction of breaks within the abasic sites), or by incubation with a mixture of dU-glycosylase and an apurinic/apyrimidinic endonuclease. Methods and compositions are provided that include an ATO having a dU base and incubating with a dU-glycosylase to degrade the ATO molecule, or incubating with a dU-glycosylase and subsequently incubating at a temperature greater than 80 ℃ to degrade the ATO molecule, or incubating the ATO with a mixture of a dU-glycosylase and an apurinic/apyrimidinic endonuclease. In a further aspect, the ATO comprises ribonucleotides and is degradable with a ribonuclease under conditions that satisfy ribonuclease activity. In related aspects, the ribonuclease is selected from the group consisting of ribonuclease h (rnase h), ribonuclease hii (rnase hii), ribonuclease a (rnase a), and ribonuclease T1(rnase T1).

In other embodiments, a moiety may be a modified nucleotide or nucleotide analog. The modified nucleotide may be a non-canonical nucleotide (non-dA, non-dG, non-dT, non-dC), and the atypical nucleotide is a naturally occurring nucleotide or an artificial nucleotide. Atypical nucleotides may be universal nucleotides. Atypical nucleotides may include inosine bases, where inosine is used to replace guanine positions in the ATO sequence, where deoxyinosine preferentially directs incorporation of dC in the growing nascent strand by DNA polymerase, where it is expected and understood that other residues may not be incorporated frequently.

The modified nucleotides/analogs can be present in the 5 'universal sequence (5' universal portion) or in the 3 'random/degenerate portion, or the modified nucleotides/analogs can be present in both the 5' universal sequence and the 3 'random sequence (3' random portion).

The moiety may make the base-pairing bond weaker, or may be a degradable/removable modification after the first extension ATO reaction. Due to weak binding, ATO cannot interfere with or compete with normal primers for binding to the same template in subsequent reactions. Furthermore, ATO cannot interfere with subsequent reactions due to digestion of the agent.

The modification of intermolecular hydrogen bonds that provides weaker base pairing than standard typical base pairing can be any naturally occurring nucleotide or artificial nucleotide analog. A preferred naturally occurring atypical nucleotide for use in ATO is inosine, which can be used in place of dG to pair with dC, but is weaker than dG: dC.

Nucleic acid analogs are compounds that are similar (structurally similar) to naturally occurring RNA and DNA. Nucleic acids are strands of nucleotides, which consist of three parts: a phosphate backbone, a pentose, ribose or deoxyribose, and one of four nucleobases. Analogs can have any of these changes. Typically, analog nucleobases confer, among other things, different base pairing and base stacking properties. Examples include universal bases that can pair with all four typical bases and phospho-sugar backbone analogs that affect strand properties. The 3' random portion of the ATO serving as a template, which hybridizes to any position of the polynucleotide, may include one or more universal bases.

Universal bases are analogous compounds that can replace any of the four DNA bases with weak base pair interactions. The universal base commonly used may be 3-nitropyrrole, 5-nitroindole, or 2' -deoxyinosine. Inosine showed a slight deviation in nucleotide hybridization with dI: dC being preferred over other pairings.

A typical base may have a carbonyl or amine group on a carbon around the nitrogen atom furthest from the glycosidic bond, which allows it to base pair via hydrogen bonds (amino to keto, purine to pyrimidine) (watson-crick base pairing).

Universal bases can pair indiscriminately with any other base, but generally greatly reduce the melting temperature of the sequence; examples include 2' -deoxyinosine (hypoxanthine deoxynucleotide) and its derivatives, nitroazole analogues, and hydrophobic aromatic non-hydrogen bonding bases (strong stacking effect). This property has been explored by the present disclosure to allow an inosine-containing ATO to have a lower Tm than normal oligonucleotides, such that the ATO does not interfere with the subsequence reactions, where the annealing temperature is higher than the Tm of the universal portion of the ATO.

Deoxyinosine, a naturally occurring base, is considered the first "universal" base, meaning that it can base pair with the other natural bases A, C, G and T. Indeed, studies on deoxyinosine have shown that, although it does not self-aggregate like deoxyguanosine, it acts as a specific analogue of deoxyguanosine.

The weakly paired nucleotides in the universal portion of the ATO include modified nucleotides, nucleotide analogs, or/and modified linkages, which may be, but are not limited to, inosine, diinosine (dinosine), or phosphorothioate linkages, methylphosphonate linkages. Any modification of the nucleotides or/and linkages may be used as long as it provides a weak pairing compared to the native nucleotides or linkages. The weak pairing in the universal part of the ATO can also be related to the primers used in the subsequent reactions. If the weakly paired nucleotide is a normal natural nucleotide, the primer used in the subsequent reaction may include a strongly paired nucleotide, which is a modified nucleotide or/and a modified linkage. Modified nucleotides that provide strong or weak pairing abilities may include, but are not limited to, LNA, O-me RNA, 2-amino-dA, 2-thiol-dT, 2-aminopurine, 2' FluoRNA bases, C5-propyne analogs and methyl analogs of AP-dC, and dT (merhylalanalogue), deoxyinosine, deoxyuridine, or superbase nucleotides (Epoch Bioscience).

The ATO may include a 5 'stem portion sequence, the 5' stem portion sequence being complementary to a portion of the universal sequence, which is capable of forming a stem-loop structure,

wherein the ring portion may include a non-replicable coupling.

Wherein the non-replicable linkages may be selected from (but are not limited to) the following groups: c3 spacer phosphoramidite, or triethylene glycol spacer, or 18 atoms hexaethylene glycol spacer, or 1 ', 2' -dideoxyribose (d spacer).

The 5 'stem portion sequence may comprise a sequence that is complementary or substantially complementary to a portion of the 5' universal sequence, preferably a portion adjacent to the random sequence. In one embodiment, the loop portion may comprise a non-replicable linkage, wherein the polymerase in the ATO reaction may contain strand displacement activity or 5 'to 3' exonuclease activity. Desirably, the target polynucleotide hybridizes only to the 3 'random/degenerate/target-specific sequence of the ATO, and not to the 5' universal sequence. This is achieved by a stem-loop structure that prevents the target polynucleotide from hybridizing to the universal sequence of the ATO in an ATO reaction. In an ATO reaction, the target polynucleotide hybridizes to a 3' random sequence and is extended by a DNA polymerase, an RNA polymerase, a reverse transcriptase, or a mixture of any combination of DNA polymerase, RNA polymerase, and reverse transcriptase. The extended strand displaces the stem portion and terminates at a non-replicatable linkage. Furthermore, in the reaction after the first ATO reaction, the stem-loop structure prevents ATO from competing for binding to the modified target polynucleotide, such that any added primers can effectively bind to the modified target polynucleotide for amplification. In this embodiment, the stem portion and the irreproducible linkage serve as a moiety that renders the ATO non-interfering and non-competitive in reactions following the ATO reaction. The ATO may or may not include other modifications (such as one or more uracil nucleotides, or ribonucleotides) that can be digested after the ATO reaction.

In another embodiment, the loop portion of the ATO may include nucleotides that can be digested (e.g., one or more uracil nucleotides, or ribonucleotides). In an ATO reaction, the target polynucleotide is hybridized to a 3' random sequence and extended by a DNA polymerase. The extended strand is contacted with the 5 'end of the stem strand containing a 5' phosphate group and ligated by DNA ligase. The reaction includes a DNA polymerase (e.g., polymerase 1 or Klenow large fragment) with nick filling activity and a DNA ligase. An ATO molecule may comprise a plurality of uracil nucleotides or ribonucleotides. After the ATO reaction, the ATO molecules may be digested.

In yet another embodiment, the loop portion of the ATO may include nucleotides that can be digested (e.g., uracil nucleotides or ribonucleotides). In the first reaction, the target polynucleotide hybridizes to 3 ' random sequences adjacent to the 5 ' end of the stem strand and is ligated to the 5 ' end of the stem strand by DNA ligase. The 5' universal sequence portion may comprise a plurality of uracil nucleotides or ribonucleotides. After the first reaction, the 5' universal sequence portion may be digested.

In yet another embodiment, the ATO comprises an upper single strand that is complementary or substantially complementary to a portion or all of the universal sequence (which is capable of forming a partially double stranded structure). It is desirable that the target polynucleotide hybridizes only to the 3 'random sequence of the ATO, and not to the 5' universal sequence. This is achieved by the double stranded structure provided by the upper strand, which prevents the target polynucleotide from hybridizing to the universal sequence of the ATO in the first ATO reaction. The ATO may include other modifications (such as uracil nucleotides or ribonucleotides) that can be digested after the first reaction. In the first ATO reaction, the target polynucleotide is hybridized to the 3' random sequence and extended by a DNA polymerase. The extended strand may displace the upper strand. Alternatively, the extended strand is contacted with the 5 ' end of the upper strand having a 5 ' phosphate group at the 5 ' end, and then ligated by DNA ligase. The reaction includes a DNA polymerase (e.g., polymerase 1 or Klenow large fragment) with nick filling activity and a DNA ligase. The 5' universal sequence portion may comprise a plurality of uracil nucleotides or ribonucleotides. After the first reaction, the 5' universal sequence portion may be digested. In another aspect, in the first reaction, the target polynucleotide hybridizes to 3 ' random sequence adjacent to the 5 ' end of the upper strand and is ligated to the 5 ' end of the upper strand by a DNA ligase. The 5 'universal sequence portion and the 3' random portion may include a plurality of uracil nucleotides. After the first reaction, the ATO molecule may be digested.

In embodiments where ligation is used, the 5 'end of the upper individual strand may include a phosphate group and the 3' end of the upper individual strand may include biotin. In embodiments where extension is used without ligation, the 5 'end of the upper individual strand does not include a phosphate group, the 3' end of the upper individual strand does not include biotin, but the upper individual strand may include nucleotides that can be digested (e.g., uracil nucleotides).

The random sequence portion may be of any length, which may be in the range of 3 to 48 nucleotides in length (preferably in the range of 3 to 36 nucleotides, or most preferably in the range of 12 to 30 nucleotides). Specifically, the random sequence portion has 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or more nucleotides. The random sequence may comprise completely random nucleotides, wherein any of the four nucleotides may be present at any position. Alternatively, degenerate nucleotides may be present at some positions. The random sequence portion may include some specific (non-random) nucleotides at some specific positions, for example the 3' terminal nucleotide may be a specific nucleotide (e.g., a T residue, or an a residue, or a G residue, or a C residue). The particular 3' terminal nucleotide is selected for convenient and low cost attachment modifications (such as biotin, spacer or phosphate blocker). The random sequence may comprise degenerate nucleotides or semi-degenerate nucleotides or fully random nucleotides as well as specific nucleotides, wherein the presence of random nucleotides predominates. The random sequence may also include naturally occurring or artificial modified nucleotides. These modified nucleotides may be universal bases as described above. The random sequence may have sequence deviations, such as enriching the entire sequence with C and G residues, meaning that more than 50% of the nucleotides consist of C and G, or enriching the entire sequence with a and T residues, meaning that more than 50% of the nucleotides consist of a and T. The random sequence may be partially or completely replaced by a target-specific sequence.

The ATO hybridizes to the target polynucleotide through a random sequence under non-stringent conditions, wherein the 3' end of the polynucleotide hybridizes to the random sequence, and in one embodiment the elongation is performed using the ATO as a template.

As described, the random sequence provides a function as a template for hybridization with the target polynucleotide, and the ATO is used as a template to direct extension of the target polynucleotide. In addition, the random sequence also provides a second function as a UID (unique identifier, molecular barcode) in next generation sequencing.

In another embodiment, the ATO molecule may include a 3 'additional specificity sequence located 3' to the random sequence portion. The 3' additional specific sequence may include a sequence capable of hybridizing to a specific location of the target polynucleotide. The 3' additional specificity sequence may include a restriction enzyme recognition sequence that, when hybridized to the target sequence, can cause a nick in the target polynucleotide by the enzyme.

The 3' end of the ATO may have attached a blocker group that renders the ATO inextensible. Any blocker group may be used. Typically, the 3 'end of the ATO will be "blocked" to prohibit the ATO from acting as a primer unless the elongation of the ATO is required by design (in which case the 3' will not be blocked and the elongation will be allowed). "blocking" can be achieved by attaching a chemical moiety (such as a biotin or phosphate group) to the 3' hydroxyl group of the last nucleotide, which can serve a dual purpose by additionally serving as an affinity capture moiety for subsequent removal or capture of ATO following the ATO reaction, depending on the group chosen. Blocking can also be achieved by removing the 3 'OH or by using nucleotides lacking a 3' OH (such as dideoxynucleotides). The blocking group may be selected from the group consisting of: at least one ribonucleotide, at least one deoxynucleotide, a C3 spacer, a phosphate, a dideoxynucleotide, an amino group and an inverted deoxythymidine.

The ATO reaction preferably contains a polymerase having 3 ' to 5 ' exonuclease activity, and thus the ATO molecule preferably includes a modified nucleotide or linkage at the 3 ' end which prevents digestion by the polymerase. Any resistance modification may be used. One example is that the 3' end includes a modified linkage between the final nucleotides (preferably the last two nucleotides) at that position such that it includes a resistant moiety (such as a phosphorothioate) rather than a conventional phosphodiester.

In one embodiment, the 5' universal sequence portion is a portion of the sequence that provides functionality as a template for target polynucleotide extension, the product of which is equivalent to the addition of adaptors for sequencing library preparation via ligation. However, the extended sequences (like adaptors) were synthesized in a new way and were not attached to any added oligonucleotides. The 5' universal sequence portion includes a primer sequence or primer binding sequence that is compatible with next generation (or third generation) sequencing (NGS) or other massively parallel sequencing. For example, the 5' universal sequence portion may include a sequencing primer sequence for the Illumina platform, and/or an anchor primer sequence for the Illumina platform.

The 5 'universal sequence portion may form a stem-loop structure or a hairpin structure in which the stem portion ends near the 5' end of the random sequence portion. The 5 'end of the stem may form a 5' overhang. The stem portion may have any length. The stem portion may have any length ranging from 3 to 30 nucleotides (preferably 4 to 24 nucleotides). The stem portion may be fully double stranded or, preferably, not fully double stranded. The stem portion may include one or more unpaired regions. One or more of the unpaired regions may comprise a random sequence that functions as a UID (molecular barcode) in the NGS. The stem portion may include a sequence (e.g., a restriction enzyme site) that can be recognized and cleaved by an enzyme. Stem formation can prevent hybridization of the universal sequence to the target polynucleotide. The loop portion may have any length. The loop portion may have any length ranging from 0 to 36 nucleotides (preferably 1 to 30 nucleotides). The loop may include, in part or in whole, nucleotide analogs or other chemical linkages that are not replicable by the polymerase, e.g., abasic sites, Hexanediol (HEG) monomers, 18-atom hexaethylene glycol internodes, or 1 ', 2' -dideoxyribose (d-internode). The non-replicable linker prevents polymerase mediated extension on the 5' portion of the ATO.

The ATO molecule comprises nucleotides selected from the group consisting of: 2 '-deoxythymidine 5' -monophosphate (dTMP), 2 '-deoxyguanosine 5' -monophosphate (dGMP), 2 '-deoxyadenosine 5' -monophosphate (dAMP), 2 '-deoxycytidine 5' -monophosphate (dCMP), 2 '-deoxyuridine 5' -monophosphate (dUMP), thymidine nucleoside monophosphate (TMP), Guanosine Monophosphate (GMP), Adenosine Monophosphate (AMP), cytosine nucleoside monophosphate (CMP), Uridine Monophosphate (UMP), base analogs, and combinations thereof. It is also contemplated that ATO includes modified nucleotides or linkage modifications as defined herein.

The modified oligonucleotide or modified polynucleotide may include one or more sugars and/or one or more internucleotide linkages of the nucleotide units in the oligonucleotide or polynucleotide, and be replaced by "non-naturally occurring" groups. In one aspect, this embodiment contemplates Peptide Nucleic Acids (PNA). In PNA compounds, the sugar-backbone of the polynucleotide is replaced by an amide-containing backbone. The modified oligonucleotide/polynucleotide backbone may contain phosphorus atoms and may comprise, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, methylphosphonates, and other alkyl phosphonates. The modified oligonucleotide or modified polynucleotide may also contain one or more substituted sugar moieties. Further modifications include those that extend the genetic code (such as, but not limited to, Iso-dC and Iso-dG). Iso-dC and Iso-dG are chemical variants of cytosine and guanine, respectively. Iso-dC will hydrogen bond to Iso-dG but not to dG. Similarly, Iso-dG will base pair with Iso-dC, but not dC. In one aspect, the modification of the sugar comprises Locked Nucleic Acid (LNA).

The target polynucleotide may be fragmented, naturally or manually, randomly or specifically, or in a combination of random and specific ways, and after hybridization with the random or target specific sequence of the ATO, the 3' end of the fragmented target polynucleotide is extended. The combination of the random 3' end sequence of the target polynucleotide and the extended portion on the random template provides a unique identification (UID, molecular barcode) sequence that can be used to group sequencing reads into families. Further, the ATO may include one or more additional UIDs located in the universal sequence. Additional UIDs may be located in the loop, near the stem portion and 5' of the random sequence portion. Additional UIDs may be located anywhere in the stem between the 3' random sequence and the loop. The additional UIDs may be of any length. Additional UIDs may be of any length, which may range from 2 to 48 nucleotides (preferably having a range from 3 to 36 nucleotides). Additional UIDs may include fully random nucleotides, where any of the four nucleotides may be present at any one location. Alternatively, degenerate nucleotides may be present at some positions.

ATO molecules may include atypical nucleotides, analogs or modifications that can be recognized and cleaved by agents. Atypical nucleotides are selected from the group consisting of dUMP, dIMP and 5-OH-Me-dCMP. An agent capable of cleaving the base portion of an atypical nucleotide is an N-glycosylase. The N-glycosylase is selected from the group consisting of uracil N-glycosylase (UNG), hypoxanthine N-glycosylase, and hydroxymethylcytosine N-glycosylase. When the atypical nucleotide is dUMP, the enzyme capable of cleaving the base portion of the atypical nucleotide is UNG. When the atypical nucleotide is dUMP, an enzyme capable of cleaving the base portion of the atypical nucleotide is UNG, and the phosphodiester backbone is cleaved with DMED. In one embodiment, uracil nucleotides are incorporated into the ATO in place of thymine nucleotides during oligonucleotide synthesis. An ATO including an atypical nucleotide may be synthesized in the presence of two or more different atypical nucleotides, thereby synthesizing an ATO including two or more different atypical nucleotides. When an ATO including an atypical nucleotide is synthesized in the presence of three typical nucleotides and one atypical nucleotide, or all four typical nucleotides and one atypical nucleotide, the atypical nucleotide is provided in a ratio suitable for degradation of the ATO after the ATO reaction. Generally, the base excision repair enzyme may be selected from the group consisting of DNA glycosylase, AP endonuclease, and deoxyphosphodiesterase. Preferably, the DNA glycosylase may be selected from the group consisting of uracil-DNA glycosylase, 3-methyladenine DNA glycosylase, pyrimidine hydrate-DNA glycosylase, FaPy-DNA glycosylase and thymine mismatch-DNA glycosylase. More preferably, the DNA glycosylase is a uracil-DNA glycosylase. uracil-DNA glycosylase (UDG) or uracil-N-glycosylase (UNG) is an enzyme that catalyzes the release of free uracil from single-stranded DNA and double-stranded DNA of greater than 6 base pairs.

In one embodiment, an ATO molecule may include ribonucleotides, wherein the ribonucleotides are incorporated into the ATO in place of any or all of the nucleotides during oligonucleotide synthesis; wherein the agent is a ribonuclease, which is capable of degrading/removing ATO after the ATO reaction. The ATO may be RNA over the entire length or may be composed partially of RNA. Any portion of the ATO may be RNA, preferably the universal sequence portion is RNA.

In another embodiment, the ATO sequence may include a restriction enzyme recognition sequence located in the universal sequence or in the 3' additional specific sequence; wherein the agent is a restriction enzyme capable of degrading/removing the ATO after the ATO reaction. The restriction enzyme recognition sequence may be located in the stem portion of the ATO, which may be cleaved by the restriction enzyme.

In another embodiment, the ATO molecule may comprise an affinity binding moiety attached in any position of the ATO; wherein the agent is a protein or an antibody capable of removing ATO after the ATO reaction. For example, the affinity binding moiety is biotin; wherein the agent is avidin or streptavidin. In certain embodiments, the ATO may include one or more moieties incorporated into the 5 'terminus or 3' terminus or any internal location of the ATO that allow for affinity removal of the ATO from the reaction mixture following the ATO reaction. Preferred affinity moieties are those that can specifically interact with homologous ligands. For example, the affinity moiety may comprise biotin, digoxigenin, or the like. Other examples of capture groups include ligands, receptors, antibodies, haptens, enzymes, chemical groups or aptamers that can be recognized by antibodies. The affinity moiety may be immobilized on any desired substrate/solid support. Examples of desirable substrates include, for example, particles, beads, magnetic beads, optical trapping beads, microtiter plates, glass slides, paper, test strips, gels, other matrices, nitrocellulose, or nylon. For example, when the capture moiety is biotin, the substrate may comprise streptavidin. In some cases, the solid support is a bead. Examples of beads include, but are not limited to, streptavidin beads, agarose beads, magnetic beads, antibody-conjugated beads (e.g., anti-immunoglobulin microbeads), protein a-conjugated beads, protein G-conjugated beads, protein a/G-conjugated beads, protein L-conjugated beads, oligo-dT-conjugated beads, silica-like beads, anti-biotin beads, and anti-fluorochrome beads.

Although a portion of the random sequence of the RTO serves as a Unique Identification (UID) sequence, the RTO may include an additional UID located in the universal sequence portion.

The present disclosure provides a method for generating a polynucleotide library, the method comprising:

(i) generating a modified target polynucleotide using a target polynucleotide from the sample as a primer and a Removable Template Oligonucleotide (RTO) of any one of the RTOs as a template;

(ii) removing the RTO; and

(iii) generating a first Complementary Sequence (CS) of the modified target polynucleotide using a first primer, the first primer comprising a universal sequence,

wherein the generating operation comprises extending a primer hybridized to the template by a polymerase.

Accordingly, the present disclosure provides a method for extending a population of target polynucleotides having single stranded 3' ends, the method comprising:

(i) incubating the target polynucleotide with an Adaptor Template Oligonucleotide (ATO) having:

(a) a 3' random sequence;

(b) 3' with a blocker that renders the ATO inextensible; and

(c) 5' to a universal sequence of random sequences,

wherein the target polynucleotide hybridizes to a 3' random sequence of the ATO;

(ii) polymerase extension of the target polynucleotide using ATO as a template, thereby producing an extended target polynucleotide having a 3' universal sequence.

The method can further comprise (iii) generating a first Complementary Sequence (CS) of the modified target polynucleotide, wherein generating the first CS comprises performing polymerase extension from the 3' universal sequence using the modified target polynucleotide as a template.

The present disclosure provides a method for extending a polynucleotide, the method comprising:

(i) generating a modified target polynucleotide by incubating the target polynucleotide with a composition as described above, wherein in an enzymatic first ATO reaction the 3 ' end of the target polynucleotide is hybridized to a 3 ' random sequence of an adapter template oligonucleotide (first ATO), wherein the 3 ' end of the target polynucleotide is extended using the ATO as a template, wherein if a 3 ' overhang is present, the 3 ' end of the target polynucleotide is trimmed before extension occurs.

In one embodiment, following extension of the 3' end on the ATO, the modified polynucleotide is optionally incubated with single-stranded DNA cycloligase resulting in cyclization of the single-stranded modified polynucleotide.

The method further comprises (ii) generating a first Complementary Sequence (CS) of the modified target polynucleotide.

Step (ii) may comprise using the modified target polynucleotide as a template for one round of amplification, as a template for successive rounds of amplification separated by the purification step.

In one embodiment, the operation of generating the first CS comprises extension using a first primer and using the modified target polynucleotide as a template, wherein the first primer hybridizes to the modified target polynucleotide and is extended by a polymerase. The first primer anneals to the 3' extended universal region of the modified target polynucleotide. The first primer may include additional sequences compatible with the NGS platform, such as a 5' tail containing the necessary sequences.

In another embodiment, the operation of generating the first CS comprises in vitro transcription using RNA polymerase using a double-stranded RNA polymerase promoter region in the modified target polynucleotide generated by extension on an ATO comprising an RNA polymerase promoter. In another embodiment, a double stranded RNA polymerase of the modified target polynucleotide is generated by hybridizing nucleotides to the modified target polynucleotide after digestion and/or removal of the ATO.

In yet another embodiment, the generating the first CS comprises thermally denaturing the modified target polynucleotide, self-annealing the 3' stem-loop structure of the modified target polynucleotide, and self-priming to extend to form the first CS.

In yet another embodiment, the generating the first CS operation comprises annealing a target-specific primer to a modified target polynucleotide and extending by a polymerase. The target-specific primer anneals to the target sequence of interest. The double stranded end of the first CS may be ligated to an adaptor.

The method can further comprise digesting the ATO prior to or after the act of generating the first Complement (CS) of the modified target polynucleotide.

The method can further comprise affinity capture prior to or after the operation of generating the first Complement (CS) of the modified target polynucleotide.

In a first ATO reaction, the method can include extension and ligation, wherein a DNA polymerase extends the 3 'end of the target and a DNA ligase ligates the extended target sequence to the 5' stem portion of the ATO or the upper individual strand of the ATO.

The method may further comprise generating a modified first CS by incubating the first CS with a composition as described above, wherein in an enzymatic second ATO reaction the 3 'end of the first CS hybridizes to a 3' random sequence of an adapter template oligonucleotide (second ATO), wherein the 3 'end of the first CS is extended using the ATO as a template, wherein the 3' end of the first CS is trimmed before extension occurs if a 3 'overhang is present, wherein the second ATO comprises a different 5' universal sequence of the first ATO.

The method may further comprise generating a modified first CS by ligating an adaptor into the product of step (ii).

The method can further comprise extending a second primer that hybridizes to the first CS or to the modified first CS, wherein the second primer comprises the target-specific portion or the universal sequence, or both the 3 'target-specific sequence and the 5' universal sequence, thereby generating a second CS.

The method may further comprise splitting the modified target polynucleotide reaction into two separate reactions, wherein each reaction contains a primer complementary to the universal region of the modified target polynucleotide, wherein one reaction comprises a target-specific second primer or pool of target-specific primers complementary to the forward strand of the target sequence, and the other reaction comprises a target-specific second primer or pool of target-specific primers complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

In another embodiment, the pool of target-specific primers may contain a mixture of primers, wherein each primer may target either the forward or reverse strand for a different target, such that the final pool may target different regions of both the forward and reverse strands. Wherein the forward and reverse strands of the target sequence are complementary and the forward and reverse target-specific primers are divided into two different pools of primers, provided that if the two primers targeting the forward and reverse strands can together act as primers for PCR resulting in the generation of unwanted PCR products, they are not added to the same pool. Furthermore, as noted above, all references to "forward" and/or "reverse" pools allow each pool to contain primers that target both the forward and reverse strands.

The method can further include hybridizing primers complementary to the universal region of the modified target polynucleotide, followed by one or more rounds of linear amplification to generate a first Complementary Sequence (CS). The linear amplification reaction product is divided into two separate reactions, wherein each reaction contains a primer complementary to a universal region of the modified target polynucleotide, wherein one reaction includes a target-specific second primer or target-specific pool of primers complementary to the forward strand of the target sequence, and the other reaction includes a target-specific second primer or target-specific pool of primers complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

The method may further comprise a step of end repair of the double stranded target polynucleotide followed by ligation of the double stranded adaptors. The ligation product is then hybridized to a primer complementary to the universal region of the target polynucleotide ligation product, followed by one or more rounds of linear amplification. The linear amplification reaction product is divided into two separate reactions, wherein each reaction contains a primer complementary to a universal region of the modified target polynucleotide, wherein one reaction includes a target-specific second primer or target-specific pool of primers complementary to the forward strand of the target sequence, and the other reaction includes a target-specific second primer or target-specific pool of primers complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

When the target polynucleotide in the sample is double-stranded and the second primer is a target-specific primer, the reaction for target-specific amplification after the ATO reaction may comprise separating the ATO reaction product or the first CS product of the linear amplification into two separate reactions, a forward reaction comprising one or more target-specific second primers complementary to the forward strand of the target sequence and a reverse reaction comprising one or more target-specific second primers complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

In one aspect, the generating the first CS comprises performing a primary extension or linear amplification using a first primer, the first primer being a universal primer that targets a 3' extended universal portion of the polynucleotide. The linear amplification may have 1-30 cycles, or 2-25 cycles, or 3-24 cycles, or 4-23 cycles, or 5-22 cycles, 6-21 cycles, or 7-20 cycles, or 8-19 cycles, or 9-18 cycles, or 10-17 cycles.

In another aspect, if the target is RNA, the generating the first CS comprises performing a reverse transcription reaction using a reverse transcriptase.

The method may further comprise performing exponential amplification using the first primer and the second primer. The first primer may be a universal primer that targets the 3' extended universal portion of the modified target polynucleotide; the second primer may be a universal primer that targets the 3' extended universal portion of the first CS. Alternatively, the second primer is a target-specific primer that anneals to the specific region of interest of the first CS. The second primer can be a set of multiple primers targeting multiple regions of sequence of interest. When the second primer is a target-specific primer, a nested target-specific third primer is used for further amplification after linear or exponential amplification using the second primer.

The first, second, or third primer may include a Sample Barcode (SBC) sequence and one or more additional universal sequences necessary for compatibility with the NGS platform.

The method may comprise fragmenting or fragmenting/tagging the target polynucleotide prior to the first ATO reaction.

In one embodiment, the act of fragmenting and/or tagging the target polynucleotide comprises contacting the double-stranded polynucleotide with a transposase that binds to a transposon DNA, wherein the transposon DNA comprises a transposase binding site and a universal sequence, wherein the transposase/transposon DNA complex binds to a target location on the double-stranded polynucleotide and cleaves the double-stranded polynucleotide into a plurality of double-stranded fragments, wherein each double-stranded fragment has a transposon DNA bound to each 5' end of the double-stranded fragment. The method further comprises the act of thermally denaturing the fragmented target polynucleotides prior to the first ATO reaction. The transposase can be Tn5 transposase. The transposon DNA can include a barcode sequence, and can include a priming site. The transposon DNA comprises a double stranded 19bpTnp binding site and an overhang, wherein the overhang may comprise a UID and a priming site. The transposon DNA comprises a double stranded 19bp Tnp binding site and a nucleic acid stem-loop structure. The bound transposase can be removed from the double-stranded fragments prior to the first ATO reaction. Tn5 transposase complexed with transposon DNA, and Tn5 transposase/transposon DNA complex bound to a target location along the double-stranded genomic DNA, cleaving the double-stranded genomic DNA into multiple double-stranded fragments. Transposon DNA comprises a double-stranded 19bp Tn5 transposase (Tnp) binding site at one end and a long single-stranded overhang containing barcode regions, priming sites and other sequences. Upon transposition, the Tnp and transposon DNA bind to each other and dimerize to form a transposome. Transposomes then randomly capture or otherwise bind to the target polynucleotide. Transposases in the transposomes then cleave the genomic DNA, with one transposase cleaving the upper strand and one transposase cleaving the lower strand to create DNA fragments. Thus, the transposon DNA was randomly inserted into the polynucleotide, leaving a 9bp gap at both ends of the transposition/insertion site. As a result, a DNA fragment having a transposon DNA Tnp binding site attached to the 5 'position of the upper strand and a transposon DNA Tnp binding site attached to the 5' position of the lower strand was obtained.

In another embodiment, the act of fragmenting the target polynucleotide comprises contacting the double stranded DNA with a CRIPSR/Cas9 enzyme that binds to the guide RNA, wherein the CRISPR/Cas 9/guide RNA complex binds to a region of the target polynucleotide defined by the sequence of the guide RNA. The CRISPR/Cas 9/guide RNA/DNA complex results in a double strand break as determined by the sequence targeted by the guide RNA. In another embodiment, only single strand breaks are induced.

The operation of fragmentation may include the use of targeted fragmentation using genome editing tools. The genome editing tool can include clustered regularly interspaced short palindromic repeats and CRISPR-associated protease 9(CRISPR/Cas 9). The enzyme may belong to class I. The class I enzyme is a type I enzyme, a type III enzyme or a type IV enzyme. The enzyme may belong to class II. The class II enzyme is a type II enzyme, a type V enzyme or a type VI enzyme. The enzyme may comprise any combination of class I and class II enzymes. The combination may constitute any combination of a type I enzyme, a type II enzyme, a type III enzyme, a type IV enzyme, a type V enzyme, or a type VI enzyme. Guide RNA libraries are used to target multiple regions of the genome to induce DNA breaks. The DNA break may be a single strand break or a double strand break. The DNA break may be a combination of a double strand break and a single strand break. The pool consists of any number of guide RNAs.

In another embodiment, the fragmenting and labeling a target polynucleotide comprises contacting a single-stranded target polynucleotide with a random primer comprising a 5 ' universal sequence and a 3 ' random sequence, and extending the random primer on the target polynucleotide to generate a 5 ' labeled fragmented polynucleotide.

The target polynucleotide or fragmented target polynucleotide comprises a free 3' hydroxyl group. The target polynucleotide may be single-stranded DNA, or single-stranded RNA, or a combination of single-stranded RNA and single-stranded DNA.

One embodiment provides a method for extending a polynucleotide, the method comprising:

mixing the target polynucleotide with a DNA polymerase, an Adaptor Template Oligonucleotide (ATO) comprising 3 'random sequences as described above, the Adaptor Template Oligonucleotide (ATO) comprising 3' random sequences having a blocked and modified 3 'end to be tolerant to 3' exonuclease activity;

incubating the mixture under conditions that promote annealing, trimming the 3' overhangs (if present), and extending to generate a modified target polynucleotide; and

optionally degrading the ATO.

One embodiment provides a method for generating a sequencing library, the method comprising:

mixing the target polynucleotide with one or more DNA polymerases, an Adaptor Template Oligonucleotide (ATO) comprising 3 'random sequences as described above, the Adaptor Template Oligonucleotide (ATO) comprising 3' random sequences having a blocked and modified 3 'end to be tolerant to 3' exonuclease activity;

incubating the mixture under conditions that promote annealing, trimming the 3' overhangs (if present), and extending to generate modified target polynucleotides;

optionally degrading ATO; and

the modified target polynucleotide is amplified using primers compatible with the NGS platform with one, or two, or more rounds of linear amplification and/or exponential amplification.

The method further comprises fragmenting the target polynucleotide prior to mixing. The target polynucleotide may be a naturally occurring fragmented polynucleotide. Naturally occurring fragmented polynucleotides may be circulating free nucleic acids of plasma. The fragmented target polynucleotides may include contacting double-stranded polynucleotides with transposases that bind to transposon DNA, wherein the transposon DNA includes a transposase binding site and a universal sequence, wherein a transposase/transposon DNA complex binds to a target location on the double-stranded polynucleotides and cleaves the double-stranded polynucleotides into multiple double-stranded fragments, wherein each double-stranded fragment has a transposon DNA bound to each 5' end of the double-stranded fragment.

Also provided is a method for generating a sequencing library, the method comprising:

adding adapter sequences to the single-stranded target polynucleotide according to the ATO and composition as described above by extending the single-stranded target polynucleotide on the ATO template; and amplifying the adaptor-tagged target polynucleotide using primers compatible with the NGS platform, wherein the ATO comprises an adaptor sequence in the universal portion. The adaptor sequences provide priming sequences for both amplification and sequencing of nucleic acid fragments, and in some aspects are used for next generation sequencing applications. In a further aspect, an "adaptor sequence" is used as a promoter sequence for generating an RNA molecule, wherein the promoter sequence is, for example and without limitation, a T7 promoter sequence or a SP6 promoter sequence.

The present disclosure provides a method for generating a polynucleotide library, the method comprising:

(i) generating a modified target polynucleotide by using a target polynucleotide from a sample, the target polynucleotide from the sample hybridizing in an enzymatic first reaction that adds an adaptor sequence to the 3 'end of the target polynucleotide to the 3' random sequence of an Adaptor Template Oligonucleotide (ATO) of any one of the ATO (first ATO) described above; and

(ii) generating a first Complement Sequence (CS) of the modified target polynucleotide using a first primer comprising the universal sequence and using the modified target polynucleotide as a template, wherein the first primer hybridizes to the template and is extended by a polymerase.

The target polynucleotide is preferably fragmented, either naturally or artificially. The target polynucleotide may be any nucleic acid, such as DNA, cDNA, RNA, mRNA, small RNA, or microrna, or any combination thereof. The target polynucleotide may comprise a plurality of target polynucleotides. Each of the plurality of target polynucleotides may comprise a different sequence or the same sequence. The target polynucleotide or one or more of the plurality of target polynucleotides may comprise a variant sequence.

Depending on the type of target polynucleotide and ATO (which is DNA and/or RNA), the method may utilize reverse transcription or primer extension. The primer extension reaction may be a single primer extension step. The primer extension reaction may comprise extending one or more individual primers once. The primer extension reaction may comprise extending one or more individual primers in one step. In step (i), the 3 ' end or the trimmed 3 ' end of the target polynucleotide serves as a primer and the primer is extended using ATO as a template, and in step (ii), the extension primer or amplification primer is a first primer that anneals to the 3 ' extended portion of the target polynucleotide.

In one embodiment, in step (i), the 3 ' end of the target polynucleotide as a primer is hybridised to ATO by the 3 ' end of the sequence which can be randomly or specifically fragmented, trimmed by the 3 ' to 5 ' exonuclease activity of the polymerase to remove 3 ' overhangs (if present), and extended using ATO as a template to generate the modified target polynucleotide. Due to the random nature of the 3 'random sequence portion of the ATO, it may be difficult to obtain perfect hybridization between the 3' end of the target polynucleotide and the ATO with less stringent conditions applied. For example, a high concentration of ATO may be used, a low hybridization temperature (e.g., 4 ℃) may be used, and/or multiple extension cycles and/or longer hybridization times may be used. The extension may be performed by any polymerase and/or any reverse transcriptase or a mixture of different polymerases. Preferably, the DNA polymerase may have 3 ' to 5 ' exonuclease activity such that any 3 ' overhangs (if present) are digested (trimmed) and can be extended. The DNA polymerase may contain strand displacement activity such that the ATO moleculeThe stem-loop structure and the double stranded universal portion of (a) can be opened and copied. Alternatively, the DNA polymerase may contain 5 ' to 3 ' exonuclease activity such that the 5 ' end of the stem-loop structure and the double stranded universal part of the ATO molecule may be digested and the lower strand of the ATO copied. The DNA polymerase is preferably active at low temperatures. The polymerase may contain a mixture of different polymerases, which may have 3 'to 5' exonuclease activity, 5 'to 3' exonuclease activity and/or strand displacement activity. Polymerases that can be used to practice the methods disclosed herein include, but are not limited to, deep ventrTMDNA polymerase, LongAmpTMTaq DNA polymerase, PhusionTMHigh fidelity DNA polymerase, PhusionTMHot start high fidelity DNA polymerase,DNA polymerase, DyNAzymeTMII Hot Start DNA polymerase, PhireTMHot Start DNA polymerase, Crimson LongAmpTMTaq DNA polymerase, DyNAzymeTMEXT DNA polymerase, LongAmpTMTaq DNA polymerase, Taq DNA polymerase with standard Taq (without Mg) buffer, Taq DNA polymerase with standard Taq buffer, Taq DNA polymerase with ThermoPol II (without Mg) buffer, Taq DNA polymerase with ThermoPol buffer, Crimson TaqTMDNA polymerase, Crimson Taq with (Mg-free) bufferTMDNA polymerase,

Figure GDA0002356775850000212

(exo-) DNA polymerase, Hemo KlenaqTM、Deep VentRTM(exo-) DNA polymerase,AMV first strand cDNA synthesis kit,M-MuLV first strand cDNA synthesis kit, BstDNA polymerase, full length BstDNA polymerase, large fragment Taq DNA polymerase with ThermoPol buffer solution, 9 degree Nm DNA polymerase, Crimson TaqTMDNA polymerase, Crimson Taq with (Mg-free) bufferTMDNA polymerase, DeepVentRTM(exo-) DNA polymerase, Deep VentRTMDNA polymerase, DyNAzymeTMEXT DNA polymerase, DyNAzymeTMII Hot Start DNA polymerase, Hemo KlenaqTM、PhusionTMHigh fidelity DNA polymerase, PhusionTMHot start high fidelity DNA polymerase, Sulfolobus DNA polymerase IV, TherminatorTMGamma DNA polymerase, TherminatorTMDNA polymerase, TherminatorTMII DNA polymerase, TherminatorTMIII DNA polymerase,

Figure GDA0002356775850000215

DNA polymerase,

Figure GDA0002356775850000216

(exo-) DNA polymerase, Bsu DNA polymerase, large fragment Bst DNA polymerase, large fragment DNA polymerase I (E.coli), DNA polymerase I, large (Klenow) fragment, Klenow fragment (3 '→ 5' exo-), phi29 DNA polymerase, T4DNA polymerase, T7 DNA polymerase (unmodified), reverse transcriptase and RNA polymerase, AMV reverse transcriptase, M-MuLV reverse transcriptase, phi6 RNA polymerase (RdRP), SP6 RNA polymerase and T7 RNA polymerase.

Ligases that may be used to practice the methods of the present disclosure include, but are not limited to, T4DNA ligase, T4 RNA ligase, E.coli DNA ligase, and E.coli RNA ligase.

Multiple cycles of extension may be performed by thermal cycling of temperature (annealing, extension and denaturation). The modified target polynucleotide has an extended 3' portion, which may include some random sequences and universal sequences that provide primer binding sites. The extended 3' portion may also include additional UIDs.

In one embodiment, in step (i), if the universal portion of the ATO comprises a weakly paired nucleotide (e.g. inosine), it may not be necessary to remove the ATO after the first extension reaction. Otherwise, the ATO may be removed or digested from the reaction mixture by any means. For example, if ATO includes uracil residues, the ATO is digested/removed by UNG digestion; digesting/removing ATO by ribonuclease digestion if the ATO comprises RNA; digesting/removing the ATO by restriction enzyme digestion if the ATO comprises a restriction enzyme site; or if the ATO comprises biotin, the ATO is removed by a capture operation on streptavidin beads. In another embodiment, if the ATO includes a hairpin/stem-loop structure (fig. 3B and 3C), it may not be necessary to digest or remove the ATO from the reaction mixture because the hairpin structure of the ATO makes hybridization of the ATO to the 3' extended portion of the modified target polynucleotide impossible.

In one embodiment, the first ATO reaction is a primer extension reaction in which the target polynucleotide as a primer is extended on the ATO template by a DNA polymerase. The DNA polymerase may include strand displacement activity or 5 'to 3' exonuclease activity, wherein during extension the stem-loop structure is opened or the upper ATO strand is displaced or digested. Any polymerase can be used, such as Klenow exo-polymerase, Bst polymerase, or T4DNA polymerase.

In another embodiment, the first reaction is an extension-ligation reaction, wherein the DNA polymerase extends the target and the DNA ligase ligates the extended target sequence to the 5' stem portion of the ATO or the upper strand of the ATO. Any DNA polymerase and DNA ligase (e.g., Klenow large fragment, T4DNA ligase) can be used.

In another embodiment, the first reaction is a ligation reaction in which a DNA ligase ligates the target polynucleotide to the 5' stem portion of the ATO or the upper strand of the ATO. Any DNA ligase (e.g., T4DNA ligase) may be used.

After the first reaction, the method may include an operation of digesting a portion of the ATO or removing a portion of the ATO by affinity capture.

The method may comprise the operation of generating a modified first CS by using the first CS, which hybridizes in an enzymatic reaction (which adds an adaptor sequence to the 3 ' end of the first CS) to a 3 ' random sequence of a second ATO of any one of the above-mentioned ATO, wherein the second ATO comprises a different 5 ' universal sequence of the first ATO.

The method may further comprise the act of generating a modified first CS by ligating a double stranded adaptor into the product of step (ii).

The operation of generating the first Complementary Sequence (CS) of the modified target polynucleotide is performed in a step after digestion/removal of ATO or in a step without digestion/removal of ATO. The first Complementary Sequence (CS) can be generated by primer extension. The primer may be a universal primer capable of hybridizing to and extending the 3' extended portion of the modified target polynucleotide. Depending on the type of target polynucleotide (which is DNA, RNA, or a combination of RNA and DNA), the method may utilize reverse transcription and/or primer extension by a DNA polymerase and/or a reverse transcriptase. The primer extension reaction may be a single primer extension step. Alternatively, the primer extension reaction may be a plurality of cycles of linear amplification using the first primer. The first CS generated comprises the 5 'universal sequence and the 3' complementary sequence of the target polynucleotide. The first primer includes a 3' universal sequence that is partially identical or substantially identical to the universal sequence of the ATO. The first primer may also include a 5' additional universal sequence portion that is compatible with the sequencing platform. The first primer may also include a sample barcode Sequence (SBC) between the 3 'universal sequence and the 5' additional universal portion.

In one embodiment, the method further comprises: an operation of generating a modified first CS using the first CS as a primer and a second Adaptor Template Oligonucleotide (ATO) of any one of the above oligonucleotides as a template. Extending the 3 'end of the first CS on the ATO template to generate an extended first CS, the extended first CS comprising a second universal sequence in the 3' end. After removal of the second ATO, the first CS can be PCR amplified using two universal primers.

In another embodiment, the method further comprises: extending a second primer that hybridizes to the first CS or the modified first CS, thereby forming a second CS, wherein the second primer comprises a target-specific portion or a universal sequence, or both a 3 'target-specific sequence and a 5' universal sequence. In one aspect, when the second primer is a target-specific primer, it can include a 3 ' target-specific portion, have a 5 ' universal portion, or have no 5 ' universal portion. The extension operation using the second primer may be one extension or multiple cycles of linear amplification. Alternatively, step (ii) and the set of steps are synthesized as one single PCR reaction that generates a first CS using the first primer and a second CS using the second primer, wherein the first CS and the second CS are generated simultaneously after the first PCR cycle. After PCR reaction or linear amplification, the product can be purified or the primers removed by single strand specific nuclease digestion. If the first primer contains a sample barcode, purified PCR products from multiple samples can be pooled together. Further PCR amplification is performed using the (pooled) purified PCR products or linear amplification products of the nested target-specific third primer of the first CS and the universal primer pair of the second CS. The nested target-specific third primer of the first CS comprises a 3 ' target-specific portion and a 5 ' universal sequence portion, wherein the 5 ' universal sequence portion is compatible with the NGS platform. The PCR product was then purified for sequencing. In another aspect, when the second primer is a target-specific primer, it can include a 3 ' target-specific portion and a 5 ' universal portion, wherein the 5 ' universal portion is compatible with the NGS platform. Step (ii) and the set of steps are synthesized into one single PCR reaction that generates a first CS using the first primer and a second CS using the second primer, wherein the first CS and the second CS are generated simultaneously after the first PCR cycle. The PCR product was then purified for sequencing.

When the original target polynucleotide in the sample is double-stranded, the reaction to extend the second primer hybridized to the first CS can be split into two separate reactions, a forward reaction comprising a target-specific second primer complementary to the forward strand of the target sequence, and a reverse reaction comprising a target-specific second primer complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

In step (i) or step (ii), the extension operation may comprise one extension or linear amplification using the first primer or the second primer. In step (ii), the extension operation may comprise exponential amplification using a first primer and a second primer.

The first primer may include a Sample Barcode (SBC) sequence and an additional 5' universal sequence compatible with the NGS platform.

In one aspect, the present disclosure provides a method of extending a target polynucleotide, the method comprising: mixing a target polynucleotide with a polymerase, an ATO molecule comprising an NGS adaptor sequence and a cleavable sequence; performing a trim and extension reaction to thermally inactivate the polymerase and then incubating with the single strand specific circularized ligase; optionally comprising a cleavage reaction or amplification reaction with a reverse primer and a forward primer, one of which is performed to dissociate the circular molecules into complete linear NGS library molecules.

In another aspect, the method provides a method of extending a target polynucleotide, the method comprising: mixing the target polynucleotide with a polymerase, an ATO comprising an NGS adaptor sequence; performing a trim and extension reaction, then incubating with a first primer complementary to a universal NGS adaptor sequence, a DNA polymerase, and dntps to perform an extension reaction to generate CS and double-stranded substrate molecules; ligation with T4DNA ligase and blunt-ended or T-tailed adaptors formed by annealing two oligonucleotides comprising an NGS adaptor sequence and truncated complement and 3 'phosphate, one of which is ligated to the 5' phosphate of the modified target polynucleotide molecule to complete the linear NGS library molecule.

The present disclosure also provides a method of accurately determining the sequence of a target polynucleotide, the method comprising:

(i) sequencing at least one of the amplified second CSs of any of the methods above;

(ii) (ii) aligning at least two sequences from (i) that contain the same UID and/or aligning the same target sequence of two reactions, each reaction generating sequence information of one or the complementary strand of the duplex target sequence; and

(iii) (iii) determining a consensus sequence and/or identical variant sequence for both reactions based on (ii), wherein the consensus sequence and/or variant sequence accurately represents the target polynucleotide sequence.

The present disclosure also provides a kit for generating a polynucleotide library, the kit comprising an Adaptor Template Oligonucleotide (ATO) of any one of the above ATO and a primer compatible with the NGS platform.

The kit comprises a composition as described above.

A kit for generating a polynucleotide library includes an Adaptor Template Oligonucleotide (ATO) as described above, a polymerase, and primers compatible with the NGS platform.

The target polynucleotide is a polynucleotide, modified polynucleotide, or a combination thereof as described below. In various embodiments, the target polynucleotide is DNA, RNA, or a combination thereof. In another embodiment, the target polynucleotide is a chemically treated nucleic acid, including but not limited to embodiments wherein the substrate polynucleotide is bisulfite treated DNA to detect methylation status by NGS.

The target polynucleotides are obtained from naturally occurring sources, or they may be synthetic. Naturally occurring sources are RNA and/or genomic DNA from prokaryotes or eukaryotes. For example, but not limited to, the source may be human, mouse, virus, plant, or bacteria. In various aspects, the target polynucleotides are extended at the 3' end with an adapter sequence for use in assays that involve microarrays and create libraries for next generation nucleic acid sequencing.

The fragmentation operation of genomic DNA/RNA is a general procedure known to those skilled in the art and is performed in vitro by, for example, but not limited to, by shearing (nebulizing) the DNA/RNA, cleaving the DNA/RNA with endonucleases, sonicating the DNA/RNA, by heating the DNA/RNA, irradiating the DNA/RNA by using α radiation sources, β radiation sources, gamma radiation sources, or other radiation sources, by light, by chemical cleavage of the DNA/RNA in the presence of metal ions, by free radical cleavage, and combinations thereof.

As used herein, a "target polynucleotide Complementary Sequence (CS)" is a polynucleotide comprising a sequence complementary to a target sequence or a sequence complementary thereto (a sequence complementary to a target sequence). In some embodiments, the target polynucleotide complement sequence comprises a first complement sequence. The "first complementary sequence" is a polynucleotide reverse transcribed from a target polynucleotide or a polynucleotide formed by a primer extension reaction on a target polynucleotide or an RNA polynucleotide, which is transcribed from a double-stranded RNA polymerase promoter from the modified target polynucleotide by an RNA polymerase. The modified target polynucleotide is a target polynucleotide extended on an ATO including random sequences and universal sequences.

The target polynucleotide complement sequence comprises a second complement sequence. A "second complementary sequence" is a polynucleotide that includes a sequence that is complementary to the first complementary sequence. The target polynucleotide complement sequence may comprise a UID. For example, the first complementary sequence can include a UID provided by the random sequence of the ATO and by the 3' end portion of the randomly fragmented target polynucleotide.

The second primer or the third primer may be a plurality of sets of the second primer or the third primer. Each of the plurality of second primers or third primers is extended simultaneously and in the same reaction chamber.

The amplification step using the target-specific second primer can be divided into two reactions: forward reaction and reverse reaction. The forward reaction includes a forward set of a plurality of target-specific second primers that anneal to first strands from a plurality of target CSs of one sample, and the reverse reaction includes a reverse set of a plurality of target-specific second primers that anneal to second strands from a plurality of target CSs of the same sample. The primers used to generate PCR products in nested PCR can include a universal primer that targets the 5' universal sequence portion of the first primer and a third plurality of target-specific primers that anneal to a second strand of the plurality of target sequences, wherein the third set of target-specific primers (inner primers) is nested to the set of target-specific second primers (outer primers). The universal primers in the forward and reverse reactions may be the same.

The reaction mixture may comprise a plurality of reactions against more than one sample (which may be two samples, three samples, or more than ten samples). Different samples can be processed together in parallel. Each sample may include two reactions: forward reaction and reverse reaction. After step (ii), the different sample reactions (all forward reactions or all reverse reactions) may preferably be mixed, wherein the identity of each sample is assigned in the amplification by the first primer with the SBC. All forward or reverse reactions after step (ii) may be carried out in one mixture.

The method further includes an operation of analyzing NGS reads derived from forward and reverse reactions of two different strands representing a target sequence, the analyzing operation including generating an error-corrected consensus sequence by: (i) grouping into families containing the same random sequence identifier (UID) sequence; (ii) (ii) removing target sequences of the same family having one or more nucleotide positions, wherein the target sequences are not identical to most of the members, and (iii) examining whether the same mutation occurs in two reactions representing different strands of the target sequence.

The method also includes an operation of analyzing NGS reads derived from forward and reverse reactions representing two different strands of the target sequence, the operation of analyzing including an operation of generating a consensus sequence by grouping into families containing the same random sequence identifier (UID) sequence and counting the number of families. The method provides an accurate count of the number of original target nucleic acids present in the sample.

The method can be used to quantify the starting molecules, and the act of counting the UID family of target sequences can provide accurate counting information compared to other samples or compared between forward and reverse reactions.

The purpose of the UID is twofold. The first is to assign a unique UID to each original DNA template molecule or RNA template molecule. The second is the amplification of each uniquely labeled template to generate a number of progeny molecules (defined as UID families) with the same UID sequence. If a mutation is pre-existing in the template molecule for amplification, the mutation should be present in each progeny molecule containing the UID.

The universal primer may contain one, or two, or more terminal phosphorothioates to render it resistant to any exonuclease activity. The universal primer may also contain the 5' graft sequences necessary for hybridization to an NGS flow cell (e.g., Illumina GA IIx flow cell). Finally, the universal primer may contain an index sequence between the grafted sequence and the universal tag sequence. The index sequence enables PCR products from multiple different individuals to be analyzed simultaneously in the same flow cell chamber of a sequencer.

The target nucleic acid sequence of the sample may comprise a nucleic acid fragment or gene that contains one or more variant nucleotides and may be selected from the group consisting of one or more disease-associated SNPs/deletions/insertions, one or more chromosomal rearrangements, trisomies, or one or more cancer genes, one or more drug resistance genes, and one or more virulence genes. The disease-associated gene may include, but is not limited to, cancer-associated genes and genes associated with genetic diseases. The sample may be genomic DNA, circulating nucleic acids, RNA, mRNA, small RNA, microRNA, or FFPE DNA or RNA.

The one or more variant nucleotides in the characteristic region(s) (diagnostic region) of the target polynucleotide sequence may comprise one or more nucleotide substitutions, chromosomal rearrangements, deletions, insertions and/or aberrant methylation.

DNA methylation is an important epigenetic modification of the genome. Aberrant DNA methylation can cause silencing of tumor suppressor genes and is common in many human cancer cells. To detect the presence of any aberrant methylation in the target polynucleotide, a pretreatment should be performed prior to carrying out the present method. Preferably, the nucleic acid sample should be chemically modified by bisulfite treatment, which converts cytosine to uracil, rather than methylated cytosine (i.e., 5-methylcytosine, which is resistant to and remains cytosine). Due to these modifications, the methods can be applied to the detection of one or more aberrant methylations in a target nucleic acid. In another embodiment, the modification by bisulfite treatment occurs after the ATO reaction. In another embodiment, the modification by bisulfite treatment occurs after the generation of the first CS.

The present disclosure provides methods of analyzing a biological sample for the presence and/or amount and/or frequency of mutations or polymorphisms at multiple loci of different target nucleic acid sequences. In another aspect, the present disclosure provides methods of analyzing chromosomal abnormalities (e.g., trisomies) in a biological sample. The ATO reaction may be followed by next generation sequencing, digital PCR, microarray, or other high throughput analysis. The number of multiplex amplifications (multiplexing) of the target locus may be more than 5, or more than 10, or more than 30, or more than 50, or more than 100, or more than 500, more than 1000, or even more than 2000.

When the concentration of a mutant in a sample is very low (e.g., one or two mutants are present in the sample), only one reaction may contain the mutant after the sample nucleic acid is split into two reactions. Comparison of the two chain sequences from the two reactions can reveal that only one reaction can contain a mutation. If more than one read family contains the same mutation, it is classified as a true mutation even if the mutation occurs in only one reaction.

In another embodiment, for one or more subsequent rounds of amplification, the modified target polynucleotide may be amplified by linear amplification or exponential amplification before the sample nucleic acid is split into two reactions. Comparison of the two chain sequences from the two reactions is more likely to reveal the presence of a mutation due to the increased copy number of the entire original molecule.

In another embodiment, the modified target polynucleotide is not split into two reactions and only the presence of a mutation in one of the two strands is detected.

In another embodiment, the target polynucleotide is not split into two reactions, and the two strands are amplified separately and sequentially to allow analysis of the two strands.

There is a large body of literature describing the release of free DNA from dying tumor cells into the blood in patients with various types of cancer. Studies have shown that circulating tumor DNA can be used as a non-invasive biomarker to detect the presence of malignancy, to track treatment response, or to monitor recurrence. However, current detection methods have significant limitations. Next Generation Sequencing (NGS) approaches have revolutionized genome exploration by allowing simultaneous sequencing of billions of base pairs at a fraction of the time and cost of traditional methods. However, when aiming to identify rare mutations in genetically heterogeneous mixtures (such as tumors and plasma), an error rate of-1% causes hundreds of millions of sequencing errors, which is unacceptable. The methods of the invention overcome these limitations of sequencing accuracy. The cfDNA with mutations can be masked by the relative excess of background wild-type DNA; detection has proven challenging. By independently labeling and sequencing each raw DNA duplex, the method greatly reduces errors.

The method can greatly improve the accuracy of large-scale parallel sequencing. The method can be readily used to identify rare mutations in a population of DNA templates. Both strands of a target template in a sample are uniquely labeled and sequenced independently. The operation of comparing the sequences of the two strands results in agreement or disagreement with each other. Agreement provides a confidence that the mutation was scored as true positive.

After sequencing, members of each read family were identified and grouped according to sharing the same UID tag sequence. The sequences of the unique UID tagged family are then compared to one or both strands of the target sequence to create a consensus sequence. This step filters out random errors introduced during sequencing or PCR to generate a set of sequences, each of which is derived from a separate single-stranded DNA molecule.

In addition to its application for highly sensitive detection of rare DNA variants, barcoded random sequence identifiers in target-specific primers can also be used for single molecule counting to accurately determine relative or absolute DNA copy number and/or RNA copy number. Since labeling occurs before the main amplification, the relative abundance of variants in the population can be accurately assessed given that the scale representation is not affected by amplification bias (amplification bias).

By tagging each target sequence with a random sequence identifier and sequencing both strands, the method of the invention greatly reduces errors. By grouping sequenced uniquely tagged sequences; removing target sequences of the same family having one or more nucleotide positions (wherein the target sequences are not identical to most members of the family); and the same mutations that appeared in both populations would be true mutations, the analysis provided an error-corrected consensus sequence.

The method can be used to detect mutations in any sample (e.g., FFPE or blood). Accurate counts of sequencing reads reflecting the original molecules present in the sample provide information for prenatal testing for copy number changes or chromosomal abnormalities.

The reagents employed in the methods of the invention may be packaged into kits. The kit comprises one or more ATO, one or more polymerase, one or more primer in separate containers or in a single master mix container. The kit may also contain other reagents in appropriate packages, as well as materials required for extension, amplification, enrichment (e.g., buffers, dntps and/or polymerization tools (means)) and detection assays (e.g., enzymes), as well as instructions for performing the assays.

Brief description of the drawings

Fig. 1A depicts a schematic of an illustrative embodiment. The target polynucleotide (one or more PCR products, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the ATO molecule hybridises at one or more positions to the single stranded target polynucleotide sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. (ii) after step (i), digesting or removing ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (ii), a first universal primer is added to hybridize to the modified target polynucleotide and is extended to generate a first Complementary Sequence (CS).

Fig. 1B depicts a schematic of an illustrative embodiment. In step (i), the ATO hybridises at one or more positions to the single stranded target polynucleotide sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. The ATO includes a stem-loop structure and the loop portion includes a non-replicable coupling. The extension displaces the 5' stem portion sequence and terminates at a non-replicatable junction. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. (ii) after step (i), digesting or removing ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (ii), a first universal primer is added to hybridize to the modified target polynucleotide and is extended to generate a first Complementary Sequence (CS).

Fig. 1C depicts a schematic of an illustrative embodiment. In step (i), the ATO hybridises at one or more positions to the single stranded target polynucleotide sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. ATO comprises a stem-loop structure and the loop portion comprises a modification (such as a uracil nucleotide) that can be digested or cleaved, allowing the hairpin to be broken. The extension strand contacts the 5 ' stem portion sequence and is linked to a 5 ' stem portion containing a 5 ' phosphate group. Extension-ligation generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. In some embodiments, the 3 'end of one strand of the target polynucleotide hybridizes to a portion of the 3' random sequence of the ATO immediately adjacent to the 5 'stem portion of the ATO and is linked to the 5' stem portion of the ATO by a DNA linkage. This ligation may occur without the use of a DNA polymerase. After step (i), the ATO may be cleaved at the modification, and the lower part of the ATO may be digested or removed by affinity capture. In step (ii), a first universal primer is added to hybridize to the modified target polynucleotide and is extended to generate a first Complementary Sequence (CS).

Fig. 1D depicts a schematic diagram of an illustrative embodiment. In step (i) of the reaction, the ATO hybridises at one or more positions to the single stranded target polynucleotide sequence. The 3 ' end of one strand of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. An ATO comprises a double-stranded structure, a lower part and a separate upper part of the ATO comprising a random sequence, the universal part of the ATO being formed by two sequences designed to anneal to form a double-stranded region. Extension of the target nucleotide displaces or digests the upper part sequence by a DNA polymerase having strand displacement activity or 5 'to 3' exonuclease activity. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. The ATO may be digested after the first reaction. In some embodiments, the 3 ' end of one strand of the target polynucleotide is hybridized to a 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. The extension strand contacts the 5 ' upper part sequence and is linked to the 5 ' upper part containing the 5 ' phosphate group. Extension-ligation generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. In some embodiments, the 3 'end of one strand of the target polynucleotide hybridizes to the 3' random sequence portion of the ATO immediately adjacent to the 5 'end of the upper portion of the ATO and is linked to the 5' upper portion of the ATO by a DNA ligase. This ligation may occur without the use of a DNA polymerase. After step (i), the ATO or the lower part of the ATO may be removed by digestion or affinity capture. In step (ii), a first universal primer is added to hybridize to the modified target polynucleotide and is extended to generate a first Complementary Sequence (CS).

Fig. 2 depicts a schematic diagram of an illustrative embodiment.

(A) A target-specific second primer is provided to hybridize to the first CS and is extended to generate a second CS. The target-specific second primer includes a 3 'target-specific portion and a 5' universal portion. The extension operation may be one extension, or multiple cycles of linear amplification, or PCR amplification using both the first and second primers. Optionally, the second CS may be further PCR amplified using universal primers compatible with the NGS platform.

(B) A target-specific second primer is provided to hybridize to the first CS and is extended to generate a second CS. The target-specific second primer includes a 3 ' target-specific portion, has a 5 ' universal portion, or does not have a 5 ' universal portion. The extension operation may be one extension, or multiple cycles of linear amplification, or PCR amplification using both the first and second primers. The second CS was further PCR amplified using nested target-specific third and universal primers compatible with the NGS platform. Optionally, the second CS may be further PCR amplified using universal primers compatible with the NGS platform.

(C) The first primer is annealed to the added adaptor sequence of the target polynucleotide and extended to generate double stranded DNA of the first CS. The double-stranded DNA of the first CS is ligated to the adaptor via double-stranded ligation by DNA ligase. Only CS chains need to be connected. The ligation product may optionally be affinity captured to a solid support. After ligation, the product can be amplified by two universal primers.

(D) The target polynucleotide is subjected to bisulfite treatment, which converts unmethylated cytosine (C) to uracil, and leaves methylated cytosine intact. The ATO hybridizes to the single stranded target polynucleotide sequence at one or more positions. The 3 ' end of one strand of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. The ATO may be digested after the first reaction. The modified target polynucleotide is hybridized to a universal primer, the 3 ' of which is designed to hybridize to the universal sequence of the modified target polynucleotide (with or without an additional 5 ' universal sequence), and a library or one or more target-specific primers that include a 3 ' target-specific portion, with an additional 5 ' universal portion, or without an additional 5 ' universal portion. Both mixtures were amplified with one or more cycles of PCR amplification using both universal primers and one or more target-specific primers. The amplified product can be used directly for next generation sequencing or can be further processed to produce a product suitable for next generation sequencing.

(E) The ATO hybridizes to the single stranded target polynucleotide sequence at one or more positions. The 3 ' end of one strand of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. The ATO may be digested after the first reaction. The modified target polynucleotide is then subjected to bisulfate treatment. The modified target polynucleotide is hybridized to a universal primer, the 3 ' of which is designed to hybridize to the universal sequence of the modified target polynucleotide (with or without an additional 5 ' universal sequence), and a library or one or more target-specific primers that include a 3 ' target-specific portion, with an additional 5 ' universal portion, or without an additional 5 ' universal portion. Both mixtures were amplified with one or more cycles of PCR amplification using both universal primers and one or more target-specific primers. The amplified product can be used directly for next generation sequencing or can be further processed to produce a product suitable for next generation sequencing.

(F) The first CS is hybridized to the second ATO, and a modified first CS having an adaptor sequence added to the 3' end of the first CS is generated. The first CS may be generated by linear amplification using a first primer comprising biotin. Optionally, after linear amplification, the first CS is captured by avidin affinity and the non-first CS is removed by washing. A second ATO is added and the extension reaction is repeated as the first reaction. After the extension reaction, unreacted products are washed away and the products are amplified by two universal primers that target both ends of the universal sequence. The universal primer that targets the 5' end of the first CS can be a nested primer.

(G) Dividing the modified target polynucleotide into one or more aliquots, combining each aliquot with a universal primer, the 3 'portion of which is designed to hybridize to the universal sequence of the modified target polynucleotide and has a 5' tail containing sequences necessary for next generation sequencing, adding a different target-specific primer or pool of target-specific primers designed to amplify the target region of the target polynucleotide. The target-specific primers include a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next-generation sequencing. The mixture is amplified with one or more cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The amplified products will be PCR products (each of which has an amplified target region of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

(H) The modified target polynucleotide is combined with a universal primer designed to hybridize to a universal sequence of the modified target polynucleotide (with or without an appended 5' universal sequence), and a target-specific primer or pool of target-specific primers is added. The target-specific primer includes a 3 ' target-specific sequence, has a 5 ' universal sequence, or does not have a 5 ' universal sequence. The mixture is amplified with one or more cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The product of this amplification will be a PCR fragment with the amplified target region of the original polynucleotide, with 3 'universal sequence and 5' universal sequence, or without 3 'universal sequence and 5' universal sequence. The first amplification product may be purified to remove reagents that are no longer needed from the first PCR reaction. Combining the PCR product with a second nested universal primer, the 3 'portion of which is designed to hybridize to the universal sequence of the first PCR product and has a 5' tail containing sequences necessary for next generation sequencing, and adding a different nested target-specific primer or pool of nested target-specific primers, the nested target-specific primer comprising a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next generation sequencing. The mixture is amplified with one or more cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The amplified product is a PCR product with the amplified target region of the original polynucleotide and will contain all the necessary sequences compatible with next generation sequencing.

Fig. 3 depicts a schematic diagram of an illustrative embodiment. The adaptor template oligonucleotide comprises:

(A) ATO, including 3' random sequences, degenerate sequences, random sequences with nucleotide bias, or target-specific sequences, in addition to any combination of the design components mentioned in (A-T); the endmost 3' end is attached with a blocker portion that renders the ATO inextensible; universal sequences 5' to random sequences; and one or more nucleotide sequence/modified (NSM) portions that are recognized by the agent.

(B) ATO, in addition to any combination of design components mentioned in (A-T), includes hairpin/stem-loop structures, where the stem portion sequence is indicated. The loop portion may comprise a non-replicable linkage, or one or more cleavable nucleotides.

(C) ATO, in addition to any combination of design components mentioned in (A-T), includes hairpin/stem-loop structures, where the stem portion sequence is indicated. The stem portion may comprise one or more non-replicable linkages, or one or more cleavable nucleotides.

(D) ATO, in addition to any combination of design components mentioned in (A-U), includes hairpin/stem-loop structures, where the stem portion sequence is indicated. The stem portion may comprise one or more non-replicable linkages, or one or more cleavable nucleotides. The loop region may contain random sequences, degenerate sequences, random sequences with nucleotide bias, or target-specific sequences capable of serving as unique identifiers.

(E) Designing an ATO having two separate strands, in addition to any combination of the design components mentioned in (A-J), includes an upper separate strand that is complementary or substantially complementary to all or a portion of the lower strand of the ATO, the lower portion of the ATO being identical to the ATO in (A).

(F) An ATO, except for any combination of the design components mentioned in (A-U), wherein the universal site consists in whole or in part of a sequence designed to act as a promoter for RNA polymerase.

(G) An ATO except for any combination of the design components mentioned in (a-U), wherein the universal site consists in part of a sequence designed to act as a promoter for RNA polymerase and a separate sequence designed to act as a priming site.

(H) An ATO other than any combination of the design components mentioned in (a-U), comprising a hairpin/stem-loop structure, wherein a stem portion sequence is indicated, the loop portion may comprise a non-replicable linkage, one or more cleavable nucleotides, and have a sequence designed to act as a promoter for an RNA polymerase.

(I) An ATO other than any combination of the design components mentioned in (a-U), comprising a hairpin/stem-loop structure wherein the stem portion is divided into two or more regions separated by a random sequence, a degenerate sequence, or a random sequence with nucleotide bias, the two or more regions being spanned by a random sequence, a degenerate sequence, or a random sequence with nucleotide bias of an uncopyable linkage or equivalent length.

(J) ATO, in addition to any combination of the design components mentioned in (A-U), includes 3 ' sequences designed to be target-specific, random sequences, degenerate sequences, or random sequences with nucleotide bias 5 ' to target-specific sequences, as well as universal sequences 5 ' to random sequences.

(K) An ATO other than any combination of the design components mentioned in (a-U), wherein the lower strand contains a universal site consisting in part of a sequence designed to act as a promoter for RNA polymerase and a separate sequence designed to act as a priming site, and the upper strand is a second oligonucleotide partially or fully complementary to the lower strand capable of forming a double stranded RNA polymerase promoter.

(L) ATO, in addition to any combination of the design components mentioned in (A-U), which includes hairpin/stem-loop structures, where the stem part sequence is indicated. The loop region may contain random sequences, degenerate sequences, random sequences with nucleotide biases, or target-specific sequences capable of being unique identifiers, and include non-replicable linkages, or one or more cleavable nucleotides.

(M) ATO, other than any combination of the design components mentioned in (A-U), which includes hairpin/stem-loop structures, where the stem part sequence is indicated. The stem portion may comprise one or more non-replicable linkages, or one or more cleavable nucleotides. The loop region may contain random sequences, degenerate sequences, random sequences with nucleotide bias, or target-specific sequences capable of serving as unique identifiers.

(N) an ATO other than any combination of the design components mentioned in (a-U), comprising a hairpin/stem-loop structure wherein the stem portion is divided into two or more regions separated by a random sequence, a degenerate sequence, or a random sequence with nucleotide bias, the two or more regions being spanned by a random sequence, a degenerate sequence, or a random sequence with nucleotide bias of an uncopyable linkage or equivalent length. The loop portion may comprise a non-replicable linkage, or one or more cleavable nucleotides.

(O) an ATO other than any combination of the design components mentioned in (A-U), having a 3 'random sequence, a degenerate sequence, or a random sequence with nucleotide bias, and a 3' end suitable for use as a primer to allow for extension of the extreme ends of the ATO.

(P) an ATO other than any combination of the design components mentioned in (A-U), having a 3 'target-specific sequence and a 3' end suitable for use as a primer to allow for extension of the most terminal of the ATO.

(Q) ATO, other than any combination of the design components mentioned in (A-U), which includes hairpin/stem-loop structures, where the stem part sequence is indicated. The loop portion may comprise a non-replicable linkage, or one or more cleavable nucleotides. The 5' of ATO includes a linking acceptor (e.g., a phosphate).

(R) ATO, other than any combination of the design components mentioned in (A-U), which includes hairpin/stem-loop structures, where the stem part sequence is indicated. The loop portion may comprise a non-replicable linkage, or one or more cleavable nucleotides. 5' of ATO includes an affinity purification moiety (e.g., biotin).

(S) a linear ATO having a 5' comprising a linked acceptor (e.g., a phosphate) in addition to any combination of the design components mentioned in (A-U).

(T) a linear ATO having a 5' comprising an affinity purification moiety (such as biotin) in addition to any combination of the design components mentioned in (A-U).

(U) Linear ATO, other than any combination of the design components mentioned in (A-U), consisting of only random sequences with 3 'and 5' phosphates blocked to prevent extension.

Fig. 4 depicts a schematic diagram of an illustrative embodiment. The operation of physically shearing DNA may be performed by fragmentation by means of transposase instead of the process of enzymatic fragmentation using nuclease or by sonication. In step (i), the target double-stranded polynucleotide (DNA) is incubated with the transposon DNA and the transposase. Once the random transposition reaction has been completed, the target polynucleotide will contain multiple copies of the transposon DNA, which generates an free 3 'end on the target polynucleotide and an free 5' end of the transposon DNA. In step (ii), the ATO molecule hybridises at one or more positions to the single stranded target polynucleotide sequence. The free 3 ' end generated as a result of random transposition of one strand of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a 5 'universal sequence from the transposon DNA, a region of the target polynucleotide 3' to the transposon DNA, a random sequence 3 'to the target polynucleotide as the UID, and a 3' second universal sequence as the priming site. (iii) after step (ii), digesting or removing the ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (iii), the modified target polynucleotide is used as a starting material for Complementary Strand (CS) generation, PCR amplification, or other downstream processes.

Fig. 5 depicts a schematic diagram of an illustrative embodiment. The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the ATO molecule is randomly hybridised at one or more positions to the single stranded target polynucleotide sequence. In step (ii), the 3' end of the ATO molecule is extended using the target polynucleotide as a template. The extension generates a modified copy of the target polynucleotide comprising a 5 'universal sequence, a random sequence as a UID, and a copy of the target polynucleotide at the 3' end. After step (ii), the modified copy of the target polynucleotide is purified to remove unused ATO. In some embodiments, the ATO is not digested or removed. In step (iii), the second ATO hybridizes to the modified target polynucleotide at one or more positions. The 3 ' end of the modified copy of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a double modified target polynucleotide that includes a 3 'universal site and a 5' universal site that can serve as priming sites, inside of these are two different random sequences that serve as UIDs, and in the center is a copy of the target polynucleotide.

Fig. 6 depicts a schematic diagram of an illustrative embodiment. The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the ATO molecule designed with the target-specific sequence hybridizes to the single-stranded target polynucleotide sequence at one or more positions. In step (ii), the 3' end of the ATO molecule is extended using the target polynucleotide as a template. The extension generates a modified copy of the target polynucleotide comprising a 5 'universal sequence, a random sequence as a UID, and a copy of the target polynucleotide at the 3' end. After step (ii), the modified copy of the target polynucleotide is purified to remove unused ATO. In some embodiments, the ATO is not digested or removed. In step (iii), the second ATO hybridizes to the modified target polynucleotide at one or more positions. The 3 ' end of the modified copy of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a double modified target polynucleotide that includes a 3 'universal site and a 5' universal site that can serve as priming sites, inside of these are two different random sequences that serve as UIDs, and in the center is a copy of the target polynucleotide.

Fig. 7 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is fragmented prior to random hybridisation of the target polynucleotide to the ATO molecule at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site.

Fig. 8 depicts a schematic of an illustrative embodiment. (a) The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is fragmented prior to random hybridisation of the target polynucleotide at one or more positions to an ATO molecule, the ATO containing an RNA polymerase promoter sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. The extension generates a modified target polynucleotide comprising a double-stranded sequence region between the target polynucleotide and the ATO, the modified target polynucleotide comprising a random sequence as the UID and a 3' universal sequence comprising an RNA polymerase promoter. After step (i), optionally purifying the modified target polynucleotide to remove unused ATO. In some embodiments, the ATO is not digested or removed. In step (ii), the double stranded RNA polymerase promoter of the modified target polynucleotide is used to generate an RNA copy (first complementary sequence) of the modified target polynucleotide.

(b) The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is fragmented prior to random hybridisation of the target polynucleotide at one or more positions to an ATO molecule, the ATO containing an RNA polymerase promoter sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3 'universal sequence as a priming site, which includes a complementary region capable of forming a hairpin at the 3' end of the modified target polynucleotide. (ii) after step (i), digesting or removing ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (ii), the 3 'end of the modified target polynucleotide is allowed to form a small hairpin by annealing to itself, and then the 3' end is used as a primer, which once extended will generate a first complementary sequence comprising a double stranded RNA polymerase promoter. In step (iii), the double stranded RNA polymerase promoter of the modified target polynucleotide is used to generate an RNA copy (first complementary sequence) of the modified target polynucleotide.

Fig. 9 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is fragmented prior to random hybridisation of the target polynucleotide to the ATO molecule at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3 'universal sequence as a priming site, which includes a complementary region capable of forming a hairpin at the 3' end of the modified target polynucleotide. (ii) after step (i), digesting or removing ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (ii), the 3 'end of the modified target polynucleotide is allowed to form a small hairpin by annealing to itself, and then the 3' end is used as a primer, which once extended will generate the first complementary sequence.

Fig. 10 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is fragmented prior to random hybridisation of the target polynucleotide to the ATO molecule at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3' universal sequence as a priming site. (ii) after step (i), digesting or removing ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (ii), a target-specific primer is added to the modified target polynucleotide, and the modified target polynucleotide is extended by a polymerase until reaching the 5' end of the modified target polynucleotide. In step (iii), the double stranded adaptor is mixed with the modified target polynucleotide and a suitable enzyme, and the adaptor is then ligated to the 5' double stranded DNA of the modified target polynucleotide.

Fig. 11 depicts a schematic of an illustrative embodiment. To measure the efficiency with which 3' extension of the target polynucleotide occurs to generate a modified target, polynucleotide quantitative pcr (qpcr) is used. The target polynucleotide used is a single-stranded DNA oligonucleotide of known length and sequence (internal reference (IC)). The target polynucleotide used is also genomic DNA mixed with 10% of the amount of a single-stranded DNA oligonucleotide of known length and sequence (internal reference (IC)). Once the modified target polynucleotide has been generated, two modified target polynucleotides are used independently as templates for 3 different qPCR reactions, one reaction having two primers located within the target polynucleotide sequence (forward primer 1 and reverse primer 1), one reaction having a primer located within the target polynucleotide sequence and a primer present in a universal sequence added to the modified target polynucleotide (forward primer 2 and reverse primer 2), and the other reaction has a primer located within the target polynucleotide sequence and primers (forward primer 3 and reverse primer 3) with a tail (which is not homologous to the modified target polynucleotide) present at the end of the universal sequence added to the modified target polynucleotide, all of which reactions contain a dual-labeled qPCR probe located within the target polynucleotide (probe). The ' CT ' values when detecting the fluorescent amplification signals of the IC reference specific primers proportionally show the efficiency of 3 ' extension of the target polynucleotide when compared to the ' CT ' values when detecting the fluorescent amplification signals of primer pairs located within the IC and universal sequences. Since the 'CT' values are very similar, the efficiency can be interpreted to be between 50-100%.

Fig. 12 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product of any origin, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is fragmented prior to random hybridisation of the target polynucleotide to the ATO molecule at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence as a UID and a 3 'universal sequence as a priming site, which includes a complementary region capable of forming a hairpin at the 3' end of the modified target polynucleotide. In step (ii), the 3' end of the modified target polynucleotide is allowed to form a small hairpin by annealing to itself. In step (iii), the 3' end is then used as a primer, which upon extension will generate the first complementary strand. After step (iii), the ATO is digested or removed by affinity capture. In some embodiments, the ATO is not digested or removed. In step (iv), the double stranded DNA of the first CS is ligated to the adaptor via double stranded ligation by a DNA ligase. In step (v), the remaining ATO is digested and the hairpin is disrupted by incubating the reaction mixture with a mixture of dU-glycosylase, apurinic/apyrimidinic endonuclease and S1 nuclease. The final double stranded product is then used directly for sequencing on a compatible next generation sequencer (e.g., MiSeq).

Fig. 13 depicts a schematic of an illustrative embodiment. All steps used in the generation of the targeted amplicon next generation sequencing set from starting material to sequencing data analysis are depicted. The target polynucleotide (PCR product, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the target polynucleotide is enzymatically fragmented. In step (II), the size distribution of the fragmented target polynucleotides is determined using a high sensitivity bioanalyzer chip. In step (iii), the fragmented target polynucleotides are randomly hybridized to ATO molecules at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a 3 'random sequence as the UID, a 3' universal sequence as the priming site. In step (iii), the ATO is digested or removed by affinity capture. In some embodiments, the ATO is not digested or removed. In step (iv), the modified target polynucleotide, the universal primers designed to bind to the universal sites on the ATO, the pool of gene-specific primers, the appropriate buffer, the appropriate enzyme, dntps and other additives are combined and used to exponentially amplify the modified target polynucleotide. In step (v), the PCR product is purified. In step (vi), the first PCR product, the second universal primer designed to bind to the universal site in the PCR product, the second library of overlapping gene-specific primers, a suitable buffer, a suitable enzyme, dntps and other additives are combined and used to exponentially amplify the modified target polynucleotide. In step (vii), the second PCR product is purified. In step (viii), the size distribution of the final sequencing library is determined using a high sensitivity bioanalyzer chip. The library was then sequenced using a MiSeq sequencer with 150bp paired-end sequencing. The sequencing data was then analyzed using a combination of bwa aligner (aligner) and custom data filter python script. Data plot (H) represents the size distribution of inserts determined from sequencing data generated by MiSeq, (I) represents the distribution of reads over the full range of primers present in the pool used in exponential amplification, (J) shows the number of "barcode families" identified in sequencing data with family sizes of at least 3 reads.

Fig. 14 depicts a schematic of an illustrative embodiment. All steps used in the generation of the whole genome DNA library from the starting material to the final library are depicted. The target polynucleotide (PCR product, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the target polynucleotide is enzymatically fragmented. In step (II), the size distribution of the fragmented target polynucleotides is determined using a high sensitivity bioanalyzer chip. In step (iii), the fragmented target polynucleotides are randomly hybridized to ATO molecules at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a 3 'random sequence as the UID, a 3' universal sequence as the priming site. In step (iii), the ATO is digested or removed by affinity capture. In some embodiments, the ATO is not digested or removed. In step (iv), the modified target polynucleotide, the universal primer designed to bind to the universal site on the ATO, a suitable buffer, a suitable enzyme, dntps and other additives are combined and used to perform linear amplification of the modified target polynucleotide to generate a first CS (complementary sequence). The product is then optionally purified. In step (v), the linear CS is randomly hybridized to the ATO molecule at one or more positions. The 3 ' end of the linear CS is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. The extension generates a modified CS that includes a random sequence as a UID at each end, a different 3 'universal sequence and a 5' universal sequence as priming sites. In step (vi), the ATO is digested or removed by affinity capture. In some embodiments, the ATO is not digested or removed. In step (vii), the modified CS product, two different universal primers designed to bind to universal sites at the 5 'end and 3' end of the modified CS product, the necessary buffers, enzymes, dntps and other additives are combined and used for exponential amplification of the modified CS product. In step (viii), the second PCR product is purified. In step (ix), the size distribution of the final sequencing library is determined using a high sensitivity bioanalyzer chip.

Fig. 15 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the single stranded target polynucleotide is randomly hybridised to the ATO molecule at one or more positions. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. Extension generates a modified target polynucleotide that includes a random sequence of variable length as the UID, a 3' universal sequence as the priming site. In step (ii), the modified target polynucleotide is purified to remove unused ATO. In some embodiments, the ATO is not digested or removed. In step (iii), the modified target polynucleotide, universal primers designed to bind to universal sites in the PCR product, gene specific primers, suitable buffers, suitable enzymes, dntps and other additives are combined and used to exponentially amplify the modified target polynucleotide.

Fig. 16 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown.

(A) The modified target polynucleotide was divided into two approximately equal aliquots. To each of these aliquots was added a universal primer, 3 'of which was designed to hybridize to the universal sequence of the modified target polynucleotide and has a 5' tail containing sequences necessary for next generation sequencing. To each aliquot is added a different target-specific primer or pool of target-specific primers, the primers in each pool being designed to amplify either the forward or reverse strand of the target region of the target polynucleotide. The target-specific primers include a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next-generation sequencing. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and one or more target-specific primers. The amplified products will be two separate pools of PCR products (each of which has amplified one or the other of the strands of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

(B) The modified target polynucleotide was divided into two approximately equal aliquots. To each of these aliquots was added a universal primer designed to hybridize to a universal sequence of the modifier target polynucleotide (with or without an additional 5' universal sequence). To each aliquot is added a different target-specific primer or pool of target-specific primers, the primers in each pool being designed to amplify either the forward or reverse strand of the target region of the target polynucleotide. The target-specific primer includes a 3 ' target-specific sequence, has a 5 ' universal sequence, or does not have a 5 ' universal sequence. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and a pool of target-specific primers. The products of this amplification will be two separate pools of PCR products (with or without 3 'universal sequence and 5' universal sequence) each of which has been amplified for one or the other of the strands of the original polynucleotide. The first amplification product may be purified to remove reagents that are no longer needed from the first PCR reaction. Combining each of the two separate pools of PCR products with a second universal primer, a universal primer used in the first PCR, or a nested universal primer, the 3 'of which is designed to hybridize to the universal sequence of the first PCR product and has a 5' tail containing sequences necessary for next generation sequencing, adding a different nested target-specific primer or pool of nested target-specific primers, the nested target-specific primer comprising a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next generation sequencing. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and a pool of target-specific primers. The amplified products will be two separate pools of PCR products (each of which has amplified one or the other of the strands of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

(C) The modified target polynucleotide is combined with a universal primer designed to hybridize to a universal sequence of the modified target polynucleotide (with or without an additional 5' universal sequence). The modified target polynucleotide is then subjected to linear amplification by one or more rounds of amplification. The linear amplification product can be purified to remove reagents that are no longer needed from the linear amplification reaction. The linear amplification product was divided into two approximately equal aliquots. To each of these aliquots was added a universal primer, 3 'of which was designed to hybridize to the universal sequence of the modified target polynucleotide and has a 5' tail containing sequences necessary for next generation sequencing. To each aliquot is added a different target-specific primer or pool of target-specific primers, each pool containing primers designed to amplify a target region of the forward or reverse strand of the target polynucleotide. The target-specific primers include a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next-generation sequencing. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and a pool of primers. The amplified products will be two separate pools of PCR products (each of which has amplified one or the other of the strands of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

(D) The modified target polynucleotide is combined with a universal primer designed to hybridize to a universal sequence of the modified target polynucleotide (with or without an additional 5' universal sequence). The modified target polynucleotide is then subjected to linear amplification by one or more rounds of amplification. The linear amplification product may be purified to remove reagents that are no longer needed from the first PCR reaction. The linear amplification product was divided into two approximately equal aliquots. To each of these aliquots was added a universal primer designed to hybridize to a universal sequence of the modifier target polynucleotide (with or without an additional 5' universal sequence). To each aliquot is added a different target-specific primer or pool of target-specific primers, each pool containing primers designed to amplify a target region of the forward or reverse strand of the target polynucleotide. The target-specific primer includes a 3 ' target-specific sequence, has a 5 ' universal sequence, or does not have a 5 ' universal sequence. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and a pool of primers. The products of this amplification will be two separate pools of PCR products (with or without 3 'universal sequence and 5' universal sequence) each of which has been amplified for one or the other of the strands of the original polynucleotide. The first amplification product may be purified to remove reagents that are no longer needed from the first PCR reaction. Each of the two separate pools of PCR products is combined with a second nested universal primer, or universal primer used in the first amplification, the 3 'of which is designed to hybridize to the universal sequence of the first PCR product and has a 5' tail containing sequences necessary for next generation sequencing, and a different nested target-specific primer is added, or pool of nested target-specific primers, which includes a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next generation sequencing. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and a pool of primers. The amplified products will be two separate pools of PCR products (each of which has amplified one or the other of the strands of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

(E) The modified target polynucleotide is combined with a universal primer designed to hybridize to a universal sequence of the modified target polynucleotide (with or without an appended 5' universal sequence), and a target-specific primer or pool of target-specific primers designed to amplify either the forward or reverse strand of the target polynucleotide is added. The target-specific primer includes a 3 ' target-specific sequence, has a 5 ' universal sequence, or does not have a 5 ' universal sequence. The mixture is amplified with multiple cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The product of this amplification will be a pool of PCR products (with or without 3 'universal sequence and 5' universal sequence) that have amplified one or the other of the strands of the original polynucleotide. The first amplification product is purified to remove all unused single stranded primers. Combining the purified first amplification product with universal primers designed to hybridize to a universal sequence of a modifier target polynucleotide (with or without an additional 5' universal sequence), adding target-specific primers or a pool of target-specific primers, one or more of which are designed to amplify either the forward or reverse strand of the target polynucleotide (which is not targeted in the first amplification reaction). If the forward strand is targeted in the first reaction, the reverse strand is targeted in the second reaction. The target-specific primer includes a 3 ' target-specific sequence, has a 5 ' universal sequence, or does not have a 5 ' universal sequence. The mixture is amplified with multiple cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The second amplification product is purified to remove all unused primers. The second amplification product was divided into two approximately equal aliquots. To each of these aliquots was added a second universal primer, 3 'of which was designed to hybridize to the universal sequence of the modifier target polynucleotide and has a 5' tail containing sequences necessary for next generation sequencing. To each aliquot is added a different target-specific nested primer or pool of target-specific nested primers designed to amplify either the forward or reverse strand of the target polynucleotide. The target-specific primers include a 3 'target-specific portion and a 5' universal portion containing sequences necessary for next-generation sequencing. Two separate mixtures were amplified with multiple cycles of PCR amplification using both universal primers and one or more target-specific primers. The amplified products will be two separate pools of PCR products (each of which has amplified one or the other of the strands of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

(F) Combining the modified target polynucleotide with a universal primer designed to hybridize to a universal sequence of the modifier target polynucleotide having an appended 5' universal sequence, adding a target-specific primer or pool of target-specific primers designed to amplify either the forward or reverse strand of the target polynucleotide. The target-specific primers include a 3 'target-specific sequence and a 5' universal sequence. The mixture is amplified with multiple cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The product of this amplification will be a pool of PCR products (having 5 'universal sequences and 3' universal sequences), each of which has amplified one or the other of the strands of the original polynucleotide. The first amplification product is purified to remove all unused primers. Combining the purified first amplification product with universal primers designed to hybridize to the universal sequence of the modifier target polynucleotide having an appended 5' universal sequence, adding target-specific primers or a pool of target-specific primers, one or more of which are designed to amplify either the forward strand or the reverse strand of the target polynucleotide (the one not targeted in the first amplification reaction). If the forward strand is targeted in the first reaction, the reverse strand is targeted in the second reaction. The target-specific primers include a 3 'target-specific sequence and a 5' universal sequence. The mixture is amplified with multiple cycles of PCR amplification using both the universal primers and the one or more target-specific primers. The second amplification product is purified to remove all unused primers. The second amplification product is mixed with two universal primers, the 3 'of which is designed to hybridize to the universal sequences at the 3' and 5 'positions of the amplification product and has a 5' tail containing sequences necessary for next generation sequencing. The two mixtures were amplified with multiple cycles of PCR amplification using two universal primers. The product of this amplification will be a pool of PCR products (each of which has independently amplified both strands of the original polynucleotide) and will contain all the necessary sequences compatible with next generation sequencing.

Fig. 17 depicts a schematic of an illustrative embodiment. DNA target polynucleotides are shown. The process of enzymatic fragmentation using nucleases, or manipulation of physical shearing of DNA by sonication, or the use of transposases can be replaced by individual targeting or compound targeting of genome editing tools such as Cas genes of enzymes and type I, type II, type III of CRISPR subtypes, or combinations thereof. For example, the CRISPR/Cas9 enzyme is incubated with a mixture of DNA and one or more guide RNAs. This results in association of the DNA, CRISPR/Cas9 enzyme, and guide RNA, whereby the guide RNA targets either double-stranded or single-stranded cleavage of the DNA. The targeted fragmented DNA may then be purified and used in any subsequent downstream processes (such as the first ATO reaction).

Fig. 18 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. In step (i), the ATO molecule hybridises at one or more positions to the single stranded target polynucleotide sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimmed (if any 3 ' overhang is present), and extended using the ATO as a template. The extension generates a modified target polynucleotide that includes a random sequence that is a UID. (ii) after step (i), digesting or removing ATO by affinity capture. In some embodiments, the ATO is not digested or removed. In step (ii), the target polynucleotide undergoes an end repair process followed by ligation to a double stranded adaptor. The product of this ligation reaction will be a modified target polynucleotide that contains a short 5 'linker sequence and a longer universal 3' linker sequence. And then can be used in downstream processes.

Fig. 19 depicts a schematic of an illustrative embodiment. The target polynucleotide (PCR product, or DNA, or RNA, or any mixture thereof) may be double-stranded, with only one of the two strands (the forward strand) shown. The ability of the polymerase to use only DNA (and not RNA) as a primer for extension of the polynucleotide is exploited. The first round of ATO molecules hybridise at one or more positions to the single stranded target polynucleotide sequence. The 3 ' end of the target polynucleotide is hybridized to the 3 ' random sequence portion of the ATO, trimming is performed (if any 3 ' overhang is present), and only the DNA is extended using the ATO as a template and a polymerase that cannot use RNA primers. Digestion or removal of ATO by affinity capture. In some embodiments, the ATO is not digested or removed. The 3 'end of the modified target DNA polynucleotide is allowed to anneal to itself to form a small hairpin, then the 3' end is used as a primer, which once extended will generate the first complementary strand. Double stranded DNA with hairpins will act to preferentially allow the duplex to reform during the second round of ATO hybridization. The 3 'end of the target RNA polynucleotide is hybridized to the 3' random sequence portion of the ATO and extended using the ATO and polymerase as templates. The 3 'end of the modified target RNA polynucleotide is allowed to anneal to itself to form a small hairpin, and then the 3' end is used as a primer, which once extended will generate the first complementary strand. The hairpin can then be digested. The resulting DNA: DNA hybrid and RNA: DNA hybrid may then form a template for further downstream processing. Such as by selective amplification of DNA and/or RNA to generate a DNA sequencing library, or an RNA sequencing library, or a DNA + RNA sequencing library.

The following paragraphs are provided as clauses and should not be viewed as claims.

According to a first aspect, the present invention provides:

1. a Removable Template Oligonucleotide (RTO) for generating a library of polynucleotides, the Removable Template Oligonucleotide (RTO) comprising:

(a) a 3' random sequence;

(b) a blocker moiety attached to the 3' end, the blocker moiety rendering the RTO inextensible;

(c) universal sequences 5' to random sequences; and

(b) a nucleotide sequence/modification (NSM) that is recognized by an agent,

wherein the RTO is used as a template, is not incorporated into the reaction product, and is destroyed/removed after the reaction,

wherein the NSM facilitates removal of the RTO.

2. The removable template oligonucleotide of clause 1, wherein the NSM is a uracil nucleotide, wherein the uracil nucleotide is incorporated into the RTO during oligonucleotide synthesis in place of the thymine nucleotide; wherein the agent is uracil-DNA glycosylase (UNG) which is capable of destroying/removing the RTO after the end of the reaction.

3. The removable template oligonucleotide of clause 1, wherein the NSM is a ribonucleotide, wherein the ribonucleotide is incorporated into the RTO during oligonucleotide synthesis in place of any or all of the nucleotides; wherein the agent is a ribonuclease, which is capable of destroying/removing ATO after the reaction is completed.

4. The removable template oligonucleotide of clause 1, wherein NSM is a restriction enzyme recognition sequence located in a universal sequence; wherein the agent is a restriction enzyme capable of destroying/removing the RTO after the reaction is complete.

5. The removable template oligonucleotide of clause 1, wherein NSM is an affinity binding moiety attached at any point of the RTO; wherein the agent is a protein or an antibody capable of removing the RTO after the reaction is complete.

6. The removable template oligonucleotide of clause 5, wherein the affinity binding moiety is biotin; wherein the agent is avidin.

7. The removable template oligonucleotide of any one of clauses 1-6, wherein the RTO comprises an additional Unique Identification (UID) sequence located within the universal sequence.

8. The removable template oligonucleotide of any one of clauses 1-7, wherein the RTO comprises a portion that forms a stem-loop structure.

9. A method for generating a polynucleotide library, the method comprising:

(i) generating a modified target polynucleotide using a target polynucleotide from a sample as a primer and using a Removable Template Oligonucleotide (RTO) of any one of clauses 1-8 as a template;

(ii) removing the RTO; and

(iii) generating a first Complementary Sequence (CS) of the modified target polynucleotide using a first primer, the first primer comprising a universal sequence,

wherein the generating operation comprises extending a primer hybridized to the template by a polymerase.

10. The method of clause 9, further comprising: an operation of generating a modified first CS using the first complementary sequence as a primer and the second Removable Template Oligonucleotide (RTO) of any one of clauses 1-8 as a template.

11. The method of clause 9 or 10, further comprising: an operation of extending a second primer that hybridizes to the first CS or the modified first CS, wherein the second primer comprises a target-specific portion or a universal sequence, or both a 3 'target-specific sequence and a 5' universal sequence, thereby forming a second CS.

12. The method of clause 11, wherein when the target polynucleotide in the sample is double-stranded, extending the second primer that hybridizes to the first CS comprises separating the first CS into two separate reactions, wherein the first reaction comprises a target-specific second primer that is complementary to a first strand of the target sequence, and the second reaction comprises a target-specific second primer that is complementary to a second strand of the target sequence, wherein the first and second strands of the target sequence are complementary.

13. The method of clause 11 or 12, wherein the extension operation comprises linear amplification using the first primer or the second primer.

14. The method of clause 11 or 12, wherein the extension operation comprises exponential amplification using a first primer and a second primer.

15. The method of any of clauses 9-14, wherein the first primer comprises a Sample Barcode (SBC) sequence and an additional universal sequence compatible with the NGS platform.

16. The method of any of clauses 9-15, wherein when the second primer is a target-specific primer, nested target-specific third primers are used for further amplification following linear amplification or exponential amplification using the second primer.

17. A method of accurately determining the sequence of a target polynucleotide, the method comprising:

(i) sequencing at least one of the amplified second CS of any one of clauses 9-16;

(ii) (ii) aligning at least two sequences from (i) that contain the same UID and/or aligning the same target sequence of two reactions, wherein each reaction generates sequence information of one strand or the complementary strand of the duplex target sequence; and

(iii) (iii) determining a consensus sequence and/or identical variant sequence for both reactions based on (ii), wherein the consensus sequence and/or variant sequence accurately represents the target polynucleotide sequence.

18. A kit for generating a library of polynucleotides, the kit comprising a Removable Template Oligonucleotide (RTO) of any one of clauses 1-8 and primers compatible with the NGS platform.

According to a second aspect, the present invention provides:

1. an Adaptor Template Oligonucleotide (ATO) for generating a polynucleotide library, comprising:

(a) a 3' random sequence;

(b) a blocker moiety attached to the 3' end, the blocker moiety rendering the ATO inextensible;

(c) universal sequences 5' to random sequences; and

(b) nucleotide sequences/modifications (NSM) that render ATO non-interfering and non-competitive;

wherein ATO is used as a template to direct the first extension reaction and is not incorporated into the reaction product.

2. The adaptor template oligonucleotide of clause 1, wherein the NSM is a non-canonical nucleotide (non-dA, non-dG, non-dT, non-dC) and the atypical nucleotide is a naturally occurring nucleotide or an artificial nucleotide.

3. The adaptor template oligonucleotide of clause 2, wherein the atypical nucleotide is a universal nucleotide.

4. The adaptor template oligonucleotide of clause 2, wherein the atypical nucleotide comprises an inosine base, wherein inosine is used to replace the usual guanine position in the ATO sequence, wherein deoxyinosine preferentially directs incorporation of dC in the growing nascent strand by the DNA polymerase.

5. The adaptor template oligonucleotide of clause 1, wherein the NSM is recognizable by an agent that facilitates digestion/removal of ATO.

6. The adaptor template oligonucleotide of clause 5, wherein the NSM is a uracil nucleotide, wherein the uracil nucleotide is incorporated into the ATO in place of the thymine nucleotide during oligonucleotide synthesis; wherein the agent is uracil-DNA glycosylase (UNG) which is capable of destroying/removing ATO after the end of the first extension reaction.

7. The adaptor template oligonucleotide of clause 5, wherein the NSM is a ribonucleotide, wherein the ribonucleotide is incorporated into the ATO in place of any or all of the nucleotides during oligonucleotide synthesis; wherein the agent is a ribonuclease capable of destroying/removing ATO after the first extension reaction is completed.

8. The adaptor template oligonucleotide of clause 5, wherein NSM is a restriction enzyme recognition sequence located in a universal sequence; wherein the agent is a restriction enzyme capable of destroying/removing ATO after the first extension reaction is completed.

9. The adaptor template oligonucleotide of clause 5, wherein NSM is an affinity binding moiety attached at any point of the ATO; wherein the agent is a protein or an antibody capable of removing ATO after the first extension reaction is completed.

10. The adaptor template oligonucleotide of clause 9, wherein the affinity binding moiety is biotin; wherein the agent is avidin.

11. The adaptor template oligonucleotide of any one of clauses 1-10, wherein the ATO comprises an additional Unique Identification (UID) sequence located in the universal sequence.

12. The adaptor template oligonucleotide of any one of clauses 1-10, wherein the ATO comprises a portion that forms a stem-loop structure.

13. A method for generating a polynucleotide library, the method comprising:

(i) generating a modified target polynucleotide by using a target polynucleotide from a sample as a primer and using an Adaptor Template Oligonucleotide (ATO) as described in any of clauses 1-12 as a template; and

(ii) generating a first Complementary Sequence (CS) of the modified target polynucleotide using a first primer comprising a universal sequence and using the modified target polynucleotide as a template,

wherein the generating operation comprises extending a primer hybridized to the template by a polymerase.

14. The method of clause 13, further comprising: an operation of generating a modified first CS by using the first complementary sequence as a primer and the second Adaptor Template Oligonucleotide (ATO) as a template as described in any of clauses 1-12 and extending the first CS on the ATO template.

15. The method of clause 13, wherein step (i) further comprises the operation of removing ATO.

16. The method of clause 13, 14 or 15, further comprising: an operation of extending a second primer that hybridizes to the first CS or the modified first CS, wherein the second primer comprises a target-specific portion or a universal sequence, or both a 3 'target-specific sequence and a 5' universal sequence, thereby forming a second CS.

17. The method of clause 16, wherein when the target polynucleotide in the sample is double-stranded, extending the second primer that hybridizes to the first CS comprises separating the first CS into two separate reactions, wherein the first reaction comprises a target-specific second primer that is complementary to a first strand of the target sequence, and the second reaction comprises a target-specific second primer that is complementary to a second strand of the target sequence, wherein the first and second strands of the target sequence are complementary.

18. The method of clause 13, 14, 16 or 17, wherein the extension operation comprises linear amplification.

19. The method of clause 16, wherein the extension operation comprises exponential amplification using a first primer and a second primer.

20. The method of any of clauses 13-19, wherein the first primer comprises a Sample Barcode (SBC) sequence and an additional universal sequence compatible with the NGS platform.

21. The method of any of clauses 13-19, wherein when the second primer is a target-specific primer, nested target-specific third primers are used for further amplification following linear amplification or exponential amplification using the second primer.

22. A method of accurately determining the sequence of a target polynucleotide, the method comprising:

(i) sequencing at least one of the amplified second CS of any one of clauses 13-21;

(ii) (ii) aligning at least two sequences from (i) that contain the same UID and/or aligning the same target sequence of two reactions, wherein each reaction generates sequence information of one strand or the complementary strand of the duplex target sequence; and

(iii) (iii) determining a consensus sequence and/or identical variant sequence for both reactions based on (ii), wherein the consensus sequence and/or variant sequence accurately represents the target polynucleotide sequence.

23. A kit for generating a polynucleotide library, the kit comprising an Adaptor Template Oligonucleotide (ATO) of any one of clauses 1-12 and primers compatible with the NGS platform.

According to a third aspect, the present invention provides:

1. an Adaptor Template Oligonucleotide (ATO) for generating a polynucleotide library, comprising:

(a) a 3' random sequence;

(b) a blocker moiety attached to the 3' end, the blocker moiety rendering the ATO inextensible;

(c) universal sequences 5' to random sequences; and

(b) nucleotide sequence/modification (NSM);

wherein the ATO is used as a template to direct the first reaction and all or a portion of the ATO is not incorporated into the first reaction product, wherein the NSM renders the ATO non-interfering and non-competitive in reactions subsequent to the first reaction.

2. The adaptor template oligonucleotide of clause 1, wherein the NSM is a non-canonical nucleotide (non-dA, non-dG, non-dT, non-dC) and the atypical nucleotide is a naturally occurring nucleotide or an artificial nucleotide.

3. The adaptor template oligonucleotide of clause 2, wherein the atypical nucleotide is a universal nucleotide.

4. The adaptor template oligonucleotide of clause 2, wherein the atypical nucleotide comprises an inosine base, wherein inosine is used to replace a guanine position in the ATO sequence, wherein deoxyinosine preferentially directs incorporation of dC in the growing nascent strand by the DNA polymerase.

5. The adaptor template oligonucleotide of clause 1, wherein the NSM is recognizable by an agent that facilitates digestion/removal of ATO.

6. The adaptor template oligonucleotide of clause 5, wherein the NSM is a uracil nucleotide, wherein the uracil nucleotide is incorporated into the ATO in place of the thymine nucleotide during oligonucleotide synthesis; wherein the agent is uracil-DNA glycosylase (UNG) which is capable of destroying/removing ATO after the end of the first extension reaction.

7. The adaptor template oligonucleotide of clause 5, wherein the NSM is a ribonucleotide, wherein the ribonucleotide is incorporated into the ATO in place of any or all of the nucleotides during oligonucleotide synthesis; wherein the agent is a ribonuclease capable of destroying/removing ATO after the first extension reaction is completed.

8. The adaptor template oligonucleotide of clause 5, wherein NSM is a restriction enzyme recognition sequence located in a universal sequence; wherein the agent is a restriction enzyme capable of destroying/removing ATO after the first extension reaction is completed.

9. The adaptor template oligonucleotide of clause 5, wherein NSM is an affinity binding moiety attached at any point of the ATO; wherein the agent is a protein or an antibody capable of removing ATO after the first extension reaction is completed.

10. The adaptor template oligonucleotide of clause 9, wherein the affinity binding moiety is biotin; wherein the agent is avidin.

11. The adaptor template oligonucleotide of any one of clauses 1-10, wherein the ATO comprises an additional Unique Identification (UID) sequence located in the universal sequence.

12. The adaptor template oligonucleotide of clause 1, wherein the ATO comprises a 5' stem portion sequence complementary to a portion of the universal sequence, which is capable of forming a stem-loop structure.

13. The adaptor template oligonucleotide of clause 12, wherein the loop portion comprises a non-replicable linkage.

14. The adaptor template oligonucleotide of clause 13, wherein the non-replicable linkage is a C3 internodal phosphoramidite, or a triethylene glycol internodal arm, or an 18 atom hexaethylene glycol internodal arm, or a 1 ', 2' -dideoxyribose (d-internodal arm).

15. The adaptor template oligonucleotide of clause 12, wherein the loop portion comprises a nucleotide that can be digested.

16. The adaptor template oligonucleotide of clause 12, wherein the 5' end of the stem portion sequence comprises a phosphate group.

17. The adaptor template oligonucleotide of clause 1, wherein the ATO comprises an upper single strand complementary to a portion of the universal sequence, which is capable of forming a partially double-stranded structure.

18. The adaptor template oligonucleotide of clause 17, wherein the upper single strand comprises nucleotides that can be digested.

19. The adaptor template oligonucleotide of clause 17, wherein the 5' end of the upper individual strand comprises a phosphate group.

20. The adaptor template oligonucleotide of clause 17, wherein the 3' end of the upper individual strand comprises biotin.

21. A method for generating a polynucleotide library, the method comprising:

(i) generating a modified target polynucleotide by using a target polynucleotide from a sample, the target polynucleotide from the sample hybridizing to a 3 'random sequence of an Adaptor Template Oligonucleotide (ATO) of any one of clauses 1-20 (first ATO) in an enzymatic first reaction that adds an adaptor sequence to the 3' end of the target polynucleotide; and

(ii) generating a first Complement Sequence (CS) of the modified target polynucleotide using a first primer comprising the universal sequence and using the modified target polynucleotide as a template, wherein the first primer hybridizes to the template and is extended by a polymerase.

22. The method of clause 21, wherein the first reaction is a primer extension reaction in which the target polynucleotide is used as a primer and is extended on the ATO template by a DNA polymerase.

23. The method of clause 22, wherein the DNA polymerase has strand displacement activity, wherein during extension the stem-loop structure is opened or the upper ATO strand is displaced.

24. The method of clause 22, wherein the first reaction is an extension-ligation reaction in which a DNA polymerase extends the target and a DNA ligase ligates the extended target sequence to the 5' stem portion of the ATO or the upper strand of the ATO.

25. The method of clause 22, wherein the first reaction is a ligation reaction in which a DNA ligase ligates the target sequence to the 5' stem portion of the ATO or the upper strand of the ATO.

26. The method of clause 21, further comprising the act of digesting a portion of the ATO or removing a portion of the ATO by affinity capture after the first reaction.

27. The method of clause 21, further comprising the act of generating a modified first CS by using the first CS, the first CS hybridizing to a 3 ' random sequence of a second ATO described in any one of clauses 1-20 in an enzymatic reaction that adds an adaptor sequence to the 3 ' end of the first CS, wherein the second ATO comprises a different 5 ' universal sequence as compared to the first ATO.

28. The method of clause 21, further comprising the act of generating a modified first CS by ligating a double-stranded adaptor into the product of step (ii).

29. The method of clauses 21, 27 or 28, further comprising the act of extending a second primer that hybridizes to the first CS or the modified first CS, thereby forming a second CS, wherein the second primer comprises the target-specific portion or the universal sequence, or both the 3 'target-specific sequence and the 5' universal sequence.

30. The method of clause 29, wherein when the target polynucleotide in the sample is double-stranded and the second primer is a target-specific primer, extending the second primer that hybridizes to the first CS comprises separating the first CS into two separate reactions, wherein the forward reaction comprises a target-specific second primer that is complementary to the forward strand of the target sequence and the reverse reaction comprises a target-specific second primer that is complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

31. The method of any of clauses 21-30, wherein generating the first CS comprises linear amplification with 1-30 cycles.

32. The method of any of clauses 21-31, wherein if the target is RNA, generating the first CS comprises a reverse transcription reaction using a reverse transcriptase.

33. The method of any one of clauses 21-32, wherein the extending operation comprises exponential amplification using a first primer and a second primer.

34. The method of any of clauses 21-33, wherein when the second primer is a target-specific primer, and following linear amplification or exponential amplification using the second primer, nested target-specific third primers are used for further amplification.

35. The method of any one of clauses 21-34, wherein the first primer or the fourth primer targeting the universal adaptor sequence comprises a Sample Barcode (SBC) sequence and an additional universal sequence compatible with the NGS platform.

36. A method of accurately determining the sequence of a target polynucleotide, the method comprising:

(i) sequencing at least one of the amplified first CS or the amplified second CS of any one of clauses 21-35;

(ii) (ii) aligning at least two sequences from (i) that contain the same UID and/or aligning the same target sequence of two reactions, wherein each reaction generates sequence information of one strand or the complementary strand of the duplex target sequence; and

(iii) (iii) determining a consensus sequence and/or identical variant sequence for both reactions based on (ii), wherein the consensus sequence and/or variant sequence accurately represents the target polynucleotide sequence.

37. A kit for generating a polynucleotide library, the kit comprising an Adaptor Template Oligonucleotide (ATO) of any one of clauses 1-36 and primers compatible with the NGS platform.

According to a fourth aspect, the present invention provides:

1. an Adaptor Template Oligonucleotide (ATO) for extending a polynucleotide, comprising:

(a) a 3' random sequence;

(b) a blocker is attached to the 3' end, the blocker rendering the ATO inextensible; and

(c) universal sequences 5' to random sequences;

wherein the ATO serves as a template to direct an extension reaction by a polymerase.

2. The adaptor template oligonucleotide of clause 1, the Adaptor Template Oligonucleotide (ATO) further comprising one or more moieties that render the ATO degradable or non-interfering and non-competitive in reactions following the extension reaction, wherein the moieties are recognizable by agents that facilitate digestion/removal of the RTO.

3. The adaptor-template oligonucleotide of clause 2, wherein the moiety is a uracil nucleotide, wherein the agent comprises dU-glycosylase, which is capable of digesting/removing ATO after the first extension reaction.

4. The adaptor template oligonucleotide of clause 2, wherein the moiety is a ribonucleotide, wherein the ribonucleotide is incorporated into the ATO in place of any or all of the nucleotides during oligonucleotide synthesis; wherein the agent is a ribonuclease capable of digesting/removing ATO after the first extension reaction.

5. The adaptor template oligonucleotide of clause 1, wherein the ATO is an RNA oligonucleotide.

6. The adaptor template oligonucleotide of clause 1, wherein the ATO is a DNA oligonucleotide.

7. The adaptor template oligonucleotide of clause 1, wherein the ATO is a combination of a DNA oligonucleotide and an RNA oligonucleotide.

8. The adaptor template oligonucleotide of clause 2, wherein the moiety is a restriction enzyme recognition sequence, wherein the agent is a restriction enzyme.

9. The adaptor template oligonucleotide of clause 1, wherein the universal sequence comprises a sequence capable of acting as a promoter for RNA polymerase.

10. The adaptor template oligonucleotide of clause 9, wherein the RNA polymerase is T7 RNA polymerase, T3 RNA polymerase, or SP6 RNA polymerase.

11. The adaptor template oligonucleotide of clause 9, wherein the universal sequence comprises a 5 'RNA polymerase promoter sequence and a priming site, the priming site being located 3' of the RNA polymerase promoter sequence.

12. The adaptor template oligonucleotide of clause 1, wherein the universal sequence is double-stranded or partially double-stranded.

13. The adaptor template oligonucleotide of clause 12, wherein the ATO comprises a 5' stem portion sequence complementary or partially complementary to a portion or all of the universal sequence, which is capable of forming a stem-loop structure.

14. The adaptor template oligonucleotide of clause 13, wherein the ATO comprises in 5 'to 3' order: a 5 'stem portion, an RNA polymerase sequence, a priming site sequence, and a 3' random/degenerate sequence.

15. The adaptor template oligonucleotide of clause 13, wherein the RNA polymerase sequence is located in the loop portion.

16. The adaptor template oligonucleotide of clause 13, wherein the loop portion comprises a non-replicable linkage.

17. The adaptor template oligonucleotide of clause 13, wherein the loop portion does not comprise a non-replicable linkage.

18. The adaptor template oligonucleotide of clause 13, wherein if the 5' of the stem portion comprises an additional sequence, there is an irreproducible linkage between the stem portion and the additional sequence.

19. The adaptor template oligonucleotide of clause 13, wherein the 5' stem portion comprises a non-replicable linkage.

20. The adaptor template oligonucleotide of clauses 13-19, wherein the non-replicable linkage is a C3 inter-arm phosphoramidite, or a triethylene glycol inter-arm, or an 18 atom hexaethylene glycol inter-arm, or a 1 ', 2' -dideoxyribose (inter-d arm).

21. The adaptor template oligonucleotide of clause 13, wherein the double-stranded stem portion comprises non-complementary regions, wherein the non-complementary regions in the universal sequence strand comprise random sequences.

22. The adaptor template oligonucleotide of clause 13, wherein the stem portion forms two or more fragmentation segments separated by one or more non-replicable linkages.

23. The adaptor template oligonucleotide of clause 13, wherein the stem portion forms two or more fragmentation segments separated by one or more regions of mismatched base pairs.

24. The adaptor template oligonucleotide of clause 12, wherein the ATO comprises an upper single strand that is complementary or partially complementary to the universal sequence.

25. The adaptor template oligonucleotide of clause 24, wherein the 5' end of the upper individual strand comprises a phosphate group.

26. The adaptor template oligonucleotide of clause 1, wherein the ATO further comprises an affinity binding moiety attached in any position of the ATO.

27. The adaptor template oligonucleotide of clause 26, wherein the affinity binding moiety is biotin.

28. The adaptor template oligonucleotide of any one of the preceding clauses, wherein the 5' end comprises a phosphate group.

29. The adaptor template oligonucleotide of any one of the preceding clauses, wherein the universal sequence comprises a random sequence as an additional Unique Identification (UID) sequence.

30. The adaptor template oligonucleotide of clause 29, wherein the ATO comprises an additional UID within the loop segment.

31. The adaptor template oligonucleotide of clause 29, wherein the ATO comprises an additional UID within the stem segment.

32. The adaptor template oligonucleotide of clause 1, wherein the ATO sequence comprises an atypical nucleotide (non-dA, non-dG, non-dT, non-dC) which is a naturally occurring nucleotide or an artificial nucleotide.

33. The adaptor template oligonucleotide of clause 32, wherein the atypical nucleotide is a universal nucleotide.

34. The adaptor template oligonucleotide of clause 32, wherein the atypical nucleotide comprises an inosine base.

35. The adaptor template oligonucleotide of clause 32, wherein the 3' random sequence comprises atypical nucleotides.

36. The adaptor template oligonucleotide of clause 32, wherein the universal sequence comprises an atypical nucleotide.

37. The adaptor template oligonucleotide of clause 1, wherein the 3 'end comprises one or more modified nucleotides or linkages that render the ATO resistant to 3' exonuclease activity of the DNA polymerase.

38. The adaptor template oligonucleotide of clause 37, wherein the modified linkage is a phosphorothioate linkage.

39. The adaptor template oligonucleotide of clause 1, the Adaptor Template Oligonucleotide (ATO) further comprising a specific sequence 3 'of a random sequence, wherein the specific sequence is capable of hybridizing to a specific sequence of a polynucleotide, and a portion of the 3' random/degenerate sequence serves as a template upon which the polynucleotide is extended by a polymerase.

40. A composition comprising at least one nucleic acid polymerase and an Adaptor Template Oligonucleotide (ATO) of any of the preceding clauses.

41. The composition of clause 40, wherein the nucleic acid polymerase is a DNA polymerase.

42. The composition of clause 41, wherein the DNA polymerase has strand displacement activity.

43. The composition of clause 41, wherein the DNA polymerase has 3 'to 5' exonuclease activity.

44. The composition of clause 41, wherein the DNA polymerase is a template-dependent polymerase and not a template-independent polymerase.

45. A method for extending a target polynucleotide, the method comprising:

(i) generating a modified target polynucleotide by incubating the target polynucleotide with the composition of any one of the preceding clauses, wherein in the enzymatic first ATO reaction, the 3 ' end of the target polynucleotide hybridizes to the 3 ' random sequence of the adapter template oligonucleotide (first ATO), wherein the 3 ' end of the target polynucleotide is extended using the ATO as a template, wherein if a 3 ' overhang is present, the 3 ' end of the target polynucleotide is trimmed before extension occurs.

46. The method of clause 45, further comprising (ii) an operation to generate a first Complementary Sequence (CS) of the modified target polynucleotide.

47. The method of clause 46, wherein the act of generating the first CS comprises extension using a first primer and using the modified target polynucleotide as a template, wherein the first primer hybridizes to the template and is extended by a polymerase.

48. The method of clause 46, wherein the act of generating the first CS comprises in vitro transcription from a double-stranded promoter region in a modified target polynucleotide using an RNA polymerase, the modified target polynucleotide being generated by extension on an ATO comprising an RNA polymerase promoter.

49. The method of clause 46, wherein the act of generating the first CS comprises thermally denaturing the modified target polynucleotide, annealing the 3' stem-loop structure of the modified target polynucleotide, and self-priming to extend to form the first CS.

50. The method of clause 46, wherein generating the first CS comprises annealing the target-specific primer to a modified target polynucleotide and extension by a polymerase.

51. The method of any one of clauses 46-50, further comprising the act of digesting the ATO either prior to generating the first complement of the modified target polynucleotide (CS) or after generating the first complement of the modified target polynucleotide (CS).

52. The method of any one of clauses 46-51, further comprising performing affinity capture prior to generating the first complement of the modified target polynucleotide (CS) or after generating the first complement of the modified target polynucleotide (CS).

53. The method of clause 45, wherein the first ATO reaction comprises extension and ligation, wherein the DNA polymerase extends the 3 'end of the target and the DNA ligase ligates the extended target sequence to the 5' stem portion of the ATO or the upper individual strand of the ATO.

54. The method of clause 46, further comprising the act of generating a modified first CS by incubating the first CS with the composition of any one of clauses 35-39, wherein the 3 'end of the first CS hybridizes to the 3' random sequence of an adapter template oligonucleotide (second ATO) in an enzymatic second ATO reaction, wherein the 3 'end of the first CS is extended using ATO as a template, wherein the 3' end of the first CS is trimmed before extension occurs if a 3 'overhang is present, wherein the second ATO comprises a different 5' universal sequence of the first ATO.

55. The method of clause 46, further comprising the act of generating a modified first CS by ligating an adaptor into the product of step (ii).

56. The method of any one of clauses 46-55, further comprising the act of extending a second primer that hybridizes to the first CS or the modified first CS, thereby forming a second CS, wherein the second primer comprises the target-specific portion or the universal sequence, or both the 3 'target-specific sequence and the 5' universal sequence.

57. The method of clause 56, wherein when the target polynucleotide in the sample is double-stranded and the second primer is a target-specific primer, extending the second primer that hybridizes to the first CS comprises separating the first CS into two separate reactions, wherein the forward reaction comprises a target-specific second primer that is complementary to the forward strand of the target sequence and the reverse reaction comprises a target-specific second primer that is complementary to the reverse strand of the target sequence, wherein the forward and reverse strands of the target sequence are complementary.

58. The method of clause 47, wherein the act of generating the first CS comprises a linear amplification having 1-30 cycles or more.

59. The method of any of clauses 47 or 49, wherein generating the first CS comprises a reverse transcription reaction using a reverse transcriptase if the target is RNA.

60. The method of any one of clauses 56-59, further comprising exponential amplification using the first primer and the second primer.

61. The method of any of clauses 56-60, wherein when the second primer is a target-specific primer, nested target-specific third primers are used for further amplification following linear amplification or exponential amplification using the second primer.

62. The method of any one of clauses 56-61, wherein the first primer or the third primer comprises a Sample Barcode (SBC) sequence and an additional universal sequence compatible with the NGS platform.

63. The method of clause 45, which includes fragmenting the target polynucleotide prior to the first ATO reaction.

64. The method of clause 63, wherein fragmenting the target polynucleotide comprises contacting the double-stranded polynucleotide with a transposase that binds to a transposon DNA, wherein the transposon DNA comprises a transposase binding site and a universal sequence, wherein the transposase/transposon DNA complex binds to a target location on the double-stranded polynucleotide and cleaves the double-stranded polynucleotide into a plurality of double-stranded fragments, wherein each double-stranded fragment has the transposon DNA bound to each 5' end of the double-stranded fragment.

65. The method of clause 63, wherein the fragmentation operation comprises the use of targeted fragmentation using a genome editing tool.

66. The method of clause 65, wherein the genome editing tool comprises a clustered regularly interspaced short palindromic repeats and a CRISPR-associated protease 9(CRISPR/Cas 9).

67. The method of clause 63, which includes the act of thermally denaturing the fragmented target polynucleotides prior to the first ATO reaction.

68. The method of clause 64, wherein the transposase is Tn5 transposase.

69. The method of clause 63, wherein the act of fragmenting and labeling the target polynucleotide comprises contacting the single-stranded polynucleotide with a random primer comprising a 5 ' universal sequence and a 3 ' random sequence, extending the random primer on the target polynucleotide to generate a 5 ' labeled fragmented polynucleotide.

70. The method of clause 45, wherein the target polynucleotide comprises a free 3' hydroxyl group.

71. The method of clause 45, wherein the target polynucleotide is single-stranded DNA, or single-stranded RNA, or a combination of single-stranded RNA and single-stranded DNA.

72. A method for extending a target polynucleotide, the method comprising:

mixing the target polynucleotide with a DNA polymerase, an Adaptor Template Oligonucleotide (ATO) comprising a 3 ' random sequence, the Adaptor Template Oligonucleotide (ATO) having a blocked and modified 3 ' end to be tolerant to 3 ' exonuclease activity;

incubating the mixture under conditions that promote annealing, trimming the 3' overhangs (if present), and extending to generate modified target polynucleotides; and

optionally degrading the ATO.

73. A method for generating a sequencing library, the method comprising:

mixing the target polynucleotide with a DNA polymerase and an Adaptor Template Oligonucleotide (ATO) comprising a 3 ' random sequence, the Adaptor Template Oligonucleotide (ATO) being blocked at the 3 ' end and modified to be tolerant to 3 ' exonuclease activity;

incubating the mixture under conditions that promote annealing, trimming the 3' overhangs (if present), and extending to generate modified target polynucleotides;

optionally degrading ATO; and

the modified target polynucleotide is amplified using primers compatible with the NGS platform.

74. The method of clause 73, which includes fragmenting the target polynucleotide prior to mixing.

75. The method of clause 73, wherein the target polynucleotide is a naturally occurring fragmented polynucleotide.

76. The method of clause 75, wherein the naturally occurring fragmented polynucleotides are circulating free nucleic acids of plasma.

77. The method of clause 74, wherein fragmenting the target polynucleotide comprises contacting the double-stranded polynucleotide with a transposase that binds to a transposon DNA, wherein the transposon DNA comprises a transposase binding site and a universal sequence, wherein the transposase/transposon DNA complex binds to a target location on the double-stranded polynucleotide and cleaves the double-stranded polynucleotide into a plurality of double-stranded fragments, wherein each double-stranded fragment has the transposon DNA bound to each 5' end of the double-stranded fragment.

78. A method for generating a sequencing library, the method comprising:

adding an adapter sequence to the single stranded target polynucleotide defined in any one of clauses 45-69 by extending the single stranded target polynucleotide on the ATO; and

the adaptor-tagged target polynucleotide is amplified using primers compatible with the NGS platform.

79. A kit comprising the composition of any one of clauses 1 to 44.

80. A kit for generating a library of polynucleotides, the kit comprising an Adaptor Template Oligonucleotide (ATO) as defined in any one of clauses 1 to 41, a polymerase and primers compatible with the NGS platform.

Examples

119页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于鉴定微生物感染的方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!