Methods for nucleic acid assembly and high throughput sequencing

文档序号:112582 发布日期:2021-10-19 浏览:38次 中文

阅读说明:本技术 用于核酸组装和高通量测序的方法 (Methods for nucleic acid assembly and high throughput sequencing ) 是由 M·E·赫德森 L·A·昆 D·辛德勒 S·阿彻 I·萨奥尔默 于 2013-06-24 设计创作,主要内容包括:本发明的一些方面的方法和设备涉及高保真多核苷酸的合成。具体而言,本发明的各方面涉及同时进行酶促去除扩增序列和将经加工的寡核苷酸连接成核酸组装体。根据一些实施方式,该方法包括提供多个寡核苷酸的步骤,其中各寡核苷酸包含(i)与目标核酸序列的不同部分相同的内部序列,(ii)该内部序列5’端之侧的5’序列和在该内部序列3’端之侧的3’侧接序列,各所述侧接序列包含引物对的引物识别位点和限制性酶识别位点。(Methods and apparatus of some aspects of the invention relate to high fidelity polynucleotide synthesis. In particular, aspects of the invention relate to the simultaneous enzymatic removal of amplified sequences and ligation of processed oligonucleotides into nucleic acid assemblies. According to some embodiments, the method comprises the step of providing a plurality of oligonucleotides, wherein each oligonucleotide comprises (i) an internal sequence identical to a different portion of the target nucleic acid sequence, (ii) a 5 'sequence flanking the 5' end of the internal sequence and a 3 'flanking sequence flanking the 3' end of the internal sequence, each of said flanking sequences comprising a primer recognition site and a restriction enzyme recognition site of a primer pair.)

1. A method of generating a target nucleic acid having a predetermined sequence, the method comprising:

a) providing a first pool of double-stranded oligonucleotides, wherein the double-stranded oligonucleotides comprise:

(i) internal sequences identical to different portions of the first target nucleic acid sequence; wherein the internal sequence comprises a region of overlap with another double-stranded oligonucleotide in the first pool of double-stranded oligonucleotides; and

(ii)5 'flanking sequences and 3' flanking sequences, each of which comprises a consensus primer recognition site and a type IIS restriction enzyme recognition site, positioned such that a type IIS restriction enzyme digests and removes the flanking sequences and exposes the internal sequences; and

b) contacting the first pool of double-stranded oligonucleotides with a ligase and a type IIS restriction enzyme that recognizes the type IIS restriction enzyme recognition site under conditions suitable to facilitate simultaneous digestion and ligation of the restriction enzymes, thereby generating a first target nucleic acid;

wherein the first target nucleic acid comprises:

(i) an internal sequence identical to a portion of the final target nucleic acid;

(ii)5 'flanking sequences and 3' flanking sequences, each of which comprises a restriction enzyme recognition site.

2. The method of claim 1, wherein the first target nucleic acid is both strands of a double-stranded molecule.

3. The method of claim 1, wherein the first target nucleic acid does not comprise a substrate for the type IIS restriction enzyme in step b).

4. The method of claim 1, wherein the double-stranded oligonucleotides in the first pool of double-stranded oligonucleotides are generated by amplifying a plurality of single-stranded oligonucleotides, each single-stranded oligonucleotide corresponding to one strand of double-stranded oligonucleotides in the first pool of double-stranded oligonucleotides, wherein the amplification occurs through a common primer recognition site of the single-stranded oligonucleotides.

5. The method of claim 4, further comprising mismatch binding or error removal of the amplified oligonucleotides.

6. The method of claim 5, wherein the amplified oligonucleotide is contacted with a mismatch binding agent, optionally, the mismatch binding agent is MutS.

7. The method of claim 1, further comprising amplifying the first target nucleic acid after step b).

8. The method of claim 1, further comprising confirming the sequence accuracy of the first target nucleic acid and isolating the first target nucleic acid.

9. The method of claim 1, further comprising:

c) providing a mixture comprising the first target nucleic acid and a second target nucleic acid, wherein the second target nucleic acid comprises:

(i) an internal sequence that is different from the internal sequence of the first target nucleic acid and that is identical to a portion of the final target nucleic acid sequence;

(ii)5 'flanking sequences and 3' flanking sequences, each of which comprises a restriction enzyme recognition site; and

d) contacting the mixture with a ligase and a restriction enzyme that recognizes the restriction enzyme recognition site, thereby generating the final target nucleic acid comprising the internal sequence of the first target nucleic acid and the internal sequence of the second target nucleic acid.

10. The method of claim 9, wherein the first target nucleic acid and the second target nucleic acid are subjected to step d) under conditions suitable to facilitate simultaneous digestion and ligation.

11. The method of claim 9, wherein the final target nucleic acid does not comprise a substrate for the restriction enzyme in step d).

12. The method of claim 9, further comprising amplifying the first target nucleic acid and the second target nucleic acid prior to step d).

13. The method of claim 9, wherein the second target nucleic acid is produced by a method comprising:

a) providing a second pool of double-stranded oligonucleotides, wherein the double-stranded oligonucleotides comprise:

(i) internal sequences identical to different portions of the second target nucleic acid sequence; wherein the internal sequence comprises an overlapping region with another double-stranded oligonucleotide in the second pool of double-stranded oligonucleotides; and

(ii)5 'flanking sequences and 3' flanking sequences, each of which comprises a consensus primer recognition site and a type IIS restriction enzyme recognition site, positioned such that a type IIS restriction enzyme digests and removes the flanking sequences and exposes the internal sequences; and

b) contacting the second pool of double-stranded oligonucleotides with a ligase and a type IIS restriction enzyme capable of recognizing the type IIS restriction enzyme recognition site under conditions suitable to facilitate simultaneous digestion and ligation of the restriction enzyme, thereby generating a second target nucleic acid.

14. The method of claim 13, wherein the double-stranded oligonucleotides in the second pool of double-stranded oligonucleotides are generated by amplifying a plurality of single-stranded oligonucleotides, each corresponding to one strand of a double-stranded oligonucleotide in the second pool of double-stranded oligonucleotides, wherein amplification is performed using the common primer recognition site of the single-stranded oligonucleotides, optionally further comprising mismatch binding or error removal of the amplified oligonucleotides, e.g., contacting the amplified oligonucleotides with a mismatch binding agent, e.g., MutS.

15. The method of claim 9, the method further comprising: after step d), confirming the sequence accuracy of the final target nucleic acid by sequencing and/or isolating the final target nucleic acid.

Technical Field

Provided herein are methods and apparatus relating to the synthesis and assembly of high fidelity nucleic acids and nucleic acid libraries with predetermined sequences. More particularly, methods and apparatus for polynucleotide synthesis, error reduction, and/or high throughput sequencing validation are provided.

Background

It is common to replicate and amplify DNA sequences from nature and then break down into component parts using recombinant DNA chemistry techniques. As a component, the sequence is subsequently recombined or reassembled into a new DNA sequence. However, the dependence on naturally available sequences severely limits the possibilities explored by researchers. Although short DNA sequences can now be synthesized directly from a single nucleoside, it is generally not feasible to directly construct large fragments or modules of a polynucleotide (i.e., polynucleotide sequences longer than about 400 base pairs).

Oligonucleotide synthesis can be performed by massively parallel custom synthesis on microchips (Zhou et al, (2004) Nucleic Acids Res.32: 5409; Fodor et al (1991) Science 251: 767). However, the surface area of current microchips is very small and thus only a small number of oligonucleotides can be generated. Upon release into solution, the oligonucleotides are present at picomolar or lower concentrations per sequence, which are insufficient to drive the bimolecular activation reaction efficiently. Current methods for assembling small quantities of variant nucleic acids cannot be scaled up in a cost-effective manner to generate large quantities of a particular variant. Likewise, there remains a need for improved methods and devices for high fidelity gene assembly and the like.

In addition, usually by chemical reaction synthesis microchip oligonucleotides. The wrong chemical reaction leads to random base errors in the oligonucleotide. One of the key limiting factors in chemical nucleic acid synthesis is error rate. The error rate of chemically synthesized oligonucleotides (e.g., deletions of 1 out of every 100 bases and mismatches and insertions of 1 out of every 400 bases) exceeds that obtained by enzymatically replicating existing nucleic acids (e.g., PCR). Therefore, there is an urgent need for a new technology to produce high-fidelity polynucleotides in high yields in a cost-effective manner.

SUMMARY

Aspects of the present invention relate to methods, systems, and compositions for making and/or assembling high fidelity polymers. The invention also provides apparatus and methods for performing nucleic acid assembly reactions and assembling nucleic acids. It is an object of the present invention to provide a viable, economical method for synthesizing customized polynucleotides. It is another object of the present invention to provide a method for producing a synthetic polynucleotide having a lower error rate than a synthetic polynucleotide prepared by methods known in the art.

According to some embodiments, the present invention provides a method for producing a target nucleic acid having a predetermined sequence. In some embodiments, the method comprises the step of providing a plurality of oligonucleotides, wherein each oligonucleotide comprises (i) an internal sequence that is identical to a different portion of the sequence of the target nucleic acid, (ii) a 5 'flanking sequence flanking the 5' end of the internal sequence and a 3 'flanking sequence flanking the 3' end of the internal sequence, each flanking sequence comprising a primer recognition site and a restriction enzyme recognition site of a primer pair. In some embodiments, the method further comprises amplifying at least one set of oligonucleotides using a primer pair, thereby generating a plurality of amplified oligonucleotides. The plurality of amplified oligonucleotides may then be contacted in a single pool with a restriction enzyme and a ligase, wherein the restriction enzyme is capable of recognizing the restriction enzyme recognition site, thereby generating the target nucleic acid.

In some embodiments, the method comprises sequencing validation of the assembled target nucleic acid. In some embodiments, the amplified double-stranded oligonucleotide may comprise a sequence error or mismatch. In some embodiments, the method comprises error removal of the plurality of amplified oligonucleotides. In some embodiments, a plurality of amplified oligonucleotides can be contacted with a mismatch binding agent. The mismatch binding agent can selectively bind to double-stranded oligonucleotides containing mismatches, resulting in binding and cleavage. In some embodiments, a plurality of amplified oligonucleotides can be contacted with a mismatch recognition agent, e.g., a chemical such as lysine, piperidine, and the like.

In some embodiments, restriction enzymes and ligases are added to a pool of individual amplified oligonucleotides under conditions suitable to facilitate digestion and ligation, thereby generating a mixture comprising the assembled target nucleic acid sequence and flanking regions. In some embodiments, each flanking region comprises a consensus primer recognition site. In some embodiments, the restriction enzyme is a type IIS restriction enzyme. Digestion with a type IIS restriction enzyme produces double-stranded oligonucleotides with multiple sticky ends that can be ligated in a specific linear arrangement.

In some embodiments, the method comprises amplifying the target nucleic acid using a primer pair capable of recognizing primer recognition sites located at the 5 'end and the 3' end of the target nucleic acid. In some embodiments, the method comprises sequencing the target nucleic acid to confirm its sequence accuracy, e.g., by high throughput sequencing. In some embodiments, the method comprises isolating at least one target nucleic acid having a predetermined sequence from a pool of nucleic acid sequences.

According to some embodiments, the present invention provides a method of further processing the isolated nucleic acid. In some embodiments, the method comprises assembling at least two target nucleic acids. The step of assembling may be by hierarchical assembly. In some embodiments, at least two target nucleic acids may be subjected to restriction enzyme digestion and ligation, thereby forming a long target nucleic acid construct, e.g., at least about 10 kilobases or 100 kilobases in length.

According to some embodiments, the present invention provides a method for producing a target nucleic acid having a predetermined sequence in a vector. In some embodiments, a plurality of oligonucleotides are provided, each oligonucleotide comprising (i) an internal sequence identical to a different portion of the sequence of the target nucleic acid, (ii) a 5 'flanking sequence flanking the 5' end of the internal sequence and a 3 'flanking sequence flanking the 3' end of the internal sequence, each flanking sequence comprising a primer recognition site of a primer pair and a restriction enzyme recognition site of a restriction enzyme. In some embodiments, at least one set of oligonucleotides can be amplified using a primer pair, thereby generating a plurality of amplified oligonucleotides. In some embodiments, error removal and/or correction can be performed on a plurality of amplified oligonucleotides. In some embodiments, a circular vector having a restriction enzyme recognition site for a restriction enzyme is provided. In some embodiments, the plurality of amplified oligonucleotides and the circular vector may be contacted in a single pool with a restriction enzyme and a ligase, wherein the restriction enzyme is capable of recognizing a restriction enzyme recognition site, thereby assembling the target nucleic acid in the vector. In some embodiments, the method further comprises transforming the vector into a host cell and sequencing to verify the target nucleic acid sequence.

According to some embodiments, the present invention provides a composition for assembling a target nucleic acid having a predetermined sequence. In some embodiments, the composition comprises a plurality of oligonucleotides, wherein each oligonucleotide comprises (i) an internal sequence that is identical to a different portion of the sequence of the target nucleic acid, (ii) a 5 'flanking sequence flanking the 5' end of the internal sequence and a 3 'flanking sequence flanking the 3' end of the internal sequence, each flanking sequence comprising a primer recognition site of a primer pair and a restriction enzyme recognition site of a restriction endonuclease. In some embodiments, the composition further comprises a restriction enzyme and/or a ligase. In some embodiments, the composition further comprises a vector comprising a pair of enzyme recognition sites for a restriction enzyme. In some embodiments, the restriction enzyme is a type IIS restriction enzyme.

In some embodiments, the plurality of oligonucleotides are amplified and/or error corrected.

In some aspects of the invention, a method of generating a target nucleic acid having a predetermined sequence comprises providing a first mixture comprising (i) a restriction enzyme and (ii) a first pool of oligonucleotides comprising: a first oligonucleotide comprising a sequence identical to the 5 'end of the target nucleic acid, a second oligonucleotide comprising a sequence identical to the 3' end of the target nucleic acid, and a set of multiple oligonucleotides comprising a sequence identical to another portion of the target nucleic acid sequence, each oligonucleotide having an overlapping sequence region corresponding to a sequence region in the next oligonucleotide, the oligonucleotides in the first pool together comprising the target nucleic acid sequence; and contacting the mixture with a ligase, thereby generating the target nucleic acid. The target nucleic acid can then be sequence verified.

In some embodiments, the methods of the invention comprise providing a pool of oligonucleotides constructed and involving amplification of the oligonucleotides at different stages. The term "construction oligonucleotide" refers to a single-stranded oligonucleotide that can be used to assemble a nucleic acid molecule that is longer than the construction oligonucleotide itself. The construction oligonucleotide may be a single-stranded oligonucleotide or a double-stranded oligonucleotide. In some embodiments, the construction oligonucleotides are synthetic oligonucleotides and may be synthesized in parallel on a substrate.

In some embodiments, the method further comprises the step of providing a plurality of construction oligonucleotides prior to providing the first mixture, wherein each construction oligonucleotide comprises (i) an internal sequence that is identical to a different portion of the sequence of the target nucleic acid, (ii) a 5 'flanking sequence flanking the 5' end of the internal sequence and a 3 'flanking sequence flanking the 3' end of the internal sequence, each flanking sequence comprising a primer recognition site and a restriction enzyme recognition site of a primer pair. In some embodiments, each flanking region may comprise a common primer recognition site. In some embodiments, a plurality of construction oligonucleotides may be amplified. In some embodiments, the oligonucleotide may comprise a sequence error or mismatch. In some embodiments, error removal can be performed on a plurality of amplified oligonucleotides. For example, a plurality of amplified oligonucleotides can be contacted with a mismatch binding agent that selectively binds and cleaves double-stranded oligonucleotides comprising mismatches.

In some embodiments, restriction enzymes and ligases may be added to a pool of individual amplified oligonucleotides under conditions suitable to facilitate digestion and ligation, thereby generating a mixture comprising the assembled target nucleic acid sequence and flanking regions. In some embodiments, the restriction enzyme may be a type IIS restriction enzyme and digestion with a type IIS restriction enzyme may generate a plurality of sticky-end double-stranded oligonucleotides, wherein the plurality of sticky-end double-stranded oligonucleotides are ligated in a particular linear arrangement.

In some embodiments, the method further comprises amplifying the target nucleic acid using a primer pair capable of recognizing a primer recognition site located at the 5 'end of the first oligonucleotide and the 3' end of the second oligonucleotide.

In some embodiments, the method further comprises sequencing the target nucleic acid to confirm its sequence accuracy, e.g., by high throughput sequencing.

In some embodiments, the method further comprises isolating at least one target nucleic acid having a predetermined sequence from the pool of nucleic acid sequences.

In some embodiments, the method further comprises processing the target nucleic acid.

In some embodiments, the method further comprises providing a second mixture comprising (i) a restriction enzyme and (ii) a second pool of oligonucleotides comprising: a first oligonucleotide comprising a sequence identical to the 5 'end of the target nucleic acid, a second oligonucleotide comprising a sequence identical to the 3' end of the target nucleic acid, and a set of multiple oligonucleotides comprising a sequence identical to another portion of the target nucleic acid sequence, each oligonucleotide having an overlapping sequence region corresponding to the sequence region in the next oligonucleotide, the oligonucleotides in the second pool together comprising the second target nucleic acid. In some embodiments, the second mixture is contacted with a ligase, thereby generating a second target nucleic acid. In some embodiments, the second oligonucleotides in the first pool comprise a restriction enzyme recognition site for a restriction enzyme and the first oligonucleotides in the second pool comprise a restriction enzyme recognition site for a restriction enzyme.

In some embodiments, the method further comprises assembling at least two target nucleic acids. In some embodiments, the step of assembling is by hierarchical assembly. In some embodiments, the at least two target nucleic acids are subjected to restriction endonuclease digestion and ligation, thereby forming a long target nucleic acid construct. In some embodiments, the long target nucleic acid construct is at least about 10 kilobases in length or at least about 100 kilobases in length.

In some aspects, the invention relates to a composition for assembling a target nucleic acid having a predetermined sequence, the composition comprising a plurality of oligonucleotides comprising: a first oligonucleotide comprising a sequence identical to the 5 'end of the target nucleic acid, a second oligonucleotide comprising a sequence identical to the 3' end of the target nucleic acid, and one or more oligonucleotides comprising a sequence identical to another portion of the target nucleic acid sequence, each oligonucleotide having an overlapping sequence region corresponding to a sequence region in the next oligonucleotide, the plurality of oligonucleotides together comprising the target nucleic acid; the plurality of consensus sequences comprises a primer recognition site and a restriction enzyme recognition site of the primer pair. In some embodiments, the composition further comprises a restriction enzyme and/or a ligase. The restriction enzyme may be a type IIS restriction enzyme.

In some embodiments, a plurality of oligonucleotides may be amplified and/or error corrected. In some embodiments, the composition can further comprise a linearized vector comprising a 5 'end compatible with the first oligonucleotide and a 3' end compatible with the second oligonucleotide.

In some embodiments, the present invention relates to a method of generating a target nucleic acid having a predetermined sequence, the method comprising:

a) providing a first mixture comprising

(i) A first pool of oligonucleotides, the first pool of oligonucleotides comprising: a first plurality of oligonucleotides comprising a sequence identical to the 5 'end of the target nucleic acid, a second plurality of oligonucleotides comprising a sequence identical to the 3' end of the target nucleic acid, and a plurality of oligonucleotides comprising a sequence identical to another portion of the target nucleic acid sequence, each of the oligonucleotides having an overlapping sequence region corresponding to a sequence region in the next oligonucleotide, the oligonucleotides in the first mixture together comprising the target nucleic acid sequence;

(ii) a restriction enzyme, and

b) contacting the first mixture with a ligase, thereby generating the target nucleic acid.

In some embodiments, the method further comprises sequencing validation of the target nucleic acid.

In some embodiments, the method further comprises, prior to step (a), providing a plurality of construction oligonucleotides, each comprising (i) an internal sequence identical to a different portion of the target nucleic acid sequence, (ii)5 'and 3' flanking sequences flanking the 5 'and 3' ends of the internal sequence, each comprising a primer recognition site and a restriction enzyme recognition site of a primer pair.

In some embodiments, the method further comprises amplifying the plurality of construction oligonucleotides.

In some embodiments, the method further comprises performing error removal on the plurality of amplified oligonucleotides.

The method in some embodiments, wherein the plurality of amplified oligonucleotides are contacted with a mismatch binding agent that selectively binds and cleaves double-stranded oligonucleotides comprising mismatches.

The method in some embodiments, wherein the restriction enzyme and the ligase are added to a pool of individual amplified oligonucleotides under conditions suitable to facilitate digestion and ligation, thereby generating a mixture comprising the assembled target nucleic acid sequence and the flanking sequences.

In some embodiments, the flanking sequences each comprise a consensus primer recognition site.

In some embodiments, the restriction enzyme is a type IIS restriction enzyme.

The method in some embodiments, wherein the digestion with the type IIS restriction enzyme produces a plurality of sticky-end double-stranded oligonucleotides that are ligated in a unique linear arrangement.

In some embodiments, the method further comprises amplifying the target nucleic acid using a primer pair capable of recognizing a primer recognition site located at the 5 'end of the first oligonucleotide and the 3' end of the second oligonucleotide.

In some embodiments, the method further comprises sequencing the target nucleic acid to confirm its sequence accuracy.

The method in some embodiments, wherein the sequencing step is performed by high throughput sequencing.

In some embodiments, the method further comprises isolating at least one target nucleic acid having a predetermined sequence from the pool of nucleic acid sequences.

In some embodiments, the method further comprises processing the target nucleic acid.

In some embodiments, the method further comprises

c) Providing a second mixture comprising

(i) A second pool of oligonucleotides, the second pool of oligonucleotides comprising: a first plurality of oligonucleotides comprising the same sequence as the 5 'end of the target nucleic acid, a second plurality of oligonucleotides comprising the same sequence as the 3' end of the target nucleic acid, each having an overlapping sequence region corresponding to the sequence region in the next oligonucleotide, and a plurality of oligonucleotides comprising the same sequence as the other portion of the target nucleic acid sequence, the oligonucleotides in the second mixture together comprising a second target nucleic acid;

(ii) a restriction enzyme, and

d) contacting the second mixture with a ligase, thereby generating a second target nucleic acid.

In some embodiments, the method further comprises assembling at least two target nucleic acids.

In some embodiments, the assembling is performed by hierarchical assembly.

In some embodiments, the second plurality of oligonucleotides in the first pool comprises a restriction enzyme recognition site for a restriction enzyme and the first plurality of oligonucleotides in the second pool comprises a restriction enzyme recognition site for the restriction enzyme.

The method in some embodiments, wherein the at least two target nucleic acids are subjected to restriction endonuclease digestion and ligation, thereby forming a long target nucleic acid construct.

In some embodiments, the long target nucleic acid construct is at least about 10 kilobases in length.

In some embodiments, the long target nucleic acid construct is at least about 100 kilobases in length.

In some embodiments, the present invention relates to a method of generating a target nucleic acid having a predetermined sequence, the method comprising:

a) providing a plurality of oligonucleotides, each of said oligonucleotides comprising (i) an internal sequence identical to a different portion of a target nucleic acid sequence, (ii)5 'and 3' flanking sequences flanking the 5 'and 3' ends of said internal sequence, each of said flanking sequences comprising a primer recognition site of a primer pair and a restriction enzyme recognition site of a restriction endonuclease;

b) amplifying at least a subset of the oligonucleotides using the primer pair, thereby generating a plurality of amplified oligonucleotides;

c) optionally performing error removal on the plurality of amplified oligonucleotides;

d) providing a circular vector having a restriction enzyme recognition site for the restriction enzyme; and

c) contacting the plurality of amplified oligonucleotides and the circular vector in a single pool with the restriction enzyme and a ligase, thereby assembling the target nucleic acid in the vector, the restriction enzyme being capable of recognizing the restriction enzyme recognition site.

In some embodiments, the method further comprises transforming the vector into a host cell.

In some embodiments, the present invention relates to a composition for assembling a target nucleic acid having a predetermined sequence, the composition comprising:

a) a pool of oligonucleotides, the pool of oligonucleotides comprising: a first plurality of oligonucleotides comprising a sequence identical to the 5 'end of the target nucleic acid, a second plurality of oligonucleotides comprising a sequence identical to the 3' end of the target nucleic acid, and one or more sets of a plurality of oligonucleotides comprising a sequence identical to another portion of the target nucleic acid sequence, each of the oligonucleotides having an overlapping sequence region corresponding to a sequence region in the next oligonucleotide, the oligonucleotides in the pool together comprising the target nucleic acid;

b) a plurality of consensus sequences comprising a primer recognition site of the primer pair and a restriction enzyme recognition site;

c) a restriction enzyme; and

d) and (3) a ligase.

In some embodiments, the oligonucleotide is amplified.

In some embodiments, the oligonucleotide is error corrected.

In some embodiments, a linearized vector is also included, the vector including a 5 'end compatible with the first oligonucleotide and a 3' end compatible with the second oligonucleotide.

In some embodiments, the restriction enzyme is a type IIS restriction enzyme.

Brief description of the drawings

FIG. 1 shows an exemplary process for high fidelity nucleic acid assembly according to one embodiment of the invention.

FIG. 2 shows a non-limiting example of a method of assembly of a polynucleotide having a predetermined sequence.

FIG. 3 shows a non-limiting example of an assembly method for assembling a polynucleotide having a predetermined sequence into a vector.

FIG. 4 shows a non-limiting example of a method of hierarchical assembly of polynucleotides having a predetermined sequence.

FIG. 5 shows the nucleotide sequence of plasmid pG9-1 with restriction enzyme recognition sites (underlined).

FIG. 6 shows a non-limiting exemplary method of sequencing validation.

Detailed Description

Aspects of the invention are useful for optimizing nucleic acid assembly reactions and reducing the number of nucleic acids that are not accurately assembled. The methods and compositions of the invention can facilitate a method of obtaining a target sequence having a predetermined sequence. Thus, the methods and compositions of the invention can increase the likelihood of obtaining a properly assembled nucleic acid, thereby reducing the cost and time associated with producing a nucleic acid having a predetermined sequence.

Aspects of the invention can be used to increase the yield of one or more initial or intermediate assembly reactions. In some embodiments, the methods and compositions of the present invention can improve the efficiency of the overall assembly process by avoiding the need to separate multiple assembly steps (e.g., enzymatic digestion, purification, and ligation steps). Thus, some aspects of the invention allow for predictable and/or reliable assembly strategies and can significantly reduce the time and steps required for gene synthesis and improve the yield and/or accuracy of intermediate or final nucleic acid products.

In some aspects of the invention, the assembly process includes designing and implementing a nucleic acid assembly strategy that can reconcile sequence features known or predicted to interfere with one or more assembly steps. For example, nucleic acid sequences to be synthesized may be analyzed for sequence features that interfere with one or more assembly steps, such as repeated sequences, sequences with significantly high or low GC content, and/or other sequences associated with secondary structure. It will be appreciated by those skilled in the art that certain sequence features may interfere with multiple assembly reactions (e.g., polymerase-based extension reactions) and/or promote the formation of undesired assembly products, thereby reducing or preventing assembly of the correct nucleic acid product. In some embodiments, if multiple interfering sequence features are identified in a target nucleic acid sequence, a useful strategy may involve isolating the interfering sequence features during assembly. For example, a target nucleic acid can be assembled in a process involving multiple intermediate fragments or building blocks designed to contain only a small number of interfering sequences (e.g., 0, 1, 2, or 3). In some embodiments, each intermediate segment or building block may contain at most one interfering sequence feature. Thus, the intermediate segments can be efficiently assembled. In some embodiments, the design of a nucleic acid fragment or building block may exclude interfering sequence features from its 5 'and/or 3' ends. Thus, interfering sequence features can be excluded from complementary overlapping regions between adjacent starting nucleic acids designed for assembly reactions. This will prevent or reduce interference with sequence-specific hybridization reactions that are important for the correct assembly of nucleic acids. In some embodiments, it is sufficient to exclude interfering sequence features from the 3 'and/or 5' ends of the intermediate building blocks. For example, the interfering sequence features may be located at least one nucleotide from the 3 'end and/or 5' end of the building block, preferably at 2, 3,4, 5 or more nucleotides (e.g., 5-10, 10-15, 15-20 or more nucleotides) from the 3 'end and/or 5' end of the building block.

Aspects of the invention may be used in conjunction with in vitro and/or in vivo nucleic acid assembly steps.

Aspects of the methods and compositions provided herein are useful for increasing the accuracy, yield, throughput, and/or cost-effectiveness of nucleic acid synthesis and assembly reactions. The terms "nucleic acid," "polynucleotide," "oligonucleotide" are used interchangeably herein and refer to a naturally occurring or synthetic polymeric form of nucleotides. The oligonucleotides and nucleic acid molecules of the invention may be formed from naturally occurring nucleotides, for example, to form deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) molecules. Alternatively, the naturally occurring oligonucleotide may comprise a structural modification that alters its properties, such as a Peptide Nucleic Acid (PNA) or a Locked Nucleic Acid (LNA). Solid phase synthesis of oligonucleotides and nucleic acid molecules having naturally occurring bases or artificial bases is well known in the art. It is to be understood that these terms encompass equivalents of RNA or DNA generated from nucleotide analogs, and single-or double-stranded polynucleotides as applied to the embodiments to be described. Nucleotides useful in the invention include, for example, naturally occurring nucleotides (e.g., ribonucleotides or deoxyribonucleotides), or natural or synthetic modifications of nucleotides, or artificial bases. The term monomer as used herein refers to a member of a small group of molecules that are and can be joined together to form an oligomer, a polymer, or a compound made up of two or more members. The particular order of the monomers in the polymer is referred to herein as the "sequence" of the polymer. The set of monomers includes, but is not limited to, for example, the common L-amino acid group, the D-amino acid group, the synthetic and/or natural amino acid group, the nucleotide group, and the pentose and hexose group. Aspects of the invention described herein relate primarily to the preparation of oligonucleotides, but are readily applicable to the preparation of other polymers such as peptides or polypeptides, polysaccharides, phospholipids, heteropolymers, polyesters, polycarbonates, polyureas, polyamides, polyethyleneimines, polyarylene sulfides, polysiloxanes, polyimides, polyacetates or any other polymer.

Target nucleic acid

The term "predetermined sequence" as used herein refers to a sequence of a polymer that is known and has been selected prior to synthesis or assembly of the polymer. In particular, aspects of the invention described herein relate generally to the preparation of nucleic acid molecules, the sequence of which is known and selected for oligonucleotides or polynucleotides prior to synthesis or assembly of the nucleic acid molecule. In some embodiments of the technology provided herein, immobilized oligonucleotides or polynucleotides are used as a source of material. In various embodiments, the methods described herein use a plurality of oligonucleotides, each sequence determined based on the sequence of the final polynucleotide construct to be synthesized. In one embodiment, the oligonucleotide is a short nucleic acid molecule. For example, the oligonucleotide may be 10 to about 300 nucleotides, 20 to about 400 nucleotides, 30 to about 500 nucleotides, 40 to about 600 nucleotides, or more than about 600 nucleotides in length. However, shorter or longer oligonucleotides may be used. Oligonucleotides can be designed to have different lengths. In some embodiments, polynucleotide construct sequences can be divided into groups of shorter sequences that can be synthesized in parallel and assembled into a single or multiple desired polynucleotide constructs using the methods described herein.

In some embodiments, a nucleic acid of interest can have a sequence of a naturally occurring gene and/or other naturally occurring nucleic acid (e.g., a naturally occurring coding sequence, regulatory sequence, non-coding sequence, chromosomal structural sequence (e.g., telomere or centromere sequence), etc., any fragment thereof, or any combination of two or more thereof), or a non-naturally occurring sequence. In some embodiments, a target nucleic acid can be designed to have a sequence that differs from a native sequence in one or more positions. In other embodiments, the target nucleic acid can be designed to have a completely new sequence. However, it is understood that the target nucleic acid may include one or more naturally occurring sequences, non-naturally occurring sequences, or a combination thereof.

In some embodiments, provided herein are methods of assembling a library comprising nucleic acids having a predetermined sequence difference. The assembly protocols provided herein can be used to generate very large libraries representing many different nucleic acid sequences of interest. For example, the methods provided herein can be used to assemble libraries having more than 10 different sequence variants. In some embodiments, the nucleic acid library is a library of sequence variants. Sequence variants can be variants of a single naturally occurring protein coding sequence. However, in some embodiments, the sequence variant may be a variant of a plurality of different protein-encoding sequences. Accordingly, one aspect of the present invention relates to the design of assembly strategies for preparing accurate, high-density nucleic acid libraries. Another aspect of the technology provided herein relates to assembling an accurate high density nucleic acid library. Aspects of the technology provided herein also relate to accurate high density nucleic acid libraries. A high density nucleic acid library can comprise more than 100 different sequence variants (e.g., about 102 to 103, about 103 to 104, about 104 to 105, about 105 to 106, about 106 to 107, about 107 to 108, about 108 to 109, about 109 to 1010, about 1010 to 1011, about 1011 to 1012, about 1012 to 1013, about 1013 to 1014, about 1014 to 1015 or more different sequences), wherein a high percentage of the different sequences are specific sequences (e.g., greater than about 50%, greater than about 60%, greater than about 70%, greater than about 75%, greater than about 80%, greater than about 85%, greater than about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99% or more of the sequences are predetermined sequences of interest), as opposed to random sequences.

In certain embodiments, the target nucleic acid may include a functional sequence (e.g., a protein binding sequence, a regulatory sequence, a sequence encoding a functional protein, etc., or a combination thereof). However, in some embodiments, the target nucleic acid may lack a particular functional sequence (e.g., the target nucleic acid may include only a non-functional fragment or variant of a protein binding sequence, regulatory sequence, or protein coding sequence, or any other non-functional naturally occurring or synthetic sequence, or any non-functional combination thereof). Certain target nucleic acids may include functional and non-functional sequences. These and other aspects of the target nucleic acids and their uses are described in more detail herein.

In some embodiments, the target nucleic acid can be assembled in a single multiplex assembly reaction (e.g., a single nucleotide assembly reaction). However, the target nucleic acid may also be assembled from multiple nucleic acid fragments, wherein each nucleic acid fragment may be generated in a separate multiplex oligonucleotide assembly reaction. It is understood that in some embodiments, one or more nucleic acid fragments generated by multiple oligonucleotide assembly may be mixed with one or more nucleic acid molecules obtained from another source (e.g., restriction fragments, nucleic acid amplification products, etc.) to form the target nucleic acid. In some embodiments, the target nucleic acid assembled in a first reaction can be used as an input nucleic acid fragment for a subsequent assembly reaction to generate a larger target nucleic acid. The terms "multiplex assembly" and "multiplex oligonucleotide assembly reaction" as used herein generally refer to an assembly reaction involving a plurality of starting nucleic acids (e.g., a plurality of at least partially overlapping nucleic acids) that are assembled to produce a larger target nucleic acid.

Assembling method

FIG. 1 shows a method for assembling nucleic acids according to one embodiment of the invention. First, sequence information is obtained. The sequence information may be the sequence of the predetermined target nucleic acids to be assembled. In some embodiments, the sequence may be accepted in the form of instructions from the client. In some embodiments, the sequence may be accepted in the form of a nucleic acid sequence (e.g., DNA or RNA). In some embodiments, the sequence may be accepted in the form of a protein sequence. The sequence may be converted to a DNA sequence. For example, if the resulting sequence is an RNA sequence, T can be substituted for U to obtain the corresponding DNA sequence. If the resulting sequence is a protein sequence, the protein sequence may be converted to a DNA sequence using appropriate amino acid codons.

In some embodiments, the sequence information can be analyzed to determine an assembly strategy, such as the number and sequence of fragments to be assembled (also referred to herein as building blocks, oligonucleotides, or intermediate fragments), to generate a predetermined sequence of the target nucleic acid. In some embodiments, sequence analysis may include scanning for the presence of one or more interfering sequence features that are known or predicted to interfere with oligonucleotide synthesis, amplification, or assembly. For example, the interfering sequence structure can be a sequence with low GC content (e.g., less than 30% GC, less than 20% GC, less than 10% GC, etc.) over a length of at least 10 bases (e.g., 10-20, 20-50, 50-100, or more than 100 bases), or a sequence that can form a secondary structure or a stem-loop structure. Once through the filtration, the nucleic acid sequence can be divided into smaller fragments, such as oligonucleotide building blocks.

In some embodiments, synthetic oligonucleotides (e.g., sequence, size, and number) can be designed for assembly after the construct quantification and resolution steps. Synthetic oligonucleotides can be produced using standard DNA synthesis chemistry (e.g., phosphoramidite methods). Synthetic oligonucleotides can be synthesized on a solid support (e.g., a microarray) using any suitable technique known in the art or as detailed herein. The oligonucleotides may be eluted from the microarray prior to amplification or amplified on the microarray. It will be appreciated that different oligonucleotides may be designed to have different lengths.

In some embodiments, the building block oligonucleotide of each target sequence can be amplified. For example, the oligonucleotides can be designed to have primer binding sequences on their 3 'and 5' ends and can be amplified by Polymerase Chain Reaction (PCR) using an appropriate primer pair.

It is understood that synthetic oligonucleotides may have sequence errors. Thus, oligonucleotide preparations may be selected or screened to remove error-containing molecules as described in more detail herein. An oligonucleotide containing an error can be a homoduplex with an error on both strands (i.e., incorrect complementary nucleotides, deletions or additions on both strands). In some embodiments, sequence errors can be removed using techniques that involve denaturing and reannealing double-stranded nucleic acids. In some embodiments, single stranded nucleic acids containing complementary errors may no longer anneal and bind to each other if the nucleic acid containing each individual error is present in the nucleic acid preparation at a lower frequency than a nucleic acid having the correct sequence at the same location. In contrast, a single strand containing an error may reanneal with a complementary strand that does not contain an error or contains one or more different errors. As a result, the strand eventually containing the error exists as a heteroduplex in the re-annealed reaction product. Error-free nucleic acid strands can be re-annealed with error-containing strands or other error-free strands. The re-annealed error-free strands form homoduplexes in the re-annealed samples. Thus, by removing heteroduplexes from the re-annealed oligonucleotide preparations, the amount or frequency of nucleic acids containing errors can be reduced. Any suitable method for removing heteroduplexes known in the art may be used, including chromatography, electrophoresis, selective binding of heteroduplexes, and the like. In some embodiments, a mismatch binding protein that selectively (e.g., specifically) binds to a heteroduplex can be used. In some embodiments, the mismatch binding protein can be used in solution or immobilized on a double-stranded oligonucleotide or polynucleotide on a support.

In some embodiments, the error-containing oligonucleotides are removed using a MutS filtration method (e.g., using MutS, MutS homologs, or a combination thereof). In E.coli, the MutS protein, which appears to act as a dimer, acts as a mismatch recognition factor. In eukaryotic cells, at least three MutS homologous (MSH) proteins have been identified, namely MSH2, MSH3 and MSH6, and they form heterodimers. For example in s.cerevisiae, the MSH2-MSH6 complex (also known as MutS α) recognizes base mismatches and single nucleotide insertion/deletion loops, while the MSH2-MSH3 complex (also known as MutS β) recognizes insertions/deletions of up to 12-16 nucleotides, but it serves as a substantial redundancy. The mismatch binding protein may be obtained from recombinant or natural sources. The mismatch binding protein may be thermostable. In some embodiments, thermostable mismatch binding proteins from thermophilic microorganisms may be used. Examples of thermostable DNA mismatch binding proteins include, but are not limited to: tth MutS (from Thermus thermophilus), Taq MutS (from Thermus aquaticus), Apy MutS (from liquid thermophilus), Tma MutS (from Thermotoga maritima), homologues thereof, any other suitable MutS or any combination of two or more thereof.

It has been demonstrated that MutS obtained from different species may have different affinities for specific mismatches or for different mismatches. In some embodiments, combinations of different MutS with different affinities for different mismatches may be used.

In some embodiments, enzyme complexes using one or more repair proteins may be used. Examples of repair proteins include, but are not limited to: MutS for mismatch recognition, MutH for introducing a gap in the target strand, and MutL for mediating the interaction between MutH and MutS, homologues thereof or any combination thereof. In some embodiments, the mismatch binding protein complex is a MutHLS enzyme complex.

In some embodiments, a sliding clamp (slipping clamp) technique can be used to enrich for error-free double-stranded oligonucleotides. In some embodiments, the MutS or a homolog thereof can interact with a DNA clamp protein. Examples of DNA clamp proteins include, but are not limited to, the bacterial slip clamp protein dnaN encoded by the dnaN gene, which functions as a homodimer. In some embodiments, the interaction between the MutS protein (or homologue thereof) and the splint protein may increase the effectiveness of MutS in binding mismatches.

In some embodiments, an oligonucleotide containing an error can be removed using an enzyme from the S1 protein family (e.g., CELI, CELII or a homolog thereof, such as RESI, or a combination thereof). Enzymes from the S1 protein family recognize base mismatches, insertions, and deletion loops. In some embodiments, such enzymes may preferentially bind to the holliday junction by only one or two DNA strands, followed by cleavage of the recognition site. In some embodiments, thermostable equivalents of the S1 protein may be used.

In some embodiments, the error-containing oligonucleotides can be removed using small molecules, chemical or inorganic materials that bind to mismatched base sites. At the site of mismatch, the nucleotide base is supercoiled and susceptible to chemical modification reactions. Mismatched thymines and cytosines can be modified in chemical cleavage processes using materials such as permanganate, hydroxylamine, lysine and/or ruthenium pentamine, respectively. The resulting modified DNA was subsequently treated with piperidine to induce cleavage at the abasic site. In some embodiments, divalent salts can be used to monitor the specificity of shearing.

In some embodiments, in a next step, the error-corrected oligonucleotides are mixed by sequentially removing the consensus sequence and then ligating into a longer multiple oligonucleotide construct.

In some aspects of the invention, an enzymatic digestion consensus sequence removal step is used in conjunction with a ligation step. It will be appreciated by those skilled in the art that the method of the invention allows for simultaneous removal of the consensus sequence and ligation into the target nucleic acid construct without the need for an enzymatic removal, bead-based capture and ligation sequencing step. Furthermore, it will be appreciated by those skilled in the art that the method of the invention has many advantages over standard gene assembly methods, such as:

the production efficiency is increased. Using standard independent enzymatic removal of the consensus sequence, the reaction was stopped after a set time point, where unreacted substrate or undigested oligonucleotide was still present as subject for further removal. It will be appreciated by those skilled in the art that since the ligation reaction produces a desired product that is not enzymatically removed from the substrate, the combination of removal and ligation steps can irreversibly drive the reaction to produce the desired product.

The cost is saved: methods according to aspects of the invention are cost effective because no purification steps are required between removal of the consensus sequence and ligation. The aspects of the invention also do not require biotin-labeled primers due to the elimination of purification steps. The associated savings are also: the order time for the non-biotinylated primer was shorter than for the biotin-containing primer.

Time efficiency: by eliminating the purification step between enzymatic consensus sequence removal and ligation, the time and number of steps required for gene synthesis is reduced.

The opportunity to add other sequences is easily made independent of the sequence size. Since the partial purification steps used to remove the unwanted sequences are based on size, elimination of the purification steps can remove any restriction on the size of other sequences to be added for gene synthesis. This may include a one-step ligation to the vector, or the addition of a common flanking sequence.

This method allows the use of restriction sites in the gene which are used in the gene synthesis process itself. In previous methods, these restriction sites could not be used because the cleavage sites would result in the formation of small DNA fragments that would be removed in a purification step. The use of these restriction sites allows for recursive (hierarchical) gene synthesis to construct longer nucleic acids.

It will be appreciated by those skilled in the art that following assembly of the oligonucleotide, the assembled product (e.g., the final target nucleic acid and intermediate nucleic acid fragments) may contain undesired sequences. Errors may be caused by sequence errors introduced during oligonucleotide synthesis, or during assembly of oligonucleotides into longer nucleic acids. In some embodiments, a nucleic acid having the correct predetermined sequence may be isolated from other nucleic acid sequences (also referred to herein as a preparative in vitro clone). In some embodiments, the correct sequence may be separated by selectively separating it from other incorrect sequences. For example, nucleic acids with the correct sequence can be selectively moved or transferred to a different feature of the support or to another plate. Alternatively, nucleic acids with incorrect sequences can be selectively removed from features comprising the nucleic acid of interest (see, e.g., PCT/US2007/011886, which is incorporated herein by reference in its entirety).

In some embodiments, the assembled construct or copies of the assembled construct may be isolated by clonal isolation after oligonucleotide processing and ligation. Sequence verification of the assembly construct can be performed using, for example, high throughput sequencing. In some embodiments, sequencing of a target nucleic acid sequence can be performed using sequencing of individual molecules (e.g., single molecule sequencing) or sequencing of an amplified population of target nucleic acid sequences (e.g., polymerase clone sequencing). Any suitable sequencing method may be used, such as sequencing by hybridization, sequencing by ligation, or sequencing by synthesis.

Some aspects of the invention relate to a gene synthesis platform using the methods described herein. In some embodiments, the gene synthesis platform can be used in conjunction with a next generation sequencing platform (e.g., by sequencing by hybridization, by sequencing by synthesis, or by sequencing by ligation, or any other suitable sequencing method).

In some embodiments, the assembly method may include several parallel and/or sequential reaction steps in which multiple different nucleic acids or oligonucleotide sets are synthesized or immobilized, amplified, and combined to assemble (e.g., by extension or ligation as described herein) to generate longer nucleic acid products for further assembly, cloning, or other use (see PCT application PCT/US09/55267, which is incorporated herein by reference in its entirety).

Oligonucleotide synthesis

In some embodiments, the methods and devices provided herein use oligonucleotides immobilized on a surface or substrate (e.g., support-bound oligonucleotides). As used herein, the terms "support" and "substrate" are used interchangeably and refer to a porous or non-porous solvent-insoluble material on which polymers (e.g., nucleic acids) are synthesized or immobilized. As used herein, "porous" means that the material contains pores of substantially uniform diameter (e.g., in the nm range). Porous materials include paper, synthetic filters, and the like. In such porous materials, the reaction may take place in the pores. The support can have any of a number of shapes, such as pins, strips, plates, flat discs, rods, bends, cylindrical structures, particles (including beads, nanoparticles), and the like. The supports may have different widths. The support may be hydrophilic or may be made hydrophilic and comprise inorganic powders (such as silica, magnesium sulphate and alumina), natural polymeric materials (particularly cellulosic materials and cellulose derived materials, for example paper comprising fibres (such as filter paper, chromatography paper and the like)), synthetic or modified naturally occurring polymers (e.g., nitrocellulose, cellulose acetate, poly (vinyl chloride), polyacrylamide, cross-linked dextran, agarose, polyacrylate, polyethylene, polypropylene, poly (4-methylbutene), polystyrene, polymethacrylate, poly (ethylene terephthalate), nylon, poly (vinyl butyrate), polyvinylidene fluoride (PVDF) membrane, glass, controlled pore glass, magnetically controlled pore glass, ceramic, metal, etc.), either alone or in combination with other materials. In some embodiments, the oligonucleotides are synthesized on an array format. For example, single stranded oligonucleotides are synthesized in situ on a common support, wherein each oligonucleotide is synthesized on a separate or discrete feature (or spot) of the substrate. In a preferred embodiment, the single stranded oligonucleotide is bound to the surface of a support or feature. The term "array" as used herein refers to an arrangement of discrete features for storing, routing, amplifying and releasing oligonucleotides or complementary oligonucleotides for further reaction. In preferred embodiments, the support or array is addressable: the support comprises two or more discrete addressable features at specific predetermined locations (i.e., "addresses") on the support. Thus, each oligonucleotide molecule on the array is located at a known and defined position on the support. Each oligonucleotide sequence can be determined from its site on the support.

In some embodiments, the oligonucleotides are attached, spotted, immobilized, surface bound, supported, or synthesized on a surface or discrete features of an array. The oligonucleotides may be covalently attached to the surface or deposited on the surface. Arrays can be constructed, custom purchased, or purchased from commercial vendors such as Agilent, Affymetrix, Nimebrazil, Inc. Various construction methods are well known in the art, such as maskless array synthesizers, photo-orientation methods using masks, flow channel methods, spotting methods, and the like. In some embodiments, the construction and/or selection oligonucleotides may be synthesized on a solid support using a Maskless Array Synthesizer (MAS). Maskless array synthesizers are described, for example, in PCT application number WO 99/42813 and corresponding U.S. patent number 6,375,903. Other examples are known maskless devices that can make custom DNA microarrays in which each feature in the array has a single-stranded DNA molecule of a desired sequence. Other methods of synthetically constructing and/or selecting oligonucleotides include, for example, methods of light guidance using masks, flow channel methods, spotting methods, pin-type methods, and methods using multiple supports. Methods of light directing using masks (e.g., the vlsis ptm method) for synthesizing oligonucleotides are described in U.S. Pat. nos. 5,143,854, 5,510,270, and 5,527,681. These methods involve activating a predetermined area on a solid support and then contacting the support with a pre-selected monomer solution. The selected areas can be activated by irradiation of a light source through a mask, most often in accordance with photolithographic techniques used in the fabrication of integrated circuits. Other areas of the support remain inactive because the irradiation is blocked by the mask and it remains chemically protected. Thus, the light pattern defines which region on the support reacts with a given monomer. Different arrays of polymers are produced on the support by repeatedly activating different sets of predetermined regions and contacting different monomer solutions to the support. Other steps can optionally be used, such as washing the unreacted monomer solution from the support. Other applicable methods include mechanical techniques such as those described in U.S. Pat. No. 5,384,261. Other methods that can be applied to the synthetic construction and/or selection of oligonucleotides on a single support are described, for example, in U.S. Pat. No. 5,384,261. For example, reagents can be delivered to the support by (1) flowing in a channel defined over a predetermined area, or (2) by "spotting" on a predetermined area. Other methods and combinations of spotting and flow can also be used. In each case, certain activated regions on the support are mechanically separated from other regions as the monomer solution is delivered to the multiple reaction sites. Flow channel methods include, for example, microfluidic systems to control the synthesis of oligonucleotides on solid supports. For example, different polymer sequences may be synthesised on selected regions of a solid support by forming flow channels in the surface of the solid support through or within which appropriate reagents flow. Spotting methods for preparing oligonucleotides on solid supports involve delivery of reactants in relatively small amounts by direct placement in selected areas. In some steps, the entire support surface can be sprayed or covered with the solution, provided that it is more effective to do so. Aliquots of the precisely measured monomer solution can be added dropwise through a dispenser that moves from one area to another. Pin-type methods for synthesizing oligonucleotides on solid supports are described, for example, in U.S. Pat. No. 5,288,514. Pin-type methods utilize supports having multiple pins or other extensions. The pin types are each inserted simultaneously into a single reagent container in the tray. 96 pin type arrays typically utilize a 96 well tray, such as a 96 well microtiter dish. Each tray is filled with a specific reagent coupled in a specific chemical reaction on a single pin. Thus, the trays will often contain different reagents. Since the chemical reactions have been optimized to allow each reaction to be performed under a relatively similar set of reaction conditions, multiple steps of chemical coupling can be performed simultaneously.

In another embodiment, a plurality of oligonucleotides can be synthesized on a plurality of supports. One example is the bead-based synthesis method described in, for example, U.S. Pat. nos. 5,770,358, 5,639,603, and 5,541,061. To synthesize molecules (e.g., oligonucleotides) on beads, a large number of beads are suspended in a suitable carrier (e.g., water) in a container. Beads are provided with optional spacer molecules having active sites for their complexation, optionally protecting groups. In each step of the synthesis, the beads are separated to allow coupling into various containers. After deprotection of the nascent oligonucleotide chains, different monomer solutions are added to each vessel so that the same nucleotide addition reaction occurs on all beads in a given vessel. The beads are then washed with excess reagent, collected into a single vessel, mixed and redistributed to additional multiple vessels in preparation for the next round of synthesis. It should be noted that due to the large number of beads initially utilized, a large number of beads will likewise be randomly distributed within the container, each having a unique oligonucleotide sequence synthesized on its surface after multiple rounds of random base addition. Individual beads can be tagged with a sequence that is unique to the double-stranded oligonucleotide on them, allowing identification during use.

The pre-synthesized oligonucleotide and/or polynucleotide sequences may be attached to a support or synthesized in situ using the following methods: light directing processes, flow channel and spotting processes, ink jet processes, pin processes and bead based processes are shown in the following references: McGall et al, (1996) Proc.Natl.Acad.Sci.U.S.A.93: 13555; synthetic DNA Arrays In Genetic Engineering (Synthesis of DNA Arrays In Genetic Engineering operations), Vol.20: 111, Proelanan Press (Plenum Press) (1998); duggan et al, (1999) nat. Genet. S21: 10; microarray, Making Them and Using the In Microarray Bioinformatics ("Microarray: manufacture and use In Microarray Bioinformatics"), Cambridge university Press (Cambridge university Press), 2003; U.S. patent application publication nos. 2003/0068633 and 2002/0081582; U.S. patent nos. 6,833,450, 6,830,890, 6,824,866, 6,800,439, 6,375,903 and 5,700,637; and PCT publication nos. WO 04/031399, WO 04/031351, WO 04/029586, WO 03/100012, WO 03/066212, WO 03/065038, WO 03/064699, WO 03/064027, WO 03/064026, WO 03/046223, WO 03/040410 and WO 02/24597; the above disclosure is incorporated by reference herein in its entirety for all purposes. In some embodiments, the presynthesized oligonucleotides are attached to a support or synthesized using a spotting method, wherein the monomer solution is added dropwise through a dispenser (e.g., an ink jet) that moves from one region to another. In some embodiments, oligonucleotides are spotted on the support using, for example, a mechanical wave driven dispenser.

Amplification of

In some embodiments, oligonucleotides can be amplified using a suitable pair of primers, one primer for each end of the oligonucleotide (e.g., one complementary to the 3 'end of the oligonucleotide and one identical to the 5' end of the oligonucleotide). In some embodiments, the oligonucleotides can be designed to contain a central or internal assembly sequence (corresponding to the target sequence, designed to incorporate the end product) flanked by a 5 'amplification sequence (e.g., a 5' universal sequence or a 5 'consensus amplification sequence) and a 3' amplification sequence (e.g., a 3 'universal sequence or a 5' consensus amplification sequence).

In some embodiments, the synthetic oligonucleotide may comprise a central assembly sequence flanked by 5 'and 3' amplification sequences. The central assembly sequence is designed to be incorporated into the assembled nucleic acid. The flanking sequences are designed for amplification and are not intended to be incorporated into the assembled nucleic acid. Multiple different assembly oligonucleotides having the same amplification sequence but different central assembly sequences can be amplified using flanking amplification sequences as primer sequences. In some embodiments, the flanking sequences are removed after amplification to yield an oligonucleotide containing only the assembled sequence.

The oligonucleotides can be amplified using amplification primers (e.g., 10-50 nucleotides long, 15-45 nucleotides long, about 25 nucleotides long, etc.) corresponding to the flanking amplification sequences (e.g., one primer can be complementary to the 3 'amplification sequence and one primer has the same sequence as the 5' amplification sequence). In some embodiments, a plurality of different oligonucleotides (e.g., about 5, 10, 50, 100, or more) having different central assembly sequences can have the same 5 'amplification sequence and/or the same 3' amplification sequence. These oligonucleotides can all be amplified in the same reaction using the same amplification primers. The amplified sequences can then be removed from the amplified oligonucleotides using any suitable technique to produce oligonucleotides containing only the assembled sequences. In some embodiments, the amplified sequences are removed by restriction enzymes described in more detail herein.

In some embodiments, the oligonucleotide may be amplified while it is still attached to the support. In some embodiments, the oligonucleotides may be removed or excised from the support prior to amplification.

In some embodiments, the method comprises synthesizing a plurality of oligonucleotides or polynucleotides in a chain extension reaction using the first plurality of single-stranded oligonucleotides as templates. As previously described, oligonucleotides may be first synthesized on a discrete set of features on the surface, or may be placed on multiple features of the support. In some embodiments, the oligonucleotide is covalently attached to the support. In some embodiments, the first plurality of oligonucleotides is immobilized on a solid surface. In some embodiments, each feature of the solid surface comprises a high density of oligonucleotides having different predetermined sequences (e.g., about 106 and 108 molecules per feature). The support can comprise at least 100, at least 1,000, at least 104, at least 105, at least 106, at least 107, at least 108 features. In some embodiments, after amplification, the double-stranded oligonucleotide may be eluted in solution and/or error reduced and/or assembled to form a longer nucleic acid construct.

Error reduction

In some embodiments, each fragment is assembled and fidelity optimized to remove error-containing nucleic acids (e.g., using one or more of the post-assembly fidelity optimization techniques described herein) prior to processing to generate sticky ends. A sequence error may comprise one or more nucleotide deletions, insertions, substitutions (e.g., transversions or transitions), inversions, duplications or any combination of two or more thereof. Oligonucleotide errors may occur during oligonucleotide synthesis. Different synthesis techniques may be prone to different error distributions and frequencies. In some embodiments, the error rate may vary between 1/10-1/200 errors per base, depending on the synthesis scheme used. However, in some embodiments, a lower error rate may be achieved. In addition, the type of error may depend on the synthesis technique used. For example, microarray-based oligonucleotide synthesis may produce relatively more deletions than column-based synthesis techniques.

Some aspects of the invention relate to polynucleotide assembly methods in which synthetic oligonucleotides are designed and used to assemble polynucleotides into longer polynucleotide constructs. Errors in the sequence are faithfully replicated during enzymatic amplification or chain extension reactions. As a result, the population of polynucleotides synthesized by this method contains both error-free and error-prone sequences. In some embodiments, due to errors introduced during oligonucleotide synthesis, the synthesized oligonucleotides may contain incorrect sequences, so it is useful to remove polynucleotides that incorporate one or more oligonucleotides containing errors during assembly or extension. In some embodiments, one or more assembled polynucleotides may be sequenced to determine whether it comprises the pre-determined sequence. This method allows identification of fragments with the correct sequence. In other embodiments, other techniques may be used to remove nucleic acid fragments that contain errors. Such nucleic acid fragments may be initially synthesized oligonucleotides or assembled nucleic acid polymers. It will be appreciated that a nucleic acid containing an error may be a homoduplex containing an error on both strands (i.e. incorrect complementary nucleotides, deletions or additions on both strands) as the assembly method may involve one or more rounds of polymerase extension (e.g. during or after assembly to amplify the assembled product). During polymerase extension, the input nucleic acid containing the error can be used as a template to generate a complementary strand containing the complementary error. In certain embodiments, the preparation of double-stranded nucleic acid fragments or duplexes may comprise a mixture of nucleic acids comprising a nucleic acid having the correct predetermined sequence and a nucleic acid comprising one or more sequence errors integrated during assembly. The term "duplex" refers to a nucleic acid molecule that is at least partially double-stranded. "stable duplex" refers to a duplex that has a relatively high probability of maintaining a hybridized state with a complementary sequence under a given set of hybridization conditions. In an exemplary embodiment, a stable duplex refers to a duplex that does not contain base pair mismatches, insertions, or deletions. "labile duplex" refers to a duplex that is less likely to maintain a state of hybridization with a complementary sequence under a given set of hybridization conditions (e.g., stringent melting). In an exemplary embodiment, an unstable duplex refers to a duplex containing at least one base pair mismatch, insertion, or deletion. The term "stringency" as used herein refers to the conditions of temperature, ionic strength, presence or absence of other compounds (e.g., organic solvents), etc., under which nucleic acid hybridization is carried out. Hybridization stringency increases with temperature and/or solution chemistry (e.g., the amount of salt and/or formamide in the hybridization solution) during the hybridization process. Under "high stringency" conditions, nucleic acid base pairing occurs only between nucleic acid fragments that have a high frequency of complementary base sequences. Stringent conditions may be selected to be about 5 ℃ lower than the thermal melting point (Tm) for a given polynucleotide duplex at a defined ionic strength and pH. The length and GC content of the complementary polynucleotide strand determine the Tm of the duplex and, therefore, the hybridization conditions necessary to achieve the desired hybridization specificity. The Tm is the temperature (under conditions of defined ionic strength and pH) at which 50% of the polynucleotide sequences hybridize to a perfectly matched complementary strand. In some cases, it may be desirable to increase the stringency of hybridization conditions to about the Tm of a particular duplex. Suitable stringency conditions are known to those skilled in the art or can be determined experimentally by those skilled in the art. See, e.g., Current Protocols in Molecular Biology (New compiled Molecular Biology laboratory Manual), Wiley, N.Y. (John Wiley & Sons) (1989), 6.3.1-12.3.6; sambrook et al, 1989, Molecular Cloning, A Laboratory Manual (Molecular Cloning: A Laboratory Manual), Cold Spring Harbor Press, N.Y.; eds, Methods in Molecular Biology, Vol.20; tijssen (1993) Laboratory technologies in biochemistry and molecular biology-hybridization with nucleic acid probes (Experimental Techniques in biochemistry and molecular biology-hybridization with nucleic acid probes), for example section 1, Chapter 2 "Overview of protocols of hybridization and the protocol of nucleic acid probe assays (Overview of hybridization principles and strategy for testing nucleic acid probes)", Aisvell, New York (Elsevier).

In some embodiments, sequence errors can be removed using techniques that involve denaturing and reannealing double-stranded nucleic acids. In some embodiments, single stranded nucleic acids containing complementary errors may no longer anneal and bind to each other if the nucleic acid containing each individual error is present in the nucleic acid preparation at a lower frequency than a nucleic acid having the correct sequence at the same location. In contrast, a single strand containing an error can be re-annealed to a complementary strand that is free of errors or a complementary strand that contains one or more different errors or contains an error at a different location. As a result, the strand eventually containing the error exists as a heteroduplex in the re-annealed reaction product. Error-free nucleic acid strands can be re-annealed with error-containing strands or other error-free strands. The re-annealed error-free strands form homoduplexes in the re-annealed samples. Thus, by removing heteroduplex molecules from the re-annealed nucleic acid fragment preparation, the amount or frequency of nucleic acids containing errors can be reduced.

Heteroduplexes are thus formed by a process understood as shuffling, in which nucleic acid strands from different populations can hybridize to each other to form perfectly matched and mismatch-containing duplexes. Suitable methods for removing heteroduplex molecules include chromatography, electrophoresis, selective binding of heteroduplex molecules that preferentially bind double-stranded DNA having sequence mismatches between the two strands. The term "mismatch" or "base pair mismatch" refers to a combination of base pairs that do not normally form nucleic acids according to Watson and Crick base pairing rules. For example, for the bases commonly found in DNA (i.e., adenine, guanine, cytosine, and thymine), base pair mismatches refer to those base combinations other than the A-T and G-C pairings commonly found in DNA. As described herein, mismatches can be expressed as: for example, C/C means that a cytosine residue is opposite another cytosine residue, which is different from its correctly paired ligand guanine.

In some embodiments, oligonucleotide preparations may be selected or screened to remove error-containing molecules as described in more detail herein. In some embodiments, the mismatch binding agent described herein can be used to error correct an oligonucleotide.

In one aspect, the invention relates to a method for producing high fidelity polynucleotides on a solid support. The synthesized polynucleotide is at least about 1, 2, 3,4, 5, 8, 10, 15, 20, 25, 30, 40, 50, 75, or 100 kilobases (kb), or 1 megabase (mb), or longer in length. In exemplary embodiments, a composition of synthetic polynucleotides comprises at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 50%, 60%, 70%, 80%, 90%, 95% or more error-free copies (e.g., having a sequence not derived from a predetermined sequence). The percentage of error-free copies is based on the number of error-free copies in the composition compared to the total number of polynucleotide copies in the composition intended to have the correct (e.g., predetermined or predicted) sequence.

Some aspects of the invention relate to oligonucleotide design for high fidelity polynucleotide assembly. Aspects of the invention can be used to increase the production rate of a nucleic acid assembly process and/or reduce the number of steps or the amount of reagents used to generate a correctly assembled nucleic acid. In certain embodiments, aspects of the invention can be used to automate nucleic acid assembly to reduce the time, number of steps, amount of reagents, and other factors required for each correct nucleic acid assembly. Thus, these and other aspects of the invention can be used to reduce the cost and time of one or more nucleic acid assembly processes.

Single stranded overhangs

In some aspects of the invention, the assembled nucleic acid fragments are designed to have overlapping complementary sequences. In some embodiments, the nucleic acid fragment is a double-stranded DNA fragment having a 3 'and/or 5' single-stranded overhang. These overhangs may be sticky ends that are capable of annealing to complementary sticky ends on different nucleic acid fragments. According to aspects of the invention, the presence of complementary sequences (and specific complementary cohesive ends) on the two nucleic acid fragments facilitates their covalent assembly. In some embodiments, multiple nucleic acid fragments with different overlapping complementary single stranded cohesive ends can be assembled and their order in the assembled nucleic acid product determined by the identity of the cohesive ends on each fragment. For example, the nucleic acid fragments can be designed such that the first nucleic acid has a first cohesive end that is complementary to the first cohesive end of the vector and a second cohesive end that is complementary to the first cohesive end of the second nucleic acid. The second cohesive end of the second nucleic acid may be complementary to the first cohesive end of the third nucleic acid. The second cohesive end of the third nucleic acid may be complementary to the first cohesive end of the fourth nucleic acid. And so on to the last nucleic acid of the first cohesive end that is complementary to the second cohesive end on the penultimate nucleic acid.

In certain embodiments, overlapping complementary regions between adjacent nucleic acid fragments are designed (or selected) with sufficient differences to facilitate (e.g., thermodynamically favor) assembly of uniquely aligned nucleic acid fragments (e.g., selected or designed fragment alignments). It should be understood that overlapping regions of different lengths may be used. In some embodiments, longer sticky ends may be used when assembling larger numbers of nucleic acid fragments. Longer sticky ends may provide more flexibility in designing or selecting sufficiently distinct sequences to distinguish between correct sticky end annealing (e.g., sticky ends that involve designed annealing to each other) and incorrect sticky end annealing (e.g., between non-complementary sticky ends).

In some embodiments, complementary sticky ends between two or more pairs of different nucleic acid fragments can be designed or selected to have the same or similar sequence to facilitate assembly of products containing a relatively random arrangement (and/or number) of fragments with similar or identical sticky ends. This can be used to generate libraries of nucleic acid products with different sequence arrangements and/or different copy numbers of certain internal sequence regions.

In some embodiments, the second cohesive end of the last nucleic acid is complementary to the second cohesive end of the vector. According to aspects of the invention, the method can be used to generate vectors containing nucleic acid fragments assembled in a predetermined linear order (e.g., first, second, third, fourth … … last). In some embodiments, each of the two terminal nucleic acid fragments (e.g., the terminal fragments at each end of the assembled product) can be designed to have a sticky end that is complementary to a sticky end on the vector (e.g., on a linearized vector). These sticky ends may be identical to sticky ends that are capable of annealing to the same complementary end sequences on the linearized vector. However, in some embodiments, the cohesive ends on the terminal segments are different and the vector contains two different cohesive ends, each located at one end of the linearized vector, each complementary to one of the cohesive ends of the terminal segments. Thus, the vector may be a linearized plasmid having two cohesive ends, each of which is complementary to one end of the assembled nucleic acid fragment.

Some aspects of the invention comprise double-stranded nucleic acids having single-stranded overhangs. Any suitable technique may be used to create the overhangs. In some embodiments, double-stranded nucleic acid fragments (e.g., fragments assembled in a multiplex assembly) can be digested with appropriate restriction enzymes to generate single-stranded overhangs at the ends. In some embodiments, fragments designed to abut each other in the assembled product may be digested with the same enzyme to expose complementary overhangs. In some embodiments, overhangs may be generated using a type IIS restriction enzyme. Type IIS restriction enzymes are enzymes that bind to double-stranded nucleic acids at a site called the recognition site and perform single double-stranded cleavage outside the recognition site. The site of double-stranded cleavage, referred to as a cleavage site, is typically located 0-20 bases from the recognition site. The recognition site is typically about 4-7bp in length. All type IIS restriction enzymes exhibit at least partial asymmetric recognition. Asymmetric recognition means that the 5'3' recognition sequence of each strand of the nucleic acid is different. The enzyme activity also showed polarity, indicating that the cleavage site is located only on one side of the recognition site. Thus, typically each recognition site corresponds to only one double-stranded cleavage. Cleavage typically produces single-stranded overhangs of 1-5 nucleotides at the 5 'or 3' end, but some enzymes produce blunt ends. Both types of cleavage are used in the context of the present invention, although in some cases single stranded overhangs are produced. To date, approximately 80 type IIS enzymes have been identified. Examples include, but are not limited to, BstF5I, BtsC I, BsrD I, Bts I, Alw I, Bcc I, BsmmA I, Ear I, Mly I (blunt end), pleI, Bmr I, Bsa I, BsmB I, Fau I, Mnl I, Sap I, Bbs I, BciV I, Hph I, Mbo II, BfuA I, BspCN I, BspM I, SfaN I, Hga I, BseR I, Bbv I, Eci, Fok I, BceA I, BsmBsf I, BtgZ I, Bpu I, Bsg I, Mme I, Bsg I, Bse3D I, eM I, AcIW I, Alw26I, Bst6I, BstMA I, Bsm 634, Bsp I829I, Bsp 3, Bsp 3W 31I, BspI, BspW 11I, BspI 31, BspI, BspW 3, BspI, TspI, BspI, TspW 3, BspI 11I, BspI 11, BspI 11, BspI 11, BspI 11I, BspI 11, BspI 11, BspW 9I, BspI 11I, BspI 11, BspI 9I 9, BspI 9, BspI, Bsp, Eco57I, Eco57M I, Gsu I, and Bcg I. Such enzymes and information about their recognition and cleavage sites are available from suppliers such as New England Biolabs (New England Biolabs, inc., ippswich, usa).

In some embodiments, commercially available or engineered restriction enzymes may be used. In some embodiments, type IIS restriction enzymes may be designed and engineered to produce longer overhangs. Designing and engineering restriction enzymes to create longer single stranded overhangs enables more oligonucleotides to be bound together to form longer nucleic acid constructs. For example, BsaI, which produces a 4-nucleotide single-stranded overhang, can be engineered to produce a 5-or 6-or longer single-stranded overhang. By increasing the single-stranded overhang length using this engineered BsaI, the theoretical limit of 17 nucleic acids or oligonucleotides that can be bound can be increased.

In some embodiments, each of the plurality of nucleic acid fragments designed for nucleic acid assembly can have a type IIS restriction site at each end. Type IIS restriction sites can be targeted such that the cleavage sites are internally associated with the recognition sequence. As a result, enzymatic digestion exposes internal sequences (e.g., overhangs in internal sequences) and removes recognition sequences from the ends. Thus, the same type IIS site can be used at both ends of all nucleic acid fragments prepared for assembly and/or can be used to linearize a suitable vector. However, different type IIS sites may also be used. Two fragments designed to be contiguous in the assembled product may each comprise the same overlapping end sequence and flanking type IIS sites in appropriate positions to expose complementary overhangs within the overlapping sequences upon restriction enzyme digestion. Thus, different complementary overhangs can be used to generate multiple nucleic acid fragments. The restriction sites on each end of the nucleic acid fragment can be positioned such that digestion with a suitable type IIS enzyme removes the restriction sites and exposes a single stranded region complementary to the single stranded region on the nucleic acid fragment designed to be adjacent to the assembled nucleic acid product. In some embodiments, the ends of each of the two terminal nucleic acid fragments may be designed to have single stranded overhangs that are complementary to the single stranded overhangs of the linearized vector nucleic acid (e.g., following digestion with a suitable restriction enzyme). Thus, the resulting nucleic acid fragments and vectors can be directly transformed into host cells. Alternatively, the nucleic acid fragments and the vector may be incubated to facilitate hybridization and annealing of complementary sequences prior to transformation into a host cell. It will be appreciated that the vector may be prepared using any of the techniques described herein or any other suitable technique that produces a single stranded overhang complementary to one end of one of the terminal nucleic acid fragments.

DNase digestion with type IIS or site-specific restriction enzymes typically produces overhangs of 4 to 6 nucleotides. These short sticky ends are sufficient to join two nucleic acid fragments containing complementary ends. However, when multiple nucleic acid fragments are joined together, longer complementary cohesive ends are preferred to facilitate assembly and ensure specificity. For example, sticky ends may be long enough to have sufficiently different sequences to prevent or reduce mismatches between similar sticky ends. However, it is preferred that the length is not so long as to stabilize mismatches between similar cohesive sequences. In some embodiments, a length of about 9 to about 15 bases may be used. However, any suitable length may be selected for the region used to create the sticky overhangs. The importance of specificity may depend on the number of different fragments assembled simultaneously. In addition, the appropriate length required to avoid stabilizing the mismatched regions may depend on the conditions used to anneal the different sticky ends.

Ligase-based assembly

Ligase-based assembly techniques may involve one or more suitable ligases that catalyze covalent ligation of adjacent 3 'and 5' nucleic acid termini (e.g., the 5 'phosphate and 3' hydroxyl groups of a nucleic acid anneal on a complementary template nucleic acid such that the 3 'terminus is immediately adjacent the 5' terminus). Thus, a ligase can catalyze a ligation reaction between the 5 'phosphate of a first nucleic acid and the 3' hydroxyl of a second nucleic acid, provided that the first and second nucleic acids anneal adjacent to each other on a template nucleic acid, and the ligase can be obtained from recombinant or natural sources. The ligase may be a thermostable ligase. In some embodiments, thermostable ligases from thermophilic microorganisms may be used. Examples of thermostable DNA ligases include, but are not limited to: tth DNA ligase (from Thermus thermophilus, available from e.g. friend group (Eurogentec) and gene technology (GeneCraft)), Pfu DNA ligase (hyperthermophilic ligase from Pyrococcus furiosus), Taq ligase (from Thermus aquaticus), any other suitable thermostable ligase, or any combination thereof. In some embodiments, one or more lower temperature ligases (e.g., T4DNA ligase) may be used. Lower temperature ligases may be used for shorter overhangs (such as about 3, about 4, about 5 or about 6 base overhangs) that may be unstable at higher temperatures.

In some embodiments, the ligase may be designed and engineered to have a greater degree of specificity to minimize the formation of unwanted ligation products. In some embodiments, the ligase can be used in conjunction with a protein or can be fused to a protein capable of facilitating the interaction between the ligase and the nucleic acid molecule and/or increasing the specificity of ligation.

Non-enzymatic techniques can be used to ligate nucleic acids. For example, the 5 'terminus (e.g., 5' phosphate group) and the 3 'terminus (e.g., 3' hydroxyl group) of one or more nucleic acids can be covalently linked together without an enzyme (e.g., without a ligase). In some embodiments, non-enzymatic techniques may provide certain advantages over enzymatic ligation. For example, non-enzymatic techniques can be highly tolerant to non-natural nucleotide analogs in nucleic acid substrates, can be used to ligate short nucleic acid substrates, can be used to ligate RNA substrates, and/or be less expensive and/or more suitable for certain automated (e.g., high throughput) applications.

Thus, chemical ligation can be used to form a covalent linkage of the 5 'end of a first nucleic acid terminus and the 3' end of a second nucleic acid terminus, where the first and second nucleic acid termini can be a single nucleic acid terminus or separate nucleic acid termini. In one aspect, chemical ligation may involve at least one nucleic acid substrate having a modified terminus (e.g., a modified 5 'and/or 3' terminus) that includes one or more chemical reactive moieties that aid or facilitate ligation formation. In some embodiments, chemical ligation occurs when one or more nucleic acid ends are brought into close proximity (e.g., when the ends are bound together due to annealing between complementary nucleic acid sequences). Thus, annealing between complementary 3 'or 5' overhangs (e.g., overhangs generated by restriction enzyme cleavage of double-stranded nucleic acids) or any combination of complementary nucleic acids that causes the 3 'end to be immediately adjacent to the 5' end (e.g., when the nucleic acid chromogenically anneals to a complementary template nucleic acid, the 3 'and 5' ends are adjacent to each other) can facilitate template-directed chemical ligation. Examples of chemical reactions may include, but are not limited to, condensation, reduction, and/or photo-chemical ligation reactions. It is understood that in some embodiments, chemical linkages may be used to generate naturally occurring phosphodiester internucleotide linkages, non-naturally occurring phosphoramide pyrophosphate internucleotide linkages, and/or other non-naturally occurring internucleotide linkages.

Simultaneous enzymatic removal of consensus oligonucleotide sequences and ligation of processed oligonucleotides into longer constructs

FIG. 2 shows a method for assembling nucleic acids according to one embodiment of the invention. In some embodiments, the method comprises simultaneously enzymatically removing the consensus oligonucleotide sequences and ligating the processed oligonucleotide sequences into longer constructs. In some embodiments, oligonucleotides are amplified by PCR and error corrected as described herein. Amplified oligonucleotides (10) consisting of a consensus promoter (amplification) sequence (20) and a construct-specific payload (payload) or internal sequence region (30) are processed by suitable restriction endonucleases (40). In some embodiments, the first and last oligonucleotides contain a unique priming sequence (25) for amplification of the target construct. The restriction enzyme catalyzes the cleavage of a terminal consensus region (also referred to herein as amplification region or primer recognition region) common to all oligonucleotides (50), leaving an inner region (also referred to herein as free payload) with a terminal single-stranded DNA sequence (60). In some embodiments, the restriction enzyme is a type IIS restriction enzyme. These single stranded sequences are designed to direct specific interaction of one oligonucleotide with another such that multiple oligonucleotides are linearly arranged into a predetermined sequence (70). Thus, the terminal single stranded DNA sequence may direct the appropriate interaction of the oligonucleotides into the correct order such that the ligase (80) catalyzes the ligation of the individual oligonucleotides to generate the final target nucleic acid construct (90) or an intermediate nucleic acid construct.

It will be appreciated by those skilled in the art that if the initial consensus sequence is ligated back together (e.g. (50) using a terminal sequence complementary to (60)), the presence of a restriction enzyme ensures that it can be cleaved again to generate free ends (60). However, due to the choice of restriction enzymes, properly ligated linkers (e.g., between 1 'and 2') will not be recognized as restriction sites and will not be cleaved. The reaction should proceed toward the desired product (90) in the natural state.

In some embodiments, a variant to a method of identifying a restriction site for removal of a consensus sequence may now also be part of the gene to be synthesized. This constraint removal allows for the recursive (hierarchical) application of gene synthesis methods to construct longer nucleic acid sequences (as shown in figure 4). In previous methods, when the removal and ligation were performed in separate steps, this design was not allowed due to the necessity of a purification step between the removal and ligation steps based in part on size selection. In such methods, cleavage fragments of the desired target sequence may be lost during purification, resulting in failure to construct the desired target sequence. In some embodiments, by using the removal and ligation steps of the present invention simultaneously, those cleavage sequences can be cleaved and re-ligated consistently so that some target sequence of interest is present. In some embodiments, the amount of sequence required depends on the regulation of the relative activities of restriction enzymes and ligases.

As shown in FIG. 4, gene synthesis fragments (390) and (391) can be assembled from oligonucleotide sets (310) and (311). The oligonucleotide set can be designed to have matching restriction endonuclease sites (340) to enable ligation of gene synthesis fragments (390) and (391) (subsequent amplification) using the same simultaneous digestion and ligation steps. In some embodiments, the second round may be designed with restriction enzyme sites using a second restriction enzyme (340). However, due to the complexity of using multiple enzymes in this process, it may not be possible to do so. Furthermore, the use of two restriction enzymes without simultaneous digestion and ligation results in disallowing the target sequence to have two restriction enzyme sites, which will further constrain the genes that can be synthesized.

Still referring to FIG. 4, nucleic acid fragment (390) may be amplified using primer (325) and nucleic acid fragment (391) may be amplified using primer (326). The nucleic acid fragments are then mixed together and processed in a manner similar to the previous synthetic steps to generate combined nucleic acid fragments (392), with the restriction sites (340) functioning in a manner similar to the sites (350) in the previous round. The combined target sequence (392) can be amplified using the 5 'primer from (325) and the 3' primer from (326).

In some embodiments, a hierarchical assembly strategy may be used in accordance with the methods disclosed herein. It will be appreciated by those skilled in the art that the method is scalable to a variety of nucleic acid fragments, enabling the number of nucleic acid fragments in subsequent rounds to be similar to the number of nucleic acid fragments in the first round. The hierarchical assembly method can be geometric, building very large target sequences in fewer rounds. For example, a 1000 base (1kbp) target sequence can be constructed from one of pools (310) or (311). The 10 nucleic acid fragments of the second round similar to (390) or (391) will form a target nucleic acid sequence of 10kbp bases. The third round using a 10kbp nucleic acid sequence would result in a target nucleic acid sequence of 100kbp, which was derived from the original 100 source pool.

In some embodiments, multiple assembly reactions may be performed in separate wells. The assembled constructs from the assembly reaction can then be mixed to form longer nucleic acid sequences. In some embodiments, restriction enzymes can be used for hierarchical assembly to form cohesive ends, which can be joined together in a desired order. Oligonucleotides can be designed and synthetically constructed to contain recognition and cleavage sites for one or more restriction enzymes to facilitate ligation of a particular sequence at a site. In some embodiments, one or more type IIS endonuclease recognition sites can be incorporated at the end of the construction oligonucleotide to allow cleavage by a type IIS restriction endonuclease. The order of ligation can be determined by hybridization of complementary cohesive ends.

In some embodiments, the first pool of oligonucleotides comprises 3 'end oligonucleotides designed to have additional restriction enzyme recognition sites at their 3' ends and the second pool of oligonucleotides comprises 5 'end oligonucleotides designed to have additional restriction enzyme recognition sites at their 5' ends. In some embodiments, the restriction enzymes are the same. After oligonucleotide assembly in each pool, both sub-assembly constructs can be treated with restriction enzymes and ligase according to the methods described herein.

It will be appreciated by those skilled in the art that the available assembly space for synthesis can be significantly (geometrically) increased by aspects of the present invention. Previously, to generate a construct with twice the number of sequences (2n), it was necessary to double the number of oligonucleotides. For example, to generate construct (390), the number of constructs (310) and, therefore, compatible single stranded ends (360) needs to be doubled. Using the method shown in FIG. 4, the linker used for (310) and (311) need only be compatible with linker (340), thereby ensuring that only one additional linker is used in the assembly of the nucleic acid of double size. Thus, if oligonucleotides (310) and (311) have interfering or incompatible ends, they can still be ligated together by the methods described herein (digestion (340) and ligation) to prepare the target nucleic acid (392), whereas simply mixing pools of oligonucleotides (310) and (311) together is not possible to achieve ligation.

FIG. 3 shows a variant form in which oligonucleotide processing and assembly into the target construct are carried out simultaneously and into the plasmid simultaneously. The details of plasmid pG9-1(SEQ ID NO.1) are shown in FIG. 5. The plasmid contains a restriction enzyme recognition site (underlined text, figure 5) that allows the restriction enzyme (in this example, BsaI) to cut the plasmid at two sites, resulting in a defined single-stranded sequence (figure 5-inverted text). According to FIG. 3, a plasmid (100) (e.g., pG9-1) is introduced into a pool containing a mixture of oligonucleotides (110), which oligonucleotides (110) have been amplified and error corrected as described herein. In some embodiments, these oligonucleotide sequences (110) may have a consensus sequence (120) that is recognized by a particular restriction enzyme (140). In some embodiments, the plasmids (130) may have sequences recognized by the same restriction enzyme (140). The action of the restriction enzyme (140) on these sequences results in the removal of the consensus sequence from the oligonucleotides ((310), (311)) and plasmid (150), exposing the single-stranded DNA sequence (160). In some embodiments, the restriction enzyme may be a type IIS restriction enzyme. In some embodiments, the single stranded sequence is designed to direct a specific interaction of one oligonucleotide with another such that a plurality of oligonucleotides are arranged into a defined sequence and the arranged oligonucleotide sequences (170) are entered into a plasmid (100). In some embodiments, the ligase (180) catalyzes the covalent attachment of a single oligonucleotide. The final product is a plasmid (e.g., pG9-1) containing the specific construct derived from the adaptor oligonucleotide (190). This plasmid (190) can then be transformed into bacteria and sequence verified.

Aspects of the invention relate to sequencing validation of assembled constructs according to the methods of the invention. The sequencing validation of the constructs is shown in figure 6. In this method, multiple constructs (200, C1-C4) can be generated and transformed into bacteria as shown in fig. 3. Bacterial transformants containing plasmid DNA can be selected on solid growth plates using appropriate antibiotic resistance for selection. After growth, the single clones may be picked and placed in pools, one for each construct plate (220), creating pools of constructs, each pool containing one copy of each construct. In some embodiments, the number of pools may depend on the number of individual constructs to be sequenced to identify constructs with perfect sequences. As shown in fig. 6, 4 pools of 4 constructs were generated, allowing analysis of 4 members of each construct. Plasmid DNA can then be prepared from the collected material (230). Pools of plasmid DNA molecules can then be prepared for sequencing. This preparation can be done using one of a variety of methods that result in DNA fragmentation into small fragments and ligation using consensus sequences required for sequencing (e.g., next generation high throughput sequencing). These consensus sequences contained short DNA fragments that were unique to each of the 4 pools generated. These unique DNA fragments can be used to identify from which pool each sequenced construct originated. Constructs with the correct sequence can then be recovered by returning to the original bacterial growth plate and growing again the corresponding clones containing the plasmid with the desired construct.

Vectors and host cells

Any suitable carrier may be used, and the present invention is not limited thereto. For example, the vector may be a plasmid, a bacterial vector, a viral vector, a phage vector, an insect vector, a yeast vector, a mammalian vector, a BAC, YAC, or any other suitable vector. In some embodiments, the vector may be one that replicates only within one type of organism (e.g., bacteria, yeast, insects, mammals, etc.) or only within one organism. Some vectors may have a wide host range. Some vectors may have different functional sequences (e.g., origin or replication, selectable markers, etc.) that function in different organisms. These can be used to transport the vector (and any nucleic acid fragments cloned into the vector) between two different types of organisms (e.g., between bacteria and mammals, between yeast and mammals, etc.). In some embodiments, the type of vector used may be determined by the type of host cell chosen.

It will be appreciated that the vector may encode a detectable marker (such as a selectable marker, e.g., antibiotic resistance, etc.) so that transformed cells can be selectively grown and the vector can be isolated and any insertions can be characterized to determine whether it contains the desired assembled nucleic acid. The insertion can be characterized using any suitable technique (e.g., size analysis, restriction fragment analysis, sequencing, etc.). In some embodiments, the presence or absence of a correctly assembled nucleic acid in a vector can be determined by determining whether a function predicted to be encoded by the correctly assembled nucleic acid is expressed in the host cell.

In some embodiments, a host cell comprising a vector containing a nucleic acid insertion can be selected or enriched by using one or more other detectable or selectable markers that function only when the correct (e.g., designed) terminal nucleic acid fragment is cloned into the vector.

Thus, the host cell should have the appropriate phenotype to select for a drug-resistant marker encoded on one or more vectors (or to detect a detectable marker encoded on one or more vectors). However, any suitable host cell type may be used (e.g., prokaryotic, eukaryotic, bacterial, yeast, insect, mammalian, etc.). For example, the host cell may be a bacterial cell (e.g., Escherichia coli), Bacillus subtilis, Mycobacterium tuberculosis (Mycobacterium spp.), Mycobacterium tuberculosis (m. tuberculosis), or other suitable bacterial cell), a yeast cell (e.g., Saccharomyces spp.), pichia spp, Candida spp, or other suitable yeast species, such as Saccharomyces cerevisiae (s. cerevisiae), Candida albicans (c. albicans), schizosaccharomyces pombe (s. pombe), etc., a toad cell, a mouse cell, a monkey cell, a human cell, an insect cell (e.g., SF9 cell, and a drosophila cell), a worm (e.g., cryptorhabditis spp.), a plant cell, or other suitable cell, including recombinant or other suitable cell lines, for example. In addition, a variety of heterologous cell lines can be used, such as chinese hamster ovary Cells (CHO).

Applications of

Aspects of the invention find use in a variety of applications involving the generation and/or use of synthetic nucleic acids. As described herein, the present invention provides methods for assembling synthetic nucleic acids with increased efficiency. The resulting assembled nucleic acid may be amplified in vitro (e.g., using PCR, LCR, or any suitable amplification technique), amplified in vivo (e.g., by cloning into a suitable vector), isolated, and/or purified. The assembled nucleic acid (alone or cloned into a vector) can be transformed into a host cell (e.g., prokaryotic, eukaryotic, insect, mammalian, or other host cell). In some embodiments, the host cell can be used to propagate the nucleic acid. In certain embodiments, the nucleic acid may be integrated into the genome of the host cell. In some embodiments, the nucleic acid can replace a corresponding nucleic acid region on the genome of the cell (e.g., by homologous recombination). Thus, the nucleic acids may be used to generate recombinant organisms. In some embodiments, the target nucleic acid may be the entire genome or a large fragment of the genome that is used to replace all or part of the genome of the host organism. Recombinant organisms may also be used in a variety of research, industrial, agricultural, and/or medical applications.

Many of the techniques described herein can be used together, applying a combination of one or more extension-based and/or ligation-based assembly techniques at one or more points to generate long nucleic acid molecules. For example, cooperative assembly can be used to assemble oligonucleotide duplexes and nucleic acid fragments of less than 100 to greater than 10000 base pairs in length (e.g., 100-mer to 500-mer, 500-mer to 1000-mer, 1000-mer to 5000-mer, 5000-mer to 10000-mer, 25000-mer, 50000-mer, 75000-mer, 100000-mer, etc.). In an exemplary embodiment, the methods described herein can be used during assembly of the entire genome (or large fragments thereof, e.g., about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more) of an organism (e.g., a virus, bacterium, yeast, or other prokaryotic or eukaryotic organism), optionally incorporating specific modifications into the sequence at one or more desired locations.

The nucleic acid molecules generated using the methods of the invention can be incorporated into vectors. The vector may be a cloning vector or an expression vector. The vector may comprise an origin of replication and one or more selectable markers (e.g., antibiotic resistance markers, auxotrophic markers, etc.). In some embodiments, the vector may be a viral vector. The viral vector may comprise a nucleic acid sequence capable of infecting a target cell. Similarly, in some embodiments, prokaryotic expression vectors operably linked to a suitable promoter system may be used to transform a cell of interest. In other embodiments, eukaryotic vectors operably linked to a suitable promoter system may be used to transfect cells or tissues of interest.

Transcription and/or translation of the constructs described herein can be performed in vitro (i.e., using a cell-free system) or in vivo (i.e., expression in a cell). In some embodiments, cell lysates can be prepared. In certain embodiments, the expressed RNA or polypeptide may be isolated or purified.

Aspects of the methods and apparatus provided herein may include automating one or more operations described herein. In some embodiments, one or more steps in an amplification and/or assembly reaction may be automated using one or more automated sample processing devices (e.g., one or more automated liquid or fluid processing devices). Automated devices and methods may be used to deliver reagents including one or more of the following: starting nucleic acids, buffers, enzymes (e.g., one or more ligases and/or polymerases), nucleotides, salts, and any other suitable reagents (e.g., stabilizers). Automated devices and methods may also be used to control reaction conditions. For example, an automated thermal cycler can be used to control the reaction temperature and any temperature cycling that is available. In some embodiments, the scanning laser can be automated to provide one or more reaction temperatures or temperature cycles suitable for incubating the polynucleotide. Similarly, subsequent analysis of the assembled polynucleotide product can be automated. For example, sequencing can be automated using sequencing equipment and automated sequencing protocols. Other steps (e.g., amplification, cloning, etc.) can also be automated using one or more suitable devices and associated protocols. It is to be understood that one or more of the devices or device components described herein may be combined in a system (e.g., a robotic system) or in a microenvironment (e.g., a microfluidic reaction chamber). The assembled reaction mixture (e.g., liquid reaction sample) can be transferred from one component of the system to another component using automated devices and methods (e.g., robotic manipulation and/or transfer of samples and/or sample containers, including automated pipetting devices, microsystems, etc.). The system and any components thereof may be controlled by a control system.

Accordingly, method steps and/or aspects of the apparatus provided herein may be automated using, for example, a computer system (e.g., a computer control system). A computer system capable of implementing aspects of the techniques provided herein may include a computer for any type of process (e.g., sequence analysis and/or automated device control as described herein). However, it should be understood that certain process steps may be provided by one or more automated devices as part of an assembly system. In some embodiments, a computer system may include two or more computers. For example, one computer may be coupled to a second computer via a network. A computer can perform sequence analysis. The second computer may control one or more automated compounding and assembly devices in the system. In other aspects, other computers may be included in the network to control one or more analysis or processing operations. Each computer may include a memory and a processor. The computer may take any form, as aspects of the technology provided herein are not limited to implementation on any particular computer platform. Similarly, the network may take any form, including a private network or a public network (e.g., the Internet). The display device can be connected to one or more apparatuses and a computer. Alternatively, or in addition, a display device may be located at a distal site and connected to display the analysis output according to the techniques provided herein. The connections between the various components of the system may be via wire, fiber optic, wireless transmission, satellite transmission, any other suitable transmission, or any combination of two or more of the foregoing.

Various aspects, embodiments, or operations of the technology provided herein can be independently automated and performed in any of numerous ways. For example, the various aspects, embodiments or operations can be implemented independently using hardware, software or a combination thereof. When implemented in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers. It should be appreciated that any component or collection of components that perform the functions described above can be generically considered as one or more controllers that control the functions discussed above. The one or more controllers can be implemented in numerous ways, such as with dedicated hardware or general purpose hardware (e.g., one or more processors) controlled by microcode or software routines to perform functions recited above.

In this regard, it will be appreciated that one implementation of embodiments of the technology provided herein comprises at least one computer-readable medium (e.g., a computer memory, a floppy disk, a compact disk, a magnetic tape, etc.) encoded with a computer program (e.g., with a plurality of instructions) that, when executed on a processor, performs one or more of the functions described above for the technology provided herein. The computer readable medium is transportable, such that the program stored thereon can be loaded onto any computer system source to perform one or more functions of the techniques provided herein. Further, it should be understood that a computer program that performs the functions discussed above when executed is not limited to an application running on a host computer. Rather, the term computer program is used herein in a generic sense to refer to any type of computer code (e.g., software or microcode) that can be employed to program a processor to perform the aspects of the techniques provided herein discussed above.

It should be understood that consistent with several implementations of the techniques provided herein in which a processor is stored on a computer-readable medium, the computer-implemented process may accept manual input (e.g., from a user) during its execution.

Thus, overall system level control of an assembly device or assembly as described herein may be performed by a system controller that may provide control signals to the following devices: related nucleic acid synthesizers, liquid handling devices, thermocyclers, sequencing devices, related mechanized components, and other suitable systems for operating the desired input/output or other control functions. Thus, the system controller, together with any device controller, forms a controller that controls the operation of the nucleic acid assembly system. The controller may comprise a general purpose data processing system, which may be a general purpose computer or a network of general purpose computers, and other related devices including communications devices, modems, and/or other circuits or components to perform the required input/output or other functions. The controller can also be implemented at least in part as a single special purpose integrated circuit (e.g., ASIC) or an ASIC array, each having a main or central processor section for overall, system-level control, and a dedicated, separate section for performing a variety of different specific computations, functions, and other processes under the control of the central processor section. The controller can also be implemented using a variety of separate application specific program integration or other electronic circuits or devices, such as hardwired electronic devices or logic circuits, e.g., separate component circuits or program logic devices. The controller can also include any other components or devices, such as user input/output devices (monitors, displays, printers, keyboards, user pointing devices, touch screens, or other user interfaces, etc.), data storage devices, drive motors, connections, valve controllers, motorized devices, vacuum and other pumps, pressure sensors, detectors, power supplies, pulse sources, communication devices or other electronic circuits or components, and so forth. The controller may also control the operation of other parts of the system, such as automated customer order processing, quality control, packaging, shipping, invoicing, etc., to perform other suitable functions known in the art and not described in detail herein.

Aspects of the present invention may be used alone, in combination, or in a variety of arrangements not specifically discussed in the embodiments described in the foregoing and is therefore not limited in its application to the details and arrangement of components set forth in the foregoing description or illustrated in the drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.

The use of the ordinal terms "first," "second," "third," etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

Also, the phraseology and terminology used herein is for the purpose of description and not of limitation. The use of "including," "comprising," or "having," "containing," "involving," and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.

Equivalent forms

The present invention provides, inter alia, novel methods and apparatus for high fidelity gene assembly. While specific embodiments of the invention have been discussed, the foregoing description is illustrative only and not limiting. Many variations of the invention will become apparent to those skilled in the art upon reading the specification. The full scope of the invention should be determined by reference to the appended claims, along with the full scope of equivalents to which such claims are entitled, and to the specification, along with such variations.

Is incorporated by reference

Reference is made to US application 13/986,368 filed on 24/4/2013, US application 13/524,164 filed on 15/6/2012, and PCT publication PCT/US 2009/055267. All publications, patents, patent applications, and sequence database entries mentioned herein are incorporated by reference in their entirety to the same extent as if each individual publication or patent was specifically and individually indicated to be incorporated by reference.

Sequence listing

<110> GEN9 GmbH (Gen9, Inc.)

M.E.Hedyson (Hudson, Michael E.)

L-Y.A.Kun (Li-yun A.)

D, sinderler (Schinder, Daniel)

S. Archer (Archer, Stephen)

Saolmer (Saaem, Ishtiaq)

<120> method for nucleic acid assembly and high throughput sequencing

<130> 127662-013402

<140> unnumbered

<141> 2013-06-24

<150> 61/664,118

<151> 2012-06-25

<150> 61/731,627

<151> 2012-11-30

<160> 1

<170> PatentIn version 3.5

<210> 1

<211> 2608

<212> DNA

<213> Artificial sequence

<220>

<223> synthetic constructs

<400> 1

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60

cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120

ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180

accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240

attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300

tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360

tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt agtgttgaga ccattcagct 420

ccggtctcga cactgagctt ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt 480

tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa agcctggggt 540

gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc tttccagtcg 600

ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgcggggag aggcggtttg 660

cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg 720

cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat 780

aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc 840

gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc 900

tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga 960

agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt 1020

ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg 1080

taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc 1140

gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg 1200

gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc 1260

ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat ctgcgctctg 1320

ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 1380

gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 1440

caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt 1500

taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa 1560

aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagtcagaag 1620

aactcgtcaa gaaggcgata gaaggcgatg cgctgcgaat cgggagcggc gataccgtaa 1680

agcacgagga agcggtcagc ccattcgccg ccaagctctt cagcaatatc acgggtagcc 1740

aacgctatgt cctgatagcg gtccgccaca cccagccggc cacagtcgat gaatccagaa 1800

aagcggccat tttccaccat gatattcggc aagcaggcat cgccatgggt cacgacgaga 1860

tcctcgccgt cgggcatgct cgccttgagc ctggcgaaca gttcggctgg cgcgagcccc 1920

tgatgctctt cgtccagatc atcctgatcg acaagaccgg cttccatccg agtacgtgct 1980

cgctcgatgc gatgtttcgc ttggtggtcg aatgggcagg tagccggatc aagcgtatgc 2040

agccgccgca ttgcatcagc catgatggat actttctcgg caggagcaag gtgagatgac 2100

aggagatcct gccccggcac ttcgcccaat agcagccagt cccttcccgc ttcagtgaca 2160

acgtcgagca cagctgcgca aggaacgccc gtcgtggcca gccacgatag ccgcgctgcc 2220

tcgtcttgca gttcattcag ggcaccggac aggtcggtct tgacaaaaag aaccgggcgc 2280

ccctgcgctg acagccggaa cacggcggca tcagagcagc cgattgtctg ttgtgcccag 2340

tcatagccga atagcctctc cacccaagcg gccggagaac ctgcgtgcaa tccatcttgt 2400

tcaatcatac tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg 2460

agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 2520

ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa 2580

aataggcgta tcacgaggcc ctttcgtc 2608

34页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于恒温无酶多级扩增的miRNA化学发光检测试剂盒

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!