RNA polymerase variants for co-transcriptional capping

文档序号:1966735 发布日期:2021-12-14 浏览:14次 中文

阅读说明:本技术 用于共转录加帽的rna聚合酶变体 (RNA polymerase variants for co-transcriptional capping ) 是由 阿萨纳西奥斯·杜西斯 干乍那·拉维钱德兰 埃米·E·拉比多 玛格丽特·富兰克林 凯文·史密斯 于 2020-02-19 设计创作,主要内容包括:本公开提供用于高效转录的RNA聚合酶变体。(The present disclosure provides RNA polymerase variants for efficient transcription.)

1. A ribonucleic acid (RNA) polymerase variant comprising an RNA polymerase comprising an amino acid substitution at a position selected from the group consisting of E350, D351, K387, N437, K441, D506, R632, D653, S628, P657, F880 and G884 relative to an RNA polymerase comprising the amino acid sequence of SEQ ID NO: 44.

2. The RNA polymerase variant of claim 1, wherein the RNA polymerase comprises an amino acid substitution at E350.

3. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at D351.

4. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at K387.

5. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at N437.

6. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at K441.

7. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at D506.

8. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at R632.

9. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at D653.

10. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at S628.

11. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at P657.

12. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at F880.

13. The RNA polymerase variant of any preceding claim, wherein the RNA polymerase comprises an amino acid substitution at G884.

14. The RNA polymerase variant of claim 1, wherein the RNA polymerase comprises at least two, at least three, at least four, or at least five amino acid substitutions at a position selected from the group consisting of E350, D351, K387, N437, K441, D506, R632, D653, S628, P657, F880, and G884.

15. The RNA polymerase variant of claim 14, wherein the RNA polymerase comprises an amino acid substitution at a position selected from the group consisting of: e350 and D351; e350 and K387; e350 and N437; e350 and K441; e350 and D506; e350 and R632; e350 and D653; e350 and S628; e350 and P657; e350 and F880; e350 and G884; d351 and K387, D351 and N437; d351 and K441; d351 and D506; d351 and R632; d351 and D653; d351 and S628; d351 and P657; d351 and F880; d351 and G884; k387 and N437; k387 and K441; k387 and D506; k387 and R632; k387 and D653; k387 and S628; k387 and P657; k387 and F880; and K387 and G884; n437 and K441; n437 and D506; n437 and R632; n437 and D653; n437 and S628; n437 and P657; n437 and F880; n437 and G884; k441 and D506; k441 and R632; k441 and D653; k441 and S628; k441 and P657; k441 and F880; k441 and G884; d506 and R632; d506 and D653; d506 and S628; d506 and P657; d506 and F880; d506 and G884; r632 and D653; r632 and S628; r632 and P657; r632 and F880; r632 and G884; d653 and S628; d653 and P657; d653 and F880; d653 and G884; s628 and P657; s628 and F880; s628 and G884; p657 and F880; p657 and G884; and F880 and G884.

16. The RNA polymerase variant of claim 15, wherein the RNA polymerase comprises an acid substitution at a position selected from the group consisting of: k387, D653, and G884; e350, D351, and K387; and D653, P657, and R632.

17. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at E350 is selected from the group consisting of E350A, E350K, E350N, and E350W, optionally wherein the amino acid substitution at E350 is E350N.

18. The RNA polymerase variant of any preceding claim, wherein the amino acid substitution at D351 is D351V.

19. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at K387 is selected from the group consisting of K387H, K387N, and K387S, optionally wherein the amino acid substitution at K387 is K387N.

20. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at N437 is selected from the group consisting of N437F, N437I, N437T and N437Y, optionally wherein the amino acid substitution at N437 is N437F.

21. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at K441 is K441R.

22. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at D506 is selected from the group consisting of D506F, D506L, D506R, D506W, and D506Y.

23. The RNA polymerase variant of any preceding claim, wherein the amino acid substitution at R632 is R632K or R632T.

24. The RNA polymerase variant of any preceding claim, wherein the amino acid substitution at D653 is selected from the group consisting of D653A, D653F, D653G, D653H, D653I, D653K, D653L, D653M, D653N, D653P, D653Q, D653R, D653S, D653T, D653V, D653W, and D653Y, optionally wherein the amino acid substitution at D653 is D653W.

25. The RNA polymerase variant of any preceding claim, wherein the amino acid substitution at S628 is S628W.

26. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at P657 is selected from the group consisting of P657A, P657R, and P657W.

27. The RNA polymerase variant of any preceding claim, wherein the amino acid substitution at F880 is F880Y.

28. The RNA polymerase variant of any one of the preceding claims, wherein the amino acid substitution at G884 is selected from the group consisting of G884A, G884S, G884T, and G884P.

29. A RNA polymerase variant comprising a RNA polymerase comprising amino acid substitutions at two positions selected from the group consisting of E350, D351, K387, and D653 relative to a wild type RNA polymerase comprising the amino acid sequence of SEQ ID No. 1.

30. The RNA polymerase variant of claim 29, comprising amino acid substitutions at E350 and D351.

31. The RNA polymerase variant of claim 29, comprising amino acid substitutions at E350 and K387.

32. The RNA polymerase variant of claim 29, comprising amino acid substitutions at K387 and D653.

33. The RNA polymerase variant of any one of claims 29-31, wherein the amino acid substitution at position E350 is E350W, E350A, E350K, or E350N.

34. The RNA polymerase variant of claim 29 or 30, wherein the amino acid substitution at position D351 is D351V.

35. The RNA polymerase variant of any one of claims 29, 31, or 32, wherein the amino acid substitution at position K387 is K387N, K387S, or K387H.

36. The RNA polymerase variant of claim 29 or 32, wherein the amino acid substitution at position D653 is D653T or D653K.

37. A RNA polymerase variant comprising an RNA polymerase comprising amino acid substitutions at positions E350 and K387 relative to a wild type RNA polymerase comprising the amino acid sequence of SEQ ID NO:1, optionally wherein the substitutions are E350W and K387N.

38. A variant RNA polymerase comprising an RNA polymerase comprising amino acid substitutions at positions E350 and D351 relative to a wild type RNA polymerase comprising the amino acid sequence of SEQ ID NO:1, optionally wherein the substitutions are E350W and D351V.

39. A RNA polymerase variant comprising an RNA polymerase comprising amino acid substitutions at positions K387 and D653 relative to a wild type RNA polymerase comprising the amino acid sequence of SEQ ID No. 1, optionally wherein the substitutions are K387N and D653T.

40. A method comprising producing mRNA in an in vitro transcription reaction comprising a DNA template, nucleoside triphosphates, the RNA polymerase variant of any preceding claim, and optionally a cap analog.

41. The method of claim 40, wherein the reaction comprises a cap analog.

42. The method of claim 40 or 41, wherein the reaction comprises a concentration of the cap analog that is at least 5-fold lower than the concentration of the cap analog required to produce an equivalent amount of mRNA using T7RNA polymerase comprising the amino acid sequence of SEQ ID NO: 44.

43. The method of any one of the preceding claims, wherein more than 80% of the mRNA produced comprises a functional cap, more than 50% of the mRNA produced is homogeneous at the 3' end, and/or the reaction comprises less than 5ng dsRNA per 25 μ g of mRNA produced.

44. The method of any one of the preceding claims, wherein the cap analog and nucleoside triphosphates are present in the reaction at equimolar concentrations, or the molar ratio of cap analog to nucleoside triphosphates in the reaction is greater than 1: 1.

45. The method of any one of the preceding claims, wherein the cap analog is a dinucleotide cap, a trinucleotide cap, or a tetranucleotide cap.

46. The method of any one of the preceding claims, wherein the cap analog is a trinucleotide cap analog comprising a GAG sequence.

47. The method of claim 46 wherein said GAG cap analog comprises a compound selected from:

48. the method of any one of the preceding claims, wherein the cap analog is a tetranucleotide cap analog comprising a GGAG sequence.

49. The method of claim 48, wherein the tetranucleotide cap analog comprises a compound selected from the group consisting of:

50. the method of any one of the preceding claims, wherein the polynucleotide template comprises a 2 '-deoxythymidine residue or a 2' -deoxycytidine residue at template position + 1.

51. A composition or kit comprising the RNA polymerase variant of any of claims 1-39 and IVT reaction components, optionally selected from the group consisting of a polynucleotide template, nucleoside triphosphates and a cap analog.

52. A nucleic acid encoding the RNA polymerase variant of any of claims 1-39.

Background

In Vitro Transcription (IVT) template-directed mRNA transcripts are synthesized using phage DNA-dependent ribonucleic acid (RNA) polymerases (e.g., SP6, T3, and T7). Problems in the IVT response may result in complete failure (e.g., no transcript produced) or an incorrect size of the transcript (e.g., shorter or longer than expected). Specific problems associated with IVT reactions include, for example, aborted (truncated) transcripts, run-on transcripts, polyA tail variation/3' heterogeneity, mutated transcripts, and/or double-stranded contaminants produced during the reaction.

RNA polymerase exhibits three transcription stages-initiation, extension and termination. In the initial phase, RNA polymerase binds to a specific promoter DNA sequence, opening the DNA duplex and feeding the template strand into the active site. For example, T7RNA polymerase forms a structure known as the initiation complex, which includes a six-helix bundle sub-domain (promoter binding domain) that interacts with the promoter to initiate DNA duplex melting. When bound to a promoter, the polymerase produces many short (truncated) transcripts of 2-12 nucleotides (nt) in length, a process commonly referred to as inefficient synthesis/initiation. Truncated RNA transcripts cannot be converted to full-length transcripts by RNA polymerase and become accumulated as byproducts of the transcription process. After transitioning to the extension phase and releasing the promoter, the polymerase proceeds along the DNA template, producing a full-length RNA transcript.

During the extension phase, the RNA polymerase will typically continue to transcribe DNA beyond the point at which termination should begin, thereby producing RNA transcripts ("run-on transcripts") that are longer than expected. For example, T7RNA polymerase adds nucleotides to the ends of the transcript before "stripping" from the template. Studies have shown that more than 70% of transcripts produced in vitro by T7RNA polymerase can be run-on transcripts. In some cases, these aberrant RNA products are twice as long as the coding sequence. Since the run-on transcription is random, there is usually a large 3' heterogeneity between products in a given IVT response. This 3' heterogeneity is problematic for downstream applications, such as ligation reactions, which rely on RNA transcripts of defined length and/or nucleotide composition.

Disclosure of Invention

In some aspects, provided herein are RNA polymerase variants and in vitro transcription methods using these variants. In some embodiments, RNA polymerase variants of the present disclosure have been shown to increase transcription efficiency, increase co-transcription capping efficiency, increase RNA yield at half concentration of the capping analog and increase 3' homogeneity of RNA, increase transcription fidelity and/or reduce the amount of dsRNA contamination when used, for example, in an in vitro transcription reaction.

Some aspects of the present disclosure provide a ribonucleic acid (RNA) polymerase variant comprising an RNA polymerase comprising at least one amino acid substitution.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising at least one amino acid substitution relative to an RNA polymerase comprising the amino acid sequence of SEQ ID No. 44 at a position selected from the group consisting of E350, D351, K387, N437, K441, D506, R632, D653, S628, P657, F880, and G884.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising amino acid substitutions at two positions selected from the group consisting of E350, D351, K387, and D653 relative to a wild type RNA polymerase acid comprising the amino acid sequence of SEQ ID No. 1. In some embodiments, the two amino acid substitutions are E350 and D351. In some embodiments, the two amino acid substitutions are E350 and K387. In some embodiments, the two amino acid substitutions are K387 and D653.

In some embodiments, the RNA polymerase comprises an amino acid substitution at E350. In some embodiments, the RNA polymerase comprises an amino acid substitution at D351. In some embodiments, the RNA polymerase comprises an amino acid substitution at K387. In some embodiments, the RNA polymerase comprises an amino acid substitution at N437. In some embodiments, the RNA polymerase comprises an amino acid substitution at K441. In some embodiments, the RNA polymerase comprises an amino acid substitution at D506. In some embodiments, the RNA polymerase comprises an amino acid substitution at R632. In some embodiments, the RNA polymerase comprises an amino acid substitution at D653. In some embodiments, the RNA polymerase comprises an amino acid substitution at S628. In some embodiments, the RNA polymerase comprises an amino acid substitution at P657. In some embodiments, the RNA polymerase comprises an amino acid substitution at F880. In some embodiments, the RNA polymerase comprises an amino acid substitution at G884.

In some embodiments, the RNA polymerase comprises at least two, at least three, at least four, or at least five amino acid substitutions at a position selected from the group consisting of E350, D351, K387, N437, K441, D506, R632, D653, S628, P657, F880, and G884.

In some embodiments, the RNA polymerase comprises an amino acid substitution at a position selected from the group consisting of: e350 and D351; e350 and K387; e350 and N437; e350 and K441; e350 and D506; e350 and R632; e350 and D653; e350 and S628; e350 and P657; e350 and F880; e350 and G884; d351 and K387, D351 and N437; d351 and K441; d351 and D506; d351 and R632; d351 and D653; d351 and S628; d351 and P657; d351 and F880; d351 and G884; k387 and N437; k387 and K441; k387 and D506; k387 and R632; k387 and D653; k387 and S628; k387 and P657; k387 and F880; and K387 and G884; n437 and K441; n437 and D506; n437 and R632; n437 and D653; n437 and S628; n437 and P657; n437 and F880; n437 and G884; k441 and D506; k441 and R632; k441 and D653; k441 and S628; k441 and P657; k441 and F880; k441 and G884; d506 and R632; d506 and D653; d506 and S628; d506 and P657; d506 and F880; d506 and G884; r632 and D653; r632 and S628; r632 and P657; r632 and F880; r632 and G884; d653 and S628; d653 and P657; d653 and F880; d653 and G884; s628 and P657; s628 and F880; s628 and G884; p657 and F880; p657 and G884; and F880 and G884.

In some embodiments, the RNA polymerase comprises an acid substitution at a position selected from the group consisting of: k387, D653, and G884; e350, D351, and K387; and D653, P657, and R632.

In some embodiments, the amino acid substitution at E350 is selected from the group consisting of E350A, E350K, E350N, and E350W, optionally wherein the amino acid substitution at E350 is E350N.

In some embodiments, the amino acid substitution at D351 is D351V.

In some embodiments, the amino acid substitution at K387 is selected from the group consisting of K387H, K387N, and K387S, optionally wherein the amino acid substitution at K387 is K387N.

In some embodiments, the amino acid substitution at N437 is selected from the group consisting of N437F, N437I, N437T, and N437Y, optionally wherein the amino acid substitution at N437 is N437F.

In some embodiments, the amino acid substitution at K441 is K441R.

In some embodiments, the amino acid substitution at D506 is selected from the group consisting of D506F, D506L, D506R, D506W, and D506Y.

In some embodiments, the amino acid substitution at R632 is R632K or R632T.

In some embodiments, the amino acid substitution at D653 is selected from the group consisting of D653A, D653F, D653G, D653H, D653I, D653K, D653L, D653M, D653N, D653P, D653Q, D653R, D653S, D653T, D653V, D653W, and D653Y, optionally wherein the amino acid substitution at D653 is D653W.

In some embodiments, the amino acid substitution at S628 is S628W.

In some embodiments, the amino acid substitution at P657 is selected from the group consisting of P657A, P657R, and P657W.

In some embodiments, the amino acid substitution at F880 is F880Y.

In some embodiments, the amino acid substitution at G884 is selected from the group consisting of G884A, G884S, G884T, and G884P.

In some embodiments, the RNA polymerase comprises the sequence of any one of claims 61-241.

Other aspects of the disclosure provide a method comprising producing mRNA in an in vitro transcription reaction comprising a DNA template, nucleoside triphosphates, any one of the RNA polymerase variants described herein, and optionally a cap analog. In some embodiments, the reaction comprises a cap analog. In some embodiments, the reaction comprises a concentration of the cap analog that is at least 5-fold lower than the concentration of the cap analog required to produce an equivalent amount of mRNA using a T7RNA polymerase comprising the amino acid sequence of SEQ ID NO: 44.

In some embodiments, more than 80% of the mRNA produced comprises a functional cap, more than 50% of the mRNA produced is homogeneous at the 3' end, and/or the reaction comprises less than 5ng dsRNA per 25 μ g of mRNA produced.

In some embodiments, the cap analog and the nucleoside triphosphate are present in the reaction at equimolar concentrations, or the molar ratio of cap analog to nucleoside triphosphate in the reaction is greater than 1: 1. In some embodiments, the cap analog is a dinucleotide cap, a trinucleotide cap, or a tetranucleotide cap. In some embodiments, the cap analogs are trinucleotide cap analogs comprising a GAG sequence.

In some embodiments, GAG cap analogs include compounds selected from the group consisting of:

in some embodiments, the cap analog is a tetranucleotide cap analog comprising a GGAG sequence.

In some embodiments, the tetranucleotide cap analog comprises a compound selected from the group consisting of:

in some embodiments, the polynucleotide template comprises a 2 '-deoxythymidine residue or a 2' -deoxycytidine residue at template position + 1.

Other aspects of the disclosure provide compositions or kits comprising any of the RNA polymerase variants as described herein and an IVT reaction component, optionally selected from the group consisting of a polynucleotide template, nucleoside triphosphates, and a cap analog.

Other aspects of the disclosure provide nucleic acids encoding any of the RNA polymerase variants as described herein.

Drawings

Figures 1A-1H show graphs depicting the functional characteristics of transcribed RNA products produced by In Vitro Transcription (IVT) reactions involving mutant variants of the control T7RNA polymerase variant (G47A + C-terminal G) in the presence of different levels of GAG cap analogs. Following oligo dT purification, the transcribed RNA products were analyzed for yield (FIG. 1A), 3' homogeneity (FIG. 1B), amount of dsRNA (FIG. 1C), percent capped RNA (FIG. 1D and FIG. 1E), purity as determined by the DBAA (dibutylammonium acetate) HPLC method (FIG. 1F), percent tailed (i.e., the percentage of RNA comprising a polyA tail) as determined by the Tris RP (reverse phase) method (FIG. 1G), and frequency of insertion deletions (FIG. 1H).

Fig. 2A-2C show graphs depicting the percentage of capped RNA produced by In Vitro Transcription (IVT) reactions involving mutant variants of a control T7RNA polymerase variant (G47A + C-terminal G) in the presence of different levels of GGG cap (fig. 2A), m6A cap (fig. 2B), and e6A cap (fig. 2C).

Figures 3A-3E show graphs depicting the functional characteristics of transcribed RNA products produced by In Vitro Transcription (IVT) reactions involving mutant variants of the control T7RNA polymerase variant (G47A + C-terminal G) in the presence of different levels of GAG cap analogs. Following oligo dT purification, the transcribed RNA products were analyzed for concentration (FIG. 3A), percent tailing (i.e., the percentage of RNA containing a polyA tail) as determined by the Tris RP method (FIG. 3B), purity as determined by the DBAA (dibutylammonium acetate) HPLC method (FIG. 3C), 3' homogeneity (FIG. 3D), and the amount of dsRNA (FIG. 3E).

Figures 4A-4E show graphs depicting the percentage of capped RNA produced by In Vitro Transcription (IVT) reactions involving mutant variants of control T7RNA polymerase variants (G47A + C-terminal G) in the presence of different levels of GAG cap.

Fig. 5A-5D show graphs depicting the percentage of capped RNA produced by IVT reactions involving mutant variants of the control T7RNA polymerase variant (G47A + C-terminal G) in the presence of different levels of the e6A trinucleotide (trinuc).

Fig. 6A-6D show graphs depicting the percentage of capped RNA produced by IVT reactions involving mutant variants of the control T7RNA polymerase variant (G47A + C-terminal G) in the presence of different levels of the m6A trinucleotide.

Figure 7 shows a graph depicting the percentage of capped RNA produced by IVT reactions involving mutant variants of control T7RNA polymerase variants (G47A + C-terminal G) in the presence of different levels of GGAG tetranucleotides (tetranuc). The structure of GGAG tetranucleotides is provided in the lower half of fig. 7.

Figures 8A-8I show graphs depicting the percentage of capped RNA (figures 8A-8D) and relative RNA yield (figures 8E-8I) produced by IVT reactions involving mutant variants of the control T7RNA polymerase variant (G47A + C-terminal G) in the presence of GAG trinucleotides, m6A trinucleotides, E6A trinucleotides, or tetranucleotides. FIGS. 8E-8I are normalized to IVT reactions involving wild-type T7RNA polymerase.

Figures 9A-9D show graphs depicting dsRNA content produced by IVT reactions involving mutant variants of control T7RNA polymerase in the presence of GAG trinucleotides (figure 9A), m6A trinucleotides (figure 9B), e6A trinucleotides (figure 9C) and GGAG tetranucleotides (figure 9D).

Figures 10A-10D show graphs depicting 3' homogeneity (figure 10A), percentage of capped RNA (figure 10B), percentage of full-length RNA product (figure 10C), and crude yield over time (figure 10D) resulting from IVT reactions involving mutant variants of control T7RNA polymerase in the presence of GAG trinucleotides.

Figure 11 shows a graph depicting the percentage of capped RNA produced by IVT reactions involving D653W + G47A RNA polymerase variants in the presence of different concentrations of the cap analog.

Figure 12 shows a graph depicting the capping efficiency of polysubstituted RNA polymerase variants in the presence of GAG trinucleotide cap analogs.

FIGS. 13A-13B show graphs depicting the relative RNA yields (FIG. 13A) and percent capped RNA produced by IVT reactions involving multiple-substituted RNA polymerase variants in the presence of GGAG tetranucleotide cap analogs (FIG. 13B).

FIGS. 14A-14E show graphs depicting functional characteristics of transcribed RNA products produced by IVT reactions involving a multi-substituted RNA polymerase variant and three different DNA templates in the presence of GGAG tetranucleotide cap analogs. Following oligo dT purification, the transcribed RNA products were analyzed for percent capped RNA (FIG. 14A), percent tailed RNA (i.e., percent RNA containing a polyA tail) as determined by the Tris RP method (FIG. 14B), purity as determined by the RP HPLC method (FIG. 14C), 3' homogeneity (FIG. 14D), and amount of dsRNA (FIG. 14E).

FIGS. 15A-15E show graphs depicting functional characteristics of transcribed RNA products produced by IVT reactions involving multiple-substituted RNA polymerase variants in the presence of GGAG tetranucleotide cap analogs. Following oligo dT purification, the transcribed RNA products were analyzed for RNA yield (FIG. 15A), percent capped RNA (FIG. 15B), amount of dsRNA (FIG. 15C), purity as determined by RP HPLC methods (FIG. 15D), and percent tailed RNA (i.e., the percentage of RNA containing a polyA tail) (FIG. 15E).

Detailed Description

RNA polymerase (DNA-dependent RNA polymerase) is an enzyme that catalyzes the sequential addition of ribonucleotides to the 3 ' end of a growing RNA strand (RNA transcription in the 5 ' → 3 ' direction), with Nucleoside Triphosphates (NTPs) serving as substrates for the enzyme and the sequence of nucleotides being specified by a DNA template. Transcription relies on complementary pairing of bases. The two strands of the duplex are partially separated, with one separated strand serving as a template (DNA template). The RNA polymerase then catalyzes the alignment of the free nucleotides to complementary bases in the template on the DNA template. Thus, an RNA polymerase is considered to have RNA polymerase activity if it catalyzes the sequential addition of ribonucleotides to the 3' end of a growing RNA strand.

DNA-directed RNA polymerase is capable of initiating RNA synthesis without primers; the first catalytic stage of initiation is called de novo RNA synthesis. De novo synthesis is a unique stage in the transcription cycle in which RNA polymerase binds two nucleotides instead of a nascent RNA polymer and a single nucleotide. For the bacteriophage T7RNA polymerase, transcription clearly prefers to start with GTP at the +1 and +2 positions. The position at which the initiating nucleotide binds to RNA polymerase differs from the position described for the extension complex (Kennedy WP et al, J Mol biol. 2007; 370 (2): 256-68). The bias in favor of GTP as the starting nucleotide is achieved by shape complementarity, extensive protein side chains and strong base stacking interactions against the guanine moiety in the enzyme active site. Thus, the initial GTP provides the greatest stabilizing power for the open promoter conformation (Kennedy et al, 2007). In some embodiments, the RNA polymerase variants of the present disclosure comprise one or more amino acid substitutions at one or more binding site residues for de novo RNA synthesis that, without being bound by theory, for example, alter the affinity of the RNA polymerase for the capping analog of an in vitro transcription reaction such that capping efficiency is increased at low capping analog concentrations.

Thus, in some aspects, the present disclosure provides RNA polymerase variants comprising an RNA polymerase comprising an amino acid substitution at a binding site residue for de novo RNA synthesis. An RNA polymerase variant is an enzyme having RNA polymerase activity and having at least one substitution and/or modification relative to the corresponding wild-type RNA polymerase. In some embodiments, the amino acid substitution at a binding site residue is a substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 425, 427, 437, 441, 506, 628, 632, 653, 657, 811, and 880 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at a binding site residue is a substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 437, 441, 506, 628, 632, 653, and 657 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

Structural studies of T7RNA polymerase have shown that the conformation of the N-terminal domain changes substantially between the initial and extended stages of transcription. The N-terminal domain comprises a C-helix subdomain and a promoter binding domain comprising two segments separated by a subdomain H. The promoter binding domain and bound promoter are rotated about 45 degrees when synthesizing 8-nt RNA transcripts, allowing the promoters to remain in contact as the active site expands to accommodate the growing heteroduplex. The C-helical subdomain moves moderately towards its extended conformation, while subdomain H remains in its initial stage position rather than its extended stage position more than 70 angstroms apart. Structural comparison of the T7RNA polymerase initiation and extension complexes revealed extensive conformational changes within the N-terminal 267 residues (N-terminal domain) and minimal changes in the remainder of the RNA polymerase. Rigid rotation of the promoter binding domain and refolding of the N-terminal C-helix (residues 28-71) and H (residue 151-190) subdomains are responsible for eliminating the promoter binding site, enlarging the active site and establishing an exit tunnel for the RNA transcript. In particular, residues E42-G47 of T7RNA polymerase (present in the β -loop structure in the initial complex) adopt an α -helical structure in the extension complex. Structural changes in the N-terminal domain explain the increased stability and processivity of the extension complexes (see, e.g., Durniak, K.J. et al, Science 322(5901): 553-.

In some aspects, provided herein are RNA polymerase variants that facilitate a conformational change from an RNA polymerase initiation complex to an RNA polymerase extension complex (e.g., T7RNA polymerase variants). In some embodiments, the RNA polymerase variant comprises at least one amino acid modification relative to wild-type RNA polymerase that results in a conformational change in at least one three-dimensional loop structure of the RNA polymerase variant upon transition of the RNA polymerase variant from an initiation complex to an extension complex to form a helical structure. Thus, in some embodiments, at least one amino acid modification has a high helical propensity relative to a wild-type amino acid. In some embodiments, the RNA polymerase variant comprises an amino acid substitution at one or more of positions 42, 43, 44, 45, 46, and 47 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 47 is G47A.

Examples of loop structures include, but are not limited to, amino acids (aa)42-47 in the C-helical structure (e.g., aa 28-71 of SEQ ID NO: 1) of the T7RNA polymerase Initiation Complex (IC) conformation and aa 257-262 in the C-linker structure of the IC (e.g., aa 258-266 of SEQ ID NO: 1).

Accordingly, some aspects of the present disclosure provide RNA polymerase variants comprising a plurality of amino acid substitutions and/or modifications relative to a wild-type RNA polymerase. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) an amino acid substitution at a binding site residue for synthesis from a head RNA; and (b) amino acid substitutions that facilitate a conformational change from the RNA polymerase initiation complex to the RNA polymerase extension complex.

Furthermore, in some embodiments, RNA polymerase variants provided herein include amino acid modifications comprising at least one additional amino acid at the C-terminus of the polymerase. In some embodiments, the at least one additional amino acid is selected from the group consisting of alanine, arginine, asparagine, aspartic acid, cysteine, glutamic acid, glutamine, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine. In some embodiments, the at least one additional amino acid is a polar amino acid. In some embodiments, the at least one additional amino acid is a non-polar amino acid. In some embodiments, the at least one additional amino acid is glycine. In some embodiments, the at least one additional amino acid is alanine. In some embodiments, the at least one additional amino acid is serine.

In some embodiments, the RNA polymerase variants of the disclosure are used to increase transcription efficiency relative to a control RNA polymerase, e.g., in an in vitro transcription reaction. For example, the use of an RNA polymerase variant can increase transcription efficiency (e.g., RNA yield and/or transcription rate) by at least 20%. In some embodiments, the use of an RNA polymerase variant increases the transcription efficiency (e.g., RNA yield and/or transcription rate) by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 10%. In some embodiments, the use of an RNA polymerase variant increases transcription efficiency by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60%. In some embodiments, the use of an RNA polymerase variant increases total RNA yield by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 10%. In some embodiments, the use of an RNA polymerase variant increases total RNA yield by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60%. In some embodiments, the transcription rate is increased by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 10% using an RNA polymerase variant. In some embodiments, the transcription rate is increased by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60% using the RNA polymerase variant. In some embodiments, the control RNA polymerase is a wild-type RNA polymerase comprising the amino acid sequence of SEQ ID NO:1 ("wild-type T7RNA polymerase"). In other embodiments, the control RNA polymerase is a RNA polymerase variant comprising the amino acid sequence of SEQ ID NO:1 modified to include a G47A substitution and an additional glycine at its C-terminus ("control T7RNA polymerase variant" or "G47A + C-terminal G T7RNA polymerase variant" or "control RNA polymerase variant" or "G47A + C-terminal G RNA polymerase variant").

Surprisingly, the data provided herein show that the use of RNA polymerase variants of the present disclosure in an in vitro transcription reaction enables the use of much lower concentrations (amounts) of the capping analogs to produce capping RNA in amounts equivalent to that produced using wild-type T7RNA polymerase or a control RNA polymerase variant. In some embodiments, the yield of capped RNA is increased when one-half the concentration of the capping analog is used in an in vitro transcription reaction, e.g., using the RNA polymerase variants of the present disclosure in an in vitro transcription reaction. In some embodiments, the use of an RNA polymerase variant of the present disclosure increases the yield of capped RNA when only 25%, 50%, or 75% of the concentration of the capping analog is used in an in vitro transcription reaction, for example, in an in vitro transcription reaction. For example, the use of an RNA polymerase variant may increase the yield of capped RNA by at least 20% when only 25%, 50%, or 75% of the concentration of the capping analog is used in an in vitro transcription reaction. In some embodiments, the use of the RNA polymerase variant increases the yield of capped RNA by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% when only 25%, 50%, or 75% of the concentration of the capping analog is used in the in vitro transcription reaction. In some embodiments, the use of the RNA polymerase variant increases the yield of capped RNA by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60% when only 25%, 50%, or 75% of the concentration of the capping analog is used in an in vitro transcription reaction. In some embodiments, the control RNA polymerase is wild-type T7RNA polymerase. In other embodiments, the control RNA polymerase is a control RNA polymerase variant.

In some embodiments, the total yield of capped RNA is increased by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 10% using the RNA polymerase variant. In some embodiments, the total yield of capped RNA is increased by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60% using the RNA polymerase variant.

In some embodiments, the RNA polymerase variants of the present disclosure are used to increase the efficiency of co-transcription capping, e.g., in an in vitro transcription reaction. For example, use of an RNA polymerase variant can increase co-transcription capping efficiency (e.g., the percentage of transcripts comprising a capping analog) by at least 20%. In some embodiments, co-transcription capping efficiency (e.g., the percentage of transcripts comprising a capping analog) is increased by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% using an RNA polymerase variant. In some embodiments, co-transcription capping efficiency is increased by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60% using an RNA polymerase variant. In some embodiments, the control RNA polymerase is wild-type T7RNA polymerase. In other embodiments, the control RNA polymerase is a control RNA polymerase variant.

In some embodiments, at least 50% of the mRNA comprises a functional cap analog. For example, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 95%, or 100% of the mRNA can comprise a capping analog. In some embodiments, 50% -100%, 50% -90%, 50% -80%, or 50% -70% of the mRNA comprises a cap analog.

In some embodiments, the RNA 3' homogeneity of RNA is increased using the RNA polymerase variants of the present disclosure at half the concentration of the cap analog used in an in vitro transcription reaction, e.g., in an in vitro transcription reaction. For example, the use of an RNA polymerase variant can increase the 3' homogeneity of RNA by at least 20% when only 25%, 50% or 75% of the cap analog concentration is used in an in vitro transcription reaction. In some embodiments, the use of the RNA polymerase variant increases 3' homogeneity by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% when only 25%, 50%, or 75% of the concentration of the cap analog is used in the in vitro transcription reaction. In some embodiments, the RNA polymerase variant is used to increase 3' homogeneity by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60% when only 25%, 50%, or 75% of the concentration of the cap analog is used in an in vitro transcription reaction. In some embodiments, the control RNA polymerase is wild-type T7RNA polymerase. In other embodiments, the control RNA polymerase is a control RNA polymerase variant.

In some embodiments, at least 50% of the mRNA produced in an in vitro transcription reaction comprising an RNA polymerase variant of the present disclosure exhibits 3' homogeneity. For example, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 95%, or 100% of the mRNA exhibits 3' homogeneity. In some embodiments, 50% to 100%, 50% to 90%, 50% to 80%, or 50% to 70% of the mRNA exhibits 3' homogeneity.

In some embodiments, the mRNA produced in an in vitro transcription reaction comprising an RNA polymerase variant of the present disclosure has 3' homogeneity greater than a threshold. In some embodiments, the threshold is 50% or at least 50%. For example, the threshold may be 55%, 60%, 65%, 70%, 75%, 80%, 85%, or 90%.

In some embodiments, the RNA polymerase variants of the disclosure are used to increase transcription fidelity (e.g., mutation rate), for example, in an in vitro transcription reaction. For example, transcription fidelity can be improved by at least 20% using RNA polymerase variants. In some embodiments, the transcription fidelity is increased by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% using the RNA polymerase variant. In some embodiments, the RNA polymerase variant is used to increase transcription fidelity by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60%. RNA polymerase variants of the present disclosure that improve transcription fidelity will produce RNA transcripts (e.g., mRNA transcripts) that have a lower mutation rate or total number of mutations than control RNA polymerases. In some embodiments, the control RNA polymerase is wild-type T7RNA polymerase. In other embodiments, the control RNA polymerase is a control RNA polymerase variant.

In some embodiments, mRNA produced using the RNA polymerase variants of the disclosure has less than 1 mutation per 100 nucleotides relative to the DNA template. For example, the mRNA produced may have less than 1 mutation per 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides relative to the DNA template.

In some embodiments, the RNA polymerase variants of the present disclosure are used, e.g., in an in vitro transcription reaction, to reduce the amount of double-stranded RNA (dsrna) contamination in the in vitro transcription reaction. For example, the amount of dsRNA contamination in an in vitro transcription reaction can be reduced by at least 20% using an RNA polymerase variant. In some embodiments, the amount of dsRNA contamination in an in vitro transcription reaction is reduced by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% using an RNA polymerase variant. In some embodiments, the amount of dsRNA contamination in an in vitro transcription reaction is reduced by 20-100%, 20-90%, 20-80%, 20-70%, 20-60%, 20-50%, 30-100%, 30-90%, 30-80%, 30-70%, 30-60%, 30-50%, 40-100%, 40-90%, 40-80%, 40-70%, 40-60%, 40-50%, 50-100%, 50-90%, 50-80%, 50-70%, or 50-60% using an RNA polymerase variant. In some embodiments, the control RNA polymerase is wild-type T7RNA polymerase. In other embodiments, the control RNA polymerase is a control RNA polymerase variant.

In some embodiments, dsRNA is contaminating at a concentration of less than 10ng/25 μ g mRNA product. In some embodiments, dsRNA is contaminating at a concentration of less than 5ng/25 μ g mRNA product. For example, the concentration of dsRNA contamination may be less than 4ng/25 μ g mRNA product, less than 3ng/25 μ g mRNA product, less than 2ng/25 μ g mRNA product, or less than 1ng/25 μ g mRNA product. In some embodiments, the concentration of dsRNA contamination is 0.5-1, 0.5-2, 0.5-3, 0-.4, or 0.5-5ng/25 μ g mRNA product.

In some embodiments, mRNA produced in an in vitro transcription reaction comprising an RNA polymerase variant of the present disclosure has less than a threshold amount of dsRNA. In some embodiments, the threshold is 10 ng. In some embodiments, the threshold is 5 ng. In some embodiments, the threshold is 4ng, 3nm, 2ng, or 1 ng.

Amino acid substitutions and modifications

The RNA polymerase variants of the present disclosure comprise at least one amino acid substitution relative to wild-type (WT) RNA polymerase. For example, with respect to wild type T7RNA polymerase having the amino acid sequence of SEQ ID NO. 1, the glycine at position 47 is considered to be the "wild type amino acid", while the substitution of glycine at position 47 with alanine is considered to be an "amino acid substitution" with a high helical propensity. In some embodiments, the RNA polymerase variant is a T7RNA polymerase variant comprising at least one (one or more) amino acid substitution relative to WT RNA polymerase (e.g., wild-type T7RNA polymerase having the amino acid sequence of SEQ ID NO: 1).

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (at least one) amino acid modification that causes the loop structure of the RNA polymerase variant to undergo a conformational change to form a helical structure upon transition of the RNA polymerase variant from an initiation complex to an extension complex. In some embodiments, the amino acid modification is an amino acid substitution at one or more of positions 42, 43, 44, 45, 46, and 47 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution is a high-propensity amino acid substitution. Examples of high helix-propensity amino acids include alanine, isoleucine, leucine, arginine, methionine, lysine, glutamine and/or glutamic acid. In some embodiments, the amino acid substitution at position 47 is G47A.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an additional C-terminal amino acid relative to the wild-type RNA polymerase. In some embodiments, the additional C-terminal amino acid is selected from the group consisting of glycine, alanine, threonine, proline, glutamine, and serine. In some embodiments, the additional C-terminal amino acid (e.g., at position 884 relative to a wild-type RNA polymerase comprising the amino acid sequence of SEQ ID NO: 1) is glycine.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (at least one) amino acid modification at the position of a non-conserved amino acid residue. Conserved amino acid residues are amino acids or amino acid types (e.g., a single amino acid such as Gly or Ser, or a group of amino acids with similar properties, such as amino acids with acidic functional groups) that are typically shared between multiple homologous sequences of the same protein. Conserved amino acid residues can be determined using sequence alignment of homologous amino acid sequences. Sequence Alignment of approximately 1000 RNA polymerase sequences obtained using Basic Local Alignment (Basic Local Alignment) searches allows the determination of 240 positions of SEQ ID NO:1, which are most likely conserved among all RNA polymerase sequences. The 240 positions of SEQ ID NO:1 which are most likely to be conserved in all RNA polymerase sequences are positions 5-6, position 39, position 269-277, position 279, position 281-282, position 323-333, position 411-448, position 454-470, position 472-474, position 497-516, position 532-560, position 562-573, position 626-646, position 691, position 693-702, position 724-738, position 775-794, position 805-828, position 828-833, position 865-867 and position 877-879. Thus, in some embodiments, the RNA polymerase variants of the present disclosure include RNA polymerases comprising (at least one) amino acid modification at a position other than one of positions 5-6, 39, 269-277, 279, 281-282, 323-333, 411-448, 454-470, 472-474, 497-516, 532-560, 562-573, 626-646, 691, 693-702, 724-738, 775-794, 805-820, 828-833, 865-867 and 877-879 of SEQ ID NO: 1. In some embodiments, the RNA polymerase variants described herein may further comprise any number of amino acid modifications at any number of positions other than one of positions 5-6, 39, 269-277, 279, 281-282, 323-333, 411-448, 454-470, 472-474, 497-516, 532-560, 562-573, 626-646, 691, 693-702, 724-738, 775-794, 805-820, 828-833, 865-867 and 877-879 of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprising the RNA polymerase of any of SEQ ID NOS.2-247 may further comprise (at least one) additional amino acid modification(s) at a position other than one of positions 5-6, 39, 269-277, 279, 281-282, 323-333, 411-448, 454-470, 472-474, 497-516, 532-560, 562-573, 626-646, 691, 693-702, 724-738, 775-794, 805-820, 828-833, 865-867 and 877-879. Conversely, amino acid positions that are not conserved are most likely modified or mutated. Thus, in some embodiments, RNA polymerase variants of the present disclosure include RNA polymerases comprising (at least one) amino acid modification at positions 1-4, 7-38, 40-268, 278, 280, 283-322, 334-410, 449-453, 471, 475-496, 517-531, 561, 574-625, 647-690, 692, 703-723, 739-774, 795-804, 821-827, 834-864, 868-876, and 880-883. In some embodiments, RNA polymerase variants comprising the RNA polymerase of any of SEQ ID NOS.2-247 may further comprise (at least one) additional amino acid modification(s) at positions 1-4, 7-38, 40-268, 278, 280, 283-.

In some embodiments, an RNA polymerase variant comprising an RNA polymerase of any of SEQ ID NOs 2-247 may further comprise (at least one) amino acid modification at any amino acid position that does not disrupt the secondary or tertiary structure of the RNA polymerase protein. In some embodiments, an RNA polymerase variant comprising an RNA polymerase of any of SEQ ID NOs 2-247 may further comprise (at least one) amino acid modification at any amino acid position that does not disrupt the folding ability of the RNA polymerase protein. In some embodiments, an RNA polymerase variant comprising an RNA polymerase of any of SEQ ID NOs 2-247 may further comprise (at least one) amino acid modification at any amino acid position that does not disrupt the nucleic acid (e.g. DNA) binding ability of the RNA polymerase protein.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) an amino acid substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 425, 427, 437, 441, 506, 628, 632, 653, 657, 811, and 880 and (b) an additional amino acid substitution and/or amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) an amino acid substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 437, 441, 506, 628, 632, 653, and 657 and (b) an additional amino acid substitution and/or amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 350, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a lysine (K) at position 350 (E350K), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), asparagine (N) at position 350 (E350N), and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an alanine (a) at position 350 (E350A), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a tryptophan at position 350 (E350W), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 351, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a valine (V) at position 351 (D351V), and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 387, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a serine at position 387 (K387S), and/or an additional amino acid at the C-terminus (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), histidine (H) at position 387 (K387H), and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), asparagine at position 387 (K387N), and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 394, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 425, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 427, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 437, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a threonine at position 437 (N437T), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), isoleucine at position 437 (N437I), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), tyrosine at position 437 (N437Y), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a phenylalanine at position 437 (N437F), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 441, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an arginine at position 441 (K441R), and/or an additional amino acid at the C-terminus (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 506, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a tryptophan (W) at position 506 (D506W), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the amino acid substitution at position 506 is D506A, D506R, D506N, D506C, D506E, D506Q, D506G, D506H, D506I, D506L, D506K, D506M, D506F, D506P, D506S, D506T, D506W, D506Y, or D506V.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 628, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a tryptophan (W) at position 628 (S628W), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the amino acid substitution at position 628 is S628A, S628R, S628N, S628D, S628C, S628E, S628Q, S628G, S628H, S628I, S628L, S628K, S628M, S628F, S628P, S628T, S628W, S628Y, or S628V.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 632, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 653, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), tryptophan (W) at position 653 (D563W), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the amino acid substitution at position 653 is D653A, D653R, D653N, D653C, D653E, D653Q, D653G, D653H, D653I, D653L, D653K, D653M, D653F, D653P, D653S, D653T, D653W, D653Y, or D653V.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 657, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), tryptophan (W) at position 657 (P657W), and/or an additional amino acid at the C-terminus (at position 884) (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 811, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the amino acid substitution at position 657 is P657A, P657R, P657N, P657D, P657C, P657E, P657Q, P657G, P657H, P657I, P657L, P657K, P657M, P657F, P657S, P657T, P657W, P657Y, or P657V.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), an amino acid substitution at position 880, and/or an additional amino acid (e.g., G) at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A), a tyrosine at position 880 (F880Y), and/or an additional amino acid at the C-terminus (e.g., G) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 47 (e.g., G47A) and an additional amino acid at the C-terminus (at position 884) relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the additional amino acid at the C-terminus is threonine (T). In some embodiments, the additional amino acid at the C-terminus is serine (S). In some embodiments, the additional amino acid at the C-terminus is alanine (a). In some embodiments, the additional amino acid at the C-terminus is proline (P).

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 350 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 350 is selected from the group consisting of E350R, E350K, E350H, E350D, E350Q, E350N, E350T, E350S, E350C, E350G, E350A, E350V, E350I, E350M, E350P, E350Y, E350W, and E350F. In some embodiments, the amino acid substitution at position 350 is E350R. In some embodiments, the amino acid substitution at position 350 is E350K. In some embodiments, the amino acid substitution at position 350 is E350H. In some embodiments, the amino acid substitution at position 350 is E350D. In some embodiments, the amino acid substitution at position 350 is E350Q. In some embodiments, the amino acid substitution at position 350 is E350N. In some embodiments, the amino acid substitution at position 350 is E350T. In some embodiments, the amino acid substitution at position 350 is E350S. In some embodiments, the amino acid substitution at position 350 is E350C. In some embodiments, the amino acid substitution at position 350 is E350G. In some embodiments, the amino acid substitution at position 350 is E350A. In some embodiments, the amino acid substitution at position 350 is E350V. In some embodiments, the amino acid substitution at position 350 is E350I. In some embodiments, the amino acid substitution at position 350 is E350M. In some embodiments, the amino acid substitution at position 350 is E350P. In some embodiments, the amino acid substitution at position 350 is E350Y. In some embodiments, the amino acid substitution at position 350 is E350W. In some embodiments, the amino acid substitution at position 350 is E350F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 351 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 351 is selected from the group consisting of D351R, D351K, D351H, D351E, D351Q, D351N, D351T, D351S, D351C, D351G, D351A, D351V, D351I, D351M, D351P, D351Y, D351W, and D351F. In some embodiments, the amino acid substitution at position 351 is D351R. In some embodiments, the amino acid substitution at position 351 is D351K. In some embodiments, the amino acid substitution at position 351 is D351H. In some embodiments, the amino acid substitution at position 351 is D351E. In some embodiments, the amino acid substitution at position 351 is D351Q. In some embodiments, the amino acid substitution at position 351 is D351N. In some embodiments, the amino acid substitution at position 351 is D351T. In some embodiments, the amino acid substitution at position 351 is D351S. In some embodiments, the amino acid substitution at position 351 is D351C. In some embodiments, the amino acid substitution at position 351 is D351G. In some embodiments, the amino acid substitution at position 351 is D351A. In some embodiments, the amino acid substitution at position 351 is D351V. In some embodiments, the amino acid substitution at position 351 is D351I. In some embodiments, the amino acid substitution at position 351 is D351M. In some embodiments, the amino acid substitution at position 351 is D351P. In some embodiments, the amino acid substitution at position 351 is D351Y. In some embodiments, the amino acid substitution at position 351 is D351W. In some embodiments, the amino acid substitution at position 351 is D351F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 387 relative to a wild type RNA polymerase, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 387 is selected from the group consisting of K387R, K387H, K387E, K387D, K387Q, K387N, K387T, K387S, K387C, K387G, K387A, K387V, K387I, K387M, K387P, K387Y, K387W, and K387F. In some embodiments, the amino acid substitution at position 387 is K387R. In some embodiments, the amino acid substitution at position 387 is K387H. In some embodiments, the amino acid substitution at position 387 is K387E. In some embodiments, the amino acid substitution at position 387 is K387D. In some embodiments, the amino acid substitution at position 387 is K387Q. In some embodiments, the amino acid substitution at position 387 is K387N. In some embodiments, the amino acid substitution at position 387 is K387T. In some embodiments, the amino acid substitution at position 387 is K387S. In some embodiments, the amino acid substitution at position 387 is K387C. In some embodiments, the amino acid substitution at position 387 is K387G. In some embodiments, the amino acid substitution at position 387 is K387A. In some embodiments, the amino acid substitution at position 387 is K387V. In some embodiments, the amino acid substitution at position 387 is K387I. In some embodiments, the amino acid substitution at position 387 is K387M. In some embodiments, the amino acid substitution at position 387 is K387P. In some embodiments, the amino acid substitution at position 387 is K387Y. In some embodiments, the amino acid substitution at position 387 is K387W. In some embodiments, the amino acid substitution at position 387 is K387F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 394 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 394 is selected from the group consisting of R394K, R394H, R394E, R394D, R394Q, R394N, R394T, R394S, R394C, R394G, R394A, R394V, R394I, R394M, R394P, R394Y, R394W, and R394F. In some embodiments, the amino acid substitution at position 394 is R394K. In some embodiments, the amino acid substitution at position 394 is R394H. In some embodiments, the amino acid substitution at position 394 is R394E. In some embodiments, the amino acid substitution at position 394 is R394D. In some embodiments, the amino acid substitution at position 394 is R394Q. In some embodiments, the amino acid substitution at position 394 is R394N. In some embodiments, the amino acid substitution at position 394 is R394T. In some embodiments, the amino acid substitution at position 394 is R394S. In some embodiments, the amino acid substitution at position 394 is R394C. In some embodiments, the amino acid substitution at position 394 is R394G. In some embodiments, the amino acid substitution at position 394 is R394A. In some embodiments, the amino acid substitution at position 394 is R394V. In some embodiments, the amino acid substitution at position 394 is R394I. In some embodiments, the amino acid substitution at position 394 is R394M. In some embodiments, the amino acid substitution at position 394 is R394P. In some embodiments, the amino acid substitution at position 394 is R394Y. In some embodiments, the amino acid substitution at position 394 is R394W. In some embodiments, the amino acid substitution at position 394 is R394F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 425 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 425 is selected from the group consisting of R425K, R425H, R425E, R425D, R425Q, R425N, R425T, R425S, R425C, R425G, R425A, R425V, R425I, R425M, R425P, R425Y, R425W, and R425F. In some embodiments, the amino acid substitution at position 425 is R425K. In some embodiments, the amino acid substitution at position 425 is R425H. In some embodiments, the amino acid substitution at position 425 is R425E. In some embodiments, the amino acid substitution at position 425 is R425D. In some embodiments, the amino acid substitution at position 425 is R425Q. In some embodiments, the amino acid substitution at position 425 is R425N. In some embodiments, the amino acid substitution at position 425 is R425T. In some embodiments, the amino acid substitution at position 425 is R425S. In some embodiments, the amino acid substitution at position 425 is R425C. In some embodiments, the amino acid substitution at position 425 is R425G. In some embodiments, the amino acid substitution at position 425 is R425A. In some embodiments, the amino acid substitution at position 425 is R425V. In some embodiments, the amino acid substitution at position 425 is R425I. In some embodiments, the amino acid substitution at position 425 is R425M. In some embodiments, the amino acid substitution at position 425 is R425P. In some embodiments, the amino acid substitution at position 425 is R425Y. In some embodiments, the amino acid substitution at position 425 is R425W. In some embodiments, the amino acid substitution at position 425 is R425F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 427 relative to a wild type RNA polymerase, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 427 is selected from the group consisting of Y427R, Y427K, Y427H, Y427E, Y427D, Y427Q, Y427N, Y427T, Y427S, Y427C, Y427G, Y427A, Y427V, Y427I, Y427M, Y427P, Y427W, and Y427F. In some embodiments, the amino acid substitution at position 427 is Y427R. In some embodiments, the amino acid substitution at position 427 is Y427K. In some embodiments, the amino acid substitution at position 427 is Y427H. In some embodiments, the amino acid substitution at position 427 is Y427E. In some embodiments, the amino acid substitution at position 427 is Y427D. In some embodiments, the amino acid substitution at position 427 is Y427Q. In some embodiments, the amino acid substitution at position 427 is Y427N. In some embodiments, the amino acid substitution at position 427 is Y427T. In some embodiments, the amino acid substitution at position 427 is Y427S. In some embodiments, the amino acid substitution at position 427 is Y427C. In some embodiments, the amino acid substitution at position 427 is Y427G. In some embodiments, the amino acid substitution at position 427 is Y427A. In some embodiments, the amino acid substitution at position 427 is Y427V. In some embodiments, the amino acid substitution at position 427 is Y427I. In some embodiments, the amino acid substitution at position 427 is Y427M. In some embodiments, the amino acid substitution at position 427 is Y427P. In some embodiments, the amino acid substitution at position 427 is Y427W. In some embodiments, the amino acid substitution at position 427 is Y427F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 437 relative to a wild type RNA polymerase, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 437 is selected from the group consisting of N437R, N437K, N437H, N437E, N437D, N437Q, N437T, N437S, N437C, N437G, N437A, N437V, N437I, N437M, N437P, N437Y, N437W, and N437F. In some embodiments, the amino acid substitution at position 437 is N437R. In some embodiments, the amino acid substitution at position 437 is N437K. In some embodiments, the amino acid substitution at position 437 is N437H. In some embodiments, the amino acid substitution at position 437 is N437E. In some embodiments, the amino acid substitution at position 437 is N437D. In some embodiments, the amino acid substitution at position 437 is N437Q. In some embodiments, the amino acid substitution at position 437 is N437T. In some embodiments, the amino acid substitution at position 437 is N437S. In some embodiments, the amino acid substitution at position 437 is N437C. In some embodiments, the amino acid substitution at position 437 is N437G. In some embodiments, the amino acid substitution at position 437 is N437A. In some embodiments, the amino acid substitution at position 437 is N437V. In some embodiments, the amino acid substitution at position 437 is N437I. In some embodiments, the amino acid substitution at position 437 is N437M. In some embodiments, the amino acid substitution at position 437 is N437P. In some embodiments, the amino acid substitution at position 437 is N437Y. In some embodiments, the amino acid substitution at position 437 is N437W. In some embodiments, the amino acid substitution at position 437 is N437F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 441 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 441 is selected from the group consisting of K441R, K441H, K441E, K441D, K441Q, K441N, K441T, K441S, K441C, K441G, K441A, K441V, K441I, K441M, K441P, K441Y, K441W, and K441F. In some embodiments, the amino acid substitution at position 441 is K441R. In some embodiments, the amino acid substitution at position 441 is K441H. In some embodiments, the amino acid substitution at position 441 is K441E. In some embodiments, the amino acid substitution at position 441 is K441D. In some embodiments, the amino acid substitution at position 441 is K441Q. In some embodiments, the amino acid substitution at position 441 is K441N. In some embodiments, the amino acid substitution at position 441 is K441T. In some embodiments, the amino acid substitution at position 441 is K441S. In some embodiments, the amino acid substitution at position 441 is K441C. In some embodiments, the amino acid substitution at position 441 is K441G. In some embodiments, the amino acid substitution at position 441 is K441A. In some embodiments, the amino acid substitution at position 441 is K441V. In some embodiments, the amino acid substitution at position 441 is K441I. In some embodiments, the amino acid substitution at position 441 is K441M. In some embodiments, the amino acid substitution at position 441 is K441P. In some embodiments, the amino acid substitution at position 441 is K441Y. In some embodiments, the amino acid substitution at position 441 is K441W. In some embodiments, the amino acid substitution at position 441 is K441F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 632 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 632 is selected from the group consisting of R632K, R632H, R632E, R632D, R632Q, R632N, R632T, R632S, R632C, R632G, R632A, R632V, R632I, R632M, R632P, R632Y, R632W, and R632F. In some embodiments, the amino acid substitution at position 632 is R632K. In some embodiments, the amino acid substitution at position 632 is R632H. In some embodiments, the amino acid substitution at position 632 is R632E. In some embodiments, the amino acid substitution at position 632 is R632D. In some embodiments, the amino acid substitution at position 632 is R632Q. In some embodiments, the amino acid substitution at position 632 is R632N. In some embodiments, the amino acid substitution at position 632 is R632T. In some embodiments, the amino acid substitution at position 632 is R632S. In some embodiments, the amino acid substitution at position 632 is R632C. In some embodiments, the amino acid substitution at position 632 is R632G. In some embodiments, the amino acid substitution at position 632 is R632A. In some embodiments, the amino acid substitution at position 632 is R632V. In some embodiments, the amino acid substitution at position 632 is R632I. In some embodiments, the amino acid substitution at position 632 is R632M. In some embodiments, the amino acid substitution at position 632 is R632P. In some embodiments, the amino acid substitution at position 632 is R632Y. In some embodiments, the amino acid substitution at position 632 is R632W. In some embodiments, the amino acid substitution at position 632 is R632F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 811 relative to a wild type RNA polymerase, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 811 is selected from the group consisting of H811R, H811K, H811E, H811D, H811Q, H811N, H811T, H811S, H811C, H811G, H811A, H811V, H811I, H811M, H811P, H811Y, H811W, and H811F. In some embodiments, the amino acid substitution at position 811 is H811R. In some embodiments, the amino acid substitution at position 811 is H811K. In some embodiments, the amino acid substitution at position 811 is H811E. In some embodiments, the amino acid substitution at position 811 is H811D. In some embodiments, the amino acid substitution at position 811 is H811Q. In some embodiments, the amino acid substitution at position 811 is H811N. In some embodiments, the amino acid substitution at position 811 is H811T. In some embodiments, the amino acid substitution at position 811 is H811S. In some embodiments, the amino acid substitution at position 811 is H811C. In some embodiments, the amino acid substitution at position 811 is H811G. In some embodiments, the amino acid substitution at position 811 is H811A. In some embodiments, the amino acid substitution at position 811 is H811V. In some embodiments, the amino acid substitution at position 811 is H811I. In some embodiments, the amino acid substitution at position 811 is H811M. In some embodiments, the amino acid substitution at position 811 is H811P. In some embodiments, the amino acid substitution at position 811 is H811Y. In some embodiments, the amino acid substitution at position 811 is H811W. In some embodiments, the amino acid substitution at position 811 is H811F.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising an amino acid substitution at position 880 relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the amino acid substitution at position 880 is selected from the group consisting of F880R, F880K, F880H, F880E, F880D, F880Q, F880N, F880T, F880S, F880C, F880G, F880A, F880V, F880I, F880M, F880P, F880Y, and F880W. In some embodiments, the amino acid substitution at position 880 is F880R. In some embodiments, the amino acid substitution at position 880 is F880K. In some embodiments, the amino acid substitution at position 880 is F880H. In some embodiments, the amino acid substitution at position 880 is F880E. In some embodiments, the amino acid substitution at position 880 is F880D. In some embodiments, the amino acid substitution at position 880 is F880Q. In some embodiments, the amino acid substitution at position 880 is F880N. In some embodiments, the amino acid substitution at position 880 is F880T. In some embodiments, the amino acid substitution at position 880 is F880S. In some embodiments, the amino acid substitution at position 880 is F880C. In some embodiments, the amino acid substitution at position 880 is F880G. In some embodiments, the amino acid substitution at position 880 is F880A. In some embodiments, the amino acid substitution at position 880 is F880V. In some embodiments, the amino acid substitution at position 880 is F880I. In some embodiments, the amino acid substitution at position 880 is F880M. In some embodiments, the amino acid substitution at position 880 is F880P. In some embodiments, the amino acid substitution at position 880 is F880Y. In some embodiments, the amino acid substitution at position 880 is F880W.

It is to be understood that RNA polymerase variants of the present disclosure can include more than one (e.g., 2,3, 4, 5, or more) amino acid substitutions and/or modifications. It will also be appreciated that any RNA polymerase variant may include a G47A substitution and/or an additional C-terminal amino acid, such as glycine, relative to the wild-type RNA polymerase comprising the amino acid sequence SEQ ID NO: 1.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) amino acid substitutions at positions 350, 351, and 387 and (b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the additional amino acid substitution at position 350 is E350A. In some embodiments, the additional amino acid substitution at position 350 is E350K. In some embodiments, the additional amino acid substitution at position 350 is E350N. In some embodiments, the additional amino acid substitution at position 350 is E350W. In some embodiments, the additional amino acid substitution at position 351 is D351V. In some embodiments, the additional amino acid substitution at position 387 is K387S. In some embodiments, the additional amino acid substitution at position 387 is K387H. In some embodiments, the additional amino acid substitution at position 387 is K387N. In some embodiments, the RNA polymerase variant comprises a G47A substitution. In some embodiments, the RNA polymerase variant comprises an additional glycine at the C-terminus.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) amino acid substitutions at positions 437 and 441 and (b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the additional amino acid substitution at position 437 is N437T. In some embodiments, the additional amino acid substitution at position 437 is N437Y. In some embodiments, the additional amino acid substitution at position 437 is N437I. In some embodiments, the additional amino acid substitution at position 437 is N437F. In some embodiments, the additional amino acid substitution at position 441 is K441R. In some embodiments, the RNA polymerase variant comprises a G47A substitution. In some embodiments, the RNA polymerase variant comprises an additional glycine at the C-terminus.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) an amino acid substitution at position 880 and (b) an amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1. In some embodiments, the additional amino acid substitution at position 880 is F880Y. In some embodiments, the C-terminal amino acid modification is an additional alanine (a). In some embodiments, the C-terminal amino acid modification is an additional serine (S). In some embodiments, the C-terminal amino acid modification is an additional threonine (T). In some embodiments, the C-terminal amino acid modification is an additional proline (P). In some embodiments, the RNA polymerase variant comprises a G47A substitution.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) amino acid substitutions at positions 632, 653, and 657 and (b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the additional amino acid substitution at position 632 is R632K. In some embodiments, the additional amino acid substitution at position 632 is R632T. In some embodiments, the additional amino acid substitution at position 653 is D653T. In some embodiments, the additional amino acid substitution at position 653 is D653K. In some embodiments, the additional amino acid substitution at position 657 is P657W. In some embodiments, the additional amino acid substitution at position 657 is P657R. In some embodiments, the additional amino acid substitution at position 657 is P657A. In some embodiments, the RNA polymerase variant comprises a G47A substitution. In some embodiments, the RNA polymerase variant comprises an additional glycine at the C-terminus.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) amino acid substitutions at positions 628, 632, 653, and 657 and (b) an additional amino acid substitution and/or amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1. In some embodiments, the additional amino acid substitution at position 628 is S628W. In some embodiments, the additional amino acid substitution at position 632 is R632K. In some embodiments, the additional amino acid substitution at position 632 is R632T. In some embodiments, the additional amino acid substitution at position 653 is D653T. In some embodiments, the additional amino acid substitution at position 653 is D653K. In some embodiments, the additional amino acid substitution at position 657 is P657W. In some embodiments, the additional amino acid substitution at position 657 is P657R. In some embodiments, the additional amino acid substitution at position 657 is P657A. In some embodiments, the RNA polymerase variant comprises a G47A substitution. In some embodiments, the RNA polymerase variant comprises an additional glycine at the C-terminus.

In some embodiments, the RNA polymerase variant comprises an RNA polymerase comprising (a) amino acid substitutions at positions 387, 657, and 884 and (b) an additional amino acid substitution and/or amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

It is also understood that the present disclosure includes RNA polymerases that have at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the RNA polymerase variants described herein. It is also understood that any of the RNA polymerase variants described herein may have at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95% identity to a RNA polymerase comprising the amino acid sequence of SEQ ID No. 1.

The term "identity" refers to the relationship between the sequences of two or more polypeptides (e.g., enzymes) or polynucleotides (nucleic acids) as determined by comparing the sequences. Identity also refers to the degree of sequence relatedness between or among sequences determined by the number of matches between two or more amino acid residues or strings of nucleic acid residues. An identity metric measures the percentage of identical matches between the smaller of two or more sequences having gap alignments (if any) as determined by a particular mathematical model or computer program (e.g., an "algorithm"). The identity of the relevant protein or nucleic acid can be readily calculated by known methods. "percent (%) identity" when applied to a polypeptide or polynucleotide sequence is defined as the percentage of residues (amino acid residues or nucleic acid residues) in a candidate amino acid or nucleic acid sequence that are identical to the residues in the amino acid sequence or nucleic acid sequence of the second sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent identity. Methods and computer programs for alignment are well known in the art. It will be appreciated that identity depends on the calculation of percent identity, but that the value of identity may vary due to gaps and penalties introduced in the calculation. Typically, a variant of a particular polynucleotide or polypeptide (e.g., an antigen) has at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, but less than 100% sequence identity to that particular reference polynucleotide or polypeptide as determined by sequence alignment programs and parameters described herein and known to those of skill in the art. Such alignment tools include the BLAST suite of tools (Stephen F. Altschul et al (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res.25: 3389-3402). Another popular local alignment technique is based on the Smith-Waterman algorithm (Smith, T.F. and Waterman, M.S. (1981) "Identification of common molecular sequences," J.Mol.biol.147:195- "197). A general global alignment technique based on dynamic programming is the Needleman-Wunsch algorithm (Needleman, SB and Wunsch, CD (1970)' A general method application to the search for similarity in the amino acid sequences of two proteins, "J.Mol.biol.48: 443-. A Fast Optimal Global Sequence Alignment Algorithm (FOGSAA) has recently been developed which is said to produce global alignments of nucleotide and protein sequences faster than other optimal global alignment methods, including the Needleman-Wunsch algorithm.

Nucleotide cap analogs

Also provided herein are co-transcriptional capping methods for ribonucleic acid (RNA) synthesis using any of the RNA polymerase variants described herein. That is, RNA is produced in a "one-pot" reaction, without the need for a separate capping reaction. Thus, in some embodiments, the method comprises reacting a polynucleotide template with an RNA polymerase variant, nucleoside triphosphates, and a cap analog under in vitro transcription reaction conditions to produce an RNA transcript.

The cap analog can be, for example, a dinucleotide cap, a trinucleotide cap, or a tetranucleotide cap. In some embodiments, the cap analog is a dinucleotide cap. In some embodiments, the cap analog is a trinucleotide cap. In some embodiments, the cap analog is a tetranucleotide cap.

In some embodiments, a nucleotide cap (e.g., a trinucleotide cap or a tetranucleotide cap) comprises a compound of formula (I)

Or a stereoisomer, tautomer, or salt thereof, wherein:

ring B1Is a modified or unmodified guanine;

ring B2And ring B3Each independently is a nucleobase or a modified nucleobase;

X2is O, S (O)p、NR24Or CR25R26Wherein p is 0, 1 or 2;

Y0is O or CR6R7

Y1 is O, S (O)n、CR6R7Or NR8Wherein n is 0, 1 or 2;

each- - -is a single bond or is absent, wherein when each- - -is a single bond, Yi is O, S (O)n、CR6R7Or NR8(ii) a And when each is absent, Y1Is invalid;

Y2is (OP (O) R4)mWherein m is 0, 1 or 2, or-O- (CR)40R41)u-Q0-(CR42R43) v-where Q0Is key, O, S (O)r、NR44Or CR45R46R is 0, 1 or 2, and u and v are each independently 1, 2,3 or 4;

each R2And R2' independently is halo, LNA OR OR3

Each R3Independently H, C1-C6Alkyl radical, C2-C6Alkenyl or C2-C6Alkynyl and when R is3Is C1-C6Alkyl radical, C2-C6Alkenyl or C2-C6When alkynyl, it is optionally substituted by halo, OH and C1-C6Alkoxy (optionally substituted with one or more OH or OC (O) -C1-C6Alkyl substitution);

each R4And R4' independently is H, halo, C1-C6Alkyl, OH, SH, SeH or BH3 -

R6、R7And R8Each independently is-Q1-T1Wherein Q is1Is a bond or C1-C3Alkyl linkers (optionally substituted by halo, cyano, OH and C)1-C6One or more substitutions in alkoxy), and T1Is H, halo, OH, COOH, cyano or Rs1Wherein R iss1Is C1-C3Alkyl radical, C2-C6Alkenyl radical, C2-C6Alkynyl, C1-C6Alkoxy, C (O) OC1-C6Alkyl radical, C3-C8Cycloalkyl radical, C6-C10Aryl, NR31R32、(NR31R32R33)+4 to 12 membered heterocycloalkyl, or 5 or 6 membered heteroaryl, and Rs1Optionally substituted with one or more substituents selected from the group consisting of: halogenRadical, OH, oxo, C1-C6Alkyl, COOH, C (O) OC1-C6Alkyl, cyano, C1-C6Alkoxy, NR31R32、(NR31R32R33)+、C3-C8Cycloalkyl radical, C6-C10Aryl, 4 to 12 membered heterocycloalkyl and 5 or 6 membered heteroaryl;

R10、R11、R12、R13、R14and R15Each independently is-Q2-T2Wherein Q is2Is a bond or is optionally substituted by halo, cyano, OH and C1-C6C substituted by one or more of alkoxy1-C3An alkyl linker, and T2Is H, halo, OH, NH2Cyano, NO2、N3、Rs2OR ORs2Wherein R iss2Is C1-C6Alkyl radical, C2-C6Alkenyl radical, C2-C6Alkynyl, C3-C8Cycloalkyl radical, C6-C10Aryl, NHC (O) -C1-C6Alkyl, NR31R32、(NR31R32R33)+4 to 12 membered heterocycloalkyl, or 5 or 6 membered heteroaryl, and Rs2Optionally substituted with one or more substituents selected from the group consisting of: halo, OH, oxo, C1-C6Alkyl, COOH, C (O) OC1-C6Alkyl, cyano, C1-C6Alkoxy, NR31R32、(NR31R32R33)+、C3-C8Cycloalkyl radical, C6-C10Aryl, 4 to 12 membered heterocycloalkyl and 5 or 6 membered heteroaryl; or, R12And R14Together are oxo, or R13And R15Together are oxo;

R20、R21、R22and R23Each independently is-Q3-T3Wherein Q is3Is a bond or is optionally substituted by halo, cyano, OH and C1-C6C substituted by one or more of alkoxy1-C3An alkyl linker, and T3Is H, halo, OH, NH2Cyano, NO2、N3、RS3OR ORS3Wherein R isS3Is C1-C6Alkyl radical, C2-C6Alkenyl radical, C2-C6Alkynyl, C3-C8Cycloalkyl radical, C6-C10Aryl, NHC (O) -C1-C6Alkyl, mono C1-C6Alkylamino, di-C1-C6Alkylamino, 4-to 12-membered heterocycloalkyl, or 5-or 6-membered heteroaryl, and Rs3Optionally substituted with one or more substituents selected from the group consisting of: halo, OH, oxo, C1-C6Alkyl, COOH, C (O) OC1-C6Alkyl, cyano, C1-C6Alkoxy, amino, mono C1-C6Alkylamino, di-C1-C6Alkylamino radical, C3-C8Cycloalkyl radical, C6-C10Aryl, 4 to 12 membered heterocycloalkyl and 5 or 6 membered heteroaryl;

R24、R25and R26Each independently is H or C1-C6An alkyl group;

R27and R28Each independently is H OR OR29(ii) a Or R27And R28Together form OR30-O; each R29Independently H, C1-C6Alkyl radical, C2-C6Alkenyl or C2-C6Alkynyl and when R is29Is C1-C6Alkyl radical, C2-C6Alkenyl or C2-C6When alkynyl, it is optionally substituted by halo, OH and C1-C6Alkoxy (optionally substituted with one or more OH or OC (O) -C1-C6Alkyl substitution);

R30is optionally substituted by halogen, OH and C1-C6C substituted by one or more of alkoxy1-C6An alkylene group;

R31、R32and R33Each independently is H, C1-C6Alkyl radical, C3-C8Cycloalkyl radical, C6-C10Aryl, 4 to 12 membered heterocycloalkyl, or 5 or 6 membered heteroaryl;

R40、R41、R42and R43Each independently of the other being H, halo, OH, cyano, N3、OP(O)R47R48Or optionally substituted with one or more OP (O) R47R48Substituted C1-C6Alkyl, or an R41And one R43To the carbon atom and Q to which they are attached0Together form C4-C10Cycloalkyl, 4-to 14-membered heterocycloalkyl, C6-C10Aryl or 5-to 14-membered heteroaryl, and cycloalkyl, heterocycloalkyl, phenyl or 5-to 6-membered heteroaryl are each optionally substituted with OH, halo, cyano, N3Oxo, OP (O) R47R48、C1-C6Alkyl radical, C1-C6Haloalkyl, COOH, C (O) OC1-C6Alkyl radical, C1-C6Alkoxy radical, C1-C6Haloalkoxy, amino, mono C1-C6Alkylamino and di-C1-C6One or more substitutions in alkylamino;

R44is H, C1-C6An alkyl or amine protecting group;

R45and R46Each independently H, OP (O) R47R48Or optionally substituted with one or more OP (O) R47R48Substituted C1-C6Alkyl radical, and

R47and R48Each independently is H, halo, C1-C6Alkyl, OH, SH, SeH or BH3-。

It should be understood that the cap analogs as provided herein may include any of the cap analogs described in international publication WO 2017/066797 published on 20/4/2017, which is incorporated herein by reference in its entirety.

In some embodiments, B2The intermediate position may be a non-ribose molecule, such as arabinose.

In some embodiments, R2Is ethyl-based.

Thus, in some embodiments, the trinucleotide cap comprises the following structure:

in other embodiments, the trinucleotide cap comprises the following structure:

in other embodiments, the trinucleotide cap comprises the following structure:

in other embodiments, the trinucleotide cap comprises the following structure:

thus, in some embodiments, the tetranucleotide cap comprises the following structure:

in other embodiments, the tetranucleotide cap comprises the following structure:

in other embodiments, the tetranucleotide cap comprises the following structure:

in other embodiments, the tetranucleotide cap comprises the following structure:

in some embodiments, R is alkyl (e.g., C)1-C6Alkyl groups). In some embodiments, R is methyl (e.g., C)1Alkyl groups). In some embodiments, R is ethyl (e.g., C)2Alkyl groups). In some embodiments, R is hydrogen.

In some embodiments, the trinucleotide cap comprises a sequence selected from the group consisting of seq id no: GAA, GAC, GAG, GAU, GCA, GCC, GCG, GCU, GGA, GGC, GGG, GGU, GUA, GUC, GUG and GUU. In some embodiments, the trinucleotide cap comprises GAA. In some embodiments, the trinucleotide cap comprises GAC. In some embodiments, the trinucleotide cap comprises a GAG. In some embodiments, the trinucleotide cap comprises GAU. In some embodiments, the trinucleotide cap comprises GCA. In some embodiments, the trinucleotide cap comprises GCC. In some embodiments, the trinucleotide cap comprises GCG. In some embodiments, the trinucleotide cap comprises GCU. In some embodiments, the trinucleotide cap comprises GGA. In some embodiments, the trinucleotide cap comprises a GGC. In some embodiments, the trinucleotide cap comprises GGG. In some embodiments, the trinucleotide cap comprises a GGU. In some embodiments, the trinucleotide cap comprises GUA. In some embodiments, the trinucleotide cap comprises a GUC. In some embodiments, the trinucleotide cap comprises GUG. In some embodiments, the trinucleotide cap comprises a GUU.

In some embodiments, threeThe nucleotide cap comprises a sequence selected from the group consisting of: m is7GpppApA、m7GpppApC、m7GpppApG、m7GpppApU、m7GpppCpA、m7GpppCpC、m7GpppCpG、m7GpppCpU、m7GpppGpA、m7GpppGpC、m7GpppGpG、m7GpppGpU、m7GpppUpA、m7GpppUpC、m7GpppUpG and m7GpppUpU。

In some embodiments, the trinucleotide cap comprises m7GpppApApA. In some embodiments, the trinucleotide cap comprises m7GpppApC. In some embodiments, the trinucleotide cap comprises m7GpppApG. In some embodiments, the trinucleotide cap comprises m7GpppApU. In some embodiments, the trinucleotide cap comprises m7GpppCpA. In some embodiments, the trinucleotide cap comprises m7GpppC. In some embodiments, the trinucleotide cap comprises m7GpppCpG. In some embodiments, the trinucleotide cap comprises m7GpppCpU. In some embodiments, the trinucleotide cap comprises m7GpppGpA. In some embodiments, the trinucleotide cap comprises m7GpppGpC. In some embodiments, the trinucleotide cap comprises m7GpppG. In some embodiments, the trinucleotide cap comprises m7GpppGpU. In some embodiments, the trinucleotide cap comprises m7GpppUpA. In some embodiments, the trinucleotide cap comprises m7GpppUPC. In some embodiments, the trinucleotide cap comprises m7GpppUPG. In some embodiments, the trinucleotide cap comprises m7GpppUpU。

In some embodiments, the trinucleotide cap comprises a sequence selected from the group consisting of seq id no: m is7G3′OMepppApA、m7G3′OMepppApC、m7G3′OMepppApG、m7G3′OMepppApU、m7G3′OMepppCpA、m7G3′OMepppCpC、m7G3′ OMepppCpG、m7G3′OMepppCpU、m7G3′OMepppGpA、m7G3′OMepppGpC、m7G3′OMepppGpG、m7G3′OMepppGpU、m7G3′OMepppUpA、m7G3′OMepppUpC、m7G3′OMepppUpG and m7G3′OMepppUpU。

In some embodiments, the trinucleotide cap comprisesm7G3′OMepppApApApA. In some embodiments, the trinucleotide cap comprisesm7G3′OMepppApC. In some embodiments, the trinucleotide cap comprises m7G3′OMepppApG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppApU. In some embodiments, the trinucleotide cap comprises m7G3′ OMepppCpA. In some embodiments, the trinucleotide cap comprises m7G3′OMepppC. In some embodiments, the trinucleotide cap comprises m7G3′OMepppCpG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppCpU. In some embodiments, the trinucleotide cap comprises m7G3′OMepppGpA. In some embodiments, the trinucleotide cap comprises m7G3′OMepppGpC. In some embodiments, the trinucleotide cap comprises m7G3′OMepppG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppGpU. In some embodiments, the trinucleotide cap comprises m7G3′OMepppUpA. In some embodiments, the trinucleotide cap comprises m7G3′OMepppUpC. In some embodiments, the trinucleotide cap comprises m7G3′OMepppUpG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppUpU。

In other embodiments, the trinucleotide cap comprises a sequence selected from the group consisting of: m is7G3′OMepppA2′OMepA、m7G3′OMepppA2′OMepC、m7G3′OMepppA2′OMepG、m7G3′OMepppA2′OMepU、m7G3′OMepppC2′OMepA、m7G3′ OMepppC2′OMepC、m7G3′OMepppC2′OMepG、m7G3′OMepppC2′OMepU、m7G3′OMepppG2′OMepA、m7G3′OMepppG2′ OMepC、m7G3′OMepppG2′OMepG、m7G3′OMepppG2′OMepU、m7G3′OMepppU2′OMepA、m7G3′OMepppU2′OMepC、m7G3′ OMepppU2′OMepG and m7G3′OMepppU2′OMepU。

In some embodiments, the trinucleotide cap comprises m7G3′OMepppA2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7G3′OMepppA2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7G3′OMepppA2′ OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppA2′OMepU. In some embodiments, the trinucleotide cap comprises m7G3′OMepppC2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7G3′OMepppC2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7G3′OMepppC2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppC2′OMepU. In some embodiments, the trinucleotide cap comprises m7G3′OMepppG2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7G3′OMepppG2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7G3′OMepppG2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppG2′OMepU. In some embodiments, the trinucleotide cap comprises m7G3′OMepppU2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7G3′OMepppU2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7G3′OMepppU2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7G3′OMepppU2′OMepU。

In other embodiments, the trinucleotide cap comprises a sequence selected from the group consisting of: m is7GpppA2′OMepA、m7GpppA2′OMepC、m7GpppA2′OMepG、m7GpppA2′OMepU、m7GpppC2′OMepA、m7GpppC2′OMepC、m7GpppC2′ OMepG、m7GpppC2′OMepU、m7GpppG2′OMepA、m7GpppG2′OMepC、m7GpppG2′OMepG、m7GpppG2′OMepU、m7GpppU2′OMepA、m7GpppU2′OMepC、m7GpppU2′OMepG and m7GpppU2′OMepU。

In some embodiments, the trinucleotide cap comprises m7GpppA2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7GpppA2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7GpppA2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7GpppA2′OMepU. In some embodiments, the trinucleotide cap comprises m7GpppC2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7GpppC2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7GpppC2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7GpppC2′OMepU. In some embodiments, the trinucleotide cap comprises m7GpppG2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7GpppG2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7GpppG2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7GpppG2′OMepU. In some embodiments, the trinucleotide cap comprises m7GpppU2′OMepA is used. In some embodiments, the trinucleotide cap comprises m7GpppU2′OMeAnd (pC). In some embodiments, the trinucleotide cap comprises m7GpppU2′OMeAnd pG. In some embodiments, the trinucleotide cap comprises m7GpppU2′OMepU。

In some embodiments, the trinucleotide cap comprises m7Gpppm6A2’OmeAnd pG. In some embodiments, the trinucleotide cap comprises m7Gpppe6A2’OmepG。

In some embodiments, the trinucleotide cap comprises a GAG. In some embodiments, the trinucleotide cap comprises GCG. In some embodiments, the trinucleotide cap comprises GUG. In some embodiments, the trinucleotide cap comprises GGG.

In some embodiments, the trinucleotide cap comprises any one of the following structures:

in some embodiments, the tetranucleotide cap comprises GGAG.

In some embodiments, the tetranucleotide cap comprises any one of the following structures:

in vitro transcription method

Some aspects of the disclosure provide methods of producing an RNA transcript (e.g., an mRNA transcript) comprising contacting a DNA template with an RNA polymerase (e.g., a T7RNA polymerase such as a T7RNA polymerase variant) under conditions that result in the production of an RNA transcript.

In some embodiments, the methods comprise contacting the DNA template with a T7RNA polymerase variant comprising (at least one) additional C-terminal amino acid (e.g., Gly, Ala, GlyGly, AlaAla, GlyAla, or AlaGly).

In some aspects, the disclosure provides methods of performing an IVT reaction comprising contacting a DNA template with an RNA polymerase (e.g., a T7RNA polymerase such as a T7RNA polymerase variant) in the presence of nucleoside triphosphates and a buffer under conditions that result in the production of an RNA transcript.

Other aspects of the disclosure provide co-transcription capping methods comprising reacting a polynucleotide template with a T7RNA polymerase variant, nucleoside triphosphates, and a cap analog under in vitro transcription reaction conditions to produce an RNA transcript.

In some embodiments, a co-transcription capping method for RNA synthesis comprises reacting a polynucleotide template with: (a) a T7RNA polymerase variant comprising at least one amino acid substitution relative to wild-type RNA polymerase that results in at least one loop structure of the RNA polymerase variant undergoing a conformational change to form a helical structure upon transition of the RNA polymerase variant from an initiation complex to an extension complex (e.g., at least one amino acid substitution position 42, 43, 44, 45, 46, and/or 47); (b) nucleoside triphosphates; and (c) comprises the sequence GpppA2′OmeA trinucleotide cap of pG, wherein the polynucleotide template comprises 2 at template position +1' -a deoxythymidine residue.

IVT conditions typically require a purified linear DNA template containing a promoter, nucleoside triphosphates, a buffer system containing Dithiothreitol (DTT) and magnesium ions, and an RNA polymerase. The exact conditions used in the transcription reaction depend on the amount of RNA required for a particular application. A typical IVT reaction is performed by incubating a DNA template with RNA polymerase and nucleoside triphosphates (including GTP, ATP, CTP, and UTP (or nucleotide analogs)) in a transcription buffer. This reaction produces an RNA transcript with a 5' terminal guanosine triphosphate.

Deoxyribonucleic acid (DNA) is simply the nucleic acid template for RNA polymerase. The DNA template may include a polynucleotide encoding a polypeptide of interest (e.g., an antigenic polypeptide). In some embodiments, the DNA template includes an RNA polymerase promoter (e.g., T7RNA polymerase promoter) located 5' to and operably linked to the polynucleotide encoding the polypeptide of interest. The DNA template may also include a nucleotide sequence encoding a poly (a) tail located at the 3' end of the target gene.

Polypeptides of interest include, but are not limited to, biologics, antibodies, antigens (vaccines), and therapeutic proteins. The term "protein" includes peptides.

In some embodiments, the RNA transcript is the product of an IVT reaction. In some embodiments, the RNA transcript is messenger RNA (mrna) comprising a nucleotide sequence encoding a polypeptide of interest linked to a polyA tail. In some embodiments, the mRNA is a modified mRNA (mmrna) that includes at least one modified nucleotide.

Nucleotides include a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and at least one phosphate group. Nucleotides include nucleoside monophosphates, nucleoside diphosphates and nucleoside triphosphates. Nucleoside Monophosphates (NMPs) comprise a nucleobase linked to a ribose sugar and a single phosphate; nucleoside Diphosphate (NDP) includes a nucleobase linked to a ribose and two phosphates; nucleoside Triphosphates (NTPs) comprise a nucleobase linked to a ribose and three phosphates. Nucleotide analogs are compounds that have the general structure of a nucleotide or that are structurally similar to a nucleotide. Nucleotide analogs include, for example, nucleobase analogs, sugar analogs, and/or analogs of the phosphate group of nucleotides.

Nucleosides include nitrogenous bases and 5-carbon sugars. Thus, the nucleoside plus phosphate group generates a nucleotide. Nucleoside analogs are compounds that have the general structure of a nucleoside or are structurally similar to a nucleoside. Nucleoside analogs include, for example, nucleobase analogs and/or nucleoside sugar analogs.

It is understood that, unless otherwise specified, the term "nucleotide" includes naturally occurring nucleotides, synthetic nucleotides, and modified nucleotides. Examples of naturally occurring nucleotides provided herein for use, e.g., in an IVT reaction to produce RNA include Adenosine Triphosphate (ATP), Guanosine Triphosphate (GTP), Cytidine Triphosphate (CTP), Uridine Triphosphate (UTP), and 5-methyluridine triphosphate (m @)5UTP). In some embodiments, Adenosine Diphosphate (ADP), Guanosine Diphosphate (GDP), Cytidine Diphosphate (CDP), and/or Uridine Diphosphate (UDP) is used.

Examples of nucleotide analogs include, but are not limited to, antiviral nucleotide analogs, phosphate analogs (soluble or immobilized, hydrolyzable or non-hydrolyzable), dinucleotides, trinucleotides, tetranucleotides (e.g., capping analogs) or precursors/substrates for enzymatic capping (vaccinia or ligase), nucleotides labeled with a functional group to facilitate attachment/conjugation of a cap or 5 'moiety (IRES), nucleotides labeled with a 5' PO4Nucleotides labeled to facilitate attachment of a cap or 5' moiety, or nucleotides labeled with a functional/protecting group that can be chemically or enzymatically cleaved. Examples of antiviral nucleotide/nucleoside analogs include, but are not limited to, Ganciclovir (Ganciclovir), Entecavir (Entecavir), Telbivudine (Telbivudine), Vidarabine (Vidarabine), and Cidofovir (Cidofovir).

Modified nucleotides may include modified nucleobases. For example, an RNA transcript (e.g., an mRNA transcript) of the present disclosure can include a modified nucleobase selected from: pseudouridine (ψ), 1-methylpseudouridine (m1 ψ), 1-ethylpseudouridine, 2-thiouridine, 4' -thiouridine, 2-thio-1-methyl-1-deaza-pseudouridine, 2-thio-1-methyl-pseudouridine, 2-thio-5-aza-uridine, 2-thio-dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-pseudouridine, 4-methoxy-pseudouridine, 4-thio-1-methyl-pseudouridine, 4-thio-pseudouridine, 5-aza-uridine, dihydropseudouridine, pseudouridine, and mixtures thereof, 5-methyluridine, 5-methoxyuridine (mo5U) and 2' -O-methyluridine. In some embodiments, an RNA transcript (e.g., an mRNA transcript) includes a combination of at least two (e.g., 2,3, 4, or more) of the modified nucleobases described above.

Nucleoside Triphosphates (NTPs) as provided herein can comprise unmodified or modified ATP, modified or unmodified UTP, modified or unmodified GTP, and/or modified or unmodified CTPs. In some embodiments, the NTP of the IVT reaction comprises unmodified ATP. In some embodiments, the NTP of the IVT reaction comprises modified ATP. In some embodiments, the NTP of the IVT reaction comprises unmodified UTP. In some embodiments, the NTP of the IVT reaction comprises a modified UTP. In some embodiments, the IVT-reactive NTP comprises unmodified GTP. In some embodiments, the IVT-reactive NTP comprises a modified GTP. In some embodiments, the NTP of the IVT reaction comprises an unmodified CTP. In some embodiments, the NTP of the IVT reaction comprises a modified CTP.

The concentration of nucleoside triphosphates and cap analogs present in the IVT reaction can vary. In some embodiments, the NTP and the cap analog are present in the reaction at equimolar concentrations. In some embodiments, the molar ratio of cap analog (e.g., trinucleotide cap) to nucleoside triphosphate in the reaction is greater than 1: 1. For example, the molar ratio of the cap analog to the nucleoside triphosphate in the reaction can be 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, 15:1, 20:1, 25:1, 50:1, or 100: 1. In some embodiments, the molar ratio of cap analog (e.g., trinucleotide cap) to nucleoside triphosphate in the reaction is less than 1: 1. For example, the molar ratio of cap analog (e.g., trinucleotide cap) to nucleoside triphosphate in the reaction can be 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, 1:10, 1:15, 1:20, 1:25, 1:50, or 1: 100.

The composition of NTP in IVT reactions can also vary. For example, more ATP than GTP, CTP and UTP may be used. By way of non-limiting example, the IVT reaction may include 7.5 mmol GTP, 7.5 mmol CTP, 7.5 mmol UTP, and 3.75 mmol ATP. The same IVT reaction can include 3.75 mmol of cap analogs (e.g., trinucleotide caps). In some embodiments, the molar ratio of G: C: U: A: cap is 1:1:1:0.5: 0.5. In some embodiments, the molar ratio of G: C: U: A: cap is 1:1:0.5:1: 0.5. In some embodiments, the molar ratio of G: C: U: A: cap is 1:0.5:1:1: 0.5. In some embodiments, the molar ratio of G: C: U: A: cap is 0.5:1:1:1: 0.5.

In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises a nucleotide sequence selected from the group consisting of pseudouridine (ψ), 1-methylpseuduridine (m)1Psi), 5-methoxyuridine (mo)5U), 5-methylcytidine (m)5C) Modified nucleobases of alpha-thioguanosine and alpha-thioadenosine. In some embodiments, an RNA transcript (e.g., an mRNA transcript) includes a combination of at least two (e.g., 2,3, 4, or more) of the modified nucleobases described above.

In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises pseudouridine (ψ). In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises 1-methylpseuduridine (m)1ψ). In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises 5-methoxyuridine (mo)5U). In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises 5-methylcytidine (m)5C) In that respect In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises α -thioguanosine. In some embodiments, the RNA transcript (e.g., mRNA transcript) comprises a-thioadenosine.

In some embodiments, a polynucleotide (e.g., an RNA polynucleotide, such as an mRNA polynucleotide) is uniformly modified (e.g., fully modified, modified throughout the sequence) for a particular modification. For example, 1-methylpseuduridine (m) can be used1ψ), which means that all uridine residues in the mRNA sequence are substituted with 1-methylpseuduridine (m)1ψ) is replaced. Similarly, polynucleotides may be directed to a sequence by substitution with a modified residue (such as any one of those listed above)Any type of nucleoside residue present in (a) is uniformly modified. Alternatively, a polynucleotide (e.g., an RNA polynucleotide such as an mRNA polynucleotide) may not be uniformly modified (e.g., partially modified, partially sequence modified). Each possibility represents a different embodiment of the invention.

In some embodiments, the buffer system comprises tris. For example, the concentration of tris used in an IVT reaction may be at least 10mM, at least 20mM, at least 30mM, at least 40mM, at least 50mM, at least 60mM, at least 70mM, at least 80mM, at least 90mM, at least 100mM or at least 110mM phosphate. In some embodiments, the phosphate is at a concentration of 20-60mM or 10-100 mM.

In some embodiments, the buffer system comprises Dithiothreitol (DTT). For example, the concentration of DTT used in an IVT reaction may be at least 1mM, at least 5mM, or at least 50 mM. In some embodiments, the concentration of DTT used in the IVT reaction is 1-50mM or 5-50 mM. In some embodiments, the concentration of DTT used in the IVT reaction is 5 mM.

In some embodiments, the buffer system comprises magnesium. In some embodiments, NTP and magnesium ion (Mg) present in the IVT reaction2+(ii) a For example MgCl2) In a molar ratio of 1:1 to 1: 5. For example, the molar ratio of NTP to magnesium ion may be 1:1, 1:2, 1:3, 1:4, or 1: 5.

In some embodiments, NTP + capping analogs (e.g., trinucleotide caps, such as GAGs) and magnesium ions (Mg) present in IVT reactions2+(ii) a For example, MgCl2) In a molar ratio of 1:1 to 1: 5. For example, the molar ratio of NTP + trinucleotide cap (e.g., GAG) to magnesium ion can be 1:1, 1:2, 1:3, 1:4, or 1: 5.

In some embodiments, the buffer system comprises Tris-HCl, spermidine (e.g., at a concentration of 1-30mM),X-100 (polyethylene glycol p- (1,1,3, 3-tetramethylbutyl) -phenyl ether) and/or polyethylene glycol (PEG).

Addition of Nucleoside Triphosphates (NTPs) to the 3' end of the growing RNA strand is catalyzed by a polymerase (e.g., T7RNA polymerase, e.g., any one or more of the T7RNA polymerase variants of the present disclosure (e.g., G47A)). In some embodiments, the RNA polymerase (e.g., T7RNA polymerase variant) is present in the reaction (e.g., IVT reaction) at a concentration of 0.01mg/ml to 1 mg/ml. For example, the RNA polymerase may be present in the reaction at a concentration of 0.01mg/mL, 0.05mg/mL, 0.1mg/mL, 0.5mg/mL, or 1.0 mg/mL.

Surprisingly, T7RNA polymerase variants (e.g., G47A) and cap analogs (e.g., GpppA) as provided herein are used, for example, in vitro transcription reactions2′OmepG) results in the production of RNA transcripts, of which more than 80% of the produced RNA transcripts comprise a functional cap. In some embodiments, more than 85% of the RNA transcripts produced comprise a functional cap. In some embodiments, more than 90% of the RNA transcripts produced comprise a functional cap. In some embodiments, more than 95% of the RNA transcripts produced comprise a functional cap. In some embodiments, more than 96% of the RNA transcripts produced comprise a functional cap. In some embodiments, more than 97% of the RNA transcripts produced comprise a functional cap. In some embodiments, more than 98% of the RNA transcripts produced comprise a functional cap. In some embodiments, more than 99% of the RNA transcripts produced comprise a functional cap.

Also surprisingly, it was found that RNA transcripts were produced using a polynucleotide template comprising a 2 '-deoxythymidine residue or a 2' -deoxycytidine residue at template position +1, wherein more than 80% (e.g., more than 85%, more than 90% or more than 95%) of the RNA transcripts produced comprised a functional cap. Thus, in some embodiments, a polynucleotide (e.g., DNA) template, e.g., used in an IVT reaction, comprises a 2' -deoxythymidine residue at template position + 1. In other embodiments, a polynucleotide (e.g., DNA) template, such as used in an IVT reaction, comprises a 2' -deoxycytidine residue at template position + 1.

Polysubstituted RNA T7 polymerase variants

Various aspects of the present disclosure provide RNA T7 polymerase variants comprising at least two amino acid substitutions. In some embodiments, the RNA T7 polymerase variant comprises at least three amino acid substitutions. In some embodiments, the RNA T7 polymerase variant comprises at least four amino acid substitutions. In some embodiments, the RNA T7 polymerase variant comprises at least five amino acid substitutions. A variant of RNA T7 polymerase that includes a G47A substitution relative to wild-type T7RNA polymerase (e.g., comprising the amino acid sequence of SEQ ID NO: 1) may be referred to herein as a "variant of G47A T7 Pol".

Table 1 below provides examples of polysubstituted RNA T7 polymerase variants of the present disclosure. It will be appreciated that each of the T7 polymerase variants included in table 1 comprises a G47A substitution relative to the wild type T7RNA polymerase comprising the amino acid sequence of SEQ ID NO: 1. It will also be appreciated that each of the T7 polymerase variants included in table 1 comprises an additional C-terminal amino acid at position 884 relative to the wild-type T7RNA polymerase comprising the amino acid sequence of SEQ ID NO: 1. The additional C-terminal amino acid is glycine (G884), unless otherwise specified: G884T represents a T7RNA polymerase variant comprising a threonine at position 884 (instead of glycine); G884S represents a T7RNA polymerase variant comprising a serine at position 884 (instead of glycine); G884P represents a T7RNA polymerase variant comprising proline (instead of glycine) at position 884; and G884A represents a T7RNA polymerase variant comprising an alanine (instead of glycine) at position 884. All substitutions in table 1 are relative to the wild type T7RNA polymerase variant comprising the amino acid sequence of SEQ ID No. 1.

TABLE 1 polysubstituted RNA T7 polymerase variants

Applications of

RNA transcripts produced according to the present disclosure include mRNA (including modified mRNA and/or unmodified RNA), lncRNA, self-replicating RNA, circular RNA, CRISPR guide RNA, and the like. In embodiments, the RNA is an RNA (e.g., mRNA or self-replicating RNA) that encodes a polypeptide (e.g., a therapeutic polypeptide). Thus, RNA transcripts produced using the RNA polymerase variants of the disclosure can be used in a myriad of applications.

For example, RNA transcripts can be used to produce polypeptides of interest, such as therapeutic proteins, vaccine antigens, and the like. In some embodiments, the RNA transcript is a therapeutic RNA. Therapeutic mRNA is mRNA encoding a therapeutic protein (the term "protein" includes peptides). Therapeutic proteins mediate a variety of effects in the host cell or subject to treat the disease or ameliorate the signs and symptoms of the disease. For example, a therapeutic protein can replace a defective or abnormal protein, enhance the function of an endogenous protein, provide a new function to a cell (e.g., inhibit or activate endogenous cellular activity), or serve as a delivery agent for another therapeutic compound (e.g., an antibody-drug conjugate). The therapeutic mRNA can be used to treat the following diseases and conditions: bacterial infections, viral infections, parasitic infections, cell proliferation disorders, genetic disorders, and autoimmune disorders. Other diseases and conditions are included herein.

The target protein encoded by the mRNA provided herein can be essentially any protein. In some embodiments, the therapeutic protein is a cytokine, a growth factor, an antibody, or a fusion protein. Non-limiting examples of therapeutic proteins include blood factors (e.g., factor VIII and factor VII), complement factors, Low Density Lipoprotein Receptor (LDLR), and MUT 1. Non-limiting examples of cytokines include interleukins, interferons, chemokines, lymphokines, and the like. Non-limiting examples of growth factors include erythropoietin, EGF, PDGF, FGF, TGF, IGF, TNF, CSF, MCSF, GMCSF, and the like. Non-limiting examples of antibodies include adalimumab (adalimumab), infliximab (infliximab), rituximab (rituximab), ipilimumab (ipilimumab), tocilizumab (tocilizumab), canazumab (canakinumab), eritlizumab (itolizumab), tralokinumab (tralokinumab). Non-limiting examples of fusion proteins include, for example, etanercept, abacavir, and belief.

In some embodiments, the protein of interest is human erythropoietin, LDLR (for cholesterol inhibition) or MUT1 (for treatment of methylmalonic acid (MMA)). In other embodiments, the protein of interest encoded by the mRNA is a therapeutic antibody, including but not limited to the antibodies listed above.

RNA transcripts produced using the RNA polymerase variants disclosed herein can encode one or more biological agents. A biological agent is a polypeptide-based molecule that can be used to treat, cure, alleviate, prevent, or diagnose a serious or life-threatening disease or medical condition. Biological agents include, but are not limited to, allergen extracts (e.g., for allergy injection and testing), blood components, gene therapy products, human tissue or cell products for transplantation, vaccines, monoclonal antibodies, cytokines, growth factors, enzymes, thrombolytics, immunomodulators, and the like.

One or more biological agents currently in sale or development may be encoded by the RNA of the invention. While not wishing to be bound by theory, it is believed that incorporating encoding polynucleotides of known biologies into the RNAs of the present disclosure will improve therapeutic efficacy due, at least in part, to the specificity, purity, and/or selectivity of the construct design.

RNA transcripts produced using the RNA polymerase variants disclosed herein may encode one or more antibodies. The term "antibody" includes monoclonal antibodies (including full length antibodies with immunoglobulin Fc regions), antibody compositions with polyepitopic specificity, multispecific antibodies (e.g., bispecific antibodies, diabodies, and single chain molecules), and antibody fragments. The term "immunoglobulin" (Ig) is used interchangeably herein with "antibody". A monoclonal antibody refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical except for possible naturally occurring mutations and/or post-translational modifications (e.g., isomerization, amidation) that may be present in minor amounts. Monoclonal antibodies are highly specific for a single antigenic site.

Monoclonal antibodies specifically include chimeric antibodies (immunoglobulins) in which a portion of the heavy and/or light chain is identical with or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of these chains are identical with or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass; and fragments of such antibodies, so long as they exhibit the desired biological activity. Chimeric antibodies include, but are not limited to, "primatized" antibodies comprising variable domain antigen binding sequences derived from a non-human primate (e.g., Old World Monkey, ape, etc.) and human constant region sequences.

Antibodies encoded by the RNAs of the present disclosure may be used to treat conditions or diseases in many therapeutic areas, such as, but not limited to, blood, cardiovascular, CNS, intoxication (including anti-snake toxins), dermatology, endocrinology, gastrointestinal tract, medical imaging, musculoskeletal, oncology, immunology, respiration, sensation, and anti-infection.

RNA transcripts produced using the RNA polymerase variants disclosed herein may encode one or more vaccine antigens. Vaccine antigens are biological agents that enhance immunity to specific diseases or infectious agents. One or more vaccine antigens currently in sale or development may be encoded by the RNA of the present disclosure. RNA-encoded vaccine antigens are useful for treating conditions or diseases in many therapeutic areas, such as, but not limited to, cancer, allergy, and infectious disease. In some embodiments, the cancer vaccine may be a personalized cancer vaccine in the form of concatemers encoding peptide epitopes or individual RNA or combinations thereof.

RNA transcripts produced using the RNA polymerase variants disclosed herein can be designed to encode one or more antimicrobial peptides (AMPs) or antiviral peptides (AVPs). AMPs and AVPs have been isolated and described from a range of animals such as, but not limited to, microorganisms, invertebrates, plants, amphibians, birds, fish, and mammals. The antimicrobial polypeptide can block cell fusion and/or viral entry of one or more enveloped viruses (e.g., HIV, HCV). For example, the antimicrobial polypeptide can comprise or consist of a synthetic peptide corresponding to a region that is, for example, a contiguous sequence of at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 amino acids of a transmembrane subunit of a viral envelope protein (e.g., HIV-1gp120 or gp 41). The amino acid and nucleotide sequences of HIV-1gp120 or gp41 are described, for example, in Kuiken et al, (2008), "HIV Sequence Complex," Los Alamos National Laboratory.

In some embodiments, the RNA transcript is used as a radiolabeled RNA probe. In some embodiments, the RNA transcript is used for non-isotopic RNA labeling. In some embodiments, the RNA transcript is used as a guide RNA (grna) for gene targeting. In some embodiments, RNA transcripts (e.g., mRNA) are used for in vitro translation and microinjection. In some embodiments, the RNA transcripts are used for RNA structure, processing, and catalytic studies. In some embodiments, the RNA transcript is used for RNA amplification. In some embodiments, the RNA transcript is used as an antisense RNA for gene expression experiments. Other applications are contemplated by the present disclosure.

Other embodiments are as follows:

other embodiments of the present disclosure are encompassed in the following numbered paragraphs:

1. a ribonucleic acid (RNA) polymerase variant comprising an RNA polymerase, the RNA polymerase comprising:

(a) amino acid substitutions at binding site residues for de novo RNA synthesis; and

(b) amino acid modifications that increase transcription efficiency relative to wild-type RNA polymerase.

2. The RNA polymerase variant of paragraph 1, wherein the amino acid modification causes the loop structure of the RNA polymerase variant to undergo a conformational change to form a helical structure when the RNA polymerase variant is converted from an initiation complex to an extension complex.

3. The RNA polymerase variant of paragraph 2, wherein the amino acid modification is an amino acid substitution at position 47 relative to a wild type RNA polymerase, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1.

4. The RNA polymerase variant of paragraph 3, wherein the amino acid substitution at position 47 is G47A.

5. The RNA polymerase variant of any of paragraphs 1-4, wherein the amino acid modification comprises an additional C-terminal amino acid relative to wild-type RNA polymerase.

6. The RNA polymerase variant of paragraph 5, wherein the additional C-terminal amino acid is glycine.

7. The RNA polymerase variant of any of paragraphs 1-6, wherein the amino acid substitution at a binding site residue results in at least one of the following benefits relative to wild-type RNA polymerase:

(i) the transcription efficiency is improved;

(ii) the efficiency of co-transcription and capping is improved;

(iii) increasing RNA production at 1/2 concentration of the cap analog;

(iv) increasing 3' homogeneity of RNA at 1/2 concentration of the cap analog;

(v) improving transcription fidelity; and/or

(vi) Reducing the amount of dsRNA contamination.

8. The polymerase variant of any of paragraphs 1-6, wherein the amino acid substitution at a binding site residue, relative to the amino acid modification of (b), results in at least one of the following benefits:

(i) the transcription efficiency is improved;

(ii) the efficiency of co-transcription and capping is improved;

(iii) increasing RNA production at 1/2 concentration of the cap analog;

(iv) increasing 3' homogeneity of RNA at 1/2 concentration of the cap analog;

(v) improving transcription fidelity; and/or

(vi) Reducing the amount of dsRNA contamination.

9. The RNA polymerase variant of any of paragraphs 1-8, wherein the amino acid substitution at a binding site residue is a substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 425, 427, 437, 441, 506, 628, 632, 653, 657, 811 and 880 relative to wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1.

10. A variant RNA polymerase comprising an RNA polymerase comprising:

(a) an amino acid substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 425, 427, 437, 441, 506, 628, 632, 653, 657, 811, and 880; and

(b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO. 1.

11. The RNA polymerase variant of paragraph 10, comprising the additional amino acid substitution of (b).

12. The RNA polymerase variant of paragraph 11, wherein the additional amino acid substitution of (b) is at position 47.

13. The RNA polymerase variant of paragraph 12, wherein the additional amino acid substitution at position 47 is G47A.

14. The RNA polymerase variant of any of paragraphs 10-13, comprising an amino acid modification at the C-terminus.

15 the RNA polymerase variant of paragraph 14, wherein the amino acid modification at the C-terminus comprises an additional C-terminal amino acid.

16. The RNA polymerase variant of paragraph 15, wherein the additional C-terminal amino acid is selected from the group consisting of glycine, serine, alanine, proline, and threonine.

17. The RNA polymerase variant of paragraph 16, wherein the additional C-terminal amino acid is glycine.

18. The RNA polymerase variant of paragraph 16, wherein the additional C-terminal amino acid is alanine.

19. The RNA polymerase variant of paragraphs 17 or 18, comprising an RNA polymerase comprising:

(a) an amino acid substitution at a position selected from the group consisting of positions 350, 351, 387, 394, 425, 427, 437, 441, 506, 628, 632, 653, 657, 811, and 880;

(b) additional amino acid substitutions; and

(c) amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO. 1.

20. The RNA polymerase variant of paragraph 19, wherein the additional amino acid substitution is at position 47.

21. The RNA polymerase variant of paragraph 20, wherein the additional amino acid substitution at position 47 is G47A.

22. The RNA polymerase variant of any of paragraphs 19-21, wherein the amino acid modification at the C-terminus comprises an additional C-terminal amino acid.

23. The RNA polymerase variant of paragraph 22, wherein the additional C-terminal amino acid is selected from the group consisting of glycine, serine, alanine, proline, glutamine, and threonine.

24. The RNA polymerase variant of paragraph 23, wherein the additional C-terminal amino acid is glycine.

25. The RNA polymerase variant of any of paragraphs 1-24, wherein the additional amino acid substitution of (a) is at a position selected from the group consisting of position 387, 350, 351, 506, 628, 653, and 657, relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

26. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is selected from the group consisting of K387S, K387H, and K387N.

27. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is selected from the group consisting of E350K, E350N, E350A, and E350W.

28. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is D351V.

29. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is D506W.

30. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is S628W.

31. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is D653W.

32. The RNA polymerase variant of paragraph 25, wherein the additional amino acid substitution is P657W.

33. The RNA polymerase variant of any of paragraphs 1-24, wherein the additional amino acid substitution of (a) is at a position selected from the group consisting of positions 350, 351, 387, and 437 relative to a wild type RNA polymerase, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1.

34. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution at position 350 of (a) is at position 350 and the additional amino acid substitution at position 350 is selected from the group consisting of E350R, E350K, E350D, E350Q, E350N, E350T, E350S, E350C, E350G, E350A, E350V, E350L, E350I, E350P, E350Y, E350W, and E350F.

35. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 351 and the additional amino acid substitution at position 350 is selected from the group consisting of D351R, D351K, D351Q, D351T, D351S, D351C, D351V, D351L, D351I, D351M, D351P, D351Y, and D351W.

36. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 387, and the additional amino acid substitution at position 387 is selected from the group consisting of K387R, K387H, K387T, K387S, K387V, K387L, K387I, and K387M.

37. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 437 and the additional amino acid substitution at position 437 is selected from the group consisting of N437Q, N437T, N437S, N437G, and N437F.

38. The RNA polymerase variant of paragraph 22, wherein the additional C-terminal amino acid is serine or alanine.

39. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 350 and the additional amino acid substitution at position 350 is selected from the group consisting of E350N, E350C, E350G, E350Y, E350W, and E350F.

40. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 351, and the additional amino acid substitution at position 351 is selected from the group consisting of D351R, D351S, D351L, D351M, and D351Y.

41. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 387, and the additional amino acid substitution at position 387 is selected from the group consisting of K387R, K387T, K387L, and K387M.

42. The RNA polymerase variant of paragraph 33, wherein the additional amino acid substitution of (a) is at position 437 and the additional amino acid substitution at position 437 is selected from the group consisting of N437R, N437K, N437H, N437T, N437V, N437I, and N437W.

43. The RNA polymerase variant of paragraph 22, wherein the additional C-terminal amino acid is glutamine, threonine, or proline.

44. The RNA polymerase variant of any of paragraphs 1-24, wherein the additional amino acid substitution of (a) is at a position selected from the group consisting of positions 350, 351, 387, 437, 441, 632, and 880 relative to wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO: 1.

45. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution at position 350 of (a) is at position 350 and the additional amino acid substitution at position 350 is selected from the group consisting of E350R, E350K, E350D, E350Q, E350N, E350T, E350S, E350C, E350G, E350A, E350V, E350L, E350I, E350Y, E350W and E350F.

46. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 351, and the additional amino acid substitution at position 351 is selected from the group consisting of D351R, D351K, D351Q, D351T, D351C, D351V, D351L, D351M, and D351W.

47. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 387, and the additional amino acid substitution at position 387 is selected from the group consisting of K387H, K387E, K387N, K387T, K387S, K387G, K387A, K387Y, K387W, and K387F.

48. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 437 and the additional amino acid substitution at position 437 is selected from the group consisting of N437T, N437I, N437Y, N437W, and N437F.

49. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 444 and the additional amino acid substitution at position 444 is K444R.

50. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 632 and the additional amino acid substitution at position 632 is selected from the group consisting of R632K and R632T.

51. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 880 and the additional amino acid substitution at position 880 is F880Y.

52. The RNA polymerase variant of paragraph 22, wherein the additional C-terminal amino acids are glutamine, threonine, and proline.

53. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 350 and the additional amino acid substitution at position 350 is selected from the group consisting of E350K, E350N, E350A, and E350W.

54. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 351 and the additional amino acid substitution at position 351 is D351V.

55. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 387, and the additional amino acid substitution at position 387 is selected from the group consisting of K387H, K387N, and K387S.

56. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 437 and the additional amino acid substitution at position 437 is selected from the group consisting of N437T, N437I, N437Y and N437F.

57. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 444 and the additional amino acid substitution at position 444 is selected from K444R.

58. The RNA polymerase variant of paragraph 44, wherein the additional amino acid substitution of (a) is at position 880 and the additional amino acid substitution at position 880 is F880Y.

59. The RNA polymerase variant of paragraph 22, wherein the additional C-terminal amino acids are threonine, serine, alanine, and proline.

60. A variant RNA polymerase comprising an RNA polymerase comprising:

(a) amino acid substitutions at positions 350, 351, and 387; and

(b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO. 1.

61. The RNA polymerase variant of paragraph 60, wherein:

the additional amino acid substitution at position 350 is selected from the group consisting of E350A, E350K, E350N, and E350W;

the additional amino acid substitution at position 351 is D351V; and/or

The additional amino acid substitution at position 387 is selected from the group consisting of K387S, K387H, and K387N.

62. A variant RNA polymerase comprising an RNA polymerase comprising:

(a) amino acid substitutions at positions 437 and 441; and

(b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO. 1.

63. The RNA polymerase variant of paragraph 62, wherein:

the additional amino acid substitution at position 437 is selected from the group consisting of N437T, N437Y, N437I, and N437F; and/or

The additional amino acid substitution at position 441 is K441R.

64. A variant RNA polymerase comprising an RNA polymerase comprising:

(a) an amino acid substitution at position 880; and

(b) amino acid modification at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO. 1.

65. The RNA polymerase variant of paragraph 64, wherein:

the additional amino acid substitution at position 880 is F880Y; and/or

The amino acid modification at the C-terminus is an additional amino acid selected from alanine, serine, threonine, and proline.

66. A variant RNA polymerase comprising an RNA polymerase comprising:

(a) amino acid substitutions at positions 632, 653, and 657; and

(b) additional amino acid substitutions and/or amino acid modifications at the C-terminus relative to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID NO. 1.

67. The RNA polymerase variant of paragraph 66, wherein:

the additional amino acid substitution at position 632 is selected from R632K and R632T;

the additional amino acid substitution at position 653 is selected from D653T and D653K; and/or

The additional amino acid substitution at position 657 is selected from P657W, P657R, or P657A.

68. The RNA polymerase variant of any of paragraphs 60-67, comprising the additional amino acid substitution of (b).

69. The RNA polymerase variant of paragraph 68, wherein the additional amino acid substitution of (b) is at position 47.

70. The RNA polymerase variant of paragraph 69, wherein the additional amino acid substitution of (b) at position 47 is G47A.

71. The RNA polymerase variant of any of paragraphs 60-70, comprising an amino acid modification at the C-terminus.

72. The RNA polymerase variant of paragraph 71, wherein the amino acid modification at the C-terminus comprises an additional C-terminal amino acid.

73. The RNA polymerase variant of paragraph 72, wherein the additional C-terminal amino acid is glycine.

74. The RNA polymerase variant of any of paragraphs 1-73, comprising an amino acid sequence that is at least 90% identical to a wild-type RNA polymerase, wherein the wild-type RNA polymerase comprises the amino acid sequence of SEQ ID No. 1.

75. A method comprising producing a ribonucleic acid (RNA) transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, a cap analog, and an RNA polymerase comprising at least one mutation relative to a wild-type RNA polymerase, wherein the reaction comprises a concentration of the cap analog that is at least 5-fold lower than the concentration of the cap analog required to produce an equivalent amount of the RNA transcript using the wild-type RNA polymerase, optionally wherein the wild-type RNA polymerase is wild-type T7RNA polymerase.

76. The method of paragraph 75, wherein more than 80% of the RNA transcripts produced comprise a functional cap.

77. The method of paragraphs 75 or 76, wherein the RNA transcript produced has a 3 ' homogeneity greater than a threshold, wherein the threshold 3 ' homogeneity is at least 50% 3 ' homogeneity.

78. The method of any of paragraphs 75-77, wherein the RNA transcripts produced have less than a threshold amount of dsRNA, wherein the threshold amount of dsRNA is 5ng dsRNA/25 μ g mRNA.

79. A method comprising producing an RNA transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, and an RNA polymerase variant of any of paragraphs 1-74.

80. A method comprising producing an RNA transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, a cap analog, and an RNA polymerase variant of any of paragraphs 1-72.

81. The method of paragraph 79 or 80, wherein the nucleoside triphosphate comprises unmodified or modified ATP, modified or unmodified UTP, modified or unmodified GTP and/or modified or unmodified CTP.

82. The method of paragraphs 80 or 81, wherein said reaction comprises a concentration of the cap analog that is at least 2-fold lower, at least 5-fold lower, or at least 10-fold lower than the concentration of the cap analog required to produce an equivalent amount of RNA transcript using wild-type RNA polymerase.

83. The method of any of paragraphs 80-82, wherein more than 80%, more than 85%, more than 90% or more than 95% of the RNA transcripts produced comprise a functional cap.

84. The method of any of paragraphs 80-83, wherein the nucleoside triphosphates and the cap analog are present in the reaction at equimolar concentrations.

85. The method of any one of paragraphs 80-84, wherein the molar ratio of cap analog to nucleoside triphosphate in the reaction is greater than 1:1 or equal to 1: 1.

86. The method of any of paragraphs 80-85, wherein the cap analog is a dinucleotide cap, a trinucleotide cap, or a tetranucleotide cap.

87. The method of any of paragraphs 80-86, wherein the cap analog is a natural cap analog or a synthetic cap analog.

88. The method of paragraphs 86 or 87, wherein the cap analog is a trinucleotide cap comprising a sequence selected from the group consisting of: GAA, GAC, GAG, GAU, GCA, GCC, GCG, GCU, GGA, GGC, GGG, GGU, GUA, GUC, GUG and GUU.

89. The method of paragraph 88, wherein the trinucleotide cap comprises a sequence selected from the group consisting of: GAG, GCG, GUG and GGG.

90. The method of paragraph 89, wherein the trinucleotide cap comprises sequence GAG.

91. The method of paragraph 90, wherein the trinucleotide cap comprises a sequence selected from the group consisting of seq id no:

(a)m7GpppApA、m7GpppApC、m7GpppApG、m7GpppApU、m7GpppCpA、m7GpppCpC、m7GpppCpG、m7GpppCpU、m7GpppGpA、m7GpppGpC、m7GpppGpG、m7GpppGpU、m7GpppUpA、m7GpppUpC、m7GpppUpG and m7GpppUpU;

(b)m7G3′OMepppApA、m7G3′OMepppApC、m7G3′OMepppApG、m7G3′OMepppApU、m7G3′OMepppCpA、m7G3′OMepppCpC、m7G3′OMepppCpG、m7G3′OMepppCpU、m7G3′OMepppGpA、m7G3′OMepppGpC、m7G3′ OMepppGpG、m7G3′OMepppGpU、m7G3′OMepppUpA、m7G3′OMepppUpC、m7G3′OMepppUpG and m7G3′OMepppUpU;

(c)m7G3′OMepppA2′OMepA、m7G3′OMepppA2′OMepC、m7G3′OMepppA2′OMepG、m7G3′OMepppA2′OMepU、m7G3′OMepppC2′OMepA、m7G3′OMepppC2′OMepC、m7G3′OMepppC2′OMepG、m7G3′OMepppC2′OMepU、m7G3′ OMepppG2′OMepA、m7G3′OMepppG2′OMepC、m7G3′OMepppG2′OMepG、m7G3′OMepppG2′OMepU、m7G3′OMepppU2′ OMepA、m7G3′OMepppU2′OMepC、m7G3′OMepppU2′OMepG and m7G3′OMepppU2′OMepU; or

(d)m7GpppA2′OMepA、m7GpppA2′OMepC、m7GpppA2′OMepG、m7GpppA2′OMepU、m7GpppC2′OMepA、m7GpppC2′OMepC、m7GpppC2′OMepG、m7GpppC2′OMepU、m7GpppG2′OMepA、m7GpppG2′OMepC、m7GpppG2′ OMepG、m7GpppG2′OMepU、m7GpppU2′OMepA、m7GpppU2′OMepC、m7GpppU2′OMepG and m7GpppU2′OMepU。

92. The method of paragraph 91, wherein the trinucleotide cap comprises GpppA2′OmepG。

93. The method of any one of paragraphs 75-92, wherein the polynucleotide template comprises a 2 '-deoxythymidine residue or a 2' -deoxycytidine residue at template position + 1.

94. The method of any of paragraphs 75-93, wherein the RNA transcript produced stimulates a cytokine response at least 50% lower when delivered to a cell, optionally in an unpurified form, relative to RNA produced using wild-type RNA polymerase.

95. The method of any of paragraphs 75-94, wherein the concentration of double-stranded RNA (dsRNA) transcript produced is at least 50% lower relative to a dsRNA transcript produced using wild-type RNA polymerase.

96. The method of any one of paragraphs 75-95, wherein less than 50%, less than 25% or less than 10% of the RNA transcripts produced are dsRNA.

97. The method of any one of paragraphs 75-96, wherein less than 30% or less than 20% of the RNA transcripts produced exhibit 3' heterogeneity.

98. The method of any one of paragraphs 75-97, wherein less than 50%, less than 25%, or less than 10% of the RNA transcripts produced are run-on RNA transcripts.

99. The method of any one of paragraphs 75-98, wherein the amount of full-length RNA transcript produced is at least 15-fold greater than the amount of polynucleotide template.

100. The method of any of paragraphs 75-99, wherein the ratio of dsRNA to full-length RNA transcript produced is less than 1: 1.

101. The method of any one of paragraphs 75-100, wherein the RNA transcript produced has fewer than 1 mutation per 100 nucleotides relative to the polynucleotide template.

102. A nucleic acid encoding the RNA polymerase variant of any of paragraphs 1-74.

103. A composition comprising an RNA polymerase variant of any of paragraphs 1-74 and optionally nucleoside triphosphates.

104. A kit comprising the RNA polymerase variant of any of paragraphs 1-74 and In Vitro Transcription (IVT) reagents.

105. A ribonucleic acid (RNA), optionally messenger RNA (mrna), produced by the method of any of paragraphs 75-104.

106. A lipid nanoparticle comprising the RNA of paragraph 103, optionally wherein the lipid nanoparticle comprises ionizable amino lipids in a molar ratio of 20-60%, 5-25% non-cationic lipids, 25-55% sterols, and 0.5-15% PEG-modified lipids.

107. A variant RNA polymerase derived from a starting RNA polymerase having an amino acid modification at position G47 and an additional C-terminal amino acid relative to the wild-type amino acid sequence of T7RNA polymerase comprising the sequence of SEQ ID NO:1, wherein the variant comprises at least one substitution that affects binding of a first nucleotide to a D site within the variant RNA polymerase when the variant RNA polymerase is in a conformational state for de novo initiation of RNA synthesis, and wherein the amino acid substitution results in at least one of the following benefits relative to the starting RNA polymerase:

(i) the transcription efficiency is improved;

(ii) the efficiency of co-transcription and capping is improved;

(iii) increase in RNA production;

(iv) increasing 3' homogeneity of the RNA transcript;

(v) improving transcription fidelity; and

(vi) the amount of dsRNA in the reaction mixture was reduced.

108. An RNA polymerase variant comprising the amino acid sequence of any one of SEQ ID NOs 3-14, 45-48 or 242-247, wherein X is any amino acid selected from the group consisting of R, K, H, E, D, Q, N, T, S, C, G, A, V, L, I, M, P, Y, W and F.

109. The RNA polymerase of paragraph 108, comprising the amino acid sequence of SEQ ID NO: 47.

110. The RNA polymerase of paragraph 109, wherein X is W.

111. The RNA polymerase variant of any of paragraphs 108 and 110 further comprising a G47A substitution.

112. The RNA polymerase variant of any of paragraphs 108 and 111 further comprising an additional C-terminal amino acid.

113. The RNA polymerase variant of paragraph 112, wherein the additional C-terminal amino acid is glycine.

114. An RNA polymerase variant comprising the amino acid sequence of any one of SEQ ID NOS 61-241.

115. A nucleic acid encoding the RNA polymerase variant of paragraph 114.

116. A method comprising producing an RNA transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, and an RNA polymerase variant of paragraph 114.

117. A method comprising producing an RNA transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, a cap analog, and an RNA polymerase variant of paragraph 114.

118. A variant RNA polymerase comprising an RNA polymerase comprising:

(a) an amino acid substitution at position E350, K387, N437, F880, or D653;

(b) an amino acid substitution at position G47; and/or

(c) Amino acid modification at the C-terminus relative to a wild-type RNA polymerase comprising the amino acid sequence of SEQ ID NO. 1.

119. The RNA polymerase of paragraph 118 wherein the amino acid substitution of (a) is selected from the group consisting of E350N, K387N, N437F, F880Y and D653W.

120. The RNA polymerase variant of paragraph 119, wherein the amino acid substitution of (a) is D653W.

121. The RNA polymerase variant of any of paragraphs 118-120 wherein the amino acid substitution at position G47 is G47A.

122. The RNA polymerase variant as described in any of paragraphs 118-121, wherein the amino acid modification at the C-terminus is an additional glycine, an additional alanine, an additional threonine or an additional proline.

123. A RNA polymerase variant comprising an RNA polymerase comprising amino acid substitutions at two positions selected from the group consisting of E350, D351, K387, N437, K441, D506, R632, D653, S628, P657 and F880 relative to a wild type RNA polymerase comprising the amino acid sequence of SEQ ID No. 1.

124. The RNA polymerase variant of paragraph 123 comprising amino acid substitutions at E350 and D351.

125. The RNA polymerase variant of paragraph 123, comprising amino acid substitutions at E350 and K387.

126. The RNA polymerase variant of paragraph 123, comprising amino acid substitutions at K387 and D653.

127. The RNA polymerase variant of any of paragraphs 123-125 wherein the amino acid substitution at position E350 is E350W, E350A, E350K or E350N.

128. The RNA polymerase variant of paragraphs 123 or 124, wherein the amino acid substitution at position D351 is D351V.

129. The RNA polymerase variant of any of paragraphs 123, 125 or 126, wherein the amino acid substitution at position K387 is K387N, K387S, or K387H.

130. The RNA polymerase variant of paragraphs 123 or 126, wherein the amino acid substitution at position D653 is D653T or D653K.

131. A method comprising producing an RNA transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, a cap analog, and an RNA polymerase variant as described in any of the preceding paragraphs, wherein the cap analog is a trinucleotide cap analog or a tetranucleotide cap analog.

132. The method of any one of the preceding paragraphs, wherein the cap analog is a trinucleotide cap analog comprising a GAG.

133. The method of paragraph 132, wherein the GAG cap analogs are selected from:

134. the method of any one of the preceding paragraphs, wherein the cap analog is a tetranucleotide cap analog comprising GGAG.

135. The method of paragraph 134, wherein the tetranucleotide cap analog is selected from the group consisting of:

136. the method of any one of the preceding paragraphs, wherein more than 80%, more than 85%, more than 90%, or more than 95% of the RNA transcripts produced comprise a capping analog.

137. The method of any of the preceding paragraphs, wherein the method produces at least 50%, at least 60%, or at least 75% more RNA transcripts comprising a cap analog than a control in vitro transcription reaction of wild-type RNA polymerase comprising SEQ ID NO: 1.

138. The method of any one of the preceding paragraphs, wherein the molar ratio of the cap analog to the nucleoside triphosphate in the reaction is between 1:10 and 1: 1.

139. The method of any one of the preceding paragraphs, wherein less than 1%, less than 0.5%, or less than 0.1% of the RNA transcripts produced are double-stranded RNA (dsrna).

140. The method of any one of the preceding paragraphs, wherein the reaction produces at least 5mg/mL, at least 6mg/mL, at least 7mg/mL, at least 8mg/mL, at least 9mg/mL, or at least 10mg/mL of the RNA transcript.

141. The method of any one of the preceding paragraphs, wherein at least 85%, at least 90%, or at least 95% of the RNA transcripts produced are full-length RNA transcripts.

142. The method of any one of the preceding paragraphs, wherein the method produces at least 10%, at least 25%, or at least 50% more RNA transcripts comprising the cap analog than a control in vitro transcription reaction comprising a control RNA polymerase variant, wherein the control RNA polymerase variant is derived from SEQ ID NO:1 and comprises the G47A mutation and an additional glycine at the C-terminus.

143. A method comprising producing an RNA transcript in an in vitro transcription reaction comprising a polynucleotide template, nucleoside triphosphates, a cap analog, and a wild-type RNA polymerase, wherein the cap analog is a trinucleotide cap analog or a tetranucleotide cap analog.

144. The method of paragraph 143, wherein the wild type RNA polymerase comprises the amino acid sequence of SEQ ID NO 1.

145. The method of paragraphs 143 or 144 wherein the cap analog is a tetranucleotide cap analog comprising GGAG.

146. The method as described in any of paragraphs 143-145, wherein the tetranucleotide cap analog is selected from the group consisting of:

wild type T7RNA polymerase

MNTINIAKNDFSDIELAAIPFNTLADHYGERLAREQLALEHESYEMGEARFRKMFERQLKAGEVADNAAAKPLITTLLPKMIARINDWFEEVKAKRGKRPTAFQFLQEIKPEAVAYITIKTTLACLTSADNTTVQAVASAIGRAIEDEARFGRIRDLEAKHFKKNVEEQLNKRVGHVYKKAFMQVVEADMLSKGLLGGEAWSSWHKEDSIHVGVRCIEMLIESTGMVSLHRQNAGVVGQDSETIELAPEYAEAIATRAGALAGISPMFQPCVVPPKPWTGITGGGYWANGRRPLALVRTHSKKALMRYEDVYMPEVYKAINIAQNTAWKINKKVLAVANVITKWKHCPVEDIPAIEREELPMKPEDIDMNPEALTAWKRAAAAVYRKDKARKSRRISLEFMLEQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGLLTLAKGKPIGKEGYYWLKIHGANCAGVDKVPFPERIKFIEENHENIMACAKSPLENTWWAEQDSPFCFLAFCFEYAGVQHHGLSYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSETVQDIYGIVAKKVNEILQADAINGTDNEVVTVTDENTGEISEKVKLGTKALAGQWLAYGVTRSVTKRSVMTLAYGSKEFGFRQQVLEDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWESVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTGEILRKRCAVHWVTPDGFPVWQEYKKPIQTRLNLMFLGQFRLQPTINTNKDSEIDAHKQESGIAPNFVHSQDGSHLRKTVVWAHEKYGIESFALIHDSFGTIPADAANLFKAVRETMVDTYESCDVLADFYDQFADQLHESQLDKMPALPAKGNLNLRDILESDFAFA(SEQ ID NO:1)

Control T7RNA polymerase variant (G47A + C-terminal G)

MNTINIAKNDFSDIELAAIPFNTLADHYGERLAREQLALEHESYEMAEARFRKMFERQLKAGEVADNAAAKPLITTLLPKMIARINDWFEEVKAKRGKRPTAFQFLQEIKPEAVAYITIKTTLACLTSADNTTVQAVASAIGRAIEDEARFGRIRDLEAKHFKKNVEEQLNKRVGHVYKKAFMQVVEADMLSKGLLGGEAWSSWHKEDSIHVGVRCIEMLIESTGMVSLHRQNAGVVGQDSETIELAPEYAEAIATRAGALAGISPMFQPCVVPPKPWTGITGGGYWANGRRPLALVRTHSKKALMRYEDVYMPEVYKAINIAQNTAWKINKKVLAVANVITKWKHCPVEDIPAIEREELPMKPEDIDMNPEALTAWKRAAAAVYRKDKARKSRRISLEFMLEQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGLLTLAKGKPIGKEGYYWLKIHGANCAGVDKVPFPERIKFIEENHENIMACAKSPLENTWWAEQDSPFCFLAFCFEYAGVQHHGLSYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSETVQDIYGIVAKKVNEILQADAINGTDNEVVTVTDENTGEISEKVKLGTKALAGQWLAYGVTRSVTKRSVMTLAYGSKEFGFRQQVLEDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWESVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTGEILRKRCAVHWVTPDGFPVWQEYKKPIQTRLNLMFLGQFRLQPTINTNKDSEIDAHKQESGIAPNFVHSQDGSHLRKTVVWAHEKYGIESFALIHDSFGTIPADAANLFKAVRETMVDTYESCDVLADFYDQFADQLHESQLDKMPALPAKGNLNLRDILESDFAFAG(SEQ ID NO:44)

Examples

Example 1 production of RNA polymerase variants.

The RNA polymerase variants were generated using the substitutions shown in tables 2-6.

TABLE 2 RNA polymerase variants

TABLE 3 exemplary monosubstituted variants

TABLE 4 exemplary polysubstituted variants

TABLE 5 exemplary polysubstituted + C-terminal G variants

TABLE 6 additional multiple substitution variants

Example 2 IVT reactions Using polysubstituted + C-terminal G RNA polymerase variants

In Vitro Transcription (IVT) reactions were performed using DNA templates, GAG cap analogs, and the polysubstituted + C-terminal G RNA polymerase variants provided in table 5. All polymerase variants used in this example include a G47A mutation, a C-terminal G addition, and another genetic substitution at position E350, D351, K487, R394, R425, Y427, N437, K441, R632, H811, F880 or G884.

The following RNA polymerase variants produced total RNA yields in IVT reactions from 60% to > 100% of the total yield of control IVT reactions performed using the control RNA polymerase variant (G47A + C-terminal G): E350R, E350K, E350D, E350Q, E350N, E350T, E350S, E350C, E350G, E350A, E350V, E350L, E350I, E350P, E350Y, E350W and E350F; D351R, D351K, D351Q, D351T, D351S, D351C, D351V, D351L, D351I, D351M, D351P, D351Y and D351W; K387R, K387H, K387T, K387S, K387V, K387L, K387I, and K387M; R394K; N437Q, N437T, N437S, N437G and N437F; F880Y; and 884S and 884A (C-terminal addition) (data not shown).

The following RNA polymerase variants produced equal or higher levels of 3' homogeneity of RNA in IVT reactions than in control IVT reactions performed using the control RNA polymerase variant (G47A + C-terminal G): E350N, E350C, E350G, E350Y, E350W and E350F; D351R, D351S, D351L, D351M and D351Y; K387R, K387T, K387L and K387M; R394K; N437R, N437K, N437H, N437T, N437V, N437I and N437W; R632K and R632T; and 884Q, 884T, and 884P (C-terminal addition) (data not shown).

The following RNA polymerase variants produced equal or higher (up to 20% increase) capping RNA% (percentage of total RNA including GAG capping) relative to RNA produced in control IVT reactions performed using control RNA polymerase variants (G47A + C-terminal G): E350R, E350K, E350D, E350Q, E350N, E350T, E350S, E350C, E350G, E350A, E350V, E350L, E350I, E350Y, E350W and E350F; D351R, D351K, D351Q, D351T, D351C, D351V, D351L, D351M and D351W; K387H, K387E, K387N, K387T, K387S, K387G, K387A, K387Y, K387W and K387F; N437T, N437I, N437Y, N437W and N437F; K441R; R632K and R632T; F880Y; and 884Q, 884T, 884S, 884A, and 884P (C-terminal addition) (data not shown).

Example 3 polysubstituted + C-terminal G RNA polymerase variants produce RNA products with more desirable characteristics relative to control polymerase variants

In vitro transcription reactions were performed using DNA template, GAG cap analogs (0.75mM, 2.25mM, 3.75mM and 7.5mM) and: (1) G47A + C-terminal G RNA polymerase variant (control polymerase variant; G47A + C-terminal G); (2) G47A/K387S + C terminal G RNA polymerase variant (K387S); (3) G47A/K387H + C terminal G RNA polymerase variant (K387H); (4) G47A/K387N + C terminal G RNA polymerase variant (K387N); (5) G47A/E350K + C-terminal G RNA polymerase variant (E350K); (6) G47A/E350N + C-terminal G RNA polymerase variant (E350N); (7) G47A/E350A + C-terminal G RNA polymerase variant (E350A); (8) G47A/E350W + C-terminal G RNA polymerase variant (E350W); and (9) G47A/D351V + C-terminal G RNA polymerase variant (D351V). Following IVT reactions, the transcribed RNA products from each reaction were characterized to describe the quality of the RNA products, including percent capping, dsRNA contamination, purity, and 3' homogeneity.

After oligo dT purification, the total yield of total RNA produced using the polysubstituted variants (K387S, K387H, K387N, E350K, E350N, E350A, E350W, and D351V) was comparable to the yield using the control RNA polymerase variant (FIG. 1A). RNA yield was measured by UV absorption.

3' homogeneity of RNA transcripts was measured using RNAse T1 digestion. RNAse T1 specifically cleaves mRNA after G nucleotides. Endonucleate cleavage produces a "scar" of 5 'hydroxide (OH) and 3' monophosphate (mP), while exonucleolytic cleavage produces clean 5 'OH/3' OH cleavage. Therefore, RNAse T1 digestion can be used to distinguish transcripts with and without non-templated addition at the 3' end. In this example, RNA produced using the multiple substitution variant has equal or higher percentage of 3' end homogeneity relative to the control polymerase variant (fig. 1B). In particular, as shown in fig. 1B, the K387S, K387H, K387N, and E350N variants produced more than 20 percentage points higher RNA containing 3' homogenous ends than the control variant.

In this example, dsRNA contaminants (e.g., dsRNA longer than 40 nucleotide base pairs) were assessed after the IVT reaction using a standard ELISA. All IVT reaction mixtures generated from the polysubstituted and control variants contained less than-4 ng dsRNA/25. mu.g mRNA (FIG. 1C). In contrast, IVT reaction mixtures generated by wild-type T7 polymerase contained-20 ng dsRNA/25. mu.g mRNA.

Total RNA products were analyzed by LC-MS to determine capped RNA% (i.e., the percentage of transcribed RNA that contained GAG caps). In the initial IVT reaction, all polysubstituted variants produced RNA with higher levels of capped RNA% relative to the control variant at low and high amounts of GAG cap analogs (fig. 1D-1E). In particular, as shown in fig. 1D-1E, when 0.75mM GAG cap analogue (the lowest cap concentration used in this IVT reaction series) was used, the K387S, K387H, K387N, E350A and D351V variants produced RNA at% capped RNA levels 10-25 percentage points higher than the control variant.

The purity of the transcribed RNA was assessed using the DBAA (dibutylammonium acetate) HPLC method. The polysubstituted variants produced RNAs of comparable purity (> 90% purity in most experimental examples) relative to the control variant (fig. 1F).

The percentage of tailed RNA (i.e. the percentage of transcribed RNA comprising a polyA tail) was assessed using the Tris RP (reverse phase) method. Polysubstituted variants produced RNAs with comparable% tailing (> 85% tailing) relative to control variants (fig. 1G).

The frequency of indels (insertions/deletions/single point mutations) in the transcribed RNA produced by all the polysubstitution variants was comparable to that produced by the control variant polymerase (fig. 1H). On homopolymer stretches >7A (a 7 in fig. 1H), all variants gave an indel frequency of-25% compared to an incidence of-15% caused by wild-type polymerase. However, all variants resulted in an edge deletion frequency in the homopolymer stretches of 5A or 6A (a 5 and a6 in fig. 1H, respectively) equal to the level caused by the wild-type polymerase.

As demonstrated herein, the polysubstituted variants used in this example produce RNA products with more desirable or improved characteristics in IVT reactions relative to the control polymerase variant. Most notably, the K387S, K387H, K387N, E350K, E350N, E350A, E350W, and D351V variants exhibited increased capping efficiency at all concentrations of GAG cap analogs tested, relative to the control variants.

Example 4 polysubstituted + C-terminal G RNA polymerase variants produce RNA products with increased capping efficiency relative to control polymerase variants

In vitro transcription reactions were performed using DNA template, one of three cap analogs (GGG cap, Gm6AG cap (named m6A) and Ge6AG (named e6A) cap) at different concentrations, and: (1) G47A + C-terminal G RNA polymerase variant (control polymerase variant; G47A + C-terminal G); (2) G47A/K387S + C terminal G RNA polymerase variant (K387S); (3) G47A/K387H + C terminal G RNA polymerase variant (K387H); (4) G47A/K387N + C terminal G RNA polymerase variant (K387N); (5) G47A/E350K + C-terminal G RNA polymerase variant (E350K); (6) G47A/E350N + C-terminal G RNA polymerase variant (E350N); (7) G47A/E350A + C-terminal G RNA polymerase variant (E350A); (8) G47A/E350W + C-terminal G RNA polymerase variant (E350W); (9) G47A/D351V + C-terminal G RNA polymerase variant (D351V) and (10) G884 RNA polymerase variant (G884 wt). Initiating an IVT reaction using a GGG cap using 5' GTP; IVT reactions using m6A and e6A caps were initiated using 5' ATP (FIGS. 2A-2C). After the IVT reaction, LC-MS was performed for each experiment to determine the capping RNA% (i.e., the percentage of transcribed RNA that contained the cap).

At all tested concentrations of GGG cap analogs, all the tested polysubstituted variants (K387S, K387H, K387N, E350K, E350N, and E350W) produced significantly higher levels of capped RNA when the GGG cap analogs were incorporated during the IVT reaction relative to the control variants (fig. 2A). In experiments using 2-fold concentrations of GGG cap, the polysubstituted variants produced 50-65% capped RNA. In experiments using 2-fold concentrations of GGG cap, the control variant produced only 30% of capped RNA.

At low (0.5-fold concentration m6A) and high (2-fold concentration m6A) concentrations of m6A capped analogues, all tested polysubstituted variants (K387S, K387H, K387N, E350K, E350N, E350A, E350W and D351V) produced significantly higher levels of capped RNA when m6A capped analogues were incorporated during the IVT reaction relative to the control variant (fig. 2B). In experiments using 2-fold concentrations of m6A capping, the polysubstituted variant yielded 80-85% capped RNA. In experiments using 2-fold concentrations of m6A capping, the control variant produced only 60% capped RNA. The G884 variant also produced higher levels of capped RNA compared to the control, with > 85% capped RNA in experiments using 2-fold concentrations of m6A capping.

At low (0.5-fold concentration E6A) and high (2-fold concentration E6A) concentrations of E6A capped analogues, the tested multi-substituted variants (K387S, K387H, K387N, E350K, E350N, E350A, E350W, and D351V) produced higher levels of capped RNA when incorporated with E6A capped analogues during IVT reactions (fig. 2C). In experiments using 2-fold concentrations of e6A capping, the polysubstituted variant yielded 80-88% capped RNA. In experiments using 2-fold concentrations of e6A capping, the control variant produced only-75% of capped RNA. The G884 variant also produced higher levels of capped RNA compared to the control, with-90% capped RNA in experiments using 2-fold concentrations of e6A capping.

As demonstrated herein, the multiply-substituted + C-terminal G RNA polymerase variants (e.g., K387S, K387H, K387N, E350K, E350N, E350A, E350W, and D351V) produce transcribed RNA products with higher capping efficiency when incorporated with various capping analogs relative to the control polymerase variant.

Example 5 polysubstituted + C-terminal G RNA polymerase variants produce RNA products with more desirable characteristics relative to control polymerase variants

In vitro transcription reactions were performed using DNA template, GAG cap analogues (0.75mM and 7.5mM) and the following: (1) wild Type (WT) RNA polymerase; (2) G47A + C-terminal G RNA polymerase variant (control polymerase variant; G47A + C-terminal G); (3) G47A/D506W + C-terminal G RNA polymerase variant (D506W); (4) G47A/S628W + C-terminal G RNA polymerase variant (S628W); (5) G47A/D653W + C-terminal G RNA polymerase variant (D653W); and (6) G47A/P657W + C-terminal G RNA polymerase variant (P657W). Following IVT reactions, the transcribed RNA products from each reaction were characterized to describe the quality of the RNA products, including percent capping, dsRNA contamination, purity, and 3' homogeneity.

After oligo dT purification, the total yield of total RNA (based on concentration in ng/. mu.L) produced using the S628W multisubstitution variant was comparable to the yield using the control RNA polymerase variant (FIG. 3A). Although the total RNA yield using the D506W, D653W, and P657W polysubstitution variants was lower than the yield using the control RNA polymerase variant, the downstream experiments and continued use for the polysubstitution variants remained viable yields. RNA yield was measured by UV absorption.

The percentage of tailed RNA (i.e. the percentage of transcribed RNA comprising a polyA tail) was assessed using the Tris RP (reverse phase) method. The polysubstituted variants produced RNA with comparable tailing compared to the control variant and the wild type polymerase (. gtoreq.90% tailed) (FIG. 3B).

The purity of the transcribed RNA was assessed using the DBAA (dibutylammonium acetate) HPLC method. The RNA produced by the polysubstituted variant was of comparable purity relative to the control variant and the wild-type polymerase (purity ≧ 85%) (FIG. 3C).

3' homogeneity of RNA transcripts was measured using RNAse T1 digestion. RNAse T1 specifically cleaves mRNA after G nucleotides. Endonucleolytic cleavage produces 5 ' hydroxide (OH) and 3 ' monophosphate (mP) ' scars ', while transcription terminates at the 3 ' hydroxide (OH). Since the last templated nucleotide is G, RNAse T1 digestion can be used to distinguish between transcripts with and without non-templated addition at the 3' end. In this example, RNA produced using the multi-substituted variant had equal or higher percentage of 3' end homogeneity relative to the control polymerase variant (fig. 3D). In particular, the D506W, D653W, and P657W variants produced RNA containing 3' homogenous ends significantly higher than the control variants.

In this example, dsRNA contaminants (e.g., longer than 40 nucleotide base pairs) were assessed after IVT reactions using a standard dsRNA ELISA. All IVT reaction mixtures generated from the polysubstituted and control variants contained less than-5 ng dsRNA/25. mu.g mRNA (FIG. 3E). In contrast, IVT reaction mixtures generated by wild-type T7 polymerase contained greater than-20 ng dsRNA per 25 μ g mRNA.

As demonstrated herein, the polysubstituted variants used in this example (e.g., D506W, D653W, and P657W) produce RNA products in IVT reactions with comparable or improved characteristics relative to the control polymerase variant.

Example 6 polysubstituted + C-terminal G RNA polymerase variants produce RNA products with increased capping efficiency relative to control polymerase variants

In vitro transcription reactions were performed using DNA template, one of three cap analogs (GAG cap, m6A cap and e6A cap) at different concentrations and: (1) G47A + C-terminal G RNA polymerase variant (control polymerase variant; G47A + C-terminal G); (2) G47A/D506W + C-terminal G RNA polymerase variant (D506W); (3) G47A/S628W + C-terminal G RNA polymerase variant (S628W); (4) G47A/D653W + C-terminal G RNA polymerase variant (D653W); and (5) G47A/P657W + C-terminal G RNA polymerase variant (P657W). Incorporation of DNA templates encoding 5' a followed by G into IVT reactions using m6A and e6A capped analogs. After the IVT reaction, LC-MS was performed for each experiment to determine the capping RNA% (i.e., the percentage of transcribed RNA that contained the cap).

In IVT reactions involving 5mM of each NTP, all the polysubstituted variants tested (D653W, D506W, P657W, S628W) required lower effective concentrations of GAG cap analogues to produce RNA with 50% cap incorporation (EC) relative to the control variants (EC 653, r.i.m.)50) (FIGS. 4A-4D). Most notably the amount of the liquid to be dispensed,D653W significantly improved EC against GAG cap incorporation relative to control variants50Wherein nearly 100% of the total RNA at GAG concentrations as low as 0.75mM incorporates a GAG cap. Relative to control variants, D506W, P657W and S628W resulted in EC incorporation for GAG cap50Improvement (reduction) by 1.28, 2.27 and 1.45 times. D653W was also significantly better than the control variant in IVT reactions involving 7.5mM of each NTP, with EC incorporated against GAG cap relative to the control variant50Improvement (decrease) 12.3 fold (fig. 4E).

In IVT reactions involving 5mM of each NTP, all the tested polysubstituted variants (D653W, D506W, P657W, S628W) required lower effective concentrations of e6A cap analogue to generate RNA with cap incorporation relative to the control variant (fig. 5A-5D). Most notably, D653W gave nearly 100% of total RNA with an incorporated e6A cap at 2mM e 6A. In contrast, the control variant only allows-40% of the total RNA to have incorporated e6A, even at 5mM e 6A.

In IVT reactions involving 5mM of each NTP, all the tested polysubstituted variants (D653W, D506W, P657W, S628W) required lower effective concentrations of m6A cap analogue to generate RNA with cap incorporation relative to the control variant (fig. 6A-6D). Most notably, D653W gave nearly 100% of total RNA with an incorporated m6A cap at 5mM m 6A. In contrast, the control variant only had less than 30% of the total RNA with incorporated m6A, even at 5mM m 6A.

In IVT reactions involving 7.5mM of each NTP, the D653W polysubstituted variant required a lower effective concentration of GGAG tetranucleotide cap analog to produce RNA with cap incorporation relative to the control variant (fig. 7). Most notably, D653W gave nearly 100% of total RNA with incorporated GGAG caps at 7.5mM GGAG tetranucleotides. In contrast, the control variant had incorporated GGAG only in less than 70% of the total RNA, even at 7.5mM GGAG tetranucleotides.

As demonstrated herein, the multiply-substituted + C-terminal G RNA polymerase variants (e.g., D653W, D506W, P657W, and S628W) produced transcribed RNA products with higher capping efficiency when incorporated with a variety of different capping analogs (e.g., GAG, e6A, m6A, GGAG tetranucleotides) relative to the control polymerase variants.

Example 7 polysubstituted + C-terminal G RNA polymerase variants produce RNA products with increased capping efficiency and RNA yield relative to control polymerases

In vitro transcription reactions were performed using DNA template, 5mM equimolar NTP, 5mM cap analogue (GAG trinucleotide, e6A trinucleotide, m6A trinucleotide or GGAG tetranucleotide) and 500nM following T7RNA polymerase: (1) G47A + C-terminal G RNA polymerase variant (control polymerase variant; G47A + C-terminal G); (2) G47A/D653W + C-terminal G RNA polymerase variant (D653W); (3) G47A/G884P + C-terminal G RNA polymerase variant (G884P); (4) G47A/G884T + C-terminal G RNA polymerase variant (G884T); (5) G47A/G884A + C-terminal G RNA polymerase variant (G884A); (6) G47A/F880Y + C-terminal G RNA polymerase variant (F880Y); (7) G47A/N437F + C terminal G RNA polymerase variant (N437F); (8) G47A/K387N + C terminal G RNA polymerase variant (K387N); or (9) G47A/E350N + C-terminal G RNA polymerase variant (E350N).

Following the IVT reaction, the mRNA product was subjected to oligo-dT purification, then analyzed by LC-MS to determine the capped RNA% (i.e., the percentage of transcribed RNA that contains the cap) and by HPLC to determine the RNA yield of the reaction.

All of the tested polysubstituted variants (D653W, G884P, G884T, G884A, F880Y, N437F, K387N, E350N) produced RNA at percent levels of capped RNA comparable to or higher than the control polymerase variant in the presence of any of the GAG trinucleotides, E6A trinucleotides, m6A trinucleotides or GGAG tetranucleotides (fig. 8A-8I). Notably, D653W provided a significantly increased percentage of capped RNA relative to the control polymerase variant or wild-type polymerase, particularly in the presence of the m6A trinucleotide (-85% capped) and the e6A trinucleotide (-90% capped). See fig. 8B and 8C.

All the tested polysubstituted variants (D653W, G884P, G884T, G884A, F880Y, N437F, K387N, E350N) produced higher or comparable total RNA yields than the control polymerase variant in the presence of GAG trinucleotides (fig. 8E-8I). The G884A, F880Y, K387N and E350N variants produced higher or comparable total RNA yields than the control polymerase variant in the presence of the m6A trinucleotide.

All the tested polysubstituted variants (D653W, G884P, G884T, G884A, F880Y, N437F, K387N, E350N) yielded higher percentage of capped RNA than the control polymerase variant in the presence of GAG trinucleotides (fig. 8A-8D). The G884A, F880Y, K387N, and E350N variants produced higher percent capped RNA than the control polymerase variant in the presence of the m6A trinucleotide. In the presence of the e6A trinucleotide, F880Y produced a higher percentage of capped RNA than the control polymerase variant.

The IVT reaction of this example was then further analyzed for the content of double stranded rna (dsrna) (undesired by-products of the IVT reaction) and compared to other IVT reactions (fig. 9A-9D). Notably, none of the multiple substituted variants tested (D653W, G884P, G884T, G884A, F880Y, N437F, K387N, E350N) produced more than-0.75 ng dsRNA per 2 μ G total RNA in IVT reactions. This is in contrast to the wild-type T7 polymerase, which produced 2-5ng dsRNA per 2. mu.g of total RNA in IVT reactions in the presence of all trinucleotide and tetranucleotide cap analogues tested.

Example 8G 47A/D653W + C-terminal G RNA polymerase produces RNA products with higher 3' homogeneity and capping efficiency relative to related single-and double-mutant RNA polymerases

In vitro transcription reactions were performed using DNA template, 5mM equimolar NTP, 0.5mM GAG trinucleotide and the following T7RNA polymerase: (1) wild-type RNA polymerase; (2) G47A RNA polymerase variant; (3) G884A RNA polymerase variant; (4) D653W RNA polymerase variant; (5) G47A/D653W RNA polymerase variant; (6) D653W + C-terminal G RNA polymerase variant; (7) G47A/D653W + C-terminal G RNA polymerase variant; or (8) G47A + C-terminal G RNA polymerase variant.

Samples of IVT reactions were collected throughout each reaction (120 min) and analyzed for crude RNA yield over time (fig. 10D). Following the IVT reaction, mRNA products were oligo-dT purified and then analyzed for 3' homogeneity (fig. 10A), capping RNA% (i.e., the percentage of transcribed RNA that contains the cap) (fig. 10B), and full-length product percentage (i.e., the percentage of total RNA that contains the full-length transcript) (fig. 10C).

G47A/D653W + C-terminal G RNA polymerase performed best among the polymerases tested, while D653W + C-terminal G RNA polymerase and G47A + C-terminal G RNA polymerase also provided RNA of good quality and yield. G47A/D653W + C terminal G RNA polymerase produced RNA of-90% of the total RNA containing 3' homogeneity; D653W + C-terminal G RNA polymerase produced RNA of 75% of the total RNA contains 3' homogeneity; and-70% of the total RNA produced by G47A + C-terminal G RNA contained 3' homogeneity. In contrast, only-10% of total RNA produced by wild-type polymerase contains 3' homogeneity. All polymerases tested that contained the D653W mutation produced 90-95% of capped RNA. In contrast, wild-type polymerase produced only-60% of capped RNA in these experiments. All mutant variants of RNA polymerase produced a good (> 85%) level of percentage of full-length product. Furthermore, as shown in FIG. 10D, in these experiments, mutant variants of RNA polymerase were able to maintain acceptable RNA yields (5-9 mg/mL at 120 min reaction time), even while producing higher quality RNA (higher 3' homogeneity and higher percentage of capped RNA) than the wild-type polymerase.

Example 9D 653W + G47A RNA polymerase variants produce RNA products with increased capping efficiency relative to control polymerase variants

In vitro transcription reactions were performed using DNA template, one of the four cap analogs (GGAG cap, Gm6AAG, Gm6AG cap or Ge6AG cap) at different concentrations (1-7mM cap analog) and either the G47A + C-terminal G RNA polymerase variant (control polymerase variant) or the G47A + D653W RNA polymerase variant. After the IVT reaction, LC-MS was performed for each experiment to determine the capping RNA% (i.e., the percentage of transcribed RNA that contained the cap).

The G47A + D653W RNA polymerase variant produced RNA with a higher percentage of incorporated cap analogs at all cap analog concentrations for all four tested cap analogs relative to the control polymerase variant (fig. 11).

Example 10A set of polysubstituted RNA polymerase variants produces RNA products with increased capping efficiency relative to control polymerase variants

An individual in vitro transcription reaction was performed using a DNA template, 5mM equimolar NTP, 0.5mM GAG trinucleotide and one of the T7RNA polymerase variants as shown in table 7.

Following the IVT reaction, the mRNA product was subjected to oligo-dT purification, then analyzed by LC-MS to determine the capped RNA% (i.e., the percentage of transcribed RNA that contains the cap) and by HPLC to determine the RNA yield of the reaction.

TABLE 7RNA polymerase variants used in example 9

41 of the 42 tested polysubstituted variants as shown in table 7 produced higher relative amounts of percent capped RNA in the presence of GAG trinucleotides than the control polymerase variant (G47A + C-terminal G) or the wild type RNA polymerase (fig. 12). Several variants produced more than 85% capped RNA, including G47A + K387N + C-terminal T; E350W + K387N + G47A + C-terminal G; D351V + E350W + K387H + G47A + C-terminal G; G47A + D653T + C-terminal a; D351V + E350W + G47A + C-terminal G; D351V + E350K + K387N + G47A + C-terminal G; K387N + G47A + C terminal G; D351V + E350K + K387S + G47A + C-terminal G; and D351V + E350A + K387N + G47A + C-terminal G.

Example 11 polysubstituted RNA polymerase variants produce RNA products with high levels of capping efficiency at low concentrations of GGAG capping analogs

In vitro transcription reactions were performed using DNA template, 6mM equimolar concentrations of NTP, varying amounts of GGAG tetranucleotide cap analog (0.6mM/0.1:1GGAG: NTP; 0.8 mM; 1.0 mM; 1.2mM/0.2:1GGAG: NTP; 1.4 mM; or 1.6mM) and 0.025mg/mL of the following T7RNA polymerase: (1) G47A + C-terminal G (control polymerase variant; G47A + C-terminal G); (2) D563T + G47A + C-terminal G; (3) D653W + G47A; (4) E350W + D351V + G47A + C-terminal G; (5) D653T + G47A + C-terminal S (G884S); (6) E350W + K387N + G47A + C-terminal G; or (7) D653T + K387N + G47A + C-terminal G.

Following the IVT reaction, the mRNA product was subjected to oligo-dT purification, then analyzed by LC-MS to determine the capped RNA% (i.e., the percentage of transcribed RNA that contains the cap) and by HPLC to determine the RNA yield of the reaction.

In the presence of GGAG capping analogs, all of the tested multi-substituted variants produced RNA with a higher percentage of capped RNA compared to the control polymerase variant, regardless of the concentration of GGAG analogs (fig. 13B). All polysubstituted variants produced at least 80% capped RNA, much higher than 45% capped RNA produced by the control polymerase variant, even at the lowest tested concentration of GGAG capping analogue (0.6 mM). At 1.6mM GGAG cap analogue, all variants tested produced approximately 93-97% capped RNA.

Example 12 multiple substituted RNA polymerase variants produce high quality RNA products regardless of DNA template

Three different DNA templates (constructs 1, 2 and 3), 6mM equimolar concentrations of NTP, 1.2mM GGAG cap analogue and the following T7RNA polymerase were used for in vitro transcription reactions: (1) G47A + C-terminal G RNA polymerase variant (control polymerase variant; G47A + C-terminal G); (2) D653W + G47A RNA polymerase variant; (3) D653T + K387N + G47A + C-terminal G RNA polymerase variant; (4) E350W + D351V + G47A + C-terminal G RNA polymerase variant; (5) E350W + K387N + G47A + C-terminal G RNA polymerase variant; or (6) D653T + G47A + C-terminal G RNA polymerase variant.

Following the IVT reaction, the mRNA product was subjected to oligo-dT purification, then analyzed by LC-MS to determine the capped RNA% (i.e., the percentage of transcribed RNA that contains the cap) and by HPLC to determine the RNA yield of the reaction.

For all three DNA templates, all the multiple-substituted variants tested produced RNA with 90-95% capped RNA in the presence of GGAG tetranucleotides (fig. 14A). Each variant produces a higher level of percent capped RNA than the control polymerase variant.

The percentage of tailed RNA (i.e. the percentage of transcribed RNA comprising a polyA tail) was assessed using the Tris RP (reverse phase) method. For all three DNA templates, the% tailing of RNA generated by the polysubstituted variant was comparable to the control variant (. gtoreq.90% tailing) (FIG. 14B).

The purity of the transcribed RNA was assessed using a reverse phase HPLC method. For all three DNA templates, the polysubstituted variant yielded RNA with comparable purity relative to the control variant and wild type polymerase (approximately 95% purity) (fig. 14C).

3' homogeneity of RNA transcripts produced from construct 1 was measured using RNAse T1 digestion. RNA produced using the polysubstituted variant had a higher percentage of 3 'end homogeneity relative to the control polymerase variant (fig. 14D), with about 95% of the total RNA having 3' homogeneity.

In this example, dsRNA contaminants (e.g., longer than 40 nucleotide base pairs) were assessed after IVT reactions using a standard dsRNA ELISA. All IVT reaction mixtures generated from the multiply-substituted and control variants contained less than-0.015% (wt/wt) dsRNA for all three DNA templates (fig. 14E). In particular, for all three DNA templates, the IVT reaction mixture produced from the following contains less than 0.005% (wt/wt) dsRNA: D653T + K387N + G47A + C-terminal G RNA polymerase variant; E350W + D351V + G47A + C-terminal G RNA polymerase variant; E350W + K387N + G47A + C-terminal G RNA polymerase variant; D653T + G47A + C-terminal G RNA polymerase variant.

Example 13 polysubstituted RNA polymerase variants produce high quality RNA products

In vitro transcription reactions were performed using a DNA template, 6mM equimolar concentrations of NTP, 1.5mM GGAG cap analogue and the following T7RNA polymerase: (1) wild-type RNA polymerase; (2) G47A + C-terminal G RNA polymerase variant; (3) E350W + K387N RNA polymerase variants; (4) E350W + D351V RNA polymerase variant; (5) a K387N + D653T RNA polymerase variant; (6) E350W + K387N + G47A + C-terminal G RNA polymerase variant; (7) E350W + D351V + G47A + C-terminal G RNA polymerase variant; or (8) K387N + D653T + G47A + C-terminal G RNA polymerase variant.

Following the IVT reaction, the mRNA product was subjected to oligo-dT purification, then analyzed by LC-MS to determine the capped RNA% (i.e., the percentage of transcribed RNA that contains the cap) and by HPLC to determine the RNA yield of the reaction.

In this example, most of the tested polysubstituted variants produced a total RNA yield in the presence of GGAG tetranucleotides comparable to wild-type polymerase (FIG. 15A), wherein the total RNA was about 5 mg/mL.

All of the multi-substituted variants tested in this example produced RNA with higher amounts of capped RNA in the presence of GGAG tetranucleotides relative to the wild-type polymerase variant and the G47A + C-terminal G polymerase variant (fig. 15B). 90-95% of the total RNA produced by GGAG tetranucleotide caps comprises: E350W + K387N RNA polymerase variants; E350W + D351V RNA polymerase variant; a K387N + D653T RNA polymerase variant; E350W + K387N + G47A + C-terminal G RNA polymerase variant; E350W + D351V + G47A + C-terminal G RNA polymerase variant; and K387N + D653T + G47A + C-terminal G RNA polymerase variants.

In this example, standard dsRNA ELISA is used to evaluate dsRNA (e.g., longer than 40 nucleotide base pairs) produced by IVT reactions. The double mutant polymerase variants (E350W + K387N; E350W + D351V; and K387N + D653T) produced approximately 0.4% to 0.6% (weight/weight) dsRNA per total RNA (FIG. 15C). Other mutant variants (E350W + K387N + G47A + C-terminal G; E350W + D351V + G47A + C-terminal G; and K387N + D653T + G47A + C-terminal G) produced less than 0.015% (w/w) dsRNA per total RNA.

The purity of the transcribed RNA was assessed using a reverse phase HPLC method. All the multiple-substituted variants tested in this example produced RNA of comparable purity to the G47A + C-terminal G variant and wild-type polymerase (approximately 90% pure) (fig. 15D).

The percentage of tailed RNA (i.e. the percentage of transcribed RNA comprising a polyA tail) was assessed using the Tris RP (reverse phase) method. All the multiple-substituted variants tested in this example produced RNA with a% tailed comparable to the G47A + C-terminal G variant and wild-type polymerase (. gtoreq.85% tailed) (FIG. 15E).

Example 14 multiple substitution RNA polymerase variants do not result in the production of RNA insertion deletions or point mutations in the increase

In vitro transcription reactions were performed using a DNA template, 6mM equimolar concentrations of NTP, 1.5mM GGAG cap analogue and the following T7RNA polymerase: (1) G47A + C-terminal G variant; (2) D653T + G47A + C-terminal G variant; (3) a D653W + G47A variant; (4) E350W + K387N + G47A + C-terminal G variant; (5) E350W + D351V + G47A + C-terminal G variant; or (6) D653+ K387N + G47A + C-terminal G variant.

The resulting mRNA was evaluated using Next Generation Sequencing (Next Generation Sequencing) to test for insertions and deletions (indels) in the resulting RNA sequence, as well as point mutations. Importantly, none of the polymerase variants tested produced mRNA with a large number of indels or point mutations. All variants tested produced mrnas with indels of 0.0-0.4%, below the threshold for percent indels associated with wild-type RNA polymerase. Thus, this example demonstrates that neither the polymerase variants tested nor their individual mutations adversely affect the fidelity of the enzyme.

The subject matter of all references, patents, and patent applications disclosed herein is incorporated by reference, and in some cases may encompass the entire document.

The indefinite articles "a" and "an" as used in this specification and claims are understood to mean "at least one" unless clearly indicated to the contrary.

It will also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or action, the order of the steps or actions of the method is not necessarily limited to the order in which the steps or actions of the method are recited.

In the claims, as well as in the specification above, all transitional phrases such as "comprising," including, "" carrying, "" having, "" containing, "" involving, "" holding, "" consisting of … … and the like are to be understood to be open-ended, i.e., to mean including but not limited to. As described in united states patent office patent inspection program manual section 2111.03, only the transition phrases "consisting of … …" and "consisting essentially of … …" should be closed or semi-closed transition phrases, respectively.

233页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:TREM组合物及其用途

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!