Integration of a nucleic acid construct into a eukaryotic cell using transposase from medaka

文档序号:914141 发布日期:2021-02-26 浏览:25次 中文

阅读说明:本技术 使用来自青鳉属的转座酶将核酸构建体整合入真核细胞 (Integration of a nucleic acid construct into a eukaryotic cell using transposase from medaka ) 是由 杰里米·米舒尔 斯瑞达·戈文达拉詹 玛吉·李 于 2020-04-07 设计创作,主要内容包括:本发明提供了用于异源基因的高表达的多核苷酸载体。一些载体还包含进一步改善表达的新型转座子和转座酶。进一步公开了可在基因转移系统中使用,以将核酸稳定地引入细胞DNA中的载体。基因转移系统可以用于例如基因表达、生物加工、基因治疗、插入诱变,或基因发现等方法中。(The present invention provides polynucleotide vectors for high expression of heterologous genes. Some vectors also contain novel transposons and transposases that further improve expression. Further disclosed are vectors that can be used in gene transfer systems to stably introduce nucleic acids into cellular DNA. Gene transfer systems can be used in methods such as gene expression, bioprocessing, gene therapy, insertional mutagenesis, or gene discovery.)

1. A polynucleotide comprising an open reading frame encoding a transposase having an amino acid sequence that differs from the amino acid sequence of SEQ ID NO:782 are at least 90% identical and are operably linked to a heterologous promoter.

2. The polynucleotide of claim 1, wherein the transposase is transposase-specific with respect to SEQ ID NO:782 contains the mutations shown in columns C and D of table 1.

3. The polynucleotide of claim 2, wherein the transposase is mutated relative to SEQ ID NO:782 comprises a mutation at an amino acid position selected from 22, 124, 131, 138, 149, 156, 160, 164, 167, 171, 175, 177, 202, 206, 210, 214, 253, 258, 281, 284, 361, 386, 400, 408, 409, 455, 458, 467, 468, 514, 515, 524, 548, 549, 550 and 551.

4. The polynucleotide of claim 3, wherein the transposase is mutated relative to SEQ ID NO:782 contains a mutation selected from: E22D, a124C, Q131D, L138V, F149R, L156T, D160E, Y164F, I167L, a171T, R175K, K177N, T202R, I206L, I210L, N214D, V253I, V258L, I281F, a284L 84 2, L361I, V386I, M400L, S408E, L409I, F455Y, V458L, V467I, L468I, a514R, V515I, S524P, R548K, D549K, D550R and S551R, the transposase optionally comprising at least 2, 3, 4, or 5 selected from the group.

5. The polynucleotide of claim 2, wherein the amino acid sequence of the transposase is selected from the group consisting of SEQ ID NOs: 782 or 805 and 908.

6. The polynucleotide of any one of the preceding claims, wherein the transposase can be selected from the group consisting of SEQ ID NOs: 41 excision or transposition of the transposon.

7. The polynucleotide of claim 6, wherein the excision or transposition activity of the transposase is SEQ ID NO:782 of at least 10% of its activity.

8. The polynucleotide of any one of the preceding claims, wherein the promoter is active in an in vitro transcription reaction.

9. The polynucleotide of any one of claims 1-7, wherein the promoter is active in eukaryotic cells.

10. The polynucleotide of claim 9, wherein the eukaryotic cell is a mammalian cell, optionally wherein the codons of the open reading frame are selected for mammalian cell expression.

11. An isolated mRNA encoding a polypeptide having an amino acid sequence that differs from the amino acid sequence of SEQ ID NO:782 is at least 90% identical, and wherein the mRNA sequence is at least partially identical to SEQ ID NO:781 comprises at least 10 synonymous codon differences, optionally wherein codons in the mRNA at the corresponding positions are selected for mammalian cell expression.

12. The polynucleotide of any one of claims 1-10, wherein the open reading frame further encodes a nuclear localization sequence fused to the transposase.

13. The polynucleotide of any one of claims 1-10, wherein the open reading frame further encodes a heterologous DNA binding domain fused to the transposase.

14. The polynucleotide of claim 13, wherein the DNA-binding domain is derived from a CRISPR Cas system, or a zinc finger protein, or a TALE protein.

15. A non-naturally occurring polynucleotide encoding a polypeptide having a sequence that differs from the sequence of SEQ ID NO:782 are at least 90% identical, wherein the polynucleotide sequence is at least 90% identical relative to SEQ ID NO:781 have at least 10 synonymous codon differences, optionally wherein codons at the corresponding positions in the polynucleotide are selected for mammalian cell expression.

16. A non-naturally occurring polypeptide encoded by the polynucleotide of any one of the preceding claims.

17. A polypeptide comprising SEQ ID NO:7 and SEQ ID NO: 8 in the aforementioned step (b).

18. The transposon of claim 17, further comprising on one side of the heterologous polynucleotide an amino acid sequence identical to SEQ ID NO: 12, and a sequence at least 90% identical to SEQ ID NO: 15 sequences at least 90% identical.

19. The transposon of claim 17 or 18, wherein the heterologous polynucleotide comprises a heterologous promoter active in a eukaryotic cell.

20. The transposon of claim 19, wherein the promoter is operably linked to at least one or more of: i) an open reading frame; ii) a nucleic acid encoding a selectable marker; iii) a nucleic acid encoding a counter-selectable marker; iii) a nucleic acid encoding a regulatory protein; iv) a nucleic acid encoding an inhibitory RNA.

21. The transposon of claim 19, wherein the heterologous promoter comprises a sequence selected from SEQ ID NO: 325 and 409.

22. The transposon of any one of claims 17-21, wherein the heterologous polynucleotide comprises a heterologous enhancer active in eukaryotic cells.

23. The transposon of claim 22, wherein the heterologous enhancer is selected from SEQ ID NO: 304-324.

24. A transposon according to any one of claims 17 to 23, wherein the heterologous polynucleotide comprises a heterologous intron which is spliced in a eukaryotic cell.

25. The transposon of claim 24, wherein the nucleotide sequence of the heterologous intron is selected from the group consisting of SEQ ID NOs: 412-472.

26. The transposon of any one of claims 17-25, wherein the heterologous polynucleotide comprises an insulator sequence.

27. A transposon according to claim 26, wherein the nucleic acid sequence of the insulator is selected from SEQ ID NO: 286-292.

28. The transposon of any one of claims 17-27, wherein the heterologous polynucleotide comprises or encodes a selectable marker.

29. The transposon of claim 28, wherein the selectable marker is selected from the group consisting of glutamine synthetase, dihydrofolate reductase, puromycin acetyltransferase, blasticidin acetyltransferase, hygromycin B phosphotransferase, aminoglycoside 3' -phosphotransferase, and a fluorescent protein.

30. A eukaryotic cell whose genome comprises SEQ ID NO:7 and SEQ ID NO: 8.

31. the eukaryotic cell of claim 30, wherein the cell is an animal cell.

32. The animal cell of claim 31, wherein the cell is a mammalian cell.

33. The mammalian cell of claim 32, wherein the cell is a rodent cell.

34. The mammalian cell of claim 32, wherein the cell is a human cell.

35. The transposon of any one of claims 17-29, wherein the heterologous polynucleotide comprises two open reading frames, each of which is operably linked to a separate promoter.

36. The transposon of claim 35, wherein the heterologous polynucleotide further comprises a sequence selected from the group consisting of SEQ ID NOs: 596, 779.

37. A method of integrating a transposon into a eukaryotic cell, the method comprising

a. Introducing a transposon into the cell, the transposon comprising the nucleic acid sequence of SEQ ID NO:7 and SEQ ID NO: 8;

b. introducing into the cell a transposase having a sequence that is identical to SEQ ID NO:782, wherein the transposase transposes the transposon to produce a nucleic acid comprising SEQ ID NO:7 and SEQ ID NO: 8 in a genome of said microorganism.

38. The method of claim 37, wherein the transposase is introduced as a polynucleotide encoding the transposase.

39. The method of claim 38, wherein the polynucleotide encoding the transposase is an mRNA molecule.

40. The method of claim 38, wherein the polynucleotide encoding the transposase is a DNA molecule.

41. The method of claim 37, wherein the transposase is introduced in the form of a protein.

42. The method of any one of claims 37-41, wherein the heterologous polynucleotide encodes a selectable marker, and the method further comprises

c. Selecting cells comprising the selectable marker.

43. The method of any one of claims 37-42, wherein the cell is an animal cell.

44. The animal cell of claim 43, wherein the cell is a mammalian cell.

45. The mammalian cell of claim 44, wherein the cell is a rodent cell.

46. The mammalian cell of claim 44, wherein the cell is a human cell.

47. A method of expressing a polypeptide comprising culturing a eukaryotic cell having a nucleic acid sequence comprising SEQ ID NO:7 and SEQ ID NO: 8, wherein the polynucleotide is expressed.

48. The method of claim 47, further comprising purifying the polypeptide from the culture medium.

49. The method of claim 47 or 48, further comprising incorporating the purified polypeptide into a pharmaceutical composition.

Background

The level of expression of a gene encoded on a polynucleotide integrated into the genome of a cell depends on the configuration of sequence elements within the polynucleotide. The efficiency of integration and thus the copy number of the polynucleotide integrated into each genome, as well as the genomic locus at which integration occurs, also affect the expression level of the genes encoded on the polynucleotide. The efficiency of polynucleotide integration into the genome of a target cell can generally be increased by placing the polynucleotide in a transposon.

The transposon comprises two ends recognized by the transposase. Transposases act on transposons to remove them from one DNA molecule and then integrate into another DNA molecule. The DNA between the two transposon ends is transposed with the transposon ends by the transposase. "synthetic transposon" herein refers to a heterologous DNA flanked by a pair of transposon ends so that it is recognized and transposed by a transposase. The introduction of a synthetic transposon and the corresponding transposase into the nucleus of a eukaryotic cell may result in the transposition of the transposon into the genome of the cell. These results are useful because they improve transformation efficiency and because they can increase the expression level of the integrated heterologous DNA. Thus, there is a need in the art for highly active transposases and transposons.

Transposition by a piggyBac-like transposase is fully reversible. The transposon initially integrates at an integration target sequence in the recipient DNA molecule, during which the target sequence replicates at each end of the transposon Inverted Terminal Repeat (ITR). Subsequent transposition removes the transposon and allows the recipient DNA to recover its previous sequence, i.e., both the target sequence copy and the transposon are removed. However, this is not sufficient to remove the transposon from the genome into which the transposon has already been integrated, because it is highly likely that the transposon excises from the first integration target sequence and transposes into the second integration target sequence in the genome. On the other hand, a transposase lacking an integration (or transposition) function can excise a transposon from a first target sequence, but cannot integrate into a second target sequence. Thus, an integration-deficient transposase can be used to reverse genomic integration of the transposon.

One application of transposases is in engineering eukaryotic genomes. Such engineering may require the integration of more than one different polynucleotide into the genome. These integrations may be simultaneous or sequential. When transposing a first transposon comprising a first heterologous polynucleotide into the genome by a first transposase followed by transposing a second transposon comprising a second heterologous polynucleotide into the same genome by a second transposase, it is advantageous that the second transposase is unable to recognize and transpose the first transposon. This is because the location of the polynucleotide sequence within the genome affects the expressibility of the genes encoded on the polynucleotide, and thus, transposition of the first transposase to a different chromosomal location by the second transposase can alter the expression characteristics of any gene encoded on the first heterologous polynucleotide. Thus, there is a need for a set of transposons and their corresponding transposases, wherein the transposases in the set recognize and transpose only their corresponding transposons, and not any other transposons in the set.

piggyBac transposons and transposases from the species trichoplusia (looper moth) Trichoplusiani have been widely used for the insertion of heterologous DNA into the genome of target cells from many different organisms since their discovery in 1983. The piggyBac system is a particularly valuable transposase system because: "it is active in a wide range of organisms, it enables efficient integration of multiple large transgenes, it enables the addition of domains to transposases without loss of activity, and excision from the genome without leaving footprint mutations" (Doherty et al, hum. Gene Ther.23,311-320(2012), at p.312, LHC,2)。

the value and versatility of piggyBac systems has motivated tremendous efforts to identify other active transposons (commonly referred to as piggyBac-like elements, or PLE) similar to piggyBac, but these efforts have largely been unsuccessful. "since piggyBac is one of the most popular transposons for transgenics, much attention has been paid to finding new active PLE. However, to date, only a few active PLE have been reported. "(Luo et al, BMC Molecular Biology 15,28(2014) http:// www.biomedcentral.com/1471-2199/15/28.12, page 4, RHC,1“Discussion”)。

although large numbers of homologues of piggyBac transposons and transposases are present in sequence databases, few active homologues are identified because most of them are inactivated by their host to avoid detrimental activity on the host; as described in the following excerpts: "related piggyBac transposable elements have been found in plants, fungi and animals (including humans) [125]Although they may be lost due to mutationAnd (6) alive. "(Munoz-Lopez and Garcia-Perez, Current Genomics 11, 115-,1). "it is believed that transposons invade the genome and then spread throughout the entire genome during evolution. The mobility of the 'selfish' transposon is detrimental to the host; thus, they are eliminated or inactivated by the host by natural selection. Even harmless transposons eventually lose activity due to the lack of conservative selection. Thus, generally, the life span of transposons in a host is short, and they subsequently become fossils in the genome. "(Hikosaka et al, mol.biol.Evol.24,2648-3656(2007) in p.2648, LHC,1 "Introduction"). "frequent movement of transposable elements in the genome is detrimental" (Belancio et al, 2008; Deininger)&Batzer,1999;Le Rouzic&Capy,2006;Oliver&Greene, 2009). As a result, most transposable elements are inactivated shortly after invading a new host. "(Luo et al, institute Science 18,652-,1)。

three types of piggyBac-like elements have been found: (1) those very similar to the original piggyBac from ulna moths (usually at the nucleotide level>95% identical, (2) moderately related ones (typically 30% -50% identical at the amino acid level), and (3) very remotely related ones (Wu et al, Insect Science 15,521-528(2008) at p.521, RHC.2)。

piggyBac-like transposases that are highly related to the ulnar moth transposase have been described by several research groups. They are very conserved. Transposase sequences (95-98% nucleotide identity) very similar to the original piggyBac have been reported in three different strains of Bactrocera dorsalis (Handler) drosophila dorsalis&McCombs, insert Molecular Biology 9, 605-. A relatively conserved piggyBac sequence was also found in other Bactrocera dorsalis (Bactrocera) genera (Bonizzoni et al, Insect Molecular Biology 16,645-650 (2007)). Two noctuids (Helicoverpazea and Helicoverpaarmigera) and other lines of the Trichoplusia ni possess genomic copies of the piggyBac transposase with 93-100% nucleotide identity to the original piggyBac sequence (Zimowska)&Handler,Insect Biochemistry and Molecular Biology,36,421-428(2006))。Zimowska&Handler also found multiple copies of the more significantly mutated (and truncated) version of piggyBac transposase simultaneously in two Helicoverpa genera and a homolog in the myxobolus Spodoptera frugiperda. None of these groups attempted to measure any activity of these transposases. Wu et al (2008), supra, reported isolation of a transposase having 99.5% sequence identity to the ulexia piggyBac from Trichoplusia ni (Macdunnoughiaacarassigana). They also demonstrated that the transposons and transposases can measure excision and transposition and are therefore active. Their discussion summarises the previous results as follows: "other closely related IFP 2-like sequences were reported to be present in various genera of Bactrocera (Bactrocera), the genome of Trichinella (T.ni), Helicoverpa armigera (Heliocoverpaarma) and H.zea (Handler)&McCombs,2000;Zimowska&Handler, 2006; bonizzoni et al, 2007). These sequences are partial fragments of piggyBac-like elements, most of which are truncated or inactivated by the accumulation of random mutations. "(Wu et al, Insect Science 15,521-528(2008) in p.526, LHC,3.)。

it has been demonstrated that it is difficult to identify active piggyBac-like transposases that are moderately associated with the ulexin enzyme by merely observing the sequence. The presence of known essential features (full-length open reading frame, catalytic aspartate residue and intact ITRs) has not been demonstrated to predict activity. "diverse PLEs have been recorded in eukaryotes in computational analysis of genomic sequence data [ omission of citations]. However, hardly any separation of compounds having a chemical bond withFunctionally consistent, fully structural elements, only the original IFP2piggyBac was developed as a vector for routine transgenics. "(Wu et al, genetic 139,149-154(2011), at p.152, RHC,2). The group of Wu et al at the university of Nanjing ("Nanjing group") published a number of papers over a6 year period, each identifying moderately relevant piggyBac homologues. Although Nanjing groups showed in 2008 that they could measure excision and transposition of the silverworm moth transposon by the corresponding transposase of the silverworm moth (Macdunnoughiacrassigna), and in each of the subsequent papers they showed the desire to identify a novel active piggyBac class transposase, they showed only excision activity and only one transposase from cotton aphid (Aphis gossypii). They concluded that the usefulness of this transposase "remains to be explored in further experiments" (Luo et al.2011, p.660, LHC)2 "discovery"). However, no activity was shown to be found in other papers published by Nanjing group that identified piggyBac-like sequences from a range of other insects. A panel of the kansas university published three papers that recognized other putative active piggyBac-like transposases. None of these papers reports any activity data. Wang et al, Insect Molecular Biology 15,435-443(2006) found multiple copies of the piggyBac-like sequence in the genome of the oriental tobacco budworm Heliothrisis viruses. Many of them have significant mutations or deletions, leading the authors to not consider them as candidate active transposases. Wang et al, Instrument Biochemistry and Molecular Biology 38,490-498(2008) reported 30 piggyBac-like sequences in the genome of the red flour beetle, Triboliumcastaneum. They concluded that: "all TcPLEs identified herein (except TcPLE 1) are apparently defective due to the presence of multiple stop codons and/or escape sites in the putative transposase coding region. "even for TcPLE1," there is no evidence to support recent or current migration (mobilisation)Event "(page 492, section 3.1,2&3). Wang et al (2010) used PCR to identify piggyBac-like sequences from the species Pectinophoragossypiella gossypii. They again found many apparently defective copies, and a transposase with characteristics that the authors considered consistent with activity (page 179, RHC,2). However, no follow-up report was found indicating transposase activity. Other groups have also attempted to identify active piggyBac-like transposases. The conclusions of these reports are: the identified piggyBac-like elements are undergoing activity testing, but are not subsequently reported successfully. For example, Sarkar et al (2003) concluded their discussion by restating the value of novel, active piggyBac-like transposons and describing their ongoing efforts to identify novel, active piggyBac-like transposons: the mobility of the original t.ni piggyBac element in various insects suggests that piggyBac family transposons may prove to be useful genetic tools in organisms other than insects. We are currently isolating the complete piggyBac element from an. "((mol.Gen.genomics 270, 173-) -180 in p.179, LHC,1). This putative active transposase appears to have not been further publicly reported. The silkworm genome was analyzed for piggyBac-like sequences (Xu et al, Mol Gen Genomics276,31-40 (2006)). They found 98 piggyBac-like sequences and performed various computational analyses of putative transposase sequences and ITR sequences. They concluded that: we have isolated several complete piggyBac-like elements from bombyx mori (b.mori) and are currently testing their activity and feasibility to use them as transformation vectors. "(p 38, RHC,3). There appears to be no further disclosure of these putative active transposases.

Four published papers discuss the third class of distantly related transposases of the piggyBac class. The first three of these showed only a partial response to excision and were acknowledged to be different from complete transposition. Hikosaka et al, Mol Biol Evol 24,2648-2656 (2007): "in this study, we demonstrated that Xtr-Uribo2 Tpase has excision activity against the target transposon, although there is currently no evidence of integration of the excised target into the genome. "(page 2654, RHC,2). Luo et al, Insect Science 18, 652-: these results demonstrate the activity of Ago-PLE1.1 transposase in the first step of mediating the shearing and sticking action of elements "(page 658, LHC,1). Daimon et al, Genome 53, 585-. While Daimon et al reported detection of excision events by PCR, they also reported screening of approximately 100,000 recovered plasmids for excision of yabusame-1 and yabusame-W, without identifying one recovered plasmid from among those from which the element had been excised. In contrast, Daimon reported that the transposition frequency of wild-type piggyBac enzymes was about 0.3 to 1.4. Thus, yabusabe-1 or yabusabe-W exhibits an ablation frequency of less than 0.001% (1: 100,000) as reported by Daimon et al. This is at least 2-3 orders of magnitude lower than the levels achievable by wild-type piggyBac enzymes, and is lower than genetically engineered variants of piggyBac transposases (whose transposable amount is ten times higher than wild-type). The implied transposition frequency (order of magnitude) of yabasume-1 from Daimon et al is also two orders of magnitude lower than the random integration frequency (order of magnitude of about 0.1%) in mammalian cells. Thus, Daimon et al indicate that yabusame-1 is essentially inactive and cannot be used as a genetic engineering tool. Such an idea might be potentially in Daimon et al's own conclusions: "although we can detect an excision event in a highly sensitive PCR-based assay, we do soThe data indicate that both elements almost lost cleavage activity. This also indicates that the PCR-based excision assay used to show Uribo2 and Ago-PLE1.1 activity is not predictive of transposition activity that would aid in the insertion of heterologous DNA into the genome of the target cell. The only reported fully active (both excising and integrating) transposase of the third class of transposases that is distantly associated with the original piggyBac transposase from the ulna moth (Trichoplusia Ni) is from bat Myotis lucifugus (Mitra et al, proc.natl.acad.sci.110,234-239 (2013)). These authors used the yeast system to demonstrate excision and transposition activity of bat transposases. All the work described here shows that: even if a large number of candidate sequences are present, it is difficult to identify a fully active piggyBac-like transposase. Therefore, there is a need for new piggyBac-like transposons and their corresponding transposases.

Disclosure of Invention

Heterologous gene expression of a polynucleotide construct stably integrated into the genome of a target cell can be improved by placing an expression polynucleotide between a pair of transposon ends (sequence elements recognized and transposed by a transposase). A DNA sequence inserted between a pair of transposon ends can be excised from one DNA molecule by transposase and inserted into a second DNA molecule. A novel transposon-transposase system of the piggyBac class is disclosed which is not derived from Trichoplusiani, a ulna. It is derived from Oryziasa, medaka (Oryzias transposase and Oryzias transposon). The Oryzias transposons contain sequences that act as transposon ends and can be used together with the corresponding Oryzias transposases that recognize and act on these transposon ends as gene transfer systems for stably introducing nucleic acids into the DNA of cells. The gene transfer system of the present invention can be used in methods including, but not limited to: genome engineering of eukaryotic cells, heterologous gene expression, gene therapy, cell therapy, insertional mutagenesis, or gene discovery.

Transposition can be performed using a polynucleotide comprising an open reading frame encoding an Oryzias transposase having an amino acid sequence that differs from the amino acid sequence of SEQ ID NO:782 are at least 90% identical and are operably linked to a heterologous promoter. Heterologous promoters may be active in eukaryotic cells. The heterologous promoter may be active in mammalian cells. mRNA can be made using a polynucleotide comprising an open reading frame encoding an Oyzias transposase having an amino acid sequence that differs from the amino acid sequence of SEQ ID NO:782 are at least 90% identical and are operably linked to a heterologous promoter active in an in vitro transcription reaction. Relative to SEQ ID NO:782, the transposase may comprise mutations as shown in columns C and D of table 1. Relative to SEQ ID NO:782 and the transposase can comprise a mutation at an amino acid position selected from 22, 124, 131, 138, 149, 156, 160, 164, 167, 171, 175, 177, 202, 206, 210, 214, 253, 258, 281, 284, 361, 386, 400, 408, 409, 455, 458, 467, 468, 514, 515, 524, 548, 549, 550, and 551. Relative to SEQ ID NO:782 the transposase may comprise a mutation selected from E22D, a124C, Q131D, L138V, F149R, L156T, D160E, Y164F, I167L, a171T, R175K, K177N, T202R, I206L, I210L, N214D, V253I, V258L, I281F, a284L, L361I, V386I, M400L, S408E, L409I, F455Y, V458L, V467I, L468I, a514R, V515I, S524P, R548K, D549K, D550R and S551R, the transposase optionally comprising at least 2, 3, 4 or 5 mutations selected from the group. The amino acid sequence of the transposase can be selected from SEQ ID NO:782 or 805 and 908. The transposase can excise or transpose a sequence derived from SEQ ID NO: 41. The excision activity or transposition activity of the transposase is SEQ ID NO:782 of at least 5% or 10% of its activity. The codons of the transposase open reading frame can be selected for mammalian cell expression. The isolated mRNA may encode a polypeptide whose amino acid sequence is identical to SEQ ID NO:782 is at least 90% identical, and wherein the mRNA sequence is identical between the mRNA and SEQ ID NO:781 relative to SEQ ID NO:781 have at least 10 synonymous codon differences, optionally wherein codons at corresponding positions in the mRNA are selected for mammalian cell expression. The open reading frame encoding the transposase may further encode a heterologous nuclear localization sequence fused to the transposase. The open reading frame encoding the transposase can further encode a heterologous DNA binding domain (e.g., derived from a criprpr Cas system, a zinc finger protein, or a TALE protein) fused to the transposase. The non-naturally occurring polynucleotide may encode a polypeptide having a sequence that differs from the sequence set forth in SEQ ID NO:782 are at least 90% identical.

The Oryzias transposon comprises SEQ ID NO:7 and SEQ ID NO: 8. the transposon may further comprise on one side of the heterologous polynucleotide an amino acid sequence identical to SEQ ID NO: 12, and on the other side comprises a sequence at least 90% identical to SEQ ID NO: 15 sequences at least 90% identical. The heterologous polynucleotide may comprise a heterologous promoter active in eukaryotic cells. The promoter may be operably linked to at least one or more of: i) an open reading frame; ii) a nucleic acid encoding a selectable marker; iii) a nucleic acid encoding a counter-selectable marker; iii) a nucleic acid encoding a regulatory protein; iv) a nucleic acid encoding an inhibitory RNA. The heterologous promoter may comprise a sequence selected from SEQ ID NO: 325 and 409. The heterologous polynucleotide may comprise a heterologous enhancer active in eukaryotic cells. The heterologous enhancer may be selected from SEQ ID NO: 304-324. The heterologous polynucleotide may comprise a heterologous intron that is spliceable (splittable) in a eukaryotic cell. The nucleotide sequence of the heterologous intron can be selected from SEQ ID NO: 412-472. The heterologous polynucleotide may comprise an insulator sequence. The nucleic acid sequence of the insulator may be selected from SEQ ID NO: 286-292. The heterologous polynucleotide may comprise two open reading frames, each of which is operably linked to a separate promoter. The heterologous polynucleotide may comprise a sequence selected from SEQ ID NOs: 596, 779. The heterologous polynucleotide may comprise or encode a selectable marker. The selectable marker may be selected from the group consisting of glutamine synthetase, dihydrofolate reductase, puromycin acetyltransferase, blasticidin acetyltransferase, hygromycin B phosphotransferase, aminoglycoside 3' -phosphotransferase, and fluorescent protein. One embodiment of the invention is a eukaryotic cell whose genome comprises SEQ ID NO:7 and SEQ ID NO: 8. the cell may be an animal cell, a mammalian cell, a rodent cell, or a human cell.

The transposon can be integrated into the genome of a eukaryotic cell by: (a) combining a polynucleotide comprising SEQ ID NO:7 and SEQ ID NO: 8, (b) introducing a transposase into the cell, the transposase having a sequence that is identical to SEQ ID NO:782 are at least 90% identical, wherein a transposase transposes the transposon to produce a nucleic acid sequence comprising SEQ ID NO:7 and SEQ ID NO: 8 in a genome of said microorganism. The transposase can be introduced as a polynucleotide encoding the transposase, which can be an mRNA molecule or a DNA molecule. Transposases can be introduced as proteins. The heterologous polynucleotide may also encode a selectable marker, and the method may further comprise selecting cells comprising the selectable marker. The cell can be an animal cell, a mammalian cell, a rodent cell, or a human cell. The human cell may be a human immune cell, such as a B cell or a T cell. The heterologous polynucleotide may encode a chimeric antigen receptor. The polypeptide may be expressed from a transposon that is integrated into the genome of a eukaryotic cell. The polypeptide may be purified. The purified polypeptide may be incorporated into a pharmaceutical composition.

Drawings

FIG. 1: structure of Oryzias transposon. The Oryzias transposon comprises a left transposon end and a right transposon end flanked by heterologous polynucleotides. The left transposon ends include (i) the left target sequence, which is typically 5'-TTAA-3', although many other target sequences are also used less frequently (Li et al, 2013.proc.natl.acad.sci vol.110, No.6, E478-487); (ii) (ii) left ITR (e.g., SEQ ID NO: 7) and (iii) (optionally) other left transposon terminal sequences (e.g., SEQ ID NO: 12). The right transposon end comprises (i) (optionally) other right transposon end sequences (e.g., SEQ ID NO: 15); (ii) a right ITR (e.g., SEQ ID NO: 8) that is a perfect or imperfect repeat of the left ITR, but in the opposite direction; and (iii) a right target sequence, which is typically identical to the left target sequence.

Detailed Description

5.1 definition

The use of the singular forms "a," "an," and "the" includes the plural forms as well, unless the context clearly dictates otherwise. Thus, for example, reference to "a polynucleotide" includes a plurality of polynucleotides, reference to "a substrate" includes a plurality of such substrates, reference to "a variant" includes a plurality of variants, and the like.

Terms such as "connected," "attached," "linked," and "coupled" are used interchangeably herein and include direct connection and indirect connection, attachment, linking, or coupling, unless the context clearly dictates otherwise. Where a range of values is recited, it is understood that each intervening integer value, and each fraction thereof, between the recited upper and lower limits of that range is also specifically disclosed, as well as each subrange between such values. The upper and lower limits of any range can independently be included in or excluded from the range, and each range where neither, neither or both limits are included is also encompassed within the invention. Where the values in question have inherent limits (e.g., where a component may be present at a concentration of 0 to 100%, or where the pH of the aqueous solution may be in the range of 1 to 14), these inherent limits are specifically disclosed. Where a value is explicitly recited, it is understood that values about the same number as the recited value are also within the scope of the invention. Where a combination is disclosed, each subcombination of the elements of that combination is also specifically disclosed and is within the scope of the invention. Conversely, where different elements or groups of elements are disclosed separately, combinations thereof are also disclosed. Where any element of the invention is disclosed as having a plurality of alternatives, examples of the invention are also disclosed herein in which each alternative is excluded alone or in any combination with the others. More than one element of the invention may have such exclusions, and all combinations of elements having such exclusions are disclosed herein.

Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton, et al, Dictionary of Microbiology and Molecular Biology, second edition, John Wiley and Sons, New York (1994), and Hale & Marham, The Harper Collins Dictionary of Biology, Harper Perennial, NY,1991 provide The artisan with a general Dictionary of many of The terms used in The present invention. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described. Unless otherwise indicated, nucleic acids are written from left to right in the 5 'to 3' direction; amino acid sequences are written from left to right in the amino to carboxy direction, respectively. The terms defined immediately below are more fully defined by reference to the specification as a whole.

"configuration" of a polynucleotide refers to the functional sequence elements within the polynucleotide, as well as the order and orientation of those elements.

The terms "corresponding transposon" and "corresponding transposase" are used to indicate an active relationship between the transposase and the transposon. Transposase transposes its corresponding transposon. Many transposases may correspond to a single transposon. Transposons are transposed by their corresponding transposases. Many transposons may correspond to one transposase.

The term "counter-selectable marker" refers to a polynucleotide sequence that confers a selective defect on a host cell. Examples of counter-selective markers include sacB, rpsL, tetAR, pheS, thyA, gata-1, ccdB, kid, and barnase (Bernard,1995, Journal/Gene,162:159-, 2005, Journal/Appl Environ Microbiol,71: 587-; yazynin et al, 1999, Journal/FEBS Lett,452: 351-354). Reverse selectable markers generally confer a selection disadvantage in certain circumstances. For example, they may confer sensitivity to compounds that may be added to the host cell environment, or they may kill a host with a certain genotype but not a host with a different genotype. The condition that does not impose a selectivity disadvantage on cells carrying the counter-selectable marker is called "permissive". The conditions that do confer a selective disadvantage on cells with a counter-selectable marker are called "restrictive".

The term "coupling element" or "translational coupling element" refers to a DNA sequence that allows for the linkage of expression of a first polypeptide to expression of a second polypeptide. Internal ribosome entry site elements (IRES elements) and cis-acting hydrolase elements (CHYSEL elements) are examples of coupling elements.

The terms "DNA sequence", "RNA sequence", or "polynucleotide sequence" refer to a contiguous nucleic acid sequence. The sequence can be an oligonucleotide of 2 to 20 nucleotides in length to a full-length genomic sequence comprising thousands of base pairs.

The term "expression construct" refers to any polynucleotide designed to transcribe RNA. For example, a construct comprising at least one promoter that is, or is operably linked to: a downstream gene, coding region, or polynucleotide sequence (e.g., a cDNA or genomic DNA fragment encoding a polypeptide or protein, or an RNA effector molecule, such as an antisense RNA, triplex-forming RNA, a ribozyme, an artificially selected high affinity RNA ligand (aptamer), a double-stranded RNA, such as an RNA molecule comprising a stem-loop or hairpin dsRNA, or a double-or multi-finger dsRNA or microrna, or any RNA). An "expression vector" is a polynucleotide comprising a promoter operably linked to a second polynucleotide. Transfection or transformation of the expression construct into a recipient cell causes the cell to express the RNA effector molecule, polypeptide, or protein encoded by the expression construct. The expression construct may be a genetically engineered plasmid, a virus, a recombinant virus or an artificial chromosome derived from, for example, a bacteriophage, an adenovirus, an adeno-associated virus, a retrovirus, a lentivirus, a poxvirus, or a herpesvirus. Such expression vectors may include sequences from bacteria, viruses, or phages. Such vectors include chromosomal, episomal, and virus-derived vectors, such as vectors derived from bacterial plasmids, bacteriophages, yeast episomes, yeast chromosomal elements, and viruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, cosmids, and phagemids. Expression constructs may replicate in living cells or may be prepared synthetically. For the purposes of this application, the terms "expression construct", "expression vector", "vector" and "plasmid" are used interchangeably to illustrate the use of the invention in a general illustrative sense and are not intended to limit the invention to a particular type of expression construct.

The term "expression polypeptide" refers to a polypeptide encoded by a gene on an expression construct.

The term "expression system" refers to any in vivo or in vitro biological system for producing one or more gene products encoded by a polynucleotide.

"Gene" refers to a transcriptional unit that includes a promoter and sequences from which it is expressed as RNA or protein. The sequence to be expressed may be genomic or cDNA, among other possibilities. Other elements, such as introns and other regulatory sequences may or may not be present.

"Gene transfer system" includes a vector or gene transfer vector, or a polynucleotide comprising a gene to be transferred, which is cloned into a vector ("gene transfer polynucleotide" or "gene transfer construct"). The gene transfer system may also include other features to facilitate the process of gene transfer. For example, a gene transfer system may comprise a vector and a lipid or viral packaging mixture to enable a first polynucleotide to enter a cell, or it may comprise a polynucleotide comprising a transposon and a second polynucleotide sequence encoding a corresponding transposase to enhance productive genomic integration of the transposon. The transposase and transposon of the gene transfer system may be on the same nucleic acid molecule or on different nucleic acid molecules. The transposase of the gene transfer system may be provided in the form of a polynucleotide or polypeptide.

Two elements are "heterologous" to each other if not naturally associated. For example, a nucleic acid sequence encoding a protein linked to a heterologous promoter refers to a promoter other than the promoter that naturally drives expression of the protein. Heterologous nucleic acids flanked by transposon ends or ITRs refer to heterologous nucleic acids that are not flanked by those transposon ends or ITRs, e.g., nucleic acids encoding polypeptides other than transposases, including antibody heavy or light chains. A nucleic acid is heterologous to a cell if it does not naturally occur in the cell or naturally occurs in the cell but is located elsewhere (e.g., episomal or different genomic location).

The term "host" refers to any prokaryotic or eukaryotic organism that can act as a receptor for nucleic acids. The term "host" as used herein includes prokaryotic or eukaryotic organisms that may be genetically engineered. For an example of such a host, see Maniatis et al, Molecular cloning.A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (New York, 1982). The terms "host", "host cell", "host system" and "expression host" as used herein are used interchangeably.

A "high activity" transposase is a transposase that is more active than the naturally occurring transposase from which it is derived. Thus, a "high activity" transposase is not a naturally occurring sequence.

"integration deficiency" or "transposition deficiency" refers to a transposase that excises its corresponding transposon, but integrates the excised transposon into the host genome at a lower frequency than the corresponding naturally occurring transposase.

"IRES" or "internal ribosome entry site" refers to a specialized sequence that directly promotes ribosome binding, independent of the cap structure.

An "isolated" polypeptide or polynucleotide refers to a polypeptide or polynucleotide that has been removed from its natural environment, produced using recombinant techniques, or chemically or enzymatically synthesized. The polypeptides or polynucleotides of the invention may be purified, i.e., substantially free of any other polypeptides or polynucleotides and associated cellular products or other impurities.

The terms "nucleoside" and "nucleotide" include those moieties that contain not only the known purine and pyrimidine bases, but also other heterocyclic bases that have been modified. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, or other heterocycles. Modified nucleosides or nucleotides can also include modifications on the sugar moiety, for example, where one or more hydroxyl groups are substituted with halogens, aliphatic groups, or functionalized as ethers, amines, and the like. The term "nucleotidic unit" is intended to encompass nucleosides and nucleotides.

An "open reading frame" or "ORF" refers to a portion of a polynucleotide that, when translated into an amino acid, does not contain a stop codon. The genetic code reads a DNA sequence in sets of three base pairs, which means that a double stranded DNA molecule can be read in any one of six possible reading frames-three in the forward direction and three in the reverse direction. The ORF typically also includes a start codon at which translation can begin.

The term "operably linked" refers to a functional linkage between two sequences such that one modifies the behavior of the other. For example, a first polynucleotide comprising a nucleic acid expression control sequence (e.g., a promoter, an IRES sequence, an enhancer, or an array of transcription factor binding sites) is operably linked to a second polynucleotide when the first polynucleotide affects the transcription and/or translation of the second polynucleotide. Similarly, a first amino acid sequence comprising a secretion signal or a subcellular localization signal is operably linked to a second amino acid sequence if the first amino acid sequence results in the second amino acid sequence being secreted or localized at the subcellular location.

The term "orthogonal" refers to the lack of interaction between two systems. If a first transposase does not excise or transpose a second transposon and the second transposase does not excise or transpose the first transposon, then the first transposon and its corresponding first transposase and the second transposon and its corresponding second transposase are orthogonal.

The term "overhang" or "DNA overhang" refers to a single-stranded portion at the end of a double-stranded DNA molecule. Overhangs that base-pair with each other are complementary overhangs.

"piggyBac transposase" refers to a transposase having at least 20% sequence identity to a piggyBac transposase from a ruler moth (Trichoplusiani) (SEQ ID NO: 909), identified using the TBLASTN algorithm and described more fully with Sakar, A.et.al, (2003), mol.Gen.genomics 270: 173. 180. Molecular evolution analysis of the wireless read piggyBac transposon family and related ' featured ' D ' DDE-like DDD motifs, having aspartic acid residues at positions corresponding to D268, D346, and D447 of the ruler moth (Trichoplusiani) transposase at maximum alignment. PiggyBac-like transposases are also characterized by their ability to excise their transposons precisely and at high frequency. By "piggyBac-like transposon" is meant a transposon having transposon ends that are identical or at least 80% identical, preferably at least 90, 95, 96, 97, 98 or 99% or 100% identical to the transposon ends of a naturally occurring transposon encoding a piggyBac-like transposase. The piggyBac-type transposon contains an Inverted Terminal Repeat (ITR) of about 12 to 16 bases at each end and is flanked by 4 base sequences that correspond to the integration target sequence (target site repeat or target sequence repeat or TSD) that is replicated during transposon integration. PiggyBac transposons and transposases are naturally present in a variety of organisms, including Trichoplusia argentea (Agyrogramma agnate) (GU477713), Achyriopsis gambiae (Anopheles gambiae) (XP _ 312615; XP _ 320414; XP _310729), Aphis gossypii (Aphis gossypii) (GU329918), Piper pisum (Acyrospondicum) (XP _001948139), agrotiopsis parvus (Agrostis)) (GU477714), Bombyx mori (Bombyx mori) (BAD11135), Ciona intestinalis (XP. 002123602), Chinesis (Chinespora chilis) (JX294476), Drosophila melanogaster (Drosophila melanogaster) (AAL 84), Daphus dapsus (Dapha puaria) (AAM 7676767676), Cochlozia gossypiella (Acipes nigra), Geotrichu sinensis (Achyriopsis larva Ab) and Achyriopsis larva Ab sinensis (Achyriopsis) sp) (Ab sinensis Daphnigeri # 3 741958 42), Gerocinia, Gekko et Geranium sp) (Phoenis sp) (Phoeni nigra) (Phoenis sp) (Phoeni et zans sp) (Phoenis sp) (Phoeni # 3 741958 5), Miyas) (Phoenis sp) (Phoenii) (Phoenis sp) (Phoenis 677676), Miyas sp) (Phoenis 677635), Miyas) (Phoenii et zan, and the ulna (Trichoplusiani) (AAA87375) and the Xenopus tropicalis (Xenopus tropicalis) (BAF82026), although little transposition activity has been described.

The terms "polynucleotide", "oligonucleotide", "nucleic acid" and "nucleic acid molecule" are used interchangeably to refer to a polymeric form of nucleotides of any length and may comprise ribonucleotides, deoxyribonucleotides, analogs thereof or mixtures thereof. The term refers only to the primary structure of the molecule. Thus, the term includes triple-, double-and single-stranded deoxyribonucleic acid ("DNA"), as well as triple-, double-and single-stranded ribonucleic acid ("RNA"). It also includes polynucleotides modified, for example, by alkylation and/or by capping, and unmodified forms of the polynucleotides. More specifically, the terms "polynucleotide", "oligonucleotide", "nucleic acid" and "nucleic acid molecule" include polydeoxyribonucleotides (containing 2-deoxy-D-ribose), polyribonucleotides (containing D-ribose), including tRNA, rRNA, hRNA, siRNA and mRNA (whether spliced or not), any other type of polynucleotide comprising N-or C-glycosides of purine or pyrimidine bases, as well as other polymers containing non-nucleotide backbones, such as polyamides (e.g., peptide nucleic acids ("PNA")) and polymorpholino (like Neugene, commercially available from Corvallis Oregon (Oreg., Convally) Anti-viral Limited) polymers, and other synthetic sequence-specific nucleic acid polymers, provided that the polymer contains nucleobases in a structure that allows base pairing and base stacking, such as found in DNA and RNA. There is no intended difference in length between the terms "polynucleotide", "oligonucleotide", "nucleic acid" and "nucleic acid molecule", and these terms are used interchangeably herein. These terms refer only to the primary structure of the molecule. Thus, these terms include, for example, 3 '-deoxy-2', 5'-DNA, oligodeoxyribonucleotide N3' P5 'phosphoramidate, 2' -O-alkyl substituted RNA, double and single stranded DNA, and double and single stranded RNA and hybrids thereof, including, for example, hybrids between DNA and RNA or PNA and DNA or RNA, and also known types of modifications, such as labels, alkyls, "caps", substitution of one or more nucleotides with an analog, internucleotide modifications, such as those having uncharged linkages (e.g., methylphosphonate, phosphotriester, phosphoramidate, carbamate, etc.), those having negatively charged linkages (e.g., phosphorothioate, phosphorodithioate, etc.), and those having positively charged linkages (e.g., aminoalkyl phosphoramidate, aminoalkyl phosphotriester), those containing pendant moieties (pendant moieties), such as proteins (including enzymes (e.g., nucleases), toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., acridine, psoralen, etc.), those containing chelates (chelates such as metals, radioactive metals, boron, oxidative metals, etc.), those containing alkylating agents (alkylators), those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), and unmodified forms of polynucleotides or oligonucleotides.

A "promoter" refers to a nucleic acid sequence sufficient to direct transcription of an operably linked nucleic acid molecule. The promoter may be used with or without other transcriptional control elements (e.g., enhancers) sufficient to allow promoter-dependent gene expression to be controllable in a cell-type specific, tissue-specific, or time-specific manner, or inducible by external signals or agents; such elements may be within the 3' region or introns of the gene. Desirably, the promoter is operably linked to a nucleic acid sequence, such as a cDNA or gene sequence, or an effector RNA coding sequence, in a manner that enables expression of the nucleic acid sequence, or the promoter is provided in an expression cassette into which a selected nucleic acid sequence to be transcribed may be conveniently inserted. A regulatory element, such as a promoter active in mammalian cells, refers to a regulatory element that can be configured to result in the expression of at least 1 transcript per cell in a mammalian cell into which the regulatory element has been introduced.

The term "selectable marker" refers to a polynucleotide fragment or expression product thereof that generally allows for the selection or non-selection (against) of the molecule or cell comprising it under specific conditions. These labels may encode an activity such as, but not limited to, the production of RNA, peptides or proteins, or may provide binding sites for RNA, peptides, proteins, inorganic and organic compounds or compositions. Examples of selectable markers include, but are not limited to: (1) DNA fragments encoding products resistant to other toxic compounds (e.g., antibiotics); (2) a DNA fragment encoding a product that is otherwise deficient in the recipient cell (e.g., a tRNA gene, an auxotrophic marker); (3) a DNA fragment encoding a product that inhibits the activity of a gene product; (4) DNA fragments encoding easily identifiable products (e.g., phenotypic markers such as β -galactosidase, Green Fluorescent Protein (GFP), and cell surface proteins); (5) DNA fragments that bind to products that are otherwise detrimental to cell survival and/or function; (6) DNA fragments which otherwise inhibit the activity of any of the DNA fragments described in 1 to 5 above (e.g., antisense oligonucleotides); (7) DNA fragments that bind to the product of a modified substrate (e.g., a restriction endonuclease); (8) DNA fragments that can be used to isolate a desired molecule (e.g., a particular protein binding site); (9) a DNA fragment encoding a specific nucleotide sequence which may otherwise be non-functional (e.g. for PCR amplification of a sub-population of molecules); and/or (10) a DNA fragment which, if absent, confers sensitivity, directly or indirectly, to a particular compound.

Sequence identity can be determined by using algorithms such as BESTFIT, FASTA and TFASTA in Wisconsin Genetics Software Package version 7.0 (Wisconsin Genetics Software Package Release 7.0), Genetics Computer Group (Genetics Computer Group), 575Science Dr., madison, Wisconsin), using default gap parameters (gap parameters) or by inspection, and optimal alignment (i.e., resulting in the highest percentage of sequence similarity in the comparison window). Percent sequence identity is determined by comparing two optimally aligned sequences over a comparison window, determining the number of positions in the two sequences at which identical residues occur to yield the number of matched positions, then dividing the number of matched positions by the total number of matched and unmatched positions (without counting gaps in the comparison window, i.e., the window size), and multiplying the result by 100 to yield the percent sequence identity. Unless otherwise indicated, the window of comparison between two sequences is defined by the full length of the shorter of the two sequences.

The "target nucleic acid" is a nucleic acid into which a transposon is to be inserted. Such targets may be part of a chromosome, episome or vector.

An "integration target sequence" or "target site" of a transposase is a site or sequence in a target DNA molecule into which a transposome can be inserted by the transposase. The piggyBac transposase of the ulna moths (Trichoplusiani) mainly inserts the transposons thereof into the target sequence 5 '-TTAA-3'. Other useful target sequences of piggyBac transposons are 5'-CTAA-3', 5'-TTAG-3', 5'-ATAA-3', 5'-TCAA-3', 5'-AGTT-3', 5'-ATTA-3', 5'-GTTA-3', 5'-TTGA-3', 5'-TTTA-3', 5'-TTAC-3', 5'-ACTA-3', 5'-AGGG-3', 5'-CTAG-3', 5'-GTAA-3', 5'-AGGT-3', 5'-ATCA-3', 5'-CTCC-3', 5'-TAAA-3', 5'-TCTC-3', 5'-TGAA-3', 5 '-AAAT-3'), 5'-AATC-3', 5'-ACAA-3', 5'-ACAT-3', 5'-ACTC-3', 5'-AGTG-3', 5'-ATAG-3', 5'-CAAA-3', 5'-CACA-3', 5'-CATA-3', 5'-CCAG-3', 5'-CCCA-3', 5'-CGTA-3', 5'-CTGA-3', 5'-GTCC-3', 5'-TAAG-3', 5'-TCTA-3', 5'-TGAG-3', 5'-TGTT-3', 5'-TTCA-3', 5'-TTCT-3' and 5'-TTTT-3' (Li et al, U.S., 2013.proc.natl.acad.sci vol.110, No.6, E478-487). PiggyBac-type transposases transpose their transposons using a cut-and-stick mechanism, which results in their 4 base pair target sequence being repeated after insertion of the DNA molecule. Thus, the target sequence is located on each side of the integrated piggyBac-like transposon.

The term "translation" refers to the process of synthesizing a polypeptide by "reading" the sequence of a polynucleotide by a ribosome.

A "transposase" is a polypeptide that catalyzes the excision of a corresponding transposon from a donor polynucleotide (e.g., a vector), and (assuming that the transposase does not have an integration defect) the subsequent integration of the transposon into a target nucleic acid. "Oryzias transposase" refers to a transposase that hybridizes to SEQ ID NO:782 transposases having at least 80, 90, 95, 96, 97, 98, 99 or 100% sequence identity, including the amino acid sequence of SEQ ID NO:782 of a highly active variant thereof. A high activity transposase is a transposase that is more active than the naturally occurring transposase from which it is derived, either in excision activity or transposition activity, or both. The activity of the high activity transposase is preferably at least 1.5-fold, or at least 2-fold, or at least 5-fold, or at least 10-fold higher than the native transposase from which it is derived. For example 2-5 times or 1.5-10 times. The transposase may or may not be fused to one or more other domains, such as a nuclear localization sequence or a DNA binding protein.

The term "transposition" as used herein refers to the action of a transposase in excising a transposon from one polynucleotide and then integrating it into a different site of the same polynucleotide or into a second polynucleotide.

The term "transposon" refers to a polynucleotide that can be excised from a first polynucleotide (e.g., a vector) and integrated into a second location of the same polynucleotide or a second polynucleotide (e.g., the genome of a cell or extrachromosomal DNA) by the action of a corresponding trans-acting transposase. The transposon comprises a first transposon end and a second transposon end, which are polynucleotide sequences that are recognized and transposed by a transposase. The transposon typically further comprises a first polynucleotide sequence between the two transposon ends such that the first polynucleotide sequence is transposed with the two transposon ends by the action of the transposase. The first polynucleotide in a native transposon typically comprises an open reading frame encoding a corresponding transposase that recognizes and transposes the transposon. The transposons of the invention are "synthetic transposons" comprising heterologous polynucleotide sequences which can be transposed due to their juxtaposition (juxtaposition) between the two transposon ends. The synthetic transposon may or may not further comprise flanking polynucleotide sequences located outside the ends of the transposon, such as sequences encoding transposase, vector sequences or sequences encoding a selectable marker.

The term "transposon end" refers to a cis-acting nucleotide sequence sufficient to be recognized by and transposed by a corresponding transposase. The transposon ends of piggyBac transposons contain perfect or imperfect repeats, such that the respective repeats in the two transposon ends are the inverse complements of each other. These are called Inverted Terminal Repeats (ITRs) or Terminal Inverted Repeats (TIRs). The ends of the transposon may or may not include other sequences that are close to the ITRs that facilitate or enhance transposition.

The term "vector" or "DNA vector" or "gene transfer vector" refers to a polynucleotide that is used to perform a "carrying" function on another polynucleotide. For example, vectors are often used to allow a polynucleotide to be propagated within a living cell, or to allow a polynucleotide to be packaged for delivery into a cell, or to integrate a polynucleotide into the genomic DNA of a cell. The vector may further comprise other functional elements, for example it may comprise a transposon.

5.2 description

5.2.1. Gene integration

Expression of a gene from a heterologous polynucleotide in a eukaryotic host cell can be improved if the heterologous polynucleotide is integrated into the genome of the host cell. Integration of a polynucleotide into the genome of a host cell also typically renders it stably heritable (by subjecting the polynucleotide to the same mechanisms that ensure replication and division of genomic DNA). This stable heritability is ideal for good and consistent expression over the long-term. This is particularly important for cell therapies where cells are genetically modified and then placed into the body. This is also important for the manufacture of biomolecules, especially for therapeutic applications, where stability of the host and uniformity of expression levels are also important for regulatory purposes. Thus, cells having a gene transfer vector (including transposon-based gene transfer vectors) integrated into their genome are important embodiments of the present invention.

If the heterologous polynucleotides are part of a transposon (i.e., located between transposon ITRs), they can integrate more efficiently into the target genome (e.g., so that they can be integrated by the transposase). One particular benefit of transposons is that the entire polynucleotide between the transposon ITRs is integrated. A transposon comprising a target site flanking an ITR (which flanks a heterologous polynucleotide) integrates at the target site in the genome resulting in the genome comprising the heterologous polynucleotide flanked by the ITR, which is flanked by the target site. This is in contrast to random integration, where a polynucleotide introduced into a eukaryotic cell is generally randomly fragmented in the cell and only a portion of the polynucleotide is incorporated into the target genome, usually at a low frequency. It has been demonstrated that the piggyBac transposon of the looper Trichoplusia ni is transposed by its transposase in cells of many organisms (see, e.g., Keith et al (2008) BMC Molecular Biology 9:72"Analysis of the piggyBac transpose redirection signal a functional nucleic acid targeting signal in the 94c-terminal residues"). The heterologous polynucleotide incorporating the piggyBac-like transposon can be integrated into a eukaryotic cell, including an animal cell, a fungal cell, or a plant cell. Preferred animal cells may be vertebrates or invertebrates. Preferred vertebrate cells include cells from mammals, including rodents, such as rats, mice, and hamsters; ungulates, such as cattle, goats or sheep; and pigs. Preferred vertebrate cells also include cells from human tissue and human stem cells. Target cell types include hepatocytes, nerve cells, muscle cells, blood cells, embryonic stem cells, adult stem cells, hematopoietic cells, embryos, zygotes, sperm cells (some of which may be manipulated in an in vitro environment), and immune cells, including lymphocytes, such as T cells, B cells, and natural killer cells, T helper cells, antigen presenting cells, dendritic cells, neutrophils, and macrophages. Preferred cells may be pluripotent (the progeny of which can differentiate into cells of several restricted cell types, such as hematopoietic stem cells or other stem cells) or totipotent (i.e., the progeny of which can become a cell of any cell type in the organism, such as an embryonic stem cell). Preferred cultured cells are Chinese Hamster Ovary (CHO) cells or human embryonic kidney (HEK293) cells. Preferred fungal cells are yeast cells including Saccharomyces cerevisiae and Pichia pastoris. Preferred Plant cells are algae (e.g.Chlorella), tobacco, maize and rice (Nishizawa-Yokoi et al (2014) Plant J.77:454-63 "precision marker extraction system using an animal derived piggyBac transposon in plants").

Preferred gene transfer systems comprise a transposon and a corresponding transposase protein which transposes the transposon, or a nucleic acid which encodes a corresponding transposase protein and which is expressible in the target cell. Preferred gene transfer systems comprise synthetic Oryzias transposons and corresponding Oryzias transposases.

The transposase protein can be provided as a protein or a nucleic acid encoding the transposase (e.g., as a ribonucleic acid, including an mRNA or any polynucleotide recognized by cellular translation machinery); as DNA, for example, as extrachromosomal DNA, episomal DNA (episomal DNA) is included; introduced into cells as plasmid DNA or viral nucleic acid. In addition, nucleic acids encoding transposase proteins can be transfected into cells as nucleic acid vectors (e.g., plasmids) or as gene expression vectors (including viral vectors). The nucleic acid may be circular or linear. mRNA encoding a transposase can be prepared using DNA in which the gene encoding the transposase is operably linked to a heterologous promoter active in vitro (e.g., a bacterial T7 promoter). DNA encoding the transposase protein can be stably inserted into the genome or vector of the cell for constitutive or inducible expression. When the transposase protein is transfected into cells or inserted into a vector in the form of DNA, the transposase coding sequence is preferably operably linked to a heterologous promoter. Various promoters can be used, including constitutive promoters, cell-type specific promoters, organism-specific promoters, tissue-specific promoters, inducible promoters, and the like. When DNA encoding a transposase is operably linked to a promoter and transfected into a target cell, the promoter should be operable in the target cell. For example, if the target cell is a mammalian cell, the promoter should be operable in the mammalian cell; if the target cell is a yeast cell, the promoter should be operable in the yeast cell; if the target cell is an insect cell, the promoter should be operable in the insect cell; if the target cell is a human cell, the promoter should be operable in the human cell; if the target cell is a human immune cell, the promoter should be operable in the human immune cell. All DNA or RNA sequences encoding piggyBac-like transposase proteins are explicitly considered. Alternatively, the transposase can be introduced directly into the Cell in the form of a protein, for example, using a Cell-penetrating peptide (e.g., as described in Ramsey and Flynn (2015) Pharmacol. the. Ther.154:78-86"Cell-penetrating peptides transport therapeutics inter cells); use of small molecules including salts and propylbetaines (propaebetaines) (e.g., as described in Astolfo et al (2015) Cell 161: 674-690); or electroporation (e.g., as described in Morgan and Day (1995) Methods in Molecular Biology 48:63-71"The introduction of proteins in cells by electrophoresis").

The transposon can be inserted into the DNA of a cell by non-homologous recombination by various replicable mechanisms, even without the need for transposase activity. The transposons described herein can be used for gene transfer regardless of the mechanism of gene transfer.

5.2.5 Gene transfer System

The gene transfer system comprises a polynucleotide to be transferred to a host cell. Preferably, the polynucleotide comprises an Oryzias transposon, and wherein the polynucleotide is to be integrated into the genome of the target cell.

When the gene transfer system has multiple components, e.g., one or more polynucleotides comprising a gene for expression in a target cell and optionally a transposon end and transposase (which may be provided as a protein or encoded by a nucleic acid), these components may be transfected into the cell simultaneously or sequentially. For example, the transposase protein or nucleic acid encoding it can be transfected into the cell before, simultaneously with or after transfection of the corresponding transposon. In addition, administration of any one component of the gene transfer system may be repeated, for example, by administering at least two doses of the component.

Any transposase protein described herein can be encoded by a polynucleotide that includes RNA or DNA. Similarly, a nucleic acid encoding a transposase protein or transposon of the present invention can be transfected into a cell as a plasmid or recombinant viral DNA, in the form of a linear fragment or a circular fragment.

The Oryzias transposase can be provided as a DNA molecule that can be expressed in a target cell. The sequence encoding the Oryzias transposase should be operably linked to a heterologous sequence that enables the transposase to be expressed in the target cell. The sequence encoding the Oryzias transposase can be operably linked to a heterologous promoter active in the target cell. For example, if the target cell is a mammalian cell, the promoter should be active in the mammalian cell. If the target is a vertebrate cell, the promoter should be active in the vertebrate cell. If the target cell is a plant cell, the promoter should be active in the plant cell. If the promoter is an insect cell, the promoter should be active in the insect cell. The sequence encoding the Oryzias transposase can also be operably linked to other sequence elements required for expression in the target cell, such as polyadenylation sequences, terminator sequences, and the like.

The Oryzias transposase can be provided as an mRNA that can be expressed in a target cell. The mRNA is preferably prepared in an in vitro transcription reaction. For in vitro transcription, the sequence encoding the Oryzias transposase is operably linked to a promoter active in the in vitro transcription reaction. Exemplary promoters active in vitro transcription reactions include the T7 promoter (5'-TAATACGACTCACTATAG-3') that allows transcription by T7RNA polymerase, the T3 promoter (5'-AATTAACCCTCACTAAAG-3') that allows transcription by T3 RNA polymerase, and the SP6 promoter (5'-ATTTAGGTGACACTATAG-3') that allows transcription by SP6 RNA polymerase. Variants of these promoters and other promoters useful for in vitro transcription may also be operably linked to sequences encoding the Oryzias transposase.

If the Oryzias transposase is provided in the form of a polynucleotide (DNA or mRNA) encoding the transposase, it is advantageous to increase the expressivity of the transposase in the target cell. Thus, sequences other than the naturally occurring sequences are advantageously used to encode the transposase; in other words, codon preferences (codon-preferences) of the cell type in which expression is to be effected are used. For example, if the target cell is a mammalian cell, the codons should be biased towards the preferences of the mammalian cell. If the target is a vertebrate cell, the codon should be biased towards the preference of the particular vertebrate cell. If the target cell is a plant cell, the codon should be biased towards the preference of the plant cell. If the promoter is an insect cell, the codon should be biased towards the preference of the insect cell.

Preferred RNA molecules include RNA molecules with appropriate cap structures to enhance translation in eukaryotic cells, polya, and other 3' sequences that enhance mRNA stability in eukaryotic cells, and optional substitutions to reduce toxic effects on cells (e.g., substitution of uridine for pseudouridine, and cytosine for 5-methylcytosine). An mRNA encoding the Oryzias transposase can be prepared to have a 5' cap structure to improve expression in the target cell. Examples of the inventionThe sexual cap structure is a cap analogue ((G (5') ppp (5') G), an anti-reverse cap analogue (3' -O-Me-m)7G (5') ppp (5') G), clean cap (m7G (5') ppp (5') (2' OMeA) pG), mCAP (m7G (5') ppp (5') G). The mRNA encoding the Oryzias transposase can be prepared as bases or as partial or complete substitutions, e.g., pseudouridine for uridine, and 5-methylcytosine for cytosine. Any combination of these caps and substitutions may be made.

The components of the gene transfer system can be transfected into one or more cells by techniques such as particle bombardment, electroporation, microinjection, etc., combined with lipid-containing vesicles (e.g., cationic lipid vesicles), DNA-concentrating agents (e.g., calcium phosphate, polylysine, or polyethyleneimine), and then the components (i.e., their nucleic acids) are inserted into a viral vector and the viral vector is contacted with the cells. In the case of using a viral vector, the viral vector may include a variety of viral vectors known in the art, including viral vectors selected from the group consisting of retroviral vectors, adenoviral vectors, and adeno-associated viral vectors. The gene transfer system may be formulated in a suitable manner known in the art, or as a pharmaceutical composition or kit.

5.2.3 sequence elements in Gene transfer systems

Gene expression from gene transfer polynucleotides (e.g., piggyBac-like transposons, including Oryzias transposons) integrated into the host cell genome is typically strongly influenced by the chromatin environment into which it is integrated. Polynucleotides integrated into euchromatin are expressed at higher levels than polynucleotides integrated into heterochromatin or polynucleotides that are silenced following integration. Silencing may be reduced if the heterologous polynucleotide comprises a chromatin control element. Thus, gene transfer polynucleotides (including any transposons described herein) advantageously comprise chromatin control elements, such as sequences that prevent heterochromatin (insulators) diffusion. Advantageous gene transfer polynucleotides include Oryzias transposons comprising a nucleotide sequence that is complementary to a sequence selected from SEQ ID NOs: 286-292, which may also comprise Ubiquitous Chromatin Opening Elements (UCOEs) or stabilizing and anti-repressor elements (STAR) to increase long-term stable expression of the gene transfer polynucleotide from integration. Advantageous gene transfer polynucleotides may further comprise a matrix attachment region, e.g. a region complementary to a sequence selected from SEQ ID NO: 293-303 sequence is at least 95% identical.

In some cases, it is advantageous for the gene transfer polynucleotide to comprise two insulators, one on each side of the heterologous polynucleotide containing the sequence to be expressed, and within the transposon ITRs. The insulators may be the same or different. Particularly advantageous gene transfer polynucleotides comprise a nucleotide sequence that is identical to a sequence selected from SEQ ID NOs: 291 or SEQ ID NO: 292 and an insulator sequence at least 95% identical to a sequence selected from one of SEQ ID NOs: 286- > 290, and at least 95% identical. The insulator also shields the expression control elements from each other. For example, when a gene transfer polynucleotide comprises a gene encoding two open reading frames, each of which is operably linked to a different promoter, one promoter can reduce expression from the other promoter in a phenomenon known as transcriptional interference. (ii) insertion between two transcription units of a sequence identical to a sequence selected from SEQ ID NO: insulator sequences at least 95% identical to one of 286-292 can reduce this interference, increasing expression from one or both promoters.

Preferred gene transfer vectors contain expression elements capable of driving high levels of gene expression. In eukaryotic cells, gene expression is regulated by several different types of elements, including enhancers, promoters, introns, RNA export elements, polyadenylation sequences, and transcription terminators.

An advantageous gene transfer polynucleotide for transferring a gene for expression into a eukaryotic cell comprises an enhancer operably linked to a heterologous gene. Advantageous gene transfer polynucleotides for transferring genes for expression into mammalian cells comprise an enhancer of the immediate early gene (immediate early gene)1, 2 or 3 of Cytomegalovirus (CMV) from human, primate or rodent cells (e.g., a sequence at least 95% identical to the sequence of one of SEQ ID NO: 304-322), an enhancer of the adenovirus major late protein enhancer (e.g., a sequence at least 95% identical to SEQ ID NO: 323), or an enhancer of SV40 (e.g., a sequence at least 95% identical to SEQ ID NO: 324), operably linked to a heterologous gene.

An advantageous gene transfer polynucleotide for transferring a gene for expression into a eukaryotic cell comprises a promoter operably linked to a heterologous gene. Advantageous gene transfer polynucleotides for transferring genes for expression into mammalian cells include the EF1a promoter (e.g., any of SEQ ID NO: 325-346) from any mammalian or avian species, including humans, rats, mice, chickens and Chinese hamsters; promoters of immediate early genes 1, 2 or 3 of Cytomegalovirus (CMV) from human, primate or rodent cells (e.g., any of SEQ ID NO: 347-357); the promoter of eukaryotic elongation factor 2(EEF2) from any mammalian or avian species (including human, rat, mouse, chicken and Chinese hamster) (e.g., any of SEQ ID NO: 358-368); GAPDH promoter from any mammalian or yeast species (e.g., any of SEQ ID NO: 379-; PGK promoters (e.g., any of SEQ ID NO: 396-402) from any mammalian or avian species (including human, rat, mouse, chicken and Chinese hamster), or ubiquitin promoters (e.g., SEQ ID NO: 403), operably linked to a heterologous gene. The promoter may be operably linked to: i) a heterologous open reading frame; ii) a nucleic acid encoding a selectable marker; iii) a nucleic acid encoding a counter-selectable marker; iii) a nucleic acid encoding a regulatory protein; iv) a nucleic acid encoding an inhibitory RNA.

Advantageous gene transfer polynucleotides for transferring genes for expression into eukaryotic cells comprise introns within a heterologous polynucleotide that is spliceable in the target cell. Advantageous gene transfer polynucleotides for transferring genes for expression into a mammal include introns of the immediate early gene 1, 2 or 3 of Cytomegalovirus (CMV) from human, primate or rodent cells (e.g., a sequence at least 95% identical to any of SEQ ID NO: 412-, Chicken and chinese hamster) of the actin intron (e.g., a nucleotide sequence identical to SEQ ID NO: 445-458), a GAPDH intron from any mammalian or avian species (including human, rat, mouse, chicken, and chinese hamster) (e.g., a sequence at least 95% identical to any of SEQ ID NOs: sequences at least 95% identical to any of 459-461), an intron comprising the adenovirus major late protein enhancer (e.g., a sequence identical to SEQ ID NO: 462-463) or a hybrid/synthetic intron in a heterologous polynucleotide (e.g., a sequence at least 95% identical to any of SEQ ID NOs: 423-431) are at least 95% identical.

Advantageous gene transfer polynucleotides for transferring genes for expression into eukaryotic cells comprise enhancers and promoters operably linked to heterologous coding sequences. Such gene transfer polynucleotides may comprise a combination of an enhancer and a promoter, wherein an enhancer from one gene is associated with a promoter from a different gene, i.e., the enhancer is heterologous to the promoter. For example, in order to transfer a gene for expression into a mammalian cell, the immediate early CMV enhancer (e.g., a sequence selected from SEQ ID NO: 304-322) from a rodent or human or primate is advantageously followed by: a promoter from the EF1a gene (e.g.a sequence selected from the group consisting of SEQ ID NO: 325-346), or from the heterologous CMV gene (e.g.a sequence selected from the group consisting of SEQ ID NO: 347-357), or from the EEF2 gene (e.g.a sequence selected from the group consisting of SEQ ID NO: 358-368), or from the actin gene (e.g.a sequence selected from the group consisting of SEQ ID NO: 369-378), or from the GAPDH gene (e.g.a sequence selected from the group consisting of SEQ ID NO: 379-395), which is operably linked to the heterologous sequence.

An advantageous gene transfer polynucleotide for the transfer of an expressed gene into a eukaryotic cell comprises an operably linked promoter and intron operably linked to a heterologous open reading frame. Such gene transfer polynucleotides may comprise a combination of promoters and introns, wherein a promoter from one gene binds to an intron from a different gene, i.e., the intron is heterologous to the promoter. For example, for transferring a gene for expression into a mammalian cell, the immediate early CMV promoter (e.g., a sequence selected from SEQ ID NO: 347-357) from a rodent or a primate advantageously is followed by: an intron from the EF1a gene (e.g., a sequence at least 95% identical to the sequence selected from the group consisting of SEQ ID NO: 432-444), or an intron from the EEF2 gene (e.g., a sequence at least 95% identical to the sequence selected from the group consisting of SEQ ID NO: 464-471), or an intron from the actin gene (e.g., a sequence at least 95% identical to the sequence selected from the group consisting of SEQ ID NO: 445-458), which is operably linked to a heterologous sequence.

Advantageous gene transfer polynucleotides for transferring a gene for expression into a eukaryotic cell comprise a complex transcription initiation regulatory element comprising a promoter operably linked to an enhancer and/or intron, and the complex transcription initiation regulatory element is operably linked to a heterologous sequence. Examples of advantageous complex transcription initiation regulatory elements that can be operably linked to a heterologous sequence in a gene transfer polynucleotide to transfer a gene for expression into a mammalian cell are selected from the group consisting of SEQ ID NOs: 473 and 565.

Expression of two open reading frames from a single polynucleotide may be achieved by operably linking expression of each open reading frame to a separate promoter, each of which may optionally be operably linked to an enhancer and an intron, as described above. It is particularly useful when expressing two polypeptides that need to interact in a specific molar ratio (e.g., an antibody chain or a bispecific antibody chain, or a receptor and its ligand). It is generally advantageous: by placing a genetic insulator between two open reading frames (e.g., the 3 'end of a polyadenylation sequence operably linked to the first open reading frame, and the 5' end of a promoter operably linked to the second open reading frame encoding the second polypeptide) to prevent transcriptional promoter interference. Transcriptional promoter interference can also be prevented by effectively terminating transcription of the first gene. In many eukaryotic cells, the use of a strong polyA signal sequence between the two open reading frames will reduce transcription promoting interference. Examples of polyA signal sequences that can be used to effectively terminate transcription are set forth in SEQ ID NO: 566-. An advantageous gene transfer polynucleotide comprises a heterologous open reading frame operably linked to a heterologous gene transfer sequence selected from the group consisting of SEQ ID NO: 566-595 sequence is at least 95% identical. Advantageous complex regulatory elements for termination of transcription of the first gene and initiation of transcription of the second gene include SEQ ID NO: 596-779. Particularly advantageous gene transfer polynucleotides for transferring first and second open reading frames for co-expression into mammalian cells comprise a nucleotide sequence identical to a sequence selected from the group consisting of SEQ ID NOs: 596-779 has a sequence which is at least 90% identical or at least 95% identical or at least 99% identical or 100% identical, which separates the two heterologous open reading frames.

5.2.4 selection of target cells comprising Gene transfer polynucleotides

If the gene transfer polynucleotide comprises an open reading frame encoding a selectable marker, target cells whose genome comprises a stably integrated transfer polynucleotide can be identified by exposing the target cells to conditions that favor expression of the selectable marker by the cells ("selection conditions"). The gene transfer polynucleotide advantageously comprises an open reading frame encoding a selectable marker, such as: enzymes conferring resistance to antibiotics such as neomycin (resistance conferred by an aminoglycoside 3' -phosphotransferase, such as the sequence selected from SEQ ID NO: 114-. Other selectable markers include fluorescent selectable markers (e.g., open reading frames encoding GFP, RFP, etc.), and thus can be selected, for example, using flow cytometry. Other selectable markers include an open reading frame encoding a transmembrane protein that is capable of binding to a second molecule (protein or small molecule) that can be fluorescently labeled so that the presence of the transmembrane protein can be selected, for example, using flow cytometry.

The gene transfer polynucleotide may comprise a selectable marker open reading frame encoding a glutamine synthetase (GS, e.g., a sequence selected from SEQ ID NO: 126-130), which allows selection by glutamine metabolism. Glutamine synthetase is an enzyme responsible for the biosynthesis of glutamine from glutamate and ammonia, and is a key component of the sole pathway for glutamine formation in mammalian cells. In the absence of glutamine in the growth medium, the GS enzyme is critical for the survival of mammalian cells in culture. Certain cell lines (e.g., mouse myeloma cells) do not express enough GS enzyme to survive without the addition of glutamine. In these cells, the transfected GS open reading frame can be used as a selectable marker by allowing growth in glutamine-free medium. Other cell lines, such as Chinese Hamster Ovary (CHO) cells, express sufficient GS enzyme to survive without exogenous glutamine addition. These cell lines can be manipulated by gene editing techniques, including CRISPR/Cas9, to reduce or eliminate the activity of the GS enzyme. In all of these cases, a GS inhibitor, such as Methionine Sulfoximine (MSX), can be used to inhibit the endogenous GS activity of the cell. An alternative protocol involves introducing a gene transfer polynucleotide comprising sequences encoding a first polypeptide and a glutamine synthetase selectable marker, and then treating the cell with an inhibitor of glutamine synthetase (e.g., methionine sulfoximine). The higher the level of methionine sulfoximine used, the higher the level of glutamine synthetase expression necessary for the cell to synthesize sufficient glutamine for survival. Some of these cells will also show increased expression of the first polypeptide.

Preferably, the GS open reading frame is operably linked to a weak promoter or other sequence element that attenuates expression as described herein, such that high levels of expression can only occur if there are many copies of the gene transfer polynucleotide or if they are integrated into the genome in a location where high levels of expression occur. In this case, the use of methionine sulfoximine inhibitors may not be required: if expression of glutamine synthetase is attenuated, simply synthesizing sufficient glutamine for cell survival may provide a sufficiently stringent selection.

The gene transfer polynucleotide may comprise a selectable marker open reading frame encoding a dihydrofolate reductase (DHFR, e.g., a sequence selected from SEQ ID NO: 112-113) that is required for catalyzing the reduction of 5, 6-Dihydrofolate (DHF) to 5,6,7, 8-Tetrahydrofolate (THF). Some cell lines fail to express sufficient DHFR to survive without the addition of Hypoxanthine and Thymidine (HT). Transfected DHFR open reading frames in these cells can function as selectable markers by allowing growth in hypoxanthine and thymidine free medium. Cell lines lacking DHFR, such as Chinese Hamster Ovary (CHO) cells, can be generated by gene editing techniques, including CRISPR/Cas9, that reduce or eliminate the activity of endogenous DHRF enzymes. DHFR confers resistance to Methotrexate (MTX). Higher levels of methotrexate may inhibit DHFR. The selection protocol involves introducing a construct comprising sequences encoding the first polypeptide and a DHFR selectable marker into cells with or without a functional endogenous DHFR gene, and then treating the cells with a DHFR inhibitor (e.g., methotrexate). The higher the level of methotrexate used, the higher the level of DHFR expression required for the cells to synthesize sufficient DHFR to survive. Some of these cells will also show increased expression of the first polypeptide. Preferably, the DHFR open reading frame is operably linked to a weak promoter or other sequence element that attenuates expression as described above, such that high levels of expression can only occur if there are many copies of the gene transfer polynucleotide or if they are integrated into the genome in a location where high levels of expression occur.

Genes encoded on gene transfer polynucleotides that are integrated into a highly transcriptionally active genomic region, or integrated into the genome in multiple copies, or exist in extrachromosomal multiple copies, can have high levels of expression. It is often advantageous to operably link the open reading frame encoding the selectable marker with expression control elements that result in low expression levels of the selectable polypeptide from the gene transfer polynucleotide and/or to use conditions that provide for more stringent selection. Under these conditions, in order for the expressing cell to produce sufficient levels of the selective polypeptide encoded on the gene transfer polynucleotide to survive the selected conditions, the gene transfer polynucleotide may be present at a favorable location in the genome of the cell to achieve high levels of expression, or a sufficiently high number of copies of the gene transfer polynucleotide are present such that these factors compensate for low expression levels due to the expression control elements.

When a selectable marker in a transposon is operably linked to a regulatory element that only weakly expresses the selectable marker, it is often necessary to insert the transposon into the target genome by a transposase to achieve genomic integration of the transposon, see, e.g., section 6.1.3. Cells incorporating multiple copies of the transposon or cells in which the transposon integrates at favorable genomic locations to achieve high expression are selected by operably linking a selectable marker to an element that results in weak expression. The use of a gene transfer system comprising a transposon and a corresponding transposase will increase the likelihood of generating cells with multiple transposon copies or where the transposon integrates for high expression at favorable genomic locations. Thus, a gene transfer system comprising a transposon and a corresponding transposase is particularly advantageous when the transposon comprises a selective marker operably linked to a weak promoter.

Nucleic acids and selectable markers to be expressed as RNA or protein may be included on the same gene transfer polynucleotide, but operably linked to different promoters. In this case, low expression levels of the selectable marker can be achieved by using weakly active constitutive promoters, such as the phosphoglycerate kinase (PGK) promoter (e.g., the promoter selected from SEQ ID NO: 396-402), the herpes simplex virus thymidine kinase (HSV-TK) promoter (e.g., SEQ ID NO: 405), the MC1 promoter (e.g., SEQ ID NO: 406), the ubiquitin promoter (e.g., SEQ ID NO: 403). Other weakly active promoters may be deliberately constructed, for example promoters attenuated by truncation, for example truncated SV40 promoter (for example sequences selected from SEQ ID NO: 407-408), truncated HSV-TK promoter (for example SEQ ID NO: 404), or promoters attenuated by insertion of a 5' UTR (for example sequences selected from SEQ ID NO: 410-411) between the promoter and the open reading frame encoding the selective polypeptide, which is unfavorable for expression. Particularly advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 396-409 operably linked to an open reading frame encoding a selectable marker.

The expression level of the selectable marker may also be advantageously reduced by other mechanisms, such as insertion of the SV40 small t antigen intron after the open reading frame of the selectable marker. The SV40 small t intron accepts an aberrant 5' splice site, which may result in a deletion in the preceding open reading frame in a portion of the spliced mRNA, thereby reducing expression of the selectable marker. Particularly advantageous gene transfer polynucleotides comprise the intron SEQ ID NO: 472 operably linked to an open reading frame encoding a selectable marker. In order for this attenuation mechanism to be effective, the open reading frame encoding the selectable marker preferably contains a strong intron donor within its coding region. DNA sequence SEQ ID NO: 131-134 are exemplary nucleic acid sequences encoding the nucleic acid sequences of SEQ ID NOs: 126-129. Each of these nucleic acid sequences comprises an intron donor and can be operably linked to the SV40 small t antigen intron by placing the intron in the 3' UTR of the glutamine synthetase open reading frame. Sequence SEQ ID NO: 123 is a nucleic acid sequence encoding puromycin acetyltransferase of SEQ ID NO: 122, which comprises an intron donor, and which can be operably linked to an SV40 small t antigen intron by placing the intron in the 3' UTR of the puromycin open reading frame. Advantageous gene transfer polynucleotides comprise a nucleotide sequence that is identical to a sequence selected from SEQ ID NOs: 123 or 131-134 which is at least 90% identical or at least 95% identical or at least 99% identical or 100% identical to the sequence of one of SEQ ID NO: 472.

the expression level of the selectable marker may also be advantageously reduced by other mechanisms, such as the insertion of an inhibitory 5' -UTR, such as SEQ ID NO: 410-411, into the transcript. Particularly advantageous gene transfer polynucleotides comprise a promoter operably linked to an open reading frame encoding a selectable marker, wherein the promoter is operably linked to SEQ ID NO: 410-411 sequences that are at least 90% identical or at least 95% identical or at least 99% identical or 100% identical are between the promoter and the selectable marker.

An exemplary nucleic acid sequence comprising a glutamine synthetase coding sequence operably linked to a regulatory sequence expressible in a mammalian cell includes SEQ ID NO: 152, 221 and 283, 285. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotides of the sequences of 152-221 or 283-285 express glutamine synthetase after integration into the genome of the target cell, thereby facilitating growth of the cell in the absence of added glutamine or in the presence of MSX. The regulatory elements in these sequences have been balanced to produce low levels of glutamine synthetase expression, providing a selective advantage for target cells whose genomes contain multiple copies of the gene transfer polynucleotide or target cells that contain copies of the gene transfer polynucleotide in a region of the genome that is favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 152-221 or 283-285, and they may further comprise a left transposon end and a right transposon end.

An exemplary nucleic acid sequence comprising a blasticidin-S-transferase coding sequence operably linked to regulatory sequences expressible in mammalian cells includes SEQ ID NO: 222-228. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotide of the sequence of 222-228 expresses blasticidin-S-transferase after integration into the genome of the target cell, thereby facilitating growth of the cell in the presence of added blasticidin. The regulatory elements in these sequences have been balanced to produce low levels of blasticidin-S-transferase expression, providing a selective advantage for target cells whose genome contains multiple copies of the gene transfer polynucleotide or for target cells that contain copies of the gene transfer polynucleotide in a region of the genome that is favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 222, 228, and they may further comprise a left transposon end and a right transposon end.

An exemplary nucleic acid sequence comprising a hygromycin B phosphotransferase coding sequence operably linked to regulatory sequences expressible in mammalian cells includes SEQ ID NO: 229-230. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotide of the sequence of 229- & 230 expresses hygromycin B phosphotransferase after integration into the target cell genome, thereby facilitating cell growth in the presence of added hygromycin. The regulatory elements in these sequences have been balanced to produce low levels of hygromycin B phosphotransferase expression, providing a selective advantage to target cells whose genomes contain multiple copies of the gene transfer polynucleotide or to target cells that contain copies of the gene transfer polynucleotide in a region of the genome that is favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 229 and 230, and they may further comprise a left transposon end and a right transposon end.

An exemplary nucleic acid sequence comprising an aminoglycoside 3' -phosphotransferase coding sequence operably linked to regulatory sequences expressible in mammalian cells comprises SEQ ID NO: 221-. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotides of the sequences 221-223 and 259-260 express the aminoglycoside 3' -phosphotransferase after integration into the target cell genome, thereby facilitating cell growth in the presence of added neomycin. The regulatory elements in these sequences have been balanced to produce low levels of aminoglycoside 3' -phosphotransferase expression, providing a selective advantage to target cells whose genomes comprise multiple copies of the gene transfer polynucleotide or to target cells that comprise copies of the gene transfer polynucleotide in a region of the genome that is favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 221-223 and 259-260, and they may further comprise a left transposon end and a right transposon end.

An exemplary nucleic acid sequence comprising a puromycin acetyltransferase coding sequence operably linked to regulatory sequences expressible in mammalian cells comprises SEQ ID NO: 234-. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotides of the sequences 234-253 and 261-285 express puromycin acetyltransferase after integration into the genome of the target cell, thereby facilitating growth of the cell in the presence of added puromycin. Regulatory elements in these sequences have been balanced to produce low levels of puromycin acetyltransferase expression, providing a selective advantage to target cells whose genome comprises multiple copies of the gene transfer polynucleotide or to target cells that comprise copies of the gene transfer polynucleotide in a region of the genome that is favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 234, 253, and 261, 285, and they may further comprise a left transposon end and a right transposon end.

Exemplary nucleic acid sequences comprising a ble gene coding sequence operably linked to regulatory sequences expressible in mammalian cells include SEQ ID NO: 254-258. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotide of the 254-258 sequence expresses the ble gene after integration into the target cell genome, thereby facilitating cell growth in the presence of added giycepin. Regulatory elements in these sequences have been balanced to produce low levels of expression of the ble gene product, providing a selective advantage for target cells whose genome comprises multiple copies of the gene transfer polynucleotide or for target cells that comprise copies of the gene transfer polynucleotide in a region of the genome that is favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 254-258, and they may further comprise a left transposon end and a right transposon end.

An exemplary nucleic acid sequence comprising a dihydrofolate reductase coding sequence operably linked to regulatory sequences expressible in mammalian cells can comprise SEQ ID NO: 135-. Comprises a sequence selected from SEQ ID NO: the gene transfer polynucleotides of the sequences of 135-151 and 259-282 express dihydrofolate reductase after integration into the genome of the target cell, thereby facilitating cell growth in the presence of added hypoxanthine and thymidine or in the presence of MTX. The regulatory elements in these sequences have been balanced to produce low levels of dihydrofolate reductase expression, providing a selective advantage for target cells whose genomes contain multiple copies of the gene transfer polynucleotide or target cells that contain copies of the gene transfer polynucleotide in regions of the genome that are favorable for expression of the encoded gene. Advantageous gene transfer polynucleotides comprise a nucleotide sequence selected from SEQ ID NOs: 135-151 and 259-282, and they may further comprise a left transposon end and a right transposon end.

The use of transposons and transposases in combination with weakly expressed selectable markers has several advantages over non-transposon constructs. One is that the linkage between the expression of the first polypeptide and the selectable marker is better for transposons, since the transposase integrates the entire sequence located between the two transposon ends into the genome. In contrast, when heterologous DNA is introduced into the nucleus of a eukaryotic cell (e.g., a mammalian cell), it gradually breaks down into random fragments that can integrate into the genome of the cell or degrade. Thus, if a gene transfer polynucleotide comprising a sequence encoding the first polypeptide and a selectable marker is introduced into a population of cells, some cells will integrate the sequence encoding the selectable marker, but not the sequence encoding the first polypeptide, and vice versa. Thus, selection of cells expressing high levels of the selectable marker is only slightly relevant to cells that also express high levels of the first polypeptide. In contrast, because the transposase integrates all of the sequences between the ends of the transposon, it is likely that cells expressing high levels of the selectable marker will also express high levels of the first polypeptide.

A second advantage of transposons and transposases is their greater efficiency in integrating DNA sequences into the genome. Thus, a higher proportion of cells in the cell population may integrate one or more copies of the gene transfer polynucleotide into their genome, and thus the likelihood of good stable expression of the selectable marker and the first polypeptide is correspondingly higher.

A third advantage of piggyBac-like transposons and transposases is: piggyBac-like transposases tend to insert their corresponding transposons into transcriptionally active chromatin. Thus, each cell may integrate the gene transfer polynucleotide into a region of the genome where the gene is well expressed, and thus the likelihood of good stable expression of the selectable marker and the first polypeptide is correspondingly higher.

5.2.5A novel transposase of the PIGGYBAC class derived from medaka (ORYIAS LATIPES)

The native DNA transposon is subjected to a "cut-and-paste" replication system in which the transposon is excised from a first DNA molecule and then inserted into a second DNA molecule. DNA transposons are characterized by Inverted Terminal Repeats (ITRs) and are mobilized by transposases encoded by the elements. PiggyBac transposon/transposase systems are particularly useful because of The high precision of transposon integration and excision (see, e.g., "Fraser, M.J. (2001) The TTAA-Specific Family of Transposable Elements: Identification, Functional Characterization, and Utility Transformation of instruments; Insect Transformation: Methods and applications.A.M.Handler and A.A.James.Boca Raton, Fla., CRC Press: 249-268"; and "US 20070204356A 1: PiggyBac constructs in transposate" and references therein).

In the genome of phylogenetically different species from fungi to mammals, many sequences have been found that have sequence similarity to the piggyBac transposase from the ruler moth (Trichoplusiani), but very few have been shown to have transposase activity (see e.g. Wu M, et al (2011) Genetica)139149-54 of Cloning and characterization of piggyBac-like elements in leptin instruments, and references therein).

Two properties of transposases are particularly important for genome modification: their ability to integrate a polynucleotide into a target genome, and their ability to precisely excise a polynucleotide from a target genome. Both properties can be measured using a suitable system.

A system for measuring a first step of transposition (excision of a transposon from a first polynucleotide) comprises the following components: (i) a first polynucleotide encoding a first selectable marker operably linked to a sequence that causes it to be expressed in a selection host, and (ii) a transposon comprising a transposon end recognized by a transposase. The transposon is present in the first selectable marker and disrupts the coding sequence of the first selectable marker to render the first selectable marker inactive. The transposon is placed in the first selectable marker such that precise excision of the first transposon results in reconstitution of the first selectable marker. If an active transposase capable of excising the first transposon is introduced into a host cell containing the first polynucleotide, the host cell will express an active first selectable marker. The activity of the transposase to excise the transposon can be measured as the frequency at which the host cell is able to grow under conditions that require the first selectable marker to be active.

If the transposon comprises a second selectable marker operably linked to a sequence that makes the second selectable marker expressible in the selection host, transposing the second selectable marker into the genome of the host cell will produce a genome comprising the active first and second selectable markers. The activity of the transposase to transpose the transposon to the second genomic position can be measured as the frequency with which the host cell can grow under conditions that require the first and second selectable markers to be active. In contrast, if the first selectable marker is present, but the second selectable marker is not present, it indicates that the transposon has excised from the first polynucleotide, but is not subsequently transposed into the second polynucleotide. The selectable marker can be, for example, an open reading frame encoding an antibiotic resistance protein, an auxotrophic marker, or any other selectable marker.

We used this system to test the activity of putative transposase/transposon combinations as described in section 6.1. We used computational methods to search publicly available sequenced genomes to find open reading frames with homology to known active piggyBac transposases. We selected transposase sequences that appear to have the DDDE motif characteristic of active piggyBac-like transposases and searched the DNA sequences flanking these putative transposases for inverted repeats adjacent to the 5'-TTAA-3' target sequence. We identified putative transposons with intact transposases from: the fruit of the genus Magnolia (Spodopteralitura) (Genbank accession No. MTZO01002002.1, protein accession No. XP-022823959), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 21, flanking the putative left-terminal SEQ ID NO: 68 and putative right-hand SEQ ID NO: 69; pieris rapae (NCBI genomic reference sequence NW _019093607.1, Genbank protein accession number XP _022123753.1) whose open reading frame encodes a putative transposase that is SEQ ID NO: 22, flanking the putative left-terminal SEQ ID NO: 70 and putative right SEQ ID NO: 71; myzusperse (NCBI genomic reference sequence NW _019100532.1, protein accession number XP _022166603), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 23, flanking the putative left-terminal SEQ ID NO: 72 and putative right-hand SEQ ID NO: 73; bovine dung beetle (oncophagostistus) (NCBI genomic reference sequence NW _019280463, protein accession number XP _022900752), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 24, flanking the putative left-terminal SEQ ID NO: 74 and putative right SEQ ID NO: 75; chest ant (temnothioraxcurvivinosus) (NCBI genomic reference sequence NW _020220783.1, protein accession number XP _024881886), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 25, flanking the putative left-hand SEQ ID NO: 76 and putative right-hand SEQ ID NO: 77; narrow wax gilding (agriiusplanipen) (NCBI genomic reference sequence NW _020442437.1, protein accession number XP _025836109), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 26, flanking the putative left-terminal SEQ ID NO:78 and putative right SEQ ID NO: 79; a greenhouse powder spider (parasteitedatepidaririum) (NCBI genomic reference sequence NW _018371884.1, protein accession number XP _015905033), whose open reading frame encodes a putative transposase that is SEQ ID NO: 27, flanking the putative left-terminal SEQ ID NO: 80 and putative right SEQ ID NO: 81; pink bollworm (pectinophaga gossypiella) (Genbank accession No. GU270322.1, protein ID ADB45159.1, also described in Wang et al,2010. instect mol. biol.19,177-184. "piggyBac-like elements in the ping bollworm, pectinophaga gossypiella") whose open reading frame encodes a putative transposase which is SEQ ID NO: 28, flanking the putative left-terminal SEQ ID NO: 82 and putative right SEQ ID NO: 83; trichoplusia argentea (Ctenophila agnata) (NCBI accession number GU477713.1, protein accession number ADV17598.1, also described in Wu M, et al (2011) Genetica139:149-54, "Cloning and catalysis of piggyBac-like elements in leptopeptian instruments"), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 29, flanking the putative left-terminal SEQ ID NO: 84 and putative right SEQ ID NO: 85 parts by weight; flatworm (macromomlignano) (NCBI genomic reference sequence NIVC01003029.1, protein accession number PAA53757), whose open reading frame encodes a putative transposase that is SEQ ID NO: 30, flanking the putative left-terminal SEQ ID NO: 86 and putative right SEQ ID NO: 87; wasp (orussusabietinius) (NCBI accession number XM _012421754, protein accession number XP _012277177), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 31, flanking the putative left-terminal SEQ ID NO: 88 and putative right SEQ ID NO: 89; orchid bee (eufriezeaxicana) (NCBI genomic reference sequence NIVC01003029.1, protein accession number XP — 017759329), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 32, flanking the putative left-terminal SEQ ID NO: 90 and putative right SEQ ID NO: 91; the fruit of the Pimpinella machining (Spodopteralitura) (NCBI genomic reference sequence NC-036206.1, protein accession number XP-022824855), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 33, flanking the putative left-terminal SEQ ID NO: 92 and putative right-hand SEQ ID NO: 93; the open reading frame of Vanessa (Vanessa tameaea) (NCBI genomic reference sequence NW _020663261.1, protein accession number XP _026490968) encodes a putative transposase, which is SEQ ID NO: 34, flanking the putative left-terminal SEQ ID NO: 94 and putative right SEQ ID NO: 95; blattaria germanica (blattellagerrmanica) (NCBI genomic reference sequence PYGN01002011.1, protein accession number PSN31819), whose open reading frame encodes a putative transposase that is SEQ ID NO: 35, flanking the putative left-hand SEQ ID NO: 96 and putative right SEQ ID NO: 97, a stabilizer; bovine dung beetle (oncophagostistus) (NCBI genomic reference sequence NW _019281532.1, protein accession number XP _022910826), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 36, flanking the putative left-terminal SEQ ID NO: 98 and putative right SEQ ID NO: 99; bovine dung beetle (oncophagostistus) (NCBI genomic reference sequence NW _019281689.1, protein accession number XP _022911139), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 37, flanking the putative left-terminal SEQ ID NO: 100 and putative right SEQ ID NO: 101, a first electrode and a second electrode; bovine dung beetle (oncophagostistus) (NCBI genomic reference sequence NW _019286114.1, protein accession number XP _022913435), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 38, flanking the putative left-terminal SEQ ID NO: 102 and putative right SEQ ID NO: 103; alfalfa leaf cutting bees (megachilerotunndata) (NCBI genomic reference sequence NW _003797295, protein accession number XP _012145925), whose open reading frame encodes a putative transposase, which is SEQ ID NO: 39, flanking the putative left-terminal SEQ ID NO: 104 and putative right SEQ ID NO: 105; xiphophorus mavalis (NCBI genomic reference sequence NC _036460.1, protein accession number XP _023207869), whose open reading frame encodes a putative transposase, which is the amino acid sequence of SEQ ID NO: 40, flanking the putative left-terminal SEQ ID NO: 106 and putative right SEQ ID NO: 107; and medaka (oryziasilatepes) (NCBI accession No. NC _019868.2, protein accession No. XP _023815209), the open reading frame of which encodes a putative transposase that is SEQ ID NO:782, flanking the putative left-hand SEQ ID NO:1 and putative right-hand SEQ ID NO: 2.

5.2.5.1 Oryzias transposase and corresponding transposon

As described in section 6.1.2, the active transposase recognized by transposition activity in yeast and its corresponding transposon are Oryzias transposases. The Oryzias transposase comprises a polypeptide sequence that is at least 80% identical, or at least 90% identical, or at least 93% identical, or at least 95% identical, or at least 96% identical, or at least 97% identical, or at least 98% identical, or at least 99% identical, or 100% identical to the sequence of SEQ ID NO 782, and that is capable of transposing a transposon from a transposase reporter construct (transpose reporter construct) SEQ ID No. 41, as described in section 6.1.2. Exemplary non-native Oryzias transposases include SEQ ID NO: 805-908.

The Oryzias transposase can be provided as a protein as part of a gene transfer system, or as a polynucleotide encoding an Oryzias transposase, wherein the polynucleotide is expressible in a target cell. When provided in the form of a polynucleotide, the Oryzias transposase may be provided in the form of DNA or mRNA. If provided in DNA form, the open reading frame encoding the Oryzias transposase is preferably operably linked to a heterologous regulatory element comprising a promoter active in the target cell (e.g., a promoter active in a eukaryotic or vertebrate or mammalian cell) such that the transposase is expressible in the target cell. If provided in the form of an mRNA, the mRNA may be prepared in vitro from a DNA molecule in which the open reading frame encoding the Oryzias transposase is preferably operably linked to a heterologous promoter active in the in vitro transcription system used to prepare the mRNA, e.g., the T7 promoter.

The Oryzias transposon comprises a heterologous polynucleotide flanked by a left transposon end and a right transposon end, the left transposon end comprising a polynucleotide having the sequence of SEQ ID NO:7, and the right transposon end comprises a left ITR having the sequence given in SEQ ID NO: 8, and wherein the distal end of each ITR is immediately adjacent to the target sequence. Here and elsewhere, when an inverted repeat is defined by a sequence that includes a nucleotide defined by an ambiguity code, the identity (identity) of the nucleotide can be selected independently in the two repeats. A preferred target sequence is 5'-TTAA-3', although other useful target sequences may be used. Preferably, the target sequence on one side of the transposon is a direct repeat of the target sequence on the other side of the transposon. The left transposon end may also comprise additional sequences close to the ITRs, e.g. sequences similar to the sequence selected from SEQ ID NO: 5. 11 or 12 is at least 90% identical or 100% identical. The right transposon end may further comprise additional sequences close to ITRs, e.g. sequences similar to the sequence selected from SEQ ID NO: 6. 13, 14 or 15 is at least 90% identical or 100% identical. The structure of a representative Oryzias transposon is shown in figure 1. The Oryzias transposon can be substituted by a transposon having SEQ ID NO:782, for example, by transposase of the polypeptide sequence given in SEQ ID NO:780, or a polynucleotide having a sequence given in seq id no.

Transposon ends including ITRs and target sequences can be added to the ends of the heterologous polynucleotide sequences to produce a synthetic Oryzias transposon that can be efficiently transposed into the target eukaryotic genome by an Oryzias transposase. For example, SEQ ID NO: 1. 16 and 17 each comprise a left 5'-TTAA-3' target sequence followed by a left transposon ITR, followed by an additional terminal sequence that can be added to one side of the heterologous polynucleotide, wherein the target sequence is distal to the heterologous polynucleotide to produce a synthetic Oryzias transposon. SEQ ID NO: 2. 18, 19 and 20 each comprise an additional terminal sequence, followed by a right transposon ITR sequence, and then a right 5'-TTAA-3' target sequence, which can be added to the other side of the heterologous polynucleotide, wherein the target sequence is distal with respect to the heterologous polynucleotide to produce a synthetic Oryzias transposon. The aforementioned transposon terminal sequences contain 5'-TTAA-3' as the target sequence, but the target sequence may be removed from both ends of the synthetic Oryzias transposon and replaced by a substitute target sequence.

The synthetic Oryzias transposase recognizes the synthetic Oryzias transposon. They excise the transposon from a first DNA molecule by cleaving DNA from the target sequence at the left end of the transposon and cleaving the target sequence at the right end of the second transposon end, and then re-ligating the cleaved ends of the first DNA molecule to leave a single copy of the target sequence. The excised transposon sequence, including any heterologous DNA located between the ends of the transposon, is integrated by the transposase into the target sequence of a second DNA molecule, e.g., the genome of the target cell. Cells whose genome comprises a synthetic Oryzias transposon are an embodiment of the invention.

5.2.5.2 Oryzias transposases are active in mammalian cells

The sphinx piggyBac transposase has been demonstrated to be active in a variety of eukaryotic cells. In section 6.1.2, we show that the Oryzias transposase can transpose its corresponding transposon into the genome of yeast (Saccharomyces cerevisiae). In section 6.1.3, we show that Oryzias transposase can transpose its corresponding transposon into the genome of mammalian CHO cells. These results provide evidence that, like other known active piggyBac-like transposases, Oryzias transposases also have the activity to transpose their corresponding transposons into the genome of most eukaryotic cells. Although the Oryzias transposase is active in a variety of eukaryotic cells, the natural open reading frame (provided by SEQ ID NO: 781) encoding the Oryzias transposase is unlikely to be well expressed in a similarly broad range of cells because the optimal codon usage (optimal codon usage) varies significantly between different cell types. Thus, it is advantageous to use a sequence other than the native sequence to encode the transposase, in other words, the codon bias of the cell type in which expression is to be performed. Similarly, promoters and other regulatory sequences are selected to be active in the cell type in which expression is to be effected. Advantageous polynucleotides for expressing the Oryzias transposase are disclosed in the polynucleotides and SEQ ID NOs: 781 comprises the amino acid sequence at the corresponding position relative to SEQ ID NO:781, optionally wherein codons at corresponding positions in the polynucleotide are selected for mammalian cell expression. An exemplary polynucleotide sequence for an Oryzias transposase having the polypeptide sequence given in SEQ ID No. 782 is given by SEQ ID No. 780, wherein the relative position of the polynucleotide to SEQ ID No. 781 is: 781 are selected for mammalian cell expression. The polynucleotide may be DNA or mRNA.

5.2.6 highly active Oryzias transposase

The individual advantageous mutations can be combined in a number of different ways, for example by the methods described in "DNA shuffling" or by the methods described in U.S. Pat. No. 8,635,029B2 and Liao et al (2007, BMC Biotechnology 2007,7:16doi: 10.1186/1472-. By using variants of the selection schemes described herein (e.g., section 6.1.6) and appropriate corresponding transposons, transposases with modified activity, either activity against new target sequences or enhanced activity against existing target sequences, can be obtained.

An alignment of piggyBac-like transposases of known activity can be used to identify amino acid changes that may lead to enhanced activity. Transposases are generally harmful to their host and therefore tend to accumulate mutations that inactivate them. However, the mutations accumulated in the different transposases are different because each mutation occurs randomly. Consensus sequences can be obtained from alignments of sequences and can be used to improve activity (Ivics et al,1997, Cell 91: 501-. We aligned known active piggyBac-like transposases using the CLUSTAL algorithm and enumerated the amino acids found at each position. Table 1 shows this diversity relative to the Oryzias transposase (relative to SEQ ID NO: 782), the amino acids shown in column C are present in the equivalent positions in the alignment of known active piggyBac-type transposases, and are therefore likely to be acceptable variations in the Oryzia transposase. Column D shows amino acid changes found in other known active piggyBac-like transposases other than the Oryzias transposase, at positions that are well conserved among the remaining transposase groups, but where the amino acids in the Oryzias transposase sequence are outliers. Mutation of the position shown in column A to the amino acid shown in column D will particularly likely result in increased transposase activity, as it will shift the sequence of the Oryzias transposase toward a common orientation.

We selected 60 amino acid substitutions from column D of table 1 to provide a substitution in the sequence of Oryzias transposase SEQ ID NO:782 are substituted. These substitutions are E22, D82, a124, Q131, L138, F149, L156, D160, Y164, I167, a171, G172, R175, K177, G178, L200, T202, I206, I210, N214, W237, V251, V253, V258, M270, I281, a284, M319, G322, L323, H326, F333, Y337, L361, V386, M400, T402, H404, S408, L409, D422, K435, Y440, F455, V458, D459, S461, a465, V467, L468, W469, a512, a514, V515, S524, R548, D549, D550, S551 and N562. Genes encoding Oryzias transposase variants comprising combinations of these substitutions were synthesized and tested for transposase activity as described in section 6.1.6.

In addition to the naturally occurring sequence SEQ ID NO:782, we have also engineered more than 70 non-native Oryzias transposase variants with excision or transposition activity. An exemplary sequence of an active non-native Oryzias transposase variant is provided as SEQ ID NO: 816-877. An Oryzias transposase variant with enhanced excision activity relative to transposition activity is provided as SEQ ID NO: 805-815.

Thus, an Oryzias transposase can be created that is a non-native sequence, but that is identical to SEQ ID NO:782 are at least 99% identical, or at least 98% identical, or at least 97% identical, or at least 96% identical, or at least 95% identical, or at least 90% identical, or at least 80% identical. Such variants may retain SEQ ID NO:782 the partial activity of the transposase (as determined by the transposition activity and/or excision activity) may be functionally equivalent in one or both of transposition and excision to SEQ ID NO:782 or a transposase activity, or a cleavage activity or both relative to SEQ ID NO:782 has increased activity. Such variants may include mutations set forth herein to increase transposition and/or excision, mutations set forth herein that are neutral to transposition and/or excision, and mutations that are deleterious to transposition and/or integration. Preferred variants include mutations that appear neutral or that enhance transposition and/or excision. Some such variants lack mutations that exhibit unfavorable transposition and/or excision. Some such variants include only mutations that exhibit enhanced transposition, only mutations that exhibit enhanced excision, or mutations that exhibit both enhanced transposition and excision.

Enhanced activity refers to activity (e.g., transposition or excision activity) beyond experimental error that is greater than the reference transposase from which the variant was derived. The activity may be 1.2, 1.5, 2, 5, 10, 15, 20, 50, or 100 times that of the reference transposase. The enhanced activity may be, for example, in the range of 1.2-100 fold, 2-50 fold, 1.5-50 fold, or 2-10 fold of the reference transposase. Activity here and elsewhere can be measured as shown in the examples.

Functional equivalence means that a variant transposase can mediate transposition and/or excision of the same transposon with efficiency (within experimental error) comparable to that of a reference transposase.

Furthermore, SEQ ID NOs: 782 of a variant sequence. Combining beneficial substitutions, such as those shown in column D of table 1, can result in SEQ ID NO:782 of a highly active (superactive) variant. Preferred highly active Oryzias transposases may be found in the region corresponding to SEQ ID NO:782 contains an amino acid substitution at a position selected from amino acids 22, 124, 131, 138, 149, 156, 160, 164, 167, 171, 175, 177, 202, 206, 210, 214, 253, 258, 281, 284, 361, 386, 400, 408, 409, 455, 458, 467, 468, 514, 515, 524, 548, 549, 550 and 551 (see section 6.1.6). Preferably, the substitution is a substitution shown in column C or column D of table 1. An advantageous highly active Oryzias transposase comprises an amino acid substitution (relative to SEQ ID NO: 782) selected from the group consisting of E22, A124, Q131, L138, F149, L156, D160, Y164, I167, A171, R175, K177, T202, I206, I210, N214, V253, V258, I281, A284, L361, V386, M400, S408, L409, F455, V458, V467, L468, A514, V515, S524, R548, D549, D550 and S551. Some highly active Oryzias transposases may further comprise heterologous nuclear localization sequences.

Some engineered Oryzias transposases may have higher excision activity relative to the transposition activity of the transposase. A favorable Oryzias transposase with high activity in excision can be found in the region corresponding to SEQ ID NO:782 and at a position selected from amino acids 156, 164, 167, 171, 175, 177, 284 and 455, comprises an amino acid substitution, for example an amino acid substitution selected from L156T, Y164F, I167L, a171T, R175K, K177N, a284L and F455Y. These substitutions can be combined to engineer an Oryzias transposase with higher excision activity than transposition activity. Exemplary Oryzias transposases that are highly active in excision include those selected from the group consisting of SEQ ID NOs: 805, 815.

A preferred highly active Oryzias transposase comprises an amino acid sequence other than a naturally occurring protein (e.g., a transposase whose amino acid sequence does not comprise SEQ ID NO: 782), which is identical to the amino acid sequence of SEQ ID NO: 805-877 has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100% identity to any one of the amino acid sequences and is represented by any one of SEQ ID NOs: 782 comprises a substitution at a position selected from amino acids 22, 124, 131, 138, 149, 156, 160, 164, 167, 171, 175, 177, 202, 206, 210, 214, 253, 258, 281, 284, 361, 386, 400, 408, 409, 455, 458, 467, 468, 514, 515, 524, 548, 549, 550 and 551. Preferably, the highly active Oryzias transposase has a relative identity to SEQ ID NO:782 of the invention the sequence comprises any combination of amino acid substitutions or substitutions selected from E22D, a124C, Q131D, L138V, F149R, L156T, D160E, Y164F, I167L, a171T, R175K, K177N, T202R, I206L, I210L, N214D, V253I, V258L, I281F, a284L, L361I, V386I, M400L, S408E, L409I, V458L, V467I, L468I, a514R, V515I, S524P, R548K, D549K, D550R and S551R, including at least 1, 2, 3, 4, 5,6,7,8, 9, 10 or all of these mutations.

One aspect of the invention is a method of generating a transgenic cell using a naturally occurring or highly active Oryzias transposase. A method of generating a transgenic cell comprising (i) introducing into a eukaryotic cell a naturally occurring or highly active Oryzias transposase (in the form of a protein or a polynucleotide encoding a transposase) and a corresponding Oryzias transposon. The creation of the transgenic cell may further comprise (ii) identifying the cell in which the Oryzias transposon is incorporated into the genome of the eukaryotic cell. Identifying the cell in which the Oryzias transposon is incorporated into the genome of the eukaryotic cell may comprise selecting the eukaryotic cell based on a selectable marker encoded on the Oryzias transposon, which selectable marker may be any selectable polypeptide, including any selectable polypeptide described herein.

The activity of the transposase can also be increased by fusing a Nuclear Localization Signal (NLS) to the N-and C-termini of the transposase protein or to the N-and C-termini of the internal region, as long as the transposase activity is maintained. Nuclear localization signals or sequences (NLS) are amino acid sequences that "tag" or directly or indirectly facilitate the interaction of proteins with nuclear transporters for introduction into the nucleus. The Nuclear Localization Signal (NLS) used can include consensus NLS sequences, viral NLS sequences, cellular NLS sequences, and combinations thereof.

Transposases can also be fused to other functional protein domains. Such protein functional domains may include DNA binding domains, flexible hinge regions (flexible hinge regions) that may facilitate fusion of one or more domains, and combinations thereof. May be fused to the N-terminus, C-terminus or internal region of the transposase protein, as long as the transposase activity is maintained. Fusion to a DNA binding domain can be used to direct the Oryzias transposase to a specific single or multiple genomic loci. The DNA-binding domain can include a helix-turn-helix domain, a zinc finger domain, a leucine zipper domain, a TALE (transcription activator-like effector) domain, a CRISPR-Cas protein, or a helix-loop-helix domain. The specific DNA binding domain used may include a Gal4 DNA binding domain, a LexA DNA binding domain, or a Zif268 DNA binding domain. The flexible hinge region used may include glycine/serine linkers and variants thereof.

5.3 kits

The invention also relates to a kit comprising an Oryzias transposase and/or an Oryzias transposon as a protein or encoded by a nucleic acid; or a gene transfer system as described herein, comprising an Oryzias transposase as a protein or encoded by a nucleic acid as described herein in combination with an Oryzias transposon; optionally together with a pharmaceutically acceptable carrier, adjuvant or vehicle, and optionally together with instructions for use. Any of the components of the kit of the invention may be administered and/or transfected into a cell sequentially or in parallel, e.g. the Oryzias transposase protein or its encoding nucleic acid may be administered and/or transfected into a cell as defined above, before, simultaneously or after administration and/or transfection of the Oryzias transposon. Alternatively, the Oryzias transposon may be transfected into the cell as defined above prior to, simultaneously with or after transfection of the Oryzias transposase protein or its encoding nucleic acid. If parallel transfection is used, the two components are preferably provided as separate formulations and/or mixed with one another immediately prior to administration to avoid transposition prior to transfection. In addition, the administration and/or transfection of at least one component of the kit can be carried out in a time-staggered pattern, for example by multiple use of the component.

6. Examples of the embodiments

The following examples illustrate the methods, compositions, and kits disclosed herein and should not be construed as limiting in any way. Various equivalents will be apparent from the following examples. Such equivalents are also considered to be part of the invention disclosed herein.

6.1 novel transposases

6.1.1 measurement of transposase Activity

As described in section 5.2.5, the transposition frequency of an active transposase can be measured using a system in which transposons interfere with a selectable marker. A transposase reporter polynucleotide was constructed in which the URA-CH3 open reading frame of yeast Saccharomyces cerevisiae was disrupted by a yeast TRP1 open reading frame, the TRP1 open reading frame being operably linked to a promoter and a terminator such that it can be expressed in the yeast Saccharomyces cerevisiae. The TRP1 gene flanked by putative transposon ends with 5'-TTAA-3' target sites, such that excision of the putative transposon would leave a single copy of the 5'-TTAA-3' target site and precisely reconstitute the URA3 open reading frame. Yeast transposase reporter strains were constructed by integrating a transposase reporter polynucleotide into the URA3 gene of haploid yeast strains auxotrophic for LEU2 and TRP1, making the strains LEU2-, URA3-, and TRP1 +.

Transposases were tested for their ability to transpose transposons containing the TRP1 gene from the URA3 open reading frame. Each open reading frame encoding a putative transposase was cloned into a Saccharomyces cerevisiae (Saccharomyces cerevisiae) expression vector containing a2 micron origin of replication and an LEU2 gene that is expressible in yeast (Saccharomyces). Each transposase open reading frame is operably linked to a Gal1 promoter. The transposase open reading frame of each clone was converted to a yeast transposase reporter strain and inoculated on minimal medium lacking leucine. After 2 days, all LEU + colonies were collected by scraper. The Gal promoter was induced by growth in galactose for 4 hours, and then cells were plated onto 3 different plates: plates lacking leucine only, plates lacking leucine and uracil, and plates lacking leucine, uracil, and tryptophan. These plates were incubated for 2-4 days and colonies on each plate were counted, and the number of viable cells, the number of transposon excision events and the number of transposon excision and reintegration (i.e., transposition events) were measured separately.

6.1.2 recognition of active oryziaspigybac transposase

Eleven putative piggyBac-class transposases were identified from Genbank as described in section 5.2.5, which are at least 20% identical to piggyBac transposase from the ruler moth (trichoplusia). These putative transposases exhibit a DDDE motif characteristic of comprising an active piggyBac-like transposase. Flanking DNA sequences were analyzed for the presence of the following features of piggyBac transposition: an inverted repeat sequence immediately adjacent to the 5'-TTAA-3' target sequence. Putative left and right transposon end sequences were taken from these flanking sequences, including the sequence between the 5'-TTAA-3' target sequence and the open reading frame encoding the putative transposase. These transposon ends were incorporated into a transposase reporter construct constructed as described in section 6.1.1 and integrated into the genome of Saccharomyces cerevisiae (Saccharomyces cerevisiae), thereby generating a transposase reporter strain. The corresponding transposase sequence of each reporter strain was reverse translated, synthesized, cloned into a Saccharomyces cerevisiae (Saccharomyces cerevisiae) expression vector, and converted into a reporter strain. Transposase activity was measured as described in section 6.1.1.

The following twenty combinations showed no excision or transposition: reporter construct SEQ ID NO: 48 (comprising putative left transposon end SEQ ID NO: 68 and putative right transposon end SEQ ID NO: 69), with transposase SEQ ID NO: 21; reporter construct SEQ ID NO: 49 (comprising putative left transposon terminal SEQ ID NO: 70 and putative right transposon terminal SEQ ID NO: 71), with transposase SEQ ID NO: 22; reporter construct SEQ ID NO: 50 (comprising putative left transposon end SEQ ID NO: 72 and putative right transposon end SEQ ID NO: 73), with transposase SEQ ID NO: 23; reporter construct SEQ ID NO: 51 (comprising putative left transposon terminal SEQ ID NO: 74 and putative right transposon terminal SEQ ID NO: 75), with transposase SEQ ID NO: 24; reporter construct SEQ ID NO: 52 (comprising putative left transposon terminal SEQ ID NO: 76 and putative right transposon terminal SEQ ID NO: 77), with transposase SEQ ID NO: 25; reporter construct SEQ ID NO: 53 (comprising putative left transposon terminal SEQ ID NO:78 and putative right transposon terminal SEQ ID NO: 79), with transposase SEQ ID NO: 26; reporter construct SEQ ID NO: 54 (comprising putative left transposon terminal SEQ ID NO: 80 and putative right transposon terminal SEQ ID NO: 81), with transposase SEQ ID NO: 27; reporter construct SEQ ID NO: 55 (comprising putative left transposon terminal SEQ ID NO: 82 and putative right transposon terminal SEQ ID NO: 83), a transposon with a transposase of SEQ ID NO: 28; reporter construct SEQ ID NO: 56 (comprising putative left transposon terminal SEQ ID NO: 84 and putative right transposon terminal SEQ ID NO: 85), with transposase SEQ ID NO: 29; reporter construct SEQ ID NO: 57 (comprising putative left transposon end SEQ ID NO: 86 and putative right transposon end SEQ ID NO: 87), with transposase SEQ ID NO: 30, of a nitrogen-containing gas; reporter construct SEQ ID NO: 58 (comprising putative left transposon terminal SEQ ID NO: 88 and putative right transposon terminal SEQ ID NO: 89), with transposase SEQ ID NO: 31; reporter construct SEQ ID NO: 59 (comprising putative left transposon terminal SEQ ID NO: 90 and putative right transposon terminal SEQ ID NO: 91), with transposase SEQ ID NO: 32, a first step of removing the first layer; reporter construct SEQ ID NO: 60 (comprising putative left transposon terminal SEQ ID NO: 92 and putative right transposon terminal SEQ ID NO: 93), a sequence of amino acids with transposase SEQ ID NO: 33; reporter construct SEQ ID NO: 61 (comprising putative left transposon end SEQ ID NO: 94 and putative right transposon end SEQ ID NO: 95), with transposase SEQ ID NO: 34; reporter construct SEQ ID NO: 62 (comprising putative left transposon terminal SEQ ID NO: 96 and putative right transposon terminal SEQ ID NO: 97), with transposase SEQ ID NO: 35; reporter construct SEQ ID NO: 63 (comprising putative left transposon terminal SEQ ID NO: 98 and putative right transposon terminal SEQ ID NO: 99), with transposase SEQ ID NO: 36; reporter construct SEQ ID NO: 64 (comprising putative left transposon terminal SEQ ID NO: 100 and putative right transposon terminal SEQ ID NO: 101), with transposase SEQ ID NO: 37; reporter construct SEQ ID NO: 65 (comprising putative left transposon terminal SEQ ID NO: 102 and putative right transposon terminal SEQ ID NO: 103), with transposase SEQ ID NO: 38; reporter construct SEQ ID NO: 66 (comprising putative left transposon terminal SEQ ID NO: 104 and putative right transposon terminal SEQ ID NO: 105), with transposase SEQ ID NO: 39; reporter construct SEQ ID NO: 67 (comprising putative left transposon terminal SEQ ID NO: 106 and putative right transposon terminal SEQ ID NO: 107), with transposase SEQ ID NO: 40. this is consistent with literature reports: although computational recognition of sequences homologous to piggyBac transposases from the species athetosis (Trichoplusiani) is straightforward, most of them are inactive even though these sequences appear to have complete terminal repeats and the transposases appear to contain DDDE motifs found in active piggyBac-like transposases. Therefore, it is necessary to measure excision and transposition activities to identify novel active piggyBac-like transposases and transposons.

A transposase that exhibits good activity in excising its corresponding transposon from a reporter gene construct (shown by the appearance of URA + colonies) and transposing the TRP gene in the transposon into another genomic position in a Saccharomyces cerevisiae (Saccharomyces cerevisiae) reporter strain is transposase SEQ ID NO: 782. transposase SEQ ID NO:782 is able to transpose the polypeptide from the reporter construct SEQ ID NO: 41. This is shown in table 2: column G shows the number of excision events (by measuring the appearance of URA + colonies); column H shows the number of complete transposition events (by measuring the appearance of URA + TRP + colonies).

6.1.3 Oryzias transposase active in mammalian cells

PiggyBac-like transposases transpose their corresponding transposons into the genome of eukaryotic cells, including yeast cells (e.g., Pichia pastoris and Saccharomyces cerevisiae) and mammalian cells (e.g., Human Embryonic Kidney (HEK) cells and Chinese Hamster Ovary (CHO) cells). To determine the activity of piggyBac-like transposases in mammalian cells, we constructed gene transfer polynucleotides comprising transposon ends and further comprising a selectable marker that is identified with SEQ ID NO: 129 encodes a glutamine synthetase operably linked to regulatory elements providing weak expression of the glutamine synthetase, the sequence of the glutamine synthetase and related thereto a polypeptide consisting of SEQ ID NO: 172 as set forth herein. The gene transfer polynucleotide further comprises open reading frames encoding the heavy and light chains of the antibody, each open reading frame operably linked to a promoter and a polyadenylation signal sequence. The gene transfer polynucleotide (SEQ ID NO: 108) comprises the left transposon end, which contains the 5'-TTAA-3' target integration sequence, followed by a polynucleotide with the sequence shown in SEQ ID NO:9 at the left transposon end of the Oryzias ITR sequence given (which is an embodiment of SEQ ID NO: 7). The gene transfer polynucleotide further comprises a polynucleotide having a sequence defined by SEQ ID NO: the Oryzias right transposon end of the ITR sequence given at 10 (which is an embodiment of SEQ ID NO: 8) is followed by a 5'-TTAA-3' target integration sequence. Two Oryzias transposon ends were placed on either side of a heterologous polynucleotide comprising a glutamine synthetase selectable marker and an open reading frame encoding the heavy and light chains of an antibody. The left transposon end further comprises a nucleotide sequence consisting of SEQ ID NO: 5, immediately adjacent to the left ITR and proximal to the heterologous polynucleotide. The right transposon end further comprises a nucleotide sequence consisting of SEQ ID NO: 6 immediately adjacent to the right ITR and proximal to the heterologous polynucleotide.

The gene transfer polynucleotide is transfected into CHO cells lacking a functional glutamine synthetase gene. Cells were transfected by electroporation with 25 μ g of gene transfer polynucleotide DNA, either co-transfected with 3 μ g of DNA or not, containing a gene encoding a transposase operably linked to a human CMV promoter and a polyadenylation signal sequence. After electroporation, cells were cultured in medium containing 4mM glutamine for 48 hours, and then diluted to 300,000 cells per ml in medium lacking glutamine. Cells were changed to fresh glutamine-free medium every 5 days. Cell viability was measured for each transfection using a Beckman-Coulter Vi-Cell at different times after transfection. The total number of viable cells was also measured with the same instrument. The results are shown in Table 3.

As shown in Table 3, by 12 days after transfection, the viability of cells transfected with gene transfer polynucleotide and without transposase decreased to about 27% (column B). Within 7 days, the total number of viable cells decreased to less than 50,000 per ml (column C). At viable cell densities equal to or below this density, viability measurements become inaccurate. The culture was never recovered. In contrast, when SEQ ID NO:108 and Oryzias transposase SEQ ID NO:782 Co-transfection, the cells returned to greater than 90% viability within 10 days (column D of Table 3), at which time the viable cell density exceeded 200 million per ml (column E of Table 3). This indicates that a gene transfer polynucleotide comprising left and right Oryzias transposon ends can be transposed efficiently into the genome of a mammalian target cell by the corresponding Oryzias transposase.

The recovered CHO cell pool containing piggyBac-like transposons integrated into its genome was cultured in 14 day Fed Batch (Fed-Batch) using Sigma Advanced Fed Batch medium. Antibody titers were measured in culture supernatants using Octet. Table 4 shows the titers measured at days 7, 10, 12 and 14 of the fed-batch culture. After 14 days, the peptide was obtained from the cells by reaction with Oryzias transposase SEQ ID NO:782 co-transfected with an integrated recombinant construct comprising SEQ ID NO:108 to the gene transfer polynucleotide, the titer of antibodies in the cells reaches about 2 g/L. This indicates that the Oryzias transposons and their corresponding transposases, as described in section 5.2.5, are a novel, piggyBac-like transposon/transposase system that is active in mammalian cells, and can be used to develop cell lines that express proteins and to engineer the genome of mammalian cells.

6.1.4 messenger RNA encoding Oryzias transposase active in mammalian cells

We further tested the polypeptide with SEQ ID NO:108 (the structure of which is described in section 6.1.3) to determine whether a synthetic Oryzias transposon can be integrated into the genome of a mammalian cell if the corresponding transposase is provided in the form of an mRNA.

mRNA encoding the transposase was prepared by in vitro transcription using T7RNA polymerase. The mRNA comprises the 5' sequence SEQ ID NO: 109 comprising the 3' sequence SEQ ID NO: 110. the mRNA has an anti-inversion cap analogue (3' -O-Me-m)7G (5') ppp (5') G. DNA molecules comprising sequences encoding transposases operably linked to heterologous promoters active in vitro can be used to prepare transposase mrnas. An isolated mRNA molecule comprising a sequence encoding a transposase can be used to integrate a corresponding transposon into a target genome.

Has the sequence shown in SEQ ID NO:108, comprising a selectable marker encoding glutamine synthetase, having the amino acid sequence of SEQ ID NO: 129, consisting of the polypeptide sequence given in SEQ ID NO: 134 and operably linked to regulatory elements that result in the expression of weak glutamine synthetase, the sequence of glutamine synthetase and its associated regulatory elements being as set forth in SEQ ID NO: shown at 172. Gene transfer polynucleotides SEQ ID NO:108 further comprises open reading frames encoding the heavy and light chains of the antibody, each operably linked to a promoter and a polyadenylation signal sequence. Gene transfer polynucleotides SEQ ID NO:108 further comprises a polypeptide having the sequence set forth by SEQ ID NO:1 and a left transposon end of the sequence given by SEQ ID NO: 2 at the end of the right transposon of the sequence given in figure 2.

mRNA encoding the Oryzias transposase was prepared by in vitro transcription using T7RNA polymerase. The mRNA comprises the 5' sequence SEQ ID NO: 109. an open reading frame encoding an Oryzias transposase (amino acid sequence SEQ ID NO:782, nucleotide sequence SEQ ID NO: 780), and a 3' sequence SEQ ID NO: 110. transferring the gene to a polynucleotide of SEQ ID NO:108 into CHO cells lacking a functional glutamine synthetase gene. Transfection of cells by electroporation: mu.g of gene transfer polynucleotide DNA was co-transfected with 3. mu.g of mRNA comprising an open reading frame encoding the corresponding transposase (amino acid sequence SEQ ID NO:782, nucleotide sequence SEQ ID NO: 780). The cells were cultured in a medium containing 4mM glutamine for 48 hours after electroporation, and then diluted to 300,000 cells per ml in a medium lacking glutamine. Cells were changed to fresh glutamine-free medium every 5 days. Cell viability was measured for each transfection using a Beckman-Coulter Vi-Cell at different times after transfection. The total number of viable cells was also measured with the same instrument. The results are shown in Table 5.

When a polypeptide having the sequence of SEQ ID NO:108 and a polynucleotide encoding an Oryzias transposase SEQ ID NO:782 of mRNA, by 9 days after transfection, the viability dropped to about 28% (column B of Table 5), at which time the density of viable cells was about 40,000 per ml (column C of Table 5). Cell viability and viable cell density were then increased until viability exceeded 96% at 28 days post-transfection, and exceeded 300 million viable cells per ml. This indicates that a gene transfer polynucleotide comprising left and right Oryzias transposon ends can be efficiently transposed into the genome of a mammalian target cell when co-transfected with an mRNA encoding the corresponding Oryzias transposase.

6.1.5 Oryzias transposon end sequences active in mammalian cells

When we originally tested the Oryzias transposon, we used the entire sequence between the 5'-TTAA-3' target sequence and the transposase open reading frame as the transposon end. We have found that for other piggyBac-like sequences, the entire sequence is not generally required for transposition activity. Thus, we constructed synthetic Oryzias transposons with truncated ends to determine whether they could be transposed by the Oryzias transposase. SEQ ID NO: 42 encodes a polypeptide having the sequence set forth in SEQ ID NO: 130 operably linked to regulatory elements which act as selectable markers resulting in the expression of weak glutamine synthetase. To one side of the heterologous polynucleotide is the left Oryzias transposon end, which contains the 5'-TTAA-3' integration target sequence, followed by a sequence having the sequence of SEQ ID NO:9, which is the transposon ITR sequence of SEQ ID NO:7 according to the preceding paragraph. On the other side of the heterologous polynucleotide is the right Oryzias transposon end, which comprises a transposon having the sequence of SEQ ID NO:10 (which is an embodiment of SEQ ID NO: 8), followed by a 5'-TTAA-3' integration target sequence. The transposon further comprises a sequence selected from SEQ ID NO: 5. 11 and 12 immediately adjacent (following) the left transposon ITR sequence. The transposon further comprises a sequence selected from SEQ ID NO: 6. 13, 14 and 15 immediately preceding the right transposon ITR sequence. The transposon was transfected into CHO cells lacking a functional glutamine synthetase gene. Transfection of cells by electroporation: mu.g of gene transfer polynucleotide DNA is transfected and, optionally, the cells are co-transfected with 3. mu.g of mRNA comprising an open reading frame encoding the corresponding transposase (amino acid sequence SEQ ID NO:782, nucleotide sequence SEQ ID NO: 780). Cells were incubated in medium containing 4mM glutamine for 48 hours after electroporation and then diluted to 300,000 cells per ml in medium lacking glutamine. Cells were changed to fresh glutamine-free medium every 5 days. Cell viability was measured for each transfection using a Beckman-Coulter Vi-Cell at different times after transfection. The total number of viable cells was also measured with the same instrument. The results are shown in Table 6.

Columns B and C of table 6 show when the expression vector comprises a sequence having SEQ ID NO: 11 and a truncated left transposon end having SEQ ID NO: 6-reduction in cell viability and viable cell density (viable cell density) when cells are transfected with transposons at the ends of the full-length right transposon. Cell viability and viable cell density decreased throughout the experiment. In contrast, when any of the same transposons were co-transfected with mRNA encoding the Oryzias transposase, cell viability and viable cell density initially declined, but recovery began on day 14 and was complete between day 19 and day 24 (columns C and D of Table 6). When using a polypeptide comprising a nucleotide sequence having SEQ ID NO: 12 and a left transposon having the sequence of SEQ ID NO: 6 at the ends of the full-length right transposon, comparable results were obtained when cells were transfected with transposons (compare columns E and F and columns G and H of Table 6). When using a polypeptide comprising a nucleotide sequence having SEQ ID NO: 5 and a full-length left transposon end having SEQ ID NO: comparable results were also obtained when cells were transfected with transposons at the truncated right transposon end of 13 (compare columns I and J and K and L of table 6). When using a polypeptide comprising a nucleotide sequence having SEQ ID NO: 5 and a full-length left transposon end having SEQ ID NO: comparable results were also obtained when cells were transfected with transposons at the truncated right transposon end of 14 (compare columns M and N and columns O and P of table 6). When using a polypeptide comprising a nucleotide sequence having SEQ ID NO: 5 and a full-length left transposon end having SEQ ID NO: 15 (compare columns Q and R and S and T of table 6) also gave comparable results when cells were transfected with transposons at the ends of the truncated right transposon. This indicates that, except immediately adjacent to the sequence having SEQ ID NO:7, the left transposon end of the Oryzias synthetic transposon may further comprise, next to the left transposon ITR sequence, an integration target sequence of the transposon ITR sequence of SEQ ID NO: 5. 11 and 12; and the right transposon end of the Oryzias synthetic transposon may comprise a transposon immediately adjacent to the transposon having SEQ ID NO: 8, selected from the group consisting of SEQ ID NO: 6. 13, 14 and 15.

6.1.6 high activity Oryzias transposase

To identify the sequence relative to SEQ ID NO:782, given the naturally occurring Oryzias transposase sequence, the Oryzias transposase mutations that result in increased transposition activity or increased excision activity, we analyzed CLUSTAL alignment of active piggyBac-like transposases. Column C of table 1 shows the amino acids in the active piggyBac-like transposase relative to each position in the Oryzias transposase (positions shown in column a of table 1). Column B of table 1 shows the sequence represented by SEQ ID NO:782 of the amino acid present in the Oryzias transposase. Since transposases are generally harmful to their host, they tend to accumulate mutations that inactivate them. The mutations accumulated in the different transposases are different because each mutation occurs randomly. Thus, consensus sequences can be used to approximate ancestral sequences prior to the accumulation of deleterious mutations. It is difficult to accurately calculate the ancestral sequence from a small number of existing sequences, and we therefore chose to focus on positions where the active transposase is more conserved, sharing amino acids other than that of the Oryzias transposase. It is believed that mutating these amino acids to common amino acids found in other active transposases will likely increase the activity of the Oryzias transposase. These candidate beneficial amino acid substitutions are shown in column D of table 1.

6.1.6.1 first group of Oryzias transposase variants

A set of 95 polynucleotides encoding a variant Oryzias transposase comprising one or more substitutions selected from the group consisting of E22, D82, a124, Q131, L138, F149, L156, D160, Y164, I167, a171, G172, R175, K177, G178, L200, T202, I206, I210, N214, W237, V251, V253, V258, M270, I281, a284, M319, G322, L323, H326, F333, Y337, L361, V386, M400, T402, H404, S408, L409, K435, Y440, F455, V458, D459, S461, a465, V467, L468, W469, a512, a514, V515, S524, R548, D549, D550, S551 and N. In this set of 95 variants, each substitution represents at least 5 times, and the number of pairwise combinations of substitutions is maximized, such that each substitution is tested in as many different sequence scenarios (contexts) as possible. Cloning each variant gene into a vector comprising a leucine selectable marker; each gene encoding a transposase variant is operably linked to a Saccharomyces cerevisiae (Saccharomyces cerevisiae) Gal-1 promoter. Each of these variants is then converted to a polypeptide comprising SEQ ID NO:41 (Saccharomyces cerevisiae) strain. After 48 hours, cells were scraped from the plates into minimal medium lacking leucine and galactose as carbon source. The a600 of each culture was adjusted to 2. Cultures were cultured in galactose for 4 hours to induce transposase expression, then 1,000x diluted aliquots were plated on medium lacking leucine, uracil and tryptophan (to calculate transposition), 1,000x diluted aliquots were plated on medium lacking leucine and uracil (to calculate excision), and 25,000x diluted aliquots were plated on medium lacking leucine (to calculate total number of viable cells). After two days, colonies were counted to calculate the transposition frequency (═ leu-ura-trp medium divided by (25 x-leu medium cells) and the excision frequency (═ leu-ura medium divided by (25 x-leu medium cells)). The results are shown in Table 7. More than 60 variants of the Oryzias transposase (having the sequence given by SEQ ID NO: 816-877) have excision or transposition activity which is at least 10% of the activity measured for the naturally occurring Oryzias transposase. Although these activities are inferior to naturally occurring transposases, they are still highly active and useful transposases for integrating the Oryzias transposon into the genome of a target eukaryotic cell. Relative to SEQ ID NO:782, some of the Oryzias transposases having the activities shown in table 7 have high activity against excision. An exemplary Oryzias transposase with high activity for excision comprises a sequence selected from the group consisting of SEQ ID NO: 805, 815. These are functional non-native Oryzias transposases.

The effect of sequence changes on excision and transposition frequencies was modeled as described in U.S. Pat. No. 8,635,029 and Liao et al (2007, BMC Biotechnology 2007,7:16doi:10.1186/1472- & 6750-7-16 "Engineering protease K using machine learning and synthetic genes"). The mean and standard deviation of the regression weights for each substitution were calculated and are shown in table 8. The effect of a single substitution on transposase activity may vary depending on the environment (i.e., other substitutions present). The positive mean regression weight represents: considering all the different sequence scenarios that have been tested in, on average, the substitution has a positive influence on the measured properties. Incorporation of substitutions with positive mean regression weights into sequences generally resulted in variants with improved activity (Liao et al, supra). A further measure of the context-dependent variability of the effect of the substitution is the standard deviation of the regression weights. A substitution has a positive effect in most scenarios if the standard deviation of the mean regression weight of the substitution minus the regression weight of the substitution is zero or greater. The mean regression weight of thirty-one of the sixty substitutions we selected by looking for changes towards consensus for other active piggyBac-like transposases minus the standard deviation of the regression weights for excision or transposition is zero or greater: E22D, a124C, Q131D, L138V, D160E, Y164F, I167L, a171T, R175K, T202R, I206L, I210L, N214D, V253I, V258L, I281F, a284L, V386I, M400L, S408E, L409I, F455Y, V458L, V467I, L468I, a514R, V515I, D549K, D550R and S551R (columns F and I of table 8). The average regression weight for thirty-six substitutions we selected by looking for changes to the consensus for other active piggyBac-like transposases was zero or greater: E22D, a124C, Q131D, L138V, F149R, L156T, D160E, Y164F, I167L, a171T, R175K, K177N, T202R, I206L, I210L, N214D, V253I, V258L, I281F, a284L 84 2, L361I, V386I, M400L, S408E, L409I, F455Y, V458L, V467I, L468I, a514R, V515I, S524P, R548K, S549K, D550R and S551R. In addition to identifying specific substitutions that have beneficial effects, this also indicates locations where similar substitutions (analog subsistitions) may be beneficial. Analogous substitutions are substitutions in which the properties of the amino acid are retained. For example: glycine and alanine are in the "small" amino acid group; valine, leucine, isoleucine and methionine in the "hydrophobic" amino acid group; phenylalanine, tyrosine and tryptophan are in the "aromatic" group of amino acids; aspartate and glutamate are in the "acidic" amino acid group; asparagine and glutamine are in the "amide" amino acid group; histidine, lysine and arginine are in the "basic" group of amino acids; cysteine, serine and threonine are among the group of "nucleophilic" amino acids. Other substitutions from the same set of amino acid groups may also be beneficial at the same position if a substitution at an amino acid position within the Oryzias transposase is beneficial for excision or transposition activity. For example, since substitution of the nucleophilic residue serine at position 408 with the acidic residue glutamate (S408E) is beneficial, substitution with the acidic residue aspartate (i.e., S408D) may also be beneficial. Similarly, since substitution of the hydrophobic residue valine at position 258 with the hydrophobic residue leucine (V258L) is beneficial, substitution with the hydrophobic residue isoleucine or methionine (i.e., V258I or V258M) may also be beneficial. A favourably highly active Oryzias transposase is disclosed in SEQ ID NO:782 of amino acid 22, 124, 131, 138, 160, 164, 167, 171, 175, 202, 206, 210, 214, 253, 258, 281, 284, 386, 400, 408, 409, 455, 458, 467, 468, 514, 515, 548, 549, 550 and 551, comprises an amino acid substitution, for example a multiple amino acid substitution at one or more of these positions selected from E22D, a124C, Q131D, L138V, D160E, Y164F, I167L, a171T, R175K, T202R, I206R, I210R, N214R, V253R, V258R, I281R, a 284R, V386R, M400R, S36408, L409R, F455, V458, V R, V467R, V36514, V515, R72, or a similar multiple amino acid substitutions at one or more of these positions.

Table 8 also shows that some substitutions are positive for regression weights for excision, but much lower positive values for integrated weights, or even negative. These include the amino acid substitutions L156T, Y164F, I167L, a171T, R175K, K177N, a284L and F455Y. Such substitutions can be combined to engineer an Oryzias transposase that is more excisable than transposable. An advantageous Oryzias transposase with high activity in excision is contained in the amino acid sequence relative to SEQ ID NO:782 with an amino acid substitution at a position selected from amino acids 156, 164, 167, 171, 175, 177, 284 and 455, for example one or more amino acid substitutions selected from L156T, Y164F, I167L, a171T, R175K, K177N, a284L and F455Y at one of these positions or the like.

6.1.6.2 second group of Oryzias transposase variants

Substitutions for proteins that have been tested multiple times in the context of different combinations of other substitutions and have "positive regression coefficients, weights, or other values describing their relative or absolute contribution to one or more activities" are usefully incorporated into proteins to obtain proteins that have "improvements in respect of one or more attributes, activities, or functions of interest" as described in Liao et al (2007, BMC Biotechnology 2007,7:16doi:10.1186/1472- "Engineering protein Using protein learning and synthesizing genes") and U.S. Pat. Nos. 5.4.2 and 5.4.3 of 8,635,029. Based on the substitution weights shown in Table 8, we designed a set of open reading frames encoding 31 new variants (sequences given by SEQ ID NO: 878-908) combined with some of the most positive (positive) substitutions (L156T, Y164F, I167L, R175K, K177N, I210L, V258L, A284L, V386I, L409I, F455Y, V458L, A465S, A550R 14 514R and D550 5). In a set of 31 variants, each substitution is represented at least 5 times, and the number of pairwise combinations of substitutions is maximized, such that each substitution is tested in as many different sequence scenarios as possible. Cloning each variant open reading frame into a vector comprising a leucine selectable marker; each gene encoding a transposase variant is operably linked to a Saccharomyces cerevisiae (Saccharomyces cerevisiae) Gal-1 promoter. Each of these variants was then converted to a polypeptide comprising SEQ ID NO:41 (Saccharomyces cerevisiae) strain. After 48 hours, cells were scraped from the plates into minimal medium lacking leucine and galactose as carbon source. The a600 of each culture was adjusted to 2. Cultures were cultured in galactose for 4 hours to induce transposase expression, then 25,000x diluted aliquots were plated on medium lacking leucine, uracil and tryptophan (to calculate transposition), 1,000x diluted aliquots were plated on medium lacking leucine and uracil (to calculate the number of divisions), and 25,000x diluted aliquots were plated on medium lacking leucine (to calculate the total number of viable cells). After two days, colonies were counted to calculate transposition frequency (═ number of cells on leu-ura-trp medium divided by (-number of cells on leu medium)). The results are shown in Table 9.

In addition to the activities of the 31 novel Oryzias transposase variants, table 9 also shows the activities of 1 variant from the first group, which is the most active variant in this group. The activity of the new set of variants was much higher than the first set. None of the variants were inactive, and the lowest observed activity (for SEQ ID NO: 899) was that of SEQ ID NO:782 and several variants have higher transposition activity than the naturally occurring Oryzias transposase (SEQ ID NOs: 853, 885, 903 and 905). Preferred Oryzias transposases comprise amino acid substitutions selected from the group consisting of L156T, Y164F, I167L, R175K, K177N, I210L, V258L, a284L, V386I, L409I, F455Y, V458L, a465S, a514R and D550R, or similar changes at the same position.

Brief description of the tables

Table 1. amino acid changes may result in increased transposase activity.

Amino acid substitutions with potential to improve transposase activity were identified as described in section 5.2.6. Column A shows the position in the Oryzias transposase (relative to SEQ ID NO: 782), column B shows the amino acids in the native protein, and column C shows the amino acids at equivalent positions (equivalent positions) in an alignment in a known active piggyBac class transposase. Column D shows the positions of amino acid changes found in the piggyBac class transposase of known activity (non-Oryzias transposase) that are well conserved among the remaining transposase groups, but where the amino acids in the Oryzias transposase sequence are outliers. Mutations in these amino acids are particularly likely to result in increased transposase activity. More than one amino acid letter in the column indicates that each individual amino acid substitution is acceptable or beneficial and is not intended to represent a peptide. For example, at position 2, either amino acid T, A, R, D or N is acceptable, so the column C contains "TARDN" to indicate this.

TABLE 2 excision and transposition of transposons in yeast.

Sources of transposons and transposases are listed in column A. Reporter plasmids were constructed as described in section 6.1.2 using the left sequence of SEQ ID NO shown in column B and the right sequence of SEQ ID NO shown in column C. The reporter plasmid has the insertion sequence given by SEQ ID NO listed in column D. These reporter plasmids were integrated into the Ura3 gene of the Trp strain of Saccharomyces cerevisiae (Saccharomyces cerevisiae). The amino acid sequence given by SEQ ID NO shown in column E was reverse translated, synthesized and cloned into a plasmid containing the Leu2 gene expressible in Saccharomyces cerevisiae (Saccharomyces cerevisiae) and a2 micron origin of replication. The transposase gene is operably linked to the Gal1 promoter. The plasmid containing the transposase was transformed into a reporter strain, expression was induced, and cells were plated as described in section 6.1.1. Induced cultures were diluted 25,000 fold and then 100 μ l was plated on leu deletion (dropout) plates and then 100 μ l was plated on leu ura deletion plates or leu uratrp deletion plates. Column F shows the number of colonies on leu deletion plates; column G shows the number of colonies on the leuura deletion plate (indicating that the transposon was excised from the middle of the ura gene in the reporter gene); column H shows the number of colonies on the leu uratrp deletion plate (indicating that the transposon was excised from the middle of the ura gene in the reporter gene and transposed to another site in the genome).

TABLE 3 transposons of transposons into the CHO target cell genome.

Cells were transfected with transposon SEQ ID NO:108 as described in section 6.1.3. Transposase SEQ ID NO is shown in line 1. As shown in row 2, the viability (percentage of cells surviving) and total viable cell density (in million cells per ml) for each transfection are shown in the adjacent columns. Rows 3-17 show these measurements at various times post-transfection, and days elapsed are shown in column A.

TABLE 4 antibody production of transposons integrated into the genome of CHO target cells.

Cells were transfected with transposons and transposases as described in section 6.1.3. The recovery rate is shown in Table 3. During 14 days of fed batch antibody production, the culture supernatants contained the indicated concentrations of antibody (antibody titer): column a shows titer (titer) at day 7; column B shows the titer at day 10; column C shows the titer at day 12; column D shows the titer at day 14.

TABLE 5 transposons of transposons into the genome of CHO target cells by transposases encoded by mRNA.

Cells were transfected with transposons and transposases encoded by mRNA as described in section 6.1.4. As shown in row 3, viability (percentage of cells surviving) and total viable cell density (in million cells per ml) are shown in the adjacent columns. Lines 1-12 show these measurements at various times post-transfection, and days elapsed are shown in column A.

TABLE 6 transposons having truncated terminal sequences transpose into the genome of CHO target cells.

Cells are transfected with transposons and optionally transposases encoded by mRNA as described in section 6.1.5. Transposon SEQ ID NO is shown in line 1. Each transposon comprises a left transposon end comprising a 5'-TTAA-3' integration target sequence immediately adjacent to a transposon ITR sequence having SEQ ID NO:9, immediately adjacent (behind) the left transposon end shown in line 2 of SEQ ID NO. The transposon further comprises SEQ ID NO: 42: an open reading frame encoding a glutamine synthetase selectable marker operably linked to a regulatory sequence expressible in a mammalian cell. The transposon further comprises a right transposon end comprising a right terminal sequence shown in SEQ ID NO at row 3 immediately adjacent to a transposon having the sequence shown in SEQ ID NO:10, immediately adjacent to the 5'-TTAA-3' integration target sequence. Line 4 shows the transposase SEQ ID NO encoded by the transfected mRNA. Viability (percentage of cells surviving) is shown in the column labeled "V%" on row 5, and total viable cell density (in million cells per milliliter) is shown in the column labeled "VCD" on row 5. Rows 6-15 show these measurements at various times post-transfection, and days elapsed are shown in column U.

Table 7 transposition and excision activities of Oryzias transposase variants.

Genes encoding the Oryzias transposase variants were designed, synthesized, and cloned as described in section 6.1.6.1. The SEQ ID NO of each variant is given in column a. The gene is transformed into a strain of saccharomyces cerevisiae, the genome of which comprises the transposase reporter gene SEQ ID NO:41 and plated on medium lacking leucine. After 48 hours, cells were scraped from the plate into minimal medium lacking leucine and galactose as carbon source. The a600 of each culture was adjusted to 2. Cultures were grown in galactose for 4 hours to induce expression of transposase. Cultures were diluted 1,000 fold into minimal medium lacking leucine. One 100. mu.l aliquot was plated on minimal medium agar plates lacking leucine and uracil (to measure transposon excision) and another 100. mu.l aliquot was plated on minimal medium agar plates lacking leucine, tryptophan and uracil (to measure transposon transposition). Each culture was diluted 25,000-fold and 100. mu.l aliquots were then plated on minimal medium agar plates lacking leucine (to measure viable cells). After 48 hours, colonies on each plate were counted. The number of colonies on plates lacking leucine is shown in column B, the number of colonies on plates lacking leucine and uracil is shown in column C, and the number of colonies on plates lacking leucine, uracil and tryptophan is shown in column D. Column E shows the cut-off frequency (calculated by dividing the number in column C by the number in column B, and then by 25). Column F shows the transposition frequency (calculated by dividing the number in column D by the number in column B and then by 25).

Table 8 model weights for amino acid substitutions in Oryzias transposase variants.

The effect of sequence changes on the excision and transposition activity of Oryzias transposase was modeled as described in U.S. patent 8,635,029. The mean and standard deviation of the regression weights were calculated for each substitution. Position (relative to SEQ ID NO: 782) is shown in column A, in SEQ ID NO:782 the amino acid found at this position is shown in column B. The amino acid substitutions tested are shown in column C. The regression weights for the substitutions on transposition activity are shown in column D, the standard deviation of the regression weights is shown in column E, and the average weight minus the standard deviation is shown in column F. The regression weights for the substitutions versus cleavage activity are shown in column G, the standard deviation of the regression weights is shown in column H, and the average weight minus the standard deviation is shown in column I.

Table 9 transposition and excision activities of Oryzias transposase variants.

Genes encoding the Oryzias transposase variants were designed, synthesized, and cloned as described in section 6.1.6.2. The SEQ ID NO of each variant is given in column a. The gene is transformed into a strain of saccharomyces cerevisiae, the genome of which comprises the transposase reporter gene SEQ ID NO:41 and plated on medium lacking leucine. After 48 hours, cells were scraped from the plate into minimal medium lacking leucine and galactose as carbon source. The a600 of each culture was adjusted to 2. Cultures were grown in galactose for 4 hours to induce expression of transposase. Cultures were diluted 25,000 fold into minimal medium lacking leucine. One 100. mu.l aliquot was plated on minimal medium agar plates lacking leucine and uracil (to measure transposon excision), another 100. mu.l aliquot was plated on minimal medium agar plates lacking leucine, tryptophan and uracil (to measure transposon transposition), and a third 100. mu.l aliquot was plated on minimal medium agar plates lacking leucine (to measure viable cells). After 48 hours, colonies on each plate were counted. The number of colonies on plates lacking leucine is shown in column B, the number of colonies on plates lacking leucine and uracil is shown in column C, and the number of colonies on plates lacking leucine, uracil and tryptophan is shown in column D. Column E shows the cut-off frequency (calculated by dividing the numbers in column C by the numbers in column B). Column F shows the transposition frequency (calculated by dividing the numbers in column D by the numbers in column B).

Table form

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1 (continue)

TABLE 1

TABLE 2

TABLE 3

TABLE 4

TABLE 5

TABLE 6

Table 7 (continue)

TABLE 7

Table 8 (continue)

TABLE 8

TABLE 9

Reference to the literature

All references cited herein are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference in its entirety for all purposes. To the extent that information associated with a citation may vary over time, it is intended to refer to a version that is valid at the time of the filing date of the present application or of the priority application that first mentioned the citation.

As will be apparent to those skilled in the art, many modifications and variations of this invention can be made without departing from its spirit and scope. The particular embodiments described herein are illustrative only, and the invention is to be limited only by the terms of the appended claims, along with the full scope of equivalents to which such claims are entitled. Any embodiment, aspect, element, feature, or step can be used in combination with any other embodiment, aspect, element, feature, or step, unless otherwise apparent from the context.

62页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:杀灭靶细菌的方法和组合物

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!