Multiple guide RNAs

文档序号:1871905 发布日期:2021-11-23 浏览:7次 中文

阅读说明:本技术 多重引导rna (Multiple guide RNAs ) 是由 S·特赛 K·J·乔昂格 于 2014-09-18 设计创作,主要内容包括:本申请涉及多重引导RNA。用于任选地在哺乳动物细胞中将高活性CRISPR引导RNA(gRNA)从RNA聚合酶II和III启动子进行多重表达的方法和构建体。本发明至少部分地基于Csy4的发现,Csy4是一种内切核糖核酸酶,该内切核糖核酸酶识别短RNA发夹序列,可以用于切下在单个较长RNA转录物上编码的多功能gRNA(产生自RNA pol II或III启动子),在该较长RNA转录物中的各个gRNA由Csy4切割位点隔开。(The present application relates to multiple guide RNAs. Methods and constructs for multiplexed expression of high activity CRISPR guide RNAs (grnas) from RNA polymerase II and III promoters, optionally in mammalian cells. The present invention is based, at least in part, on the discovery that Csy4, Csy4 is an endoribonuclease that recognizes short RNA hairpin sequences and can be used to excise multifunctional grnas (generated from the RNA pol II or III promoters) encoded on a single longer RNA transcript in which each gRNA is separated by a Csy4 cleavage site.)

1. A deoxyribonucleic acid (DNA) molecule, comprising:

a plurality of sequences encoding crrnas, wherein each crRNA is flanked by at least one Csy4 cleavage sequence consisting of sequence GTTCACTGCCGTATAGGCAG (SEQ ID NO: 1);

and sequences encoding tracrRNA.

2. The DNA molecule of claim 1, operably linked to a promoter sequence.

3. The DNA molecule of claim 1 or 2, comprising two, three, or more crRNA sequences.

4. The DNA molecule of claim 2, wherein the promoter sequence is an RNA polymerase II (PolII) promoter or a PolIII promoter.

5. The DNA molecule of claim 2, wherein the promoter sequence is an RNA Pol II promoter.

6. The DNA molecule of claim 5, wherein the Pol II promoter is selected from the group consisting of: the CAG, EF1A, CAGGS, PGK, Ubic, CMV, B29, desmin, endoglin, FLT-1, GFPA, and SYN1 promoters.

7. A DNA molecule comprising a promoter sequence linked to one, two, three or more cassettes, the cassettes comprising: a sequence encoding crRNA linked to a Csy4 cleavage site, the Csy4 cleavage site consisting of GTTCACTGCCGTATAGGCAG (SEQ ID NO: 1); and sequences encoding tracrRNA.

8. The DNA molecule of claim 7, comprising:

a Pol II promoter operably linked to a first sequence encoding a first crRNA, the first sequence linked to a Csy4 cleavage site, the Csy4 cleavage site linked to a second sequence encoding a second crRNA, the second sequence linked to a Csy4 cleavage site, the Csy4 cleavage site linked to a third sequence encoding a third crRNA, the third sequence linked to a Csy4 cleavage site.

9. The DNA molecule of claim 7, comprising a crRNA sequence of about 100 nt.

10. The DNA molecule of claim 1 or 7, further comprising a sequence encoding a functional Csy4 enzyme.

11. The DNA molecule of claim 1 or claim 7, wherein the crRNA sequence comprises: (X)17-20) GTTTTAGAGCTAGAAA (SEQ ID NO: 15); and the tracrRNA sequence comprises: TAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 16);

wherein X17-20Is a sequence complementary to the complementary strand of 17-20 contiguous nucleotides of the target sequence, preferably the target sequence immediately 5' of the preprimiter sequence adjacent motif (PAM).

12. The DNA molecule of claim 1 or claim 7, comprising one or more U at the 3' end of the molecule.

13. An RNA molecule encoded by the DNA molecule of any one of claims 1-12.

14. A vector comprising the DNA molecule of any one of claims 1-12.

15. The vector of claim 14, further comprising a sequence encoding a functional Csy4 enzyme.

16. A method of producing a plurality of crrnas in a cell, the method comprising contacting the cell in vitro with the RNA molecule of claim 13.

17. The method of claim 16, wherein the cell is a mammalian cell and the cell further expresses an exogenous functional Csy4 enzyme sequence.

18. A method of altering the expression of multiple target genes in a cell, the method comprising contacting the cell in vitro with the DNA molecule of any one of claims 1-12 or the RNA molecule of claim 13, wherein each crRNA comprises a sequence complementary to at least 17-20nt of a target gene.

19. A method of altering expression of a target gene in a cell, the method comprising contacting the cell in vitro with the DNA molecule of any one of claims 1-12 or the RNA molecule of claim 13, wherein each crRNA comprises a sequence complementary to at least 17-20nt of the target gene.

20. Use of a DNA molecule according to any one of claims 1 to 12 or an RNA molecule according to claim 13 in the manufacture of a medicament for altering expression of a plurality of target genes in a cell, wherein each crNRA comprises a sequence complementary to at least 17-20nt of a target gene.

21. Use of a DNA molecule according to any one of claims 1 to 12 or an RNA molecule according to claim 13 in the manufacture of a medicament for altering expression of a target gene in a cell, wherein each crNRA comprises a sequence complementary to at least 17-20nt of the target gene.

Technical Field

Described are methods and constructs for the multiplexed expression of high activity CRISPR guide RNAs (grnas) from RNA polymerase II and III promoters, optionally in mammalian cells.

Background

Cas9 nuclease forms a programmable RNA-guided Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated (Cas) system (Wiedenheft et al, Nature 482,331-338 (2012); Horvath et al, Science 327,167-170 (2010); Terns et al, Current opinion of microbiology (Curr Opin Microbiol)14,321-327(2011)), which can be used to generate site-specific breaks in target DNA sequences in vitro, in mammalian cells, and in live model organisms such as zebrafish (Wang et al, Cell 153,910-918 (2013); sink (Shon et al, Cell research (Cell Res) (2013); Dicarb et al, Aceic (Nucleic acid research) (Juid) 3; Queen et al, Nategen Biotech (239, 31, Nategen et al), eife 2, e00471 (2013); yellow (Hwang) et al, Nat Biotechnol (Nat Biotechnol)31, 227-; please (Cong) et al, Science 339, 819. sup. 823 (2013); marie (Mali) et al, Science 339,823-826(2013 c); dune (Cho) et al, Nat Biotechnol 31,230-232 (2013); gredz (Gratz) et al, Genetics (Genetics)194(4):1029-35 (2013)). A short approximately 100nt guide rna (grna) complexes with Cas9 and directs the nuclease to a specific target DNA site; targeting is mediated by a sequence of at least 17-20 nucleotides (nt) at the 5' end of the gRNA designed for simple base pair complementarity and interaction between the complementary strands of the target genomic DNA sequence of interest located via the first 17-20 nucleotides of the engineered gRNA and a PAM Adjacent Motif (PAM), e.g., PAM matching the sequence NGG or NAG (Shen) et al, Cell research (Cell Res) (2013), Dicaro (Dicalo) et al, Nucleic acid research (Nucleic Acids Res) (2013), ginger (Jiang) et al, Nature Biotechnol (Nat Biotechnol)31,233-239(2013), Jinek (Jineek) et al, Elife 2, e00471(2013), yellow (Hwang) et al, Nature Biotechnol (Nat Biotechnol)31,227 (229, 227, Con 2013, Confli et al, 201819 (Ma) et al, Ma, science 339, 823-; dune (Cho) et al, Nat Biotechnol 31,230-232 (2013); quinones (Jinek), et al, Science 337, 816-. grnas can also direct catalytically inactivated Cas9 protein (called dCas9, see Jinek et al Science 337: 816-821(2012)) which in turn is fused to an effector domain (e.g., a transcriptional activation domain), see, e.g., USSN 61/799,647, filed 3/15/2013, and 61/838,148, filed 6/21/2013, both of which are incorporated herein by reference. The latter system enables RNA-guided recruitment of heterologous effector domains to the genomic locus of interest.

SUMMARY

The present invention is based, at least in part, on the discovery that Csy4, Csy4 is an endoribonuclease that recognizes short RNA hairpin sequences and can be used to excise multifunctional grnas (generated from the RNA pol II or III promoters) encoded on a single longer RNA transcript in which each gRNA is separated by a Csy4 cleavage site.

Thus, in a first aspect, the invention provides a deoxyribonucleic acid (DNA) molecule comprising a plurality of sequences encoding a guide rna (gRNA), wherein each gRNA is flanked by at least one Csy4 cleavage sequence, the Csy4 cleavage sequence comprising or consisting of sequence GTTCACTGCCGTATAGGCAG (SEQ ID NO:1) or GTTCACTGCCGTATAGGCAGCTAAGAAA (SEQ ID NO: 2).

In some embodiments, the DNA molecule is operably linked to a promoter sequence.

In some embodiments, the DNA molecule includes two, three, or more gRNA sequences, each flanked by at least one Csy4 cleavage sequence.

In some embodiments, the promoter sequence is an RNA polymerase II (Pol II) promoter or a Pol III promoter, preferably an RNA Pol II promoter. In some embodiments, the Pol II promoter is selected from the group consisting of: the CAG, EF1A, CAGGS, PGK, Ubic, CMV, B29, desmin, endoglin, FLT-1, GFPA, and SYN1 promoters.

In another aspect, the invention provides a DNA molecule comprising a promoter sequence linked to one, two, three or more cassettes comprising: the sequence encoding the guide RNA, i.e.about 100nt of the sequence, e.g.95-300 nt, e.g.95-105 nt for the S.pyogenes-based system, is linked to a Csy4 cleavage site, e.g.SEQ ID NO:1 or 2.

In some embodiments, the DNA molecule comprises a Pol II promoter operably linked to a first sequence encoding a first guide RNA linked to a Csy4 cleavage site, the Csy4 cleavage site linked to a second sequence encoding a second guide RNA linked to a Csy4 cleavage site, the Csy4 cleavage site linked to a third sequence encoding a third guide RNA linked to a Csy4 cleavage site. In some embodiments, additional guide RNAs linked to the Csy4 cleavage sites are included. For example, the DNA molecule may have the following structure: promoter-C4-gRNA-C4-gRNA-C4-gRNA-C4 promoter-C4-gRNA-C4-gRNA-C4-gRNA-C4-gRNA-C4 promoter-C4-gRNA-C4-gRNA-C4-gRNA-C4-gRNA-C4-gRNA C4 and so on. In this illustration, C4 is the sequence encoding the Csy4 RNA cleavage site and the gRNA is the sequence encoding the guide RNA.

In some embodiments, Cas9 sgRNA includes the following sequence:

(X17-20)GUUUUAGAGCUAUGCUGUUUUG(XN)(SEQ ID NO:4);

(X17-20)GUUUUAGAGCUA (SEQ ID NO:5);

(X17-20)GUUUUAGAGCUAUGCUGUUUUG(SEQ ID NO:6);

(X17-20)GUUUUAGAGCUAUGCU(SEQ ID NO:7);

(X17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG

(XN)(SEQ ID NO:8);

(X17-20)GUUUUAGAGCUAUGCUGAAAAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(XN)(SEQ ID NO:9);

(X17-20)GUUUUAGAGCUAUGCUGUUUUGGAAACAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(XN)(SEQ ID NO:10);

(X17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(XN)(SEQ ID NO:11),

(X17-20)GUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(SEQ ID NO:12);

(X17-20) GUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 13); or

(X17-20)GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(SEQ ID NO:14),

Wherein X17-20Is a sequence complementary to a complementary strand of 17 to 20 contiguous nucleotides of the target sequence, preferably 5' of the immediate preadipolar region sequence adjacent motif (PAM), and XNIs any sequence that does not interfere with binding of ribonucleic acid to Cas 9. Although X is exemplified herein by the Streptococcus pyogenes Cas9 system17-20But longer sequences may also be used, e.g., as appropriate for other systems.

In some embodiments, the DNA molecule further comprises a sequence encoding a functional Csy4 enzyme.

Also provided herein are vectors comprising the DNA molecules described herein, e.g., vectors optionally comprising sequences encoding a functional Csy4 enzyme. Also provided herein are multiple transcripts produced from these DNA molecules, e.g., intact RNA that has not been cleaved with Csy 4.

In yet another aspect, provided herein are methods for producing a plurality of guide RNAs in a cell. These methods comprise expressing a DNA molecule described herein in a cell.

In some embodiments, the cell is a mammalian cell and the cell further expresses an exogenous functional Csy4 enzyme sequence, or the method further comprises administering a functional Csy4 enzyme or a nucleic acid encoding a functional Csy4 enzyme.

In another aspect, the invention provides methods for altering expression of one or more target genes in a cell. These methods include expressing a DNA molecule as described herein, e.g., a DNA molecule comprising a plurality of sequences encoding a guide rna (gRNA), wherein each gRNA comprises a variable sequence complementary to at least 17-20nt of one or more target genes, and each gRNA is flanked by at least one Csy4 cleavage sequence that comprises or consists of sequence GTTCACTGCCGTATAGGCAG (SEQ ID NO:1) or GTTCACTGCCGTATAGGCAGCTAAGAAA (SEQ ID NO: 2).

In the methods and compositions of the invention, the gRNA may be a single guide RNA comprising a fused tracrRNA and crRNA, as described herein, or may comprise only a crRNA, and the tracrRNA may be expressed from the same or different DNA molecules. Thus, in some embodiments, the DNA molecules described herein further comprise a sequence encoding a tracrRNA. In some embodiments, the methods comprise expressing the separate tracrRNA in a cell, e.g., contacting the cell with a vector or DNA molecule that expresses the tracrRNA.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials for use in the present invention are described herein; other suitable methods and materials known in the art may also be used. These materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.

Other features and advantages of the invention will be apparent from the following detailed description and drawings, and from the claims.

Drawings

Figure 1 is a schematic illustration of constructs used in the following initial multiplex experiment:

1+2: co-directional repeats of crRNA array and Cas9, with separate tracrRNAs

3+4: short crRNA arrays separated by Csy4 site, with Csy4, Cas9, and separate tracrRNAs

5+6: full-length chimeric grnas separated by Csy4 site.

7: nls-FLAG tagged Cas9

Fig. 2 is a bar graph showing experimental results in cells expressing the constructs shown in fig. 1. The Csy4 site + the complete gRNA (constructs 5 and 6) was the most efficient multiplex frame.

Fig. 3 is a schematic overview and comparison of exemplary standard and multiplex Csy 4-based gRNA frameworks, and transcripts they produce. It is noted that Csy4 enables the use of an RNA Pol II promoter (e.g., CAG) as a substitute for U6, U6 being an RNA Pol III promoter.

Fig. 4 is a bar graph showing that Csy4 cleavage produces truncated recognition sites for grnas with higher activity in human cells. Processing of the truncation site also leaves a clean 5 'end, effectively removing the 5' G restriction imposed by the U6 promoter on the gRNA target sequence.

FIGS. 5A-C are sequences showing evidence of 2-target multiplexing edited in a single human cell. A single deletion was observed at site 2 or 3 of interest. For positions 2 and 3, multiple deletions in the same sequence were observed. Deletions across sites 2 and 3 were also observed.

Fig. 6 is a schematic showing successful multiplexed expression of three grnas using a Csy 4-based system.

Fig. 7 is a graph showing that grnas excised from mRNA transcribed from RNA Pol II by Csy4 can effectively recruit Cas9 nuclease to specific targets in human cells. In these experiments, grnas were expressed in longer mRNA transcripts made from RNA Pol II CAG promoters.

Detailed description of the invention

One potential advantage of the Cas9 system is the ability to recruit nuclease activity or heterologous effector domains to more than one genomic locus or target site in a cell. However, such multiplex applications require the ability to efficiently express more than one gRNA in a cell. For mammalian cells, RNA polymerase III promoters (e.g., the U6 promoter) have been used to express single short grnas. Previous attempts to express multiple gRNA components from a single transcript in human cells have not proven effective.

For the Cas9 system, an additional desirable ability is to generate inducible versions of the components and achieve tissue-specific expression of the components. Inducible and/or tissue-specific RNA polymerase II promoters have been described previously. However, while Cas9 or dCas9 proteins can be expressed from such RNA pol II promoters, short, defined grnas cannot be expressed in this manner because the start and stop sites of transcription from RNA pol II are imprecise. Indeed, to date all grnas have been expressed from RNA polymerase III promoters, and these promoters are ideally suited for the expression of short RNAs.

As demonstrated herein, Csy4 is an endoribonuclease that recognizes short RNA hairpin sequences and can be used to excise a multifunctional gRNA (generated from the RNA pol II or III promoter) cassette encoded on a single longer RNA transcript in which each gRNA is separated by a Csy4 cleavage site. Functional grnas can be successfully cleaved from longer RNA transcripts, which are expressed from the RNA pol II promoter.

gRNA/Csy4 poly-box

Thus, described herein are DNA molecules that encode longer RNA transcripts, referred to herein as multimerization cassettes, that include two or more gRNA sequences, wherein each gRNA is flanked by Csy4 cleavage sequences. The DNA molecule may also include a promoter, and may optionally include one or more other transcriptional regulatory sequences, such as enhancer, silencer, insulator, and poly a sequences. See, e.g., Xu (Xu) et al, Gene (Gene), 7/11/2001; 272(1-2):149-56.

Promoters

A number of promoters are known in the art that can be used in the methods of the invention. In some embodiments, the promoter is a Pol II or Pol III promoter, preferably a Pol II promoter. Different Pol II promoters have been described, and these Pol II promoters may be used in the compositions and methods of the invention, including the CAG promoter (see, e.g., Alexatous Polo (Alexopoulo) et al, BMC Cell Biology (BMC Cell Biology)9:2,2008; Miyazaki (Miyazaki) et al, Gene (Gene)79(2): 269-77 (1989); Danui (Niwa) et al, Gene (Gene)108(2): 193-9 (1991)); additional promoters include EF1A, CAGGS, PGK, UbiC, and CMV promoters, as well as tissue-specific promoters, such as B29, desmin, endoglin, FLT-1, GFPA, SYN1, among others; the sequences of various promoters are known in the art. For example, the CMV and PGK promoters can be amplified from pSicoR and pSicoR PGK, respectively (Van. Toura et al, Proc Natl Acad Sci U S A101: 10380-10385 (2004)), the UbicC promoter can be amplified from pDSL _ hpUGIH (ATCC), the CAGGS promoter can be amplified from pCAGGS (BCCM), and the EF1A promoter can be amplified from pEF6 vector (Invitrogen)). Pol II core promoters are described in Butler (Butler) and Kentang (Kadonaga), Gene and development (Genes & Dev.)16:2583-2592 (2002). Excision of the gRNA from the larger transcript driven by Pol II expression enables one to produce a gRNA having any nucleotide at the 5' -most position (standard expression from U6 or other RNA polymerase III promoters places a restriction on the identity of this nucleotide).

In some embodiments, a tissue-specific promoter is used, and a short, defined gRNA sequence can be processed away from the RNA-Pol II transcript.

A number of Pol III promoters are known in the art, including the U6 micronucleus (sn) RNA promoter, the 7SK promoter, and the H1 promoter. See, e.g., Ro (Ro) et al, BioTechniques 38(4) 625-627 (2005).

Guide RNA

By using a single gRNA carrying at least 17-20nt at its 5' end complementary to the genomic DNA target site, Cas9 nuclease can be directed to a specific genomic target of at least 17-20nt carrying an additional proximal prepro-spacer adjacent motif (PAM) of the sequence NGG.

Thus, the compositions described herein may include a sequence encoding a single guide RNA (sgrna) comprising a crRNA fused to a normal trans-encoded tracrRNA, e.g., a single Cas9 guide RNA, as in mary (Mali) et al, Science (Science)2013, month 2 and 15; 339(6121) 823-6, wherein the sequence at the 5 'end is complementary to 17-20 nucleotides (nt) of the target sequence immediately adjacent to the 5' of a prepro-spacer sequence motif (PAM), such as NGG, NAG or NNGG.

Methods of designing and expressing guide RNAs are known in the art. Guide RNAs are summarized in two different systems: 1) system 1 uses separate crrnas and tracrrnas that function together to direct cleavage with Cas 9; and 2) system 2 uses a chimeric crRNA-tracrRNA hybrid of two separate guide RNAs combined in a single system (Jinek et al 2012). tracr-RNAs can be variably truncated and have been shown to be in a range of lengths that function in both the separate system (system 1) and the chimeric gRNA system (system 2). See, e.g., jineek et al, Science 2012; 337: 816-; marie (Mali), et al, Science 2013, 2 months and 15 days; 339(6121) 823-6; cong et al, Science 2013, month 2 and 15; 339(6121) 819-23; and huang (Hwang) and Fu (Fu), et al, nature biotechnology (Nat Biotechnol.)2013 march; 31(3) 227-9; queene Ke (Jinek), et al, Elife 2, e00471 (2013)). For system 2, generally, longer length chimeric grnas have shown greater upper target activity, however, the relative specificity of different length grnas is still currently undefined, and thus in certain instances it may be desirable to use shorter grnas. In some embodiments, the gRNA is complementary to a region within about 100-800bp upstream of the transcription initiation site, e.g., about 500bp upstream of the transcription initiation site, including the transcription initiation site, or within about 100-800bp downstream of the transcription initiation site, e.g., about 500 bp. In some embodiments, vectors (e.g., plasmids) encoding more than one gRNA, e.g., plasmids encoding 2,3, 4, 5, or more grnas, are used, which are targeted to different sites in the same region of the target gene. Additional guide RNAs and methods to increase specificity of GENOME EDITING are described in provisional patent application serial No. 61/838,178, entitled RNA-GUIDED GENOME EDITING FOR RNA GUIDED increased specificity (INCREASING SPECIFICITY FOR).

In some embodiments, the gRNA comprises or consists of:

(X17-20)GUUUUAGAGCUAUGCUGUUUUG(XN)(SEQ ID NO:4);

(X17-20)GUUUUAGAGCUA (SEQ ID NO:5);

(X17-20)GUUUUAGAGCUAUGCUGUUUUG(SEQ ID NO:6);

(X17-20)GUUUUAGAGCUAUGCU(SEQ ID NO:7);

(X17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG

(XN)(SEQ ID NO:8);

(X17-20)GUUUUAGAGCUAUGCUGAAAAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(XN)(SEQ ID NO:9);

(X17-20)GUUUUAGAGCUAUGCUGUUUUGGAAACAAAACAGCAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUC(XN)(SEQ ID NO:10);

(X17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(XN)(SEQ ID NO:11),

(X17-20)GUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(SEQ ID NO:12);

(X17-20) GUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC (SEQ ID NO: 13); or

(X17-20)GUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGC(SEQ ID NO:14),

Wherein X17-20Is a sequence that is complementary to a complementary strand of at least 17-20 contiguous nucleotides of the target sequence (although in some embodiments this region of complementarity may be greater than 20nt, e.g., 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 or more nt, e.g., 17-30nt), preferably immediately 5' of the pre-spacer sequence adjacent to the motif (PAM), e.g., NGG, NAG, or NNGG. XNIs any sequence in which N (in RNA) may beIs 0-300 or 0-200, e.g., 0-100, 0-50, or 0-20, that does not interfere with binding of ribonucleic acid to Cas 9. In some embodiments, the RNA comprises one or more U, e.g., 1 to 8 or more U (e.g., U, UU, UUU, UUUU, uuuuuuuuu, uuuuuuuuuu, uuuuuuuuuuuuuuuuuuu, uuuuuuuuuuuuuuuuuu) at the 3' end of the molecule due to the optional presence of one or more T that act as a termination signal for terminating transcription of the RNA PolIII. In some embodiments, the RNA comprises one or more, e.g., up to 3, e.g., one, two, or three, or more additional nucleotides at the 5' end of the RNA molecule that are not complementary to the target sequence. Optionally, one or more of the RNA nucleotides are modified, e.g. locked (2 '-O-4' -C methylene bridge), are 5 '-methylcytidine, are 2' -O-methyl-pseudouridine, or wherein the phosphoribosyl backbone has been linked by a polyamide chain, e.g. sequence X17-20One or more of the nucleotides within, sequence XNOne or more of the nucleotides within, or one or more substitutions of nucleotides within any sequence of the gRNA.

For example, in some embodiments, chimeric guide RNAs described in Ninek (Jinek) et al (science.337 (6096):816-21(2012)) can be used, e.g.,

(X17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG(SEQ ID NO:8);(X17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGG (SEQ ID NO: 9); in some embodiments, sgrnas are used that carry a 5 '-terminal 17-20 nucleotide sequence complementary to the target DNA sequence, and a 42 nucleotide 3' -terminal stem-loop structure necessary for Cas9 binding (described in jineke (Jinek) et al, eife.2: e00471(2013)), e.g., (X)17-20)GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG(SEQ ID NO:8)。

In some embodiments, the guide RNA includes one or more adenine (a) or uracil (U) nucleotides on the 3' end.

Although the examples described herein utilize a single gRNA, these methods can also be used with dual grnas (e.g., crRNA and tracrRNA found in naturally occurring systems). In this case, the present system will be used in combination, for example with a plurality of different crrnas expressed in the following (note that for RNA, T should be understood as U herein), with a single tracrRNA:

crRNA sequence: x17-20-GTTTTAGAGCTAGAAA(SEQ ID NO:15)

tracrRNA sequence: TAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC (SEQ ID NO: 16). In this case, in the methods and molecules described herein, crRNA is used as the guide RNA, and tracrRNA may be expressed from the same or different DNA molecules.

Furthermore, although guide RNAs having complementary 17-20 nucleotide sequences are exemplified herein, in some embodiments, longer sequences, such as 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 or more nt, such as 17-30nt instead of 17-20nt, may be used.

Csy4 cleavage sequence

In the methods and compositions described herein, the Csy4 cleavage sequence is inserted into the DNA molecule such that each guide RNA is flanked by cleavage sequences, with one or at least one cleavage sequence between each guide RNA. Exemplary Csy4 cleavage sequences include GTTCACTGCCGTATAGGCAG (truncated 20nt) (SEQ ID NO:1) and GTTCACTGCCGTATAGGCAGCTAAGAAA (full 28nt) (SEQ ID NO: 2). As demonstrated herein, the use of a truncated Csy4 cleavage site (SEQ ID NO:1) was more effective in human cells than the use of a standard site. To the knowledge of the present inventors, this was the first demonstration of the use of Csy4 activity in human cells.

Functional Csy4 enzyme sequence

In the methods described herein, a functional Csy4 enzyme capable of cleaving transcripts at the Csy4 cleavage site is also expressed in the cell.

Exemplary Csy4 sequences are from Csy4 homologs, these Csy4 homologs are from Pseudomonas aeruginosa UCBPP-PA14(Pa14), Bacillus pestis AAM85295(Yp), Escherichia coli UTI89(Ec89), Arthrobacter nodorum (Dichelobacter nodosus) VCS1703A (Dn), Acinetobacter baumannii AB0057(Ab), Antarctic bacteria (Moritella sp.) PE36(MP1, MP01), Shewanella W3-18-1(SW), Pasteurella multocida Pm70(Pm), plant spoilage bacteria (Pectobacterium wasabiae) (Pw), and Ech703(Dd) of Erwinz (Dickeydantadii), all listed in the FIGS. 2010S 6 of Huurwitz et al (Haurwitz) et al (Science 329) (135598). In a preferred embodiment, Csy4 is from pseudomonas aeruginosa.

In some embodiments, Csy4 is also used to covalently attach a heterologous effector domain to the gRNA. Csy4 is considered to be a single converting enzyme and remains bound to its target hairpin sequence after cleavage (Sternberg et al, RNA.2012, 4 months; 18(4): 661-72). Thus, Csy4 is expected to remain bound to the 3' end of each cleaved gRNA. Since the cleaved gRNA appears to function in human cells as demonstrated herein, the presence of this Csy4 protein at the 3' end of the gRNA does not appear to affect the ability of the gRNA to complex with Cas9 and direct Cas9 activity. Thus, it is hypothesized that these gRNA-Csy4 fusions will also be able to direct Cas9 mutants carrying mutations that inactivate their catalytic nuclease activity (dCas9 protein). Thus, one can fuse Heterologous Functional Domains (HFDs) to Csy4(Csy4-HFD), and then the dCas9: sgRNA: Csy4-HFD complex can direct such domains to specific genomic loci. Examples of such HFDs may include other nuclease domains, such as domains from fokl, transcriptional activator or repressor domains, or other domains that modify the methylation state of histones or DNA.

Csy4-HFD is generated by fusing a heterologous domain (e.g., a transcriptional activation domain, such as from VP64 or NF-. kappa.Bp 65) to the N-terminus or C-terminus of Csy4, with or without an intervening linker, such as a linker of about 5-20 or 13-18 nucleotides. The transcriptional activation domain may be fused to the N-or C-terminus of Csy 4. In addition to the transcription activation domain, other heterologous functional domains as known in the art may be used (e.g., transcription repressors (e.g., KRAB, SID, and other repressors) or silencers, such as heterochromatin protein 1(HP1, also known as swi6), e.g., HP1 α or HP1 β; proteins or peptides that can recruit long non-coding RNAs (lncrnas) fused to fixed RNA binding sequences, e.g., those bound by MS2 coat protein, endoribonuclease Csy4, or λ N protein; enzymes that modify the methylation state of DNA (e.g., DNA methyltransferase (DNMT) or TET protein), or enzymes that modify histone subunits (e.g., Histone Acetyltransferase (HAT), Histone Deacetylase (HDAC), histone methyltransferases (e.g., for methylation of lysine or arginine residues), or histone demethylases (e.g., for demethylation of lysine or arginine residues), sequences of such domains are known in the art, such as a domain that catalyzes the hydroxylation of methylated cytosines in DNA. Exemplary proteins include the ten-eleven translocation (TET) family 1-3, an enzyme that converts 5-methylcytosine (5-mC) to 5-hydroxymethylcytosine (5-hmC) in DNA. See, for example, WO/2014/144761.

The sequence of human TET1-3 is known in the art and is shown in the following table:

variant (1) represents the longer transcript and encodes the longer isoform (a). Variant (2) differs from variant 1 in the 5'UTR and in the 3' UTR as well as the coding sequence. The resulting subtype (b) is shorter and has a different C-terminus compared to subtype a.

In some embodiments, all or a portion of the full-length sequence of the catalytic domain may be included, such as a catalytic module including a cysteine-rich extension and a 2OGFeDO domain encoded by 7 highly conserved exons, such as the Tet1 catalytic domain including amino acids 1580-. See, e.g., jeer et al, Cell Cycle (Cell Cycle) 2009, 6 months and 1 day; 8(11) 1698 and 710. The electronic publication No. 2009, No. 6, No. 27 exemplifies the alignment of key catalytic residues in all three Tet proteins, and their complementary material to the full-length sequence (available at ftp site ftp. ncbi. nih. gov/pub/aravidin/don/supplementarymaterialdons. html) (see, e.g., seq 2 c); in some embodiments, the sequence includes the corresponding region in amino acids 1418-2136 or Tet2/3 of Tet 1.

Other catalytic modules may be, for example, from proteins identified in 2009 by jeer et al.

In some embodiments, the fusion protein comprises a linker between Csy4 and the heterologous domain. Linkers that can be used for these fusion proteins (or between fusion proteins in a linked structure) can include any sequence that does not interfere with the function of the fusion protein. In preferred embodiments, the linker is short, e.g., 2-20 amino acids, and is typically flexible (i.e., includes amino acids with a high degree of freedom, e.g., glycine, alanine, and serine). In some embodiments, the linker comprises one or more units consisting of GGGS (SEQ ID NO:3) or GGGGS (SEQ ID NO:17), such as two, three, four, or more repeats of a GGGS (SEQ ID NO:3) or GGGGS (SEQ ID NO:17) unit. Other linker sequences, such as GGS, GGSG (SEQ ID NO:22), SGSETPGTSESA (SEQ ID NO:23), SGSETPGTSESATPES (SEQ ID NO:24), or SGSETPGTSESATPEGGSGGS (SEQ ID NO:25) may also be used.

Cas9

In the methods described herein, Cas9 is also expressed in cells. A variety of bacteria express Cas9 protein variants. Cas9 from streptococcus pyogenes is currently used longest; some other Cas9 proteins have high levels of sequence identity to streptococcus pyogenes Cas9 and use the same guide RNA. Other Cas9 are more diverse, use different grnas, and also recognize different PAM sequences (2-5 nucleotide sequences specified by proteins adjacent to the sequence specified by the RNA). Chelinski et al classified the Cas9 proteins from a large group of bacteria (RNA biology 10:5, 1-12; 2013) and listed in its supplementary FIG. 1 and supplementary Table 1 are a number of Cas9 proteins, incorporated herein by reference. The constructs and methods described herein can include the use of any of those Cas9 proteins, and their corresponding guide RNAs or other compatible guide RNAs. In clump (Cong) et al (Science)39,819(2013)), Cas9 from the Streptococcus thermophilus LMD-9CRISPR1 system has been shown to function in human cells. In addition, neiniek (Jinek) et al showed that in vitro, Cas9 orthologs from streptococcus thermophilus and listeria lnoko (but not from neisseria meningitidis or campylobacter jejuni, which might use different guide RNAs) could be guided by the dual streptococcus pyogenes gRNA in order to cleave the target plasmid DNA, albeit with slightly reduced efficiency.

In some embodiments, the present system utilizes Cas9 protein from streptococcus pyogenes as encoded in bacteria or codon optimized for expression in mammalian cells. An exemplary sequence (residues 1-1368) of streptococcus pyogenes Cas9 fused to an HA epitope (amino acid sequence DAYPYDVPDYASL (SEQ ID NO:18)) and a nuclear localization signal (amino acid sequence PKKKRKVEDPKKKRKVD (SEQ ID NO:19)) is as follows:

see, jineek et al, 2013, supra.

In some embodiments, a Cas9 sequence comprising one of the D10A and H840A mutations is used in order to render the nuclease a nickase, or a Cas9 sequence comprising both the D10A and H840A mutations is used in order to render the nuclease portion of the protein catalytically inactive. The sequences of catalytically inactive streptococcus pyogenes Cas9(dCas9) that can be used in the methods and compositions described herein are as follows; mutations are in bold and underlined.

See, e.g., mary (Mali) et al, 2013, supra; and jiniek (Jinek) et al, 2012, supra. Alternatively, Cas9 may be a dCas 9-heterodomain fusion (dCas9-HFD), as in U.S. provisional patent application entitled RNA-GUIDED TARGETING OF genetic and epigenetic REGULATORY PROTEINS TO SPECIFIC GENOMIC LOCI filed on 21/6.2013 (RNA-GUIDED TARGETING OF GENETIC AND EPIGENOMIC REGULATORY PROTEINS TO SPECIFIC GENOMIC LOCI), and assigned serial No. 61/838,148, and described in PCT/US 2014/027335.

As described herein, Cas9 can be expressed from an expression vector, e.g., an extrachromosomal plasmid or viral vector that includes a sequence encoding Cas9, e.g., a Cas9cDNA or gene; cas9 can be expressed from a foreign Cas9cDNA or gene that has been integrated into the genome of the cell; cas9 can be expressed from mRNA encoding Cas 9; cas9 may be the actual Cas9 protein itself; or in the case of non-mammalian cells, can be exogenous Cas 9.

Expression system

Nucleic acid molecules including expression vectors can be used, for example, for in vivo or in vitro expression of the Csy 4/guide RNA constructs described herein. Vectors for expressing multiple grnas (potentially in an inducible or tissue-/cell-type specific manner) can be used for research and therapeutic applications.

In order to use the fusion proteins and multimeric guide RNA cassettes described herein, it may be desirable to express them from the nucleic acids that encode them. This can be done in a number of ways. For example, a nucleic acid encoding a guide RNA cassette or Csy4 or Cas9 protein can be cloned into an intermediate vector for transformation into a prokaryotic or eukaryotic cell for replication and/or expression. The intermediate vector is typically a prokaryotic vector, such as a plasmid, or a shuttle vector, or an insect vector, for storage or manipulation of the nucleic acid encoding the fusion protein or for production of the fusion protein. The nucleic acid encoding the guide RNA or fusion protein can also be cloned into an expression vector for administration into a plant cell, an animal cell, preferably a mammalian cell or a human cell, a fungal cell, a bacterial cell, or a protozoan cell.

To obtain expression, sequences encoding the guide RNA or fusion protein are typically subcloned into an expression vector that includes a promoter to direct transcription. Suitable bacterial and eukaryotic promoters are well known in the art and are described, for example, in Sambrook (Sambrook) et al, Molecular Cloning, a Laboratory Manual (3 rd edition, 2001); crigler (Kriegler), gene transfer and expression: a Laboratory Manual (Gene Transfer and Expression: A Laboratory Manual) (1990); and modern methods of Molecular Biology (Current Protocols in Molecular Biology) (edited by Olsubel (Ausubel) et al, 2010). Bacterial expression systems for expression of engineered proteins are available, for example in E.coli, Bacillus, and Salmonella (Palva et al, 1983, Gene 22: 229-. Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well known in the art and are also commercially available.

A variety of suitable vectors are known in the art, such as viral vectors, including recombinant retroviruses, lentiviruses, adenoviruses, adeno-associated viruses, and herpes simplex virus 1, adenovirus-derived vectors, or recombinant bacterial or eukaryotic plasmids. For example, the expression construct may comprise: a coding region; a promoter sequence, such as a promoter sequence that limits expression to a selected cell type, a conditional promoter, or a strong general promoter; an enhancer sequence; untranslated regulatory sequences, such as the 5 'untranslated region (UTR), the 3' UTR; a polyadenylation site; and/or an insulator sequence. Such sequences are known in the art, and the skilled person will be able to select suitable sequences. See, e.g.Modern methods of molecular biology (Current) Protocols in Molecular Biology)(Ausubel) f.m. et al (editors) green Publishing Associates, (1989), segments 9.10-9.14 and other standard laboratory manuals in some embodiments, expression can be restricted to specific cell types using tissue-specific promoters, as is known in the art.

As described above, the vector for expressing the guide RNA may include an RNA Pol II or Pol III promoter to drive expression of the guide RNA. These human promoters allow expression of grnas in mammalian cells following plasmid transfection. Alternatively, the T7 promoter may be used, for example, for in vitro transcription, and RNA may be transcribed and purified in vitro. The promoter used to direct expression of the nucleic acid depends on the particular application. For example, typically, strong constitutive promoters are used for expression and purification of fusion proteins. In contrast, when the fusion protein is administered in vivo for gene regulation, either a constitutive promoter or an inducible promoter may be used, depending on the particular use of the fusion protein. Furthermore, preferred promoters for administration of the fusion protein may be weak promoters, such as HSV TK, or promoters with similar activity. The promoter may also include elements that respond to transcriptional activation, such as hypoxia response elements, Gal4 response elements, lactose inhibitor response elements, and small molecule control systems, such as tetracycline regulation systems and RU-486 systems (see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89: 5547; Origallono (Oligino), et al, 1998, Gene therapy (Gene Ther., 5: 491-.

In addition to a promoter, an expression vector typically comprises a transcription unit or an expression cassette comprising all further elements required for expression of the nucleic acid in a host cell (prokaryotic or eukaryotic). Thus, a typical expression cassette comprises a promoter operably linked, for example, to a nucleic acid sequence encoding a fusion protein, and, for example, any signals required for efficient polyadenylation, transcription termination, ribosome binding site, or translation termination of the transcript. Additional elements of the cassette may include, for example, enhancers, and heterologous splicing intrinsic signals.

The particular expression vector used to transport the genetic information into the cell is selected with respect to the intended use of the fusion protein, e.g., expression in plants, animals, bacteria, fungi, protozoa, etc. Standard bacterial expression vectors include plasmids such as pBR 322-based plasmids, pSKF, pET23D, and commercially available tag fusion expression systems such as GST and LacZ. A preferred tag fusion protein is Maltose Binding Protein (MBP). Such tag fusion proteins can be used to purify engineered TALE repeat proteins. Epitope tags may also be added to recombinant proteins in order to provide an isolated convenient method for monitoring expression and for monitoring cellular and subcellular localization, such as c-myc or FLAG.

Expression vectors containing regulatory elements from eukaryotic viruses are commonly used in eukaryotic expression vectors, such as SV40 vectors, papillomavirus vectors, and vectors derived from epstein-barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A +, pMTO10/A +, pMAMneo-5, baculovirus pDSVE and any other vector that allows expression of proteins under the direction of the following promoters: SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown to be effective for expression in eukaryotic cells.

Some expression systems have markers for selection of stably transfected cell lines, such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. High-yield expression systems are also suitable, for example, using baculovirus vectors in insect cells with the fusion protein coding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoter.

The elements typically included in expression vectors also include replicons that function in E.coli, genes encoding antibiotic resistance to allow selection of bacteria that harbor the recombinant plasmid, and unique restriction sites in non-essential regions of the plasmid to allow insertion of the recombinant sequences.

Bacterial, mammalian, yeast or insect cell lines expressing large amounts of proteins are generated using standard transfection Methods and these proteins are then purified using standard techniques (see, e.g., Colley (Colley) et al, 1989, journal of biochemistry (j.biol. chem.), 264: 17619-22; Guide to Protein Purification in enzymatic procedures (Guide to Protein Purification, in Methods in Enzymology), volume 182 (Deutscher, editors, 1990)). Transformation of eukaryotic and prokaryotic cells is carried out according to standard techniques (see, e.g., Morrison (Morrison), 1977, journal of bacteriology (J.Bacteriol.)132: 349-351; Clark (Clark) -Cortis (Curtiss) & Cortiss (Curtiss), Methods of Enzymology (Methods in Enzymology)101:347-362 (Wu et al, eds., 1983)).

Any known procedure for introducing an exogenous nucleotide sequence into a host cell may be used. These include the use of calcium phosphate transfection, polybrene (polybrene), protoplast fusion, electroporation, nuclear transfection, liposomes, microinjection, naked DNA, plasmid vectors, viral vectors, both episomal and integrative, and any other well-known method for introducing cloned genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into a host cell (see, e.g., Sambrook (Sambrook) et al, supra). It is only necessary to use specific genetic engineering procedures that can successfully introduce at least one gene into a host cell capable of expressing the selected protein.

In some embodiments, the Cas9 or Csy4 proteins include a nuclear localization domain that provides for translocation of the protein to the nucleus. Several Nuclear Localization Sequences (NLS) are known, and any suitable NLS can be used. For example, many NLSs have multiple basic amino acids, called bidirectional basic repeats (reviewed in Garcia) -Bistor (Bustos) et al, 1991, Biochemical and biophysics acta, 1071: 83-101). The NLS comprising the bidirectional basic repeat can be located in any part of the chimeric protein and result in localization of the chimeric protein within the nucleus. In a preferred embodiment, the nuclear localization domain is integrated into the final fusion protein, as the final function of the fusion protein described herein will typically require localization of the protein in the nucleus of the cell. However, in cases where the protein has an intrinsic nuclear translocation function, it may not be necessary to add a separate nuclear localization domain.

The invention includes vectors and cells comprising vectors.

Libraries

Also provided herein are combinatorial libraries of grnas, e.g., in inducible, tissue or cell type specific multiplex vectors for research applications, e.g., for screening potential drug targets or for defining interactions at the gene level.

Application method

The described methods can include expression in a cell, or contacting a cell with a multimerization cassette as described herein, plus a nuclease that can be guided by the shortened gRNA, e.g., Cas9 nuclease as described above, and Csy4 nuclease as described above.

The described system is a useful and versatile tool for modifying the expression of multiple exogenous genes simultaneously, or for targeting multiple portions of a single gene. To achieve this, current methods require the use of separate gRNA-encoding transcripts for each site to be targeted. Separate grnas are not optimal for multiple genome editing of a population of cells, as it is not possible to ensure that each cell will express each gRNA; using multiple transcripts, cells receive a complex and heterogeneous random mixture of grnas. However, the present system allows for the expression of multiple grnas from a single transcript, which allows for the targeting of multiple sites in the genome by expressing multiple grnas. Furthermore, using a single transcript system, each cell should express all grnas with similar stoichiometries. Thus, this system can be readily used to simultaneously alter expression of a large number of genes, or to recruit multiple Cas9 or HFDs to a single gene, promoter, or enhancer. This capability has wide application, for example, in basic biology research, where it can be used to study gene function and manipulate expression of multiple genes in a single pathway, and in synthetic biology, where it will enable researchers to create circuits in cells that respond to multiple input signals. This technique can be implemented and is relatively easy to adapt to multiplexing, which will make it a widely useful technique with many wide applications.

The methods described herein include contacting a cell with a nucleic acid encoding a polymeric gRNA cassette described herein for one or more genes, and a nucleic acid encoding Csy4 and Cas9, thereby modulating expression of the one or more genes.

Examples of the invention

The invention is further illustrated in the following examples, which should not be construed as limiting the scope of the invention as described in the claims.

Example 1 multiplex editing Using CRISPR/Cas9

Three strategies were attempted with the goal of performing multiple editing using CRISPR/Cas9 of an array of crrnas or sgrnas expressed from a single transcript:

1. co-directional repeats flanking the crRNA array and Cas9, with separate tracrRNAs

2. Short crRNA arrays separated by Csy4 site, expressed with Csy4, Cas9, and separate tracrRNAs

3. full-Length Single guide RNA (sgRNA) separated by Csy4 site

Each set of constructs (exemplified in fig. 1) was tested for the ability to efficiently disrupt the EGFP in the U2OS-EGFP disruption array. The results are shown in table 2. Constructs designed using strategies 1 and 2 showed the lowest activity in EGFP disruption arrays, even for single targets; further experiments (described below) therefore focused on optimization strategy 3.

Example 2 in mammalian cells, high activity CRISPR directs multiple expression of RNA from RNA polymerase II and III promoters

A schematic overview of an exemplary strategy for cleaving grnas from longer transcripts using Csy4 nuclease is shown in fig. 3. In an initial experiment to demonstrate proof of concept, two versions of Csy 4-cleaved RNA hairpin sites were tested for cleavage in human cells. To achieve this, grnas flanked by one of two Csy4 cleavage sites were expressed:

1, GTTCACTGCCGTATAGGCAGCTAAGAAA (full 28nt) (SEQ ID NO:2)

2, GTTCACTGCCGTATAGGCAG (truncated 20nt) (SEQ ID NO:1)

The results show that grnas flanked at their 5 'and 3' ends by truncated 20nt sequences are more active in mammalian cells than those flanked by longer 28nt sequences (fig. 4). To the knowledge of the present inventors, this is the first demonstration that Csy4 nuclease can be used to process RNA transcripts in living human cells. An important additional advantage of the 20nt truncation site is that, unlike the longer 28nt sequence, it does not leave any additional nucleotides on the 5' end of the gRNA processed from the longer transcript (fig. 4). This enables expression of grnas with any desired nucleotide at the 5' most position. This is an improvement over expressing a gRNA from an RNA polymerase III promoter with the need for a particular nucleotide or nucleotides at the 5' most position.

Using this Csy 4-based system, efficient expression of two and three different grnas was demonstrated (fig. 5 and 6). Simultaneous expression of grnas using this method induces changes at the desired site in human cells.

These results also demonstrate that this Csy 4-based strategy can be used with grnas encoded on longer mrnas produced by RNA Pol II promoters (fig. 7). In these experiments, one of three different individual grnas flanked by truncated Csy4 sites was encoded on mRNA produced by the CAG promoter (an RNA Pol II promoter). As shown in figure 7, in human cells, all three constructs were able to produce functional grnas that could direct Cas9 nuclease, but this was only in the presence of Csy 4. The level of target Cas9 activity observed was comparable (although somewhat lower) to that observed when these grnas were expressed using a standard RNA Pol III promoter alone, or as Csy4 flanking transcripts from an RNA Pol III promoter (fig. 7).

In summary, the results of the present invention demonstrate that: 1) when separated by the Csy4 cleavage site and with Csy4 present in human cells, up to three functional grnas can be produced from a single RNA pol III transcript, 2) multiple Csy 4-processed grnas can be used to direct Cas9 nuclease to induce multiple changes in a single human cell, and 3) a functional gRNA flanked by Csy4 cleavage sites can be excised from a longer mRNA transcript made from an RNA polymerase II promoter by Csy4 nuclease.

Other embodiments

It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

Sequence listing

<110> general medical company

<120> multiple guide RNA

<130> 40174-0012CN2

<150> PCT/US2014/035162

<151> 2014-04-23

<150> 14/211,117

<151> 2014-03-14

<150> PCT/US2014/029068

<151> 2014-03-14

<150> PCT/US2014/028630

<151> 2014-03-14

<150> PCT/US2014/029304

<151> 2014-03-14

<150> 61/921,007

<151> 2013-12-26

<150> 61/930,782

<151> 2014-01-23

<160> 54

<170> PatentIn version 3.5

<210> 1

<211> 20

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 1

gttcactgcc gtataggcag 20

<210> 2

<211> 28

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 2

gttcactgcc gtataggcag ctaagaaa 28

<210> 3

<211> 4

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 3

Gly Gly Gly Ser

1

<210> 4

<211> 342

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<220>

<221> modified base

<222> (43)..(342)

<223> a, c, u or g, and the region may comprise 0-300 nucleotides, wherein certain positions may be deleted

<400> 4

nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuguuu ugnnnnnnnn nnnnnnnnnn 60

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nn 342

<210> 5

<211> 32

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 5

nnnnnnnnnn nnnnnnnnnn guuuuagagc ua 32

<210> 6

<211> 42

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 6

nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuguuu ug 42

<210> 7

<211> 36

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 7

nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcu 36

<210> 8

<211> 362

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<220>

<221> modified base

<222> (63)..(362)

<223> a, c, u or g, and the region may comprise 0-300 nucleotides, wherein certain positions may be deleted

<400> 8

nnnnnnnnnn nnnnnnnnnn guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cgnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360

nn 362

<210> 9

<211> 375

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<220>

<221> modified base

<222> (76)..(375)

<223> a, c, u or g, and the region may comprise 0-300 nucleotides, wherein certain positions may be deleted

<400> 9

nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcugaaa agcauagcaa guuaaaauaa 60

ggcuaguccg uuaucnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360

nnnnnnnnnn nnnnn 375

<210> 10

<211> 387

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<220>

<221> modified base

<222> (88)..(387)

<223> a, c, u or g, and the region may comprise 0-300 nucleotides, wherein certain positions may be deleted

<400> 10

nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuguuu uggaaacaaa acagcauagc 60

aaguuaaaau aaggcuaguc cguuaucnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360

nnnnnnnnnn nnnnnnnnnn nnnnnnn 387

<210> 11

<211> 396

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<220>

<221> modified base

<222> (97)..(396)

<223> a, c, u or g, and the region may comprise 0-300 nucleotides, wherein certain positions may be deleted

<400> 11

nnnnnnnnnn nnnnnnnnnn guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cguuaucaac uugaaaaagu ggcaccgagu cggugcnnnn nnnnnnnnnn nnnnnnnnnn 120

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnn 396

<210> 12

<211> 96

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 12

nnnnnnnnnn nnnnnnnnnn guuuaagagc uagaaauagc aaguuuaaau aaggcuaguc 60

cguuaucaac uugaaaaagu ggcaccgagu cggugc 96

<210> 13

<211> 106

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 13

nnnnnnnnnn nnnnnnnnnn guuuuagagc uaugcuggaa acagcauagc aaguuuaaau 60

aaggcuaguc cguuaucaac uugaaaaagu ggcaccgagu cggugc 106

<210> 14

<211> 106

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 14

nnnnnnnnnn nnnnnnnnnn guuuaagagc uaugcuggaa acagcauagc aaguuuaaau 60

aaggcuaguc cguuaucaac uugaaaaagu ggcaccgagu cggugc 106

<210> 15

<211> 36

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<223> description of Combined DNA/RNA molecules (Combined DNA/RNA molecules) synthetic oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 15

nnnnnnnnnn nnnnnnnnnn gttttagagc tagaaa 36

<210> 16

<211> 60

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 16

tagcaagtta aaataaggct agtccgttat caacttgaaa aagtggcacc gagtcggtgc 60

<210> 17

<211> 5

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 17

Gly Gly Gly Gly Ser

1 5

<210> 18

<211> 13

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 18

Asp Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu

1 5 10

<210> 19

<211> 17

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 19

Pro Lys Lys Lys Arg Lys Val Glu Asp Pro Lys Lys Lys Arg Lys Val

1 5 10 15

Asp

<210> 20

<211> 1401

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic Polypeptides

<400> 20

Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val

1 5 10 15

Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu

50 55 60

Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg

385 390 395 400

Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp

450 455 460

Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr

645 650 655

Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu

1055 1060 1065

Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val

1070 1075 1080

Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr

1085 1090 1095

Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp

1355 1360 1365

Ala Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu Gly Ser Gly

1370 1375 1380

Ser Pro Lys Lys Lys Arg Lys Val Glu Asp Pro Lys Lys Lys Arg

1385 1390 1395

Lys Val Asp

1400

<210> 21

<211> 1368

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic Polypeptides

<400> 21

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val

1 5 10 15

Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu

50 55 60

Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg

385 390 395 400

Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp

450 455 460

Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr

645 650 655

Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg

820 825 830

Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu

1055 1060 1065

Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val

1070 1075 1080

Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr

1085 1090 1095

Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp

1355 1360 1365

<210> 22

<211> 4

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 22

Gly Gly Ser Gly

1

<210> 23

<211> 12

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 23

Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala

1 5 10

<210> 24

<211> 16

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 24

Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser

1 5 10 15

<210> 25

<211> 21

<212> PRT

<213> Artificial sequence

<220>

<223> description of Artificial sequences synthetic peptides

<400> 25

Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Gly

1 5 10 15

Gly Ser Gly Gly Ser

20

<210> 26

<211> 24

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 26

gccgaggtga agttcgaggg cgac 24

<210> 27

<211> 23

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 27

cctacggcgt gcagtgcttc agc 23

<210> 28

<211> 332

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 28

acgtaaacgg ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc acctacggca 60

agctgaccct gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg cccaccctcg 120

tgaccaccct gacctacggc gtgcagtgct tcagccgcta ccccgaccac atgaagcagc 180

acgacttctt caagtccgcc atgcccgaag gctacgtcca ggagcgcacc atcttcttca 240

aggacgacgg caactacaag acccgcgccg aggtgaagtt cgagggcgac accctggtga 300

accgcatcga gctgaagggc atcgacttca ag 332

<210> 29

<211> 295

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 29

gtccggcgag ggcgagggcg atgccaccta cggcaagctg accctgaagt tcatctgcac 60

caccggcaag ctgcccgtgc cctggcccac cctcgtgacc accctgacct acggcgtgca 120

gtgcttcagc cgctaccccg accacatgaa gcagcacgac ttcttcaagt ccgccatgcc 180

cgaaggctac gtccaggagc gcaccatctt cttcaaggac gacggcaact acaagacccg 240

cgtcgagggc gacaccctgg tgaaccgcat cgagctgaag ggcatcgact tcaag 295

<210> 30

<211> 158

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 30

tccggcgagg gcgagggcga tgccacctac ggcaagctga ccctgaagtt catctgcacc 60

accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca ccctgacgag ggcgacaccc 120

tggtgaaccg catcgagctg aagggcatcg acttcaag 158

<210> 31

<211> 285

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 31

tccggcgagg gcgagggcga tgccacctac ggcaagctga ccctgaagtt catctgcacc 60

accggcaagc tgcccgtgcc ctggcccacc ctcgtgaaca ccctgaccta cgtgcagtgc 120

ttcagccgct accccgacca catgaagcag cacgacttct tcaagtccgc catgcccgaa 180

ggctacgtcc aggagcgcac catcttcttc aaggacgacg gcaactacaa gttcgagggc 240

gacaccctgg tgaaccgcat cgagctgaag ggcatcgact tcaag 285

<210> 32

<211> 301

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 32

ccggcgaggg cgagggcgat gccacctacg gcaagctgac cctgaagttc atctgcacca 60

ccggcaagct gcccgtgccc tggcccaccc tcgtgaccac cctgacctac gtgcagtgct 120

tcagccgcta ccccgaccac atgaagcagc acgacttctt caagtccgcc atgcccgaag 180

gctacgtcca ggagcgcacc atcttcttca aggacgacgg caactacaag acccgcgccg 240

aggtgaagtt cgagggcgac accctggtga accgcatcga gctgaagggc atcgacttca 300

a 301

<210> 33

<211> 156

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 33

tccggcgagg gcgagggcga tgccacctac ggcaagctga ccctgaagtt catctgcacc 60

accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca ccctgaccta cggacaccct 120

ggtgaaccgc atcgagctga agggcatcga cttcaa 156

<210> 34

<211> 197

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 34

gtaaacggcc acaagttcag cgtgcagtgc ttcagccgct accccgacca catgaagcag 60

cacgacttct tcaagtccgc catgcccgaa ggctacgtcc aggagcgcac catcttcttc 120

aaggacgacg gcaactacaa gacggcaact acaagttcga gggcgacacc ctggtgaacc 180

gcatcgagct gaagggc 197

<210> 35

<211> 283

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 35

ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc acctacggca agctgaccct 60

gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg cccaccctcg tgaccaccct 120

gacctacggc gtgcagtgct tcagccgcta ccccgaccac atgaagcagc acgacttctt 180

caagtccgcc atgcccgaag gctacgtcca ggagcgcacc atcttcttca aggacgacgg 240

caactacaag acaccgcatc gagctgaagg gcatcgactt caa 283

<210> 36

<211> 175

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 36

ggccacaagt tcagcgtgtc cggcgagggc gagggcgatg ccacctacgg caagctgacc 60

ctgaagttca tctgcaccac cggcaccatc aaggacgacg gcaactacaa gacccgcgcc 120

gaagttcgag ggcgacaccc tggtgaaccg catcgagctg aagggcatcg acttc 175

<210> 37

<211> 291

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 37

taaacggcca caagttcagc gtgtccggcg agggcgaggg cgatgccacc tacggcaagc 60

tgaccctgaa gttcatctgc accaccggca agctgcccgt gccctggccc accctcgtga 120

ccaccctgac ctacggcgtg cagtgcttca gccgctaccc cgaccacatg aagcagcacg 180

acttcttcaa gtccgccatg cccgaaggct acgtccagga gcgcaccatc ttcttcaagg 240

acgacggcaa ctacaagacc tggtgaaccg catcgagctg aagggcatcg a 291

<210> 38

<211> 314

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 38

cacaagttca gcgtgtccgg cgagggcgag ggcgatgcca cctacggcaa gctgaccctg 60

aagttcatct gcaccaccgg caagctgccc gtgccctggc ccaccctcgt gaccaccctg 120

acctacggcg tgcagtgctt cagccgctac cccgaccaca tgaagcagca cgacttcttc 180

aagtccgcca tgcccgaagg ctacgtccag gagcgcacca tcttcttcaa ggacgacggc 240

aactacaaga cccgcgccga ggttcgaggg cgacaccctg gtgaaccgca tcgagctgaa 300

gggcatcgac ttca 314

<210> 39

<211> 316

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 39

taaacggcca caagttcagc gtgtccggcg agggcgaggg cgatgccacc tacggcaagc 60

tgaccctgaa gttcatctgc accaccggca agctgcccgt gccctggccc accctcgtga 120

ccaccctgac ctacggcgtg cagtgcttca gccgctaccc cgaccacatg aagcagcacg 180

acttcttcaa gtccgccatg cccgaaggct acgtccagga gcgcaccatc ttcttcaagg 240

acgacggcaa ctacaagacc cgcgccgagg gcgacaccct ggtgaaccgc atcgagctga 300

agggcatcga cttcaa 316

<210> 40

<211> 318

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 40

aaacggccac aagttcagcg tgtccggcga gggcgagggc gatgccacct acggcaagct 60

gaccctgaag ttcatctgca ccaccggcaa gctgcccgtg ccctggccca ccctcgtgac 120

caccctgacc tacgtgcagt gcttcagccg ctaccccgac cacatgaagc agcacgactt 180

cttcaagtcc gccatgcccg aaggctacgt ccaggagcgc accatcttct tcaaggacga 240

cggcaactac aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg tgaaccgcat 300

cgagctgaag ggcatcga 318

<210> 41

<211> 185

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 41

acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac ggcaagctga 60

ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca 120

ccctgaccta cgtgaagttc gagggcgaca ccctggtgaa ccgcatcgag ctgaagggca 180

tcgac 185

<210> 42

<211> 180

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 42

aaacggccac aagttcagcg tgtccggcga gggcgagggc gatgccacct acggcaagct 60

gaccctgaag ttcatctgca ccaccggcaa gctgcccgtg ccctggccca ccctcgtgac 120

caccctgacc tacgtgaagt tcgagggcga caccctggtg aaccgcatcg agctgaaggg 180

<210> 43

<211> 242

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (132)..(132)

<223> a, c, t, g unknown or others

<220>

<221> modified base

<222> (153)..(153)

<223> a, c, t, g unknown or others

<220>

<221> modified base

<222> (174)..(174)

<223> a, c, t, g unknown or others

<400> 43

aaacggccac aagttcagcg tgtccggcga gggcgagggc gatgccacct acggcaagct 60

gaccctgaag ttcatctgca ccaccggcaa gctgcccgtg ccctggccca cccttccgcc 120

atgcccgaag gntacgtcca ggagcgcacc atnttcttca aggacgacgg caantacaag 180

acccgcgccg aagttcgagg gcgacaccct ggtgaaccgc atcgagctga agggcatcga 240

ct 242

<210> 44

<211> 175

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 44

caccctgacc tacggcgtgc agtgcttcag ccgctacccc gaccacatga agcagcacga 60

cttcttcaag tccgccatgc ccgaaggcta cgtccaggag cgcaccatct tcttcaagga 120

cgacggcaac tacaagaccc gcgccgaggt gaagttcgag ggcgacaccc tggtg 175

<210> 45

<211> 39

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 45

caccctgacc tacgtgaagt tcgagggcga caccctggt 39

<210> 46

<211> 161

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 46

caccctgacc tacggcgtgc agtgcttcag ccgctacccc gaccacatga agcagcacga 60

cttcttcaag tccgccatgc ccgaaggcta cgtccaggag cgcaccatct tcttcaagga 120

cgacggcaac tacaagaccc gcgccgaggg cgacaccctg g 161

<210> 47

<211> 145

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 47

caccctgacc tcagccgcta ccccgaccac atgaagcagc acgacttctt caagtccgcc 60

atgcccgaag gctacgtcca ggagcgcacc atcttcttca aggacgacgg caactacaag 120

acccgcgccg aggcgacacc ctggt 145

<210> 48

<211> 153

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 48

ttattataca tcggagccct gccaaaaaat caatgtgaag caaatcgcag cccgcctcct 60

gcctccgctc tactcactgg tgttcatctt tggttttgtg ggcaacatgc tggtcatcct 120

catcctgata aactgcaaaa ggctgaagag cat 153

<210> 49

<211> 113

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 49

ttattataca tcggagccct gccaaaaaat caatgtgaag caaatcgcag cccgcctccg 60

ctctactcac tggtgttcat ctttggtttt gtgggcaaca tgctggtcat cct 113

<210> 50

<211> 70

<212> DNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 50

ttattataca tcggagccct gccaaaaaat caatgtgaag caaatcgcag cccgcatgct 60

ggtcatcctc 70

<210> 51

<211> 62

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 51

nnnnnnnnnn nnnnnnnnnn guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cg 62

<210> 52

<211> 54

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<220>

<221> modified base

<222> (1)..(20)

<223> a, c, u or g, and the region may comprise 17-20 nucleotides, some of which may be deleted

<400> 52

nnnnnnnnnn nnnnnnnnnn guuuuagagc uagaaauagc aaguuaaaau aagg 54

<210> 53

<211> 28

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 53

guucacugcc guauaggcag cuaagaaa 28

<210> 54

<211> 20

<212> RNA

<213> Artificial sequence

<220>

<223> description of Artificial sequences Synthesis of oligonucleotides

<400> 54

guucacugcc guauaggcag 20

53页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种SNP分子标记在鉴定高繁殖性能奶牛及辅助育种中的应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!