Orthogonal Cas9 proteins for RNA-guided gene regulation and editing

文档序号:1459365 发布日期:2020-02-21 浏览:20次 中文

阅读说明:本技术 用于RNA向导的基因调节和编辑的正交Cas9蛋白 (Orthogonal Cas9 proteins for RNA-guided gene regulation and editing ) 是由 乔治·M·丘奇 凯文·埃斯弗特 普拉尚特·马利 于 2014-07-08 设计创作,主要内容包括:提供了用于RNA向导的基因调节和编辑的正交Cas9蛋白。提供了调控细胞中的靶核酸的表达的方法,包括使用多重正交Cas9蛋白同时和独立调节相应的基因或者同时和独立编辑相应的基因。(Orthogonal Cas9 proteins for RNA guided gene regulation and editing are provided. Methods of modulating expression of a target nucleic acid in a cell are provided, including simultaneously and independently modulating or simultaneously and independently editing a corresponding gene using multiple orthogonal Cas9 proteins.)

1. A method of regulating expression of two or more target nucleic acids in a cell, comprising:

introducing into the cell a first foreign nucleic acid encoding two or more RNAs complementary to the two or more target nucleic acids,

introducing into the cell a second exogenous nucleic acid encoding two or more orthogonal RNA guided nuclease null DNA binding proteins that bind to the two or more target nucleic acids and are guided by the two or more RNA, respectively,

introducing into the cell a third foreign nucleic acid encoding two or more transcriptional modulator proteins or domains,

wherein the RNA, the orthogonal RNA-guided nuclease null DNA binding protein, and the transcriptional regulator protein or domain are expressed,

wherein two or more co-localized complexes are formed between the RNA, orthogonal RNA guided nuclease null DNA binding protein, transcriptional regulator protein or domain and the target nucleic acid, and wherein the transcriptional regulator protein or domain regulates expression of the target nucleic acid.

2. The method of claim 1, wherein the foreign nucleic acid encoding an orthogonal RNA guided nuclease null DNA binding protein further encodes a transcriptional regulator protein or domain fused to the orthogonal RNA guided nuclease null DNA binding protein.

3. The method of claim 1, wherein the foreign nucleic acid encoding two or more RNAs further encodes a target of an RNA binding domain and the foreign nucleic acid encoding the transcriptional regulator protein or domain further encodes an RNA binding domain fused to the transcriptional regulator protein or domain.

4. The method of claim 1, wherein the cell is a eukaryotic cell.

5. The method of claim 1, wherein the cell is a yeast cell, a plant cell, or an animal cell.

6. The method of claim 1, wherein the RNA is between about 10 to about 500 nucleotides.

7. The method of claim 1, wherein the RNA is between about 20 to about 100 nucleotides.

8. The method of claim 1, wherein the transcriptional regulator protein or domain is a transcriptional activator.

9. The method of claim 1, wherein the transcriptional regulator protein or domain upregulates expression of the target nucleic acid.

10. The method of claim 1, wherein the transcriptional regulator protein or domain upregulates expression of the target nucleic acid to treat a disease or detrimental condition.

Background

Bacterial and archaeal (archaeal) CRISPR-Cas systems direct the degradation of complementary sequences present within invading foreign nucleic acids by means of short guide RNAs complexed to the Cas protein. See Deltcheva, E.et al.CRISPR RNAamplitude by trans-encoded small RNA and host factor RNase III.Nature 471,602-607 (2011); gasiunas, G., Barrangou, R., Horvath, P. & Siksnys, V.Cas9-crRNAribonucleoprotein complex media specific DNA clearance for adaptive immunity in bacteria of the National Academy of science of the United States of America 109, E2579-2586 (2012); jinek, M.et al.A. programmable dual-RNA-bound DNA endonuclease in adaptive bacterial immunity 337,816-821 (2012); sapranaussas, R.et al.the Streptococcus thermophilus CRISPR/Cassystem provides immunity in Escherichia coli nucleic acid nucleic acids research 39,9275-9282 (2011); and Bhaya, D., Davison, M. & Barrangou, R.CRISPR-Cas systems in bacterial and exchange: versatile small RNAs for adaptive sensitivity and regulation. Annualview of genetics 45,273 and 297 (2011). Recent in vitro reconstitution of streptococcus pyogenes (s.pyogenes) type II CRISPR systems demonstrated that fusion to the crRNA ("CRISPR RNA"), which is normally trans-encoding tracrRNA ("trans-activated CRISPR RNA"), was sufficient to direct Cas9 protein sequence-specific cleavage of the target DNA sequence matching the crRNA. Expression of grnas homologous to the target site results in the recruitment of Cas9 (recovery) and degradation of the target DNA. See, h.devieau et al, Phageresponse to CRISPR-encoded resistance in Streptococcus thermophilus. journal of Bacteriology 190,1390(Feb, 2008).

Disclosure of Invention

Aspects of the disclosure relate to complexes of guide RNA, DNA binding protein, and double-stranded DNA target sequences. According to certain aspects, DNA binding proteins within the scope of the present disclosure include proteins that form complexes with guide RNAs and with guide RNAs that direct the complex to a double-stranded DNA sequence, wherein the complex binds to the DNA sequence. This aspect of the disclosure may be referred to as co-localization of RNA and DNA binding proteins to or with double-stranded DNA. In this manner, the DNA binding protein-guide RNA complex can be used to localize a transcriptional regulator protein or domain at the target DNA to regulate expression of the target DNA. According to one aspect, two or more orthogonal RNA (orthogonal RNA) -guided DNA binding proteins or a set of orthogonal RNA-guided DNA binding proteins may be used to simultaneously and independently regulate genes in DNA in a cell. According to one aspect, two or more orthogonal RNA guided DNA binding proteins or a set of orthogonal RNA guided DNA binding proteins can be used to simultaneously and independently edit genes in DNA in a cell. It is understood that when reference is made to a DNA binding protein or an RNA guided DNA binding protein, such reference includes an orthogonal DNA binding protein or an orthogonal RNA guided DNA binding protein. Such orthogonal DNA binding proteins or orthogonal RNA-guided DNA binding proteins may have nuclease activity, they may have nickase (nickase) activity or they may be nuclease null.

According to certain aspects, there is provided a method of regulating expression of a target nucleic acid in a cell, the method comprising introducing into the cell a first foreign nucleic acid (for nucleic acid) encoding one or more RNA (ribonucleic acid) complementary to DNA (deoxyribonucleic acid), wherein the DNA comprises the target nucleic acid, introducing into the cell a second foreign nucleic acid encoding a nucleic-acid-null-DNA binding protein (null-DNA binding protein) that binds to the DNA and is guided by the one or more RNA-guided RNA, introducing into the cell a third foreign nucleic acid encoding a transcriptional regulator protein or domain, wherein the one or more RNA, RNA-guided nuclease null-DNA binding protein and transcriptional regulator protein or domain are expressed, wherein the one or more RNA, RNA-guided nuclease null-DNA binding protein and transcriptional regulator protein or domain are co-localized to the DNA, and wherein, the transcriptional modulator protein or domain modulates expression of the target nucleic acid.

According to one aspect, the foreign nucleic acid encoding the RNA guided nuclease null DNA binding protein further encodes a transcriptional regulator protein or domain fused to the RNA guided nuclease null DNA binding protein. According to one aspect, the foreign nucleic acid encoding one or more RNAs further encodes a target for an RNA binding domain and the foreign nucleic acid encoding a transcriptional regulator protein or domain further encodes an RNA binding domain fused to the transcriptional regulator protein or domain.

According to one aspect, the cell is a eukaryotic cell. According to one aspect, the cell is a yeast cell, a plant cell or an animal cell. According to one aspect, the cell is a mammalian cell.

According to one aspect, the RNA is between about 10 to about 500 nucleotides. According to one aspect, the RNA is between about 20 to about 100 nucleotides.

According to one aspect, the transcriptional regulator protein or domain is a transcriptional activator (transcriptional activator). According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid to treat a disease or detrimental condition (destructive conditioning). According to one aspect, the target nucleic acid is associated with a disease or deleterious condition. According to one aspect, the transcriptional modulator protein or domain is a transcriptional repressor (transcriptional repressior). According to one aspect, the transcriptional regulator protein or domain down-regulates expression of the target nucleic acid. According to one aspect, the transcriptional regulator protein or domain down-regulates expression of the target nucleic acid to treat a disease or detrimental condition. According to one aspect, the target nucleic acid is associated with a disease or deleterious condition.

According to one aspect, the one or more RNAs is a guide RNA (guide RNA). According to one aspect, the one or more RNAs is a tracrRNA-crRNA fusion.

According to one aspect, the DNA is genomic DNA, mitochondrial DNA, viral DNA, or foreign DNA.

According to certain aspects, there is provided a method of regulating expression of a target nucleic acid in a (modular) cell, comprising introducing into the cell a first foreign nucleic acid encoding one or more RNAs (ribonucleic acids) complementary to DNA (deoxyribonucleic acids), wherein the DNA comprises the target nucleic acid, introducing into the cell a second foreign nucleic acid encoding an RNA-guided nuclease-null DNA-binding protein of a type II CRISPR system (bound to and guided by the DNA) and a third foreign nucleic acid encoding a transcriptional regulator protein or domain into the cell, wherein the one or more RNAs, the RNA-guided nuclease-null DNA-binding protein of the type II CRISPR system and the transcriptional regulator protein or domain are expressed, wherein the one or more RNAs, the RNA-guided nuclease-null DNA-binding protein of the type II CRISPR system and the transcriptional regulator protein or domain are co-localized to the DNA, and wherein, the transcriptional modulator protein or domain modulates expression of the target nucleic acid.

According to one aspect, the foreign nucleic acid encoding an RNA-guided nuclease null DNA-binding protein of a type II CRISPR system further encodes a transcriptional regulator protein or domain fused to the RNA-guided nuclease null DNA-binding protein of the type II CRISPR system. According to one aspect, the foreign nucleic acid encoding one or more RNAs further encodes a target for an RNA binding domain and the foreign nucleic acid encoding a transcriptional regulator protein or domain further encodes an RNA binding domain fused to the transcriptional regulator protein or domain.

According to one aspect, the cell is a eukaryotic cell. According to one aspect, the cell is a yeast cell, a plant cell or an animal cell. According to one aspect, the cell is a mammalian cell.

According to one aspect, the RNA is between about 10 to about 500 nucleotides. According to one aspect, the RNA is between about 20 to about 100 nucleotides.

According to one aspect, the transcriptional regulator protein or domain is a transcriptional activator. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid to treat a disease or detrimental condition. According to one aspect, the target nucleic acid is associated with a disease or deleterious condition.

According to one aspect, the one or more RNAs is a guide RNA. According to one aspect, the one or more RNAs is a tracrRNA-crRNA fusion.

According to one aspect, the DNA is genomic DNA, mitochondrial DNA, viral DNA, or foreign DNA.

According to certain aspects, methods of modulating expression of a target nucleic acid in a cell are provided, comprising introducing into the cell a first foreign nucleic acid encoding one or more RNAs (ribonucleic acids) complementary to DNA (deoxyribonucleic acids), wherein the DNA comprises the target nucleic acid, introducing into the cell a second foreign nucleic acid encoding a nuclease-null Cas9 protein that binds to the DNA and is guided by the one or more RNAs, introducing into the cell a third foreign nucleic acid encoding a transcriptional regulator protein or domain, wherein the one or more RNAs, nuclease-null Cas9 protein, and transcriptional regulator protein or domain are expressed, wherein the one or more RNAs, nuclease-null Cas9 protein, and transcriptional regulator protein or domain are co-localized to the DNA, and wherein the transcriptional regulator protein or domain regulates expression of the target nucleic acid.

According to one aspect, the foreign nucleic acid encoding the nuclease-null Cas9 protein further encodes a transcriptional regulator protein or domain fused to the nuclease-null Cas9 protein. According to one aspect, the foreign nucleic acid encoding one or more RNAs further encodes a target for an RNA binding domain and the foreign nucleic acid encoding a transcriptional regulator protein or domain further encodes an RNA binding domain fused to the transcriptional regulator protein or domain.

According to one aspect, the cell is a eukaryotic cell. According to one aspect, the cell is a yeast cell, a plant cell or an animal cell. According to one aspect, the cell is a mammalian cell.

According to one aspect, the RNA is between about 10 to about 500 nucleotides. According to one aspect, the RNA is between about 20 to about 100 nucleotides.

According to one aspect, the transcriptional regulator protein or domain is a transcriptional activator. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid to treat a disease or detrimental condition. According to one aspect, the target nucleic acid is associated with a disease or deleterious condition.

According to one aspect, the one or more RNAs is a guide RNA. According to one aspect, the one or more RNAs is a tracrRNA-crRNA fusion.

According to one aspect, the DNA is genomic DNA, mitochondrial DNA, viral DNA, or foreign DNA.

According to one aspect, there is provided a cell comprising: a first foreign nucleic acid encoding one or more RNAs complementary to DNA, wherein the DNA comprises a target nucleic acid, a second foreign nucleic acid encoding an RNA-guided nuclease-null DNA-binding protein, and a third foreign nucleic acid encoding a transcriptional regulator protein or domain, wherein the one or more RNAs, RNA-guided nuclease-null DNA-binding protein, and transcriptional regulator protein or domain are members of a co-localization complex for the target nucleic acid.

According to one aspect, the foreign nucleic acid encoding the RNA guided nuclease null DNA binding protein further encodes a transcriptional regulator protein or domain fused to the RNA guided nuclease null DNA binding protein. According to one aspect, the foreign nucleic acid encoding one or more RNAs further encodes a target for an RNA binding domain and the foreign nucleic acid encoding a transcriptional regulator protein or domain further encodes an RNA binding domain fused to the transcriptional regulator protein or domain.

According to one aspect, the cell is a eukaryotic cell. According to one aspect, the cell is a yeast cell, a plant cell or an animal cell. According to one aspect, the cell is a mammalian cell.

According to one aspect, the RNA is between about 10 to about 500 nucleotides. According to one aspect, the RNA is between about 20 to about 100 nucleotides.

According to one aspect, the transcriptional regulator protein or domain is a transcriptional activator. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid. According to one aspect, the transcriptional regulator protein or domain upregulates expression of the target nucleic acid to treat a disease or detrimental condition. According to one aspect, the target nucleic acid is associated with a disease or deleterious condition.

According to one aspect, the one or more RNAs is a guide RNA. According to one aspect, the one or more RNAs is a tracrRNA-crRNA fusion.

According to one aspect, the DNA is genomic DNA, mitochondrial DNA, viral DNA, or foreign DNA.

According to certain aspects, the RNA guided nuclease null DNA binding protein is an RNA guided nuclease null DNA binding protein of a type II CRISPR system. According to certain aspects, the RNA-guided nuclease-null DNA-binding protein is a nuclease-null Cas9 protein.

According to one aspect, a method of altering a DNA target nucleic acid in a cell is provided, comprising introducing into the cell a first foreign nucleic acid encoding two or more RNAs, each RNA complementary to an adjacent site of the DNA target nucleic acid, introducing into the cell a second foreign nucleic acid encoding at least one RNA-guided DNA binding protein nickase, which may be an orthogonal RNA-guided DNA binding protein nickase and expressed by the two or more RNA guides, wherein the two or more RNAs and the at least one RNA-guided DNA binding protein nickase, and wherein the at least one RNA-guided DNA binding protein nickase co-localizes with the two or more RNAs to the DNA target nucleic acid and cleaves the (nick) DNA target nucleic acid resulting in two or more adjacent nicks.

According to one aspect, a method of altering a DNA target nucleic acid in a cell is provided, comprising introducing into the cell a first foreign nucleic acid encoding two or more RNAs, each RNA complementary to an adjacent site in the DNA target nucleic acid, introducing into the cell a second foreign nucleic acid encoding (by the two or more RNA guides) at least one RNA guided DNA binding protein nickase of a type II CRISPR system, wherein the two or more RNAs and the at least one RNA guided DNA binding protein nickase of the type II CRISPR system are expressed, and wherein the at least one RNA guided DNA binding protein nickase of the type II CRISPR system co-localizes with the two or more RNAs to the DNA target nucleic acid and cleaves the DNA target nucleic acid resulting in two or more adjacent nicks.

According to one aspect, a method of altering a DNA target nucleic acid in a cell is provided, comprising introducing into the cell a first foreign nucleic acid encoding two or more RNAs, each RNA complementary to an adjacent site in the DNA target nucleic acid, introducing into the cell a second foreign nucleic acid encoding at least one Cas9 protein nickase having one inactive (inactive) nuclease domain and guided by the two or more RNAs, wherein the two or more RNAs and the at least one Cas9 protein nickase are expressed, and wherein the at least one Cas9 protein nickase co-localizes with the two or more RNAs to the DNA target nucleic acid and cleaves the DNA target nucleic acid resulting in two or more adjacent nicks.

According to the method of altering a DNA target nucleic acid, two or more adjacent nicks are on the same strand of a double-stranded DNA. According to one aspect, two or more adjacent nicks are on the same strand of the double-stranded DNA and result in homologous recombination. According to one aspect, the two or more adjacent nicks are on different strands of the double-stranded DNA. According to one aspect, two or more adjacent nicks are on different strands of the double-stranded DNA and create a double-stranded break. According to one aspect, two or more adjacent nicks are on different strands of the double stranded DNA and create double stranded breaks, resulting in non-homologous end joining. According to one aspect, the two or more adjacent nicks are on different strands of the double-stranded DNA and are offset with respect to each other. According to one aspect, two or more adjacent nicks are on different strands of the double-stranded DNA and are offset with respect to each other and create a double-stranded break. According to one aspect, two or more adjacent nicks are on different strands of the double stranded DNA and offset relative to each other and create double stranded breaks, resulting in non-homologous end joining. According to one aspect, the method further comprises introducing a third foreign nucleic acid encoding a donor nucleic acid sequence into the cell, wherein the two or more nicks result in homologous recombination of the target nucleic acid with the donor nucleic acid sequence.

According to one aspect, there is provided a method of altering a DNA target nucleic acid in a cell, comprising introducing into the cell a first foreign nucleic acid encoding two or more RNAs, each RNA complementary to an adjacent site of the DNA target nucleic acid, introducing into the cell a second foreign nucleic acid encoding at least one RNA-guided DNA binding protein nickase and guided by the two or more RNAs, and wherein the two or more RNAs and the at least one RNA-guided DNA binding protein nickase are expressed, and wherein the at least one RNA-guided DNA binding protein nickase is co-localized with the two or more RNAs to the DNA target nucleic acid and cleaves the DNA target nucleic acid resulting in two or more adjacent nicks, and wherein the two or more adjacent nicks are on different strands of the double-stranded DNA and create double-strand breaks, resulting in fragmentation of the target nucleic acid, thereby preventing expression of the target nucleic acid.

According to one aspect, there is provided a method of altering a DNA target nucleic acid in a cell comprising introducing into the cell a first foreign nucleic acid encoding two or more RNAs, each RNA being complementary to an adjacent site of the DNA target nucleic acid, introducing into the cell at least one RNA-guided DNA-binding protein nickase encoding a type II CRISPR system and from two or more RNA-guided second foreign nucleic acids, and wherein the two or more RNAs and the at least one RNA-guided DNA-binding protein nickase of the type II CRISPR system are expressed, and wherein the at least one RNA-guided DNA-binding protein nickase of the type II CRISPR system is co-localized with the two or more RNAs to the DNA target nucleic acid and cleaves the DNA target nucleic acid resulting in two or more adjacent nicks, and wherein the two or more adjacent nicks are on different strands of the double stranded DNA and create a double stranded break, resulting in target nucleic acid fragmentation, thereby preventing expression of the target nucleic acid.

According to one aspect, there is provided a method of altering a DNA target nucleic acid in a cell, comprising introducing into the cell a first foreign nucleic acid encoding two or more RNAs, each RNA complementary to an adjacent site of the DNA target nucleic acid, introducing into the cell a second foreign nucleic acid encoding at least one Cas9 protein nickase having one inactive nuclease domain and guided by the two or more RNAs, and wherein the two or more RNAs and the at least one Cas9 protein nickase are expressed, and wherein the at least one Cas9 protein nickase is co-localized with the two or more RNAs to the DNA target nucleic acid and cleaves the DNA target nucleic acid resulting in two or more adjacent nicks, and wherein the two or more adjacent nicks are on different strands of the double stranded DNA and create double stranded breaks resulting in fragmentation of the target nucleic acid, thereby preventing expression of the target nucleic acid.

According to one aspect, there is provided a cell comprising: a first foreign nucleic acid encoding two or more RNAs, each RNA complementary to an adjacent site of the DNA target nucleic acid, and a second foreign nucleic acid encoding at least one RNA-guided DNA binding protein nickase, and wherein the two or more RNAs and the at least one RNA-guided DNA binding protein nickase are members of a co-localized complex for the DNA target nucleic acid.

According to one aspect, the RNA-guided DAN binding protein nickase is an RNA-guided DNA binding protein nickase of a type II CRISPR system. According to one aspect, the RNA-guided DNA-binding protein nickase is a Cas9 protein nickase having one inactive nuclease domain.

According to one aspect, the cell is a eukaryotic cell. According to one aspect, the cell is a yeast cell, a plant cell or an animal cell. According to one aspect, the cell is a mammalian cell.

According to one aspect, the RNA comprises between about 10 to about 500 nucleotides. According to one aspect, the RNA comprises between about 20 to about 100 nucleotides.

According to one aspect, the target nucleic acid is associated with a disease or deleterious condition.

According to one aspect, the two or more RNAs are guide RNAs. According to one aspect, the two or more RNAs are tracrRNA-crRNA fusions.

According to one aspect, the DNA target nucleic acid is genomic DNA, mitochondrial DNA, viral DNA, or foreign DNA.

According to one aspect, the method may comprise the simultaneous use of an orthogonal RNA guided DNA binding protein nickase, an orthogonal RNA guided DNA binding protein nuclease, an orthogonal RNA guided nuclease null DNA binding protein. Thus, in the same cell (the same cell), alteration created by cutting or shearing DNA and translation mediation can be performed. Further, one or more exogenous donor nucleic acids can also be added to the cell using methods known to those skilled in the art for introducing nucleic acids into cells (such as electroporation), and one or more exogenous donor nucleic acids can be introduced into the DNA of the cell by recombination, such as homologous recombination, or other mechanisms known to those skilled in the art. Thus, the use of the various orthogonal RNA-guided DNA binding proteins described herein allows for the alteration of a single cell by cleavage or cleavage, allows for the introduction of donor nucleic acids into the DNA of the cell, and allows for the transcription of regulatory genes.

Further features and advantages of certain embodiments of the present invention will become more fully apparent from the following description of the embodiments and the accompanying drawings and claims.

Drawings

The above and other features and advantages of this embodiment will be more fully understood from the following detailed description of an exemplary embodiment taken in conjunction with the accompanying drawings, in which:

FIGS. 1A and 1B are schematic representations of RNA-guided transcriptional activation. FIG. 1C is the design of the reporter construct. Figure 1D shows data demonstrating that Cas9N-VP64 fusion, as determined by both Fluorescence Activated Cell Sorting (FACS) and immunofluorescence assay (IF), shows RNA guided transcriptional activation. Fig. 1E shows data determined by FACS and IF, indicating specific transcriptional activation of gRNA sequences from the reporter construct in the presence of Cas9N, MS2-VP64, and a gRNA with an appropriate MS2 aptamer binding site. Fig. 1F depicts data demonstrating transcriptional induction by single gRNA and multiple grnas.

Fig. 2A depicts a method for evaluating Cas9-gRNA complexes and TALE-targeted blueprints (landscapes). Figure 2B depicts data indicating that the Cas9-gRNA complex tolerates an average of 1-3 mutations in its target sequence. Fig. 2C depicts data indicating that the Cas9-gRNA complex is extremely insensitive to point mutations other than those localized to the PAM sequence. Fig. 2D depicts heat map data (heat map data), indicating that introduction of a 2-base mismatch significantly attenuated the activity of the Cas9-gRNA complex. Figure 2E depicts data indicating that an 18-mer (18-mer) TALE shows an average tolerance of 1-2 mutations in its target sequence. Figure 2F depicts data showing that the 18-mer TALE is similar to Cas9-gRNA complex, being very insensitive to single base mismatches in its target. Fig. 2G depicts thermal mapping data, indicating that the introduction of a 2 base mismatch significantly attenuates the activity of an 18-mer TALE.

FIG. 3A depicts a schematic representation of guide RNA design. FIG. 3B depicts data showing the percentage of non-homologous end joining resulting in offset nicks for 5 'projections and offset nicks resulting in 5' projections. Fig. 3C depicts data showing the percentage of offset cuts targeted to result in a 5 'projection and offset cuts resulting in a 5' projection.

FIG. 4A is a schematic representation (left) of the metal coordinating residue in position D7 of RuvC PDB ID:4EP4 (blue), a schematic representation (middle) of the PDB IDs comprising coordinating Mg ions (grey spheres) 3M7K (orange) and the HNH endonuclease domain of 4H9D (cyan) and the DNA of 3M7K (purple) and a list of mutants analyzed (right). Figure 4B depicts data showing Cas9 mutants m3 and m4 as well as the undetectable nuclease activity of their respective fusions to VP 64. FIG. 4C is a high resolution inspection of the data in FIG. 4B.

Fig. 5A is a schematic diagram of a homologous recombination assay to determine Cas9-gRNA activity. FIG. 5B depicts the percentage of guide RNA and homologous recombination for random sequence insertion.

Fig. 6A is a schematic diagram of guide RNA for OCT4 gene. Fig. 6B depicts transcriptional activation for promoter-luciferase reporter constructs. Fig. 6C depicts transcriptional activation by qPCR of endogenous genes.

FIG. 7A is a schematic representation of a guide RNA for the REX1 gene. Fig. 7B depicts transcriptional activation for promoter-luciferase reporter constructs. Fig. 7C depicts transcriptional activation by qPCR of endogenous genes.

FIG. 8A schematically depicts a high level specificity analysis process flow for calculating normalized expression levels. FIG. 8B depicts the data for the percentage distribution of binding sites with the number of mismatches generated within the biased construct library (biased construct library). Left: theoretical distribution. And (3) right: distribution observed from actual TALE construct libraries. Figure 8C depicts the percentage distribution data of tag counts aggregated to binding sites according to the number of mismatches. Left: the distribution observed from the positive control sample. And (3) right: the distribution observed from the samples in which the non-control TALE was induced.

Fig. 9A depicts data for analysis of a targeting blueprint of Cas9-gRNA complex showing tolerance of 1-3 mutations in its target sequence. Fig. 9B depicts data for analysis of targeted blueprints of Cas9-gRNA complexes showing insensitivity to point mutations (except those localized to the PAM sequence). Fig. 9C depicts heat mapping data for analysis of a targeting blueprint of Cas9-gRNA complexes showing significant attenuation of activity by introduction of 2-base mismatches. Fig. 9D depicts data from a nuclease-mediated HR assay, determining that the predicted PAM is NGG and NAG for streptococcus pyogenes Cas 9.

Fig. 10A depicts data from a nuclease-mediated HR assay to determine that an 18-mer TALE allows for multiple mutations in its target sequence. Fig. 10B depicts data analyzing the targeting blueprints for 3 TALEs of different sizes (18-mer, 14-mer, and 10-mer). FIG. 10C depicts data for a 10-mer TALE showing near single-base mismatch resolution (near single-base mismatch resolution). FIG. 10D depicts thermal mapping data for a 10-mer TALE showing near single base mismatch resolution.

FIG. 11A depicts designed guide RNAs. FIG. 11B depicts the percentage of non-homologous end joining for multiple guide RNAs.

Fig. 12A-12F depict comparisons and characterization of putative orthogonal Cas9 proteins. FIG. 12A: SP, ST1, NM and TD. Bases were stained to indicate degree of conservation. FIG. 12B: plasmid for characterization of Cas9 protein in escherichia coli (e. FIG. 12C: functional PAM from the library is consumed when the spacer and protospacer (protospacer) match due to Cas9 cleavage. FIG. 12D: cas9 does not cleave when the targeting plasmid spacer and the library protospacer do not match. FIG. 12D: the non-functional PAM was never sheared or exhausted. FIG. 12F: the PAM selection protocol was confirmed. Cells expressing Cas9 protein and one of the two spacer-containing targeting plasmids were transformed with one of the two libraries with the corresponding pre-spacer and subjected to antibiotic selection. Surviving uncleaved plasmids were subjected to deep sequencing (deep sequencing). Cas 9-mediated PAM consumption was quantified by comparing the relative abundance of each sequence within the matched versus mismatched protospacer library.

Figures 13A-13F depict the depletion of a functional Protospacer Adjacent Motif (PAM) from the library by Cas9 protein. The log frequency of each base at each position for a matched spacer-protospacer pair is plotted against control conditions in which the spacer and protospacer are not matched. The results reflect the average consumption of the library based on NM (fig. 13A), STl (fig. 13B) and TD (fig. 13C) of two different protospacer sequences (fig. 13D). The depletion of the specific sequence for each protospacer of each Cas9 protein was plotted separately (fig. 13E-13F).

FIGS. 14A-14B depict NM-mediated transcriptional repression. FIG. 14A: reporter plasmids for quantification of inhibition. FIG. 14B: normalized cell fluorescence for matched and mismatched spacer-protospacer pairs. Error bars represent the standard deviation of five replicates (replicates).

FIG. 15 depicts orthogonal recognition of crRNA in E.coli. Cells with all combinations of Cas9 and crRNA were challenged with plasmids with matched or mismatched protospacers and appropriate PAM. Sufficient cells were seeded (plated) to reliably obtain colonies from matched spacer and protospacer pairings (colony) and total colony numbers for calculating fold depletion (fold depletion).

Figures 16A-16B depict Cas 9-mediated gene editing in human cells. FIG. 16A: a homologous recombination assay for quantifying gene editing efficiency. Cas 9-mediated double strand breaks within the protospacer stimulation with the donor template interrupts repair of the GFP cassette, resulting in cells with intact GFP. Three different templates were used to provide the correct PAM for each Cas 9. Fluorescent cells were quantified by flow cytometry. FIG. 16B: cell sorting yielded NM, STl, and TD in combination with each of their respective sgrnas. The protospacer and PAM sequences for each Cas9 are shown on each set. The repair efficiency is indicated in the upper right corner of each figure.

FIGS. 17A-17B depict transcriptional activation in human cells. FIG. 17A: the basal promoter driving tdTomato (minimal promoter) is the characteristic transcriptional activated reporter construct. The protospacer and PAM sequences were placed upstream of the basic promoter. Nuclease-null Cas9-VP64 fusion proteins bound to the protospacer lead to transcriptional activation and enhanced fluorescence. FIG. 17B: cells were transfected with all combinations of Cas9 activator and sgRNA and tdTomato was visualized fluorescently. Transcriptional activation occurs only when each Cas9 pairs with its own sgRNA.

Detailed Description

The supporting references listed herein may be referred to by corner labels. It should be understood that the corner labels refer to the references as if they were fully set forth to support the specific statement.

The CRISPR-Cas system of bacteria and archaea brings about adaptive immunity by incorporating viral fragments or plasmid DNA into the CRISPR locus and using the transcribed crRNA to guide nucleases to degrade homologous sequences1,2. In type II CRISPR systems, Cas9 nuclease binds to and cleaves a ternary complex of crRNA and tracrRNA (transactivating crRNA) and cleaves dsDNA protospacer sequence matching the crRNA spacer and further comprising a short protospacer sequence adjacent motif (PAM)3,4. Fusing crRNA and tracrRNA to generate single guide RNA (sgRNA) sufficient to target Cas94

As RNA-guided nucleases and nickases, Cas9 is useful for targeted gene editing in a variety of organisms5-9And selecting10. Although it is argued that these successes can vary, because the ability to localize proteins and RNAs to any nearby set of dsDNA sequences provides great versatility for controlling biological systems, nuclease-null Cas9 variants can be used for regulatory purposes11-17. Blocking from the promoter and 5' -UTR in bacteria18By VP64 in human cells19Supplementation extends Cas 9-mediated regulation to transcriptional activation. According to certain aspects, DNA binding proteins described herein, including orthogonal RNA guided DNA binding proteins, such as orthogonal Cas9 proteins, may be used with transcription activators, inhibitors, fluorescent protein markers, chromosome binders (chromosome teethers), and a variety of other tools known to those of skill in the art. According to this aspect, the use of orthogonal Cas9 allows for genetic modification using any or all of transcriptional activators, suppressors, fluorescent protein markers, chromosome binders, and a variety of other tools known to those skilled in the art. Thus, aspects of the disclosure relate to the use of orthogonal Cas9 proteins for transcriptional activation, inhibition, and gene editing of multiple RNA guides.

Embodiments of the present disclosure relate to characterizing and demonstrating orthogonality (orthogonality) between multiple Cas9 proteins in bacterial and human cells. Such orthogonal RNA guided DNA binding proteins can be used in multiple or groups to simultaneously and independently regulate transcription, labeling or editing of multiple genes in the DNA of a single cell.

According to one aspect, multiple orthogonal Cas9 proteins are identified within a single family of CRISPR systems. Although clearly relevant, peptides from streptococcus pyogenes, neisseria meningitidis (n. meningitidis), streptococcus thermophilus (s. thermophilus) and treponema denticola (t. denticola) range in length from 3.25 to 4.6kb and recognize completely different PAM sequences.

Embodiments of the present disclosure are based on the use of DNA binding proteins to co-localize transcriptional regulatory proteins or domains to DNA in a manner that modulates a target nucleic acid. One skilled in the art will readily recognize that such DNA binding proteins can be used to bind to DNA for a variety of purposes. Such DNA binding proteins may be naturally occurring. DNA binding proteins included within the scope of the present disclosure include those that can be guided by RNA (referred to herein as guide RNA). According to this aspect, the guide RNA and the RNA-guided DNA binding protein form a co-localized complex on the DNA. According to certain aspects, the DNA binding protein may be a nuclease-null DNA binding protein. According to this aspect, a nuclease-null DNA binding protein may result from altering or modifying a DNA binding protein having nuclease activity. Such DNA binding proteins having nuclease activity are known to those of skill in the art and include naturally occurring DNA binding proteins having nuclease activity, such as, for example, Cas9 protein found in type II CRISPR systems. Such Cas9 protein and type II CRISPR systems are well documented in the art. See Makarova et al, Nature Reviews, Microbiology, vol.9, June 2011, pp.467-477, including all supplementary information, which is incorporated herein by reference in its entirety.

According to certain aspects, methods of confirming two or more or a group of orthogonal DNA binding proteins, such as orthogonal RNA guided DNA binding proteins of a type II CRISPR system, such as orthogonal cas9 protein, each of which may be nuclease active or nuclease inactive, are provided. According to certain aspects, two or more or a set of orthogonal DNA binding proteins may be used with corresponding guide RNAs to simultaneously and independently regulate a gene or edit nucleic acid within a cell. According to certain aspects, nucleic acids may be introduced into cells encoding two or more or a set of orthogonal DNA binding proteins, corresponding guide RNAs, and two or more or a set of corresponding transcriptional regulators or domains. In this way, many genes can be parallel targets for regulation or editing in the same cell. Methods for editing genomic DNA are well known to those skilled in the art.

Exemplary DNA binding proteins with nuclease activity function to cleave (nick) or cleave double-stranded DNA. Such nuclease activity can result from a DNA binding protein having one or more polypeptide sequences that exhibit nuclease activity. Such exemplary DNA binding proteins can have two separate nuclease domains, each domain responsible for cleaving or cleaving a particular strand of double-stranded DNA. Exemplary polypeptide sequences known to those of skill in the art to have nuclease activity include McrA-HNH nuclease-associated domains and RuvC-like nuclease domains. Thus, exemplary DNA binding proteins are those that comprise one or more McrA-HNH nuclease-associated domains and RuvC-like nuclease domains in nature. According to certain aspects, the DNA binding protein is altered or otherwise modified to inactivate nuclease activity. Such changes or modifications include altering one or more amino acids to inactivate nuclease activity or a nuclease domain. Such modifications include removing one or more polypeptide sequences exhibiting nuclease activity (i.e., nuclease domains) such that the one or more polypeptide sequences exhibiting nuclease activity (i.e., domains) are not present in the DNA binding protein. Other modifications to inactivate nuclease activity will be apparent to those skilled in the art based on this disclosure. Thus, nuclease null DNA binding proteins include polypeptide sequences modified to inactivate nuclease activity or to remove polypeptide sequences or sequences that inactivate nuclease activity. Although nuclease activity has been inactivated, nuclease-null DNA binding proteins retain the ability to bind DNA. Thus, a DNA binding protein includes one or more polypeptide sequences required for DNA binding and may lack one or more or all nuclease sequences exhibiting nuclease activity. Thus, a DNA binding protein includes one or more polypeptide sequences required for DNA binding and may have one or more or all nuclease sequences exhibiting inactivated nuclease activity.

According to one aspect, a DNA binding protein having two or more nuclease domains can be modified or altered to inactivate all but one of the nuclease domains. Such modified or altered DNA binding proteins are referred to as DNA binding protein nickases in the sense that the DNA binding protein cleaves or cleaves only one strand of double stranded DNA. When guided by RNA to DNA, DNA binding protein nickases are referred to as RNA guided DNA binding protein nickases.

Exemplary DNA binding proteins are RNA-guided DNA binding proteins of type II CRISPR systems lacking nuclease activity. An exemplary DNA binding protein is a nuclease-null Cas9 protein. An exemplary DNA binding protein is Cas9 protein nickase.

Streptococcus pyogenes (S.pyggenes) Cas9 mediated by the following two catalytic domains in the protein Lactobacillus pyogenes (S.pyogenes) (Streptococcus pyogenes) (ATCC 7; Streptococcus pyogenes) (Streptococcus faecal) (ATCC 3676; Streptococcus pyogenes) (Streptococcus faecalis) (ATCC 7; Streptococcus faecalis) (Streptococcus faecal 3676; Streptococcus faecalis) (Streptococcus faecal) (Streptococcus faecalis) (ATCC 3676; Streptococcus faecalis) (ATCC 7; Streptococcus faecalis) (Streptococcus faecal) (ATCC 3676; Streptococcus faecalis) (ATCC 7; Streptococcus faecalis) (ATCC 7) S.faecalis) (Streptococcus faecalis strain 7; Streptococcus faecalis strain) (Streptococcus faecalis strain 3676; Streptococcus faecalis strain 7) S.faecalis (Streptococcus faecalis) (ATCC 7; Streptococcus faecalis) (Streptococcus faecalis) (ATCC 7) S.3672; Streptococcus faecalis) (Streptococcus faecalis) (ATCC 7) S) (Streptococcus faecalis) (ATCC 7) S.3695) S.3672) is) (Streptococcus faecalis) (Streptococcus faecalis strain 7) S) (Streptococcus faecalis) (Streptococcus faecalis) S.3672) S) (Streptococcus faecalis strain 3695) S) (Streptococcus faecalis) S7) is strain 7) is strain) (Streptococcus faecalis strain 7; Streptococcus faecalis strain) (Streptococcus faecalis) is strain) (Streptococcus faecalis) is strain) (Streptococcus faecalis strain 7) is strain) (ATCC 7) is strain) (Streptococcus faecalis strain) (Corynebacterium) (Streptococcus faecalis strain 7) is strain) (Corynebacterium) S3695) K) (Corynebacterium) S) (Corynebacterium) K) (Corynebacterium) S3695) and Streptococcus pyogene) K) (Corynebacterium) S3695) S7) Corynebacterium) (Corynebacterium) K) (Corynebacterium) S) (Corynebacterium) K) (Corynebacterium) (Corynebacterium) K) (Corynebacterium) S7) Corynebacterium) (Corynebacterium) S7) K) (Corynebacterium) K) (Corynebacterium) (Corynebacterium) K) (Corynebacterium) S3695) K) (Corynebacterium) K) (Corynebacterium) (Corynebacterium) K) (Corynebacterium) (Corynebacterium) K7) Corynebacterium) K) (Corynebacterium) Corynebacterium parvus) (Corynebacterium) Corynebacterium parstrain Corynebacterium) (Corynebacterium parvus) (Corynebacterium) Corynebacterium parstrain Corynebacterium parvus) (Corynebacterium parstrain Corynebacterium parvus) (Corynebacterium) K.3672) Corynebacterium parvus) (Corynebacterium parstrain Corynebacterium parvus 7) Corynebacterium parvus) (Corynebacterium parvus) (Corynebacterium parvus) (Corynebacterium parvus 7) K) (Corynebacterium parvus) (Corynebacterium parvus) (Corynebacterium parvus) (Corynebacterium) Corynebacterium parvus) (Corynebacterium parvus) K) (Corynebacterium parvus) K7) K) (Corynebacterium parvus) K.3672) Corynebacterium parvus) (Corynebacterium parvus) K) (Corynebacterium parvus) K7) K) (Corynebacterium parvus) K) (Corynebacterium parvus) (Corynebacterium parvus) K) (Corynebacterium parvus) K7) K) (Corynebacterium parvus) K) (Corynebacterium parvus) K) (Corynebacterium parvus) (Corynebacterium parvus) K) (Corynebacterium parvus) K) (Corynebacterium parvus) 11) K) (Corynebacterium parvus) K) (Corynebacterium parvus) 11) K) (Corynebacterium parvus) K7) Corynebacterium parvus) (Corynebacterium parvus) K) (Corynebacterium parvus) K7) K) (Corynebacterium parvus) K) (Corynebacterium parvus) K7) K) (Corynebacterium parvus) K) (Corynebacterium parvus) K.3672) Corynebacterium parvus) (Corynebacterium par.

The Cas9 protein can be referred to in the literature by those skilled in the art as Csnl. The streptococcus pyogenes (s. pyogenes) Cas9 protein sequence that was the subject of the assays described herein is shown below. See, Deltcheva et al, Nature 471,602-607(2011), which is incorporated herein by reference in its entirety.

Figure BDA0002240846790000151

According to certain aspects of the RNA-guided genome regulation methods described herein, Cas9 is altered to reduce, substantially (substantailly) reduce or eliminate nuclease activity. Such Cas9 may be an orthogonal Cas9, such as when more than one Cas9 protein is contemplated. In this context, two or more or a set of orthogonal Cas9 proteins may be used in the methods described herein. According to one aspect, Cas9 nuclease activity is reduced, substantially reduced, or eliminated by altering a RuvC nuclease domain or HNH nuclease domain. According to one aspect, the RuvC nuclease domain is inactivated. According to one aspect, the HNH nuclease domain is inactivated. According to one aspect, the RuvC nuclease domain and the HNH nuclease domain are inactivated. According to another aspect, a RuvC nuclease domain and HNH nuclease domain inactivated Cas9 protein is provided. According to another aspect, a nuclease-null Cas9 protein is provided in the event that the RuvC nuclease domain and HNH nuclease domain are inactivated. According to another aspect, Cas9 nickases are provided in which the RuvC nuclease domain or HNH nuclease domain is inactivated, thereby rendering the remaining nuclease domain nuclease active. In this way, only one strand of the double-stranded DNA is sheared or cleaved.

According to another aspect, a nuclease-null Cas9 protein is provided, wherein one or more amino acids in Cas9 are altered or otherwise removed to provide a nuclease-null Cas9 protein. According to one aspect, the amino acids include D10 and H840. See Jinke et al, Science 337, 816-. According to another aspect, the amino acids include D839 and N863. According to one aspect, one or more or all of D10, H840, D839, and H863 are substituted with an amino acid that reduces, substantially eliminates, or eliminates nuclease activity. According to one aspect, one or more or all of D10, H840, D839 and H863 are substituted with alanine. According to one aspect, a Cas9 protein that replaces one or more or all of D10, H840, D839, and H863 with an amino acid (such as alanine) that reduces, substantially eliminates, or eliminates nuclease activity is referred to as a nuclease-null Cas9 or Cas9N and exhibits reduced or eliminated nuclease activity, or the nuclease activity is absent or substantially absent at the level of detection. According to this aspect, nuclease activity for Cas9N is not detectable using known assays, i.e., is lower than the level of detection of known assays.

According to one aspect, nuclease null Cas9 proteins include homologs (homologs) and orthologs (orthologs) that retain the ability of the protein to bind to DNA and be guided by RNA. According to one aspect, the nuclease null Cas9 protein includes the sequence set forth for naturally occurring Cas9 from streptococcus pyogenes and the substitution of alanine for one or more or all of D10, H840, D839, and H863, as well as the following protein sequences: the protein sequence is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98% or 99% homologous thereto and is a DNA binding protein, such as an RNA-guided DNA binding protein.

According to one aspect, the nuclease-null Cas9 protein comprises a sequence set forth for naturally occurring Cas9 from streptococcus pyogenes (in addition to the protein sequence of the RuvC nuclease domain and HNH nuclease domain), as well as a sequence that is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98%, or 99% homologous thereto and is a DNA binding protein (such as an RNA-guided DNA binding protein). In this way, aspects of the disclosure include protein sequences that are responsible for DNA binding, e.g., co-localization with guide RNA and binding to DNA and protein sequences homologous thereto, and need not include protein sequences for the RuvC nuclease domain and HNH nuclease domain (to the extent that they are not needed for DNA binding), as these domains can be inactivated or removed from the protein sequence of the naturally occurring Cas9 protein to produce a nuclease-null Cas9 protein.

For the purposes of this disclosure, figure 4A depicts metal coordinating residues homologous to Cas9 in a known protein structure. Residues are labeled based on position in the Cas9 sequence. Left: the coordination position of Mg ions is marked with RuvC structure, PDBID: 4EP4 (blue) position D7, corresponding to D10 in Cas9 sequence. The method comprises the following steps: from PDB ID: the structures of HNH endonuclease domains of 3M7K (orange) and 4H9D (cyan), including coordinated Mg ions (grey spheres) from 3M7K (violet) and DNA. Residues D92 and N113 in positions D53 and N77 of 3M7K and 4H9D are shown as rods with sequences homologous to Cas9 amino acids D839 and N863. And (3) right: list of mutants made and used for analysis of nuclease activity: cas9 wild type; cas9m1Alanine for D10; cas9m2Alanine for D10 and alanine for H840; cas9m3Alanine for D10, alanine for H840, and alanine for D839; and Cas9m4Alanine for D10, alanine for H840, alanine for D839, and alanine for N863.

As shown in fig. 4B, upon deep sequencing at the target locus, the Cas9 mutant: m3 and m4 and also their respective fusions with VP64 show no detectable nuclease activity. The figure shows the mutation frequency versus genomic position, with red lines demarcating the gRNA target. Figure 4C is a higher resolution examination of the data in figure 4B and corroborating the mutation blueprint showing a curve comparable to the unmodified locus.

According to one aspect, an engineered Cas9-gRNA system is provided that allows RNA-guided genomic regulation in human cells by attaching (tether) a transcription activation domain to a nuclease-null Cas9 or a guide RNA. According to one aspect of the disclosure, one or more transcriptional regulatory proteins or domains (these terms are used interchangeably) add Cas9 or one or more guide rnas (grnas) that are not otherwise linked to a nuclease defect. The transcriptional regulatory domain corresponds to a target locus. Thus, aspects of the disclosure include methods and materials for targeting a transcriptional regulatory domain to a targeted locus by fusing, linking, or adding such a domain to Cas9N or to a gRNA.

According to one aspect, Cas9N fusion proteins capable of transcriptional activation are provided. According to one aspect, the (join), fusion, attachment VP64 activation domain is added (see Zhang et al, Nature Biotechnology 29,149-153(2011), incorporated herein by reference in its entirety) or otherwise attached to the C-terminus of Cas 9N. According to one approach, the transcriptional regulatory domain is provided to the site of the target genomic DNA by the Cas9N protein. According to one approach, Cas9N fused to a transcriptional regulatory domain is provided within a cell along with one or more RNAs. Cas9N binds with the transcriptional regulatory domain fused to it on or near the target genomic DNA. One or more guide RNAs bind on or near the target genomic DNA. The transcriptional regulatory domain regulates expression of a target gene. According to a particular aspect, Cas9N-VP64 fusion activates transcription of the reporter construct when bound to a gRNA that targets sequences near the promoter, thereby demonstrating RNA-guided transcriptional activation.

According to one aspect, gRNA fusion proteins capable of transcriptional activation are provided. According to one aspect, a VP64 activation domain is added, fused, linked, or otherwise attached to the gRNA. According to one approach, a transcriptional regulatory domain is provided to the site of the target genomic DNA by the gRNA. According to one approach, a gRNA fused to a transcriptional regulatory domain is provided in a cell along with a Cas9N protein. Cas9N binds on or near the target genomic DNA. One or more guide RNAs bind to or near the target genomic DNA with a transcriptional regulator protein or domain fused thereto. The transcriptional regulatory domain regulates expression of a target gene. According to a particular aspect, the Cas9N protein and gRNA fused to a transcriptional regulatory domain activate transcription of the reporter construct, thereby demonstrating RNA-guided transcriptional activation.

Constructs a gRNA adapter (teter) capable of transcriptional regulation were constructed by inserting random sequences into the gRNA and determining Cas9 function to confirm which regions of the gRNA were resistant to modification. Grnas with random sequence insertions at the 5 'end of the crRNA portion or the 3' end of the tracrRNA portion of the chimeric gRNA retain function, while insertion into the tracrRNA backbone (scaffold) portion of the chimeric gRNA results in a loss of function. See fig. 5A-5B, which summarize flexibility of grnas for random base insertion. Fig. 5A is a schematic of a Homologous Recombination (HR) assay to determine Cas9-gRNA activity. As shown in fig. 5B, the gRNA with random sequence insertion at the 5 'end of the crRNA portion or the 3' end of the tracrRNA portion of the chimeric gRNA retained function, while the tracrRNA backbone portion inserted into the chimeric gRNA resulted in loss of function. The insertion point in the gRNA sequence is indicated by red nucleotides. Without wishing to be bound by scientific theory, the increased activity when bases are randomly inserted at the 5' end may be due to the increased half-life of longer grnas.

To attach VP64 to the gRNA, two copies of the RNA stem loop that bind the MS2 phage coat protein (MS2 bacteriophages coat-protein) were appended to the 3' end of the gRNA. See Fusco et al, Current Biology: CB13,161-167(2003), which is incorporated herein by reference in its entirety. These chimeric grnas were expressed with Cas9N and MS2-VP64 fusion proteins. Sequence-specific transcriptional activation from the reporter construct was observed in the presence of all 3 components.

FIG. 1A is a schematic representation of RNA-guided transcriptional activation. As shown in fig. 1A, to generate a Cas9N fusion protein capable of transcriptional activation, VP64 activation domain was attached directly to the C-terminus of Cas 9N. As shown in fig. 1B, to generate a gRNA anchor capable of transcriptional activation, two copies of the RNA stem loop that bind the MS2 phage coat protein were appended to the 3' end of the gRNA. These chimeric grnas were expressed with Cas9N and MS2-VP64 fusion proteins. FIG. 1C shows the design of reporter constructs for determining transcriptional activation. The two reporters carry different gRNA target sites and share a control TALE-TF target site. As shown in fig. 1D, Cas9N-VP64 fusion showed RNA-guided transcriptional activation as determined by fluorescence-activated cell sorting (FACS) and immunofluorescence assay (IF). Specifically, Cas9N-VP64 activates the reporter in a gRNA sequence-specific manner when the control TALE-TF activates both reporters. As shown in fig. 1E, only observed by both FACS and IF, in all 3 components: the gRNA sequence specific transcriptional activation from the reporter construct in the presence of Cas9N, MS2-VP64, and a gRNA with an appropriate MS2 aptamer binding site.

According to certain aspects, methods are provided for modulating an endogenous gene using Cas9N, one or more grnas, and a transcriptional regulator protein or domain. According to one aspect, the endogenous gene may be any desired gene, referred to herein as a target gene. According to one exemplary aspect, target genes for regulation include ZFP42(REX1) and POU5F1(OCT4), both of which are tightly regulated genes involved in maintaining pluripotency. As shown in fig. 1F, 10 grnas targeting a-5 kb segment (stretch) of DNA upstream of the transcription initiation site (highlighting the DNase hypersensitive site as green) were designed for the REXl gene. Transcriptional activation was determined using a promoter-luciferase reporter construct (see Takahashi et al, Cell 131861-872 (2007), incorporated herein by reference in its entirety) or directly by qPCR of endogenous genes.

Fig. 6A-6C relate to RNA-guided OCT4 regulation using Cas9N-VP 64. As shown in fig. 6A, 21 grnas targeting a-5 kb segment of DNA upstream of the transcription start site were designed for the OCT4 gene. The DNase hypersensitive site (hypersensitive site) was highlighted in green. Fig. 6B shows transcriptional activation using a promoter-luciferase reporter construct. Fig. 6C shows transcriptional activation by qPCR of endogenous genes directly. The introduction of a single gRNA appropriately stimulates transcription, while multiple grnas act synergistically to stimulate robust, multiple transcriptional activation.

Figures 7A-7C relate to RNA-guided REX1 modulation using Cas9N, MS2-VP64, and gRNA +2X-MS2 aptamers. As shown in fig. 7A, 10 grnas targeting a-5 kb segment of DNA upstream of the transcription start site were designed for the REX1 gene. The DNase hypersensitive site was highlighted in green. Fig. 7B shows transcriptional activation using a promoter-luciferase reporter construct. Figure 7C shows transcriptional activation by qPCR of endogenous genes directly. The introduction of a single gRNA appropriately stimulates transcription, while multiple grnas act synergistically to stimulate robust, multiple transcriptional activation. In one aspect, the absence of the 2X-MS2 aptamer on the gRNA does not result in transcriptional activation. See Maeder et al, Nature Methods10, 243-245(2013) and Perez-Pinera et al, Nature Methods10,239-242(2013), each of which is incorporated herein by reference in its entirety.

Thus, the method relates to the use of multiple guide RNAs having a Cas9N protein and a transcriptional regulator protein or domain to regulate expression of a target gene.

Both Cas9 and gRNA attachment (tethering) methods are effective, the former showing efficiencies that are 1.5-2 fold higher. This difference may be due to the requirement for 2-component versus 3-component composite components. However, in principle the gRNA attachment approach enables different effector domains to be recruited by different grnas, as long as each gRNA uses a different RNA-protein interaction pair. See Karyer-Bibens et al, Biology of the Cell/underlying the Autoicesof the European Cell Biology Organization 100,125-138(2.008), the entire contents of which are incorporated herein by reference. According to one aspect of the disclosure, different target genes can be regulated using specific guide RNAs and a universal Cas9N protein (i.e., the same or similar Cas9N protein for different target genes). According to one aspect, methods of using the same or similar Cas9N for multiple gene regulation are provided.

The methods of the present disclosure also relate to editing target genes using Cas9N protein and guide RNAs described herein to provide multiplex gene and epigenetic engineering of human cells. Since Cas9-gRNA targeting is a problem (see Jiang et al, Nature Biotechnology 31,233-239(2013), incorporated herein by reference in its entirety), methods are provided for in-depth interrogation of Cas9 for affinity for very large target sequence variation intervals. Thus, aspects of the disclosure provide direct high throughput reading of Cas9 targeting in human cells while avoiding complications introduced by dsDNA nicking toxicity and mutagenic repair incurred with native nuclease-active Cas9 specific testing.

Further aspects of the disclosure generally relate to the use of DNA binding proteins or systems for transcriptional regulation of target genes. Based on the present disclosure, one skilled in the art will readily identify exemplary DNA binding systems. With the naturally occurring Cas9 protein, this DNA binding system does not need to have any nuclease activity. Thus, such DNA binding systems do not require inactivation of nuclease activity. One exemplary DNA binding system is TALE. According to one aspect, TALE specificity is assessed using the method shown in figure 2A. A library of constructs was designed in which each element of the library contained a basic promoter driving dTomato fluorescent protein. Downstream of the transcription start site m, a 24bp (A/C/G) random transcript tag was inserted, while two TF binding sites were placed upstream of the promoter: one is a constant DNA sequence shared by all library elements, and the second is a variable feature of a "biased" library with binding sites engineered to span a large collection of sequences with a combination of many mutations present away from the target sequence to which the programmable DNA targeting complex is designed to bind. This is achieved using degenerate oligonucleotides engineered to carry the nucleotide frequency at each position such that the target sequence nucleotides appear to occur at a frequency of 79% and the nucleotides at a frequency of 7% of each other. See Patwardhan et al, Nature Biotechnology 30,265-270(2012), incorporated herein by reference in its entirety. The reporter library was then sequenced to show the association between the 24bp dTomato transcript tags and their corresponding "biased" target sites in the library elements. The high diversity of transcript tags makes sure that the consensus tags between different targets are extremely rare, while the biased construction of target sequences means that sites with few mutations are associated with more tags than sites with more mutations. Next, transcription of the dtomat reporter gene is stimulated with either a control TF engineered to bind to the consensus DNA site or a target TF engineered to bind to the target site. The abundance of each expressed transcript tag was measured in each sample by performing RNAseq on the stimulated cells and then mapped back to their corresponding binding sites using the previously established association table. The control TF is expected to stimulate all library members equally because its binding site is shared by all library elements, while the target TF is expected to bias the distribution of expressed members to those that it preferentially targets. This assumption was used in step 5 to calculate the normalized expression level of each binding site by dividing the number of tags obtained for the target TF by those obtained for the control TF.

As shown in fig. 2B, the targeted blueprint of Cas9-gRNA complex shows that it tolerates 1-3 mutations on average in its target sequence. As shown in fig. 2C, the Cas9-gRNA complex is also extremely insensitive to point mutations other than those localized to the PAM sequence. It should be noted that this data shows that the prediction PAM for streptococcus pyogenes Cas9 is not only NGG but also NAG. As shown in fig. 2D, the introduction of a 2-base mismatch significantly attenuated the activity of the Cas9-gRNA complex, however only when these were located 8-10 bases closer to the 3 'end of the gRNA target sequence (in the heat map, the target sequence positions were labeled 1-23 starting from the 5' end).

Mutation resistance of TALE domain another widely used genome editing tool was determined using the transcription specific assay described herein. TALE off-target data for 18-mer (18-mer) TALEs, as shown in fig. 2E, showed that it can tolerate an average of 1-2 mutations in its target sequence and fails to activate most 3-base mismatched variants in its target. As shown in fig. 2F, the 18-mer TALE was extremely insensitive to single base mismatches in its target, similar to the Cas9-gRNA complex. As shown in fig. 2G, the introduction of a 2 base mismatch significantly attenuated the activity of an 18-mer TALE. TALE activity is more sensitive to mismatches nearer the 5 'end of its target sequence (in heat mapping, target sequence positions are labeled 1-18 from the 5' end).

Results were determined using a targeting assay in a nuclease assay, which is the subject matter of fig. 10A-10D relating to evaluating targeted blueprints for TALEs of different sizes. As shown in fig. 10A, using a nuclease-mediated HR assay, it was determined that 18-merTALE tolerates multiple mutations in its target sequence. As shown in fig. 10B, the targeted blueprints of TALEs with 3 different sizes (18-mer, 14-mer, and 10-mer) were analyzed using the method described in fig. 2A. Shorter TALEs (14-mers and 10-mers) are increasingly more specific for their targeting, but also reduce their activity by nearly an order of magnitude. As shown in fig. 10C and 10D, 10-merTALE showed near single base mismatch resolution, losing nearly all activity against targets with 2 mismatches (target sequence positions are labeled 1-10 from the 5' end in the heat mapping). Taken together, these data imply that designing shorter TALEs can result in higher specificity in genome engineering applications, while the requirement for FokI dimerization in TALE nuclease applications is essential to avoid off-target effects. See Kim et al, Proceedings of the National Academy of sciences of the United States 93, 1156-1160 (1996) and Pattanayak et al, Nature Methods8, 765-770 (2011), the entire contents of each of which are incorporated herein by reference.

Figures 8A-8C relate to a high level specificity analysis process flow for calculating the normalized expression levels shown in the examples from the experimental data. As shown in FIG. 8A, a library of constructs was generated by a biased distribution of binding site sequences and 24bp random sequence tags that would be bound to reporter gene transcripts (top). The highly degenerate transcribed tags are such that they should map many-to-one to Cas9 or TALE binding sequences. The library of constructs was sequenced (level 3, left) to determine which tags co-appeared with the binding site, resulting in a correlation table of binding sites versus transcription tags (level 4, left). Library barcodes (library barcodes) can be used to sequence multiple libraries of constructs created for different binding sites at once (indicated here by bright blue and bright yellow, levels 1-4, left). The construct library was then transfected into a cell population and a different set of Cas9/gRNA or TALE transcription factors were induced in the cell population sample (level 2, right). Always induce one sample with immobilized TALE activator targeting the immobilized binding site sequence in the construct (upper level, green box); this sample was used as a positive control (green sample, also indicated as + marker). The cDNA generated from the reporter mRNA in the induced samples was then sequenced and analyzed to obtain the number of tags per tag in the sample (levels 3 and 4, right). As with the sequencing of the construct library, multiple samples including positive controls were sequenced and analyzed together by appending the sample barcodes. Here, the bright red indicates one non-control sample sequenced and analyzed by the positive control (green). Since only transcribed tags and no construct binding sites were present in each read, the binding site-to-tag association table resulting from sequencing of the construct library was then used to record the total number of tags expressed by each binding site in each sample (level 5). The records for each non-positive control sample were then converted to normalized expression levels by dividing them with the records obtained in the positive control samples. Examples of plots of normalized expression levels for multiple mismatches are provided in fig. 2B and 2E and in fig. 9A and 9B. Not included in the overall process flow are several levels of filtering against error tags, against tags unrelated to the construct library, and against tags that appear to be common to multiple binding sites. FIG. 8B example distribution of the percentage of multiple mismatched binding sites generated from within the biased construct library. Left: theoretical distribution. And (3) right: distribution observed from the actual TALE construct library. Figure 8C depicts an example distribution of the percentage of the number of tags aggregated to the binding site for multiple mismatches. Left: the distribution observed from the positive control sample. And (3) right: distribution observed from samples induced with non-control TALE. The distribution of the number of pooled tags closely reflects the distribution of binding sites in fig. 8B due to the binding of the positive control TALE to the fixation sites in the construct, whereas for the non-control TALE samples the distribution is biased to the left, as sites with fewer mismatches induce higher expression levels. The following: the average expression level for the number of mutations in the target site is shown by calculating the relative enrichment between these by dividing the number of tags obtained for the target TF by those obtained for the control TF.

These results were further reconfirmed by specific data generated using different Cas9-gRNA complexes. As shown in fig. 9A, the different Cas9-gRNA complexes tolerated 1-3 mutations in its target sequence. As shown in fig. 9B, the Cas9-gRNA complex was also extremely insensitive to point mutations other than those localized to the PAM sequence. However, as shown in FIG. 9C, the introduction of a 2 base mismatch significantly attenuated the activity (in the heat map, the target sequence positions are labeled 1-23 starting from the 5' end). As shown in fig. 9D, the predicted PAM for streptococcus pyogenes Cas9 was determined to be NGG and also NAG using a nuclease-mediated HR assay.

According to certain aspects, binding specificity is increased according to the methods described herein. Since cooperativity between the various complexes is a factor in target gene activation by Cas9N-VP64, the transcriptional regulation application of Cas9N is naturally quite specific as a single off-target binding event should have minimal impact. According to one aspect, offset nicks are used in genome editing methods. Most nicks rarely lead to NHEJ events, (see Certo et al, Nature methods8,671-676 (2011), incorporated herein by reference in its entirety), thus minimizing the effects of off-target nicks. In contrast, induction of offset nicks to generate Double Strand Breaks (DSBs) is highly effective in inducing gene disruption. According to certain aspects, the 5 'overhang generates a more significant NHEJ event as opposed to the 3' overhang. Similarly, 3 'overhangs favor HR over NHEJ, although the total number of HR events is significantly lower when 5' overhangs are generated. Thus, methods are provided for using nicks for homologous recombination and offset nicks for generating double strand breaks to minimize the effect of off-target Cas9-gRNA activity.

FIGS. 3A-3C relate to multiple offset cleavages and methods for reducing off-target binding of guide RNAs. As shown in fig. 3A, a traffic light reporter (traffic light reporter) was used to simultaneously determine HR and NHEJ events upon introduction of a target incision or break. The GFP sequence was repaired by DNA cleavage events resolved by the HDR pathway, while mutagenesis of NHEJ resulted in a frameshift (frameshift) presenting the out-of-frame GFP and in-frame downstream mCherry sequences. For the assay, 14 grnas covering 200bp of DNA were designed: 7 targeting sense strands (sense strand) (U1-7) and 7 antisense strands (D1-7). A Cas9D10A mutant that cleaves the complementary strand, two grnas combined in different ways, were used to induce a range of programmed 5 'or 3' overhangs (indicating cleavage sites for 14 grnas). As shown in fig. 3B, induction of offset nicks to generate Double Strand Breaks (DSBs) is highly effective in inducing gene disruption. It should be noted that the offset cuts that result in the 5 'overhang produce more NHEJ events as opposed to the 3' overhang. As shown in fig. 3C, generating a3 'overhang also favors the ratio of HR to NHEJ events, but the total number of HR events is significantly lower when generating a 5' overhang.

Fig. 11A-11B relate to Cas9D10A nickase-mediated NHEJ. As shown in fig. 11A, traffic light reporters were used to determine NHEJ events upon introduction of a target nick or double strand break. Briefly, upon introduction of a DNA cleavage event, if the break is subjected to mutagenesis NHEJ, GFP moves out of frame (out of frame) and the downstream mCherry sequence is presented in frame, producing red fluorescence. 14 grnas covering a 200bp segment of DNA were designed: 7 targeting sense strands (U1-7) and 7 antisense strands (D1-7). As shown in fig. 11B, it was observed that most of the nicks (using Cas9D10A mutant) rarely caused NHEJ events, unlike wild-type Cas9, which caused DSBs across all targets and robust NHEJ. All 14 sites were located within the adjacent 200bp segment of DNA and differences in targeting efficiency of more than 10-fold were observed.

According to certain aspects, described herein are methods of modulating expression of a target nucleic acid in a cell, comprising introducing one or more, two or more foreign nucleic acids into the cell. The foreign nucleic acid introduced into the cell encodes a guide RNA or guide RNAs, a nuclease-null Cas9 protein or domain, and a transcriptional regulator protein or domain. The guide RNA, nuclease-null Cas9 protein, and transcriptional regulator protein or domain are collectively referred to as a co-localized complex, as the term is understood by those skilled in the art in the sense that the guide RNA, nuclease-null Cas9 protein, and transcriptional regulator protein or domain bind to DNA and regulate expression of the target nucleic acid. According to certain further aspects, the foreign nucleic acid introduced into the cell encodes a guide RNA or guide RNAs and a Cas9 protein nickase. The guide RNA and Cas9 protein nickase are together referred to as a co-localized complex, as the term is understood by those skilled in the art in the sense that the guide RNA and Cas9 protein nickase bind to DNA and cleave the target nucleic acid.

Cells according to the present disclosure include any cell into which an exogenous nucleic acid can be introduced and expressed as described herein. It should be understood that the basic concepts of the present disclosure described herein are not limited by cell type. Cells according to the present disclosure include eukaryotic cells, prokaryotic cells, animal cells, plant cells, fungal cells, archaeal cells, eubacterial cells, and the like. Cells include eukaryotic cells such as yeast cells, plant cells, and animal cells. Particular cells include mammalian cells. Further, cells include any cell that is beneficial or desirable for modulating a target nucleic acid. These cells may include those that lack the expression of a particular protein resulting in a disease or deleterious condition. These diseases or deleterious conditions are readily known to those skilled in the art. In accordance with the present disclosure, nucleic acids responsible for expression of a particular protein can be targeted by the methods described herein and transcriptional activators that cause upregulation of expression of the target nucleic acid and the corresponding particular protein. In this manner, the methods described herein provide therapeutic treatment.

Target nucleic acids include any nucleic acid sequence for which modulation or cleavage of a co-localized complex described herein can be useful. The target nucleic acid includes a gene. For the purposes of this disclosure, DNA, such as double-stranded DNA, can include a target nucleic acid and a co-localized complex that can bind to or otherwise co-localize with the DNA at or adjacent or near the target nucleic acid, and in a manner such that the co-localized complex can have a desired effect on the target nucleic acid. Such target nucleic acids can include endogenous (or naturally occurring) nucleic acids and exogenous (or foreign) nucleic acids. Based on the present disclosure, the skilled person will be able to readily identify or design the guide gRNA and Cas9 proteins that co-localize to the DNA comprising the target nucleic acid. The skilled person will further be able to recognize transcriptional regulators or domains that are likewise co-localized to the DNA comprising the target nucleic acid. The DNA includes genomic DNA, mitochondrial DNA, viral DNA or foreign DNA.

For such introduction, the foreign nucleic acids (i.e., those that are not a natural nucleic acid component of the cell) can be introduced into the cell using any method known to those of skill in the art. Such methods include transfection, transduction, viral transduction, microinjection, lipofection (lipofection), nuclear transfection (nucleofection), nanoparticle bombardment, transformation, conjugation, and the like. Such methods will be readily understood and adapted by those skilled in the art using readily identifiable literature sources.

Transcriptional regulators or domains that are transcriptional activators include VP16 and VP64, as well as others that are readily identifiable by one of skill in the art based on the present disclosure.

Diseases and deleterious conditions (destructive conditions) are those characterized by abnormal loss of expression of a particular protein. These diseases or deleterious conditions can be treated by up-regulation of specific proteins. Thus, methods of treating a disease or detrimental condition are provided wherein a co-localized complex described herein associates with otherwise binding to DNA comprising a target nucleic acid, and a transcriptional activator of the co-localized complex upregulates expression of the target nucleic acid. For example, up-regulation of PRDM16 and other genes that promote brown fat differentiation and increase metabolic uptake may be used to treat metabolic syndrome or obesity. It is useful to activate anti-inflammatory genes in autoimmune and cardiovascular diseases. Activation of tumor suppressor genes is useful in the treatment of cancer. Based on the present disclosure, one skilled in the art will readily identify such diseases and deleterious conditions.

The following examples are given to represent the present disclosure. These examples should not be construed as limiting the scope of the disclosure, which these and other equivalent embodiments will be apparent in light of this disclosure, the accompanying drawings, and the appended claims.

47页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种海藻有机聚合物的制备方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!