Genome editing compositions using CRISPR/Cpf1 system and uses thereof

文档序号:1026673 发布日期:2020-10-27 浏览:32次 中文

阅读说明:本技术 使用CRISPR/Cpf1系统的基因组编辑组合物及其用途 (Genome editing compositions using CRISPR/Cpf1 system and uses thereof ) 是由 金龙三 高正宪 李姃美 文秀彬 于 2018-11-21 设计创作,主要内容包括:本发明涉及使用CRISPR/Cpf1系统的基因组编辑组合物及其用途,并且更具体地,涉及基因组编辑组合物,其包含:包含能够与靶核苷酸序列杂交的指导序列和连接至所述指导序列的3’末端的尿苷重复序列的CRISPR RNA(crRNA)或其编码DNA;以及Cpf1蛋白或其编码DNA,使用所述组合物的基因组编辑方法、转基因实体的构建方法以及转基因实体。使用CRIPSPR/Cpf1系统,本发明可以在基因组编辑中提高插失效率并降低脱靶活性,并且因此可以容易地构建具有插入其中(敲入)或从其中缺失(敲除)的期望基因的转化细胞或者转基因动物或植物。(The present invention relates to genome editing compositions using the CRISPR/Cpf1 system and uses thereof, and more particularly, to genome editing compositions comprising: CRISPR RNA (crRNA) or a DNA encoding thereof comprising a guide sequence capable of hybridizing to a target nucleotide sequence and a uridine repeat sequence linked to the 3' end of the guide sequence; and Cpf1 protein or DNA encoding the same, methods of genome editing, methods of constructing transgenic entities, and transgenic entities using the compositions. Using the CRIPSPR/Cpf1 system, the present invention can increase insertion efficiency and reduce off-target activity in genome editing, and thus can easily construct a transformed cell or a transgenic animal or plant having a desired gene inserted therein (knock-in) or deleted therefrom (knock-out).)

A polynucleotide in a CRISPR/Cpf1 system having a uridine (U) repeat nucleotide sequence linked to the 3' end of a guide sequence complementary to a target nucleotide sequence.

2. The polynucleotide of claim 1, wherein the uridine repeat nucleotide sequence is comprised of (U)aV)nUbThe nucleotide sequence of (a), wherein a and b are integers from 2 to 20, n is an integer from 1 to 5, and V is adenine (A), cytosine (C), or guanine (G).

3. The polynucleotide of claim 2, wherein (U)aV)nUbIs U4AU6

4.A composition for genome editing, comprising: CRISPRRNA(crRNA) comprising a guide sequence complementary to a target nucleotide sequence and a uridine repeat sequence linked to the 3' end of said guide sequence, or a DNA encoding said crRNA; and

a Cpf1 protein or DNA encoding said Cpf1 protein.

5. The composition for genome editing according to claim 4, wherein the uridine repeat sequence is a nucleotide sequence comprising 2 to 20 uridines.

6. The composition for genome editing according to claim 4, wherein the uridine repeat sequence is a nucleotide sequence comprising 6 to 10 uridines.

7. The composition for genome editing according to claim 4, wherein the uridine repeat sequence is a nucleotide sequence comprising 8 uridines.

8. The composition for genome editing of claim 4, wherein the uridine repeat sequence is composed of (U)aV)nUbThe nucleotide sequence of (a), wherein a and b are integers from 2 to 20, n is an integer from 1 to 5, and V is adenine (A), cytosine (C), or guanine (G).

9. The composition for genome editing of claim 8, wherein V is a.

10. The composition for genome editing of claim 8, wherein n is 1.

11. The composition for genome editing of claim 8, wherein (U)aV)nUbIs U4AU6

12. The composition for genome editing of claim 4, wherein the guide sequence is 18 to 23nt in length.

13. The composition for genome editing of claim 4, wherein the Cpf1 protein is derived from one or more microorganisms selected from the group consisting of: microorganisms of the genera Candida (Candida), Lachnospira (Lachnospira), Vibrio butyricum (Butyrivibrio), Heterophaera (Peregrinibacter), Aminococcus (Acidococcus), Porphyromonas (Porphyromonas), Prevotella (Prevotella), Francisella (Francisella), Candida methanoplasia and Eubacterium (Eubacterium).

14. The composition for genome editing of any one of claims 4 to 13, wherein the composition comprises: a PCR amplicon comprising DNA encoding the crRNA, and a recombinant vector comprising DNA encoding the Cpf1 protein.

15. The composition for genome editing of any one of claims 4 to 13, wherein the composition comprises: a recombinant vector comprising a DNA encoding said crRNA, and a recombinant vector comprising a DNA encoding said Cpf1 protein.

16. The composition for genome editing of claim 15, wherein the DNA encoding the crRNA and the DNA encoding the Cpf1 protein are inserted in one recombinant vector or separate vectors.

17. The composition for genome editing according to any one of claims 4 to 13, wherein the composition is for genome editing in a eukaryotic cell or a eukaryotic organism.

18. The composition for genome editing of claim 17, wherein the eukaryote is a eukaryotic animal or a eukaryotic plant.

19. A method for genome editing, the method comprising: introducing the composition of any one of claims 4 to 13 into an isolated cell or organism.

20. The method of claim 19, wherein the introduction of the composition is achieved by local injection, microinjection, electroporation, or lipofection methods.

21. The method of claim 19, wherein the cell or organism is an isolated eukaryotic cell or eukaryotic non-human organism.

22. The method of claim 21, wherein the eukaryotic cell is a cell isolated from a eukaryotic animal or a eukaryotic plant.

23. A method for constructing a genetically modified organism, the method comprising: introducing the composition of any one of claims 4 to 13 into an organism or isolated cell other than a human.

24. A genetically modified organism constructed by the method of claim 23.

Technical Field

The present invention relates to compositions for genome editing using the CRISPR/Cpf1 system and uses thereof, and more particularly, to compositions for genome editing comprising: CRISPR RNA (crRNA) comprising a guide sequence complementary to a target nucleotide sequence and a uridine (U) repeat sequence linked to the 3' end of the guide sequence, or DNA encoding the crRNA; and a Cpf1 protein or a DNA encoding the Cpf1 protein, a genome editing method using the composition, a construction method of a genetically modified organism, and a genetically modified organism.

Background

Genome editing refers to a method of expressing a desired genetic trait by freely correcting genetic information of an organism, and has been remarkably developed while being used in various fields ranging from research of gene functions to treatment of diseases by developing a CRISPR-associated protein (CRISPR/Cas) system.

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) are loci containing multiple short direct repeats present in the genomes of bacteria and archaea, the gene sequences of which have been revealed and function as an acquired prokaryotic immune system to confer resistance to foreign genetic elements such as viruses and bacteriophages. Short sequences of exogenous DNA, called pre-spacers, are integrated into the genome between CRISPR repeats and are used to keep track of past exposure to external factors. The spacer sequence of the motif thus integrated serves as a template for the generation of guide RNA and for the cleavage of externally invaded genetic material.

The core of CRISPR-based gene editing technology is the process of recognizing a specific base sequence in an organism using RNA as a mediator, inducing Double Strand Breaks (DSBs) at the corresponding gene site by effector proteins (e.g., Cas9), and then repairing the DSBs. During the recovery of DSBs produced by CRISPR/Cas systems in eukaryotic cells, there is a non-homologous end joining (NHEJ) in which random insertions and deletions (indels) occur in the truncated base sequence, and homology-directed repair (HDR) which repairs the cleavage site using a DNA strand with the same base sequence as the nearby region of the DNA being cleaved as a template. Each gene repair method is capable of performing knockouts, which induce frame shifts of specific genes caused by the insertion loss of base sequences of genes, and knockins, which induce the intended insertion or substitution of specific base sequences in desired genes. Therefore, there is a need to improve DSB frequency and accuracy to improve the efficiency of knockout or knock-in at precise locations, and for this purpose, research is constantly being conducted to find a modified method of an existing CRISPR/Cas system or a new CRISPR/Cas system.

Recently, like Cas9, Cpf1 (type V Cas system from Prevotella (Prevotella) and Francisella 1(Francisella 1), called CRISPR) was also found in various types of bacteria. Cpf1 belongs to class 2, which like Cas9 has one protein as the effector protein and under the direction of crrna (crispr rna), the effector protein causes DSB in DNA by recognizing a specific pre-spacer adjacent motif (PAM) sequence.

However, the difference is that Cas9 requires crRNA and trans-activated crRNA (tracrrna) for specific base sequence recognition and cleavage, whereas Cpf1 only requires crRNA. Furthermore, in PAM sequences where effector proteins and crRNA complexes recognize specific DNA base sequences, the difference is that Cas9 requires a G-rich sequence, while Cpf1 recognizes a T-rich sequence. Even in the form of DSBs produced in this case, Cas9 cleaved to a blunt end at a site near PAM, while Cpf1 cleaved to a staggered end at a distance of from PAM18-23 nucleotides (nt). Additionally, the gene size of Cpf1 is smaller than Cas9 and is therefore expected to be more useful for clinical purposes.

The above-described features of Cpf1 may be advantageous in gene therapy. In particular, the Cpf1 feature, which requires a relatively small size protein and crRNA, may be of great advantage compared to Cas9 because there is a limit to the size of genetic material that can be delivered when using viruses, such as adeno-associated virus (AAV), to deliver genetic material for gene editing into humans. In addition, even in terms of stability of gene therapy, the fact that off-target results for Cpf1 are low compared to Cas9 is an important advantage. However, it is difficult to replace Cas9 since it has been found to date that the insertion failure rate of Cpf1 is relatively low relative to Cas9 or there is a large deviation depending on the gene to be targeted. Therefore, to replace or exceed Cas9 while maximizing the advantages of Cpf1, a method to increase the insertion failure rate of Cpf1 must be developed.

The insertion efficiency or accuracy of the target gene can be improved by manipulating effector endonucleases or guide RNAs in the CRISPR/Cas system, and in the case of Cas9, such studies have been actively conducted, while studies on the Cpf1 system are insufficient.

[ detailed description of the invention ]

[ problem ] to

Accordingly, as a result of intensive studies to develop a CRISPR/Cpf1 system capable of overcoming the disadvantages of the CRISPR/Cas9 system, the present inventors found that insertion failure rate was improved by adding a uridine repeat sequence to the 3' -terminal sequence of crRNA used for the CRISPR/Cpf1 system, as compared to the Cpf1 system and Cas9 system having crRNA in the related art, thereby completing the present invention.

It is an object of the present invention to provide polynucleotides in the CRISPR/Cpf1 system, consisting of a uridine (U) repeat nucleotide sequence linked to the 3' end of a guide sequence complementary to a target nucleotide sequence.

It is another object of the present invention to provide a composition for genome editing comprising: CRISPR RNA (crRNA) comprising a guide sequence complementary to a target nucleotide sequence and a uridine (U) repeat sequence linked to the 3' end of the guide sequence, or DNA encoding the crRNA; and a Cpf1 protein or DNA encoding said Cpf1 protein.

It is yet another object of the present invention to provide a method for genome editing, the method comprising: introducing a composition for genome editing into an isolated cell or organism.

It is a further object of the present invention to provide a method for constructing a genetically modified organism, said method comprising: introducing a composition for genome editing into an isolated cell or organism.

It is a further object of the present invention to provide a genetically modified organism constructed by the method.

Technical scheme

One aspect of the present invention provides a polynucleotide in the CRISPR/Cpf1 system consisting of a uridine (U) repeat nucleotide sequence linked to the 3' end of a guide sequence complementary to a target nucleotide sequence.

Another aspect of the invention provides a composition for genome editing comprising: CRISPR RNA (crRNA) comprising a guide sequence complementary to a target nucleotide sequence, a uridine repeat sequence linked to the 3' end of said guide sequence, or DNA encoding said crRNA; and a Cpf1 protein or DNA encoding said Cpf1 protein.

Another aspect of the invention provides a method for genome editing, the method comprising: introducing a composition for genome editing into an isolated cell or organism.

Another aspect of the present invention provides a method for constructing a genetically modified organism, the method comprising: introducing a composition for genome editing into an isolated cell or organism.

In a further aspect of the invention there is provided a genetically modified organism constructed by the method.

[ advantageous effects ]

The present invention can increase insertion efficiency and reduce off-target activity in genome editing of eukaryotic cells using the CRIPSPR/Cpf1 system, and thus can easily construct a genetically modified cell or a genetically modified animal or plant having a desired gene inserted therein (knock-in) or deleted therefrom (knock-out).

Drawings

Fig. 1 to 14 show the results of in vitro experiments confirming that crRNA with a U repeat sequence at the 3' end (U-rich crRNA) increased dsDNA cleavage efficiency of Cpf 1:

fig. 1 shows the results, which confirmed the difference in insertion efficiency of ascipf 1 in vivo based on the mutation of the 3 nucleotide sequence at the 3' end of the crRNA.

Fig. 2 shows the results, which confirm the effect of the U3 ends on the insertion failure rate according to the targets (DNMT1, LGALS3BP, and VEGFA).

FIG. 3 shows the results, which confirm that U-rich crRNA at the 3' end increases dsDNA cleavage efficiency of AsCpf1, depending on reaction time and conditions.

FIG. 4 is a diagram showing the ampicillin resistance gene target sequence and the crRNA library sequence.

FIG. 5 is a set of photographs showing colonies of BL21(DE3) E.coli transformed with pET21 plasmid vector according to Colony Forming Units (CFU), wherein an oligonucleotide library was cloned using a sequence-independent and ligation cloning method (Li & Elledge, Methods Mol Biol, 2012).

FIG. 6 is a graph showing a schematic of an unbiased in vitro experimental approach for finding optimal crRNA alignment.

Fig. 7 shows the results of deep sequencing data analysis, which confirmed that a crRNA-encoding plasmid DNA library was prepared such that A, T, G and C were at nearly the same molar ratio at each position.

Fig. 8 shows the result of calculating probability values from reciprocal values of the nucleotide ratios at each position showing the optimal crRNA alignment.

Fig. 9 shows the results, which confirm the change in the activity of ascipf 1 according to the length of the 3' -terminal uridine sequence of crRNA.

FIG. 10 is a schematic diagram showing an in vitro experimental method for analyzing dsDNA cleavage activity.

FIG. 11 shows the results confirming that U-rich 3' overhangs in crRNA enhance the activity of AsCpf1 (mean. + -. standard deviation, in comparison,; p < 0.05.; p < 0.01, U)8(n=3))。

Fig. 12 is a schematic diagram showing the experimental design for confirming the optimal arrangement of crRNA.

FIG. 13 shows The results, which show that The number of reads (The number of reads) is inversely proportional to The efficiency of crRNA.

FIG. 14 shows the results, which confirm that crRNA with a U83' overhang shows the best AsCpf1 activity by normalization of the reads.

Fig. 15 to 21 show the results, which confirm the optimal crRNA structure for enhancing genomic efficiency in vivo:

fig. 15 is a conceptual diagram schematically showing an in vivo analysis method for determining the optimal structure of crRNA according to the present invention.

FIG. 16 shows the results confirming that the insertion loss efficiency is improved by the U-rich 3' -overhang sequence according to the present invention (mean. + -. standard deviation; representative results are shown after three replicates, respectively).

FIG. 17 shows results confirming that unlike Cas9, an improvement in insertion loss efficiency by U-rich guide RNA at the 3' terminus occurs specifically in Cpf1 (mean. + -. standard deviation; representative results are shown after three replicates, respectively).

Fig. 18 shows the results, which confirm the change in the insertion failure rate of ascipf 1 as a function of the increase in uridine length.

FIG. 19 shows the results, which confirmed the difference in the efficiency of insertion according to the 3' terminal sequence of crRNA (; p > 0.05;. p < 0.01, n ═ 3).

Fig. 20 shows the results, which confirm the optimal target length of uridine for U-rich crRNA.

Figure 21 shows the results, which validate the optimal crRNA structure for improving genomic efficiency in the CRISPR/Cpf1 system (mean ± standard deviation; representative results are shown after three replicates, respectively).

Fig. 22 to fig. 24 show the results, confirming that knock-in efficiency is improved by crRNA containing U-rich 3' overhangs.

FIG. 22 schematically illustrates that in the presence of crRNA and donor DNA, dsDNA cleavage at the DNMT1 position occurs.

Fig. 23 shows the results, which confirm the efficiency of insertion and knock-in of the target site after the insertion mutation by the CRISPR/Cpf1 system.

Figure 24 shows the results of targeting the same site using ascipf 1 and SpCas 9.

Fig. 25 and 26 show the results of large scale comparisons of genome editing efficiencies of CRISPR/ascipf 1 and CRISPR/SpCas 9:

fig. 25 shows the results, which show the insertion failure rates of AsCpf1 and SpCas9 confirmed for the same target gene in HEK-293T cells by dot plots, and fig. 26 shows the results by Box-Whisker plots.

Fig. 27 to 33 show experimental results confirming that U-rich crRNA according to the present invention does not cause off-target effects:

fig. 27 shows the deep sequencing results, which compare off-target activity of crRNA sequences and U-rich crRNA sequences in related fields at potential off-target sites.

Fig. 28 shows the results, which compare the off-target activity with crRNA and U-rich crRNA in the related art having one mismatched base at the target sequence.

FIG. 29 shows the results, which confirm that 98% or more of the genomic DNA was degraded not only by AsCpf 1-U-rich crRNA but also by the AsCpf1 standard crRNA ribonucleoprotein complex.

FIG. 30 shows the results of confirming the typical cleavage pattern at positions 18-20 of the non-target strand and 22 of the target strand by Integrated Genome Viewer (IGV).

FIG. 31 shows results showing off-target sites, in which the DNA cleavage scores and the difference between Con-crRNA and U-rich crRNA (U-rich-crRNA) were confirmed to be 2.5 or more and 6 or less in the entire genome Circosplot.

FIG. 32 graphically illustrates the number of target-off sites and the number of common target-off sites confirmed for standard and U-rich crRNAs, respectively.

FIG. 33 is a graph showing the same off-target pattern of whole genome Circos profiles in standard and U-rich crRNAs.

Fig. 34 to 36 show that U-rich crRNA according to the present invention was applied to multiple genome editing and PAM mutation:

fig. 34 shows the results, confirming that multiple U-rich crrnas simultaneously increased the insertion failure rate of multiple targets.

Fig. 35 and 36 show that U-rich crRNA was applied to the ascipf 1 PAM variant (; p > 0.001; < 0.01, n ═ 3).

FIGS. 37 to 41 confirm the improved binding affinity of AsCpf 1-U-rich crRNA complex.

Fig. 37 shows the results, which show crRNA levels by performing Northern blot analysis to confirm whether the increased Cpf1 activity is due to increased crRNA stability or to direct modulation of Cpf 1.

Fig. 38 shows the results, which show that chemically modified U-rich crRNA shows much higher Cpf1 activity than chemically modified standard crRNA, but no significant difference for chemically modified guide RNA of Cas 9.

Figure 39 shows the results, the length of 63nt is the minimum length that does not show a decrease in tracrRNA activity, and at this length the presence of U4AU4 does not induce increased Cas9 activity.

Fig. 40 shows the results, demonstrating that U-rich crRNA significantly improved binding affinity to ascipf 1 compared to standard crRNA, but the U-rich sgRNA did not cause significant differences in binding strength to the SpCas9 complex.

FIG. 41 shows the results of Isothermal Titration Calorimetry (ITC) analysis of U-rich crRNA and standard crRNA, respectively.

[ best mode ]

The present invention has been made in an effort to solve the above-mentioned problems, and provides a polynucleotide in the CRISPR/Cpf1 system, which consists of a uridine (U) repeat nucleotide sequence linked to the 3' end of a guide sequence capable of hybridizing (complementary) to a target nucleotide sequence. Further, the present invention provides a composition for genome editing comprising: CRISPR RNA (crRNA) comprising a guide sequence capable of hybridizing to a target nucleotide sequence and a uridine repeat sequence linked to the 3' end of said guide sequence, or DNA encoding said crRNA; and a Cpf1 protein or DNA encoding said Cpf1 protein.

As used herein, unless specifically indicated otherwise, the term "genome editing" refers to the loss, alteration and/or repair (correction) of gene function by deletion, insertion, substitution, etc., of one or more nucleic acid molecules (e.g., 1 to 100,000bp, 1 to 10,000bp, 1 to 1,000bp, 1 to 100bp, 1 to 70bp, 1 to 50bp, 1 to 30bp, 1 to 20bp, or 1 to 10bp) by cleavage at a target site of a target gene. According to one exemplary embodiment, cleavage at a desired position of a target DNA can be performed by using the V-type CRISPR/Cpf1 system of the Cpf1 protein, and according to another exemplary embodiment, a specific gene in a cell can be corrected by using the V-type CRISPR/Cpf1 system of the Cpf1 protein.

Additionally, in the art of delivering CRISPR/Cpf1 Ribonucleoprotein (RNP) or DNA encoding RNP to cells, methods are provided for overcoming the disadvantages of existing microinjection methods. As an example of such a method, a technique of editing a genome by introducing ribonucleoprotein or DNA encoding ribonucleoprotein into a plasmid or the like and delivering the plasmid to a large number of cells at a time by electroporation, lipofection, or the like is provided, but the genome editing technique using the Cpf1 system is not limited thereto.

The CRISPR/Cpf1 gene editing composition may be introduced into a cell or organism in the form of a recombinant vector comprising DNA encoding Cpf1 and a recombinant vector comprising DNA encoding crRNA, or may be introduced into a cell or organism in the form of a mixture comprising a Cpf1 protein and a crRNA or a ribonucleoprotein in which a Cpf1 protein forms a complex with a crRNA.

An exemplary embodiment provides a composition for genome editing comprising a guide sequence capable of hybridizing to a target nucleotide sequence or a DNA encoding the guide sequence, and a Cpf1 protein or a DNA encoding the Cpf1 protein, or a ribonucleoprotein that is a complex of crRNA and a Cpf1 protein.

Another exemplary embodiment provides a method for genome editing of an organism, the method comprising: delivering a ribonucleoprotein comprising a guide rna (crrna) and a Cpf1 protein to an organism.

The Cpf1 protein or the DNA encoding the Cpf1 protein and the guide RNA or the DNA encoding the guide RNA, which are included or used in the composition for genome editing or the method for genome editing, may be used in the form of a mixture comprising the Cpf1 protein and the guide RNA or a Ribonucleoprotein (RNA) in which the Cpf1 protein forms a complex with the guide RNA, or may be used in the case where the DNA encoding the Cpf1 protein and the DNA encoding the guide RNA are each contained in separate vectors or are contained together in one vector.

The compositions and methods can be applied to eukaryotes. Eukaryotes can be selected from eukaryotic cells (e.g., fungi such as yeast, eukaryotic and/or eukaryotic plant-derived cells (e.g., embryonic cells, stem cells, somatic cells, germ cells, etc.), and the like), eukaryotic animals (e.g., vertebrates or invertebrates, more particularly, mammals, including primates, such as humans and monkeys, dogs, pigs, cows, sheep, goats, mice, rats, etc.), and eukaryotic plants (e.g., algae such as green algae, monocots or dicots, such as corn, soybean, wheat, and rice, etc.).

Another exemplary embodiment provides a method for constructing a genetically modified organism by genome editing using a Cpf1 protein. More specifically, the method for constructing a genetically modified organism may comprise: cpf1 protein or DNA encoding Cpf1 protein and guide RNA (CRISPR RNA; crRNA) or DNA encoding guide RNA are delivered to eukaryotic cells. When the genetically modified organism is a genetically modified eukaryotic animal or a genetically modified eukaryotic plant, the method of making can further comprise culturing and/or differentiating the eukaryotic cell at the same time as or after delivery.

Another exemplary embodiment provides a genetically modified organism constructed by the method for constructing a genetically modified organism.

The genetically modified organism can be selected from all eukaryotic cells (e.g., fungi such as yeast, eukaryotic and/or eukaryotic plant-derived cells (e.g., embryonic cells, stem cells, somatic cells, germ cells, etc.), and the like), eukaryotic animals (e.g., vertebrates or invertebrates, more particularly mammals, including primates, such as humans and monkeys, dogs, pigs, cows, sheep, goats, mice, rats, etc.), and eukaryotic plants (e.g., algae such as green algae, monocots or dicots, such as corn, soybean, wheat, and rice, etc.).

In the method for genome editing and the method for constructing a genetically modified organism provided in the present specification, the eukaryotic animal may be those other than human, and the eukaryotic cell may include a cell isolated from a eukaryotic animal including human.

As used herein, the term "ribonucleoprotein" refers to a protein-ribonucleic acid complex comprising Cpf1 protein and guide RNA (crrna) as an RNA-guided endonuclease.

The Cpf1 protein is an endonuclease of a new CRISPR system, different from the CRISPR/Cas9 system, is relatively small in size compared to Cas9, does not require tracrRNA, and can function by a single guide crRNA. In addition, the Cpf1 protein is a pre-spacer adjacent motif (PAM) sequence, recognizes thymine-rich DNA sequences such as 5 ' -TTN-3 ' or 5 ' -TTTN-3 ' (N is any nucleotide, and a nucleotide having a base A, T, G or C) at the 5 ' end, and cleaves double-stranded of DNA to generate sticky ends (sticky double-strand breaks). The resulting sticky ends can facilitate NHEJ-mediated transgene knockin at the target location (or cleavage site).

The Cpf1 protein of the present invention may be derived from the genera candida (Candidatus), Lachnospira (Lachnospira), vibrio butyricus (Butyrivibrio), allophyta (peregrinobacteria), aminoacidococcus (acidinococcus), Porphyromonas (Porphyromonas), Prevotella (Prevotella), franciscella (Francisella), Candidatus methannoplasma or Eubacterium (Eubacterium), and may be derived from microorganisms such as: thrifty bacterium of the phylum of the genus thrifty (Parcuberia) (GWC2011_ GWC2_44_17), bacterium of the family Lachnospiraceae (Lachnospiraceae) (MC2017), Butyrivibrio proteoclasis, bacterium of the phylum Heterophaeaceae (Peregrinobacteria) (GW2011_ GWA _33_10), aminoacetococcus (Acidococcus sp.) (BV3L6), Porphyromonas (Porphyromonas macrocacae), bacterium of the family Lachnospiraceae (ND2006), Porphyromonas crevicans, saccharopolyspora saccharolytica (Prevolella disiens), Morella (Moraxella borvaculi) (237), Smihela sp. (SC _ KO8D17), Leptospira (Leptospira), Spirochaeta (Metarhizia furilaceae) (Eubacterium), and Candida mycoides (Metallus), but not limited to this. In one example, the Cpf1 protein may be derived from bacteria of the phylum fructicola (GWC2011_ GWC2_44_17), bacteria of the phylum deuteromycota (GW2011_ GWA _33_10), aminoacidococcus (BV3L6), Porphyromonas actinidiae, bacteria of the family lachnospiraceae (ND2006), Porphyromonas creoviricans, prevotella saccharolytica, moraxella volvulus (237), leptospira padioides, bacteria of the family lachnospiraceae (MA2020), francisco franciscensis (U112), Candidatus methandalus mathanoplam termitum, or mycobacterium leii, but is not limited thereto.

The Cpf1 protein may be isolated from a microorganism or non-naturally occurring by recombinant or synthetic means. Cpf1 proteins may further comprise, but are not limited to, elements commonly used for intranuclear delivery in eukaryotic cells (e.g., Nuclear Localization Signals (NLS), etc.). The Cpf1 protein may be used in the form of a purified protein or may be used in the form of DNA encoding the Cpf1 protein or a recombinant vector comprising the DNA.

The crRNA used in the Cpf1 system of the invention is characterized by a uridine repeat sequence linked to the 3' terminus of the guide RNA sequence that hybridizes to the target gene.

In an exemplary embodiment of the present invention, the uridine repeat sequence may be a nucleotide sequence in which uridine is repeated 2 to 20 times. Preferably, the crRNA of the present invention may comprise 6 to 10 repeated uridine sequences, more preferably 8 uridine repeated sequences.

In another example of the present inventionIn an exemplary embodiment, the uridine repeat sequence may be composed of (U)aV)nUbNucleotide sequence shown. In this case, a and b are integers of 2 to 20, n is an integer of 1 to 5, and V is adenine (a), cytosine (C), or guanine (G).

In a preferred exemplary embodiment of the invention, V is A and may be a group consisting of (U)aA)nUbNucleotide sequence shown.

In a preferred exemplary embodiment of the invention, n is 1 and may be represented by UaVUbNucleotide sequence shown.

In a preferred exemplary embodiment of the invention, the uridine repeat sequence may be composed of U4AU6Nucleotide sequence shown.

In the present invention, a guide sequence capable of hybridizing to a target nucleotide sequence refers to a nucleotide sequence having 50% or more, 60% or more, 70% or more, 80% or more, 90% or more, 95% or more, 99% or more, or 100% sequence complementarity to the nucleotide sequence of a target site of a gene (target sequence) (hereinafter, unless otherwise specified, the same meaning is used, and sequence homology can be confirmed using typical sequence comparison means (e.g., BLAST)). For example, a crRNA capable of hybridizing to a target sequence may have a sequence complementary to a corresponding sequence located on the opposite strand of a nucleic acid strand in which the target sequence (located on the same sequence as the strand in which the PAM sequence is located) (i.e., the strand in which the PAM sequence is located), and in other words, the crRNA may contain a sequence in which T in the target sequence is replaced by U, which indicates the DNA sequence as a target sequence site.

In the present specification, crRNA may be expressed as a target sequence, and in this case, even if not otherwise mentioned, the crRNA sequence may be interpreted as a sequence in which T is replaced with U in the target sequence.

The nucleotide sequence (target sequence) of the gene target site may be a sequence in which TTTN or TTN (N is A, T, C or G) or a pre-spacer adjacent motif (PAM) having sequence homology of 50% or more, 66% or more, or 75% or more with TTTN or TTN is linked to the 5 'end thereof (for example, PAM sequence is directly linked to the 5' end (0nt distance) of the target sequence or linked to the 5 'end of the target sequence at a distance of 1 to 10 nt), or may be a sequence in which a sequence complementary to PAM sequence in the opposite direction (NAAA or NAA, a sequence having sequence homology of 50% or more, 66% or more, or 75% or more with NAAA or NAA; N is A, T, C or G; reverse PAM sequence at the 3' end) is linked to the 3 'end thereof (for example, reverse PAM sequence is directly linked to the 3' end (0nt distance) of the target sequence or linked to the 3 'end of the target sequence at a distance of 1 to 10 nt) in addition to the PAM sequence at the 5' end The sequence of (a).

In an exemplary embodiment of the present invention, the guide sequence included in the crRNA may have a length of 18 to 23nt, but is not limited thereto.

In an exemplary embodiment of the present invention, the crRNA may be provided in the form of a PCR amplicon comprising DNA encoding the crRNA or in the form of a recombinant vector. As an example, the present invention may provide a composition for genome editing comprising: a PCR amplicon comprising DNA encoding a crRNA, and a recombinant vector comprising DNA encoding a Cpf1 protein. As another exemplary embodiment, the present invention may provide a composition for genome editing, comprising: a recombinant vector comprising DNA encoding a crRNA, and a recombinant vector comprising DNA encoding a Cpf1 protein. In this case, the recombinant vector may comprise a crRNA expression cassette comprising a transcriptional control sequence, such as a crRNA-encoding DNA and/or a promoter operably linked thereto.

The DNA encoding the crRNA and the DNA encoding the Cpf1 protein according to the present invention may be inserted into one recombinant vector or separate vectors.

The DNA encoding the crRNA and the DNA encoding the Cpf1 protein according to the invention may be cloned into one recombinant vector or into separate vectors.

As another example, the following may be delivered into a cell or organism by local injection, microinjection, electroporation, lipofection, and the like: a mixture comprising an RNA-guided endonuclease (RGEN) and a guide RNA, or a Ribonucleoprotein (RNP), DNA encoding RGEN, guide RNA and RNP, or a recombinant vector comprising DNA.

In the above methods, delivery of a DNA comprising Cpf1 (endonuclease) or Cpf1 and a mixture of crRNA or a DNA encoding crRNA, or ribonucleoprotein, or NDA encoding ribonucleoprotein may be performed by delivering a mixture comprising in vitro expressed (purified) Cpf1 and crRNA or ribonucleoprotein to which Cpf1 and crRNA have been conjugated to a eukaryotic cell or eukaryote by methods such as microinjection, electroporation, and lipofection. In yet another example, delivery of a mixture comprising Cpf1 or Cpf 1-encoding DNA and crRNA or crRNA-encoding DNA or ribonucleoprotein may be performed by delivering a recombinant vector comprising an expression cassette comprising DNA encoding Cpf1 and an expression cassette comprising DNA encoding crRNA, respectively, in separate vectors or in one vector to eukaryotic cells and/or eukaryotes by methods such as local injection, microinjection, electroporation, and lipofection.

In addition to the endonuclease-encoding DNA or crRNA-encoding DNA, the expression cassette may also contain typical gene expression control sequences in the form operably linked to the endonuclease-encoding DNA or crRNA-encoding DNA.

The term "operably linked" refers to a functional association between a gene expression control sequence and another nucleotide sequence.

The gene expression control sequence may be one or more selected from an origin of replication, a promoter, and a transcription termination sequence (terminator).

The promoter described herein is one of the transcription control sequences that regulates the transcription initiation of a particular gene, and is typically a polynucleotide fragment of about 100 to about 2500bp in length. In an exemplary embodiment, a promoter may be used without limitation as long as the promoter can regulate transcription initiation in a cell, such as a eukaryotic cell. For example, the promoter may be selected from one or more of the following: cytomegalovirus (CMV) promoter (e.g., human or mouse CMV immediate early promoter), U6 promoter, EF 1-alpha (elongation factor 1-a) promoter, EF 1-alpha short (EFS) promoter, SV40 promoter, adenovirus promoter (major late promoter)Mover), pLλA promoter, trp promoter, lac promoter, tac promoter, T7 promoter, vaccinia virus 7.5K promoter, HSV tk promoter, SV40E1 promoter, Respiratory Syncytial Virus (RSV) promoter, metallothionein promoter, β -actin promoter, ubiquitin C promoter, human interleukin 2(IL-2) gene promoter, human lymphotoxin gene promoter, human granulocyte macrophage colony stimulating factor (GM-CSF) gene promoter, and the like, but is not limited thereto.

The transcription termination sequence may be a polyadenylation sequence (pA) or the like. The origin of replication may be f1 origin of replication, SV40 origin of replication, pMB1 origin of replication, gland origin of replication, AAV origin of replication, BBV origin of replication, and the like.

The vector described herein may be selected from the group consisting of plasmid vectors, cosmid vectors, and viral vectors, such as phage vectors, adenovirus vectors, retrovirus vectors, and adeno-associated virus vectors. A vector that can be used as a recombinant vector can be constructed by using the following as a basis: plasmids (e.g., pcDNA series, pSC101, pGV1106, pACYC177, ColE1, pKT230, pME290, pBR322, pUC8/9, pUC6, pBD9, pHC79, pIJ61, pLAFR1, pHV14, pGEX series, pET series, pUC19, etc.), phages (e.g., λ gt4 λ B, λ -Charon, λ Δ z1, M13, etc.), viral vectors (e.g., adeno-associated virus (AAV) vectors, etc.), or the like used in the art, but are not limited thereto.

[ embodiment ]

Hereinafter, the present invention will be described in more detail by examples. These examples are only for illustrating the present invention, and it is apparent to those of ordinary skill in the art that the scope of the present invention is not to be construed as being limited by these examples.

<Experimental methods>

1. Cell culture and transfection

1mM heat-inactivated penicillin/streptomycin and 10% FBS (Corning) were added to high concentration glucose DMEM medium and incubated at 37 ℃ and 5% CO2HEK-293T cells (293T/17, ATCC) were cultured under the conditions described in (1).

Cells by electroporation or lipofectionTransduction. Specifically, for electroporation, plasmid vectors (Addgene) in which 2 to 5 μ g of AsCpf1, LbCpf1, or SpCas9 were encoded were transduced to 5X 10 with PCR amplicons in which 1 to 3 μ g of crRNA or sgRNA were encoded using a Neon electroporator (Invitrogen) to 5X 105To 1X 106HEK-293T cells. Chemically synthesized crrna (bioneer) was used instead of PCR amplicons if necessary.

For the lipofection method, 3 to 15 μ L of FuGene reagent (Promega) was mixed with plasmid vector encoding 1 to 5 μ g of AsCpf1, LbCpf1, or SpCas9 therein and 3 to 15 μ g of PCR amplicon for 15 minutes. The day before transduction, 5X 105The individual cells were plated in 1ml of DMEM, and then cultured for 48 hours by adding the mixture (300. mu.L) to the medium.

After incubation, cells were harvested and used PureHelixTMGenomic DNA preparation kit (NanoHelix) or MaxwellTMGenomic DNA was prepared by RSC nucleic acid isolation workstation (Promega).

pSpCas9(BB) -2A-GFP (PX458), pY010(pcDNA3.1-hAsCpf1) and pY016(pcDNA3.1-hLbCpf1) were obtained from Feng Zhang (Addge plasma #48138, #69982, #69988, respectively). Information on the target used in this example of the present invention is shown in [ table 1] and [ table 2] below.

TABLE 1

(four sequences of the above target sequence [ ] represent PAM sequences)

TABLE 2

Figure BPA0000290653310000162

Figure BPA0000290653310000171

AsCpf1 PAM variants

Site-directed mutagenesis was performed on a Veriti thermal cycler (life technologies) using pY010 plasmid vector as template and mutagenesis primer.

The S542R mutation was generated using a mutagenic primer pair (SEQ ID NOS: 52 and 53). Additional mutagenesis primers (SEQ ID NOS: 54 to 57) were used to generate the K607R and K548V/N552R mutations. The primer sequences used in this example are shown in table 3 below.

TABLE 3

Briefly, 100ng of plasmid template and 15pmol of each mutagenic primer were added to 50. mu.l of Toyobo KOD mixture (Takara) and 25 cycles of an initial denaturation step (3 min, 94 ℃), a denaturation step (20 sec, 95 ℃), an annealing step (40 sec, 62 ℃) and a polymerization step (10 min, 72 ℃) were performed. Mu.l of the PCR product were reacted with 2. mu.l of DpnI (New England Biolabs) at 37 ℃ for 2 hours. The reaction mixture (5. mu.l) was denatured by heating at 62 ℃ for 20 minutes and then used to transform BL21(DE3) E.coli cells. Mutagenesis was confirmed by Sanger sequencer.

3. Purification of recombinant AsCpf1

The codon-humanized Cpf1 gene obtained from the genus aminoacidococcus (Acylaminococcus sp.) was cloned into pET-28a (+) plasmid vector (Invitrogen) and the vector structure was transformed into BL21(DE3) E.coli cells.

Genetically modified E.coli colonies were grown in LB medium (LB broth) at 37 ℃ to an optical density (ca) of 0.7, and the cells were then cultured overnight at 30 ℃ in the presence of 0.1mM isopropylthio-. beta. -D-galactoside (IPTG) to induce production of recombinant proteins. Next, cells were obtained by centrifugation at 3,500g for 30 minutes, and the cells were disrupted by sonication. Cell eluates were purified by centrifugation at 15,000g for 30 min and injected using 0.45 μmFilter (Millipore) filtration. The purified eluate was loaded on Ni using FPLC purification System (AKTA Purifier, GE Healthcare)2+On an affinity column.

In addition, recombinant ascipf 1 was purified in an automated protein production system (ExiProgen, Bioneer) by adding 1 μ g of genetic construct to the in vitro transcription mixture. The concentration of the produced protein was confirmed by SDS-PAGE using Bovine Serum Albumin (BSA) as a reference and coomassie blue staining.

AsCpf1 in vitro DNA cleavage

TTTC PAM followed by a PCR amplicon with the DNA sequence 5'-CTGATGGTCCATGTCTGTTACTC-3' (SEQ ID NO: 58) was cloned into the T-Blunt vector (Solgent). The vector constructs were amplified in DH 5. alpha. E.coli cells and used HiGeneTMDNA purification kit (Biofact) purification. The target vector (20 ng/. mu.L) was reacted with purified recombinant AsCpf1 protein (50 ng/. mu.L) and chemically synthesized crRNA (10 ng/. mu.L) at 37 ℃ for 30 to 60 minutes. The reaction mixture was used to transform DH 5. alpha. E.coli competent cells by dissolving the reaction mixture in 10% SDS-PAGE gels to quantify the lysate, or by adding a thermal shock at 42 ℃ for 2 minutes. The genetically modified cells were applied to LB agar plates containing ampicillin (50 ng/. mu.L) and cultured at 37 ℃. The number of colonies formed to induce crRNA-dependent DNA cleavage of ascipf 1 was counted.

5. Quantitative of loss of insertion

A T7 endonuclease I (T7E1) assay was performed to assess the insertion failure rate of AsCpf1, LbCpf1, or SpCas9 in the targeted locus of HEK-293T cells. By using a solution based onTMPfu's PCR amplification kit (SolGent) performs PCR amplification on the target site to obtain a PCR product. The PCR product (100 to 300. mu.g) was then reacted with 10 units of T7E1 enzyme (New England Biolabs) in a 25. mu.reaction mixture for 1 hour at 37 ℃. mu.L of the reaction mixture was loaded directly onto a 10% SDS-PAGE gel and the cleaved PCR products were run in a TBE buffer system. The gel images were stained with ethidium bromide solution and then digitized on a Printgraph 2M gel imaging system (Atto). To calculate the insertion loss efficiency, the digitized images were analyzed using Image J softwareLike this.

6. Off-target activity assessment

Using Cas-OFFinder [ 21;http://www.rgenome.net/casoffinder(ii) a Tables 4 to 9]Potential off-target sites with two or fewer bulges and mismatches were selected. After transduction with the AsCpf1 vector construct and PCR amplicon encoding crRNA, HEK-293T cells were cultured in DMEM for 2 days.

TABLE 4

Figure BPA0000290653310000211

(four sequences in [ ] above the target sequence represent PAM sequences. lower case letters are mismatch sequences, -number represents bulge;. position of the target sequence, according to Genome Reference Consortium Human Build 38 patchrelease 11(GRCh38.p 11).

TABLE 5

Figure BPA0000290653310000221

TABLE 6

Figure BPA0000290653310000231

TABLE 7

Figure BPA0000290653310000232

TABLE 8

Figure BPA0000290653310000241

TABLE 9

(by comparison with untreated allele sequences to monitor the occurrence of SNPs in the alleles studied those single nucleotide variations observed identically between Cpf1 treated and untreated alleles were considered SNPs

Nested PCR amplification was used at the target and potential off-target sites and used for library construction. Each library was purified using Agencourt AMPureXP (Beckman Coulter) and quantified by the PicoGreen method using Quanti-iT PicoGreen dsDNA assay kit (Invitrogen).

After confirmation of the size of the library using the Agilent 2100 bioanalyzer system (Agilent Technologies), qPCR analysis was performed to confirm whether the dose and appropriate clusters were well-compliant with Illumina recommendations. Next, paired-end sequencing was performed according to the Illumina MiSeq sequence platform using MiSeq Reagent Kit V3(Life Sciences). Primer sequences were removed from each raw data using the Cutadapt tool (version 1.14). The trimmed sequences are constrained and sequence comparisons are made. The insertion loss mutation observed in the 23nt target sequence was considered as genetic correction by off-target activity.

Alternatively, the DNMT1 target site of the HEK-293T cell line was amplified by PCR and then 5. mu.g of the AsCpf1 vector construct and 3. mu.g of crRNA and sequence at the target or only one base mismatch were transduced to 2X10 by electroporation6Insertion loss mutations were induced in individual HEK-293T cells. The insertion failure rate was measured on SDS-PAGE gels as determined by digestion with T7E 1.

7. Unbiased in vitro experiments

crRNA library oligonucleotides with 11nt random sequences at the 3' end were synthesized and each crRNA was in the same molar ratio (Integrated DNA Technologies). The oligonucleotide library was cloned into the pET21 plasmid vector using a sequence-independent and ligation cloning (SLIC) method. The cloned plasmid was used to transform BL21(DE3) E.coli cells and to ensure a colony forming unit of 108CFU/mL or higher. Through being in alignment withColonies of serial dilutions of genetically modified cells on ampicillin (+) plates were counted to calculate CFU values. Genetically modified cells were grown in LB medium supplemented with 50ng/mL ampicillin until an optical density of 0.6 was reached. Water soluble cells (2X 10) were treated with pET-28a (+) plasmid vector (50 to 200ng) carrying dCpf1 or Cpf1 using a Gene Pulser Xcell electroporator (BioRad)10Individual cells/mL) were genetically modified. Genetically modified cells were plated on agar plates supplemented with ampicillin and kanamycin, to which 0.1MIPTG was added. Plasmid vectors were purified by collecting colonies formed on each plate. The plasmid vectors were subjected to deep sequencing analysis using an Illumina HiSeq X Ten sequencer (Macrogen, South Korea) to calculate the a/T/G/C frequency of each position of the crRNA.

8. Binding experiments

Binding experiments were performed using Isothermal Titration Calorimetry (ITC) and microscale thermophoresis (MST).

ITC was performed in an Auto-iTC200 microcalorimeter (GE Healthcare). Specifically, titrated cells containing 5. mu.M of purified recombinant AsCpf1 protein in 25 ℃ PBS buffer (pH7.4) were titrated with chemically synthesized standard or U-rich crRNA (50. mu.M) at 2. mu.L/injection. Data analysis was performed using MicroCal Origin (TM) software (GE Healthcare). The calculated values are the average of three independent experiments. Binding affinity of the guide RNA to effector proteins (SpCas9 and ascipf 1) was measured using Monolith nt.115(nanotemper technologies GmbH). Chemically synthesized crrna (idt technologies) was labeled with Cy5 fluorescent dye. Purified recombinant AsCpf1 at various concentrations (0.25nM to 50. mu.M) was mixed with 8nM labeled RNA in PBS buffer containing 0.05% Tween-20 and 0.05% BSA. Analysis was performed at 24 ℃ using 5% LED power and 20% MST power.

Meanwhile, in Cas9 MST experiments, Cy5 labeled crRNA was hybridized with tracrRNA at the same molecular ratio. Specifically, two RNA oligonucleotides resuspended in nucleic-Free Duplex Buffer (IDT Technologies) were heated at 95 ℃ for 5 minutes and then cooled at room temperature. Purified SpCas9 protein at various concentrations (0.1nM to 15 μ M) was mixed with 8nM labeled RNA in 20mM HEPES buffer (ph7.4) containing 150mM KCl, 0.05% Tween-20, and 0.05% BSA. Analysis was performed at 24 ℃ using 20% LED power and 20% MST power. All samples were placed in a NanoTemper standard capillary and each measurement was repeated at least 3 times. Binding affinity data was analyzed using nanotemperer analysis software.

Northern blot analysis

Total RNA was extracted from HEK-293T cells using the Maxwell RSC miRNA tissue kit (Promega) according to the manufacturer's instructions. After denaturing each sample in RNA denaturation buffer (20% formaldehyde, 50% formamide, 50mM MOPS, pH 7.0) for 15 minutes at 65 ℃, 0.3 to 0.5. mu.g of the isolated RNA was separated from a 1% agarose/16% formaldehyde gel. RNA was then transferred from 10X SSC to positively charged nylon membranes by capillary migration overnight. RNA was prehybridized for 30 min with 20 to 50ng/ml PCR DIG probe in DIGEASY Hyb light pre-warmed to 50 ℃, reacted with PCR DIG labeling mix (Roche), and then denatured for 5 min at 96 ℃. The blot was washed and immunodetected with Anti-Degoxigenin-AP Fab fragment (Roche). The target RNA-DNA probe hybrids were visualized by chemiluminescence assay using CDP-Star substrate (Roche). The probe sequences (SEQ ID NOS: 69 and 70) are shown in Table 10 below.

Watch 10

Figure BPA0000290653310000271

10. Statistical analysis

Statistical analysis of insertion failure rates was performed on Sigma Plot using a two-tailed Student t-test. Statistical analysis showed that P values < 0.05 were significant.

147页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:通过性状堆叠提高作物产量的组合物和方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!