Method for creating plant haploid induction line and application thereof

文档序号：675153 发布日期：2021-04-30 浏览：14次中文

阅读说明：本技术 创制植物单倍体诱导系的方法及其应用 (Method for creating plant haploid induction line and application thereof ) 是由潘燕林丁琦谭超陈雅徐念李弘婧王文舒马崇烈于 2019-10-29 设计创作，主要内容包括：本申请涉及创制植物单倍体诱导系的方法,其包括修饰植物中的CENH3基因,使其功能部分缺失。具体地,本申请涉及利用CRISPR/Cas9系统编辑水稻OsCENH3基因,从而产生水稻单倍体诱导系,其具有诱导单倍体产生的能力,这对于农业生产具有重要的意义。本申请还涵盖了利用单倍体诱导系诱导产生单倍体的方法以及单倍体诱导系用于育种、进行基因编辑、产生单倍体和产生新的单倍体诱导系的用途。(The present application relates to a method for creating a haploid inducer line of a plant, comprising modifying the CENH3 gene in the plant such that a functional part thereof is deleted. Specifically, the application relates to a rice OsCENH3 gene edited by a CRISPR/Cas9 system, so that a rice haploid induction line is generated, and the rice haploid induction line has the capability of inducing haploid generation, and has important significance for agricultural production. Also encompassed by the present application are methods of inducing haploid production using haploid inducer lines and the use of haploid inducer lines for breeding, gene editing, haploid production and the production of new haploid inducer lines.)

1. A method of creating a haploid inducer line of a plant comprising modifying the CENH3 gene in a plant such that a functional part thereof is deleted, optionally the modification results in a deletion or an insertion mutation in the CENH3 gene, optionally the modification is by gene editing.

2. Method according to claim 1, wherein said plant is rice, preferably indica or japonica rice, and said CENH3 gene is the oscernh 3 gene, said oscernh 3 gene comprising or consisting of: the nucleotide sequence shown in SEQ ID NO. 4 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto, or a nucleotide sequence encoding the amino acid sequence shown in SEQ ID NO. 5 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto.

3. The method of claim 2, wherein the gene editing is performed by one or more sequence-specific nucleases selected from the group consisting of: CRISPR/Cas9, CRISPR/Cpf1, TALEN, meganuclease and ZFN, preferably said gene editing is by CRISPR/Cas9, optionally said CRISPR/Cas9 comprises Cas9, wherein said Cas9 comprises SEQ ID NO:1 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto, optionally the CRISPR/Cas9 comprises a guide rna (sgRNA), wherein the sgRNA targets exon 4 of the oscernh 3 gene, optionally the sgRNA targets the nucleotide sequence set forth in SEQ ID No. 6, optionally the sgRNA comprises the nucleotide sequence set forth in SEQ ID No. 7.

4. The method of any one of claims 1-3, wherein the modified OsCENH3 gene comprises or consists of the sequence: 10, 12, 14, 16, 18 and 20 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto, or a nucleotide sequence encoding an amino acid sequence set forth in any one of SEQ ID NOs 11, 13, 15, 17, 19 and 21 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto.

5. A method of inducing haploids in a plant comprising crossing a haploid inducer line created by the method of any one of claims 1-4 as male or female parent with another female or male parent, thereby producing haploid progeny, preferably the plant is rice, preferably indica or japonica rice, optionally the other female or male parent is selected from indica male sterile line X17 and BAD2 homozygous mutants.

6. Use of a haploid inducer line created by the method of any of claims 1-4 for breeding or gene editing.

7. Use of a haploid inducer line created by the method of any one of claims 1-4 to produce haploids.

8. Use of a haploid inducer line created by the method of any one of claims 1-4 to generate a new haploid inducer line, wherein the modified CENH3 gene in the haploid inducer line is transferred to other plant lines by means of cross-breeding, thereby generating a new haploid inducer line.

9. A mutant OsCENH3 gene of rice, which comprises or consists of the following sequence: 10, 12, 14, 16, 18 and 20 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto, or a nucleotide sequence encoding an amino acid sequence set forth in any one of SEQ ID NOs 11, 13, 15, 17, 19 and 21 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto.

10. The sgRNA capable of targeting the OsCENH3 gene of rice comprises or consists of a nucleotide sequence shown by SEQ ID NO. 7.

Technical Field

The present application relates to the field of biotechnology. In particular, the application relates to a method for creating a plant haploid inducer line and application thereof, in particular to a method for creating a plant haploid inducer line by gene editing of a CENH3 gene, so as to be applied to plant breeding.

Background

The double haploid breeding technology is one breeding technology of obtaining homozygous diploid via doubling haploid plant produced through spontaneous or artificial induction and breeding target strain. Compared with the traditional breeding technology, the technology has the advantages of shortening the breeding period and improving the breeding efficiency. A haploid inducer Stock6, obtained from a natural mutant of maize, has the ability to induce maternal haploid production as the male parent, and a series of inducer lines derived from Stock6 have been successfully used for commercial breeding. At present, haploid breeding work of rice mainly depends on modes such as pollen in-vitro culture or distant hybridization, is limited by factors such as induction efficiency and genotype, and is very limited in breeding application, however, haploid induction lines (Liujuntao and the like, 2018) serving as male parents are reported by compiling OsMATL genes through gene compiling technology, but the haploid induction lines have more defects.

CENH3 is widely present in eukaryotes, and the CENH3 gene encodes centromere-specific histone 3, is an important component of the centromere, and plays an important role in cell mitosis and meiosis. In plants, scientists first cloned the CENH3 gene from arabidopsis thaliana, and then subsequently cloned the CENH3 gene from rice, sugarcane, tobacco, barley, corn, carrot, and the like. Comparison of the amino acid sequences of CENH3 from different species revealed that CENH3 had high amino-terminal tail variation, while the carboxy-terminal Histone Fold Domain (HFD) was relatively conserved (Britt and Kuppu, 2016). The Loop 1 and α 2 helices of the HFD region are responsible for targeting CENH3 to the centromere and are therefore designated as CENP- ｃA target regions (CENP- ｃA target domains, CATDs). Post-translational modification of CENH3 typically occurs at its amino-terminal tail (Bailey et al, 2013). Both the N-terminal tail domain and the HFD domain are important for the normal functioning of CENH 3.

In 2010, in an experiment for complementation of the Arabidopsis thaliana CENH3 mutant, researchers fused GFP to their CENH3, and the fusion protein was able to complement its sterile phenotype efficiently after expression in the CENH3 mutant. GFP, the N-terminal tail of histone H3.3 and CENH3 HFD region are fused to obtain GFP-tailswap protein, and the GFP-tailswap protein can also rescue the embryonic lethal phenotype of the CENH3 mutant, the GFP-tailswap strain is highly male sterile, and the selfed offspring are normal diploid. After crossing with wild type plants as male parent, 34% of the progeny were haploid (Ravi and Chan, 2010). Maheshwari et al introduced GFP-CENH3, and CENH3 from grape and maize, into Arabidopsis CENH3 mutants, and these different versions of CENH3 were all able to complement the CENH3 mutant phenotype. Transposing the N-terminal tail or HFD region of CENH3 from 2 different sources resulted in a non-functional or low-performance allele (Maheshwari et al, 2015).

Leucine (Leu, L) at position 92 of Hv β CENH3 of barley was mutated to phenylalanine (Phe, F) by tilling (targeting induced local division genes) technique, so that CENH3 could not be loaded onto the centromere. In addition, the corresponding amino acid L in beet and AtCENH3 was mutated to F with similar results. AtCENH3(L130F) restored normal fertility to the cenh3 mutant, achieving a haploid induction efficiency of 4.8% after crossing as female parent with wild type plants (Karimi-Ashtiyani et al, 2015). Kuppu et al, by TILLING, performed point mutations at amino acids at conserved positions in the HFD region of Arabidopsis CENH3, and amino acid changes at positions P82S, A86V, G83E, A132T and A136T all complemented the sterile phenotype of the CENH3 mutant, and these plants were selfed to produce normal progeny, which when crossed with wild-type lines gave haploid progeny of unequal efficiency (Kuppu et al, 2015). In addition, modified CENH3 variants were introduced using corn CENH3 RNAi and Knock-out strains as receptors, respectively, to obtain haploid inducer lines. The efficiency of the haploid induction line of the RNAi background is only 0.16%, and the haploid induction rate of the Knock-out background induction line is up to 3.6%. This is also a literature report of haploid inducer lines obtained after editing CENH3 in crop plants for the first time (Kelliher et al, 2016).

Disclosure of Invention

The inventor successfully creates a plant, such as a rice haploid induction line, by using a gene editing technology, particularly a CRISPR/Cas9 technology, editing a CENH3 gene, particularly an OsCENH3 gene for the first time, and the good haploid induction effect is realized.

In a first aspect, the present application provides a method of creating a haploid inducer line of a plant, comprising modifying the CENH3 gene in the plant such that a functional part thereof is deleted.

In some embodiments, the modification is by gene editing.

In some embodiments, the plant is rice and the CENH3 gene is the oscernh 3 gene.

In some embodiments, the gene editing is performed by one or more of the sequence-specific nucleases selected from the group consisting of: CRISPR/Cas9, CRISPR/Cpf1, TALEN, meganuclease and ZFN,

in some embodiments, the gene editing is by CRISPR/Cas 9. The CRISPR/Cas9 comprises Cas9, preferably, the Cas9 comprises SEQ ID NO:1 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto. The CRISPR/Cas9 comprises a guide rna (sgRNA), preferably the sgRNA targets exon 4 of the oscernh 3 gene.

In some embodiments, the modified oscernh 3 gene comprises or consists of the sequence: 10, 12, 14, 16, 18 and 20 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto, or a nucleotide sequence encoding an amino acid sequence as set forth in any one of SEQ ID NOs 11, 13, 15, 17, 19 and 21 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto.

In a second aspect, the present application provides a method of inducing haploids in a plant comprising crossing a haploid inducer line created by the method of the first aspect as a male or female parent with another female or male parent, thereby producing a haploid progeny.

In a preferred embodiment, the plant is rice.

In a third aspect, the present application provides use of a haploid inducer line created by the method of the first aspect for breeding or gene editing.

In a fourth aspect, the present application provides the use of a haploid inducer line created by the method of the first aspect for generating haploids.

In a fifth aspect, the present application provides the use of a haploid inducer line created by the method of the first aspect for the generation of a new haploid inducer line, wherein the modified CENH3 gene in the haploid inducer line is transferred to other plant lines by means of cross transfer, thereby generating a new haploid inducer line.

In a sixth aspect, the present application provides a haploid inducer line created by the method of the first aspect. In a preferred embodiment, the haploid inducer line is a rice haploid inducer line.

In a seventh aspect, the present application provides seeds produced by the haploid inducer line of the sixth aspect. In a preferred embodiment, the modified CENH3 gene in the seed is homozygous.

In an eighth aspect, the present application provides a haploid plant produced by the method of the second aspect. In a preferred embodiment, the haploid plant is haploid rice.

In a ninth aspect, the present application provides seeds produced by the haploid plant of the eighth aspect.

In a tenth aspect, the present application provides a rice mutant oscernh 3 gene comprising or consisting of the sequence: 10, 12, 14, 16, 18 and 20 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto, or a nucleotide sequence encoding an amino acid sequence set forth in any one of SEQ ID NOs 11, 13, 15, 17, 19 and 21 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto.

In an eleventh aspect, the present application provides a sgRNA capable of targeting the rice oscernh 3 gene, comprising or consisting of the nucleotide sequence shown in SEQ ID No. 7.

Drawings

FIG. 1 shows a map of the editing vector pZZT000431 constructed in example 1 of the present application.

FIG. 2 shows potassium iodide staining pattern of pollen in example 4 of the present application, the left is potassium iodide staining pattern of pollen of edited plant of Oryza sativa generation E1, and the right is potassium iodide staining pattern of pollen of wild type plant.

FIG. 3 shows a graph of the fruit set percentage statistics of example 4 of the present application, with the abscissa showing, in order from left to right: -3 represents an E1 generation rice editorial plant with OsCENH3 gene shown as SEQ ID NO. 10; -6 represents an E1 generation rice compilation plant with OsCENH3 gene shown as SEQ ID NO. 12; 12 represents an E1 generation rice compilation plant with OsCENH3 gene shown as SEQ ID NO. 14; -18 represents an E1 generation rice compilation plant with OsCENH3 gene shown as SEQ ID NO. 16; -24 represents an E1 generation rice compilation plant with OsCENH3 gene shown as SEQ ID NO. 18; -27 represents an E1 generation rice wanted plant with OsCENH3 gene shown as SEQ ID NO: 20; CK represents a non-edited plant control of the same variety.

FIG. 4 shows haploids and diploids in example 6 of the present application, where A shows the phenotype of the haploid and diploid plants and B shows the flow cytometry results of the haploid and diploid plants.

FIG. 5 shows statistics of haploid inductivity in example 6 of the present application.

Detailed Description

The following definitions and methods are provided to better define the present application and to guide those of ordinary skill in the art in the practice of the present application. Unless otherwise defined, the terms of the present application are to be understood according to the conventional usage of those of ordinary skill in the relevant art.

Definition of

The term "gene editing" or "genome editing" as used herein is an emerging and relatively precise genetic engineering technique for modifying a specific target gene in the genome of an organism. Gene editing means that targeted gene can be 'edited' at a fixed point, so that a specific DNA fragment can be modified. Gene editing relies on the generation of site-specific double-strand breaks (DSBs) at specific locations in the genome by genetically engineered nucleases, also known as "molecular scissors," which induce organisms to repair DSBs by non-homologous end joining (NHEJ) or Homologous Recombination (HR), as this repair process is prone to errors, resulting in targeted mutations.

The term "CRISPR/Cas 9" as used herein refers to an endonuclease that targets an endonuclease cleavage site using an RNA guide strand. See, Jinek et al, Science337816 and 821 (2013); cong et al, Science (2013, 1 month, 3 days); and Mali et al, Science (2013, 1, 3).

The term "CRISPR/Cpf 1" as used herein is a novel class of CRISPR-Cas systems. The CRISPR/Cas9 system, which is currently widely used, often suffers off-target effects. As a potential adversary of the CRISPR/Cas9 system, the CRISPR-Cpf1 system is used as a new tool for gene editing, further expanding the selection range of the target site of gene editing, and has almost no off-target effect. Cpf1 is a novel CRISPR effector protein with many different properties from Cas9, which is advantageous to overcome some limitations in CRISPR/Cas9 applications.

The term "transcription activator-like effector nucleosidase" or "TAL effector nucleosidase" or "TALEN" as used herein refers to a class of artificial restriction endonucleases generated by fusing a TAL effector DNA-binding domain to a DNA cleavage domain.

The term "Zinc Finger Nuclease (ZFN)" as used herein consists of one DNA recognition domain and one DNA cleavage domain. The DNA recognition domain is of a 3-4 ZF tandem structure, each ZF contains about 30 amino acids, is fixed by 1 zinc ion, can recognize and combine with 1 specific triplet base, and consists of 96 amino acid residues at the carboxyl end of the nonspecific endonuclease Fok I. Each Fok I monomer is connected with 1 ZFP to form 1 ZFN, a specific site is identified, when 2 identification sites are at a proper distance (6-8 bp), 2 monomer ZFNs interact to generate an enzyme digestion function to form double-strand break, and therefore DNA fixed-point shearing is mediated.

The term "meganuclease (meganuclease)" as used herein refers to a homing endonuclease capable of recognizing a nucleic acid sequence of 14-40 bases in length. Some meganucleases can tolerate small homing site sequence differences, and a large recognition region can still ensure high specificity for these enzymes, which in turn can maintain low levels of non-specific cleavage within the genome and low toxicity. The broad-range nuclease is encoded by an open reading frame within the mobile sequence of an self-splicing RNA intron or self-splicing intein sequence.

The term "percent identity" as used herein, in the case of two or more nucleic acids, refers to two or more sequences or subsequences that are the same or have a specified percentage of nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identity over a specified region when compared and aligned for maximum correspondence over a comparison window or specified region), as measured using the BLAST or BLAST 2.0 sequence comparison algorithm at default parameters or by manual alignment and visual inspection (see, e.g., the NCBI website, etc.).

Detailed Description

The haploid inducer line produced by the method for creating the plant haploid inducer line is used as a male parent or a female parent and is hybridized with other inbred line lines, and seeds produced by the female parent comprise diploid and haploid seeds only containing female parent chromosomes. Haploid seeds can be distinguished from diploid seeds by cytological examination, phenotypic analysis, or molecular marker examination.

The haploid induction line has an important effect on breeding, the homozygous speed can be improved, the breeding time is shortened, and the breeding efficiency is improved. The haploid induction system is a mutant for compiling self genes, does not carry transgenic components, is convenient for field planting and management, and can reduce breeding cost.

In some embodiments, the modification results in a deletion or insertion mutation of the CENH3 gene.

In some embodiments, the modification is by gene editing.

In some embodiments, the plant is rice. Rice may include, but is not limited to, indica and japonica rice, and the CENH3 gene is oscernh 3 gene.

In some embodiments, the oscernh 3 gene comprises or consists of the following sequence: 4 or a variant sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% sequence identity thereto, or a nucleotide sequence encoding the amino acid sequence set forth in SEQ ID No. 5 or a variant sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% sequence identity thereto.

As used herein, "variant sequence" refers to a sequence that shares some sequence identity with the original sequence and retains the function of the original sequence. For example, the oscernh 3 gene comprising the variant sequence herein may be from other rice lines, but still encode a functional oscernh 3 protein.

In some embodiments, the gene editing is performed by one or more of the sequence-specific nucleases selected from the group consisting of: CRISPR/Cas9, CRISPR/Cpf1, TALEN, meganuclease, and ZFN.

In a preferred embodiment, gene editing is performed by CRISPR/Cas 9. As known in the art, CRISPR/Cas9 comprises Cas9 and a guide rna (sgRNA) that is capable of directing Cas9 endonuclease to specifically recognize and modify a target sequence targeted by the sgRNA. In the present application, Cas9 may comprise SEQ ID NO:1 or a variant sequence having at least 90%, 95% or 99% sequence identity thereto. The desired sgRNA can target exon 4 of the oscernh 3 gene. Preferably, the sgRNA targets the nucleotide sequence shown in SEQ ID NO 6.

In some embodiments, the sgRNA comprises the nucleotide sequence set forth in SEQ ID No. 7.

In some embodiments, the modified oscernh 3 gene comprises or consists of the sequence: 10, 12, 14, 16, 18 and 20 or a variant sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% sequence identity thereto, or a nucleotide sequence encoding an amino acid sequence set forth in any of SEQ ID NOs 11, 13, 15, 17, 19 and 21 or a variant sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% sequence identity thereto.

In some embodiments, the plant is rice. The rice may include, but is not limited to, indica or japonica rice.

In some embodiments, the additional female or male parent may include, but is not limited to, indica male sterile line X17 and BAD2 homozygous mutant with japonica rice inbred line KY131 as background.

In a third aspect, the present application provides use of a haploid inducer line created by the method of the first aspect for breeding or gene editing.

In a fourth aspect, the present application provides the use of a haploid inducer line created by the method of the first aspect for generating haploids.

In a seventh aspect, the present application provides seeds produced by the haploid inducer line of the sixth aspect. In a preferred embodiment, the modified CENH3 gene in the seed is homozygous.

In an eighth aspect, the present application provides a haploid plant produced by the method of the second aspect. In a preferred embodiment, the haploid plant is haploid rice.

In a ninth aspect, the present application provides seeds produced by the haploid plant of the eighth aspect.

In a tenth aspect, the present application provides a rice mutant oscernh 3 gene comprising or consisting of the sequence: 10, 12, 14, 16, 18 and 20 or a variant sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% sequence identity thereto, or a nucleotide sequence encoding an amino acid sequence set forth in any of SEQ ID NOs 11, 13, 15, 17, 19 and 21 or a variant sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5% or 99.9% sequence identity thereto.

In an eleventh aspect, the present application provides a sgRNA capable of targeting the rice oscernh 3 gene, comprising or consisting of the nucleotide sequence shown in SEQ ID No. 7.

In this specification and claims, the words "comprise," "comprising," and "contain" mean "including but not limited to," and are not intended to exclude other moieties, additives, components, or steps.

It should be understood that features, characteristics, components or steps described in a particular aspect, embodiment or example of the present invention may be applied to any other aspect, embodiment or example described herein unless incompatible therewith.

Examples

The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application. Modifications or substitutions to methods, steps or conditions of the present application are intended to be included within the scope of the present application without departing from the spirit and scope of the present application.

Unless otherwise indicated, the examples follow conventional experimental conditions, such as Sambrook et al, Molecular cloning, A laboratory Manual,2001, or conditions as suggested by the manufacturer's instructions.

Unless otherwise specified, the chemical reagents used in the examples are all conventional commercially available reagents, and the technical means used in the examples are conventional means well known to those skilled in the art.

Plant materials used in the following examples include oryza sativa inbred line KY131 and indica rice male sterile line X17, both provided by china seed group limited.

The primers used in the following examples were designed using the Primer3 web service of European research in biological information (ELIXIR) and synthesized by Shanghai Producer.

Example 1 construction of CRISPR/Cas9 editing vector

General methods for constructing vectors and identifying are well known to those skilled in the art. Specifically, in the CRISPR/Cas9 vector, a nuclear localization signal (OsHAT-NLS) is fused at the carboxyl end of a modified Cas9, codon optimization is carried out according to an expression mode in rice, and a sugarcane ubiquitinase promoter SspUbi4 is used for driving expression of Cas 9. The nucleotide sequence of the improved and optimized Cas9 is shown in SEQ ID NO 1. The OsCENH3 gene sequence in the japonica rice inbred line KY131 is obtained from internal genome sequencing data and is verified by cloning and sequencing again. The editing site of OsCENH3 gene generates insertion or deletion mutation of base at the junction position of amino terminal and carboxyl terminal of OsCENH3 protein (exon region No. 4) to obtain 3n indels mutant whose function is not completely inactivated.

Selecting 20 nucleotides adjacent to the upstream of NGG (wherein N is any nucleotide), carrying out BLAST analysis on a core 12 nucleotide sequence (12 nucleotides adjacent to NGG) and 15 nucleotides of NGG in total in an NCBI database, and considering a plurality of rice genomes with existing genome information, including indica rice and japonica rice, so as to ensure that only one accurate match exists in the rice genomes. According to the sequence information, a guide RNA (sgRNA) (SEQ ID NO:7) is designed, wherein the sgRNA is a whole molecule formed by connecting nucleotide sequences of the guide RNA (gRNA) and a skeleton RNA (gRNA scaffold), and the sgRNA is driven to express by a rice U3 promoter. The final sgRNA set was supplied by tassel corporation (www.genscript.com) for full sequence synthesis. The final selected target site sequence targeted by the gRNA is as follows:

ACCGCTTCCGTCCAGGCACAG (SEQ ID NO:6, targeting exon 4).

A glyphosate resistance gene (CP4) driven by a tobacco mosaic virus promoter CaMV35S is used as a transgene selection marker. Thus, an editing vector was constructed, designated pZZT000431, and its map is shown in FIG. 1.

Example 2 genetic transformation of oryza sativa inbred line KY131

The constructed editing vector pZZT000431 is transformed into agrobacterium EHA105 (Huazhong)Presentation to agriculture university) and sequencing identification. The method is characterized in that an embryonic callus is generated by induction of a rice inbred line KY131 seed, the embryonic callus is incubated with agrobacterium EHA105 containing an editorial vector pZZT000431, and genetic transformation is completed through a series of steps such as screening regeneration and the like. Agrobacterium infection of rice callus and screening and differentiation procedures are reported in Nishimura et al (A protocol for Agrobacterium-mediated transformation in rice. Nature protocols,2006,6: 2796-. The specific process comprises co-culturing rice callus with Agrobacterium for 3 days, and culturing the callus in ddH of Carbenicilin 250mg/L₂Soaking in O for 30min, removing excessive Agrobacterium, placing on a screening culture medium containing 250mg/L Carbenicilin for screening for 3 rounds, transferring the resistant callus to a regeneration culture medium for culturing for 4 weeks, and transferring the regenerated plantlet to a rooting culture medium for further culturing for 2 weeks.

Rice callus and plant regeneration conditions: callus induction is carried out for 30 days, callus subculture is carried out for 14 days, screening culture is carried out for 42 days, and the above treatment is carried out in a dark room; the regeneration culture is carried out for 28 days, the rooting culture is carried out for 14 days in a light room under the conditions of 16 hours of light, 8 hours of darkness and 26 ℃.

Example 3 screening and identification of OsCENH3 Gene editing Rice E0 transformation event

The regenerated plants in example 2 were sent to a greenhouse for planting, regenerated E0 seedling leaves were taken and plant genome DNA was extracted by the CTAB method. The DNA sample is subjected to copy number detection by a fluorescence quantitative PCR method, and simultaneously, a Simple Sequence Repeat (SSR) mark detection method (please refer to Simple Sequence Repeat Markers in Genetic diversity and Marker-Assisted Selection of Rice primers: A Review Critical Reviews in Food Science and Nutrition, 201555: 41-49) and sequencing are used for detecting the base change condition of the mutation site.

Specifically, the selection marker gene CP4 on the compilation vector pZZT000431 was selected as copy number detection gene, amplification primers (SEQ ID NO:2 and 3) of the gene with the size of 200bp were designed, amplification and fluorescence value detection were performed on a fluorescent quantitative PCR instrument, and the corresponding copy number of the plant was estimated with reference to CT value (results not shown). In the SSR detection experiment, primer pairs (SEQ ID NO:8 and 9, wherein the 5' end of the SEQ ID NO:8 is marked with fluorescence-labeled carboxyfluorescein (FAM)) with the size of 350-500bp are designed on the downstream of a target site, a target region is amplified to cross a cutting site, high fidelity Q5 DNA polymerase (NEB) is used for amplification, after an amplified fragment is diluted to 150 times, 1 microliter of diluted product is mixed with 9 microliter of HiDi reagent, ABI3730 is used on a machine to detect the length of the fragment, and the number of inserted or deleted bases corresponding to an edited material is calculated according to the length of the fragment of an unedited control sample. In the experiments for sequencing, primer pairs of 500bp amplification fragment size were designed upstream and downstream of the target site and primers should be greater than 100bp from the editing site (SEQ ID NOS: 8 and 9), amplified using high fidelity Q5 DNA polymerase, and subsequently sequenced on a computer ABI 3730.

And selecting a plant with low copy number and at least one chromosome with mutation as a candidate wanted plant, and further culturing and breeding in a greenhouse.

Primer pairs for amplification of target sites:

CENH3 target4-F:CGCTTCCGTCCAGGCACAG(SEQ ID NO:8)

CENH3 target4-R:GAATAGAAATCAGTGATCTCCC(SEQ ID NO: 9)；

F：CAGCACAGGTTAAGTCTG(SEQ ID NO:2)

R：GTCTGTCTCAACGGTAAG(SEQ ID NO:3)。

example 4 genotype and phenotypic identification of edited plants of E1 generations of rice

The OsCENH3 gene can cause sterility of plants after being completely inactivated, so that only mutants with partial function change can fruit. Seeds of fertile plants E0 generation are collected for planting, and plants which are GMO negative and are compiled and homozygous are separated from E1 generation for planting. The GMO detection method is referred to the detection standard of transgenic products of No. 6-2007 publication GB/T19495-2004 of the Chinese national Standard agriculture department 953. The deletion numbers of the bases at the target position are all multiples of 3 through sequencing identification, and the base and amino acid changes of the OsCENH3 gene are shown in SEQ ID NO. 10-21.

The basic agronomic characters of E1 generation plants, including plant height, tillering number, leaf length and leaf width, are similar to those of wild plants.

Pollen viability assay was performed by potassium iodide staining. Collecting small ears at the upper, middle and lower parts of the rice ear, shaking off pollen onto a glass slide, adding potassium iodide staining solution for staining, observing the staining condition of the pollen under a microscope, and counting the proportion of the stained pollen. Statistics shows that the pollen viability of the E13 n indels editorial plant is different from that of a wild type plant (namely wild type rice KY131), and the pollen viability is reduced by about 10%. Selecting main tillers of E1 generation plants, counting the number of filled seeds and empty shells on the main tillers, and calculating the maturing rate by using the following formula:

setting percentage is the number of full seeds/(number of full seeds + number of empty husks) × 100%.

By counting the setting rate and the editing type of different E1 generation plants, the setting rate of the mutant is reduced by 10-20% in the 3n indels mutant compared with that of the wild type plant (FIG. 4).

Example 5 crossing of haploid inducer lines of Rice with different lines

Seeds of E1 edition editorial plants are planted, GMO, SSR and sequencing detection are carried out in the seedling stage, and homozygous mutants (3n indels) without transgenic components are reserved as hybrid parents. Sowing different batches of indica rice male sterile line X17 materials, and selecting the batch suitable for hybridization for experiment. Specifically, when the mutant is used as a male parent, the female parent material is indica rice male sterile line X17, and the flowering female parent spikelets are cut off and bagged the day before the experiment. When the experiment is about 10 days and the mutant blooms more vigorously, the spikelet of the male parent is close to the female parent, pollen is shaken off to the stigmas of the female parent, the pollination time and the information of the male parent and the female parent are marked by bagging, and the seeds are harvested after the pollination time and the information of the male parent and the female parent are ripe. When the mutant is used as a female parent, the BAD2 homozygous mutant is used as a male parent, the spikelet which blooms on the female parent is removed in the previous day, the spikelet which does not bloom is cut off the upper part of the glume flower until the anther is just broken, the spraying device filled with clean water is used for continuously spraying until the whole anther is immersed in water, after 20min, the water is removed by flicking, the anther is clamped out by using tweezers, water is sprayed once again, and the residual anther in the glume is completely inactivated and bagged. When the experiment is about 10 days, the spikelets of the male parents are close to the castrated female parents, pollen is shaken off to the stigmas of the female parents, the pollination time and the information of the male parents are well marked by bagging, and the seeds are harvested after the seeds are matured.

Example 6 detection of haploid Induction Effect of Rice haploid inducer line

The seeds obtained in example 5 were all sown, ungerminated seeds were removed, and all surviving plants were sampled for detection. Firstly, the number of hybrid seeds and the number of self-bred seeds are determined by SSR, and then the ploidy detection is carried out by using a flow cytometer. Specifically, seeds capable of emerging are taken as the total number, all seedlings are subjected to SSR detection, and flow cytometry ploidy detection is carried out on remaining plants (preliminarily considered to be haploids or doubled haploids doubled by haploids) after true hybrids (heterozygous SSR results) and self hybrids (SSR homozygous and consistent with female parent genotypes) are excluded. The phenotype and flow cytometry results of the haploid and diploid plants are shown in fig. 4, a and B, respectively. The detection result shows that the haploid inductivity of the CENH3(-3) mutant is 6.7% (FIG. 5).

Although the technical solutions of the present application have been described in detail above with general description and specific embodiments, it will be apparent to those skilled in the art that some modifications or improvements may be made on the basis of the technical solutions. Accordingly, such modifications and improvements are intended to be within the scope of this invention as claimed.

The sequences used in this application are as follows:

SEQ ID NO:1

nucleotide sequence of Cas9 expressed in rice

ATGCCGAAGAAGCGCCGCCGCGTGGACAAGAAGTACTCCATCGGCCTCGACAT CGGCACCAACTCCGTGGGCTGGGCCGTGATCACCGACGAGTACAAGGTGCCGT CCAAGAAGTTCAAGGTGCTCGGCAACACCGACCGCCACTCCATCAAGAAGAA CCTCATCGGCGCCCTCCTCTTCGACTCCGGCGAGACCGCCGAGGCCACCCGCC TCAAGCGCACCGCCCGCCGCCGCTACACCCGCCGCAAGAACCGCATCTGCTAC CTCCAGGAGATCTTCTCCAACGAGATGGCCAAGGTGGACGACTCCTTCTTCCA CCGCCTCGAGGAGTCCTTCCTCGTGGAGGAGGACAAGAAGCACGAGCGCCAC CCGATCTTCGGCAACATCGTGGACGAGGTGGCCTACCACGAGAAGTACCCGAC CATCTACCACCTCCGCAAGAAGCTCGTGGACTCCACCGACAAGGCCGACCTCC GCCTCATCTACCTCGCCCTCGCCCACATGATCAAGTTCCGCGGCCACTTCCTCA TCGAGGGCGACCTCAACCCGGACAACTCCGACGTGGACAAGCTCTTCATCCAG CTCGTGCAGACCTACAACCAGCTCTTCGAGGAGAACCCGATCAACGCCTCCGG CGTGGACGCCAAGGCCATCCTCTCCGCCCGCCTCTCCAAGTCCCGCCGCCTCG AGAACCTCATCGCCCAGCTCCCGGGCGAGAAGAAGAACGGCCTCTTCGGCAAC CTCATCGCCCTCTCCCTCGGCCTCACCCCGAACTTCAAGTCCAACTTCGACCTC GCCGAGGACGCCAAGCTCCAGCTCTCCAAGGACACCTACGACGACGACCTCGA CAACCTCCTCGCCCAGATCGGCGACCAGTACGCCGACCTCTTCCTCGCCGCCA AGAACCTCTCCGACGCCATCCTCCTCTCCGACATCCTCCGCGTGAACACCGAG ATCACCAAGGCCCCGCTCTCCGCCTCCATGATCAAGCGCTACGACGAGCACCA CCAGGACCTCACCCTCCTCAAGGCCCTCGTGCGCCAGCAGCTCCCGGAGAAGT ACAAGGAGATCTTCTTCGACCAGTCCAAGAACGGCTACGCCGGCTACATCGAC GGCGGCGCCTCCCAGGAGGAGTTCTACAAGTTCATCAAGCCGATCCTCGAGAA GATGGACGGCACCGAGGAGCTCCTCGTGAAGCTCAACCGCGAGGACCTCCTCC GCAAGCAGCGCACCTTCGACAACGGCTCCATCCCGCACCAGATCCACCTCGGC GAGCTCCACGCCATCCTCCGCCGCCAGGAGGACTTCTACCCGTTCCTCAAGGA CAACCGCGAGAAGATCGAGAAGATCCTCACCTTCCGCATCCCGTACTACGTGG GCCCGCTCGCCCGCGGCAACTCCCGCTTCGCCTGGATGACCCGCAAGTCCGAG GAGACCATCACCCCGTGGAACTTCGAGGAGGTGGTGGACAAGGGCGCCTCCG CCCAGTCCTTCATCGAGCGCATGACCAACTTCGACAAGAACCTCCCGAACGAG AAGGTGCTCCCGAAGCACTCCCTCCTCTACGAGTACTTCACCGTGTACAACGA GCTCACCAAGGTGAAGTACGTGACCGAGGGCATGCGCAAGCCGGCCTTCCTCT CCGGCGAGCAGAAGAAGGCCATCGTGGACCTCCTCTTCAAGACCAACCGCAA GGTGACCGTGAAGCAGCTCAAGGAGGACTACTTCAAGAAGATCGAGTGCTTCG ACTCCGTGGAGATCTCCGGCGTGGAGGACCGCTTCAACGCCTCCCTCGGCACC TACCACGACCTCCTCAAGATCATCAAGGACAAGGACTTCCTCGACAACGAGGA GAACGAGGACATCCTCGAGGACATCGTGCTCACCCTCACCCTCTTCGAGGACC GCGAGATGATCGAGGAGCGCCTCAAGACCTACGCCCACCTCTTCGACGACAAG GTGATGAAGCAGCTCAAGCGCCGCCGCTACACCGGCTGGGGCCGCCTCTCCCG CAAGCTCATCAACGGCATCCGCGACAAGCAGTCCGGCAAGACCATCCTCGACT TCCTCAAGTCCGACGGCTTCGCCAACCGCAACTTCATGCAGCTCATCCACGAC GACTCCCTCACCTTCAAGGAGGACATCCAGAAGGCCCAGGTGTCCGGCCAGGG CGACTCCCTCCACGAGCACATCGCCAACCTCGCCGGCTCCCCGGCCATCAAGA AGGGCATCCTCCAGACCGTGAAGGTGGTGGACGAGCTCGTGAAGGTGATGGG CCGCCACAAGCCGGAGAACATCGTGATCGAGATGGCCCGCGAGAACCAGACC ACCCAGAAGGGCCAGAAGAACTCCCGCGAGCGCATGAAGCGCATCGAGGAGG GCATCAAGGAGCTCGGCTCCCAGATCCTCAAGGAGCACCCGGTGGAGAACAC CCAGCTCCAGAACGAGAAGCTCTACCTCTACTACCTCCAGAACGGCCGCGACA TGTACGTGGACCAGGAGCTCGACATCAACCGCCTCTCCGACTACGACGTGGAC CACATCGTGCCGCAGTCCTTCCTCAAGGACGACTCCATCGACAACAAGGTGCT CACCCGCTCCGACAAGAACCGCGGCAAGTCCGACAACGTGCCGTCCGAGGAG GTGGTGAAGAAGATGAAGAACTACTGGCGCCAGCTCCTCAACGCCAAGCTCAT CACCCAGCGCAAGTTCGACAACCTCACCAAGGCCGAGCGCGGCGGCCTCTCCG AGCTCGACAAGGCCGGCTTCATCAAGCGCCAGCTCGTGGAGACCCGCCAGATC ACCAAGCACGTGGCCCAGATCCTCGACTCCCGCATGAACACCAAGTACGACGA GAACGACAAGCTCATCCGCGAGGTGAAGGTGATCACCCTCAAGTCCAAGCTCG TGTCCGACTTCCGCAAGGACTTCCAGTTCTACAAGGTGCGCGAGATCAACAAC TACCACCACGCCCACGACGCCTACCTCAACGCCGTGGTGGGCACCGCCCTCAT CAAGAAGTACCCGAAGCTCGAGTCCGAGTTCGTGTACGGCGACTACAAGGTGT ACGACGTGCGCAAGATGATCGCCAAGTCCGAGCAGGAGATCGGCAAGGCCAC CGCCAAGTACTTCTTCTACTCCAACATCATGAACTTCTTCAAGACCGAGATCAC CCTCGCCAACGGCGAGATCCGCAAGCGCCCGCTCATCGAGACCAACGGCGAG ACCGGCGAGATCGTGTGGGACAAGGGCCGCGACTTCGCCACCGTGCGCAAGG TGCTCTCCATGCCGCAGGTGAACATCGTGAAGAAGACCGAGGTGCAGACCGGC GGCTTCTCCAAGGAGTCCATCCTCCCGAAGCGCAACTCCGACAAGCTCATCGC CCGCAAGAAGGACTGGGACCCGAAGAAGTACGGCGGCTTCGACTCCCCGACC GTGGCCTACTCCGTGCTCGTGGTGGCCAAGGTGGAGAAGGGCAAGTCCAAGAA GCTCAAGTCCGTGAAGGAGCTCCTCGGCATCACCATCATGGAGCGCTCCTCCT TCGAGAAGAACCCGATCGACTTCCTCGAGGCCAAGGGCTACAAGGAGGTGAA GAAGGACCTCATCATCAAGCTCCCGAAGTACTCCCTCTTCGAGCTCGAGAACG GCCGCAAGCGCATGCTCGCCTCCGCCGGCGAGCTCCAGAAGGGCAACGAGCTC GCCCTCCCGTCCAAGTACGTGAACTTCCTCTACCTCGCCTCCCACTACGAGAAG CTCAAGGGCTCCCCGGAGGACAACGAGCAGAAGCAGCTCTTCGTGGAGCAGC ACAAGCACTACCTCGACGAGATCATCGAGCAGATCTCCGAGTTCTCCAAGCGC GTGATCCTCGCCGACGCCAACCTCGACAAGGTGCTCTCCGCCTACAACAAGCA CCGCGACAAGCCGATCCGCGAGCAGGCCGAGAACATCATCCACCTCTTCACCC TCACCAACCTCGGCGCCCCGGCCGCCTTCAAGTACTTCGACACCACCATCGAC CGCAAGCGCTACACCTCCACCAAGGAGGTGCTCGACGCCACCCTCATCCACCA GTCCATCACCGGCCTCTACGAGACCCGCATCGACCTCTCCCAGCTCGGCGGCG A C

SEQ ID NO:2

Primer F

CAGCACAGGTTAAGTCTG

SEQ ID NO:3

Primer R

GTCTGTCTCAACGGTAAG

SEQ ID NO:4

OsCENH3 gene sequence of wild rice

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTTG ATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTGG GGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTTG GATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTACCG TGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAGTTGC AGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTGAAAT AATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGACCATAA AACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGACACGATGTT GATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGGGAACTGATTT AGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTACCTAGATGATA GGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTTGTTCTCCTGTGAT TAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGGCAGCAAACGAAGC AGAGGAAGCCACACCGCTTCCGTCCAGGCACAGTGGCACTGCGGGAGATCAG GAAATTTCAGAAAACCACCGAACTGCTGATCCCGTTTGCACCATTTTCTCGGCT GGTGGGTACATCCTGAACCTGCCTTCTCTCTATATCAAATATTTCGTAGTGCAAA CTTGTGTGATGGAAGCTTTTTGTGCCGATAAAATTTGCAGGTCAGGGAGATCAC TGATTTCTATTCAAAGGATGTGTCACGGTGGACCCTTGAAGCTCTCCTTGCATTG CAAGAGGTCAGTGGTCAAACCTGTTTATTATAAGTTTACAACTGATGGCTTAGTT AGGGAAGGGTCAGACTGAATTATACTGTTTAAATTCCATTCTGCTTCAAGACTCA AGTCACGGCTCAAGAGTGTAACTGAAAAATGTACAAATCTTCCATGATCAATAA AATGAATATCTCTGTGTGTTGATTTATGAGTCAGATTGCTAAATTATTATCCTTTTT CAGTAGAACACCTATATACTACAAATATGCAACCTCCCTATTTTGTTGTGTCTGTT CAAGATTGCTATCATAGAGTATACCAATTTCAGTTCCTTCTTTCCAGCCATGTCTG TTTCTGCATAACCAGGAAAAGGAACAAAGAGCTGACTTAATTCTCACAAAATAA ATTATGTTATTTACTTGCTGTCCTGCAAATTTCCAGTGGTTTTCCCTCTCCTGCAG GCAGCAGAATACCACTTAGTGGACATATTTGAAGTGTCAAATCTCTGCGCCATCC ATGCTAAGCGTGTTACCATCAGTAAGTTGTCATTCTGAATGAACTTTTCTCTTTCT TTTTCTCCCTTTATATTATTATGCTAAATGGATATCATATATGCCACAGCCTACATGA TATCATATACGCATCCACTTCAAAAGCATTCTATTTTTTTATAGGAATAACATTCTA ATTGCAGGATGATTCTTAATACATGTGTTTATATTTAATGTCATATCTAGTTTTCAT ACTCTTAAATTTATCATGATTATTGATTAAACATAGGGAGAATTAGTTGGTTTGTG AGTTTTGAGGTGTGAAATATGCTGCTTGCTATTCCCTGTAAAGCTTATCAGCGTT GTCATTGTGTGGTTTAACAAATAAACGTTTGTTCTGCAGTGCAAAAGGACATGC AACTTGCCAGGCGTATCGGTGGGCGGAGGCCATGGTGAAAATTTGTTTGCGAGC CATGCAGCATGATGGACAAGGAGCAACATGTGTCGTTGATTAACATTTTAGAAA GTAGTGTAGATGTATCTTCACATAGGGATCAACTTACCCTTCGTTCCCATTCTAAT TCAGTTGATGTTAGTATTTACCTTTTGCTCCATTTGGATTGGTCGAATTCAGGATT TCATCAAACAGTCGATTGTGAAATGTGAACCAGGAATTGTTGTGTTGATTGCAAT AATGGGTTCCTCTCACCTGCTTCTTCCATC

SEQ ID NO:5

OsCENH3 protein sequence of wild rice

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQRKPHRFRPGTVALREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSRW TLEALLALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

SEQ ID NO:6

Target site sequence targeted by editing vector pZZT000431 (targeting exon 4)

ACCGCTTCCGTCCAGGCACAG

SEQ ID NO:7

Nucleotide sequence of guide RNA (sgRNA)

ACCGCTTCCGTCCAGGCACAGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAA GGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCT

SEQ ID NO:8

Primer CENH3 target4-F

CGCTTCCGTCCAGGCACAG

SEQ ID NO:9

Primer CENH3 target4-R

GAATAGAAATCAGTGATCTCCC

SEQ ID NO:10

CENH3(-3) nucleotide sequence

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTT GATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTG GGGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTT GGATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTAC CGTGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAG TTGCAGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTG AAATAATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGA CCATAAAACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGAC ACGATGTTGATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGG GAACTGATTTAGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTA CCTAGATGATAGGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTT GTTCTCCTGTGATTAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGG CAGCAAACGAAGCAGAGGAAGCCACACCGCTTCCGTCCAGCAGTGGCACTGC GGGAGATCAGGAAATTTCAGAAAACCACCGAACTGCTGATCCCGTTTGCACCA TTTTCTCGGCTGGTGGGTACATCCTGAACCTGCCTTCTCTCTATATCAAATATTT CGTAGTGCAAACTTGTGTGATGGAAGCTTTTTGTGCCGATAAAATTTGCAGGTC AGGGAGATCACTGATTTCTATTCAAAGGATGTGTCACGGTGGACCCTTGAAGC TCTCCTTGCATTGCAAGAGGTCAGTGGTCAAACCTGTTTATTATAAGTTTACAA CTGATGGCTTAGTTAGGGAAGGGTCAGACTGAATTATACTGTTTAAATTCCATT CTGCTTCAAGACTCAAGTCACGGCTCAAGAGTGTAACTGAAAAATGTACAAAT CTTCCATGATCAATAAAATGAATATCTCTGTGTGTTGATTTATGAGTCAGATTG CTAAATTATTATCCTTTTTCAGTAGAACACCTATATACTACAAATATGCAACCT CCCTATTTTGTTGTGTCTGTTCAAGATTGCTATCATAGAGTATACCAATTTCAGT TCCTTCTTTCCAGCCATGTCTGTTTCTGCATAACCAGGAAAAGGAACAAAGAG CTGACTTAATTCTCACAAAATAAATTATGTTATTTACTTGCTGTCCTGCAAATTT CCAGTGGTTTTCATACCACTTAGTGGACATATTTGAAGTGTCAAATCTCTGCGC CATCCATGCTAAGCGTGTTACCATCAGTAAGTTGTCATTCTGAATGAACTTTTC TCTTTCTTTTTCTCCCTTTATATTATTATGCTAAATGGATATCATATATGCCACA GCCTACATGATATCATATACGCATCCACTTCAAAAGCATTCTATTTTTTTATAG GAATAACATTCTAATTGCAGGATGATTCTTAATACATGTGTTTATATTTAATGT CATATCTAGTTTTCATACTCTTAAATTTATCATGATTATTGATTAAACATAGGG AGAATTAGTTGGTTTGTGAGTTTTGAGGTGTGAAATATGCTGCTTGCTATTCCC TGTAAAGCTTATCAGCGTTGTCATTGTGTGGTTTAACAAATAAACGTTTGTTCT GCAGTGCAAAAGGACATGCAACTTGCCAGGCGTATCGGTGGGCGGAGGCCAT GGTGAAAATTTGTTTGCGAGCCATGCAGCATGATGGACAAGGAGCAACATGTG TCGTTGATTAACATTTTAGAAAGTAGTGTAGATGTATCTTCACATAGGGATCAA CTTACCCTTCGTTCCCATTCTAATTCAGTTGATGTTAGTATTTACCTTTTGCTCC ATTTGGATTGGTCGAATTCAGGATTTCATCAAACAGTCGATTGTGAAATGTGA ACCAGGAATTGTTGTGTTGATTGCAATAATGGGTTCCTCTCACCTGCTTCTTCC ATC

SEQ ID NO:11

CENH3(-3) amino acid sequence

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQRKPHRFRP----TVALREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSR WTLEALLALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

SEQ ID NO:12

CENH3(-6) nucleotide sequence

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTT GATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTG GGGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTT GGATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTAC CGTGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAG TTGCAGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTG AAATAATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGA CCATAAAACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGAC ACGATGTTGATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGG GAACTGATTTAGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTA CCTAGATGATAGGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTT GTTCTCCTGTGATTAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGG CAGCAAACGAAGCAGAGGAAGCCACACCGCTTCCGTCCAGTGGCACTGCGGG AGATCAGGAAATTTCAGAAAACCACCGAACTGCTGATCCCGTTTGCACCATTT TCTCGGCTGGTGGGTACATCCTGAACCTGCCTTCTCTCTATATCAAATATTTCG TAGTGCAAACTTGTGTGATGGAAGCTTTTTGTGCCGATAAAATTTGCAGGTCAG GGAGATCACTGATTTCTATTCAAAGGATGTGTCACGGTGGACCCTTGAAGCTCT CCTTGCATTGCAAGAGGTCAGTGGTCAAACCTGTTTATTATAAGTTTACAACTG ATGGCTTAGTTAGGGAAGGGTCAGACTGAATTATACTGTTTAAATTCCATTCTG CTTCAAGACTCAAGTCACGGCTCAAGAGTGTAACTGAAAAATGTACAAATCTT CCATGATCAATAAAATGAATATCTCTGTGTGTTGATTTATGAGTCAGATTGCTA AATTATTATCCTTTTTCAGTAGAACACCTATATACTACAAATATGCAACCTCCC TATTTTGTTGTGTCTGTTCAAGATTGCTATCATAGAGTATACCAATTTCAGTTCC TTCTTTCCAGCCATGTCTGTTTCTGCATAACCAGGAAAAGGAACAAAGAGCTG ACTTAATTCTCACAAAATAAATTATGTTATTTACTTGCTGTCCTGCAAATTTCC AGTGGTTTTCATACCACTTAGTGGACATATTTGAAGTGTCAAATCTCTGCGCCA TCCATGCTAAGCGTGTTACCATCAGTAAGTTGTCATTCTGAATGAACTTTTCTC TTTCTTTTTCTCCCTTTATATTATTATGCTAAATGGATATCATATATGCCACAGC CTACATGATATCATATACGCATCCACTTCAAAAGCATTCTATTTTTTTATAGGA ATAACATTCTAATTGCAGGATGATTCTTAATACATGTGTTTATATTTAATGTCA TATCTAGTTTTCATACTCTTAAATTTATCATGATTATTGATTAAACATAGGGAG AATTAGTTGGTTTGTGAGTTTTGAGGTGTGAAATATGCTGCTTGCTATTCCCTG TAAAGCTTATCAGCGTTGTCATTGTGTGGTTTAACAAATAAACGTTTGTTCTGC AGTGCAAAAGGACATGCAACTTGCCAGGCGTATCGGTGGGCGGAGGCCATGG TGAAAATTTGTTTGCGAGCCATGCAGCATGATGGACAAGGAGCAACATGTGTC GTTGATTAACATTTTAGAAAGTAGTGTAGATGTATCTTCACATAGGGATCAACT TACCCTTCGTTCCCATTCTAATTCAGTTGATGTTAGTATTTACCTTTTGCTCCAT TTGGATTGGTCGAATTCAGGATTTCATCAAACAGTCGATTGTGAAATGTGAAC CAGGAATTGTTGTGTTGATTGCAATAATGGGTTCCTCTCACCTGCTTCTTCCAT C

SEQ ID NO:13

CENH3(-6) amino acid sequence

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQRKPHRFRP----VALREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSR WTLEALLALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

SEQ ID NO:14

CENH3(-12) nucleotide sequence

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTT GATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTG GGGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTT GGATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTAC CGTGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAG TTGCAGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTG AAATAATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGA CCATAAAACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGAC ACGATGTTGATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGG GAACTGATTTAGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTA CCTAGATGATAGGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTT GTTCTCCTGTGATTAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGG CAGCAAACGAAGCAGAGGAAGCCACACCGCTTCAAGGCACTGCGGGAGATCA GGAAATTTCAGAAAACCACCGAACTGCTGATCCCGTTTGCACCATTTTCTCGGC TGGTGGGTACATCCTGAACCTGCCTTCTCTCTATATCAAATATTTCGTAGTGCA AACTTGTGTGATGGAAGCTTTTTGTGCCGATAAAATTTGCAGGTCAGGGAGAT CACTGATTTCTATTCAAAGGATGTGTCACGGTGGACCCTTGAAGCTCTCCTTGC ATTGCAAGAGGTCAGTGGTCAAACCTGTTTATTATAAGTTTACAACTGATGGCT TAGTTAGGGAAGGGTCAGACTGAATTATACTGTTTAAATTCCATTCTGCTTCAA GACTCAAGTCACGGCTCAAGAGTGTAACTGAAAAATGTACAAATCTTCCATGA TCAATAAAATGAATATCTCTGTGTGTTGATTTATGAGTCAGATTGCTAAATTAT TATCCTTTTTCAGTAGAACACCTATATACTACAAATATGCAACCTCCCTATTTT GTTGTGTCTGTTCAAGATTGCTATCATAGAGTATACCAATTTCAGTTCCTTCTTT CCAGCCATGTCTGTTTCTGCATAACCAGGAAAAGGAACAAAGAGCTGACTTAA TTCTCACAAAATAAATTATGTTATTTACTTGCTGTCCTGCAAATTTCCAGTGGTT TTCATACCACTTAGTGGACATATTTGAAGTGTCAAATCTCTGCGCCATCCATGC TAAGCGTGTTACCATCAGTAAGTTGTCATTCTGAATGAACTTTTCTCTTTCTTTT TCTCCCTTTATATTATTATGCTAAATGGATATCATATATGCCACAGCCTACATG ATATCATATACGCATCCACTTCAAAAGCATTCTATTTTTTTATAGGAATAACAT TCTAATTGCAGGATGATTCTTAATACATGTGTTTATATTTAATGTCATATCTAGT TTTCATACTCTTAAATTTATCATGATTATTGATTAAACATAGGGAGAATTAGTT GGTTTGTGAGTTTTGAGGTGTGAAATATGCTGCTTGCTATTCCCTGTAAAGCTT ATCAGCGTTGTCATTGTGTGGTTTAACAAATAAACGTTTGTTCTGCAGTGCAAA AGGACATGCAACTTGCCAGGCGTATCGGTGGGCGGAGGCCATGGTGAAAATTT GTTTGCGAGCCATGCAGCATGATGGACAAGGAGCAACATGTGTCGTTGATTAA CATTTTAGAAAGTAGTGTAGATGTATCTTCACATAGGGATCAACTTACCCTTCG TTCCCATTCTAATTCAGTTGATGTTAGTATTTACCTTTTGCTCCATTTGGATTGG TCGAATTCAGGATTTCATCAAACAGTCGATTGTGAAATGTGAACCAGGAATTG TTGTGTTGATTGCAATAATGGGTTCCTCTCACCTGCTTCTTCCATC

SEQ ID NO:15

CENH3(-12) amino acid sequence

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQRKPHRF----KALREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSRWT LEALLALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

SEQ ID NO:16

CENH3(-18) nucleotide sequence

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTT GATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTG GGGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTT GGATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTAC CGTGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAG TTGCAGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTG AAATAATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGA CCATAAAACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGAC ACGATGTTGATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGG GAACTGATTTAGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTA CCTAGATGATAGGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTT GTTCTCCTGTGATTAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGG CAGCAAACGAAGCAGAGGAAGCCAGCAGTGGCACTGCGGGAGATCAGGAAAT TTCAGAAAACCACCGAACTGCTGATCCCGTTTGCACCATTTTCTCGGCTGGTGG GTACATCCTGAACCTGCCTTCTCTCTATATCAAATATTTCGTAGTGCAAACTTG TGTGATGGAAGCTTTTTGTGCCGATAAAATTTGCAGGTCAGGGAGATCACTGA TTTCTATTCAAAGGATGTGTCACGGTGGACCCTTGAAGCTCTCCTTGCATTGCA AGAGGTCAGTGGTCAAACCTGTTTATTATAAGTTTACAACTGATGGCTTAGTTA GGGAAGGGTCAGACTGAATTATACTGTTTAAATTCCATTCTGCTTCAAGACTCA AGTCACGGCTCAAGAGTGTAACTGAAAAATGTACAAATCTTCCATGATCAATA AAATGAATATCTCTGTGTGTTGATTTATGAGTCAGATTGCTAAATTATTATCCT TTTTCAGTAGAACACCTATATACTACAAATATGCAACCTCCCTATTTTGTTGTG TCTGTTCAAGATTGCTATCATAGAGTATACCAATTTCAGTTCCTTCTTTCCAGCC ATGTCTGTTTCTGCATAACCAGGAAAAGGAACAAAGAGCTGACTTAATTCTCA CAAAATAAATTATGTTATTTACTTGCTGTCCTGCAAATTTCCAGTGGTTTTCAT ACCACTTAGTGGACATATTTGAAGTGTCAAATCTCTGCGCCATCCATGCTAAGC GTGTTACCATCAGTAAGTTGTCATTCTGAATGAACTTTTCTCTTTCTTTTTCTCC CTTTATATTATTATGCTAAATGGATATCATATATGCCACAGCCTACATGATATC ATATACGCATCCACTTCAAAAGCATTCTATTTTTTTATAGGAATAACATTCTAA TTGCAGGATGATTCTTAATACATGTGTTTATATTTAATGTCATATCTAGTTTTCA TACTCTTAAATTTATCATGATTATTGATTAAACATAGGGAGAATTAGTTGGTTT GTGAGTTTTGAGGTGTGAAATATGCTGCTTGCTATTCCCTGTAAAGCTTATCAG CGTTGTCATTGTGTGGTTTAACAAATAAACGTTTGTTCTGCAGTGCAAAAGGAC ATGCAACTTGCCAGGCGTATCGGTGGGCGGAGGCCATGGTGAAAATTTGTTTG CGAGCCATGCAGCATGATGGACAAGGAGCAACATGTGTCGTTGATTAACATTT TAGAAAGTAGTGTAGATGTATCTTCACATAGGGATCAACTTACCCTTCGTTCCC ATTCTAATTCAGTTGATGTTAGTATTTACCTTTTGCTCCATTTGGATTGGTCGAA TTCAGGATTTCATCAAACAGTCGATTGTGAAATGTGAACCAGGAATTGTTGTGT TGATTGCAATAATGGGTTCCTCTCACCTGCTTCTTCCATC

SEQ ID NO:17

CENH3(-18) amino acid sequence

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQRKP----AVALREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSRWTLE ALLALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

SEQ ID NO:18

CENH3(-24) nucleotide sequence

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTT GATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTG GGGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTT GGATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTAC CGTGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAG TTGCAGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTG AAATAATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGA CCATAAAACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGAC ACGATGTTGATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGG GAACTGATTTAGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTA CCTAGATGATAGGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTT GTTCTCCTGTGATTAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGG CAGCAAACGAAGCAGAGGACAGTGGCACTGCGGGAGATCAGGAAATTTCAGA AAACCACCGAACTGCTGATCCCGTTTGCACCATTTTCTCGGCTGGTGGGTACAT CCTGAACCTGCCTTCTCTCTATATCAAATATTTCGTAGTGCAAACTTGTGTGAT GGAAGCTTTTTGTGCCGATAAAATTTGCAGGTCAGGGAGATCACTGATTTCTAT TCAAAGGATGTGTCACGGTGGACCCTTGAAGCTCTCCTTGCATTGCAAGAGGT CAGTGGTCAAACCTGTTTATTATAAGTTTACAACTGATGGCTTAGTTAGGGAAG GGTCAGACTGAATTATACTGTTTAAATTCCATTCTGCTTCAAGACTCAAGTCAC GGCTCAAGAGTGTAACTGAAAAATGTACAAATCTTCCATGATCAATAAAATGA ATATCTCTGTGTGTTGATTTATGAGTCAGATTGCTAAATTATTATCCTTTTTCAG TAGAACACCTATATACTACAAATATGCAACCTCCCTATTTTGTTGTGTCTGTTC AAGATTGCTATCATAGAGTATACCAATTTCAGTTCCTTCTTTCCAGCCATGTCT GTTTCTGCATAACCAGGAAAAGGAACAAAGAGCTGACTTAATTCTCACAAAAT AAATTATGTTATTTACTTGCTGTCCTGCAAATTTCCAGTGGTTTTCATACCACTT AGTGGACATATTTGAAGTGTCAAATCTCTGCGCCATCCATGCTAAGCGTGTTAC CATCAGTAAGTTGTCATTCTGAATGAACTTTTCTCTTTCTTTTTCTCCCTTTATA TTATTATGCTAAATGGATATCATATATGCCACAGCCTACATGATATCATATACG CATCCACTTCAAAAGCATTCTATTTTTTTATAGGAATAACATTCTAATTGCAGG ATGATTCTTAATACATGTGTTTATATTTAATGTCATATCTAGTTTTCATACTCTT AAATTTATCATGATTATTGATTAAACATAGGGAGAATTAGTTGGTTTGTGAGTT TTGAGGTGTGAAATATGCTGCTTGCTATTCCCTGTAAAGCTTATCAGCGTTGTC ATTGTGTGGTTTAACAAATAAACGTTTGTTCTGCAGTGCAAAAGGACATGCAA CTTGCCAGGCGTATCGGTGGGCGGAGGCCATGGTGAAAATTTGTTTGCGAGCC ATGCAGCATGATGGACAAGGAGCAACATGTGTCGTTGATTAACATTTTAGAAA GTAGTGTAGATGTATCTTCACATAGGGATCAACTTACCCTTCGTTCCCATTCTA ATTCAGTTGATGTTAGTATTTACCTTTTGCTCCATTTGGATTGGTCGAATTCAGG ATTTCATCAAACAGTCGATTGTGAAATGTGAACCAGGAATTGTTGTGTTGATTG CAATAATGGGTTCCTCTCACCTGCTTCTTCCATC

SEQ ID NO:19

CENH3(24) amino acid sequence

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQR----TVALREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSRWTLEAL LALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

SEQ ID NO:20

CENH3(-27) nucleotide sequence

ACGCCGCTTCAGTTTGAAAACCCACCGCCACGTCGCCGCCGCCGCCGCCGCCG CCGCCGACGCCGAGATGGCTCGCACGAAGCACCCGGCGGTGAGGAAGTCGAA GGCGGAGCCCAAGAAGAAGCTCCAGTTCGAACGCTCCCCTCGGCCGTCGAAG GCGCAGCGCGCTGGTGGTGAGCGCGCGCTCTCTCCCCCTCTGCGTTTCTTTTTT TTTTTCCTTTTTCTTTCAATGGCGGTGGATGGTGAAGCTTATGCCCCCCCCCCCC CCTTCCCGCCTCTTGCTTGTCCCCTTTGCAGGCGGCACGGGTACCTCGGCGACC ACGGTGCGTGCGGGAGCGGGTCTTTCGTTTGGTGATTTTTTGATTTTGTGGGGG GATATGTTTTTGTTTTGTATCTTGGCTGGATGGATGGCTTGCTCACCACCTGTTT GATGGAATGCAGAGGAGCGCGGCTGGAACATCGGCTTCAGGTGCGTTCTCTTG GGGGGGTTTCTAGGGTTATTCATGGGCTCGTTGGAGCTTTTCCTTTCTGTCTCTT GGATTCCGGGGGACCTGAGGGGCTCAATGTGTCCCTTTTCTTGCTCTGTTTTAC CGTGTGCTGTACTTTCCTCATCGTTGTTTTCTGAATATATTATAAGAACAGTAG TTGCAGAAAGATCTTCAATTGCTCATCAGTCAAAGCTTTTCTTGTTTTCATTCTG AAATAATAGCAAATCCAGTTTGGTCCATGGAGGGGTTATCTGAAACATTATGA CCATAAAACATGGTATTAAGCATTGCTAGCCAAGAAATGTGTGGTTTTTAGAC ACGATGTTGATAGGTGATTTTTATGCTCATCCATTATTAGTCTTTGCATCGTGG GAACTGATTTAGTAAACTTTCTTTAGTGTCATGGTTCAAATAGCGTCTGTTCTA CCTAGATGATAGGTATCCATATGGAAGTCTTGGCTTTGGAATTGCTCTCCTTTT GTTCTCCTGTGATTAAATAACTTTAACATGTGTGTGAAGCAGGGACGCCTAGG CAGCAAACGAAGCAGAGGAAGCCACTGCGGGAGATCAGGAAATTTCAGAAAA CCACCGAACTGCTGATCCCGTTTGCACCATTTTCTCGGCTGGTGGGTACATCCT GAACCTGCCTTCTCTCTATATCAAATATTTCGTAGTGCAAACTTGTGTGATGGA AGCTTTTTGTGCCGATAAAATTTGCAGGTCAGGGAGATCACTGATTTCTATTCA AAGGATGTGTCACGGTGGACCCTTGAAGCTCTCCTTGCATTGCAAGAGGTCAG TGGTCAAACCTGTTTATTATAAGTTTACAACTGATGGCTTAGTTAGGGAAGGGT CAGACTGAATTATACTGTTTAAATTCCATTCTGCTTCAAGACTCAAGTCACGGC TCAAGAGTGTAACTGAAAAATGTACAAATCTTCCATGATCAATAAAATGAATA TCTCTGTGTGTTGATTTATGAGTCAGATTGCTAAATTATTATCCTTTTTCAGTAG AACACCTATATACTACAAATATGCAACCTCCCTATTTTGTTGTGTCTGTTCAAG ATTGCTATCATAGAGTATACCAATTTCAGTTCCTTCTTTCCAGCCATGTCTGTTT CTGCATAACCAGGAAAAGGAACAAAGAGCTGACTTAATTCTCACAAAATAAAT TATGTTATTTACTTGCTGTCCTGCAAATTTCCAGTGGTTTTCATACCACTTAGTG GACATATTTGAAGTGTCAAATCTCTGCGCCATCCATGCTAAGCGTGTTACCATC AGTAAGTTGTCATTCTGAATGAACTTTTCTCTTTCTTTTTCTCCCTTTATATTATT ATGCTAAATGGATATCATATATGCCACAGCCTACATGATATCATATACGCATCC ACTTCAAAAGCATTCTATTTTTTTATAGGAATAACATTCTAATTGCAGGATGAT TCTTAATACATGTGTTTATATTTAATGTCATATCTAGTTTTCATACTCTTAAATT TATCATGATTATTGATTAAACATAGGGAGAATTAGTTGGTTTGTGAGTTTTGAG GTGTGAAATATGCTGCTTGCTATTCCCTGTAAAGCTTATCAGCGTTGTCATTGT GTGGTTTAACAAATAAACGTTTGTTCTGCAGTGCAAAAGGACATGCAACTTGC CAGGCGTATCGGTGGGCGGAGGCCATGGTGAAAATTTGTTTGCGAGCCATGCA GCATGATGGACAAGGAGCAACATGTGTCGTTGATTAACATTTTAGAAAGTAGT GTAGATGTATCTTCACATAGGGATCAACTTACCCTTCGTTCCCATTCTAATTCA GTTGATGTTAGTATTTACCTTTTGCTCCATTTGGATTGGTCGAATTCAGGATTTC ATCAAACAGTCGATTGTGAAATGTGAACCAGGAATTGTTGTGTTGATTGCAAT AATGGGTTCCTCTCACCTGCTTCTTCCATC

SEQ ID NO:21

CENH3(-27) amino acid sequence

MARTKHPAVRKSKAEPKKKLQFERSPRPSKAQRAGGGTGTSATTRSAAGTSASAG TPRQQTKQRKP----LREIRKFQKTTELLIPFAPFSRLVREITDFYSKDVSRWTLEALL ALQEAAEYHLVDIFEVSNLCAIHAKRVTIMQKDMQLARRIGGRRPW

Remarking: "- - - - -" in SEQ ID NOS 11, 13, 15, 17, 19 and 21 indicates the position of an amino acid deletion caused by editing.

Reference to the literature

[1] Liu Jun Tao, Yao Li, Liangda Wei, Zhang ya, Liu Chun Xia, Liu Yu Bo, Wang Yan Li and T.J.Kaili sea (2018), a rice haploid induction line created by gene editing, a creation method and an application invention patent thereof;

[2]Britt,A.B.,and Kuppu,S.(2016).Cenh3:an emerging player in haploid induction technology.Frontiers in plant science 7,357.

[3]Bailey,A.O.,Panchenko,T.,Sathyan,K.M.,Petkowski,J.J.,Pai,P.-J.,Bai,D.L., Russell,D.H.,Macara,I.G.,Shabanowitz,J.,and Hunt,D.F.(2013).Posttranslational modification of CENP-A influences the conformation of centromeric chromatin. Proceedings of the National Academy of Sciences 110,11827-11832.

[4]Ravi,M.,and Chan,S.W.(2010).Haploid plants produced by centromere-mediated genome elimination.Nature 464,615.

[5]Maheshwari,S.,Tan,E.H.,West,A.,Franklin,F.C.H.,Comai,L.,and Chan,S.W. (2015).Naturally occurring differences in CENH3 affect chromosome segregation in zygotic mitosis of hybrids.PLoS genetics 11,e1004970.

[6]Karimi-Ashtiyani,R.,Ishii,T.,Niessen,M.,Stein,N.,Heckmann,S.,Gurushidze,M., Banaei-Moghaddam,A.M.,Fuchs,J.,Schubert,V.,and Koch,K.(2015).Point mutation impairs centromeric CENH3 loading and induces haploid plants.Proceedings of the National Academy of Sciences 112,11211-11216.

[7]Kuppu,S.,Tan,E.H.,Nguyen,H.,Rodgers,A.,Comai,L.,Chan,S.W.,and Britt,A.B. (2015).Point mutations in centromeric histone induce post-zygotic incompatibility and uniparental inheritance.PLoS genetics 11,e1005494.

[8]Kelliher,T.,Starr,D.,Wang,W.,McCuiston,J.,Zhong,H.,Nuccio,M.L.,and Martin, B.(2016).Maternal haploids are preferentially induced by CENH3-tailswap transgenic complementation in maize.Frontiers in plant science 7,414.

sequence listing

<110> China seed group Co., Ltd

<120> method for creating plant haploid induction line and application thereof

<130> 19C13564CN

<160> 21

<170> SIPOSequenceListing 1.0

<210> 1

<211> 4125

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

atgccgaaga agcgccgccg cgtggacaag aagtactcca tcggcctcga catcggcacc 60

aactccgtgg gctgggccgt gatcaccgac gagtacaagg tgccgtccaa gaagttcaag 120

gtgctcggca acaccgaccg ccactccatc aagaagaacc tcatcggcgc cctcctcttc 180

gactccggcg agaccgccga ggccacccgc ctcaagcgca ccgcccgccg ccgctacacc 240

cgccgcaaga accgcatctg ctacctccag gagatcttct ccaacgagat ggccaaggtg 300

gacgactcct tcttccaccg cctcgaggag tccttcctcg tggaggagga caagaagcac 360

gagcgccacc cgatcttcgg caacatcgtg gacgaggtgg cctaccacga gaagtacccg 420

accatctacc acctccgcaa gaagctcgtg gactccaccg acaaggccga cctccgcctc 480

atctacctcg ccctcgccca catgatcaag ttccgcggcc acttcctcat cgagggcgac 540

ctcaacccgg acaactccga cgtggacaag ctcttcatcc agctcgtgca gacctacaac 600

cagctcttcg aggagaaccc gatcaacgcc tccggcgtgg acgccaaggc catcctctcc 660

gcccgcctct ccaagtcccg ccgcctcgag aacctcatcg cccagctccc gggcgagaag 720

aagaacggcc tcttcggcaa cctcatcgcc ctctccctcg gcctcacccc gaacttcaag 780

tccaacttcg acctcgccga ggacgccaag ctccagctct ccaaggacac ctacgacgac 840

gacctcgaca acctcctcgc ccagatcggc gaccagtacg ccgacctctt cctcgccgcc 900

aagaacctct ccgacgccat cctcctctcc gacatcctcc gcgtgaacac cgagatcacc 960

aaggccccgc tctccgcctc catgatcaag cgctacgacg agcaccacca ggacctcacc 1020

ctcctcaagg ccctcgtgcg ccagcagctc ccggagaagt acaaggagat cttcttcgac 1080

cagtccaaga acggctacgc cggctacatc gacggcggcg cctcccagga ggagttctac 1140

aagttcatca agccgatcct cgagaagatg gacggcaccg aggagctcct cgtgaagctc 1200

aaccgcgagg acctcctccg caagcagcgc accttcgaca acggctccat cccgcaccag 1260

atccacctcg gcgagctcca cgccatcctc cgccgccagg aggacttcta cccgttcctc 1320

aaggacaacc gcgagaagat cgagaagatc ctcaccttcc gcatcccgta ctacgtgggc 1380

ccgctcgccc gcggcaactc ccgcttcgcc tggatgaccc gcaagtccga ggagaccatc 1440

accccgtgga acttcgagga ggtggtggac aagggcgcct ccgcccagtc cttcatcgag 1500

cgcatgacca acttcgacaa gaacctcccg aacgagaagg tgctcccgaa gcactccctc 1560

ctctacgagt acttcaccgt gtacaacgag ctcaccaagg tgaagtacgt gaccgagggc 1620

atgcgcaagc cggccttcct ctccggcgag cagaagaagg ccatcgtgga cctcctcttc 1680

aagaccaacc gcaaggtgac cgtgaagcag ctcaaggagg actacttcaa gaagatcgag 1740

tgcttcgact ccgtggagat ctccggcgtg gaggaccgct tcaacgcctc cctcggcacc 1800

taccacgacc tcctcaagat catcaaggac aaggacttcc tcgacaacga ggagaacgag 1860

gacatcctcg aggacatcgt gctcaccctc accctcttcg aggaccgcga gatgatcgag 1920

gagcgcctca agacctacgc ccacctcttc gacgacaagg tgatgaagca gctcaagcgc 1980

cgccgctaca ccggctgggg ccgcctctcc cgcaagctca tcaacggcat ccgcgacaag 2040

cagtccggca agaccatcct cgacttcctc aagtccgacg gcttcgccaa ccgcaacttc 2100

atgcagctca tccacgacga ctccctcacc ttcaaggagg acatccagaa ggcccaggtg 2160

tccggccagg gcgactccct ccacgagcac atcgccaacc tcgccggctc cccggccatc 2220

aagaagggca tcctccagac cgtgaaggtg gtggacgagc tcgtgaaggt gatgggccgc 2280

cacaagccgg agaacatcgt gatcgagatg gcccgcgaga accagaccac ccagaagggc 2340

cagaagaact cccgcgagcg catgaagcgc atcgaggagg gcatcaagga gctcggctcc 2400

cagatcctca aggagcaccc ggtggagaac acccagctcc agaacgagaa gctctacctc 2460

tactacctcc agaacggccg cgacatgtac gtggaccagg agctcgacat caaccgcctc 2520

tccgactacg acgtggacca catcgtgccg cagtccttcc tcaaggacga ctccatcgac 2580

aacaaggtgc tcacccgctc cgacaagaac cgcggcaagt ccgacaacgt gccgtccgag 2640

gaggtggtga agaagatgaa gaactactgg cgccagctcc tcaacgccaa gctcatcacc 2700

cagcgcaagt tcgacaacct caccaaggcc gagcgcggcg gcctctccga gctcgacaag 2760

gccggcttca tcaagcgcca gctcgtggag acccgccaga tcaccaagca cgtggcccag 2820

atcctcgact cccgcatgaa caccaagtac gacgagaacg acaagctcat ccgcgaggtg 2880

aaggtgatca ccctcaagtc caagctcgtg tccgacttcc gcaaggactt ccagttctac 2940

aaggtgcgcg agatcaacaa ctaccaccac gcccacgacg cctacctcaa cgccgtggtg 3000

ggcaccgccc tcatcaagaa gtacccgaag ctcgagtccg agttcgtgta cggcgactac 3060

aaggtgtacg acgtgcgcaa gatgatcgcc aagtccgagc aggagatcgg caaggccacc 3120

gccaagtact tcttctactc caacatcatg aacttcttca agaccgagat caccctcgcc 3180

aacggcgaga tccgcaagcg cccgctcatc gagaccaacg gcgagaccgg cgagatcgtg 3240

tgggacaagg gccgcgactt cgccaccgtg cgcaaggtgc tctccatgcc gcaggtgaac 3300

atcgtgaaga agaccgaggt gcagaccggc ggcttctcca aggagtccat cctcccgaag 3360

cgcaactccg acaagctcat cgcccgcaag aaggactggg acccgaagaa gtacggcggc 3420

ttcgactccc cgaccgtggc ctactccgtg ctcgtggtgg ccaaggtgga gaagggcaag 3480

tccaagaagc tcaagtccgt gaaggagctc ctcggcatca ccatcatgga gcgctcctcc 3540

ttcgagaaga acccgatcga cttcctcgag gccaagggct acaaggaggt gaagaaggac 3600

ctcatcatca agctcccgaa gtactccctc ttcgagctcg agaacggccg caagcgcatg 3660

ctcgcctccg ccggcgagct ccagaagggc aacgagctcg ccctcccgtc caagtacgtg 3720

aacttcctct acctcgcctc ccactacgag aagctcaagg gctccccgga ggacaacgag 3780

cagaagcagc tcttcgtgga gcagcacaag cactacctcg acgagatcat cgagcagatc 3840

tccgagttct ccaagcgcgt gatcctcgcc gacgccaacc tcgacaaggt gctctccgcc 3900

tacaacaagc accgcgacaa gccgatccgc gagcaggccg agaacatcat ccacctcttc 3960

accctcacca acctcggcgc cccggccgcc ttcaagtact tcgacaccac catcgaccgc 4020

aagcgctaca cctccaccaa ggaggtgctc gacgccaccc tcatccacca gtccatcacc 4080

ggcctctacg agacccgcat cgacctctcc cagctcggcg gcgac 4261

<210> 2

<211> 18

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

cagcacaggt taagtctg 18

<210> 3

<211> 18

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

gtctgtctca acggtaag 18

<210> 4

<211> 2500

<212> DNA/RNA

<213> Rice (Oryza sativa)

<400> 4

acgccgcttc agtttgaaaa cccaccgcca cgtcgccgcc gccgccgccg ccgccgccga 60

cgccgagatg gctcgcacga agcacccggc ggtgaggaag tcgaaggcgg agcccaagaa 120

gaagctccag ttcgaacgct cccctcggcc gtcgaaggcg cagcgcgctg gtggtgagcg 180

cgcgctctct ccccctctgc gtttcttttt ttttttcctt tttctttcaa tggcggtgga 240

tggtgaagct tatgcccccc cccccccctt cccgcctctt gcttgtcccc tttgcaggcg 300

gcacgggtac ctcggcgacc acggtgcgtg cgggagcggg tctttcgttt ggtgattttt 360

tgattttgtg gggggatatg tttttgtttt gtatcttggc tggatggatg gcttgctcac 420

cacctgtttg atggaatgca gaggagcgcg gctggaacat cggcttcagg tgcgttctct 480

tgggggggtt tctagggtta ttcatgggct cgttggagct tttcctttct gtctcttgga 540

ttccggggga cctgaggggc tcaatgtgtc ccttttcttg ctctgtttta ccgtgtgctg 600

tactttcctc atcgttgttt tctgaatata ttataagaac agtagttgca gaaagatctt 660

caattgctca tcagtcaaag cttttcttgt tttcattctg aaataatagc aaatccagtt 720

tggtccatgg aggggttatc tgaaacatta tgaccataaa acatggtatt aagcattgct 780

agccaagaaa tgtgtggttt ttagacacga tgttgatagg tgatttttat gctcatccat 840

tattagtctt tgcatcgtgg gaactgattt agtaaacttt ctttagtgtc atggttcaaa 900

tagcgtctgt tctacctaga tgataggtat ccatatggaa gtcttggctt tggaattgct 960

ctccttttgt tctcctgtga ttaaataact ttaacatgtg tgtgaagcag ggacgcctag 1020

gcagcaaacg aagcagagga agccacaccg cttccgtcca ggcacagtgg cactgcggga 1080

gatcaggaaa tttcagaaaa ccaccgaact gctgatcccg tttgcaccat tttctcggct 1140

ggtgggtaca tcctgaacct gccttctctc tatatcaaat atttcgtagt gcaaacttgt 1200

gtgatggaag ctttttgtgc cgataaaatt tgcaggtcag ggagatcact gatttctatt 1260

caaaggatgt gtcacggtgg acccttgaag ctctccttgc attgcaagag gtcagtggtc 1320

aaacctgttt attataagtt tacaactgat ggcttagtta gggaagggtc agactgaatt 1380

atactgttta aattccattc tgcttcaaga ctcaagtcac ggctcaagag tgtaactgaa 1440

aaatgtacaa atcttccatg atcaataaaa tgaatatctc tgtgtgttga tttatgagtc 1500

agattgctaa attattatcc tttttcagta gaacacctat atactacaaa tatgcaacct 1560

ccctattttg ttgtgtctgt tcaagattgc tatcatagag tataccaatt tcagttcctt 1620

ctttccagcc atgtctgttt ctgcataacc aggaaaagga acaaagagct gacttaattc 1680

tcacaaaata aattatgtta tttacttgct gtcctgcaaa tttccagtgg ttttccctct 1740

cctgcaggca gcagaatacc acttagtgga catatttgaa gtgtcaaatc tctgcgccat 1800

ccatgctaag cgtgttacca tcagtaagtt gtcattctga atgaactttt ctctttcttt 1860

ttctcccttt atattattat gctaaatgga tatcatatat gccacagcct acatgatatc 1920

atatacgcat ccacttcaaa agcattctat ttttttatag gaataacatt ctaattgcag 1980

gatgattctt aatacatgtg tttatattta atgtcatatc tagttttcat actcttaaat 2040

ttatcatgat tattgattaa acatagggag aattagttgg tttgtgagtt ttgaggtgtg 2100

aaatatgctg cttgctattc cctgtaaagc ttatcagcgt tgtcattgtg tggtttaaca 2160

aataaacgtt tgttctgcag tgcaaaagga catgcaactt gccaggcgta tcggtgggcg 2220

gaggccatgg tgaaaatttg tttgcgagcc atgcagcatg atggacaagg agcaacatgt 2280

gtcgttgatt aacattttag aaagtagtgt agatgtatct tcacataggg atcaacttac 2340

ccttcgttcc cattctaatt cagttgatgt tagtatttac cttttgctcc atttggattg 2400

gtcgaattca ggatttcatc aaacagtcga ttgtgaaatg tgaaccagga attgttgtgt 2460

tgattgcaat aatgggttcc tctcacctgc ttcttccatc 2582

<210> 5

<211> 165

<212> PRT

<213> Rice (Oryza sativa)

<400> 5

Met Ala Arg Thr Lys His Pro Ala Val Arg Lys Ser Lys Ala Glu Pro

1 5 10 15

Lys Lys Lys Leu Gln Phe Glu Arg Ser Pro Arg Pro Ser Lys Ala Gln

20 25 30

Arg Ala Gly Gly Gly Thr Gly Thr Ser Ala Thr Thr Arg Ser Ala Ala

35 40 45

Gly Thr Ser Ala Ser Ala Gly Thr Pro Arg Gln Gln Thr Lys Gln Arg

50 55 60

Lys Pro His Arg Phe Arg Pro Gly Thr Val Ala Leu Arg Glu Ile Arg

65 70 75 80

Lys Phe Gln Lys Thr Thr Glu Leu Leu Ile Pro Phe Ala Pro Phe Ser

85 90 95

Arg Leu Val Arg Glu Ile Thr Asp Phe Tyr Ser Lys Asp Val Ser Arg

100 105 110

Trp Thr Leu Glu Ala Leu Leu Ala Leu Gln Glu Ala Ala Glu Tyr His

115 120 125

Leu Val Asp Ile Phe Glu Val Ser Asn Leu Cys Ala Ile His Ala Lys

130 135 140

Arg Val Thr Ile Met Gln Lys Asp Met Gln Leu Ala Arg Arg Ile Gly

145 150 155 160

Gly Arg Arg Pro Trp

165

<210> 6

<211> 21

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

accgcttccg tccaggcaca g 21

<210> 7

<211> 98

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

accgcttccg tccaggcaca ggttttagag ctagaaatag caagttaaaa taaggctagt 60

ccgttatcaa cttgaaaaag tggcaccgag tcggtgct 100

<210> 8

<211> 19

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 8

cgcttccgtc caggcacag 19

<210> 9

<211> 22

<212> DNA/RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 9

gaatagaaat cagtgatctc cc 22

<210> 10

<211> 2477

<212> DNA/RNA

<213> Rice (Oryza sativa)

<400> 10