Application of substance for inhibiting OaGS3 gene expression in regulating and controlling length of tetraploid wild rice grains
阅读说明:本技术 抑制OaGS3基因表达的物质在调控四倍体野生稻籽粒长度中的应用 (Application of substance for inhibiting OaGS3 gene expression in regulating and controlling length of tetraploid wild rice grains ) 是由 李家洋 余泓 孟祥兵 张静昆 刘贵富 荆彦辉 陈明江 于 2021-02-02 设计创作,主要内容包括:本发明提供了调控OaGS3基因的物质在调控四倍体野生稻产量/籽粒长度中的应用。本发明的具体实施例证明通过编辑基因OaGS3,可实现籽粒长度的增加,从而选育产量较高的四倍体野生稻,推进四倍体野生稻的驯化与利用。(The invention provides application of a substance for regulating an OaGS3 gene in regulating and controlling the yield/seed length of tetraploid wild rice. The specific embodiment of the invention proves that the increase of the grain length can be realized by editing the gene OaGS3, so that the tetraploid wild rice with higher yield is bred, and the domestication and utilization of the tetraploid wild rice are promoted.)
1. The application of a substance inhibiting the expression of an OaGS3 gene in increasing the yield or the grain length of tetraploid wild rice is characterized in that the OaGS3 gene encodes an OaGS3 protein, and the OaGS3 protein is a protein of A1, A2 or A3 as follows:
a1, protein of which the amino acid sequence is shown in any one of sequence 5 and sequence 6 in a sequence table;
a2, a protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues of the amino acid sequence shown in any one of the sequence 5 and the sequence 6 in the sequence table, has more than 80% of identity with the protein shown in A1), and has the same function;
a3, a fusion protein obtained by attaching a protein tag to the N-terminus or/and the C-terminus of A1) or A2).
2. The use according to claim 1, wherein the OaGS3 gene is specifically the gene as shown in E1 or E2 below:
e1, the coding sequence of the coding chain is a DNA molecule shown in a sequence 3 or a sequence 4 in a sequence table;
e2, the nucleotide sequence is a DNA molecule shown in sequence 1 or sequence 2 in the sequence table.
3. The use of claim 2, wherein said inhibition of OaGS3 gene expression is achieved by genome-directed editing of the OaGS3 gene in tetraploid wild rice by the CRISPR/Cas9 system, said CRISPR/Cas9 system targeting the XXX sequence located on any of the nucleotide sequences of the OaGS3 gene, including xxxgg; wherein XXX is any nucleic acid sequence of 19-20bp in the DNA molecule sequence, and N is any nucleotide of A, T, G, C.
4. The use of claim 3, wherein the target sequence is a sequence from 392 th to 411 th of sequence 1 in the sequence listing.
5. The use according to any one of claims 1 to 4 wherein the agent is the CRISPR/Cas9 system; the CRISPR/Cas9 system includes the following 1) or 2):
1) the sgRNA targets are sequences shown as 392 th to 411 th positions of a sequence 1 or 579 th to 598 th positions of a sequence 2;
2) a CRISPR/Cas9 vector expressing the sgRNA.
6. The method for breeding the tetraploid wild rice with higher yield is characterized by comprising the following steps: inhibiting the expression of the OaGS3 gene in the receptor tetraploid wild rice to obtain the target tetraploid wild rice with higher yield than the receptor tetraploid wild rice.
7. The method for breeding the tetraploid wild rice with longer kernel length is characterized by comprising the following steps: inhibiting the expression of the OaGS3 gene in the receptor tetraploid wild rice to obtain the target tetraploid wild rice with the kernel length longer than that of the receptor tetraploid wild rice.
8. An agent, wherein the active ingredient of the agent is a substance that inhibits the expression of a gene encoding the OaGS3 protein, reduces the abundance of the OaGS3 protein, and/or knockouts the gene encoding the OaGS3 protein.
A CRISPR/Cas9 system, characterized in that the CRISPR/Cas9 system comprises 1) or 2) as follows:
1) the sgRNA targets are sequences shown as 392 th to 411 th positions of a sequence 1 or 579 th to 598 th positions of a sequence 2;
2) a CRISPR/Cas9 vector expressing the sgRNA.
10. Any one of the following Y1-Y4:
y1, the use according to any one of claims 1 to 5 in domesticated breeding of tetraploid wild rice;
use of Y2 or the method of any one of claims 6 to 7 in domesticated breeding of tetraploid wild rice;
use of Y3, the agent of claim 8, in domestication breeding of tetraploid wild rice;
use of Y4, the CRISPR/Cas9 system of claim 9 in tetraploid wild rice acclimation breeding;
y5, the use of the substance for regulating the OaGS3 gene as claimed in any one of claims 1 to 5 in rice breeding.
Technical Field
The invention belongs to the technical field of biology, and particularly relates to application of a substance for inhibiting OaGS3 gene expression in regulating and controlling the length of tetraploid wild rice grains.
Background
Rice is one of the most important food crops in the world. The high and stable yield of the rice is of great significance for guaranteeing the world grain safety. However, the urbanization process aggravates to cause the acute reduction of the cultivated land area, the frequent increase of the risk of stable yield of grains in extreme weather and the slow increase of the crop yield brought by the traditional breeding technology. The grain safety problem is increasingly prominent and becomes a new challenge to people.
The high stalk tetraploid wild rice O.alta is a perennial alloploid rice and is mainly distributed in south America. Compared with diploid cultivated rice, the tetraploid wild rice with tetraploid stalks has the advantages of large biomass, perennial period, strong regeneration capacity, strong stress adaptability and the like. But the seeds are small without artificial domestication, and the thousand kernel weight is only 8.79 g. The grain length of the high-stalk tetraploid wild rice is increased through a genome editing technology, so that the grain weight is increased, and meanwhile, the purpose of increasing the yield is achieved by combining various advantages of the high-stalk tetraploid wild rice.
Disclosure of Invention
The technical problem to be solved by the invention is how to increase the yield of tetraploid wild rice.
In order to solve the technical problems, the invention provides an application of a substance inhibiting the expression of an OaGS3 gene in increasing the yield or the seed length of tetraploid wild rice, wherein the OaGS3 gene encodes an OaGS3 protein, and the OaGS3 protein is a protein of A1, A2 or A3 as follows:
a1, protein of which the amino acid sequence is shown in any one of sequence 5 and sequence 6 in a sequence table;
a2, a protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues of the amino acid sequence shown in any one of the sequence 5 and the sequence 6 in the sequence table, has more than 80% of identity with the protein shown in A1), and has the same function;
a3, a fusion protein obtained by attaching a protein tag to the N-terminus or/and the C-terminus of A1) or A2).
In the application, the sequence 5 in the sequence table is composed of 221 amino acid residues, and the sequence 6 in the sequence table is composed of 211 amino acid residues.
In the above applications, identity refers to the identity of amino acid sequences. The identity of the amino acid sequences can be determined using homology search sites on the Internet, such as the BLAST web pages of the NCBI home website. For example, in the advanced BLAST2.1, by using blastp as a program, setting the value of Expect to 10, setting all filters to OFF, using BLOSUM62 as a Matrix, setting Gap existence cost, Perresilute Gap cost, and Lambda ratio to 11, 1, and 0.85 (default values), respectively, and performing a calculation by searching for the identity of a pair of amino acid sequences, a value (%) of identity can be obtained.
In the above applications, the 80% or greater identity may be at least 81%, 85%, 90%, 91%, 92%, 95%, 96%, 98%, 99% or 100% identity.
In the above application, the OaGS3 gene may be specifically a gene shown as E1 or E2 as follows:
e1, wherein the coding sequence (ORF) of the coding strand is a DNA molecule shown as a sequence 3 or a sequence 4 in the sequence table;
e2, the nucleotide sequence is a DNA molecule shown in sequence 1 or sequence 2 in the sequence table.
In the application, the inhibition of the expression of the OaGS3 gene can be realized by carrying out chemical mutagenesis, physical mutagenesis, RNAi, genome site-directed editing or homologous recombination on the OaGS3 gene in tetraploid wild rice.
In the above application, the genome site-directed editing may be implemented by using Zinc Finger Nuclease (ZFN), Transcription activator-like effector nuclease (TALEN), clustered regularly spaced short palindromic repeats (clustered regularly interspaced short palindromic repeats/CRISPR associated, CRISPR/Cas 9), and other technologies capable of implementing genome site-directed editing.
In the application, the genome site-specific editing can be realized by virtue of a CRISPR/Cas9 system. The CRISPR/Cas9 system, wherein the target sequence is XXX sequence located on any nucleotide sequence of OaGS3 gene including XXXGG; wherein XXX is any nucleic acid sequence of 19-20bp in the DNA molecule sequence, and N is any nucleotide of A, T, G, C.
The specific target sequence can be a sequence shown from 392 rd position to 411 th position of a sequence 1 in a sequence table or a sequence shown from 579 th position to 598 th position in a sequence 2.
In the above application, the substance may be a CRISPR/Cas9 system;
the CRISPR/Cas9 system includes the following 1) or 2):
1) the sgRNA targets are sequences shown as 392 th to 411 th positions of a sequence 1 or 579 th to 598 th positions of a sequence 2;
2) a CRISPR/Cas9 vector expressing the sgRNA.
In a specific embodiment of the invention, the recombinant vector of the CRISPR/Cas9 system comprises recombinant vector VK005-OaGS3 gRNA; the recombinant vector VK005-OaGS3gRNA contains a sgRNA expression cassette and a Cas9 encoding gene, and can express the sgRNA and Cas 9.
The invention also provides a method for breeding tetraploid wild rice (O.alta) with higher yield, which comprises the following steps: inhibiting the expression of the OaGS3 gene in the receptor tetraploid wild rice to obtain the target tetraploid wild rice with higher yield than the receptor tetraploid wild rice.
The invention also provides a method for breeding tetraploid wild rice (O.alta) with longer grain length, which comprises the following steps: inhibiting the expression of the OaGS3 gene in the receptor tetraploid wild rice to obtain the target tetraploid wild rice with the kernel length longer than that of the receptor tetraploid wild rice.
In the above method, the target tetraploid wild rice is tetraploid wild rice satisfying the following conditions: both the CC subgenome and the DD subgenome are mutated in the target region of the sgRNA.
In the above method, the tetraploid wild rice of interest is understood to include not only the first-generation to second-generation transgenic tetraploid wild rice but also its progeny. For transgenic tetraploid wild rice, the gene can be inherited in that species, or can be transferred into other varieties of the same species using conventional breeding techniques. The transgenic tetraploid wild rice includes seed, callus, complete plant and cell.
In order to solve the above technical problems, the present invention also provides an agent for increasing the yield or grain length of tetraploid wild rice, the agent having as an active ingredient a substance that inhibits the expression of a gene encoding the OaGS3 protein, reduces the abundance of the OaGS3 protein, and/or knockouts the gene encoding the OaGS3 protein.
In the reagent, the substance is a CRISPR/Cas9 system;
the CRISPR/Cas9 system includes the following 1) or 2):
1) the sgRNA targets are sequences shown as 392 th to 411 th positions of a sequence 1 or 579 th to 598 th positions of a sequence 2;
2) a CRISPR/Cas9 vector expressing the sgRNA.
The active ingredient of the agent may further contain other biological ingredients or/and non-biological ingredients, and the other active ingredients of the agent can be determined by those skilled in the art according to the yield-increasing effect or the grain length-increasing effect of tetraploid wild rice.
The invention also protects a CRISPR/Cas9 system comprising 1) or 2) as follows:
1) the sgRNA targets are sequences shown as 392 th to 411 th positions of a sequence 1 or 579 th to 598 th positions of a sequence 2;
2) a CRISPR/Cas9 vector expressing the sgRNA.
The invention also provides any one of the following applications of Y1-Y5:
y1, the use of the above in domestication and breeding of tetraploid wild rice;
y2, the application of the method in the domestication and breeding of tetraploid wild rice;
y3, use of the above reagent in domestication breeding of tetraploid wild rice;
y4 and the application of the CRISPR/Cas9 system in the domestication and breeding of tetraploid wild rice;
y5, and the application of the substance for regulating the OaGS3 gene in rice breeding.
The invention provides application of a substance for regulating an OaGS3 gene in regulating and controlling the yield/seed length of tetraploid wild rice. The specific embodiment of the invention proves that the increase of the grain length can be realized by editing the gene OaGS3, so that the tetraploid wild rice with higher yield is bred, and the domestication and utilization of the tetraploid wild rice are promoted.
Drawings
FIG. 1 is a dot diagram of the OaGS3 gene target site in O.acta in example 1 of the present invention.
FIG. 2 is a diagram showing the construction of VK005-OaGS3gRNA in example 1 of the present invention.
FIG. 3 shows the bands resulting from partial T0 amplification of OaGS3-CC and OaGS3-DD in example 2 of the present invention.
FIG. 4 is a graph showing the sequencing result of a partial genome editing T0 material mutation in example 2 of the present invention. Wherein, O.alta is wild type, and T0-1 and T0-2 are plants of T0 material with genome editing.
FIG. 5 is a photograph of a portion of genome editing material and a wild-type grain length phenotype in example 2 of the present invention. Wherein, O.alta is wild type, and T0-1 and T0-2 are genome editing material strains.
FIG. 6 is a graph showing the statistical results of the data of the partial genome editing materials and the length of wild-type kernels in example 2 of the present invention. Wherein, O.alta is wild type, and T0-1 and T0-2 are genome editing material strains. Data shown in the figure are mean ± sd, repeat number 20, representing significance analysis result P < 0.01.
Detailed Description
The following examples are intended to illustrate the invention without limiting its scope. The experimental procedures used in the following examples are all conventional procedures unless otherwise specified. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
The binary Vector VK005-01 is Cas9/gRNA Vector in a plant Cas9/gRNA plasmid construction kit (cargo number: VK005-01) of Beijing Weishanglide Biotech Co.
The tetraploid wild rice in the following examples is tetraploid wild rice Oryza alta 2007-24, which is a high stalk tetraploid wild rice seed material (o. alta Swallen, chromosome set CCDD, No. YD-7900) in left column 1.1 of page 906 of document 1, and publicly available from institute of genetics and developmental biology or national germplasm naning tetraploid wild rice. Document 1: douglas et al, research on callus-induced differentiation of high-stalk tetraploid wild rice, journal of agriculture in the southwest, 2014,27(3), 905 and 909.
The transcript used in the examples below is T01, as an example only, and does not limit the editing sites in the application. The examples were carried out according to the usual experimental conditions or the product specifications, unless otherwise specified.
In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
Example 1 selection of tetraploid OaGS3 target site and construction of CRISPR/Cas9 Gene editing vector
Selection of target sequences
Tetraploid high culm tetraploid wild rice Oryza alta is an allotetraploid, and most genes are copied on CC and DD subgenomes. Through alignment analysis, two homologous genes of OaGS3 exist in Oryza alta, the gene positioned on the chromosome 3 of the CC subgenomic group is named as OaGS3-CC, and the other gene positioned on the chromosome 3 of the DD subgenomic group is named as OaGS3-DD (the nucleotide sequence is shown as sequence 2).
OaGS3-CC comprises 5926 nucleotides, and the nucleotide sequence is shown as a sequence 1. The CDS sequence is shown in sequence 3, and the coded protein sequence is shown in sequence 5.
OaGS3-DD consists of 6193 nucleotides, and the nucleotide sequence is shown as a sequence 2. The CDS sequence is shown in sequence 4, and the coded protein sequence is shown in sequence 6.
As shown in FIG. 1, target sequences were selected on conserved regions of the first exons of OaGS3-CC and OaGS3-DD as follows:
target sequence: 5'-CCGGCAGCGCCCGACCCCTG-3' (as shown in sequence 1 from 392 rd to 411 th or as shown in sequence 2 from 579 th to 598 th);
the sgRNA was designed according to the target sequence and designated as OaGS 3-sgRNA.
Second, construction of recombinant plasmid
The following single-stranded primers with linker sequences (underlined) were synthesized:
VK005-OaGS3-F:(the sequence indicated by the wavy line is the target sequence and the sequence indicated by the underlining is the linker sequence);
VK005-OaGS3-R:(the sequence indicated by the wavy line is complementary to Target site 1 in reverse, and the sequence indicated by underlining is the linker sequence).
VK005-OaGS3-F and VK005-OaGS3-R were primer annealed to form Oligo dimers. BspQI single enzyme digestion is carried out on the binary vector VK005-01, vector fragments of about 17kb are recovered by gel purification, and enzyme connection is carried out on the vector fragments and the Oligo dimer under the action of T4 ligase, so as to obtain a recombinant vector, and sequencing is carried out by the following sequencing primers:
VK005 primer:5’-GCCATGAATAGGTCTATGACC-3’。
the structure of the positive plasmid after sequencing verification is shown in fig. 2, namely a sgRNA expression cassette is correctly inserted into the enzyme cutting site of BspQI, and the positive plasmid is named as VK005-OaGS3 gRNA. VK005-OaGS3gRNA contains a sgRNA expression cassette and a Cas9 encoding gene, and can express the sgRNA and Cas 9.
Example 2 transformed tetraploid Rice healing and phenotypic identification
1. Cultivating tetraploid wild rice O.alta plant with OaGS3 gene knockout
The plasmid VK005-OaGS3gRNA obtained in example 1 was used to transform Agrobacterium EHA105 by an electric stimulation method, and the callus of tetraploid wild rice O.alta was infected with Agrobacterium to culture the above callus to obtain transformed seedlings. Extracting the genome DNA of the transformed seedling, and amplifying OaGS3-CC and OaGS3-DD by using specific primers.
The primer pair for amplifying OaGS3-CC consists of OaGS3-CC-F and OaGS 3-CC-R:
OaGS3-CC-F:5’-CCTCCGCCATTTATAATCCA-3’;
OaGS3-CC-R:5’-TATGCATTCGTGGTTTCAGC-3’。
the primer pair for amplifying the OaGS3-DD consists of OaGS3-DD-F and OaGS 3-DD-R:
OaGS3-DD-F:5’-GCTGCCTTTCCATCATCATT-3’;
OaGS3-DD-R:5’-ATGTTGGGCCATGCATATTT-3’。
the electrophoresis pattern of the amplified product is shown in FIG. 3, and the genome editing material is selected and sequenced to determine the mutation type. Partial results are shown in FIG. 4, and OaGS3-CC and OaGS3-DD have frame shift mutation, which results in protein sequence change and protein function disruption. The transformed seedling containing the gene editing mutation is T0 generation transgenic tetraploid wild rice (numbered T0-1 and T0-2).
2. OaGS3 knock-out tetraploid wild rice phenotype O.alta
Respectively sowing T0 generations of T0-1 strain and T0-2 strain in the field of Hainan experimental base of institute of genetics and developmental biology of Chinese academy of sciences, harvesting seeds in the mature period of normal agronomic management, removing top awns, fully dehydrating, and performing seed length statistics to ensure at least 3 times of biological repetition.
The length photos of the seeds of the strains are shown in figure 6, and the lengths of the seeds of the gene editing strain T0-1 strain and the gene editing strain T0-2 strain are obviously longer than that of the wild type.
Therefore, in the tetraploid wild rice Oryza alta, rapid domestication of grains can be realized by editing the gene OaGS3, so that the grains of the tetraploid wild rice Oryza alta are lengthened, and the yield increasing potential of the high-stalk tetraploid wild rice is improved.
The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is possible within the scope of the claims attached below.
Sequence listing
<110> institute of genetics and developmental biology of Chinese academy of sciences
<120> application of substance inhibiting OaGS3 gene expression in regulating length of tetraploid wild rice seeds
<130> GNCSY210457
<160> 6
<170> SIPOSequenceListing 1.0
<210> 1
<211> 5926
<212> DNA
<213> wild rice (Oryza alta)
<400> 1
gcccactctc tcccctccat cattacttgc ccaaaaacgg caatcgatcc ccttggcctg 60
cttccatcca tccatgtgct cctcctccgc catttataat ccaccttact ccttatctcc 120
cagctccgtc ttccccgctg ctagctccaa ctgatcaaaa cagctagcta gctgctgcag 180
cgccggcccc tacctaccct accctactgc tcctactcat cgtcatcgtc ccggccccta 240
tatatagctg cacctctctc tctctctcca tatatactcc tctcgtttta gagtagcagt 300
agtagtagag ctcatcactt catctccatc ggagtgaata agtgatatgg caatggcggc 360
ggcggcggcg gcgccccggc ccaagtcgcc gccggcagcg cccgacccct gcggccgcca 420
ccgcctccag ctcgccgtcg acgcgctcca ccgcgagatc ggattcctcg aggtacaaat 480
atctatgtag tatatcagta ccattcatag ttcttgctac aaaaacaaaa caaaacaaaa 540
caaaaacatc tttcgtacaa atatctgtcg aatcatttgc ttctgcatgc atcactctca 600
ccttaagttt tccccagctt aaaaccactc cttttatctt catttttcct tttcttgaac 660
cgaactcatt tcatcttcat atactactct tagcatgcac cattttcttc tttcagtccc 720
ccaatactga tcctctcttc ccagccaaat cattaacacg aataagctta atcaaacaag 780
ttgatcaact gatgatcgaa tgaaacaaaa caaaaccgat cgaataggag tcaatgatac 840
cacctatagc tgaaaccacg aatgcatagc ttacctctac tcgatctgtt gactgttcaa 900
cactgcacta agacaccagt taataagccg attaattaaa ccattaaatg agcttaaccg 960
gcggctagct tcctcttcct ctgggcccgt gcggatcgta ccatcggttt gcgcgttcct 1020
ccacccaaac tctctgcctt gttattccct ttgtctacat gcatttgcat catcatcaca 1080
tcaccactac gtttcaattg gattttggtg gtctccaacc atatcatatt gctttttttt 1140
cttctctact acaatctcta gcggttttgt ggagtactaa agaaaacaac taactagcca 1200
gggtttgacg tctaatcaat tggtgatttt gttaatttca cctactattt tagactttac 1260
aggtcttcga gctatactac gatttatcga tgccatgcat caatcgatgc tggtccacgc 1320
tagtttctga gttatgacta gctcttttaa ttgggcttca cctacttaat taattaacca 1380
gtggctgcgt catttattga ccaacattgt catgttaccc caactgattt tttttataaa 1440
aaaactacca ccggatatat tattagttcg tgtatgtatc tgctcaagca gcgcatgcat 1500
atagtttttc gtgacacaaa atatgtactg tatgctcaaa agatctgttg ggattgtcat 1560
attcgccatt ataattaaaa tggtgatgcc cagctttttt tttctccaat aatttattgg 1620
cttaatttcc tgtgctgtta ggagtaaagc tattaattag caccattttt aaagcttcta 1680
aaatctaaat aaagtacgac actgcaacct tactgtatat gctgtgtttg ttcattagca 1740
gtggcagtac ttgctgtcta gctttaaatg ttttgggtgt taaaatatcc cccagacatc 1800
acctgaaaag ttgacaggct aaacacatgc tcatttccca cgtttactta aattaactcg 1860
aataaacaat tactccttct gtttcggaat ataaagaatt ttagataaat gtaacatatc 1920
ctactattac aaatctgaaa aaaaaacctg ttcaaattcg taatactagg atatcatatt 1980
tattcaaaat tcgttatatt ttgagatgta gggagtaata tttcttgcag ggtgaaataa 2040
attcaatcga agggatccac gctgcctcca gatgctgcag agagtaagat agccctgctg 2100
tttctttttg tacttcgatt tcttctcgtc tttagtaccc ttcccatgca ttcgcaaaat 2160
atacttacct cagtttttgg tcatggatag atgaccttcg accgttatct ttgtaagcat 2220
cttcaataaa aatagcttta aatgcagggt attaggtata tactttcaat tgcaaatgta 2280
atgtacagca tattgattct aaaaccttga caaatcaaag taaaaaaacc gaacatttga 2340
tttctggacg ttgggagttt ttttttcttt catcttttac aaatgagtac gttgtggaat 2400
catattgtgc aagcttttga gtctaattag gtatgagata ctatctacta gttgcttgag 2460
gtaaaacaaa atgtaattag gttacattta ctaaactgaa catttagtaa tgttttgtta 2520
aataacggta atttcaatgc atgcatgtcc tcccgtaaaa acagtgcaca gactgttcaa 2580
aaagtcattg cacaataact attcacatgg aactgtgaaa agtatatata ttggaacttt 2640
ctagatcctt ttgggaacat gggaaaagcc aagtcacgcg tggaatccct attccctgtg 2700
ttccttttga aaggatccag ttaagctaga actgaaaatt gtactactac tgagatgaaa 2760
ttactgcagg agcagaaagg ggggaaacct gtaaattaaa caagcctcaa aattcaacat 2820
cagcaaccga ctcatctagt gttcctgtgt tatgaaccag ccggcaacta tgcctctatg 2880
tagtgcattg aagcacctta acaatctatg gcggtcggtt ggaaacctca tataaatacc 2940
aaatttttta accggcatct cattagcaac actgatggag agaactaaag ttgtcggctt 3000
agaaaccaga caccttttag gtttcaaaaa taaaaaaaaa tactgactcg atgcatcgtc 3060
ttgaatctag ccgatgcttc gagcagcccc atcgccccga gaggacgact cccattctaa 3120
tatcatggac atggtaatac ataaatttac gaccagcaca tatgaaataa ccgaattaat 3180
tagttcagag tacttccact ttcgaatctt gtagaacttc tttcctttcg gaaacaaaag 3240
gggggaagaa ggaggtaaca acaaagggaa ccgagaacaa aataaatatg ccgatgtaca 3300
cgagcatgaa aatttaaaaa tctcacagga aaggagatca taaataaaga accaagaacc 3360
agctcactcg gtccagattc atgctcgtta gctagcttcc atgtagcaag aaggcacgca 3420
gccatcgcac atagtgtttg caatttcggc agccactaaa atttccgtgg ttgccgaaat 3480
tttgaaaaat tcatgaacgg tgttttctga ccttttttgt ttgaaatctt aacatatttt 3540
gactgaattt aaacaaattt cgaccaaatt cacaaatatt tgaaaaaaac caataatttc 3600
aggggggtgt ttgatcttgc cggtaggatc cgaaatttca aaccatgatc gcacgcttca 3660
ccgcttgtct gcatcgacgc agctactggc catgggaagc ctccccatgc aaccaagctt 3720
cactgagagg ccgcacccct accgaccgag gagaggtggc cgccgcggag aggagggccg 3780
tcgccatcgc gtcgccagtt agaggcaagc cgtcaccttg gcgtggccgc gtagaggagg 3840
gccgctgctg tcgtgtcgcg atcgtcatcg tcggcttggc ttggagaggt ggagagatgg 3900
aggaggagat aaggtgagag agatgagaga aaatattctg acggcttgtt tggcaacttg 3960
agggaagagg attgggagtt tagacgggaa aattgatggt gagatctgtc gatgggttat 4020
acctgcaaac cgtattcgga agatatcggg gtacgctggt actagggcct acacaatatc 4080
gtatcaagca ctcagagaca aatgtttata cgagttcaag ccctactcca gtttatatgg 4140
attatggatg gaacaccaca aggattacaa taggaaaaca taacaaccta tctaatttgc 4200
cgactaggat ccaccaagaa tggcccaacg agatctaatc gacgaatacg actgcggctt 4260
cctccgggtt gcttctcgct gtggttggtg gttaactccc tagggggcct tgtatttata 4320
ttgatcgctg cctttctctt caagtagaac tcaacaagac atatttggat acgagcctga 4380
ggagacctta tcttctttga ataggacttt aacaccttct gttccgagaa gataatttcc 4440
ttttatcctt gccgtatttt cttaattaaa ggtaccttcc cctatatttt taggtatatg 4500
gtattcgtat actccatgaa tctatgtatg gagatatgtt ttatccttta ccctgaaaag 4560
attaatctga gaatttgatt tttagtctca atctcttatt tgatagagat ggtgagaatt 4620
gatagagagt ttagttgaac attagatttt aaatgaaaat agaaggttga gatttgttta 4680
gacaaagtaa aggaggaatt ctctcttaat taccaccctc ttcccaggta ttagaaagaa 4740
gagagttcct cctgaatttc ctatccccat cccactacaa ctcccatatc tttcaaccaa 4800
acagaacatt taatagcctc atccctttaa acttttaatc cctttcaaaa actccctcca 4860
accaaacgga ccgcgaggag ataagagatg gatggtggag ataaggtcgc gctggtaagc 4920
ggggtccacg gtgatttttt tctccacccg tgttgctgct gcaattctat gtcggctaat 4980
aaaaaaatca gcatctatag tttctttatt tgtcccattg ttttaaaaaa ccgtacttat 5040
aataaatggt gtttttaata aaccagtacc tataaaatgt tataggtgtc ggtttttgat 5100
tttagattca tgtgaggagt cgagtaggcc agaaatagct tcatatttca tatatgcctg 5160
ttcaaggcat atatgaaggt gtccgttatt aggggttttg tagtaatgtt tgttggaatt 5220
aatcacttgt atattatgaa accgatggct ggcgaaactt tgtcaattgt gggcggtaag 5280
ctcacctaaa gcaaattgtt caaaacaagg ttaaagatga attttattaa tggattttgg 5340
tttggaaaaa ttttgtaggg ttgacgaatt catcggaaga actcccgaac cattcataac 5400
gatgtatgaa ttttcaggtc gagaatttgt ctttaacttg gcacaactgt tactttttct 5460
tattaattct ctgtttacaa gcagttcatc cgagaagcga agtcatgatc attctcacca 5520
cttttggaag aagtttcggt acttacttga ttcccggatc ttaatgtacg tatgcatctg 5580
cactgcgcta attattattt acaaaactga actatttaat aaacactaca aaatatttaa 5640
cttgcaaagt acatatttaa tcagggattt agtaaaccca ttccaatcta atgtcaccat 5700
atctgcatgc acggtccaca tggatgctgg tgtattaatt tctttttgtg ctagaatgga 5760
cgttttattt tcaattgtcg ttagtataca cctttacgca tcatctctct cctgcgcatt 5820
tgagaaaact tcttcctttg atgatagacg tggtcttatt cttcagtttg atttagtttc 5880
agatagaata aatattgttt attacttagt ttctctcctt cagtaa 5926
<210> 2
<211> 6193
<212> DNA
<213> wild rice (Oryza alta)
<400> 2
acggctctga tccccgcggc gcagcggatc tagtccggat gggcagacga gacgagacga 60
gatgagatag atactagatc cgtccccgac aatctttcaa gccccgtggc cgtccctcct 120
ctgctcgatg tgccacgcct ctcagccttg ctgcctttcc atcatcatta ttcattcacg 180
cccaaagccc aaagcaattc acctctctct ctcgcacgaa tcgatagata cctgtaattt 240
cccccccgga aaagaaaaga gcggccattt ttctctcccc tccatcatta cttgcccaaa 300
aacggcaatc cctcctcccc catcgcctgc ttccatccat gtgcctccac cacctccgtc 360
ttccctccga ctgatcaaag cagctgctat cgctgctgca gcgccggcca tgcccgccct 420
ctagccccta cctaccctac cctaccctac tcggctactc ctactcctat atataccgtc 480
ctctctctct ctctctctct tgcgatatat ttcaagactc cagcgcgctc tccatcggag 540
tgatggcaat ggcggcggcg ccccggccca agtcgccgcc ggcagcgccc gacccctgcg 600
gccgccaccg cctccagctc gccgtcgacg cgctccaccg cgagatcgga ttcctcgagg 660
tacaacaata tccaatgtct tatcattccc tactccttct cttgcttcaa aaaaacatgt 720
cgtccatacc tcatattcat acacgtacgc tttttgctat cgctctctcc atgttttccc 780
aaccttaaaa accatcatgc atcatttgct tcttccgttc gacaaatcat ttgattagag 840
aagtttaaaa cattggaaca acaaatatgc atggcccaac attgctcctc tcgatctctt 900
ccctcagcca aattaacacg aacaaactta accaaacaag ttcaaccaac gatcggatag 960
gagtcaacca gcgctagcta gctgatgaac tagctagcaa gtccgttgac tgttcaacac 1020
taattaaggc accagttaat aagccgatta attaaacaat tatatgagct taacgggcgg 1080
ctagcttcgt cctctgggtc cgtgcggatc gtaccatcgg tttgcgcgtt cctccaccaa 1140
aactctctgc cttgttattc cctttgccta ggagcacata catttgcatg catcatcatc 1200
acatcaccag ttttaattgg attttaattt ggcggtctcc aaccatatca tattggtttt 1260
gttctctata ctagagtact ataatctcta gcggttttgt ggaactaaga aattataatt 1320
agccagggtt tgttaaccaa ttggtggttt agtttgttaa ttttagctag tactaggatt 1380
ttagacttag aggtcttcga gctactacta cgatatattg atgccatgca tcaatcgatg 1440
ctggtccatg ctagtttctg actagccctc ttaattgggc ttgacctact taattaatta 1500
accagtggct gcgtcactca ttgaccaaca ttgtcatgtt acccggactg atattttttt 1560
tataaaaaaa actaccaccg gtcgatatat tattagttag tgtatatatc tgctcaagca 1620
gcgcatgcac atggattttc gtcaaacaaa atatgtactg tatgctcaat gcatctgtgg 1680
gacgattgtc atattcgcct ttataattaa aatggtggtg cccagttgtt ttttttctcc 1740
aataatttat tgacttaaat ttcctgttct gttaggagta aaactattaa ttagcaccat 1800
tttaaaagtt tctaaaatct taataaagta tatgacactg caaccttact gtatatatat 1860
gttgtgtttg ttcattagca gtggcagtac ttgctgtcta actttaaatg ttttgggcgt 1920
taaaatatcc cccagacatc acctgaaaag ttgacaggct aaacacatgc ccatctccct 1980
cgtttactta aattaattcg aacaaacaac tatttcttgc agggtgaaat aaattcaatc 2040
gaagggatcc acgctgcctc cagatgctgc agagagtaag ttagccctgc tgtttctttt 2100
tgtacttcca tttcttctcg tctttaaact atccgcgacg ctacttactc taaattccta 2160
ctctaaagga atattccatg cctggtcagt gagattcctt actctatatc aaatcttccg 2220
cgctcgtgtt tactctatat cttcctctat aattttttaa ctatattttg aaaagtctat 2280
tttacctcct ccatcccccc aataaaaccc cgcactccta ctgcatcagc ggctcccccc 2340
cccgtcggtt gctccgccac cgccgatccc cctcccccgt gcccgaatcc ccccgccgca 2400
gccgtcgccg aatcccccct cccccagccg catcgcccct ccgccagccg catcggctcc 2460
tcccccgccg cccccgtcgc ccggcgtccc tctccctgcc gcggcttcgt cccgcatcag 2520
ctcctccccc ccgcctgtcg gccttcaccg cgccaacccg catcggctcg gaggcgcgga 2580
ggcggaggag ccgatggggc ggcgcgccgg ggcatcggcg gaggaagggc gtcgtggcgg 2640
atccgccgag ggagagcgct ctccgccgcg ccctccgccg agctctccgc ctcgctcgcc 2700
ctctccgcct cgcttgccct tccgctgtcg ctcgccgctg ctcctgcttc gttcccctcc 2760
tccccttctt ccgtccgtgc gcgggaggga cggacagttg gccgtcgccc gcgcgcgcgg 2820
cctatacagg ctgctacagt gctccgctac ggatagcgga gcactgtagc ccctggacga 2880
tacaggcgcc tttagcggat ggcgctgcag ccgcgccgtc cgctaccgcg gatagatcgc 2940
ggacggcgct gcggatagtc ttagtactct taccatgcat tcacaaaata tacttacctc 3000
ggttgatcat ggatagataa actttgaccg ttatcttttt aagcaacttc aataaaaata 3060
ggtttaaatg cagggtatta ggtatatatt ttcaattgca aatgtaacat acagcattga 3120
ttctaaaaca ttggcaaatc aaagtaaaaa actgaacatt tgatttaatt ctgaacgttg 3180
ggagtatttt ttttcatcat ttacaaatga gtacgttgtg gaattatatt gtgcaagctt 3240
ttgagactaa ttaggttata tgagatacta tctaccaagg ttgtcagtgt cttatgatac 3300
caggatccta ttaaagcgaa acgatattaa tcccatctag tatctaatta tcatagtatc 3360
tcatcttact atttatttcc atatgatcct acgaaatctt aggtagtatc ccgattctac 3420
gatccctctg atcctataaa atattattta aaataaggtg gattgaaaat aatatgaata 3480
aatatactca agtttacata aaaccaatta aatttatcaa aatcaatatt atatgcttta 3540
atttgtacaa catattatat attttatata aaaatgtata ctttctaaaa tctgttagta 3600
ttccgatcct atgatactac caataaaaaa cgattctacc tggtatctcc atatgctatc 3660
tattagttgc ttgaggtaaa acaaaatgta gttaggttac atttactaaa ctgaacattt 3720
agtaatgttt tgttaaataa cggtaatttc aatgcatgca tgtcctcccg taaaaacagt 3780
gcagagactg ttcaaaaagt cattgcacaa taactattca catggaactg tgaaaagtat 3840
atattggaac ttactagatc cttttgggaa catgggaaaa gccaaagtca cgtgtggaat 3900
ccctattccc tgtgttcttt ttgaaaggat ccagttatgg cccgtttagt tcgcgaaatt 3960
tttttcaaaa acatcacatc aaacgtttga ccgaatgtcg ggaggggttt tcggacacga 4020
atgaaaaaac taatttcacg gttagcctgt aaaccacgag acgaattttt ttgagcctaa 4080
ttaagccgtc aacatgtagg ttactgtagc acttacggct aattatggcg taattatgac 4140
aaaattaggc tcaaaagatt cgtctcgtcg tttacaatcc aactgtgcaa ttagtttctt 4200
tttttatcta tatttaatgc ttcatgcatg agtccaaaag tttgatgtga tgtttttggg 4260
gttttcgttt tgggaactaa acaaggcctt agctagaaga gtggaaattg tactactact 4320
gagatgaaat tacagcagga gcagaaaggg ggacaaactg aaaattaaac aagcctcaaa 4380
attcaacatc agcaaccgac tcatctagtg ttcctgtgtt atgaaccagc cggcaactat 4440
gcctctatgt agcgcattga agcaccttaa ctaagggcta tcaaggctgt gttaaactgt 4500
taatggtaca tatgcctttc agtttctgtc tgtactttag gttgctatac atcctatgaa 4560
catgttttct tgtctttctt cttacctcgg ttgaaaaacc catacaaata tcatttttta 4620
actgactact cactaaataa aaagaatcaa atgtgccgat ttaagaactg atgcctttaa 4680
aattttaata ataagaagaa aaataccagt tcgatgcatc gtctcgaatc gaccgaatcg 4740
aaccgatgct tcaaccggcc cccatccctc gagaggacga ctcccattct agaaacatgg 4800
acatggtaac acagaaattt tcgaccagca catatgaaat aaccgaatca attagttcag 4860
aatccttcca ctttcgaatc ttgtagaact tctatccttt cggaaacaag agggggaagg 4920
aggaggcaac aacaaagtat gggaactgag aacaaaataa atatgccgat gtacacaagc 4980
atgaaaagtt aaagatctca aaggaaagga gatcataaat caagaaccaa gaaccagctc 5040
actcggtcca gttcatgctc gttagctagc ttccatgtag caagaaggca cgcagccatc 5100
gcacatagtg tttgtaattt cggtagccac cgaaatttcg gtggttaccg aaattttgaa 5160
aaaaatcatg aatttcggtg ttttctgact gatttttttt tcaattttaa ctgagtttga 5220
acaaatttgg atcaaattca caaatatttg aaaaaatcca aaaaatttgg ggagatttga 5280
tcgtgccgat agggtccgaa atttcaaacc atgatcgcac acttcaccgc ttgtctgcat 5340
cgacgctggc cgtgggaagc ttccccgtgc aaccaagctt caccgagcgg ccgcacccct 5400
gccgatcgaa ggagaggtgg ccgccgtgga gaggagggcc gctgccgtcg cgtcgccggg 5460
tagaggcaag ccttcacctt ggcgtggccg cagacaggag ggccgctgct gtcgcgtaac 5520
ggccgtcgtc gccggcttgg cttggagagg tggagagaga tggaggagga gatgaggtta 5580
gagagagaga ggagaaagag gtaagaaaat atccagagga gataagagat gcatggtgga 5640
gataacgtcg cgctgggcaa gcgtggtcca cactgatttt ttccccgccc gtgtcgctgc 5700
tgcaattctg tgctggctaa taaaaaaaat cagataccta taaaatgtta tagatgtcgg 5760
tttttaattt taaactcatg tgagtagtcg ggtggataaa aaatggtctc gtctgccagt 5820
tgaatatgca tatataaaga cgtccactat tagggtttta tagtagttag ttagaattaa 5880
ttagttgcat attaaccaga aggctggcga aactttgtca attgtgggcg ctaatctcac 5940
ctgaagcaaa ttgttcaaaa caaggttaaa gatgattttt attaatgaat tttggcttgg 6000
aaaaattttg tagggttgac gaattcatcg gaagaactcc cgatccattc ataacgatgt 6060
atggattttc aggtcgagaa tttgtctttt aacttggcac aaccgtactt tttcttatta 6120
attctctgtt tacaagcagt tcatcggaga agcgaagtca tgatcattct caccacttat 6180
tgaagaagtt tcg 6193
<210> 3
<211> 666
<212> DNA
<213> wild rice (Oryza alta)
<400> 3
atggcaatgg cggcggcggc ggcggcgccc cggcccaagt cgccgccggc agcgcccgac 60
ccctgcggcc gccaccgcct ccagctcgcc gtcgacgcgc tccaccgcga gatcggattc 120
ctcgagggtg aaataaattc aatcgaaggg atccacgctg cctccagatg ctgcagagag 180
gttgacgaat tcatcggaag aactcccgaa ccattcataa cgatttcatc cgagaagcga 240
agtcatgatc attctcacca cttttggaag aagtttcggt acttacttga ttcccggatc 300
ttaatgtacg tatgcatctg cactgcgcta attattattt acaaaactga actatttaat 360
aaacactaca aaatatttaa cttgcaaagt acatatttaa tcagggattt agtaaaccca 420
ttccaatcta atgtcaccat atctgcatgc acggtccaca tggatgctgg tgtattaatt 480
tctttttgtg ctagaatgga cgttttattt tcaattgtcg ttagtataca cctttacgca 540
tcatctctct cctgcgcatt tgagaaaact tcttcctttg atgatagacg tggtcttatt 600
cttcagtttg atttagtttc agatagaata aatattgttt attacttagt ttctctcctt 660
cagtaa 666
<210> 4
<211> 651
<212> DNA
<213> wild rice (Oryza alta)
<400> 4
atggcggcgg cgccccggcc caagtcgccg ccggcagcgc ccgacccctg cggccgccac 60
cgcctccagc tcgccgtcga cgcgctccac cgcgagatcg gattcctcga gggtgaaata 120
aattcaatcg aagggatcca cgctgcctcc agatgctgca gagaggttga cgaattcatc 180
ggaagaactc ccgatccatt cataacgatt tcatcggaga agcgaagtca tgatcattct 240
caccacttat tgaagaagtt tcggtactta cttgattccc ggatcttaat gtacatatgc 300
atctgcactg tgctaattgg tgtgcattat atggtcgtaa gtcctagtta cttattattt 360
acaaaactga actaataaac actacaaaat atttaacttg caaagtatcc aatctaatgt 420
cactatatct gcatgcatgg tccacatgga cgctggtgta ttattttttt tgtgctagta 480
tggacgtttt attttcaatt gtcgcttagt atacaccgtt acgcatcata tctctcctgc 540
gcatttgaga aaacttcttc ctttgatgat agacgtgggc ttattcttca gtttgattta 600
gtttcagata gaataaatat tgtttaatac ttagtttctc tccttcagta a 651
<210> 5
<211> 221
<212> PRT
<213> wild rice (Oryza alta)
<400> 5
Met Ala Met Ala Ala Ala Ala Ala Ala Pro Arg Pro Lys Ser Pro Pro
1 5 10 15
Ala Ala Pro Asp Pro Cys Gly Arg His Arg Leu Gln Leu Ala Val Asp
20 25 30
Ala Leu His Arg Glu Ile Gly Phe Leu Glu Gly Glu Ile Asn Ser Ile
35 40 45
Glu Gly Ile His Ala Ala Ser Arg Cys Cys Arg Glu Val Asp Glu Phe
50 55 60
Ile Gly Arg Thr Pro Glu Pro Phe Ile Thr Ile Ser Ser Glu Lys Arg
65 70 75 80
Ser His Asp His Ser His His Phe Trp Lys Lys Phe Arg Tyr Leu Leu
85 90 95
Asp Ser Arg Ile Leu Met Tyr Val Cys Ile Cys Thr Ala Leu Ile Ile
100 105 110
Ile Tyr Lys Thr Glu Leu Phe Asn Lys His Tyr Lys Ile Phe Asn Leu
115 120 125
Gln Ser Thr Tyr Leu Ile Arg Asp Leu Val Asn Pro Phe Gln Ser Asn
130 135 140
Val Thr Ile Ser Ala Cys Thr Val His Met Asp Ala Gly Val Leu Ile
145 150 155 160
Ser Phe Cys Ala Arg Met Asp Val Leu Phe Ser Ile Val Val Ser Ile
165 170 175
His Leu Tyr Ala Ser Ser Leu Ser Cys Ala Phe Glu Lys Thr Ser Ser
180 185 190
Phe Asp Asp Arg Arg Gly Leu Ile Leu Gln Phe Asp Leu Val Ser Asp
195 200 205
Arg Ile Asn Ile Val Tyr Tyr Leu Val Ser Leu Leu Gln
210 215 220
<210> 6
<211> 211
<212> PRT
<213> wild rice (Oryza alta)
<400> 6
Met Ala Ala Ala Pro Arg Pro Lys Ser Pro Pro Ala Ala Pro Asp Pro
1 5 10 15
Cys Gly Arg His Arg Leu Gln Leu Ala Val Asp Ala Leu His Arg Glu
20 25 30
Ile Gly Phe Leu Glu Gly Glu Ile Asn Ser Ile Glu Gly Ile His Ala
35 40 45
Ala Ser Arg Cys Cys Arg Glu Val Asp Glu Phe Ile Gly Arg Thr Pro
50 55 60
Asp Pro Phe Ile Thr Ile Ser Ser Glu Lys Arg Ser His Asp His Ser
65 70 75 80
His His Leu Leu Lys Lys Phe Arg Tyr Leu Leu Asp Ser Arg Ile Leu
85 90 95
Met Tyr Ile Cys Ile Cys Thr Val Leu Ile Gly Val His Tyr Met Val
100 105 110
Val Ser Pro Ser Tyr Leu Leu Phe Thr Lys Leu Asn Thr Leu Gln Asn
115 120 125
Ile Leu Ala Lys Tyr Pro Ile Cys His Tyr Ile Cys Met His Gly Pro
130 135 140
His Gly Arg Trp Cys Ile Ile Phe Phe Val Leu Val Trp Thr Phe Tyr
145 150 155 160
Phe Gln Leu Ser Leu Ser Ile His Arg Tyr Ala Ser Tyr Leu Ser Cys
165 170 175
Ala Phe Glu Lys Thr Ser Ser Phe Asp Asp Arg Arg Gly Leu Ile Leu
180 185 190
Gln Phe Asp Leu Val Ser Asp Arg Ile Asn Ile Val Tyr Leu Val Ser
195 200 205
Leu Leu Gln
210