Photosynthetic organism gene regulation for improved growth
阅读说明:本技术 用于改善生长的光合生物基因调节 (Photosynthetic organism gene regulation for improved growth ) 是由 I·阿加维 F·I·库兹米诺夫 R·R·拉达科维茨 J·H·维卢托 S·波茨 R·斯普雷菲 于 2018-12-27 设计创作,主要内容包括:提供了具有减少的叶绿素和增加的光合效率的突变体光合生物。所述突变体菌株具有突变或弱化的:叶绿体SRP54基因和SGI1基因;叶绿体SRP54基因和SGI2基因;公开了叶绿体SRP54基因、SGI1基因和SGI2基因。相对于野生型菌株,所述突变体光合生物展现出增加的生产力。还提供了具有突变或弱化的胞质SRP54基因的突变体光合生物。本文提供了使用在SRP54基因、SGI1基因、SGI2基因、SGI1/SRP54的组合以及SGI2基因和SRP54基因的组合中具有突变的菌株生产生物质和如脂质等其它产物的方法。还包含用于弱化或破坏SRP54基因、SGI1基因和SGI2基因的构建体和方法。(Mutant photosynthetic organisms having reduced chlorophyll and increased photosynthetic efficiency are provided. The mutant strain has a mutation or attenuation of: chloroplast SRP54 gene and SGI1 gene; chloroplast SRP54 gene and SGI2 gene; discloses chloroplast SRP54 gene, SGI1 gene and SGI2 gene. The mutant photosynthetic organisms exhibit increased productivity relative to the wild type strain. Also provided are mutant photosynthetic organisms having a mutated or attenuated cytoplasmic SRP54 gene. Provided herein are methods of producing biomass and other products such as lipids using strains having mutations in the SRP54 gene, the SGI1 gene, the SGI2 gene, the combination of SGI1/SRP54, and the combination of SGI2 gene and SRP54 gene. Also included are constructs and methods for attenuating or disrupting the SRP54 gene, SGI1 gene and SGI2 gene.)
1. A mutant photosynthetic organism comprising a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54) and a mutated or attenuated significant growth improving gene 2(SGI 2).
2. A mutant photosynthetic organism comprising a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54) and a mutated or attenuated significant growth improving gene 1(SGI 1).
3. A mutant photosynthetic organism comprising a mutated or attenuated significant growth improving gene 2(SGI 2).
4. A mutant photosynthetic organism comprising a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54), a mutated or attenuated significant growth improving gene 1(SGI1), and a mutated or attenuated significant growth improving gene 2(SGI 2).
5. The mutant photosynthetic organism of any one of claims 1 to 4, wherein the mutant exhibits a reduction of chlorophyll under low light conditions and above 100 μ E m relative to a control photosynthetic organism of the same species-2s-1At all physiologically relevant irradiances of (F), photochemical maximum quantum yield (F) in photosystem IIv/FM) And higher.
6. A mutant photosynthetic organism according to claim 5 wherein the mutant photosynthetic organism exhibits a reduction in chlorophyll of at least 20% relative to a control photosynthetic organism of the same species.
7. A mutant photosynthetic organism according to claim 6 wherein the reduction in chlorophyll is at least a 30% reduction relative to a control photosynthetic organism of the same species.
8. A mutant photosynthetic organism according to claim 7 wherein the reduction in chlorophyll is at least a 40% reduction relative to a control photosynthetic organism of the same species.
9. A mutant photosynthetic organism according to claim 7 wherein the reduction in chlorophyll is at least a 50% reduction relative to a control photosynthetic organism of the same species.
10. A mutant photosynthetic organism according to claim 9 wherein the reduction in chlorophyll is at least a 60% reduction relative to a control photosynthetic organism of the same species.
11. A mutant photosynthetic organism according to claim 10 wherein the reduction in chlorophyll is at least 70% reduction relative to a control photosynthetic organism of the same species.
12. The mutant photosynthetic organism of any one of claims 1 to 4, wherein the mutant exhibits a photosynthetic activity greater than 100 μ E m relative to a control photosynthetic organism of the same species-2s-1Non-photochemical quenching (NPQ) is lower at all physiologically relevant irradiances of (a).
13. The mutant photosynthetic organism of claim 12, wherein the mutant exhibits a spectrum of wavelengths above 250 μ E m- 2s-1The NPQ is lower than that of a control photosynthetic organism of the same species for all physiological irradiances of (a).
14. A mutant photosynthetic organism according to any one of claims 1 to 4, wherein the mutant exhibits a higher carbon fixation rate on a per chlorophyll basis than a control photosynthetic organism of the same species.
15. A mutant photosynthetic organism according to claim 14 wherein the carbon fixation rate is at least 50% greater than a control photosynthetic organism of the same species.
16. A mutant photosynthetic organism according to claim 15 wherein the carbon fixation rate is at least 100% greater than a control photosynthetic organism of the same species.
17. A mutant photosynthetic organism according to any one of claims 1 to 4 wherein the oxygen evolution rate is at least 100% higher than a control photosynthetic organism of the same species.
18. A mutant photosynthetic organism according to claim 17 wherein the oxygen evolution rate is at least 200% greater than a control photosynthetic organism of the same species.
19. The mutant photosynthetic organism of any one of claims 1 to 4, wherein a culture of the mutant exhibits greater biomass productivity than a culture of a control photosynthetic organism of the same species.
20. The mutant photosynthetic organism of claim 19, wherein the mutant exhibits greater biomass productivity in photoautotrophic cultures.
21. The mutant photosynthetic organism of claim 20, wherein the mutant exhibits greater biomass activity under continuous light conditions.
22. The mutant photosynthetic organism of claim 20, wherein the mutant exhibits greater biomass activity under diurnal cycle conditions.
23. The mutant photosynthetic organism of claim 20, wherein the mutant exhibits greater biomass activity under diurnal cycle conditions under which a light profile mimics a natural daylight profile.
24. A mutant photosynthetic organism according to any one of claims 1 to 4, wherein the mutant has been generated by UV irradiation, gamma irradiation or chemical mutagenesis.
25. A mutant photosynthetic organism according to any one of claims 1 to 4, wherein the mutant is a genetically engineered mutant.
26. The mutant photosynthetic organism of claim 25, wherein the mutant has been genetically engineered by insertional mutagenesis, gene replacement, RNAi, antisense RNA, meganuclease genome engineering, one or more ribozymes, and/or CRISPR/Cas systems.
27. The mutant photosynthetic organism of claim 26, wherein the mutant has been genetically engineered through a CRISPR/Cas system.
28. The mutant photosynthetic organism of any one of claims 1 to 2, wherein prior to the mutation or attenuation of the gene, the cpSRP54 comprises an amino acid sequence having at least 65% identity to an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 68, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84 and 85.
29. The mutant photosynthetic organism of claim 26, wherein the cpSRP54 gene has at least 50% identity to the nucleic acid sequence of SEQ ID No. 8 prior to the mutation or attenuation of the gene.
30. The mutant photosynthetic organism of claim 28, wherein prior to the mutation or attenuation of the gene, the cpSRP54 has at least 65% with an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 2, SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13 and SEQ ID NO 14.
31. A mutant photosynthetic organism according to any one of claims 1 to 2 wherein prior to mutation or attenuation of the gene the SGI1 polypeptide has at least 50% identity to an amino acid sequence selected from the group consisting of seq id no:3, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38 and 39.
32. The mutant photosynthetic organism of any one of claims 1 to 2, wherein the gene encoding SRP54 protein comprises a mutation that occurs outside of the sequence encoding the first 169 amino acids of the cpSRP54 gtpase domain.
33. The mutant photosynthetic organism of claim 32, wherein the mutation in the gene encoding SRP54 protein occurs outside of the sequence encoding the cpSRP54 gtpase domain.
34. The mutant photosynthetic organism of claim 33, wherein the gene encoding SRP54 protein does not comprise a gene-disrupting mutation in the cpSRP54 gtpase domain.
35. A mutant photosynthetic organism according to claim 1, 3 or 4 wherein the SGI2 gene includes, prior to mutation or attenuation of the gene, a nucleic acid sequence encoding an amino acid sequence having at least 65% identity to an amino acid sequence selected from the group consisting of SEQ ID NO: SEQ ID NO 5, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 44, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 47, SEQ ID NO 48, SEQ ID NO 49, SEQ ID NO 50, SEQ ID NO 51, SEQ ID NO 52, SEQ ID NO 53, SEQ ID NO 54, SEQ ID NO 55 and SEQ ID NO 56.
36. The mutant photosynthetic organism of claim 35, wherein prior to mutation or attenuation of the gene, the SGI2 gene comprises a nucleic acid sequence having at least 80% identity to a nucleic acid sequence selected from the group consisting of seq id no: SEQ ID NO 7, SEQ ID NO 57, SEQ ID NO 58, SEQ ID NO 59, SEQ ID NO 60, SEQ ID NO 61, SEQ ID NO 62, SEQ ID NO 63, SEQ ID NO 64, SEQ ID NO 65, SEQ ID NO 66 and SEQ ID NO 66.
37. The mutant photosynthetic organism of claim 35, wherein prior to mutation or attenuation of the gene, the SGI2 gene comprises a nucleic acid sequence encoding an amino acid sequence having at least 80% identity to an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 5, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 44, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 47, SEQ ID NO 48, SEQ ID NO 49, SEQ ID NO 50, SEQ ID NO 51, SEQ ID NO 52, SEQ ID NO 53, SEQ ID NO 54, SEQ ID NO 55 and SEQ ID NO 56.
38. A mutant photosynthetic organism according to any one of claims 1 to 4, wherein the photosynthetic organism is an alga, and wherein the mutant alga belongs to a genus selected from the group consisting of: the genus Microcystis (Achnanthes), Coccocus (Amphiora), Geotrichum (Amphiora), Cellulomonas (Ankisstrodes), Asterina (Asteromonas), Euglena (Boekelovia), Borrelia (Bolidomonas), Bordetella (Borodinella), balloonflower (Botrydium), Botryococcus (Botryococcus), Bractenococcus (Bractenococcus), Chaetoceros (Chaetoceros), Tetraflagellata (Carteria), Chlamydomonas (Chlamydomonas), Chlorococcus (Chlorococcum), Chlorella (Chloromonum), Chlorella (Chlorococcum), Chlorella (Chlorella), Cryptomonas (Chlorococcus), Chlorella (Chlorella), Chlorophyces (Chlorococcus), Chlorella (Chlorococcus), Chlorophyces (Cryptococcus (Crypthecodina), Euglea (Crypthecodinium), Euglena (Crypthecodinium), Euglea (Crypthecodinium), Euglena (Crypthecodinium), Euglea (Crypthecodinium), Euglena (Crypthecodinium), Euglea, The genus Leymus (Gloeothamnion), Rhodococcus (Haematococcus), halophil (Halocaceta), Isochrysis (Heterococcus), Hymenomonas (Hymenomonas), Isochrysis (Isochrysis), Leptophyceae (Lepocinclis), Micropteris (Micracystis), Allium (Monodendros), Monochrysis (Monoprophidium), Micropteris (Nannochloropsis), Nannochloropsis (Navicula), Neochloris (Neochloris), Phaeophyceae (Neocalliphyceae), Phaeophyceae (Phaeophyceae), Chlorella (Paphialospora), Porphyromonas (Phormidium), Porphyromonas (Paphidophyllum), Porphyromonas (Pleurophyromonas), Porphyromonas (Piloca), Porphyromonas (Pleurophyromonas), Phaeophyceae (Pleurophyceae), Phaeophyceae (Pachys), Pachys (Pachylinae), Pachys), Pachylinae (Pachylinae), Pachylinae (Pachylinae), Pachys), Pachylinae (Pachylinae, Pachylinae (Pachys (Pachylinae), Pachylinae (, Pseudochlorella (Pseudochlorella), neochlorella (Pseudochlorella), pseudocruciate (pseudostaurospora), talaria (Pyramimonas), plasmodesmata (Pyrobotrys), Scenedesmus (Scenedesmus), Skeletonema (Skeletonema), spirulina (spirogyra), schizophyllum (Stichococcus), tetragonococcus (Tetraselmis), thalassonia (thalassosia), xanthomonas (Tribonema), chrysosporium (tribolium), hemicellum (Vaucheria), rhodochrous (Viridiella), wiseriia (vischiselia) and globularia (volvoox).
39. A mutant photosynthetic organism according to any one of claims 1 to 4, wherein the photosynthetic organism is an alga, and wherein the mutant alga is selected from the group consisting of: chlorophyta (chlorophyta), diatom (bactriaphyte), cladophora (prasinophyte), gloeophyte (glaucophyte), dinoflagellate (haloporph), chloranthus (chlorophynioplate), euglenophyta (euglenophyte), chromophyte (chromophyte) and dinoflagellate (dinoflagellate) mutants.
40. The mutant photosynthetic organism of any one of claims 1 to 4, wherein the photosynthetic organism is an alga, and wherein the algal mutant is of the phylum Chlorophyta.
41. An algal mutant according to claim 37, wherein the mutant belongs to a genus selected from the group consisting of: chlorococcus, Asparagus, Tetrastigmatophycus, Chlamydomonas, Chlorococcus, Chlorocycloris, Chlorella, Cryptococcus, Isochrysis, Crypthecodinium, Coccidioides, Dunaliella, Chlamydomonas, Volvocalella, Rhodococcus, Isochrysis, Hymenospora, Isochrysis, Lepidium, Micromannophora, Monoraphidium, Microcosphaera, Neochlorella, Phanerochaenophyta, Phaeophyceae, Chlorella, Paris, Porphyromonas, Primeria, bacteriophage, Microchlorella, Platymonas, Coccomyxophyceae, Phaeophyceae, Sphaceae, Sphaerotheca, Phaeophyceae, Chlorella, Neosarum, Pseudoeuglena, Pseudoperonospora, Scytalium, Scytalidium, Gracilaria, Scytalium, Gracilaria, Porphyceae, Sphaerotheca, and Sphaerotheca, Tetraselmis, Bothrina, Chlorella, and Volvox.
42. A biomass comprising the mutant photosynthetic organism of any one of claims 1 to 4.
43. The biomass of claim 42, wherein said photosynthetic organism is an algae.
44. A method of producing a biological product, the method comprising culturing a mutant photosynthetic organism of any one of claims 1 to 4 and isolating at least one product from the culture.
45. The method of claim 44, wherein the photosynthetic organisms are algae, and wherein the bioproduct is algal biomass.
46. The method of claim 44, wherein the biological product is a lipid, a protein, a peptide, one or more amino acids, an amino acid, one or more nucleotides, a vitamin, a cofactor, a hormone, an antioxidant, or a pigment or colorant.
47. The method of claim 46, wherein the biological product is a lipid.
48. The method of claim 47, wherein the mutant photosynthetic organism is engineered to comprise at least one exogenous gene encoding a polypeptide involved in the production of the lipid.
49. The method of claim 44, wherein the mutant photosynthetic organism is phototrophic.
50. The method of claim 49, wherein the mutant photosynthetic organism is an algae, and wherein the algae are cultured in a pond or raceway.
51. A mutant photosynthetic organism having a mutated or attenuated gene encoding cytoplasmic signal recognition protein 54(cytoSRP54) and a mutated or attenuated significant growth improving gene 2(SGI 2).
52. A mutant photosynthetic organism having a mutated or attenuated gene encoding cytoplasmic signal recognition protein 54(cytoSRP54) and a mutated or attenuated significant growth improving gene 1(SGI 1).
53. A mutant photosynthetic organism having a mutated or attenuated gene encoding cytoplasmic signal recognition protein 54(cytoSRP54), a mutated or attenuated significant growth improving gene 1(SGI1), and a mutated or attenuated significant growth improving gene 2(SGI 2).
54. The mutant photosynthetic organism of any one of claims 1 to 53, wherein a culture of the mutant photosynthetic organism exhibits greater lipid productivity than a culture of a control photosynthetic organism of the same species.
55. The mutant photosynthetic organism of any one of claims 51 to 53, wherein the mutant exhibits greater lipid productivity in photoautotrophic cultures.
56. The mutant photosynthetic organism of claim 55, wherein the mutant photosynthetic organism is an alga, and wherein the mutant alga exhibits greater biomass activity under diurnal cycle conditions.
57. The mutant algae of claim 56, wherein the mutant algae exhibits greater biomass activity under diurnal cycle conditions under which a light profile mimics a natural daylight profile.
58. The mutant photosynthetic organism of any one of claims 51 to 53, wherein the mutant photosynthetic organism has been generated by UV irradiation, gamma irradiation, or chemical mutagenesis.
59. The mutant photosynthetic organism of any one of claims 51 to 53, wherein the mutant photosynthetic organism is a genetically engineered mutant.
60. The mutant photosynthetic organism of claim 58, wherein the mutant photosynthetic organism has been genetically engineered by insertional mutagenesis, gene replacement, RNAi, antisense RNA, meganuclease genome engineering, one or more ribozymes, and/or CRISPR/Cas systems.
61. The mutant photosynthetic organism of claim 59, wherein the mutant has been genetically engineered through a CRISPR/Cas system.
62. The mutant photosynthetic organism of any one of claims 51 to 60, wherein the mutant photosynthetic organism is an alga, and wherein the mutant alga belongs to a genus selected from the group consisting of: genus Triplophytes, Coccomyza, Geotrichum, Celastrus, Celosidium, Chryseophytes, Bordetella, balloonflower, Staphylum, Chrysocola, Chaetoceros, Tetraflagellates, Chlamydomonas, Chlorococcus, Chlorella, Crypthecodinium, Chlorococcus, Chlorella, Haematococcus, Crypthecodinium, Coccodinium, Rhodococcus, Halobacterium, Isochrysis, Phyllostachys, Phaeophyceae, Isochrysis, Isodon, Isochrysis, Photinus, Phaeophyceae, Chlorella, Phaeophyceae, Oocystis, oyster globulina, pavlova, parachloropsis, parva, praguenophyta, phaeodactylum, phage, microalgal, tetraselminthium, crohns, portulaca, prototheca, pseudochlorella, neochlorella, pseudodiadactylum, talocystis, plasmopara, scenedesmus, ostereum, spirulina, schizophyllan, tetrastigmatis, thalassonia, xanthophylla, alexandrium, parachlorophyllum, welshikonium, and clitocystis.
63. The mutant photosynthetic organism of any one of claims 51 to 60, wherein the mutant photosynthetic organism is an alga, and wherein the mutant alga is selected from the group consisting of: diatom, Chlorophyceta (eustigmatophyte) and variegated mutants.
64. The mutant alga of claim 63, wherein the mutant is of the phylum Chlorophyceae.
65. The mutant algae of claim 64, wherein the mutant algae belongs to a genus selected from the group consisting of: ellipsoidea (Ellipsiodion), Euglena, Weissella, Allium, Nannochloropsis, and Pseudodiatella.
66. A method of producing lipids, the method comprising culturing the algal mutant of any one of claims 1-65 and isolating at least one lipid from the culture.
67. A method of increasing the biomass of a photosynthetic organism comprising modulating chloroplast signal recognition protein 54(cpSRP54) and significant growth improving gene 2(SGI 2).
68. A method of increasing the biomass of a photosynthetic organism comprising modulating the genes chloroplast signal recognition protein 54(cpSRP54) and significantly growth improving gene 1(SGI 1).
69. A method of increasing the biomass of a photosynthetic organism comprising modulating genes chloroplast signal recognition protein 54(cpSRP54) and significant growth improving gene 1(SGI1) and significant growth improving gene 2(SGI 2).
70. The method of claim 67, wherein modulating the gene comprises base substitution mutations, insertion mutagenesis, gene replacement, RNAi, antisense RNA, meganuclease genome engineering, one or more ribozymes and/or CRISPR/Cas systems in the cpSRP54 gene and the SGI2 gene.
71. The method of claim 68, wherein modulating the gene comprises base substitution mutations, insertional mutagenesis, gene replacement, RNAi, antisense RNA, meganuclease genome engineering, one or more ribozymes and/or CRISPR/Cas systems in the cpSRP54 gene and the SGI1 gene.
72. The method of claim 69, wherein modulating the gene comprises base substitution mutations in the cpSRP54 gene, the SGI1 gene, and the SGI2 gene, insertion mutagenesis, gene replacement, RNAi, antisense RNA, meganuclease genome engineering, one or more ribozymes, and/or CRISPR/Cas systems.
73. The method of any one of claims 67 to 72, wherein increasing biomass of a photosynthetic organism comprises an increase in total organic carbon.
74. The method of any one of claims 67 to 72, wherein increasing biomass of a photosynthetic organism comprises an increase in total lipid content.
75. The method of any one of claims 67 to 72, wherein increasing biomass of a photosynthetic organism comprises an increase in total nitrogen content.
76. The method of any one of claims 67 to 75, wherein the mutant photosynthetic organism is an alga, and wherein the mutant alga belongs to a genus selected from the group consisting of: genus Triplophytes, Coccomyza, Geotrichum, Celastrus, Celosidium, Chryseophytes, Bordetella, balloonflower, Staphylum, Chrysocola, Chaetoceros, Tetraflagellates, Chlamydomonas, Chlorococcus, Chlorella, Crypthecodinium, Chlorococcus, Chlorella, Haematococcus, Crypthecodinium, Coccodinium, Rhodococcus, Halobacterium, Isochrysis, Phyllostachys, Phaeophyceae, Isochrysis, Isodon, Isochrysis, Photinus, Phaeophyceae, Chlorella, Phaeophyceae, Oocystis, oyster globulina, pavlova, parachloropsis, parva, praguenophyta, phaeodactylum, phage, microalgal, tetraselminthium, crohns, portulaca, prototheca, pseudochlorella, neochlorella, pseudodiadactylum, talocystis, plasmopara, scenedesmus, ostereum, spirulina, schizophyllan, tetrastigmatis, thalassonia, xanthophylla, alexandrium, parachlorophyllum, welshikonium, and clitocystis.
77. The method of any one of claims 67 to 76, wherein the mutant photosynthetic organism is a plant.
Background
The increase in biomass productivity of photosynthetic organisms has been associated with various commercial applications ranging from biofuels to high value products. Genetic manipulation to increase the total protein content of biomass is highly desirable, but strategies to do so are not apparent in the art.
Engineering photosynthetic organisms to increase photosynthetic efficiency and thereby achieve higher productivity has been a long-standing goal of plant and algae biologists. US 2014/0220638 and US2016/030489 (both incorporated herein by reference) describe a mutant screen for obtaining reduced chlorophyll algal mutants that are impaired in low light adaptation capacity, that is, that maintain a low chlorophyll state of high light adapted cells even under low light. US 2014/0220638 describes algal mutants with mutations in the light adaptation regulators LAR1, LAR2 and LAR3 genes, and US2016/0304896 discloses algal mutants with mutations in the chloroplast SRP54 gene.
Disclosure of Invention
Disclosed herein are photosynthetic organisms comprising regulatory genes with increased photosynthetic efficiency and productivity, their use to produce products under photoautotrophic conditions, and methods of producing such photosynthetic organisms as well as nucleic acid molecules and constructs for modulating such genes.
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding a significant growth improving gene 2(SGI 2).
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54) and a mutated or attenuated significant growth improving gene 2(SGI 2).
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54) and a mutated or attenuated significant growth improving gene 1(SGI 1).
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54), a mutated or attenuated significant growth improving gene 1(SGI1), and a mutated or attenuated significant growth improving gene 2(SGI 2).
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding cytoplasmic signal recognition protein 54(cytoSRP54) and a mutated or attenuated significant growth improvement gene 2(SGI 2).
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding cytoplasmic signal recognition protein 54(cytoSRP54) and a mutated or attenuated significant growth improvement gene 1(SGI 1).
In one aspect, a mutant photosynthetic organism is provided that includes a mutated or attenuated gene encoding cytoplasmic signal recognition protein 54(cytoSRP54), a mutated or attenuated significant growth improvement gene 1(SGI1), and a mutated or attenuated significant growth improvement gene 2(SGI 2).
In one aspect, biomass is provided that includes a mutant photosynthetic organism, wherein the mutant photosynthetic organism includes a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54) and a mutated or attenuated significant growth improving gene 1(SGI1) and/or a mutated or attenuated significant growth improving gene 2(SGI 2).
In one aspect, a method of producing a biological product is provided. The method comprises culturing a mutant photosynthetic organism, wherein the mutant photosynthetic organism comprises a mutated or attenuated gene encoding chloroplast signal recognition protein 54(cpSRP54) and a mutated or attenuated significant growth improving gene 1(SGI1) and/or a mutated or attenuated significant growth improving gene 2(SGI 2); and isolating at least one product from the culture.
In one aspect, methods of inserting a single copy of a CRISPR gene into a selected locus of a microorganism are provided. In some embodiments, the CRISPR gene is codon optimized for expression in a microorganism. In some embodiments, the inserted CRISPR gene comprises a plurality of heterologous introns. In some embodiments, the number of heterologous introns may be at least 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40 or more. Non-limiting examples of such CRISPR genes include Cas9 and Cpf 1. In some embodiments, the CRISPR gene can be operably linked to a native promoter of a microorganism. In some embodiments, the promoter is inducible. In some embodiments, the CRISPR gene can be operably linked to a heterologous promoter of a microorganism.
In some embodiments, the biological product is a lipid, a protein, a peptide, one or more amino acids, an amino acid, one or more nucleotides, a vitamin, a cofactor, a hormone, an antioxidant, or a pigment or colorant. In some embodiments, the biological product is biomass. In some embodiments, the mutant photosynthetic organism is an algae and the biomass is an algal biomass.
In some embodiments, the mutant photosynthetic organism is engineered to comprise at least one exogenous gene encoding a polypeptide involved in the production of the lipid. In some embodiments, the mutant photosynthetic organism is phototrophic. In some embodiments, the mutant photosynthetic organism is an algae, and the algae is cultured in a pond or raceway.
In one aspect, there is provided a nucleic acid molecule construct for homologous recombination, comprising a nucleotide sequence from or adjacent to a naturally occurring photosynthetic organism gene encoding an SGI2 protein, wherein prior to mutation or attenuation of the gene, the SGI2 protein comprises an amino acid sequence having at least 55% identity to an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 5, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 44, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 47, SEQ ID NO 48, SEQ ID NO 49, SEQ ID NO 50, SEQ ID NO 51, SEQ ID NO 52, SEQ ID NO 53, SEQ ID NO 54, SEQ ID NO 55 and SEQ ID NO 56.
In one aspect, there is provided a plurality of nucleic acid molecule constructs for homologous recombination comprising nucleotide sequences from or adjacent to a naturally occurring photosynthetic organism gene encoding a cpSRP54 protein and a photosynthetic organism gene encoding an SGI1 protein, wherein prior to mutation or attenuation of the genes, the cpSRP54 protein comprises an amino acid sequence having at least 55% identity to SEQ ID NO 68, SEQ ID NO 75, SEQ ID NO 76, SEQ ID NO 77, SEQ ID NO 78, SEQ ID NO 79, SEQ ID NO 80, SEQ ID NO 81, SEQ ID NO 82, SEQ ID NO 83, SEQ ID NO 84, and SEQ ID NO 85, and wherein prior to mutation or attenuation of the SGI1 gene, the SGI1 gene encodes a polypeptide having the amino acid sequence, the amino acid sequence comprises an amino acid sequence having at least 55% identity to an amino acid sequence selected from the group consisting of: 3,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38 and 39.
In one aspect, there is provided a plurality of nucleic acid molecule constructs for homologous recombination comprising nucleotide sequences from or adjacent to a naturally occurring photosynthetic gene encoding a cpSRP54 protein and a photosynthetic gene encoding an SGI2 protein, wherein prior to mutation or attenuation of the genes, the cpSRP54 protein comprises an amino acid sequence having at least 55% identity to SEQ ID NO 68, SEQ ID NO 75, SEQ ID NO 76, SEQ ID NO 77, SEQ ID NO 78, SEQ ID NO 79, SEQ ID NO 80, SEQ ID NO 81, SEQ ID NO 82, SEQ ID NO 83, SEQ ID NO 84, or SEQ ID NO 85, and wherein prior to mutation or attenuation of the genes, the SGI2 protein comprises an amino acid sequence having at least 55% identity to SEQ ID NO 5, SEQ ID NO 40, SEQ ID NO 85, or SEQ ID NO 85, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 56 amino acid sequences having at least 55% identity.
In one aspect, nucleic acid molecule constructs for expressing antisense RNA, shRNA, microrna or ribozymes are provided, comprising a nucleotide sequence complementary to at least a portion of a naturally occurring photosynthetic organism gene encoding an SGI2 protein, wherein prior to mutation or attenuation of the gene, the SGI2 protein comprises an amino acid sequence having at least 55% identity to SEQ ID No. 5, SEQ ID No. 40, SEQ ID No. 41, SEQ ID No. 42, SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47, SEQ ID No. 48, SEQ ID No. 49, SEQ ID No. 50, SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54, SEQ ID No. 55, or SEQ ID No. 56.
In one aspect, a plurality of nucleic acid molecule constructs for expressing antisense RNA, shRNA, microrna or ribozymes are provided, the nucleic acid molecule constructs comprising nucleotide sequences complementary to at least a portion of a naturally occurring photosynthetic gene encoding a cpSRP54 protein and a photosynthetic gene encoding an SGI1 protein, wherein prior to mutation or attenuation of the genes, the cpSRP54 protein comprises an amino acid sequence having at least 55% identity to SEQ ID NO:68, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84 or SEQ ID NO:85, and wherein prior to mutation or attenuation of the SGI1 gene, the SGI1 protein comprises an amino acid sequence having at least 55% identity to SEQ ID NO:3, SEQ ID NO:75, SEQ ID NO:84, or SEQ ID NO:85, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38 or 39 amino acid sequences having at least 55% identity.
In some embodiments, the construct comprises at least a portion of a 5'UTR of cpSRP54, SGI1, SGI2, or a combination of two or more genes, at least a portion of a promoter region of cpSRP54, SGI1, SGI2, or a combination of two or more genes, and/or at least a portion of a 3' UTR of cpSRP54, SGI1, SGI2, or a combination of two or more genes. In some examples, the construct may be an RNAi, ribozyme, or antisense construct, and may comprise a sequence beginning in sense or antisense orientation with the transcribed region of cpSRP54, SGI1, SGI2, or a combination of two or more of the genes. In further examples, a construct may be designed for in vitro or in vivo expression of a guide RNA designed to target cpSRP54, SGI1, SGI2, or a combination of two or more genes, and may comprise a sequence homologous to a portion of any gene, including, for example, an intron, a 5'UTR, a promoter region, and/or a 3' UTR of a gene. In yet a further example, the construct used to attenuate expression of a gene encoding a cpSRP54, SGI1, or SGI2 polypeptide may be a guide RNA or an antisense oligonucleotide, wherein the sequence is homologous to the transcribed region of cpSRP54, SGI1, SGI2, or a combination of two or more genes in an antisense orientation.
In one aspect, a plurality of nucleic acid molecule constructs for expressing antisense RNA, shRNA, microrna or ribozymes are provided, the nucleic acid molecule constructs comprising nucleotide sequences complementary to at least a portion of a naturally occurring photosynthetic gene encoding a cpSRP54 protein and a photosynthetic gene encoding an SGI2 protein, wherein prior to mutation or attenuation of the genes, the cpSRP54 protein encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO:68, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84 or SEQ ID NO:85, and wherein prior to mutation or attenuation of the genes, the SGI2 protein encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO:5, shRNA, microrna or ribozymes 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 56 amino acid sequences having at least 55% identity.
In one aspect, a plurality of nucleic acid molecules encoding a guide RNA is provided, wherein the guide RNA comprises at least a portion of a naturally occurring photosynthetic organism gene SGI2, wherein prior to mutation or attenuation of the gene, the SGI2 gene encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO 5, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 43, SEQ ID NO 44, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 47, SEQ ID NO 48, SEQ ID NO 49, SEQ ID NO 50, SEQ ID NO 51, SEQ ID NO 52, SEQ ID NO 53, SEQ ID NO 54, SEQ ID NO 55, or SEQ ID NO 56.
In one aspect, a plurality of nucleic acid molecules encoding at least two guide RNAs are provided, wherein the guide RNAs comprise at least a portion of a naturally occurring photosynthetic organism gene encoding cpSRP54 and a photosynthetic organism gene encoding SGI1, wherein prior to mutation or attenuation of the genes, the cpSRP54 encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO 68, SEQ ID NO 75, SEQ ID NO 76, SEQ ID NO 77, SEQ ID NO 78, SEQ ID NO 79, SEQ ID NO 80, SEQ ID NO 81, SEQ ID NO 82, SEQ ID NO 83, SEQ ID NO 84, or SEQ ID NO 85, and wherein prior to mutation or attenuation of the SGI1 gene, the SGI1 gene comprises a sequence having at least 55% identity to SEQ ID NO 3, SEQ ID NO 9, SEQ ID NO 85, or a method of making a mutant or an SGI1 gene, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38 or 39 amino acid sequences having at least 55% identity.
In one aspect, a plurality of nucleic acid molecules encoding at least two guide RNAs are provided, wherein the guide RNAs comprise at least a portion of a naturally occurring photosynthetic cpSRP54 gene and a photosynthetic gene SGI2 gene, wherein prior to mutation or attenuation of the genes, the cpSRP54 gene encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO 68, SEQ ID NO 75, SEQ ID NO 76, SEQ ID NO 77, SEQ ID NO 78, SEQ ID NO 79, SEQ ID NO 80, SEQ ID NO 81, SEQ ID NO 82, SEQ ID NO 83, SEQ ID NO 84 or SEQ ID NO 85, and wherein prior to mutation or attenuation of the genes, the SGI2 gene comprises a protein having at least 55% identity to SEQ ID NO 5, SEQ ID NO 40, SEQ ID NO 41, SEQ ID NO 42, SEQ ID NO 85, SEQ ID NO 43, SEQ ID NO 44, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 47, SEQ ID NO 48, SEQ ID NO 49, SEQ ID NO 50, SEQ ID NO 51, SEQ ID NO 52, SEQ ID NO 53, SEQ ID NO 54, SEQ ID NO 55 or SEQ ID NO 56 have at least an amino acid sequence.
In one aspect, a method of increasing biomass of a photosynthetic organism is provided, the method comprising modulating an SGI2 gene.
In one aspect, there is provided a method of increasing biomass of a photosynthetic organism comprising modulating chloroplast signal recognition protein 54(cpSRP54) and significant growth improving gene 1(SGI1), wherein prior to mutation or attenuation of the genes, the cpSRP54 gene encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO:68, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, or SEQ ID NO:85, and wherein prior to mutation or attenuation of the SGI1 gene, the SGI1 gene comprises a sequence having at least 55% identity to SEQ ID NO:3, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 34, 35, 36, 37, 38 or 39 amino acid sequences.
In one aspect, there is provided a method of increasing biomass of a photosynthetic organism comprising modulating a chloroplast signal recognition protein 54 gene (cpSRP54) and a significant growth improving gene 2(SGI2), wherein prior to mutation or attenuation of the genes, the cpSRP54 gene encodes a protein comprising an amino acid sequence having at least 55% identity to SEQ ID NO:68, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, or SEQ ID NO:85, and wherein prior to mutation or attenuation of the genes, the SGI2 gene comprises a protein having at least 55% identity to SEQ ID NO:5, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:85, SEQ ID NO 44, SEQ ID NO 45, SEQ ID NO 46, SEQ ID NO 47, SEQ ID NO 48, SEQ ID NO 49, SEQ ID NO 50, SEQ ID NO 51, SEQ ID NO 52, SEQ ID NO 53, SEQ ID NO 54, SEQ ID NO 55 or SEQ ID NO 56 have at least an amino acid sequence.
In one aspect, a method of increasing biomass of a photosynthetic organism is provided, the method comprising modulating cytoplasmic signal recognition protein 54(cytoSRP54) and significant growth improving gene 2(SGI2), wherein prior to mutation or attenuation of the genes, the SGI2 gene encodes a protein comprising an amino acid sequence at least as long as SEQ ID No. 5, SEQ ID No. 40, SEQ ID No. 41, SEQ ID No. 42, SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47, SEQ ID No. 48, SEQ ID No. 49, SEQ ID No. 50, SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54, SEQ ID No. 55, or SEQ ID No. 56.
In some embodiments, the culture of the mutant photosynthetic organism exhibits greater biomass productivity than a culture of a control photosynthetic organism of the same species. In some embodiments, the mutant photosynthetic organisms exhibit greater biomass productivity in photoautotrophic cultures. In some embodiments, the mutant photosynthetic organism exhibits greater biomass productivity under continuous light conditions than a culture of a control photosynthetic organism of the same species. In some embodiments, the mutant photosynthetic organism exhibits greater biomass productivity under diurnal cycle conditions than a culture of a control photosynthetic organism of the same species. In some embodiments, the mutant photosynthetic organism exhibits greater biomass productivity under diurnal cycle conditions than a culture of a control photosynthetic organism of the same species, wherein the light profile mimics a natural sunlight profile.
In some embodiments, increasing the biomass of the photosynthetic organism comprises an increase in total organic carbon. In some embodiments, increasing the biomass of the photosynthetic organism comprises an increase in total lipid content. In some embodiments, increasing the biomass of the photosynthetic organism comprises an increase in total nitrogen content.
In some embodiments, the mutant photosynthetic organism exhibits a reduction of chlorophyll under low light conditions and a color of greater than 100, 125, 150, 200, or 250 μ E m relative to a control photosynthetic organism of the same species-2s-1At all physiologically relevant irradiances of (F), photochemical maximum quantum yield (F) in photosystem IIv/FM) And higher. In some embodiments, the reduction in chlorophyll is at least a 20%, 30%, 40%, 50%, 60% or 70% reduction relative to a control photosynthetic organism of the same species. In some embodiments, the mutant photosynthetic organism exhibits a photosynthetic activity greater than 125, 150, 200, or 250 μ E m relative to a control photosynthetic organism of the same species-2s-1Non-photochemical quenching (NPQ) is lower at all physiologically relevant irradiances of (a).
In some embodiments, the mutant photosynthetic organism exhibits a higher carbon fixation rate on a per chlorophyll basis for a control photosynthetic organism of the same species. In some embodiments, the carbon sequestration rate is at least 50%, 60%, 70%, 80%, 90%, or 100% greater than a control photosynthetic organism of the same species.
In some embodiments, the mutant photosynthetic organism exhibits an oxygen evolution rate per milligram of chlorophyll of at least 100%, 150%, 200%, 300%, 400%, or more than a control photosynthetic organism of the same species. In some embodiments, the mutant photosynthetic organism exhibits an oxygen evolution rate of μ of at least 100%, 150%, 200%, 300%, 400%, or more per gram of Total Organic Carbon (TOC).
In some embodiments, the culture of the mutant photosynthetic organism exhibits greater lipid productivity than a culture of a control photosynthetic organism of the same species. In some embodiments, the mutant photosynthetic organisms exhibit greater lipid productivity in photoautotrophic cultures. In some embodiments, the mutant photosynthetic organism is an alga.
In some embodiments, the mutant photosynthetic organism is produced by modulating an SGI2 gene of the organism. In some embodiments, the mutant photosynthetic organism is produced by modulating the cpSRP54 gene and the SGI1 or SGI2 gene of the organism. In some embodiments, modulating the gene comprises UV irradiation, gamma irradiation, or chemical mutagenesis. In some embodiments, modulating the gene comprises base substitution mutation, insertion mutagenesis, gene replacement, RNAi, antisense RNA, meganuclease genome engineering, one or more ribozymes, and/or CRISPR/Cas systems in the cpSRP54 gene, SGI1 gene, SGI2 gene, or a combination of the genes.
In some embodiments, prior to the mutation or attenuation of the gene, the mutant photosynthetic organism comprises a cpSRP54 gene encoding a protein having an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95% identical to an amino acid sequence selected from the group consisting of seq id no:68, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84 or 85. In some embodiments, prior to the mutation or attenuation of the gene, the mutant photosynthetic organism comprises a cpSRP54 gene encoding a protein having an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% identical to at least 30, 35, 40, 45, 50, 60, 70, 80, 100, 150, 200, 250, 300 amino acids or to the full length of an amino acid sequence selected from the group consisting of: 68, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84 or 85.
In some embodiments, the mutant photosynthetic organism comprises a mutation in the cpSRP54 gene that occurs outside of the sequence encoding the first 169 amino acids of the cpSRP54 gtpase domain. In some embodiments, the mutation in the cpSRP54 gene encoding SRP54 protein occurs outside of the sequence encoding the cpSRP54 gtpase domain. In some embodiments, the mutation in the cpSRP54 gene does not comprise a gene-disrupting mutation in the cpSRP54 gtpase domain.
In some embodiments, prior to mutation or attenuation of said gene, said SGI2 gene of said mutant photosynthetic organism encodes a protein having an amino acid sequence at least 50%, 65%, 70%, 75%, 80%, 85%, 90%, 95% identical to the amino acid sequence of SEQ ID NO 5, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 56. In some embodiments, prior to mutation or attenuation of the gene, the SGI2 gene of the mutant photosynthetic organism encodes a protein having an amino acid sequence that is at least 50% of the full length of the amino acid sequence of SEQ ID NO 49, 50, 51, 52, 53, 54, 55 or 56 to at least 30, 35, 40, 45, 50, 60, 70, 80, 100, 150, 200, 250, 300 amino acids or to SEQ ID NO 5, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 56, 65%, 70%, 75%, 80%, 85%, 90%, 95% identity.
In some embodiments, prior to the mutation or attenuation of the SGI1 gene, the SGI1 gene of the mutant photosynthetic organism encodes a protein having an amino acid sequence that is identical to SEQ ID NO 3, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 27, SEQ ID NO 28, SEQ ID NO 29, SEQ ID NO 30, SEQ ID NO 31, SEQ ID NO 34, SEQ ID NO 35, The amino acid sequence of SEQ ID NO 36, 37, 38 or 39 is at least 50%, 65%, 70%, 75%, 80%, 85%, 90%, 95% identical. In some embodiments, prior to the mutation or attenuation of the SGI1 gene, the SGI1 gene of the mutant photosynthetic organism encodes a protein having an amino acid sequence that hybridizes with at least 30, 35, 40, 45, 50, 60, 70, 80, 100, 150, 200, 250, 300 amino acids or with SEQ ID NO 3, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ ID NO 23, SEQ ID NO 24, SEQ ID NO 25, SEQ ID NO 26, SEQ ID NO 3, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO 19, SEQ ID NO 20, SEQ ID NO 21, SEQ ID NO 22, SEQ, The full length of the amino acid sequence of SEQ ID NO 27, 28, 29, 30, 31, 34, 35, 36, 37, 38 or 39 is at least 50%, 65%, 70%, 75%, 80%, 85%, 90%, 95% identical.
In some embodiments of the above aspect, the photosynthetic organism is polyploid, e.g., diploid, triploid, tetraploid. In some embodiments, the gene: one or more copies of cpSRP54, SGI1, or SGI2 are mutated or attenuated, leaving the other copies of the gene unaltered or unabated to generate mutant photosynthetic organisms. In some embodiments, the mutant photosynthetic organism thus produces a mutant photosynthetic organism that exhibits a reduction of chlorophyll under low light conditions and a color of greater than 100, 125, 150, 200, or 250 μ E m relative to a control photosynthetic organism of the same species-2s-1At all physiologically relevant irradiances of (F), photochemical maximum quantum yield (F) in photosystem IIv/FM) And higher. In some embodiments, the mutant photosynthetic organism thus produced exhibits greater biomass productivity than a control photosynthetic organism of the same species. In thatIn some embodiments, the mutant photosynthetic organism thus produced exhibits greater lipid productivity than a control photosynthetic organism of the same species.
In some embodiments of the above aspect, the mutant photosynthetic organism is an alga. In some embodiments, the algae belongs to the genera: the genus Microcystis (Achnanthes), Coccocus (Amphiora), Geotrichum (Amphiora), Cellulomonas (Ankisstrodes), Asterina (Asteromonas), Euglena (Boekelovia), Borrelia (Bolidomonas), Bordetella (Borodinella), balloonflower (Botrydium), Botryococcus (Botryococcus), Bractenococcus (Bractenococcus), Chaetoceros (Chaetoceros), Tetraflagellata (Carteria), Chlamydomonas (Chlamydomonas), Chlorococcus (Chlorococcum), Chlorella (Chloromonum), Chlorella (Chlorococcum), Chlorella (Chlorella), Cryptomonas (Chlorococcus), Chlorella (Chlorella), Chlorophyces (Chlorococcus), Chlorella (Chlorococcus), Chlorophyces (Cryptococcus (Crypthecodina), Euglea (Crypthecodinium), Euglena (Crypthecodinium), Euglea (Crypthecodinium), Euglena (Crypthecodinium), Euglea (Crypthecodinium), Euglena (Crypthecodinium), Euglea, The genus Leymus (Gloeothamnion), Rhodococcus (Haematococcus), halophil (Halocaceta), Isochrysis (Heterococcus), Hymenomonas (Hymenomonas), Isochrysis (Isochrysis), Leptophyceae (Lepocinclis), Micropteris (Micracystis), Allium (Monodendros), Monochrysis (Monoprophidium), Micropteris (Nannochloropsis), Nannochloropsis (Navicula), Neochloris (Neochloris), Phaeophyceae (Neocalliphyceae), Phaeophyceae (Phaeophyceae), Chlorella (Paphialospora), Porphyromonas (Phormidium), Porphyromonas (Paphidophyllum), Porphyromonas (Pleurophyromonas), Porphyromonas (Piloca), Porphyromonas (Pleurophyromonas), Phaeophyceae (Pleurophyceae), Phaeophyceae (Pachys), Pachys (Pachylinae), Pachys), Pachylinae (Pachylinae), Pachylinae (Pachylinae), Pachys), Pachylinae (Pachylinae, Pachylinae (Pachys (Pachylinae), Pachylinae (, Pseudochlorella (Pseudochlorella), neochlorella (Pseudochlorella), pseudocruciate (pseudostaurospora), talaria (Pyramimonas), plasmodesmata (Pyrobotrys), Scenedesmus (Scenedesmus), Skeletonema (Skeletonema), spirulina (spirogyra), schizophyllum (Stichococcus), tetragonococcus (Tetraselmis), thalassonia (thalassosia), xanthomonas (Tribonema), chrysosporium (Tribonema), hemina (tribolium), hemicella (Vaucheria), paracoccus (virilia), wishlia (vischirea) and globularia (volvoox). In some embodiments, the mutant photosynthetic organism is a member of the phylum chlorophyta or stonewort, and can be, for example, a member of any one of the phylum chlorophyta: chlorophyceae (Chlorophyceae), Coccidiomycetes (Trebouxiophyceae), Tetraselophyceae (Chlorodermaphyceae), Ulva (Ulvophyceae), Pinophyceae (Pedinophyceae) or Prasinophyceae (Prasinophyceae). For example, the algal mutant may be a species belonging to: chlorophyceae, Coccidioides or Tetraselophyceae. In some embodiments, the mutant algal cell is a chlorella algal cell, and can be a chlorella algal cell of a chlorella class, e.g., a species of gene, such as botryococcus, chlorella, oleaginous microalgae (Auxenochlorella), hevea (hevochlorella), chlorella (Marinichlorella), parachloropsis, pseudochlorella, tetracyclic (Tetrachlorella), unicellular, fucus, miscanthus, microspherococcus, oocyst, microalgal, or prototheca. In some embodiments, the mutant algae can be a species belonging to: oleaginous microalgae, chlorella, Ericaceae, marine chlorella, parachlorococcus, pseudochlorella or Tetracoccus.
In some embodiments, the mutant photosynthetic microorganism is a cyanobacterium. In some embodiments, the cyanobacterium is cyanobacteria, algomenorium (Agmenellum), collaretta, coleopteran, synechocystis, ophyceae, chlorella, Bodinaria, Geotrichum, Coccidioides, Chlorophyceae, Synechococcus, Chroococcus, Phaeococcus, Blueorthogonal, cyanobacteria, Blueocystis, Spirosoma, Blueslea, Podospora, Cytospora, Cellulosia, Microphysalis, Microphyceae, Coxobacter, Ornithogalum, Gliocladium, Gloenophyllum, Phosphaeroides, Halospirium, morphopomorpha, Sphingomonas, Dioscorea, Lyophyllum, Sphingomonas, Sphaerotheca, Microcystis, Coccomyxophyceae, Nostolonia, Oscillatoria, Photinctoria, Photinus, Phyllophysconalia, Chlorophyces, Phaeophyceae, Chlorophyces, Phaeophyceae, Prototheca, Colletotrichum, Leptosphaera, Schizosaccharomyces, Pseudocladocephalus, Spirulina, Staneisseria, Stahlianthus, Eucladocephala, Aphanizomenon, Synechococcus, Synechocystis, Thermococcus (thermosynechocystis), Monoramophyces, Aphanizomenon, Thermoascus, or Isococca species.
In some embodiments, the mutant photosynthetic microorganism is a plant. Non-limiting examples of plants include monocots and dicots, such as crops comprising cereal crops (e.g., wheat, corn, rice, millet, barley), fruit crops (e.g., tomato, apple, pear, strawberry, orange), forage crops (e.g., alfalfa), root vegetable crops (e.g., carrot, potato, beet, yam), leafy vegetable crops (e.g., lettuce, spinach); flowering plants (e.g., petunia, rosebush), conifers and pines (e.g., pine fir, spruce), plants for phytoremediation (e.g., plants that accumulate heavy metals); oil crops (e.g., sunflower, rapeseed) and plants used for experimental purposes (e.g., arabidopsis).
Non-limiting examples of mutated dicotyledonous plants include plants belonging to the following orders: magnoliaceae, Miciales, Cinnamomum, Pepper, Aristolochiales, Nymphaeaceae, Ranunculaceae, Papaveraceae, Boraginaceae, Kunzendendroles, Hamamelidales, eucommia, Lepidales, Myricales, Petasites, Coumanthaceae, Caryophyllales, Myricales, Phyllanthus, Polygonales, Lanceolares, Dillegiales, Camellia, Malvaceae, Urticales, Barringtonia, Violales, Salicariales, Cleoideae, Photinia, Myrtaceae, Caryophyllales, Dioscorea, Primulinaria, Rosales, Dolicheniales, Hygrophyrida, Microsiales, Myrtaceae, Cornus, Pseudobulbus, Dioscoreales, Salicales, Rhamnales, Sapindales, Geraniales, Umbelliferae, Polygalales, Lamiaceae, Plantaginea, Rutaceae, Euphorbiales, Rhamnales, Rutaceae, Euphorbiales, Rutaceae, Ru.
Non-limiting examples of mutated monocots include plants belonging to the following orders: alismatis, eupolyphaga, euryales, mildewles, dayplantales, pipewort, elephantopus, gramineae, juncus, cyperaceae, typha, pinelliales, zingiberales, areca, cyclophiliales, lofotemrina, asteriales, liliales, and orchids, or plants belonging to the gymnosperms order, for example, those belonging to the following order: pinales, ginkgoles, cycadales, araucales, cypress and ephedra.
In some embodiments, the mutant plant may be Arabidopsis thaliana (Arabidopsis arenicola), Arabidopsis thaliana (Arabidopsis arenosa), Arabidopsis cerealis, Arabidopsis creotica, Arabidopsis thaliana (Arabidopsis thaliana), Arabidopsis negundo, Arabidopsis petermanata, Arabidopsis suberectus subentana, Arabidopsis thaliana (Arabidopsis thaliana), maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), potato (Solanum tuberosum), onion (Allium cepa), garlic (Allium sativum), soybean (Glycine), tomato (Glycax), Brassicoccus terrestris, Brassia (Solanum), Gossicium Solanum nigrum, Gossimum Gossypium, or Gossia herbarum (Gossimum).
In some embodiments, modulation of SRP54, SGI1, SGI2, or a combination of one or more genes in a plant may be tissue specific. In some embodiments, the plant tissue may be a leaf, a stem, or a root. In some embodiments, regulation of a tissue-specific gene may be achieved by regulating a tissue-specific non-coding region of the gene, e.g., a promoter, enhancer, intron, 3 '-or 5' -untranslated region. In some embodiments, modulation of SRP54, SGI1, SGI2, or a combination of one or more genes in the plant may occur at different developmental stages of the plant.
These and other objects and features of the present invention will become more fully apparent from the following detailed description of the invention when read in conjunction with the accompanying drawings.
Drawings
FIGS. 1A-1B FIG. 1A shows a schematic of the SGI1 gene. The putative positions of grnas designed to disrupt the SGI1 gene (CRISPR target) are indicated. FIG. 1B shows a schematic of the SPR54 gene. The putative positions of grnas designed to disrupt the SPR54 gene (CRISPR target) are indicated.
Fig. 2A-2c fig. 2A shows a schematic of the SGI1 gene. The putative positions of grnas designed to disrupt the SGI1 gene (CRISPR target) are indicated. Fig. 2B shows a schematic of SGI1 protein. FIG. 2C shows a schematic of the SPR54 gene. The putative positions of grnas designed to disrupt the SPR54 gene (CRISPR target) are indicated.
FIG. 3. FIG. 3 shows an exemplary domain architecture analysis of the Chlorella mimetic (Parachlorella sp.) SGI2 protein.
FIG. 4. FIG. 4 shows an exemplary domain architecture analysis of an Oocystis (Oosystis sp.) SGI2 protein.
Fig. 5. fig. 5 shows an exemplary domain architecture analysis of the four squamosa (Tetraselmis sp) SGI2 protein.
FIG. 6 shows an exemplary domain architecture analysis of the Arabidopsis (Arabidopsis thaliana) SGI2 protein.
FIG. 7. FIG. 7 shows an exemplary domain architecture analysis of the Arabidopsis SGI2 protein.
FIG. 8. FIG. 8 shows an exemplary domain architecture analysis of the Arabidopsis SGI2 protein.
FIG. 9. FIG. 9 shows an exemplary domain architecture analysis of the Arabidopsis SGI2 protein.
10A-10B. FIG. 10A shows a schematic of a DNA cassette containing a codon optimized Cre gene flanked by a nitrite reductase promoter and a terminator. FIG. 10B shows a schematic of a DNA cassette comprising the sequences of bleR and GFP.
Fig. 11. fig. 11 shows the results of productivity assays for chlorella wild-type strains, SRP54 knockout strains, SGI2 knockout strains, and double knockout strains of SGI2 and SRP 54.
FIGS. 12A-12B, FIG. 12A shows results of semi-continuous region TOC productivity assays for Chlorella vulgaris wild type strains (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR24183), SGI1/SRP54 double knockout mutant (STR 2438 and STR 245056). FIG. 12B shows the results of batch TOC productivity assays for Chlorella vulgaris wild type strains (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR24183), SGI1/SRP54 double knockout mutants (STR 24528138 and STR 245051).
Fig. 13A-13b fig. 13A shows results of measurements indicating semicontinuous region TOC productivity of chlorella wild type strains (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR00012), SGI2/SRP54 double knockout mutant (STR25761) and SGI1/SGI2/SRP54 triple knockout mutants (STR25761 and STR 25762). Fig. 13B shows results indicating the determination of batch TOC productivity of chlorella wild type strains (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR00012), SGI2/SRP54 double knockout mutant (STR25761) and SGI1/SGI2/SRP54 triple knockout mutants (STR25761 and STR 25762).
FIG. 14 shows the results of batch FAME productivity assays of Chlorella vulgaris wild type strains (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR24183), SGI1/SRP54 double knockout mutants (STR 2438 and STR 24522).
FIG. 15 shows the results of batch FAME productivity assays for Chlorella vulgaris wild type strains (STR00010), SGI1 knockout mutant (STR00012), SGI2/SRP54 double knockout mutant (STR00516), and SGI1/SGI2/SRP54 triple knockout mutants (STR25761 and STR 25762).
Fig. 16A-16b fig. 16A shows a schematic of a selection cassette for knock-out of chlorella SPR 54. Fig. 16B shows a schematic of a selection cassette for knock-out of chlorella SGI 2.
FIG. 17 shows a schematic diagram of a recombinant pCC1BAC vector including Cas9, GFP, BleR, Cre gene and lox site.
Detailed Description
The inventors of the present application surprisingly and unexpectedly found that modulating the SGI1 and SGI2 genes in photosynthetic organisms results in a reduction of chlorophyll under low light conditions, and a photochemical maximum quantum yield (F) in photosystem II at all physiologically relevant irradiancesv/FM) And higher. In some embodiments, mutant photosynthetic organisms comprising a mutated or attenuated SGI1 or SGI2 gene exhibit low non-photochemical quenching (NPQ) at all physiologically relevant irradiances. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits increased biomass compared to a control photosynthetic organism of the same species. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits a higher carbon fixation rate on a per chlorophyll basis. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits a higher carbon fixation rate on a per TOC basis than a control photosynthetic organism of the same species. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits a higher oxygen evolution rate per mg of chlorophyll than a control photosynthetic organism of the same species. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits a higher oxygen evolution rate on a per TOC basis than a control photosynthetic organism of the same species. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits higher lipid productivity than a control photosynthetic organism of the same species. In some embodiments, a mutant photosynthetic organism comprising a mutated or attenuated SGI1 or SGI2 gene exhibits greater lipid productivity in photoautotrophic cultures.
The inventors of the present application have also surprisingly found that the regulation of the SGI1 or SGI2 gene and the regulation of the synergistic effect of the SRP54 gene in photosynthetic organisms (synergistic effect). In some embodiments, chlorophyll is further reduced, biomass is increased more, higher carbon fixation on a per chlorophyll basis, higher carbon fixation on a per TOC basis, higher lipid productivity in mutant photosynthetic organisms that modulate the SRP54 and SGI1 or SGI2 genes, as compared to mutant photosynthetic organisms that modulate only the SGI1 or SGI2 genes.
SGI1 gene
As described herein, a significant growth-improving gene 1(SGI1) polypeptide is a polypeptide comprising two domains: the response receives or "RR" domain (Pfam PF00072) and Myb domain (Pfam PF00249), wherein the RR domain is located at the N-terminus of the Myb domain. The RR and Myb domains are separated by an amino acid sequence that is found to be poorly conserved or not conserved in SGI1 polypeptides, sometimes referred to herein as a linker between the two domains, where the linker may range in length from, for example, one to 300 amino acids, or ten to 200 amino acids. The linker region may optionally comprise a Nuclear Localization Sequence (NLS).
The presence of the response receiving "RR" domain (Pfam PF00072) is responsible for its bioinformatic annotation as a CheY-like polypeptide. The RR domain extends from approximately amino acid 36 to amino acid 148 of the Chlorella SGI1 polypeptide (SEQ ID NO:3), and is also characterized in the Conserved Domain Database (CDD) as the "Signal receiving domain", cd00156, extending from approximately amino acid 37 to amino acid 154. The RR domain is also characterized in the protein ortholog database as the "CheY-like Receptor (REC) domain", COG0784, and as the Interpro "CheY-like superfamily" domain, IPR011006, where both of these characterized domains extend from about amino acid 33 to about amino acid 161 of the Chlorella sp.SGI 1 polypeptide of SEQ ID NO: 3. The RR domain is found in bacterial two-component regulatory systems (such as bacterial chemotaxis two-component systems comprising a polypeptide known as CheY), where it receives signals from a sensor partner. The RR domain of such systems is typically found at the N-terminus of the DNA binding domain and contains a phosphate receptor site that can be phosphorylated, which may be responsible for its activation or deactivation.
The RR domain within the SGI1 protein may be characterized, for example, as Pfam PF00072, or as a "signal-receiving domain" or simply a "receiving domain", and/or IPR011006 may be classified as cd00156 in a Conserved Domain Database (CDD), COG0784 in a protein ortholog cluster database, or an Interpro "CheY-like superfamily" domain. The RR domain is found in bacterial two-component regulatory systems (such as bacterial chemotaxis two-component systems comprising a polypeptide known as CheY), where it receives signals from a sensor partner. The RR domain of such systems is typically found at the N-terminus of the DNA binding domain and contains a phosphate receptor site that can be phosphorylated, which may be responsible for its activation or deactivation.
The myb domain within the SGI1 protein can be characterized, for example, as pfamPF 00249: "Myb-like DNA binding domain" and/or may be identified as the conserved domain TIGR01557 "Myb-like DNA binding domain, a class of SHAQKYFs (" SHAQKYF "as disclosed in SEQ ID NO: 102"), or as an Interpro homeodomain superfamily domain (IPR009057) and/or an Interpro Myb domain (IPR 017930).
In addition to having an RR domain at the N-terminus of the myb domain, when scanning using a Hidden Markov Model (HMM) designed to score proteins based on the degree to which the query protein amino acid sequence matches the conserved amino acids of the SGI1 homolog region in algae, the SGI1 proteins provided herein may score 300 or more, 320 or more, 340 or more, 350 or more, 360 or more, or 370 or more, wherein highly conserved amino acid positions are more heavily weighted than poorly conserved amino acid positions within the comparison region of the polypeptide to arrive at a score. When scanned with HMM models based on protein sequences comprising algal SGI1 polypeptides that include a single contiguous sequence using the RR domain, linker, and myb domain developed, polypeptides that score 350 or higher, such as 370 or higher, include, but are not limited to: algal and plant species polypeptides, Chlorella 1185(SEQ ID NO:3), Gliocladium (SEQ ID NO:9), Marine luminescent Septoria (SEQ ID NO:10), Chlamydomonas reinhardtii (SEQ ID NO:11), Volvox carminatus (SEQ ID NO:13), Tetraselmis 105(SEQ ID NO:14, 15, and 16), Oocystis (SEQ ID NO:17), Microcystis RCC299(SEQ ID NO:18), Microcystis tenuis (SEQ ID NO:19), Pseudosphagnum (SEQ ID NO:20), Physcomitrella patens (SEQ ID NO:21), Arabidopsis thaliana (SEQ ID NO:22), Arabidopsis thaliana (SEQ ID NO:23), Arabidopsis thaliana (SEQ ID NO:24), Helianthus annuus (SEQ ID NO:25), Vitis vinifera (SEQ ID NO:26), Cinnamomum camphora (SEQ ID NO:27), Ricinus (SEQ ID NO:28), Tomato (SEQ ID NO:29), potato (SEQ ID NO:30), upland cotton (SEQ ID NO:31), cocoa (SEQ ID NO:32), kidney bean (Phaeolis vulgaris) (SEQ ID NO:33), soybean (SEQ ID NO:34), quinoa (SEQ ID NO:35), apple (Domestica) (SEQ ID NO:36), maize (SEQ ID NO:37), turnip (SEQ ID NO:38) and rice (SEQ ID NO:39) and polypeptides having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to any of the foregoing, wherein the polypeptide has an RR domain and a myb domain, and the RR domain is the N-terminus of the myb domain. In various embodiments, the SGI1 polypeptide is from a plant or algal species (algal species). A gene encoding an SGI1 polypeptide as provided herein, e.g., a gene whose expression is disrupted in a mutant or whose expression is attenuated as provided herein, can be a naturally occurring gene of a plant or algal species that encodes a polypeptide as disclosed herein in various embodiments.
In some embodiments, the SGI1 polypeptide as provided herein is an algal SGI1 polypeptide, e.g., having the sequence of a naturally occurring algal SGI1 polypeptide, wherein the algal polypeptide comprises an RR domain and a myb domain, and the RR domain is the N-terminus of the myb domain. Algal polypeptides can optionally be at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any algal SGI1 polypeptide disclosed herein. In some embodiments, the SGI1 gene may be a gene encoding an algal SGI1 polypeptide, such as a polypeptide having the sequence of a naturally occurring algal SGI1 polypeptide. The SGI1 gene encoding a polypeptide having the sequence of a naturally occurring algal SGI polypeptide may be a gene having the sequence of a naturally occurring gene of the gene coding sequence, or may have a sequence different from the sequence of a naturally occurring gene. In various embodiments, as disclosed herein, an SGI1 gene that is attenuated, mutated, or disrupted in a mutant photosynthetic organism can be a gene identified by BLAST, e.g., using the sequences disclosed herein and/or by HMM scanning, wherein the HMM is based on a contiguous amino acid sequence, e.g., obtained by comparing at least six SGI polypeptides, wherein the contiguous amino acid sequence comprises an RR domain and a myb domain, wherein the RR domain is the N-terminus of the myb domain, and wherein a linker sequence that does not belong to either domain is present between the RR and myb domains.
In some embodiments, the SGI polypeptide has the sequence of an algal SGI1 polypeptide or is a variant of a naturally occurring algal SGI1 polypeptide having at least 85%, at least 90%, or at least 95% identity to a naturally occurring algal SGI1 polypeptide, and/or has at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to any one of SEQ ID No. 3, SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 11, SEQ ID No. 12, SEQ ID No. 13, SEQ ID No. 14, SEQ ID No. 15, SEQ ID No. 16, SEQ ID No. 17, SEQ ID No. 18, or SEQ ID No. 19.
In some embodiments, the SGI polypeptide has the sequence of a plant SGI1 polypeptide or is a variant of a naturally occurring plant SGI1 polypeptide having at least 85%, at least 90%, or at least 95% identity to a naturally occurring algal SGI polypeptide, and/or has at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 95% identity to any of SEQ ID NOs 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, or 39, At least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity.
The sequence of the SGI1 gene of the genus chlorella, provided as SEQ ID NO:1, was found to encode a polypeptide comprising two main functional domains (SEQ ID NO:3), both of which appear in the N-terminus of half of a protein having 619 amino acids. An exemplary Chlorella SGI1 cDNA sequence is provided as SEQ ID NO 2.
No conserved protein domain is found in the region C-terminal to the myb domain of the SGI1 polypeptide, i.e. in the (approximately) C-terminal half of the protein. On the other hand, the RR domain and Myb domain (where the Myb domain is located C-terminal to the RR domain) can be found in many proteins encoded in the genome of green plants (Viridiplantae), which cover algae. Bioinformatic analysis was used to identify possible orthologs of SGI1 in additional plant and algal species.
To identify a class of SGI1 proteins in additional photosynthetic organisms, a Hidden Markov Model (HMM) was constructed-for the discovery of the RR domain myb domain architecture in the genus chlorella SGI 1. As a first step, the Chlorella SGI1 polypeptide sequence (SEQ ID NO:3) was used in a BLAST search of the JGI phytochrome database v.12 containing plant and algal genomes. Four proprietary algal genomes (from genus parachlorococcus, arabidopsis, tetrakiss, and oocystis) were also added to the searched databases. When the search reaches approximately 2,000 hits, the search is stopped. These results are then analyzed by InterProScan (available from EMBL-EBI [ European Molecular biology laboratories-European Bioinformatics Institute ], e.g., in ebi.ac. uk ]) to ensure that the selected results have both an Interpro CheY-like superfamily domain (IPR011006) and an Interpro Homeobox-like or Myb domain (IPR009057 or IPR 017930). This step reduces the number of selected hits to between 900 and 1,000, where the polypeptide clearly identifies a double domain architecture (RR domain N-myb domain) in polypeptides of both algae and higher plants. The resulting sequences are used to assemble phylogenetic trees (phylogenetic trees) based on sequence homology. Phylogenetic trees show clear groupings of related polypeptides from algal species, including SGI1 homologues of the genera chlorella, dipasophyta, oocystis, chlamydomonas, hydnococcus, oyster cocci, microcellular, and gloeoscillus.
Table: SGI1 orthologs in algal species
Biological organisms
Polypeptide sequence
HMM scoring
Chlorella 1185
SEQ ID NO:3
400.20
Gliocladium sp
SEQ ID NO:9
403.0
Sea luminous oyster ball algae
SEQ ID NO:10
425.8
Chlamydomonas reinhardtii
SEQ ID NO:11
413.3
Zuofu green algae
SEQ ID NO:12
292.6
Pantoglossum catarrhalis
SEQ ID NO:13
441.4
Tetraselmis 105
SEQ ID NO:14
403.6
Tetraselmis 105
SEQ ID NO:15
403.0
Tetraselmis 105
SEQ ID NO:16
402.9
Genus oocysts
SEQ ID NO:17
426.9
Microcystis RC299
SEQ ID NO:18
418.4
Microcystis parvum
SEQ ID NO:19
405.9
To establish criteria for possible SGI1 orthologs in other photosynthetic organisms, Hidden Markov Models (HMMs) were then developed based on the algal cluster of SGI1 polypeptide sequences. HMM was developed based on the N-terminal portion of SGI1 polypeptide, which SGI1 polypeptide encompasses both RR and myb domains, including a linker region between the two conserved domains. Sequences at the C-terminus of the polypeptide that do not contain any recognizable myb domains of conserved structure were excluded from the model construction. HMMER 3.1b2 was used to construct HMMs using Multiple Sequence Alignments (MSAs) from: specific sequences of Chlorella, oocystis, and Tetraselmis polypeptides and public databases of polypeptides from Chlamydomonas reinhardtii, Volvox sp, Gliocladium zoffii, Gliocladium RCC299, and Marine luminescent Pleurococcus. The ETE3 toolkit and eggnog41 workflow were used to generate a Multiple Sequence Alignment (MSA) of the N-terminal half of the protein. This program uses internally the programs Muscle, MAFFT, Clustal Ω, as well as M-coffee for alignment, trimAI for alignment pruning, and PhyML for systematically interfering. HMMs capture information from multiple protein sequences, for example, as opposed to a single protein sequence for homology comparisons, and are therefore able to distinguish between highly conserved and highly divergent residues and take this into account when determining sequence relatedness. When HMMs are used to score sequences, highly conserved residues receive more weight than highly divergent residues, thereby providing superior sensitivity and accuracy over simpler PSAs.
SGI1 HHM was used to assign a score to the polypeptides identified in the BLAST search, which were also validated as having two conserved domains (RR and myb). In bioinformatic searches, nearly the highest scores found in algal species and single plant polypeptides allowed the identification of proteins of interest in other algal species (table 1). These represent possible orthologs, whose genes can be attenuated or knocked out to provide high-productivity mutants in other organisms.
Table 2: SGI1 orthologs in plant species
Biological organisms
Polypeptide sequence
HMM scoring
Pseudosphagnum moss
SEQ ID NO:20
397.3
Physcomitrella patens (Fr.) Kuntze
SEQ ID NO:21
372.3
Arabidopsis thaliana
SEQ ID NO:22
371.1
Round leaf Arabidopsis thaliana
SEQ ID NO:23
475.9
Lyre leaf Arabidopsis thaliana
SEQ ID NO:24
395.5
Sunflower (Helianthus annuus L.)
SEQ ID NO:25
391.2
Grape
SEQ ID NO:26
390.6
Oil-free camphor
SEQ ID NO:27
390.1
Castor oil plant
SEQ ID NO:28
390.1
Tomato
SEQ ID NO:29
388.4
Potato
SEQ ID NO:30
387.2
Upland cotton
SEQ ID NO:31
385.8
Cocoa
SEQ ID NO:32
383.0
Bean food
SEQ ID NO:33
381.6
Soybean
SEQ ID NO:34
381.4
Chenopodium quinoa willd
SEQ ID NO:35
373.7
Apple (Malus pumila)
SEQ ID NO:36
372.6
Corn (corn)
SEQ ID NO:37
371.5
Turnip
SEQ ID NO:38
370.5
Rice and method for producing the same
SEQ ID NO:39
369.6
A schematic of the SGI1 gene is shown in fig. 1A.
In some embodiments, modulation of a mutation, attenuation, or knock-out of an SGI1 gene, such as the SGI1 gene in an algal species, for example, increases photochemical maximum quantum yield (F) in photosystem IIv/FM) (about 10% -14%) while exhibiting reduced antenna size (i.e., functional absorption cross-section) compared to the wild-type strain from which it was derived.
In some embodiments, modulation of the SGI1 gene may also result in a reduction in antenna size (i.e., functional absorption cross section) for photosystem ii (PSII) and photosystem i (psi) (40% -50% reduction relative to wild-type), high electron transfer rate (about 35% to about 130% increase in saturation light relative to the PSII (1/τ' Qa) acceptor side, and high carbon fixation rate (Pmax) (up to at least 30% -40% relative to wild-type in these engineered mutants), while maintaining the number of photosystems on a per TOC basis as determined by multiple reaction monitoring protein assays.
SGI2 gene
The inventors of the present application have identified a significant growth improving gene 2(SGI2) as an ortholog present in photosynthetic organisms (e.g., algae), a plant that regulates this class of genes is called the Two-component system (TCS) because it is known that the plant regulates important cellular processes, including bacterial cell cycle progression and development (Skerker et al 2015; "Two-component signaling pathway regulating growth and cell cycle progression in bacteria: a system-level analysis (Two-component signalling pathway regulation growth and cell cycle progression)", "ploS Biology (PLoS Biology 3(10): e334), nitrogen sensing (Sanders et al 1992)," Phosphorylation sites of protein phosphatases NcC whose covalent intermediates activate transcription (of bacterial cells) and bacterial chemotaxis (bacterial cell) of Phosphorylation sites of bacterial cells of fungal infection (strain) 174-growth and bacterial cell cycle progression of bacterial cells of bacterial strain # 17-) (strain of bacterial strain of bacterial Property (Sanders et al, 1989; Identification of phosphorylation sites of chemotactic response regulatory protein CheY (Identification of The site of phosphorylation of The chemotaxis response regulator protein, CheY); Journal of biochemistry 264(36): 21770-8). In bacteria, these proteins are usually composed of histidine kinases that detect specific environmental stimuli and the corresponding response regulatory domain (PF00072) that mediates cellular responses, primarily through differential expression of target genes. However, in photosynthetic organisms, the SGI2 gene includes a corresponding response regulatory domain (PF00072) and lacks another domain of the two-component system.
A schematic of the SGI1 gene is shown in fig. 2A, and a schematic of the corresponding protein is shown in fig. 2B.
An exemplary Chlorella SGI2 gene sequence was found to be provided as SEQ ID NO:4, which encodes a polypeptide (SEQ ID NO:5) that includes a response regulatory domain (SEQ ID NO: 6).
Exemplary orthologous polypeptide sequences in various photosynthetic organisms are shown in table 3 below.
Table 3: orthologous SGI2 sequences in various photosynthetic organisms
Photosynthetic organisms
Polypeptide sequence
Genus oocysts
SEQ ID NO:40
Genus Tetraselmis
SEQ ID NO:41
Arabidopsis thaliana
SEQ ID NO:42
Arabidopsis thaliana
SEQ ID NO:43
Arabidopsis thaliana
SEQ ID NO:44
Arabidopsis thaliana
SEQ ID NO:45
Arabidopsis thaliana
SEQ ID NO:46
Soybean
SEQ ID NO:47
Grape
SEQ ID NO:48
Cocoa
SEQ ID NO:49
Rice and method for producing the same
SEQ ID NO:50
Corn (corn)
SEQ ID NO:51
Physcomitrella patens (Fr.) Kuntze
SEQ ID NO:52
Pantoglossum catarrhalis
SEQ ID NO:53
Chlamydomonas reinhardtii
SEQ ID NO:54
Chlorella sorokiniana
SEQ ID NO:55
Gliocladium C-169
SEQ ID NO:56
An exemplary Chlorella SGI2 cDNA sequence is provided as SEQ ID NO 7. Orthologous cDNA sequences of SGI2 genes in other photosynthetic organisms are shown in table 4 below.
Table 4: orthologous cDNA sequences of the SGI2 gene in other photosynthetic organisms
Photosynthetic organisms
cDNA sequence
Genus oocysts
SEQ ID NO:57
Genus Tetraselmis
SEQ ID NO:58
Soybean
SEQ ID NO:59
Grape
SEQ ID NO:60
Cocoa
SEQ ID NO:61
Rice and method for producing the same
SEQ ID NO:62
Corn (corn)
SEQ ID NO:63
Physcomitrella patens (Fr.) Kuntze
SEQ ID NO:64
Pantoglossum catarrhalis
SEQ ID NO:65
Chlamydomonas reinhardtii
SEQ ID NO:66
Gliocladium sp
SEQ ID NO:67
In some embodiments, the SGI2 polypeptide of the photosynthetic organism comprises an amino acid sequence at least 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% identical to SEQ ID No. 6. In some embodiments, the SGI2 polypeptide of the photosynthetic organism comprises an amino acid sequence that is at least 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% identical to at least 100, 150, 200, 250 amino acids or the full length of SEQ ID NO 5, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, or 56.
In some embodiments, the photosynthetic organism comprises a polynucleotide encoding an SGI2 polypeptide, wherein the nucleic acid sequence of the polynucleotide is at least 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% identical over the full length of at least 100, 150, 200, 250 nucleotides or SEQ ID No. 4, 7, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66 or 67.
In some embodiments, modulating a mutation, attenuation, or knock-out of an SGI2 gene, such as the SGI2 gene in a photosynthetic organism (e.g., an algal species), increases the photochemical maximum quantum yield (F) in photosystem IIv/FM) (about 10% -14%), reduced chlorophyll per Total Organic Carbon (TOC), increased biomass.
SPR54 gene
Modulation of the SPR54 gene has been described in U.S. patent application publication 2016/0304896, which is incorporated herein by reference in its entirety. An exemplary chlorella chloroplast SRP54(cPSRP54) cDNA sequence is provided as SEQ ID NO:8, which encodes a polypeptide having SEQ ID NO: 68.
Other non-limiting exemplary cpSRP54 orthologous polypeptides include GenBank accession numbers: EDP00260 for Chlamydomonas reinhardtii (SEQ ID NO: 75); EEH59526 for Microcystis parvum (SEQ ID NO: 76); EEH59526 for Microcystis (SEQ ID NO: 77); ACB42577 for use in Paulinella chromaphora (SEQ ID NO: 78); ABO94038 for use in marine luminescent oyster globulina (SEQ ID NO: 79); Q01H03 for Pleurotus ostreatus (SEQ ID NO: 80); EFJ41797 for C.karezii (SEQ ID NO: 81); EEC48599 for Phaeodactylum tricornutum (SEQ ID NO: 82); EED94755, for Thalassiosira pseudonana (SEQ ID NO: 83); EGB12501 for inhibiting Aureococcus nophageferens (SEQ ID NO: 84); CBN76263 for long-vesicular Water cloud (Ectocarpus silicaulosus) (SEQ ID NO: 85).
In some embodiments, the cpSRP54 gene of the photosynthetic organism encodes a polypeptide that is at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, or at least 85%, at least 90%, or at least 95% sequence identity to the cpSRP54 disclosed above.
Modulation of SGI2, SGI1 and SRP54 combinations, SGI2 and SRP54 genes or SGI1, SGI2 and SRP54 genes of photosynthetic organisms
Modulation of the SGI2, the combination of SGI1 and SRP54, the combination of SGI2 and SRP54 genes, or the combination of SGI1, SGI2 and SRP54 genes of a photosynthetic organism produces a mutant photosynthetic organism. The SGI1, SGI2, SRP54 genes can be modulated by UV mutagenesis, gamma irradiation or genetic engineering techniques. The gene sequence may be altered, may be partially or completely deleted, and the expression of the gene may be altered.
In some embodiments, the SGI1, SGI2, and/or SRP54 genes can be operably linked to algal promoter and terminator sequences as described in U.S. application publication 2017/0058303, which is incorporated herein by reference in its entirety.
In some embodiments, the mutant photosynthetic organism (e.g., plant, algae) has at least a 20% reduction, at least a 30% reduction, at least a 40% reduction, at least a 50% reduction, at least a 55% reduction, at least a 60% reduction, at least a 65% reduction, or at least a 70% reduction in total chlorophyll relative to a control cell, optionally further wherein the mutant has at least an increased ratio of chlorophyll a to chlorophyll b relative to a control cell, further optionally wherein the ratio of chlorophyll a to chlorophyll b is at least about 2.8:1, at least about 3:1, at least about 3.2:1, about 3.3:1, at least about 3.5:1, at least about 3.7:1, at least about 3.9:1, at least about 4:1, or at least about 4.3: 1.
In some embodiments, the mutant photosynthetic organism (e.g., a plant or an algae) exhibits: (a) at photon m between about 100 and about 2800 μmol-2sec-1Between about 150 and about 2800 μmol photon m-2sec-1Between about 75 and about 2800 μmol photon m-2sec-1Between about 40 and about 2800 μmol photon m-2sec-1Or between about 10 and about 2800 μmol photon m-2sec-1Higher qP relative to a control photosynthetic organism of the same species at all irradiance levels in between;
(b) at photon m between about 100 and about 2800 μmol-2sec-1Between about 150 and about 2800 μmol photon m-2sec-1Between about 75 and about 2800 μmol photon m-2sec-1Between about 40 and about 2800 μmol lightSeed m-2sec-1Or between about 10 and about 2800 μmol photon m-2sec-1Lower NPQ relative to control algae at all irradiances in between;
(c) at photon m between about 100 and about 2800 μmol-2sec-1Between about 150 and about 2800 μmol photon m-2sec-1Between about 75 and about 2800 μmol photon m-2sec-1Between about 40 and about 2800 μmol photon m-2sec-1Or between about 10 and about 2800 μmol photon m-2sec-1Higher y (ii) relative to photosynthetic organisms (e.g., algae) at all irradiance levels in between;
(d) between about 100 and about 2800 μmol photon m-2sec-1Between about 150 and about 2800 μmol photon m-2sec-1Between about 75 and about 2800 μmol photon m-2sec-1Between about 40 and about 2800 μmol photon m-2sec-1Or between about 10 and about 2800 μmol photon m-2sec-1Higher F relative to control algaev/FM;
(e) Between about 250 and about 2800 μmol photon m-2sec-1Between about 150 and about 2800 μmol photon m-2sec-1Between about 75 and about 2800 μmol photon m-2sec-1Between about 40 and about 2800 μmol photon m-2sec-1Or between about 10 and about 2800 μmol photon m-2sec-1Higher esr (ii) relative to control algae;
(f) an increase in oxygen evolution on a per chlorophyll basis of at least 50%, at least 100%, at least 200%, at least 300%, at least 350%, or at least 400% relative to a control algae; and is
(g) The carbon fixation on a per chlorophyll basis is increased by at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100% relative to a control photosynthetic organism of the same species.
In some embodiments, the mutant photosynthetic organism exhibits a biomass productivity that is at least 5%, at least 6%, at least 8%, or at least 10%, at least 15%, at least 25%, or at least 30% higher than a control algae cultured under the same conditions.
In some embodiments, the mutant photosynthetic organisms (e.g., plants, algae) exhibit greater productivity relative to control algae in a diurnal cycle culture with variable light intensity mimicking natural sunlight, optionally wherein the peak in light intensity is between about 1900 and about 2000 μmol photons m-2sec-1In the meantime.
In some embodiments, the mutant photosynthetic organism (e.g., a plant or an algae) has a greater lipid productivity, e.g., at least 5%, at least 10%, at least 15%, at least 20%, or at least 25% greater lipid productivity, relative to a control photosynthetic organism of the same species that does not have the one or more altered or attenuated genes.
Definition of
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present application, including definitions, will control. Unless the context requires otherwise, singular terms shall include the plural and plural terms shall include the singular. All publications, patents, and other references cited herein are incorporated by reference in their entirety for all purposes as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference.
As used in this disclosure and in the claims, the singular forms "a", "an" and "the" also include the plural forms unless the context clearly dictates otherwise.
All ranges provided within this application include values at the upper and lower ends of the range.
As used herein, the term "and/or" as used in phrases such as "a and/or B" is intended to include "a and B", "a or B", "a" and "B".
The term "gene" is used broadly to refer to any segment of a nucleic acid molecule (typically DNA, but optionally RNA) that encodes a polypeptide or expressed RNA. Thus, a gene comprises a sequence that encodes an expressed RNA (which may comprise a polypeptide coding sequence or, for example, a functional RNA, such as ribosomal RNA, tRNA, antisense RNA, microrna, short hairpin RNA, ribozyme, etc.). A gene may further include regulatory sequences required for or to affect its expression, as well as sequences related to the protein or RNA coding sequence in its native state, such as intron sequences, 5 'or 3' untranslated sequences, and the like. In some examples, a "gene" may refer to only the protein-coding portion of a DNA or RNA molecule, which may or may not contain introns. The length of the gene is preferably greater than 50 nucleotides, more preferably greater than 100 nucleotides in length, and may be, for example, between 50 and 500,000 nucleotides in length, such as between 100 and 100,000 nucleotides in length or between about 200 and about 50,000 nucleotides in length or between about 200 and about 20,000 nucleotides in length. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesis from known or predicted sequence information.
The term "nucleic acid" or "nucleic acid molecule" refers to a segment of DNA or RNA (e.g., mRNA) and also encompasses nucleic acids having a modified backbone (e.g., peptide nucleic acids, locked nucleic acids) or modified or non-naturally occurring nucleobases. The nucleic acid molecule may be double-stranded, partially double-stranded or single-stranded; the single-stranded nucleic acid comprising the gene or portion thereof may be the coding (sense) strand or the non-coding (antisense) strand.
A nucleic acid molecule may be "derived from" the indicated source, comprising isolation (all or part) of the nucleic acid segment from the indicated source. Nucleic acid molecules can also be derived from the indicated source by, for example, direct cloning, PCR amplification or artificial synthesis from the indicated polynucleotide source or based on sequences related to the indicated polynucleotide source. Genes or nucleic acid molecules derived from a particular source or species also include genes or nucleic acid molecules having sequence modifications relative to the source nucleic acid molecule. For example, a gene or nucleic acid molecule derived from a source (e.g., a particular reference gene) may comprise one or more mutations relative to the source gene or nucleic acid molecule that are unintended or intentionally introduced, and if one or more mutations (including substitutions, deletions, or insertions) are intentionally introduced, these sequence alterations may be introduced by random or targeted mutagenesis of the cell or nucleic acid, by amplification or other gene synthesis or molecular biology techniques, or by chemical synthesis, or any combination thereof. A gene or nucleic acid molecule derived from a reference gene or nucleic acid molecule encoding a functional RNA or polypeptide may encode a functional RNA or polypeptide having at least 75%, at least 80%, at least 85%, at least 90% or at least 95% sequence identity to the reference or source functional RNA or polypeptide or to a functional fragment thereof. For example, a gene or nucleic acid molecule derived from a reference gene or nucleic acid molecule encoding a functional RNA or polypeptide may encode a functional RNA or polypeptide having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to a reference or source functional RNA or polypeptide, or to a functional fragment thereof.
As used herein, an "isolated" nucleic acid or protein is removed from its natural environment or the environment in which the nucleic acid or protein occurs in nature. For example, an isolated protein or nucleic acid molecule is removed from a cell or organism with which it is associated in its natural or natural environment. In some examples, an isolated nucleic acid or protein may be partially or substantially purified, but isolation does not require a particular level of purification. Thus, for example, an isolated nucleic acid molecule can be a nucleic acid sequence that has been excised from the chromosome, genome, or episome into which it is integrated in nature.
A "purified" nucleic acid molecule or nucleotide sequence or protein or polypeptide sequence is substantially free of cellular material and cellular components. The purified nucleic acid molecule or protein may be substantially free of chemicals, e.g., other than buffers or solvents. "substantially free" is not intended to mean that components other than these novel nucleic acid molecules are not detectable.
The terms "naturally occurring" and "wild-type" refer to the form found in nature. For example, a naturally occurring or wild-type nucleic acid molecule, nucleotide sequence, or protein may be present in material isolated from a natural source and not intentionally modified by human manipulation.
As used herein, "weakened" means a decrease in amount, degree, intensity, or strength (strength). Attenuated gene expression may refer to a significantly reduced amount and/or rate of transcription of the gene in question, or of translation, folding or assembly of the encoded protein. By way of non-limiting example, an attenuated gene may be a mutated or disrupted gene (e.g., a gene disrupted by partial or complete deletion, truncation, frameshift, or insertion mutation) having reduced expression due to alteration or disruption of the gene's regulatory sequences, or may be a gene targeted by a construct (e.g., antisense RNA, microrna, RNAi molecule, or ribozyme) that reduces gene expression.
"exogenous nucleic acid molecule" or "exogenous gene" refers to a nucleic acid molecule or gene that has been introduced ("transformed") into a cell. The transformed cell may be referred to as a recombinant cell, wherein one or more additional exogenous genes may be introduced. A cell transformed with a nucleic acid molecule is also referred to as "transformed" if its progeny have inherited the exogenous nucleic acid molecule. The exogenous gene may be from a different species (and thus "heterologous") or from the same species (and thus "homologous") relative to the cell being transformed. An "endogenous" nucleic acid molecule, gene, or protein is a native nucleic acid molecule, gene, or protein, as it is present in or naturally produced by the host.
The term "native" as used herein is used to refer to a nucleic acid sequence or amino acid sequence, as it occurs naturally in a host. The term "non-natural" as used herein is used to refer to a nucleic acid sequence or amino acid sequence that does not naturally occur in a host. Nucleic acid sequences or amino acid sequences that have been removed from a cell, subjected to laboratory procedures, and introduced or reintroduced into a host cell are considered "non-native". Synthetic or partially synthetic genes introduced into a host cell are "non-natural". The non-native gene further comprises a gene endogenous to the host microorganism operably linked to one or more heterologous regulatory sequences that have been recombined into the host genome.
A "recombinant" or "engineered" nucleic acid molecule is one that has been altered by human manipulation. As non-limiting examples, a recombinant nucleic acid molecule comprises any nucleic acid molecule that performs: 1) have been partially or completely synthesized or modified in vitro, for example using chemical or enzymatic techniques (e.g., by using chemical nucleic acid synthesis, or by using enzymes for replication, polymerization, digestion (exonucleolytic or endonucleolytic), ligation, reverse transcription, base modification (including, for example, methylation), integration, or recombination (including homologous and site-specific recombination) of nucleic acid molecules); 2) comprising linked nucleotide sequences that are not linked in nature; 3) have been engineered using molecular cloning techniques such that they lack one or more nucleotides relative to the sequence of a naturally occurring nucleic acid molecule; and/or 4) has been manipulated using molecular cloning techniques such that it has one or more sequence alterations or rearrangements relative to the naturally occurring nucleic acid sequence. By way of non-limiting example, a cDNA is a recombinant DNA molecule, such as any nucleic acid molecule that has been produced by one or more polymerase reactions in vitro or to which a linker has been attached or that has been integrated into a vector (e.g., a cloning vector or an expression vector).
As used herein, the term "recombinant protein" refers to a protein produced by genetic engineering.
The terms recombinant, engineered or genetically engineered, when applied to an organism, refer to an organism that has been manipulated by introducing a heterologous or exogenous recombinant nucleic acid sequence into the organism and include gene knock-outs, targeted mutations, gene substitutions and promoter substitutions, deletions or insertions, as well as the introduction of transgenes or synthetic genes into the organism. The recombinant or genetically engineered organism may also be an organism into which a construct for gene "knock-down" has been introduced. Such constructs include, but are not limited to, RNAi, microrna, shRNA, siRNA, antisense, and ribozyme constructs. Also included are organisms whose genome has been altered by the activity of a meganuclease, zinc finger nuclease, TALEN, or Cas/CRISPR system. The exogenous or recombinant nucleic acid molecule may be integrated into the genome of the recombinant/genetically engineered organism or, in other examples, may not be integrated into the host genome. As used herein, a "recombinant microorganism" or "recombinant host cell" comprises progeny or derivatives of the recombinant microorganism of the present invention. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny or derivatives may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
The term "promoter" refers to a nucleic acid sequence capable of binding RNA polymerase in a cell and initiating transcription of a downstream (3' direction) coding sequence. The promoter contains the minimum number of bases or elements necessary to initiate transcription at a detectable level above background. The promoter may comprise a transcription initiation site and a protein binding domain (consensus sequence) responsible for binding RNA polymerase. Eukaryotic promoters typically, but not always, contain "TATA" and "CAT" boxes. The prokaryotic promoter may contain-10 and-35 prokaryotic promoter consensus sequences. A large number of promoters, including constitutive, inducible and repressible promoters, from a variety of different sources are well known in the art. Representative sources include, for example, algal, viral, mammalian, insect, plant, yeast and bacterial cell types, and suitable promoters from these sources are readily available or can be prepared synthetically, e.g., from depositories (e.g., ATCC) and other commercial or individual sources, based on publicly available sequences on-line. Promoters may be unidirectional (initiating transcription in one direction) or bidirectional (initiating transcription in either direction). The promoter may be a constitutive promoter, a repressible promoter, or an inducible promoter. In addition to the proximal promoter of a gene to which RNA polymerase binds to initiate transcription, the promoter region may comprise additional sequences upstream of the gene, which may be within 1kb, 2kb, 3kb, 4kb, 5kb or more of the transcription start site of the gene, wherein the additional sequences may affect the transcription rate of downstream genes and optionally affect the promoter's reactivity to developmental, environmental or biochemical (e.g., metabolic) conditions.
The term "heterologous" when used in reference to a polynucleotide, gene, nucleic acid, polypeptide, or enzyme refers to a polynucleotide, gene, nucleic acid, polypeptide, or enzyme that is derived from a source or is derived from a source other than the host biological species. In contrast, a "homologous" polynucleotide, gene, nucleic acid, polypeptide, or enzyme is used herein to refer to a polynucleotide, gene, nucleic acid, polypeptide, or enzyme derived from a host biological species. When referring to a gene regulatory sequence or to an auxiliary nucleic acid sequence for maintaining or manipulating a gene sequence (e.g., a promoter, 5 'untranslated region, 3' untranslated region, poly a addition sequence, intron sequence, splice site, ribosome binding site, internal ribosome entry sequence, genomic homology region, recombination site, etc.), by "heterologous" is meant that the regulatory sequence or auxiliary sequence is not naturally associated with the gene with which the regulatory sequence or auxiliary nucleic acid sequence is juxtaposed in a construct, genome, chromosome, or episome. Thus, a promoter operably linked to a gene to which it is not operably linked in its native state (i.e., in the genome of a non-genetically engineered organism) is referred to herein as a "heterologous promoter," even though the promoter may be derived from the same species (or in some cases, the same organism) as the gene to which it is linked.
As used herein, the term "protein" or "polypeptide" is intended to encompass both the singular "polypeptide" and the plural "polypeptide" and refers to a molecule composed of monomers (amino acids) linearly linked by amide bonds (also known as peptide bonds). The term "polypeptide" refers to any chain or chains of two or more amino acids and does not refer to a particular length of the product. Thus, a peptide, dipeptide, tripeptide, oligopeptide, "protein," "amino acid chain," or any other term used to refer to a chain or chains of two or more amino acids, is included within the definition of "polypeptide," and the term "polypeptide" may be used instead of, or interchangeably with, any of these terms.
Gene and protein accession numbers (usually provided in parentheses after the gene or species name) are unique identifiers of sequence records that are publicly available at the National Center for Biotechnology Information (NCBI) website (NCBI. The "GenInfo identifier" (GI) sequence identification number is specific to a nucleotide or amino acid sequence. If the sequence changes in any way, a new GI number is assigned. Sequence revision history tools are available to track the various GI numbers, version numbers, and update dates of sequences appearing in a particular GenBank record. The search and acquisition of nucleic acid or gene sequences or protein sequences based on accession numbers and GI numbers is well known in the fields of, for example, cell biology, biochemistry, molecular biology, and molecular genetics.
As used herein, the term "percent identity" or "homology" with respect to a nucleic acid or polypeptide sequence is defined as the percentage of nucleotides or amino acid residues in a candidate sequence that are identical to a known polypeptide after the sequences are aligned to achieve a maximum percent identity and gaps are introduced, if necessary, to achieve the maximum percent homology. N-terminal or C-terminal insertions or deletions should not be construed as affecting homology, and internal deletions and/or insertions of less than about 30, less than about 20, or less than about 10 amino acid residues in a polypeptide sequence should not be construed as affecting homology. Homology or identity at the nucleotide or amino acid sequence level can be determined by BLAST (basic local alignment search tool) analysis using algorithms employed by the programs blastp, blastn, blastx, tblastn and tblastx (Altschul (1997), "Nucleic Acids Res.)" 25, 3389-. The BLAST program uses a method that first considers similar segments with and without gaps between the query sequence and database sequences, then evaluates the statistical significance of all matches identified, and finally summarizes only those matches that meet a pre-selected significance threshold. For a discussion of the basic problems in sequence database similarity searches, see Altschul (1994), "Nature Genetics" 6, 119-129. The search parameters of histogram, description, alignment, expectation (i.e., the statistical significance threshold for reporting matches against database sequences), truncation, matrix, and filter (low complexity) may be at default settings. The default scoring matrix used by blastp, blastx, tblastn, and tblastx is the BLOSUM62 matrix (Henikoff (1992), journal of the national academy of sciences USA 89,10915-10919) that recommends query sequences (nucleotide bases or amino acids) of length greater than 85.
For blastn designed to compare nucleotide sequences, the scoring matrix is set by the ratio of M (i.e., the reward score for a pair of matching residues) to N (i.e., the penalty score for mismatching residues), where M and N may have default values of +5 and-4, respectively. The four blastn parameters can be adjusted as follows: q ═ 10 (gap creation penalty); r ═ 10 (gap extension penalty); wink ═ 1 (a word hit is generated at each winkth position along the query); and gapw 16 (setting the window width in which the gap alignment is created). The equivalent Blastp parameter settings for amino acid sequence comparisons may be: q ═ 9; r is 2; wink is 1; and gapw 32. Bestfit comparisons between sequences available in the GCG software package version 10.0 may use the DNA parameters GAP-50 (GAP creation penalty) and LEN-3 (GAP extension penalty), and equivalent settings in protein comparisons may be GAP-8 and LEN-2.
Thus, when referring to a polypeptide or nucleic acid sequence of the invention, included is a sequence identity of at least 40%, at least 45%, at least 50%, at least 55%, at least 70%, at least 65%, at least 70%, at least 75%, at least 80%, or at least 85%, such as at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or about 100% sequence identity to the full-length polypeptide or nucleic acid sequence or to a fragment thereof (which includes a contiguous sequence of at least 50, at least 75, at least 100, at least 125, at least 150, or more amino acid residues of the entire protein); variants of such sequences, for example, wherein at least one amino acid residue has been inserted into the N-and/or C-terminus and/or within the disclosed sequence or sequences containing insertions and substitutions. Contemplated variants may additionally or alternatively comprise those containing a predetermined mutation, for example by homologous recombination or site-directed or PCR mutagenesis, as well as corresponding polypeptides or nucleic acids of other species, including but not limited to those described herein, alleles of a family of polypeptides or nucleic acids containing insertions and substitutions or other naturally occurring variants; and/or derivatives, wherein the polypeptide has been covalently modified by substitution, chemical, enzymatic, or other appropriate means with a moiety other than a naturally occurring amino acid containing insertions and substitutions (e.g., a detectable moiety such as an enzyme).
As used herein, the phrase "conservative amino acid substitution" or "conservative mutation" refers to the replacement of one amino acid by another with a common property. A functional method for defining properties in common between individual amino acids is to analyze the normalized frequency of amino acid changes between corresponding proteins of homologous organisms (Schulz (1979), "Principles of Protein Structure," Schpringer-Verlag). From such analyses, amino acid groups can be defined, wherein the amino acids within a group are preferentially exchanged for each other and are therefore most similar to each other in their effect on the overall protein structure (Schulz (1979), "principles of protein Structure", Schpringer Press). Examples of amino acid groups defined in this way may comprise: "charged/polar groups" comprising Glu, Asp, Asn, gin, Lys, Arg, and His; an "aromatic or cyclic group" comprising Pro, Phe, Tyr and Trp; and "aliphatic groups" comprising Gly, Ala, Val, Leu, Ile, Met, Ser, Thr, and Cys. Within each group, subunits can also be identified. For example, the group of charged/polar amino acids can be subdivided into subunits comprising: a "positively charged subunit" comprising Lys, Arg, and His; a "negatively charged subunit" comprising Glu and Asp; and "polar subunits," which include Asn and gin. In another example, aromatic or cyclic groups may be subdivided into subunits comprising: "nitrogen cyclidines" which include Pro, His, and Trp; and "phenyl subunits," which include Phe and Tyr. In another further example, the aliphatic or cyclic group may be subdivided into subunits comprising: a "large aliphatic nonpolar subunit" comprising Val, Leu, and Ile; an "aliphatic micro-polar subunit" which includes Met, Ser, Thr and Cys; and "small residue subunits," which include Gly and Ala. Examples of conservative mutations include amino acid substitutions of amino acids within the above subunits, such as, but not limited to: lys for Arg, or vice versa, so that a positive charge can be retained; glu for Asp, or vice versa, so that a negative charge can be maintained; ser for Thr, or vice versa, so that the free-OH can be maintained; and gin for Asn, or vice versa, so that free-NH 2 can be retained. A "conservative variant" is a polypeptide comprising one or more amino acids that have been substituted to replace one or more amino acids of a reference polypeptide with an amino acid having a common property (e.g., belonging to the same amino acid group or subunit as the amino acid group or subunit depicted above) (e.g., a polypeptide whose sequence has been disclosed in a publication or sequence database or whose sequence has been determined by nucleic acid sequencing).
As used herein, the term "modulating" or "modulation" of a gene refers to altering the nucleic acid sequence of the gene, deleting the gene, either completely or partially, causing a break in the gene, altering the expression of the gene, inhibiting the expression of a gene, or silencing the expression of the gene. In some embodiments, altering the sequence of the gene is by insertion of one or more nucleotides, deletion of one or more nucleotides, substitution of nucleotides. The sequence change can be achieved by UV irradiation, gamma irradiation, genetic engineering.
As used herein, "attenuating gene expression" means reducing or eliminating expression of a gene in any manner that reduces production of a fully functional protein.
As used herein, "expression" includes gene expression at least at the level of RNA production, and "expression product" includes the resulting product, e.g., a polypeptide or functional RNA (e.g., ribosomal RNA, tRNA, antisense RNA, microrna, shRNA, ribozyme, etc.). The term "increased expression" encompasses alterations in gene expression to facilitate increased mRNA production and/or increased polypeptide expression. "increased production" includes an increase in the amount of expression of a polypeptide, the level of enzymatic activity of a polypeptide, or a combination of both, as compared to the natural production or enzymatic activity of the polypeptide.
Some aspects of the invention comprise partial, substantial, or complete deletion, silencing, inactivation, or downregulation of expression of a particular polynucleotide sequence. These genes may be partially, substantially or completely deleted, silenced, inactivated, or their expression may be down-regulated to affect the activity performed by the polypeptide encoded thereby, such as the activity of an enzyme. A gene may be partially, substantially or completely deleted, silenced, inactivated or down-regulated by inserting a nucleic acid sequence that disrupts the function and/or expression of the gene (e.g., viral insertion, transposon mutagenesis, meganuclease engineering, homologous recombination, or other methods known in the art). The terms "elimination", "elimination" and "knock-out" may be used interchangeably with the terms "deletion", "partial deletion", "substantial deletion" or "complete deletion". In certain embodiments, a microorganism of interest can be engineered by site-directed homologous recombination to knock out a particular gene of interest. In still other embodiments, RNAi or antisense dna (asdna) constructs may be used to partially, substantially, or completely silence, inactivate, or down regulate a particular gene of interest.
These insertions, deletions or other modifications of certain nucleic acid molecules or specific polynucleotide sequences may be understood as encompassing "one or more genetic modifications" or "one or more transformations" such that the resulting strains of these microorganisms or host cells may be understood as "genetically modified", "genetically engineered" or "transformed".
As used herein, "up-regulated" or "up-regulation" comprises an increase in the expression or enzymatic activity of a gene or nucleic acid molecule of interest, e.g., an increase in gene expression or enzymatic activity as compared to the expression or activity in an otherwise identical gene or enzyme that has not been up-regulated.
As used herein, "down-regulated" or "down-regulation" comprises a reduction in the expression or enzymatic activity of a gene or nucleic acid molecule of interest, e.g., a reduction in gene expression or enzymatic activity as compared to the expression or activity in an otherwise identical gene or enzyme that has not been down-regulated.
As used herein, a "mutant" refers to an organism that does not occur naturally and has a mutation in a gene that occurs as a result of classical mutagenesis (e.g., using gamma irradiation, UV, or chemical mutagens). As used herein, "mutant" also refers to a recombinant cell having an altered gene structure or expression due to genetic engineering, which may include, by way of non-limiting example, overexpression, including expression of genes under different temporal, biological, or environmental regulation and/or expression of genes to a different extent than naturally occurring and/or expression of genes that are not naturally expressed in the recombinant cell; homologous recombination, including knock-out and knock-in (e.g., gene replacement with a gene encoding a polypeptide having higher or lower activity than the wild-type polypeptide and/or a dominant-negative polypeptide); gene attenuation by RNAi, antisense RNA, ribozyme, or the like; and genome engineering using meganucleases, TALENs, and/or CRISPR techniques, among others. Mutant organisms of interest typically have a phenotype that is different from the phenotype of the corresponding wild-type or progenitor strain lacking the mutation, wherein the phenotype can be assessed by growth assays, product analysis, photosynthetic properties, biochemical assays, and the like. When referring to a gene "mutant", it is meant that the gene has at least one base (nucleotide) alteration, deletion or insertion relative to the natural or wild-type gene. The mutation (alteration, deletion and/or insertion of one or more nucleotides) may be in the coding region of the gene, or may be in an intron, 3'UTR, 5' UTR or promoter region, for example within 2kb of the transcription start site or within 3kb of the translation start site. As a non-limiting example, the mutant gene may be a gene having an insertion within the promoter region that can increase or decrease gene expression; may be a gene with a deletion resulting in the production of a non-functional protein, a truncated protein, a dominant negative protein, or no protein; may be a gene having one or more point mutations that result in amino acid changes in the encoded protein or in aberrant splicing of gene transcripts, etc. As used herein, a "mutant" refers to an organism that does not occur naturally and has a mutation in a gene that occurs as a result of classical mutagenesis (e.g., using gamma irradiation, UV, or chemical mutagens). As used herein, "mutant" also refers to a recombinant cell having an altered gene structure or expression due to genetic engineering, which may comprise, as non-limiting examples, overexpression, comprising gene expression under different temporal, biological, or environmental regulation and/or to a different extent than naturally occurring and/or gene expression that is not naturally expressed in the recombinant cell.
The term "Pfam" refers to a large collection of protein domains and protein families maintained by the Pfam association (Pfam Consortium), and is available in: the Welchongtto foundation (Welcome Trust), Sanger institute (Sanger institute); pfam. sbc. su. se (Stockholm bioinformatics center), treasura Farm (Janelia Farm), Howard hous institute of medicine (Howard Hughes medical institute), national institute of agriculture (Institut de la Recherche Agronomique), the latest version of Pfam is Pfam 27.0 (3 months 2013) based on UniProt protein database version 2012_06, using multiple sequence alignments and Hidden Markov Models (HMMs) to identify Pfam domains and families. Nucleic acid Research (Nucleic Acids Research) 26, 320-322; bateman (2000), nucleic acids research 26, 263-266; bateman (2004), "nucleic acids research" 32, "Database album (Database Issue), D138-D141; finn (2006), nucleic acids research-Database album (Database Issue) 34, D247-251; finn (2010), "nucleic acids research-database album" 38, D211-222). By accessing the Pfam database, e.g., using any of the above-described websites, HMMER homology search software (e.g., HMMER2, HMMER3, or higher) can be used to query protein sequences for HMM. Identifying the queried proteins as significant matches that are in the Pfam family (or have a particular Pfam domain) are those matches in which the bit score is greater than or equal to the aggregation threshold of the Pfam domain. The expected value (e-value) can also be used as a criterion to include the queried protein in Pfam or to determine whether the queried protein has a particular Pfam domain, where a low e-value (much less than 1.0, e.g., less than 0.1, or less than or equal to 0.01) indicates a low probability that the match is due to chance.
As used herein, the term "photosynthetic organism" refers to an organism that can convert light energy into chemical energy. In some embodiments, chemical energy may be later released to fuel the activities of these organisms (energy conversion). In some embodiments, this chemical energy is stored in carbohydrate molecules (e.g., sugars) that are synthesized from carbon dioxide and water.
Non-limiting examples of photosynthetic organisms include plants, algae, and cyanobacteria. Non-limiting examples of algae belong to the genera: genus Triplophytes, Coccomyza, Geotrichum, Celastrus, Celosidium, Chryseophytes, Bordetella, balloonflower, Staphylum, Chrysocola, Chaetoceros, Tetraflagellates, Chlamydomonas, Chlorococcus, Chlorella, Crypthecodinium, Chlorococcus, Chlorella, Haematococcus, Crypthecodinium, Coccodinium, Rhodococcus, Halobacterium, Isochrysis, Phyllostachys, Phaeophyceae, Isochrysis, Isodon, Isochrysis, Photinus, Phaeophyceae, Chlorella, Phaeophyceae, Oocystis, oyster globulina, pavlova, parachloropsis, parva, praguenophyta, phaeodactylum, phage, microalgal, tetraselminthium, crohns, portulaca, prototheca, pseudochlorella, neochlorella, pseudodiadactylum, talocystis, plasmopara, scenedesmus, ostereum, spirulina, schizophyllan, tetrastigmatis, thalassonia, xanthophylla, alexandrium, parachlorophyllum, welshikonium, and clitocystis.
Non-limiting examples of plants include Arabidopsis thaliana (Arabidopsis arenicola), Arabidopsis thaliana (Arabidopsis thaliana), Arabidopsis cebennnensis, Arabidopsis creotica, Arabidopsis thaliana, Arabidopsis neglta, Arabidopsis peedonana, Arabidopsis subeica, Arabidopsis thaliana, corn, rice, wheat, potato, onion, garlic, soybean, tomato, Gossypium hirsutum, Gossypium arboricum, Gossypium hirsutum, Brassica nigra, and Brassica sp.
As used herein, the term "mutant photosynthetic organism" or "mutant algae" refers to a photosynthetic organism or algae in which at least the combination of SGI1, SGI2, SGI1 and SRP54, SGI2 and SRP54, or SGI1, SGI2 and SRP54 is modulated. Such modulation may comprise alteration of the nucleic acid sequence or alteration of the expression of one or more genes.
As used herein, the combined regulation of SGI1 and SRP54 genes refers to the regulation of SGI1 and the regulation of SRP54 genes in the same photosynthetic organism. Similarly, regulation of the combination of the SGI2 and SRP54 genes refers to regulation of SGI2 and regulation of the SRP54 gene in the same photosynthetic organism. Likewise, regulation of the combination of SGI1, SGI2, and SRP54 genes refers to regulation of SGI1, SGI2, and SRP54 genes in the same photosynthetic organism.
As used herein, the term control photosynthetic organism refers to a photosynthetic organism that is substantially genetically identical to the mutant photosynthetic organism in all relevant respects, except that the control photosynthetic organism does not have mutated or weakened SRP54, SGI1, SGI2, or a combination of two or more genes. For example, the control photosynthetic organism is the same species, and except for the alteration of the cpSRP54, cytosolic SRP54, SGI1 or SGI2 gene or the construct used to attenuate the cpSRP54, the cytosolic SRP54, SGI1, SGI2 genes present in the mutant are genetically identical except for small genomic changes (e.g., "SNPs") that do not affect cellular physiology that may occur during mutagenesis by normal reproduction. In various embodiments, the control photosynthetic organism is a strain from which an attenuated-expression mutant photosynthetic organism having cytoplasmic SRP54, cpSRP54, SGI1, SGI2, or a combination of at least two genes, is derived.
When referring to a photosynthetic organism (such as an algae), the term "adapted to low light" means that the photosynthetic organism has increased chlorophyll and photosynthetic properties after exposure to low light intensity for a period of time sufficient to stabilize the chlorophyll and photosynthetic property changes under low light conditions. The low light may be, for example, less than 200 μ E · m-2·s-1And preferably, about 100 μ E-m-2·s-1Or less or 50 μ E · m-2·s-1Or less, and the time period for adaptation may be at least about four hours, at least about six hours, at least about eight hours, or at least about twelve hours, at least 24 hours, or at least 48 hours, and may be as long as2 days, 3 days, 4 days, or 5 days.
"cDNA" is a DNA molecule that includes at least a portion of the nucleotide sequence of an mRNA molecule except that the DNA molecule replaces the nucleobase thymine or T present in the mRNA sequence in place of uridine or U. The cDNA may be double-stranded or single-stranded, and may be, for example, the complement of the mRNA sequence. In a preferred example, the cDNA does not contain one or more intron sequences that are present in the naturally occurring gene to which the cDNA corresponds (i.e., a gene that is present in the genome of the organism). For example, a cDNA may have a sequence upstream from an intron of a naturally-occurring gene juxtaposed with a sequence downstream from an intron of a naturally-occurring gene, wherein the upstream and downstream sequences are not juxtaposed in nature in the DNA molecule (i.e., the sequences are not juxtaposed in the naturally-occurring gene). cDNA may be produced by reverse transcription of mRNA molecules, or may be synthesized, for example, by chemical synthesis and/or by using one or more restriction enzymes, one or more ligases, one or more polymerases (including but not limited to high temperature resistant polymerases useful for Polymerase Chain Reaction (PCR)), one or more recombinases, and the like, based on knowledge of the cDNA sequence, which may optionally be based on the identification of coding regions from genomic sequences or multiple partial cdnas compiled from the sequence.
An algal mutant that is "deregulated in low light adaptation" (or "locked under high light adaptation" or LIHLA mutant) is a mutant that does not exhibit phenotypic and genotypic changes that are characteristic of low light adapted wild type algal cells, comprising: a significant increase in chlorophyll and a significant increase in the expression of most of the Light Harvesting Complex Protein (LHCP) genes. When acclimating to low light, algal mutants that are deregulated in low light acclimation have reduced expression relative to a low light acclimated wild type cell of a plurality of genes (e.g., at least ten, at least twenty, at least thirty, at least forty, or at least fifty genes) that are upregulated during low light acclimation of the wild type cell. Further, algal mutants that are deregulated in low light adaptation have increased gene expression relative to wild type cells that are down-regulated (e.g., at least five, at least six, at least seven, at least eight, at least nine, or at least ten genes) during low light adaptation of the wild type cells. Further, as disclosed herein, when both the mutant and wild type cells are adapted to low light, the algal mutant that is deregulated in low light adaptation may have photosynthetic properties that are significantly different from those of the wild type cell.
"photosynthetic properties", "photo-physiological properties" or photo-physiological parameters "include, but are not limited to, maximum photosynthetic rate Pmax (calculated on a per cell or per milligram chlorophyll basis), intensity Ek at photosynthetic saturation (as measured by oxygen evolution) and α (" alpha (α) ") initial slope of the photosynthesis (oxygen evolution) versus irradiance intensity (P/I) curvev/FM(ii) a Photosynthetic quantum yield of photosystem ii (PSII), phi PSII; photochemical quenching, or proportion of open PSII centers, qP; non-photochemical quenchingNPQ; PSII electron transfer rate, ETRPSII; PSI electron transfer Rate, ETRPSI(ii) a Functional absorption cross-sectional dimension (σ) of PSIPSI) And the functional absorption cross section (σ) of PSIIPSII). The list here is not exhaustive and the term does not exclude other parameters for measuring various aspects of photosynthesis.
Reference to "substantially the same" properties is intended to mean that the properties are within 10%, and preferably within 5% of the reference value.
Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. The materials, methods, and examples are illustrative only and not intended to be limiting. Other features and advantages of the invention will be apparent from the detailed description and from the claims.
Gene attenuation
The mutant photosynthetic organisms may be mutants produced by any feasible method, including but not limited to UV irradiation, gamma irradiation, or chemical mutagenesis, and screening for low chlorophyll mutants having the photosynthetic properties disclosed herein. Methods for generating mutants of microbial strains are well known. Mutants can be identified by methods known in the art, including, for example, genomic sequencing, PCR, immunodetection of cpSRP54 or cytoSRP54 proteins, and expression analysis (e.g., reverse transcription/PCR).
The mutant photosynthetic organisms provided herein can also be genetically engineered in a combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes, or a combination of SGI2 and cpSRP54, e.g., that have been targeted for knockout or gene replacement by homologous recombination (e.g., with a mutated form of a gene that can encode a polypeptide having reduced activity relative to a wild-type polypeptide). In further examples, the algal strain of interest can be engineered by site-directed homologous recombination to insert a particular gene of interest (e.g., a promoter) with or without expression control sequences into a particular genomic locus, or to insert a promoter into a genetic locus of a host microorganism to affect expression of a particular gene or set of genes at the locus.
For example, gene knock-out or replacement by homologous recombination can be performed by transformation of a nucleic acid (e.g., DNA) fragment comprising a sequence homologous to the genomic region to be altered, wherein the homologous sequence is interrupted by an exogenous sequence (typically a selectable marker gene that allows selection of the integrated construct). The length of the genomic homologous flanking sequences on either side of the exogenous sequence or the mutated gene sequence may be, for example, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1,000, at least 1,200, at least 1,500, at least 1,750, or at least 2,000 nucleotides. Gene knockout or gene "knock-in" constructs (where the exogenous sequence is flanked by target gene sequences) can be provided in vectors that can optionally be linearized, e.g., outside of the region undergoing homologous recombination, or can be provided as linear fragments that are not in the context of the vector, e.g., the knockout or knock-in constructs can be isolated or synthetic fragments, including but not limited to PCR products. In some examples, split-marker systems can be used to generate gene knockouts by homologous recombination, where two DNA fragments can be introduced that can regenerate the selectable marker and disrupt the locus of interest by three crossover events (Jeong et al (2007), "FEMS Microbiol Lett 273: 157-163-).
In one aspect, the invention provides genetically modified organisms, such as genetically modified microorganisms having one or more genes for attenuating expression of SGI1, SGI2, cpSRP54, cytoSRP54, a combination of SGI1 and cpSRP54 genes, or a combination of SGI2 and cpSRP54 genes. As used herein, "attenuating expression of a combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes, or a combination of SGI2 and cpSRP54 genes" means reducing or eliminating expression of one or more of the above genes in any manner that reduces production of fully functional proteins.
For example, a recombinant photosynthetic organism engineered to have attenuated expression of SGI1, cpSRP 1, cytoSRP 1, a combination of SGI1 and cpSRP 1 genes or a combination of SGI1 and cpSRP 1 genes may have disrupted SGI1, cpSRP 1, cytoSRP 1, a combination of SGI1 and cpSRP 1 genes or a combination of SGI1 and cpSRP 1 genes, wherein the recombinant microorganism may have a combination of SGI1, cpSRP 1, cytoSRP 1, SGI1 and cpSRP 1 genes or a combination of SGI1 and cpSRP 1 genes that comprises at least one insertion, mutation or deletion that reduces or eliminates gene expression such that a combination of cyi 1, or SGI1 genes is not produced or produced in a lower amount than that produced by a photosynthetic organism of the same species. Disrupted SGI1, SGI2, cpSRP54, cytoSRP54, a combination of SGI1 and cpSRP54 genes, a combination of SGI2 and cpSRP54 or cytoSRP54 gene can be disrupted, for example, by homologous recombination and/or by activity-mediated insertion or gene replacement by meganucleases, zinc finger nucleases (Perez-Pinera et al, (2012), "current generation of chemi-biol (curr. opin. chem. biol.). 16:268-277), TALENs (WO 2014/207043; WO2014/076571) or RNA-guided endonucleases, such as Cas proteins of the CRISPR system (e.g., Cas9 protein).
CRISPR systems, recently reviewed by Hsu et al (cells 157: 1262-.
The present invention contemplates the use of two RNA molecules ("crRNA" and "tracrRNA") that can be co-transformed into (or expressed in) a host strain expressing or transfected with a cas protein for genome editing, or the use of a single guide RNA comprising a sequence complementary to a target sequence and a sequence that interacts with a cas protein. That is, in some strategies, a CRISPR system as used herein may comprise two separate RNA molecules (RNA polynucleotides: "tracr-RNA" and "target-RNA" or "crRNA", see below), and is referred to herein as a "double-molecule DNA-targeting RNA" or "double-molecule DNA-targeting RNA". Alternatively, as shown in the examples, the DNA-targeting RNA may also comprise a transactivation sequence (in addition to the targeting homology ("cr") sequence) for interacting with the Cas protein, i.e., the DNA-targeting RNA may be a single RNA molecule (single RNA polynucleotide) and referred to herein as a "chimeric guide RNA", "single guide RNA" or "sgRNA". The terms "DNA-targeting RNA" and "gRNA" are inclusive and refer to both bi-molecular DNA-targeting RNA and single-molecular DNA-targeting RNA (i.e., sgRNA). Both single molecule guide RNA and both RNA systems have been described in detail in the literature and, for example, in U.S. patent application publication No. US 2014/0068797, which is incorporated herein by reference in its entirety.
Any Cas protein can be used in the methods herein, e.g., Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7 (also referred to as Csn 7 and Csx 7), Cas7, Csy 7, Cse 7, Csc 7, Csa 7, Csn 7, Csm 7, Cmr 7, Csb 7, Csx 36f 7, Csx 36f 7, Csx 36f, Csx 7, cs. In some embodiments, the Cas protein is a class II Cas protein. As non-limiting examples, the Cas protein may be a Cas9 protein, such as a Cas9 protein of Staphylococcus pyogenes (Staphylococcus aureus), streptococcus thermophilus (s.thermophilus), streptococcus pneumoniae (s.pneumonia), Staphylococcus aureus (s.aureus) or Neisseria meningitidis (Neisseria meningitidis). Other Cas proteins of interest include, but are not limited to, Cpf1 RNA-guided endonucleases (Zetsche et al (2015), "cells" 163:1-13) and C2C1, C2C2, C2C3 RNA-guided nucleases (Shmakov et al (2015), "Molecular Cell" 60: 1-13). Also contemplated are Cas9 proteins provided as SEQ ID NOs 1-256 and 795-1346 in U.S. patent application publication No. US 2014/0068797, as well as chimeric Cas9 proteins that can bind domains from more than one Cas9 protein, as well as variants and mutants of the identified Cas9 proteins. (e.g., a Cas9 protein encoded by a nucleic acid molecule introduced into a host cell may include at least one mutation relative to a wild-type Cas9 protein; e.g., a Cas9 protein may be inactivated in one of the cleavage domains of the protein, thereby producing a "nickase" variant. non-limiting examples of mutations include D10A, H840A, N854A, and N863A.) the nucleic acid sequence encoding a Cas protein of a host cell of interest may be codon optimized.
Cas nuclease activity cleaves the target DNA to generate a double strand break. These breaks are then repaired by the cells in one of two ways: non-homologous end joining or homology directed repair. In non-homologous end joining (NHEJ), double-stranded breaks are repaired by joining the broken ends directly to each other. In this case, no new nucleic acid material is inserted into the site, but some of the nucleic acid material may be lost, resulting in deletions or alterations, often resulting in mutations. In homology-directed repair, a donor polynucleotide (sometimes referred to as "donor DNA" or "editing DNA") that may have homology to the cleaved target DNA sequence is used as a template for repairing the cleaved target DNA sequence, resulting in the transfer of genetic information from the donor polynucleotide into the target DNA. In this way, new nucleic acid material can be inserted/copied into the site. Modification of the target DNA due to NHEJ and/or homology directed repair (e.g., using a donor DNA molecule) can result in, for example, gene correction, gene replacement, gene tagging, transgene insertion, nucleotide deletion, gene disruption, gene mutation, and the like.
In some examples, cleavage of DNA by site-directed modification polypeptides (e.g., Cas nucleases, zinc finger nucleases, meganucleases, or TALENs) can be used to delete nucleic acid material from a target DNA sequence by cleaving the target DNA sequence and allowing the cell to repair the sequence in the absence of an exogenously provided donor polynucleotide. Such NHEJ events can cause mutations ("mis-repair") at the site of reconnection of the cleaved ends, resulting in gene disruption.
Alternatively, if the DNA targeting RNA and the donor DNA are co-administered to a cell expressing a cas nuclease, the subject methods may be used to add (i.e., insert or replace) nucleic acid material (e.g., by insertional mutagenesis "knock out" or "knock in" nucleic acids encoding proteins (e.g., selectable marker and/or any protein of interest), sirnas, mirnas, etc., to the target DNA sequence to modify the nucleic acid sequence (e.g., introduce mutations).
In particular embodiments, the donor DNA may comprise a gene regulatory sequence (e.g., a promoter) that can be targeted using CRISPR, inserted upstream of the coding region of the gene and upstream of the putative proximal promoter region of the gene, e.g., at least 50bp, at least 100bp, at least 120bp, at least 150bp, at least 200bp, at least 250bp, at least 300bp, at least 350bp, at least 400bp, at least 450bp, or at least 500bp upstream of the coding region initiating ATG of the cpSRP54 gene. The donor DNA may comprise sequences that may interfere with the native promoter, such as a selectable marker or any convenient sequence. Additional sequences inserted upstream of the initiating ATG of SGI1, SGI2, cpSRP54, cytoSRP54, combinations of genes or combinations of open reading frames (e.g., upstream of the transcription initiation site of the 5' UTR or cpSRP54 genes) may reduce or even eliminate expression of endogenous SGI1, SGI2, cpSRP54, cytoSRP54, combinations of genes. Alternatively or additionally, the native SGI1, SGI2, cpSRP54, cytoSRP54, or combination of genes may have their endogenous promoters replaced, in whole or in part, by weaker or differently regulated promoter or non-promoter sequences.
In some examples, the nucleic acid molecule introduced into the host cell for generating a high efficiency genome editing cell line encodes a Cas9 enzyme that is mutated relative to a corresponding wild-type enzyme such that the mutated Cas9 enzyme lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence. For example, an aspartate to alanine substitution in the RuvC I catalytic domain of Cas9 from streptococcus pyogenes (D10A) converts Cas9 from a two-strand cleaving nuclease to a nickase (a single-strand cleaving enzyme). Other examples of mutations that render Cas9 a nickase include, but are not limited to, H840A, N854A, and N863A. In some embodiments, Cas9 nickase can be used in combination with one or more guide sequences (e.g., two guide sequences) that target the sense and antisense strands of a DNA target, respectively. This combination allows both strands to be cleaved and used to induce NHEJ. Two nickase targets (in close proximity but targeting within different strands of DNA) can be used to induce mutagenized NHEJ. Such targeting of loci using enzymes that cleave opposing strains at staggered positions can also reduce non-target cleavage, as both strands must be cleaved precisely and specifically to effect genomic mutations.
In further examples, a mutant Cas9 enzyme whose ability to cleave DNA can be impaired in a cell, wherein one or more guide RNAs are also introduced that target sequences upstream of the transcriptional or translational start site of the gene. In this case, the Cas enzyme can bind to the target sequence and block transcription of the targeted gene (Qi et al (2013), < cell > 152: 1173-1183).
In some cases, a Cas polypeptide (e.g., Cas9 polypeptide) is a fusion polypeptide, including, for example: i) a Cas9 polypeptide (which may optionally be a variant Cas9 polypeptide as described above); and b) a covalently linked heterologous polypeptide (also referred to as "fusion partner"). A heterologous nucleic acid sequence can be linked to another nucleic acid sequence (e.g., by genetic engineering) to generate a chimeric nucleotide sequence encoding a chimeric polypeptide. In some embodiments, the Cas9 fusion polypeptide is generated by fusing the Cas9 polypeptide with a heterologous sequence that provides subcellular localization (i.e., the heterologous sequence is a subcellular localization sequence, e.g., a Nuclear Localization Signal (NLS) for targeting the nucleus; a mitochondrial localization signal for targeting mitochondria; a chloroplast localization signal for targeting chloroplasts; an ER retention signal, etc.). In some embodiments, the heterologous sequence can provide a tag (i.e., the heterologous sequence is a detectable label) to facilitate tracking and/or purification (e.g., a fluorescent protein, e.g., Green Fluorescent Protein (GFP), YFP, RFP, CFP, mCherry, tdTomato, etc.; Hemagglutinin (HA) tag; FLAG tag; Myc tag, etc.).
The host cell may be genetically engineered (e.g., transduced, transformed or transfected) with, for example, a vector construct which may be, for example, a vector for homologous recombination comprising a nucleic acid sequence homologous to the SGI1, SGI2, cpSRP54, a portion of cytoSRP54, a combination of SGI1 and cpSRP54 genes or a combination of SGI2 and cpSRP54 gene loci of the host cell or regions adjacent thereto, or may be an expression vector for expressing any one or combination of: cas proteins (e.g., class II Cas proteins), CRISPR chimeric guide RNAs, crrnas, and/or tracrrnas, RNAi constructs (e.g., shrnas), antisense RNAs, or ribozymes. The vector may be in the form of, for example, a plasmid, a viral particle, a phage, or the like. Vectors for expression of polypeptides or RNAs for genome editing may also be designed for integration into a host, e.g., by homologous recombination. Vectors containing the polynucleotide sequences described herein, e.g., sequences having homology to a combination of host SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or a combination of SGI2 and cpSRP54 gene sequences (including sequences upstream and downstream of the cpSRP54 or cytoSRP54 coding sequences), and optionally, selectable markers or reporter genes can be used to transform a suitable host to cause attenuation of the combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or the combination of SGI2 and cpSRP54 genes.
In some examples, the recombinant photosynthetic organism may reduce but not eliminate expression of a combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes, or a combination of SGI2 and cpSRP54 genes, and the recombinant photosynthetic organism may have a reduction of chlorophyll of about 10% to about 90%, for example, a reduction of total chlorophyll of about 20% to about 80%. Genetically modified microorganisms as provided herein may, in some examples, comprise nucleic acid constructs for attenuating expression of SGI1, SGI2, cpSRP54, cytoSRP54, a combination of SGI1 and cpSRP54 genes or a combination of SGI2 and cpSRP54 genes. For example, the host microorganism may comprise a construct for expressing an RNAi molecule, ribozyme, or antisense molecule that reduces expression of a combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes, or a combination of SGI2 and cpSRP54 genes. In some examples, a recombinant microorganism as provided herein may comprise at least one introduced (exogenous or non-native) construct for reducing expression of a combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes, or a combination of SGI2 and cpSRP54 genes.
Using methods known in the art, e.g., RNA-Seq or reverse transcription PCR (RT-PCR), engineered strains can be selected for expression of a combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes or a combination of SGI2 and cpSRP54 genes that is reduced, but not eliminated, relative to control cells that do not comprise genetic modifications for attenuating the expression of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes or a combination of SGI2 and cpSRP54 genes.
Genetically engineered strains as provided herein can be engineered to comprise constructs for attenuating gene expression by reducing the amount, stability, or translation of mRNA of genes encoding SGI1, SGI2, cpSRP54, cytoSRP54, a combination of SGI1 and cpSRP54 genes, or a combination of SGI2 and cpSRP54 genes. For example, photosynthetic organisms such as plants or algae or heteroflagellate strains (heterokont strains) can be transformed with antisense RNA, RNAi or ribozyme constructs targeting mRNA of the combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or SGI2 and cpSRP54 genes using methods known in the art. For example, antisense RNA constructs comprising all or part of The transcribed region of a gene can be introduced into a microorganism to reduce gene expression (Shroda et al (1999), "plant cells (The plant cell) 11:1165-78," Ngiam et al (2000), "applied environmental microbiology journal (apple. environ. microbiol.)) 66:775-782," (Ohnuma et al (2009), "protoplasm (Protoplasma) 236: 107-112; Lavaud et al (2012)," public science library journal (PLoS One) 7: e 36806). Alternatively or additionally, RNAi constructs (e.g., constructs encoding short hairpin RNAs) targeting the cpSRP54 or cytoSRP54 gene may be introduced into a microorganism, such as an alga or an inequality flagellum, for reducing expression of the cpSRP54 or cytoSRP54 gene (see, e.g., cerritti et al (2011), "Eukaryotic cells (2011)10: 1164-.
Ribozymes are RNA-protein complexes that cleave nucleic acids in a site-specific manner. Ribozymes have specific catalytic domains that possess endonuclease activity. For example, U.S. patent No. 5,354,855 reports that certain ribozymes can act as endonucleases with sequence specificities greater than that of known ribonucleases and approaching that of DNA restriction enzymes. Catalytic RNA constructs (ribozymes) can be designed to base-pair with mRNA encoding the genes provided herein to cleave the mRNA target. In some examples, ribozyme sequences may be integrated within antisense RNA constructs to mediate cleavage of the target. Various types of ribozymes are contemplated, the design and use of which are known in the art and are described, for example, in Haseloff et al (1988), Nature 334: 585-591.
Ribozymes are targeted to a given sequence by annealing to the site via complementary base pair interactions. This targeting requires two homology segments. These stretches of homologous sequences flank the catalytic ribozyme structure defined above. Each segment of the homologous sequence may vary in length from 7 to 15 nucleotides. The only requirement for defining homologous sequences is that they are separated by specific sequences that act as cleavage sites on the target RNA. For hammerhead ribozymes, the cleavage site is a dinucleotide sequence on the target RNA, which is uracil (U), followed by adenine, cytosine, or uracil (A, C or U) (Thompson et al (1995); nucleic acid Res. 23: 2250-68). The frequency with which this dinucleotide is present in any given RNA is statistically 3 out of 16. Thus, for a given target messenger RNA of 1,000 bases, 187 dinucleotide cleavage sites are statistically likely.
General design and optimization of ribozyme-directed RNA cleavage activity has been discussed in detail (Haseloff and Gerlach (1988),. Nature.334: 585. sup. 591; Symons (1992),. sup. Ann Rev Biochem 61: 641-71; Chowrira et al (1994),. sup. J Biol Chem 269: 25856-64; Thompson et al (1995), supra). The design and testing of ribozymes for efficient cleavage of target RNA is a well known procedure to those skilled in the art. Examples of scientific methods for designing and testing ribozymes are described by Chorwrara et al (1994), supra; and Lieber and Strauss (1995), molecular Cell biology (Mol Cell Biol.) 15:540-51, each of which is incorporated by reference. The identification of effective and preferred sequences for down-regulating a given gene is a matter of preparing and testing a given sequence and is a routine practice of "screening" methods known to those skilled in the art.
The use of RNAi constructs is described in the literature cited above and for example in US2005/0166289 and WO 2013/016267. Double-stranded RNA having homology to the target gene is delivered to the cell or produced in the cell by expression of an RNAi construct (e.g., an RNAi short hairpin (sh) construct). The construct may comprise a sequence identical to the target gene, or at least 70%, 80%, 90%, 95%, or between 95% and 100% identical to the sequence of the target gene. The construct may have at least 20, at least 30, at least 40, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, or at least 1kb of sequences homologous to the target gene. Expression vectors can be engineered using promoters selected for continuous or inducible expression of RNAi constructs (e.g., constructs that produce shRNA).
A nucleic acid construct for gene attenuation, e.g., a ribozyme, RNAi or antisense construct, may comprise at least fifteen, at least twenty, at least thirty, at least forty, at least fifty or at least sixty nucleotides having at least 80% identity, such as at least 85%, at least 90%, at least 95% or at least 99% or complementarity, to at least a portion of the sequence of the combination of endogenous SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or the combination of SGI2 and cpSRP54 genes of the microorganism to be engineered. A nucleic acid construct, e.g., a ribozyme, an RNAi or an antisense construct, for gene attenuation may comprise at least fifteen, at least twenty, at least thirty, at least forty, at least fifty or at least sixty nucleotides having at least 80%, such as at least 95%, or about 100% identity or complementarity to a sequence of a naturally occurring gene (e.g., a gene encoding a polypeptide having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80% or at least 85%, at least 90% or at least 95% sequence identity to a combination of endogenous SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or a combination of SGI2 and cpSRP54 genes). For example, a nucleic acid construct for gene attenuation, such as a ribozyme, RNAi, or antisense construct, can comprise at least fifteen, at least twenty, at least thirty, at least forty, at least fifty, or at least sixty nucleotides having at least 80% identity or complementarity to a sequence of a naturally-occurring cpSRP54 gene (such as any of the genes provided herein). The nucleotide sequence may be, for example, about 30 nucleotides to about 3 kilobases or more, e.g., 30-50 nucleotides in length, 50 to 100 nucleotides in length, 100 to 500 nucleotides in length, 500 nucleotides to 1kb in length, 1kb to 2kb in length, or 2 to 5kb in length. For example, the antisense sequence can be from about 100 nucleotides to about 1kb in length. For example, a nucleic acid construct, e.g., a ribozyme, an RNAi or an antisense construct, for gene attenuation may comprise at least fifteen, at least twenty, at least thirty, at least forty, at least fifty, at least sixty or at least 100 nucleotides having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, or at least 85%, e.g., at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94% or at least 95% identity or complementarity to an endogenous SGI1, SGI2, cpSRP54, a combination of cytoSRP54, SGI1 and cpSRP54 genes or a combination of SGI2 and cpSRP54 genes or a portion thereof.
The promoter used in the antisense, RNAi or ribozyme construct may be any promoter which functions in the host organism and is suitable for reducing the expression of a target gene to the desired level of expression required. Promoters that function in algae and inequilibrium are known in the art and are disclosed herein. The constructs can be converted into algae using any operable method, including any of the methods disclosed herein. A recombinant organism or microorganism transformed with a nucleic acid molecule for attenuating the combination of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or SGI2 and cpSRP54 gene expression, such as but not limited to antisense, RNAi or ribozyme constructs, may have the properties of the SGI1, SGI2, cpSRP54, cytoSRP54, SGI1 and cpSRP54 genes or SGI2 and cpSRP54 mutants described herein, including, for example, reduced chlorophyll, increased photosynthetic efficiency and increased culture productivity relative to a host organism or microorganism that does not include an exogenous nucleic acid molecule that results in attenuated gene expression.
Nucleic acid molecules and constructs
It will be appreciated by those skilled in the art that many transformation methods can be used for genetic transformation of microorganisms and thus can be used in the methods of the present invention. "Stable transformation" is intended to mean that a nucleic acid construct introduced into an organism is integrated into the genome of the organism, or is part of a stable episomal construct, and is capable of being inherited by its progeny. "transient transformation" is intended to mean the introduction of a polynucleotide into an organism and not integrating into the genome or otherwise being established and stably inherited through successive generations.
Genetic transformation may result in the stable insertion and/or expression of a transgene, a construct from a nucleus or plastid, and in some cases may result in transient expression of the transgene. These transformation methods can also be used to introduce guide RNA or edit DNA. It has been reported that genetic transformation of microalgae is successful for more than 30 different strains of microalgae belonging to at least about 22 species of green, red and brown algae, diatoms, euglena, and dinoflagellates (see, e.g., Radakovits et al, eukaryotes, 2010; and Gong et al, journal of industrial microbiology and biotechnology (j.ind.microbiol.biotechnol.), 2011). Non-limiting examples of such useful transformation methods include stirred cells in the presence of glass beads or silicon carbide whiskers, as reported by, for example, Dunahay, "Biotechnology", 15(3):452, 1993; kindle, journal of the national academy of sciences USA, 1990; michael and Miller, journal of botanicals (Plant J.), 13,427, 435, 1998. Electroporation has been successfully used for the genetic transformation of several species of microalgae, including the genus Nannochloropsis (see, e.g., Chen et al, J.Phytology J.C., 44: 768-76, 2008), the genus Chlorella (see, e.g., Chen et al, contemporary Genetics 39:365-370, 2001; Chow and Tung, Plant Cell report (Plant Rep. Cell.), vol. 18, 9, 778-780,1999), the genus Chlamydomonas (Shimogawara et al, Genetics 148, 1821-1828, 1998), the genus Dunaliella (Sun et al, molecular Biotechnol. 30 (185), (192, 2005). Microprojectile bombardment, also known as particle bombardment, gene gun transformation or biolistic bombardment, has been successfully used for several algae species, including for example diatom species (such as Phaeodactylum) (Apt et al, mol. Gen. Genet.) (252: 572-579,1996), Cyclotella and navicula (Dunahay et al, J. algae., 31: 1004-1012, 1995), Aphyllum (Fischer et al, J. algae., 35:113-120,1999) and chaetoceros (Miyagawa-Yamaguchi et al, research in algae (Phycol. Res.), 59: 113-119, 2011), as well as green algae species such as Chlorella (El-Sheekh, plant organisms (Biogiloa) 42. potarum, Vol. 42. et al., Vol. No. 2-119, 1999, J. potteria. potterium, 93, J. potteria, 93, J. sp. 93, 93). In addition, Agrobacterium-mediated gene transfer techniques can also be used for gene transformation of microalgae, as have been reported, for example, by Kumar, Plant science 166(3), 731-738,2004 and Cheney et al, J.Ogaku, 37, suppl. 11,2001.
Transformation vectors or constructs as described herein will typically include a marker gene that confers a selectable or scorable phenotype on the target host cell (e.g., algal cell), or can be co-transformed with a construct that comprises a marker. Many selectable markers for efficient isolation of algal gene transformants have been successfully developed. Common selectable markers include antibiotic resistance, fluorescent markers, and biochemical markers. Several different antibiotic resistance genes have been successfully used for selecting microalgae transformants, including blasticidin (bleomycin), bleomycin (bleomycin) (see, e.g., Apt et al, 1996, supra; Fischer et al, 1999, supra; Fuhrmann et al, plant J, 19,353-61,1999, Lumbreras et al, plant J, 14(4), 441-447, 1998; Zaslavskaia et al, algae J, 36:379-386,2000), spectinomycin (spectinomycin) (Cerutti et al, remainsBiography, 145: 97-110,1997; doetsch et al, contemporary genetics, 39,49-60,2001; fargo, molecular cell biology, 19:6980-90,1999), streptomycin (streptomycin) (Berthold et al, protist, 153:401-412,2002), paromomycin (paromomycin) (Jakobiak et al, protist, supra; sizova et al, Gene (Gene), 277: 221-; poulsen and Kroger, FEBS letters, 272: 3413-; zaslavskaia et al, 2000, supra), hygromycin (hygromycin) (Berthold et al, 2002, supra), chloramphenicol (Poulsen and Kroger,2005, supra), and many other microalgae transformants. An additional selectable marker for microalgae (such as Chlamydomonas) may be one that provides resistance to: kanamycin and amikacin resistance (Bateman, molecular and general genetics 263:404-10,2000), sapamycin (zeomycin) and phleomycin (e.g., ZEOCIN)TMPhleomycin D1) resistance (Stevens, molecular and general genetics 251:23-30,1996) and paromomycin (paramomycin) and neomycin (neomycin) resistance (Sizova et al, 2001, supra.) other fluorescent or chromogenic markers that have been used include luciferase (Falciatore et al, J.Mar.Biotechnol., 1: 239-1999; Fuhrmann et al, plant molecular biology (PlantMol.Biol., 2004; Jarvis and Brown, contemporary genetics 19: 317-322, 1991), β -glucuronidase (Chen et al, 2001, supra; Cheney et al, 2001, Tuow and Tuow, Chong-Shu, El, 2002, Huang et al, Fabry et al, Lelch et al, 2003, Lexan et al, J.7, 19857, 89,2003, J.7, 2000, 19857, 89,2003, J.7, 2003, 2000, 89,2003, E.7, 2003, 89,2003, E.7, 2003, pp.7, 2002, 89,2003, 2003, 89,2003, E.7, 2003, E.7, A, E.7, E.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.A.
One skilled in the art will readily appreciate that a variety of known promoter sequences may be usefully deployed for transformation systems of microalgal species according to the invention. For example, promoters commonly used to drive transgene expression in microalgae include the various forms of cauliflower mosaic virus promoter 35S (CaMV35S), which have been used in both the dinoflagellate (dinoflagellate) and chlorella (Chlorophyta) (Chow et al, plant cell reports, 18:778-780, 1999; Jarvis and Brown, contemporary genetics, 317-321, 1991; Lohuis and Miller, plant journal, 13:427-435, 1998). The SV40 promoter from simian viruses has also been reported to be active in several algae (Gan et al, journal of applied botany, 151345, 349, 2003; Qin et al, aquabiologica 398, 399,469, 472, 1999). Promoters from RBCS2 (ribulose bisphosphate carboxylase, small subunit) from Chlamydomonas (Fuhrmann et al, J.Phytology 19:353-361,1999) and PsaD (abundant protein of photosystem I complex; Fischer and Rochaix, FEBS letters 581:5555-5560,2001) may also be useful. Fusion promoters of HSP70A/RBCS2 and HSP 70A/beta 2TUB (tubulin) (Schroda et al, J.Plant., 21:121-131,2000) can also be used for improved transgene expression, where the HSP70A promoter can act as a transcriptional activator when placed upstream of other promoters. High level expression of genes of interest can also be achieved, for example, in diatom species under the control of promoters from the fcp gene encoding the fucoxanthin-chlorophyll a/b binding protein of diatom (Falciatore et al, J.Amydorkov., 1: 239-. Inducible promoters can provide rapid and tightly controlled gene expression in transgenic microalgae, if desired. For example, the promoter region of the NR gene encoding nitrate reductase can be used as such an inducible promoter. NR promoter activity is normally inhibited by ammonium and induced when ammonium is replaced by nitrate (Poulsen and Kroger, FEBS letters 272:3413-3423,2005) and thus gene expression can be switched off or on when microalgal cells are grown in the presence of ammonium/nitrate. Additional algal promoters that may find use in the constructs and transformation systems provided herein include those disclosed in the following patent applications: U.S. patent No. 8,883,993; U.S. patent application publication nos. US 2013/0023035; U.S. patent application publication nos. US 2013/0323780; and U.S. patent application publication No. US 2014/0363892.
The host cell may be an untransformed cell or a cell that has been transfected with at least one nucleic acid molecule. For example, an algal host cell engineered to have attenuated cpSRP54 gene expression may further comprise one or more genes that may confer any desired property, such as, but not limited to, increased production of a biomolecule of interest (e.g., one or more proteins, pigments, alcohols, or lipids).
Method for producing products from photosynthetic organisms
Also provided herein are methods of producing products from photosynthetic organisms (such as algae) by culturing photosynthetic organisms with increased photosynthetic efficiency, such as the combinations of SGI1, SGI2, cpSRP54, cytoSRP54, SGI1, and cpSRP54 genes or the combinations of SGI2 and cpSRP54 mutants, as disclosed herein. The method comprises culturing photosynthetic organism mutants SGI1, SGI2, cpSRP54, cytoSRP54, a combination of SGI1 and cpSRP54 genes, or a combination of SGI2 and cpSRP54 in a suitable medium to provide a photosynthetic organism culture, and recovering biomass or at least one product from the culture. In some embodiments, the product is a lipid. The culture comprising the photosynthetic organism is preferably a photoautotrophic culture, and the culture medium preferably does not contain a significant amount of reduced carbon, i.e., the culture does not contain a form or level of reduced carbon that the algae can use for growth.
In some embodiments, the photosynthetic organisms may be cultured in any suitable vessel, including flasks or bioreactors, wherein the photosynthetic organisms may be exposed to artificial or natural light. A culture comprising a mutated photosynthetic organism may be cultured in a light/dark cycle, which may be, for example, a natural or programmed light/dark cycle, and as illustrative examples may provide twelve hours of light to twelve hours of darkness, fourteen hours of light to ten hours of darkness, sixteen hours of light to eight hours of darkness, and so forth.
Culturing refers to the deliberate promotion of growth (e.g., increase in cell size, cell content, and/or cell viability) and/or propagation (e.g., increase in cell number by mitosis) of one or more cells through the use of selected and/or controlled conditions. The combination of both growth and reproduction may be referred to as proliferation. As the examples herein demonstrate, mutants provided herein that exhibit deregulated adaptation to low light intensities can achieve higher cell densities of the culture over time (e.g., over a period of one week or more) relative to cultured wild-type algal cells of the same strain that are not deregulated in low light adaptation. For example, the cpSRP54 mutant can be cultured for at least five days, at least six days, at least seven days, at least eight days, at least nine days, at least ten days, at least eleven days, at least twelve days, at least thirteen days, at least fourteen days, or at least fifteen days, or at least one week, two weeks, three weeks, four weeks, five weeks, six weeks, seven weeks, eight weeks, nine weeks, or ten weeks or more.
Non-limiting examples of selected and/or controlled conditions that may be used to culture the recombinant microorganism may include the use of defined media (having known characteristics such as pH, ionic strength, and/or carbon source), specified temperatures, oxygen tension, carbon dioxide levels, growth in a bioreactor, and the like, or combinations thereof. In some embodiments, the microorganism or host cell may be mixotrophic using both light and a reduced carbon source. Alternatively, the microorganism or host cell may be cultured phototrophically. When the phototrophic growth is long, the algal strains can advantageously use light as an energy source. Inorganic carbon sources (e.g. CO)2Or bicarbonate) can be used for biomolecules synthesized by microorganisms. As used herein, "inorganic carbon" includes carbon-containing compounds or molecules that cannot be used by organisms as a sustainable energy source. Typically, the "inorganic carbon" may be CO2(carbon dioxide), carbonic acid, bicarbonate(s), carbonate(s), bicarbonate(s), etc., or combinations thereof, which cannot be further oxidized to obtain sustainable energy, nor can they be used as a reducing powe of a living beingr) source of the compound. Photoautotrophically growing microorganisms can be grown in a culture medium in which inorganic carbon is essentially the sole carbon source. For example, in a culture in which inorganic carbon is essentially the sole carbon source, any organic (reduced) carbon molecule or organic carbon compound that may be provided in the culture medium cannot be taken up and/or metabolized by the cells into energy and/or is not present in an amount sufficient to provide sustainable energy for growth and proliferation of the cell culture.
Microorganisms and host cells that can be used according to the methods of the present invention can be found in various locations and environments around the world. The particular growth medium used for optimal propagation and production of lipids and/or other products may vary and may be optimized to promote growth, propagation, or production of biomass or products (e.g., lipids, proteins, pigments, antioxidants). Solid and liquid growth media are generally available from a variety of sources, as indicated for the preparation of specific media suitable for use with a variety of microbial strains. For example, various freshwater and saltwater media can be included in Barsanti (2005), "algae: anatomy, Biochemistry, and Biotechnology (Algae: Anatomy, Biochemistry & Biotechnology), CRC Press (CRCPress), fresh water and saline media such as those described in the media and methods for culturing Algae. Algae culture medium formulations are also found on the website of various algae culture collections, including, by way of non-limiting example, the algae' UTEX culture Collection (www.sbs.utexas.edu/UTEX/media. aspx); culture collection of algae and protozoa (www.ccap.ac.uk); and Katedra Botaniky (botanic. natur. cuni. cz/algo/caup-media. html).
The culture method may optionally comprise inducing the expression of one or more genes for producing a product (such as, but not limited to, a protein involved in lipid production, one or more proteins, antioxidants, or pigments), and/or modulating a metabolic pathway in the microorganism. Inducing expression may comprise adding nutrients or compounds to the culture, removing one or more components from the culture medium, increasing or decreasing light and/or temperature, and/or other manipulations that promote expression of the gene of interest. Such manipulations may depend to a large extent on the nature of the (heterologous) promoter operably linked to the gene of interest.
In some embodiments of the invention, microorganisms that are deregulated in adaptation to low light intensities may be cultured in "photobioreactors" equipped with artificial light sources and/or having one or more walls that are sufficiently transparent to light (including sunlight) to enable, promote, and/or maintain acceptable microorganism growth and proliferation. To produce fatty acid products or triglycerides, the photosynthetic microorganisms or host cells can additionally or alternatively be cultured in shake flasks, test tubes, vials, microtiter dishes (microtiter dishes), Petri dishes (Petri dishes), and the like, or combinations thereof.
Additionally or alternatively, the recombinant photosynthetic microorganisms or host cells can be grown in ponds, waterways, sea-based growth vessels, ditches, raceways, channels, and the like, or combinations thereof. In such systems, the temperature may be unregulated or various heating or cooling methods or devices may be employed. Inorganic carbon sources (such as but not limited to CO) as in standard bioreactors2Bicarbonate, carbonate, etc.) include, but are not limited to, air, rich in CO2May be provided to the culture, or a combination thereof. In providing flue gas and/or other inorganic sources (other than CO)2And may contain CO) in addition, it may be necessary to pre-treat such sources so that the level of CO introduced into the (photo) bioreactor does not constitute a dangerous and/or lethal dose with respect to the growth, proliferation and/or survival of the microorganisms.
The mutant photosynthetic organisms may comprise one or more non-native genes encoding polypeptides for producing products such as, but not limited to, lipids, colorants or pigments, antioxidants, vitamins, nucleotides, nucleic acids, amino acids, hormones, cytokines, peptides, proteins, or polymers. For example, the encoded polypeptide can be an enzyme, a metabolic regulator, a cofactor, a carrier protein, or a transporter protein. The method comprises culturing a cpSRP54 mutant or a cytoSRP54 mutant comprising at least one non-native gene encoding a polypeptide involved in product production to produce biomass or at least one algal product. Products (e.g., lipids and proteins) can be recovered from the culture by recovery methods known to those of ordinary skill in the art, such as by whole culture extraction, e.g., using organic solvents. In some cases, recovery of fatty acid products may be enhanced by cell homogenization. For example, lipids (such as fatty acids, fatty acid derivatives, and/or triglycerides) may be separated from algae by extracting the algae with a solvent at elevated temperature and/or pressure, as described in co-pending, commonly assigned U.S. patent application publication No. US2013/0225846, which is incorporated herein by reference in its entirety.
Other alternative embodiments and methods will be apparent to those skilled in the art upon review of this disclosure. The discussion of the general methods presented herein is intended for illustrative purposes only. The following non-limiting examples are provided below.
Examples of the invention
Example 1
Generation of Chlorella strains overexpressing CAS9
The generation of a strain of chlorella that overexpresses Cas9 is described in U.S. patent application publication 2016/0304896, which is incorporated by reference in its entirety.
Briefly, vector pSGE-6709 was engineered for expression of the streptococcus pyogenes Cas9 gene in chlorella. The carrier comprises the following three elements: (1) cas9 expression cassette containing an engineered Cas9 gene codon optimized for chlorella and containing introns from chlorella, further containing an N-terminal FLAG tag, a nuclear localization signal and operably linked to chlorella RPS17 promoter, and a peptide linker terminating a selectable marker expression cassette by chlorella RPS17 terminator, containing a blasticidin resistance gene from Aspergillus terreus (Aspergillus terreus) codon optimized for chlorella and containing chlorella introns, operably linked to chlorella RPS4 promoter and terminated by chlorella RPS4 terminator, a GFP reporter expression cassette containing a GFP gene (Evrogen, moscow, russia) driven by chlorella ACP1 promoter and terminated by chlorella ACP1 terminator.
The vector was transformed into genus Chlorella by particle gun method. Use berleGene gun system (BioRad)Gene Gun System) to complete the transformation of the Chlorella wild-type strain WT-1185, substantially as described in U.S. patent publication No. 2014/0154806, which is incorporated herein by reference. DNA for transformation was precipitated onto gold particles that adhered to the inside of the length of the tube, and a helium puff was burned through the tube positioned within the gene gun to propel the DNA-coated gold particles into the chlorella sp WT-1185 cells that adhered to a solid non-selective medium (2% agar plate containing PM074 algae growth medium). Use ofThe gene gun fired two bullets per cell circle at 600psi from a distance of 3-6cm from the flat plate. The next day, cells were transferred to selective media for growth of transformed colonies.
Colonies were screened for full GFP penetrance by flow cytometry and identification of transformed strains whose individual fluorescence peaks shifted to higher values than the wild type fluorescence peak. To demonstrate Cas9 expression, fully infiltrated Cas9 strain was tested for Cas9 expression by anti-Cas 9 western blot, which showed a significantly shifted fluorescence peak relative to untransformed cells. Based on these screens, isolate 6709-2 was forwarded and given strain identifier GE-15699.
Example 2:
knockdown of CPSRP54 using the fully penetrating chlorella CAS9 editing line
Knock-out of cpSRP54 using the fully infiltrated chlorella Cas9 editing line is described in U.S. patent application publication 2016/0304896, which is incorporated by reference in its entirety. Briefly, a chimeric gRNA (SEQ ID NO:103) was designed, with the last three nucleotides representing PAM, and synthesized in vitro to target the chloroplast SRP54 gene in the sequence encoding Pediococcus.
GE-15699 was transformed by electroporation with 1-2 μ g of purified chimeric guide RNA and 1 μ g of selectable marker DNA containing the bleomycin resistance "BleR" gene codon optimized for Chlorococcus and containing introns from Chlorococcus (SEQ ID: 70). The BleR gene is operably linked to the Chlorella RPS4 promoter (SEQ ID:71) and is terminated by the Chlorella RPS4 terminator (SEQ ID: 72).
Electroporation was performed by inoculating 100mL of a seed culture inoculated to 1 × 10 six days prior to transformation6One cell/ml, to inoculate 1L of culture to 1 × 10 two days before transformation6On the day of transformation, cells were pelleted by centrifugation at 5000x g for 20 minutes, washed three times with 0.1um filtered 385mM sorbitol, and resuspended to 5 × 10 in 385mM sorbitol9Cells/ml. Under different conditions, in Burley (BioRad) Gene Pulser XcellTMIn 0.2cm cuvette, 100. mu.L of concentrated cells were electroporated. The DNA used to optimize electroporation was linearized pSG6640, containing bleR and TurboGFP expression cassettes. The TurboGFP cassette comprises a Chlorella-like ACP1 promoter (SEQ ID NO:67) and a Chlorella-like ACP1 terminator (SEQ ID NO:68) operably linked to a TurboGFP gene (SEQ ID NO: 24). Immediately after electroporation of the pre-cooled cells and cuvettes, 1mL of chilled sorbitol was added and used to transfer the cells to 10mL of PM 074. After overnight recovery, the cells were concentrated and spread onto 13cm diameter PM074 medium containing 250mg/L bleomycin (zeocin) and grown under the conditions listed in the biolistic section.
The electroporation conditions were 1.0-1.2kV (5000-. The use of larger amounts of DNA increases the number of resulting bleomycin resistant colonies, although the effect is stabilized at amounts greater than 4. mu.g. After electroporation, the cells were plated on agar medium (PM130) containing 250. mu.g/ml bleomycin to select transformants incorporating the ble cassette. Transformants were screened by colony PCR using primers designed for amplification across the native targeted loci (oligo-AE596 and oligo-AE 597). The primers were designed to produce a 700bp band in the absence of integration into the locus (e.g., the "knock-in" of the BleR cassette), or a 4.3kb band if there is integration of a single ble cassette into the targeted locus. In addition, colony PCR was also performed using primers designed to amplify a fragment extending from the cpSRP54 gene (oligo-AE597) to a selectable marker. Depending on the orientation of the integrated ble cassette, a 1.2kb band will result from amplification of primer 405/597 or primer 406/597 spanning from within the ble cassette out into the cpSRP54 gene. The results show a high frequency of knockin of the BleR cassette into the targeted locus in the absence of homology arms (between 40% and 45% in this sample). cpSRP54 knockdown resulted in a greenish phenotype.
Example 3
Knock-out of SGI2 using a fully penetrating Chlorella CAS9 editing line
Knock-out of SGI2 using the fully infiltrated chlorella Cas9 editing line was performed essentially as described above for cpSRP 54. Briefly, a chimeric gRNA (SEQ ID NO:104) was designed, with the last three nucleotides representing PAM, and synthesized in vitro to target the chloroplast SGI2 gene in the sequence encoding Pediococcus.
GE-15699 was transformed by electroporation with 1-2 μ g of purified chimeric guide RNA and 1 μ g of selectable marker DNA containing the bleomycin resistance "BleR" gene codon optimized for Chlorococcus and containing introns from Chlorococcus (SEQ ID: 70). The BleR gene is operably linked to the Chlorella RPS4 promoter (SEQ ID:71) and is terminated by the Chlorella RPS4 terminator (SEQ ID: 72).
Ble resistant colonies were selected and knockdown was confirmed by PCR.
Example 4
Knock-out of SGI1 using a fully penetrating Chlorella CAS9 editing line
SGI1 knockout strain 24183 was generated starting from the mother strain GE-15699 expressing Cas 9. GE-15699 cells were electroporated into chimeric gRNAs (SEQ ID NO:105, with the last three nucleotides of SEQ ID NO:105 representing PAM), and DNA cassettes containing a codon optimized Cre gene flanked by a nitrite reductase promoter and a terminator and as shown in FIG. 10A. The cassette also contains the ble and GFP genes that have been used previously. Ble and GFP are flanked by lox2272 sites. When Cre is expressed, the lox sites recombine, looping out the DNA between these sequences. A homologous sequence of the SGI1 gene surrounding the CRISPR target is also located at the end of the cassette to enhance single copy integration. In the case where the cassette is present in the SGI1 locus, the sequence is confirmed by DNA sequencing. The copy number was confirmed as a single copy integrant using ddPCR. Then, the strain is cultured in a non-ammonium containing medium (non-ammonium stabilizing medium) to express Cre. When Cre is expressed, the lox sites recombine, looping out the DNA between these sequences.
Example 5
Double knock-out of SGI2 and CPSRP54 using the fully penetrating Chlorella CAS9 editing line
Double knock-out of SGI2 and SRP54 using the fully infiltrated chlorella Cas9 editing line was performed essentially as described above for cpSRP 54. Briefly, two chimeric gRNAs were designed, one directed to cpSRP54(SEQ ID NO:69) and the other to SGI2(SEQ ID NO:73), the last three nucleotides representing PAM, and synthesized in vitro to target the chloroplast SGI1 gene in the sequence encoding Chlorococcus.
GE-15699 was transformed by electroporation with 1-2 μ g of purified chimeric guide RNA and 1 μ g of selectable marker DNA containing the bleomycin resistance "BleR" gene codon optimized for Chlorococcus and containing introns from Chlorococcus (SEQ ID: 70). The BleR gene is operably linked to the Chlorella RPS4 promoter (SEQ ID:71) and is terminated by the Chlorella RPS4 terminator (SEQ ID: 72).
Ble resistant colonies were selected and knockdown was confirmed by PCR.
Example 6
Double knock-out of SGI1 and CPSRP54 using the fully penetrating Chlorella CAS9 editing line
The chlorella SGI1 knockout strain 24183 described above was electroporated with a chimeric gRNA targeting cpSRP54(SEQ ID NO:69) with a DNA cassette comprising ble and GFP sequences (fig. 10B) to generate a double knockout of SGI1 and cpSRP 54. Ble resistant colonies were selected and knockdown was confirmed by PCR. Three double knockout strains were generated: STR 245638, STR 245640 and STR 245641, which are identical in photo-physiological properties and physical phenotype.
Example 7
Generation of a Chlorella SGI1 knock-out strain comprising a single copy of the CAS9 Gene
The bleomycin-resistant "BleR" gene, codon optimized for Chlorella and including introns from the Chlorella (SEQ ID:70), GFP gene, Cre gene, lox site and Cas9 gene, was cloned into the pCC1BAC vector. The Cas9 gene was operably linked to the chlorella RPS17 promoter and included 29 native PBP introns and was located outside the lox2272 site. The Cas9 gene was terminated by a chlorella RPS17 terminator. The BleR gene is operably linked to the Chlorella RPS4 promoter (SEQ ID:71) and is terminated by the Chlorella RPS4 terminator (SEQ ID: 72). The GFP gene is operably linked to the chlorella ACP1 promoter and is terminated by the chlorella ACP1 terminator. The Cre gene is operably linked to a chlorella nitrite reductase promoter and a chlorella nitrite reductase terminator. These genes are flanked by a portion of the SGI1(CheY) sequence that serves as a site for homologous recombination. FIG. 17 shows a schematic diagram of a recombinant pCC1BAC vector.
Transformation of WT chlorella host strains: STR00010
Cas9 gene WT chlorella host strain was co-transformed with gRNA targeting SGI1 gene (SEQ ID NO:74) and PvuI digested and spin-purified selection cassette (NAS00460, SEQ ID NO: 86).
The selection cassette (NAS00460) included a fragment comprising the upstream 1.7kb vector backbone (corresponding to sequence 1-1761 of SEQ ID NO:86) of the upper arm of the SGI1 Homologous Recombination (HR) and the non-vector part downstream of the lower arm of SGI1 Hr, a bleomycin-resistant "BleR" gene codon optimized for Chlorococcus and containing introns from the Chlorococcus (SEQ ID:70), the GFP gene (corresponding to sequence 8260-8961 of SEQ ID NO:86) and the Cas9 gene. The selection cassette contains ble and GFP within the lox site. The CRE gene (corresponding to sequence 10418-13326 of SEQ ID NO:86) comprises 6 nitrite reductase codon optimized parachloropsis introns under a nitrite reductase inducible promoter (corresponding to sequence 9906-10417 of SEQ ID NO: 86). The Cre gene is terminated by a nitrite reductase terminator (corresponding to the sequence 13327-15140 of SEQ ID NO: 86). The Cas9 gene, which contains 29 native PBP introns, corresponds to sequence 15754 to sequence 25918 of SEQ id No. 86. The Cas9 gene is under the pseudo-chlorella RPS17 promoter (corresponding to SEQ ID NO:86 sequence 15166-15753) and contains 29 native PBP introns and is located outside the lox site. The Cas9 gene was terminated by the Chlorella RPS17 terminator (corresponding to the sequence 25919-26373 of SEQ ID NO: 86). T is
The BleR gene is operably linked to the Chlorella RPS4 promoter (SEQ ID:71) and is terminated by the Chlorella RPS4 terminator (SEQ ID: 72). The GFP gene was operably linked to the Chlorella ACP1 promoter (corresponding to sequences 7688 to 8259 of SEQ ID NO:86) and was terminated by the Chlorella ACP1 terminator (corresponding to sequences 8692-9830 of SEQ ID NO: 86). The upper arm of the SGI1 Homologous Recombination (HR) corresponds to the sequence 1762-3578 of SEQ ID NO: 86. The downstream arm of SGI1 Homologous Recombination (HR) corresponds to the sequence 26448-28447 of SEQ ID NO 86. The 5'lox2272 site corresponds to the sequence 3831-3864 of SEQ ID NO 86 and the 3' lox2272 corresponds to the sequence 9839-9872 of SEQ ID NO 86. All sequences were within the 2kb homologous region upstream and downstream of the SGI1 CRISPR target.
Upon co-transformation of the SGI1 gRNA (SEQ ID NO:105) and the selection cassette (SEQ ID NO:86), the SGI1 gene was knocked out and the selection cassette comprising the Cas9, BleR and GFP genes was inserted into the SGI1 site by homologous recombination. The BleR and GFP genes are flanked by lox2272 sites, while the Cas9 and Cre genes of the selection cassette are located outside the lox2272 site, but within the portion of the SGI1 sequence that serves as the site of homologous recombination.
Once the selection cassette is inserted into the SGI1 locus, the Cre gene is operably linked to an inducible nitrite reductase promoter. Thus, Cre gene expression is induced when the microorganism is grown in a growth medium comprising nitrite. Upon Cre gene expression, the Cre enzyme acts at the lox2272 site and removes the BleR and GFP sequences flanking the lox site. This results in a system in which a selectable marker (e.g., GFP, other antibiotic markers, e.g., BleR) can be reintroduced during the sub-sequence transformation of other sequences.
Screening of transformed Chlorella strains for Cas9 insertion
Transformed chlorella cells were plated onto single colonies on selective plates containing ammonium to inhibit CRE expression, plated again on selective inhibition plates to repair colonies, and screened for knockouts using PCR and GFP shift. PCR primers used to confirm the knockdown were as follows:
AE803:AGGCTACTCTCAGACATGACGGTGGCTCTG(SEQ ID NO:87)
ST815:GCCACAAATGAAGGTTGGCAGGGTCAGTGC(SEQ ID NO:88)
PCR positive reactions were sent to sequencing to confirm knockdown (insertion of cassette) and perfect HR. The inventors of the present application surprisingly and unexpectedly found that it is a single copy of the Cas9 gene that is inserted into the SGI1 locus.
Example 8
Triple knockout of SGI1, SGI2 and CPSRP54 using fully penetrating single copy CAS9 editing lines of Chlorella
The chlorella SGI1 knockout strain STR24129 described above was generated with a single copy of Cas9 and Cre inserted into the SGI1 locus, using the SGI1 knockout guide sequence: ACACCACCTTAAGGCACATGAGG (SEQ ID NO:89) the tag (ble/GFP) was flanked by removals.
The SGI1 knockout strain STR24129 was used as a transformation host for knocking out SGI2 and SRP54 genes. Host strain STR24129 was co-transformed with grnas targeting SGI2 and SRP54 genes and a selection cassette (pSGE06866) comprising Ultramers containing Homologous Recombination (HR) arms for each target (e.g., SRP54 and SGI 2). The BleR gene is operably linked to the Chlorella RPS4 promoter (SEQ ID:71) and is terminated by the Chlorella RPS4 terminator (SEQ ID: 72). The GFP gene is operably linked to the chlorella ACP1 promoter and is terminated by the chlorella ACP1 terminator. The selection cassette includes a ble and GFP marker surrounded by lox sites for potential marker recycling. When Cre is expressed, the lox sites recombine, looping out the DNA between these sequences.
Transformed host cells were plated on selective plates, patched and single colonies were picked, and knockouts were screened using PCR. PCR positive reactions were sent to sequencing to confirm knock-out (insertion of cassette) of each target.
Fig. 16A and 16B show schematic diagrams of selection cassettes for knock-out of chlorella SRP54 and chlorella SGI 2. The sequences of gRNAs and ultramers with HR arms are shown below.
SRP54-EMRE3EUKT592650
gRNA sequence: GGCGTGGGACATGGTGCGCAAGG (SEQ ID NO:90)
Ultramer with HR arm to amplify pSGE 06866:
ST938_HR_SRP54-UP
TGAAGCACCCCCCGGCCTCTCCCCCCGCAGGGCCGCCCCTCCCGCCTCGTCGTGC(SEQ ID NO:91)
ST939_HR_SRP54-DOWN
CGCAACGCTCTCCCTCCCCACCCCCCAGCCTCACATCCGCCTCAAGCAGCGCCCTG(SEQ ID NO:92)
the primer sequence is as follows:
ST949_CasPipe9GT_SRP54-fwd:caagctatgcgaggaagggagggtc(SEQ ID NO:93)
ST950_CasPipe9GT_SRP54-rev:ctgccgcaagtgagtgtgctgtc(SEQ ID NO:94)
other primers used for screening-located in the selection cassette:
JV 946-linker 5-for: caccagatataggtgacccgataac (SEQ ID NO:95)
AE608 ble rev:AAAACTCCACTGCACCTGCAACAT(SEQ ID NO:96)
SGI2-EMRE3EUKT590485
gRNA sequence:
ST937_crRNA_064_EMRE3EUKT590485:TGCGGTGAAGCTTGGAGCTG(SEQ ID NO:97)
ultramer sequence with HR arm placed on PSGE06866
ST940_HR_SGI2-UP
TTGCCGTCGACGAGACTTCGGGGCGCGCATTTATCGACTCTCTTGAAGATACACCGGTT(SEQ IDNO:98)
ST941_HR_SGI2-DOWN
TCCAATTGTAGATATCATATTGTTTCCGGACCTACCTTACGCACTGAGTGCTGCCAGATGTTCTT(SEQID NO:99)
The primer sequence is as follows:
ST046CasPipe9GT-064-fwd:gaggtgggtggtagtgcttcgcgaggtg(SEQ ID NO:100)
ST047CasPipe9GT-064-rev:atcacagctcacagggcagacactgcgtc(SEQ ID NO:101)
the primer sequence is as follows:
primers JV946 and AE608 were also used as screening primers.
Example 9
Bioinformatic analysis of domain architecture of SGI2 protein
The domain architecture of exemplary SGI2 proteins from chlorella, oocystis, tetrakiss, arabidopsis was analyzed using the online tool InterProScan (tool version 5.27, database version 66.0, from EMBL-EBI, simon, cambridge county, CB 101 SD, uk).
As shown in fig. 3-9, a single conserved response-accepting domain was identified at the N-terminus of the SGI2 protein.
Example 10
Bioinformatic analysis of response receiving domains of various SGI2 proteins
The Chlorella response receiving domain (SEQ ID NO:6) was locally aligned with other orthologous protein pairs from other algal species and various plants using the BLOSUM62 matrix, gap penalty of 10 and extension penalty of 0.5. The local alignment of the response-receiving domain of Chlorella (SEQ ID NO:6) with various photosynthetic organisms is shown in Table 5 below.
Table 5: the chlorella response receives the results of local alignments of domains with various orthologous proteins.
The response-receiving domain of the genus chlorella shows a higher percentage of identity with other algal species and a high degree of similarity with various plant species.
Example 11
Screening of Low-chlorophyll Chlorella strain WT-1185 mutant
After knocking out SGI1, SGI2, double-knocked-out SGI1 and cpSRP54 or double-knocked-out SGI2 and cpSRP54 of the chlorella gene, as described above, cells from light-colored colonies were selected and allowed to stand at low light (100 μmol photon m)-2sec-1) The next growth was for one to five days, after which they were sorted by flow cytometry using a BD FACSAria II flow cytometer (BD biosciences, san jose, ca) to select cells with low chlorophyll fluorescence. Typically, the fraction of cells with a minimum of about 0.5% to 2% chlorophyll fluorescence compared to the total cell population is selected. After the sorted cells were plated, antenna-depleted cell lines isolated by flow cytometry were further initially screened by visual selection of pale green or yellow colonies. To screen putative antennary reduced cell lines from other reduced pigment mutants and false positives, selected colonies were subjected to a secondary culture screen of moderate throughput to adapt isolates to low light conditions prior to photophysiological measurements. Chlorophyll fluorescence was monitored during low light adaptation to select clones that retained the reduced chlorophyll fluorescence properties of the high light adaptation state. When shifting from high to low light, the selected clones showed only a small increase in chlorophyll (relative to wild-type cells).
Used at 75cm2165ml of culture in tissue culture flasks, at constant high light (Large)About 1,700. mu. mol photon m-2sec-1) Semi-continuous culture assays were performed to identify strains with increased productivity (increased rate of biomass production, measured as Total Organic Carbon (TOC) accumulation) relative to the wild-type ancestor strain WT-1185. Two 75cm inoculations with seed cultures of a given mutant strain2A flask. The flask has a stopper with a tube connected to a syringe filter for delivering CO-enriched air bubbling through the culture2Air (1% CO)2). The width of the flask (narrowest dimension) was compared to the LED light group. The depth of the culture (distance from the wall of the flask closest to the light source to the wall of the back of the flask) was about 8.0 cm. At the beginning of the photoperiod, cultures were diluted daily by removing 65% of the culture volume and replacing it with diluted fresh PM119 medium to accommodate for the culture (212ml di H)2O to 1L PM119 medium) to increase salinity. Samples for TOC analysis were taken from the cultures removed for dilution.
Example 12
Semi-continuous productivity assay for chlorella mutants
The chlorella strain found to have reduced chlorophyll under low light conditions was analyzed to increase productivity. In the productivity assay, photoautotrophic cultures of mutants were grown in a constant light semi-continuous mode (CL-SCPA) for several days, with culture samples removed daily for biomass determination. Light was kept constant at 1900--2sec-1The next time lasts 24 hours. In this assay, 225cm was inoculated with a seed culture of a given mutant strain2PM119 medium in flask. Three cultures were initiated for each strain. The flask contained a stir bar and had a stopper with a tube connected to a syringe filter for delivering CO-enriched air bubbled through the culture2Air (1% CO 2). The width of the flask (narrowest dimension) was compared to the LED light group. The "depth" dimension of the flask extending rearwardly from the light source was 13.7 cm. Considering the position of the flask, the farthest distance of the cell in the flask from the light source surface was about 15.5cAnd m is selected. Cultures were diluted daily by removing 65% of the culture volume and replacing it with diluted fresh PM119 medium to accommodate the increased salinity due to evaporation occurring in the culture. Samples for TOC analysis were taken from the cultures removed for dilution. Once the culture reached steady state, the semi-continuous productivity assay was run for 12 days.
The productivity of the assay was assessed by measuring Total Organic Carbon (TOC) from samples removed daily. Total Organic Carbon (TOC) was determined by diluting 2mL of cell culture to a total volume of 20mL with deionized water. Three injections were made for each measurement and injected into a Shimadzu TOC-Vcsj analyzer to determine Total Carbon (TC) and Total Inorganic Carbon (TIC). The furnace is set to 720 ℃ and TOC is determined by subtracting TIC from TC. The 4-point calibration range is 2ppm to 200ppm, corresponding to 20-2000ppm of undiluted cultures, with a correlation coefficient r2> 0.999.
Various embodiments of the present invention have been described. However, it is to be understood that elements of the embodiments described herein may be combined to form additional embodiments, and that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments, alternatives, and equivalents are within the scope of the invention described and claimed herein.
Example 13
Semi-continuous urea batch assay for chlorella mutants
In SCUBA (semi-continuous urea batch assay), photoautotrophic cultures of mutants were grown for several days in nitrogen-filled day-night light semi-continuous mode and then in nitrogen-depleted batch mode. The light was programmed to mimic the average 5 month 4 day of the Imperial Valley (Imperial Valley) of California from darkness to 2000 μmol photons m at noon-2sec-1. Samples were taken daily at "dusk". In this assay, 420ml of urea-based PM153 medium in 500ml quadrate flasks was inoculated with seed culture of a given mutant strain.
PM152 is a nutrient-depleted medium based on PM074, but contains urea as a nitrogen source instead of nitrate. It is prepared by mixing 1.3mlF/2 algal feed A fraction (Aquatic ecosystems, Aquatics Eco-Systems) and 1.3ml of 'solution C' were added to a final volume of 1 liter of instant sea salt solution (17.5g/L) (Aquatic ecosystems, Apoppa, Florida). Solution C was 38.75g/L NaH2PO 4H 2O, 758mg/L thiamine hydrochloride, 3.88mg/L vitamin B12, and 3.84mg/L biotin.
Three cultures were initiated for each strain. The flask contained a stir bar and had a stopper with a tube connected to a syringe filter for delivering CO-enriched air bubbled through the culture2Air (1% CO 2). The flask was aligned to 0.0875m2The aperture of the light was aligned and the "depth" dimension of the flask extending rearwardly from the light source was 8 cm. For the semi-continuous biomass determination, cultures were diluted daily by removing 40% of the culture volume and replacing it with diluted fresh PM153 medium to accommodate the increased salinity due to evaporation occurring in the culture. Samples for TOC analysis were taken from the cultures removed for dilution. The semi-continuous productivity assay was run until the culture reached steady state. After semicontinuous culture was removed from the assay, pelleted using centrifugation and resuspended in 420ml of nitrogen depleted PM152 medium. Cultures were batch cultured for 4-5 days using the same growth conditions as for the semi-continuous mode. During batch mode, FAME samples were collected to determine lipid productivity and TOC samples were collected to determine FAME/TOC.
Example 14
Chlorophyll content, antennal size and photophysics of double-knock-out parachloroalgae knockout mutants of SGI1, SGI2 gene, SGI1 and SRP54 and SGI2 and SRP54 gene
The chlorophyll content of the high-productivity mutants was determined by extracting the cells with methanol and analyzing the supernatant by spectrophotometry. Briefly, a 500 μ l aliquot of the broth was pipetted into a 2.0ml twist top tube and pelleted using a bench top microcentrifuge at 15,000rpm for 10 minutes. From heavyThe supernatant was aspirated off the pellet and each pellet was resuspended in 1.5ml of 99.8% methanol (previously neutralized with magnesium carbonate). 0.2ml of glass beads (0.1 mm in diameter) were added to each vial and tapped for 3 minutes. 1.0ml of supernatant was transferred to a new 1.7ml flip-top tube and centrifuged in a bench top microfuge at 15,000rpm for 10 minutes. The resulting precipitate was white indicating that complete extraction had been performed. 0.8ml of each supernatant was pipetted into an optical glass cuvette and the absorption wavelengths were immediately read at wavelengths of 720nm, 665nm and 652 nm. Spectrophotometric measurements were performed in dual beam mode using 99.8% methanol blanks. The following equation was used to calculate chlorophyll concentration: chlorophyll a [ g m ]-3]16.72(a665-a720) +9.16(a652-a720) and chlorophyll b [ g m [-3]34.09(a652-a720) -15.28 (a665-a 720). The amounts of chlorophyll a and b were normalized on a per cell and per TOC basis. Although the total chlorophyll amount varies per cell in the SGI1-2261 mutant, it is generally reduced by about 30% to about 65% relative to the wild-type cell, consistent with the observed reduction in antenna size. The reduction in total chlorophyll in the SGI1 mutant ranged from about 30% to about 50% relative to wild-type cells on a per TOC basis.
In addition to chlorophyll content, the PSII functional absorption cross-section, PSI functional absorption cross-section, 1/τ'Qa(measurement of the photosaturation rate of electron transfer on the acceptor side of photosystem II, at photosaturation, efficiency of linear photosynthetic electron transfer) and the maximum carbon fixation rate Pmax. Cells of the wild type and mutant strains were cultured in the above-described constant light semi-continuous culture assay (CL-SCPA).
Analysis of various photosynthetic parameters using Fluorescence Induction and Relaxation (FIRE) technology, developed to measure a range of comprehensive photosynthetic and physiological characteristics of photosynthetic organisms (Gorbunov and Falkowski (2005), "Fluorescence Induction and Relaxation (FIRE) technology and Instrumentation for Mo for monitoring photosynthetic processes and primary production in aquatic ecosystems (fluorescence induction and Relaxation (FIRE)NitringPhotosynthetic Processes and Primary Production in Aquatic Ecosys) ": photosynthesis: basic Aspects of global field of view (Photosynthetic: fungal observations to Global perspectives), 13 th International conference on Photosynthesis, Montreal, 2004 from 8.29 to 9.3. (edit: A.van der Est and D.Bruce), Allen Press (Allen Press), Vol.2, p.1029-1031). The FIRE technique relies on the measurement and analysis of Chlorophyll "variable Fluorescence" profiles (reviewed by Falkowski et al, 2004, "Chlorophyll variable Fluorescence Techniques in Marine Ecosystems): Chlorophyll A Fluorescence: characteristics of Photosynthesis (Chlorophyla Fluorescence: A Signal of Photosynthesis) (C Papageorgiou and Govingjee editors, Schlingge Press, 757. 778. pages.) depending on the relationship between Chlorophyll Fluorescence and efficiency of the process, this technique provides a set of parameters that characterize the photosynthetic light collection process, the photochemical system and the photosynthetic electrons in PSII and the photosynthetic electron transfer down to the fixed carbon, the measurement of Chlorophyll by the institute of Photosynthesis, Hakken photo technologies (Hakken-Hakko, Inc.) (the micro-photo technologies), com, satellatic.and planet-ocean.co.uk) commercially available from FIRe apparatus, canada. Further information on the use of the FIRE device is provided in the company's manual. All measurements were performed with constant light (2000. mu. mol photon m)-2sec-1) Semi-continuous culture (CL-SCPA) (see example 3). To obtain FV/FM and sigma PSII, fluorescence induction and relaxation (FIRe) kinetic measurements were performed in the dark. F presented in Table 6v/FMAnd σPSIIThe values of (2) were calculated as the average of 6 measurements (3 measurements for each replicate of 2 replicates), and the error in these parameters did not exceed 5%.
The measurement of the PSI cross section was performed using a modified JTS-10 spectrometer with a filter set for measuring the electrochromic shift (ECS) at 520nm by equipped with a custom single-tip flash lamp (STF). Sample (A)The peak power density in the cell is high enough to ensure complete shut-down of the reaction centers within about 10 mus. The resulting excitation rate is about 1-3 hits per reaction center per 10 μ s (depending on the functional absorption cross section of the photosystem). The STF generates short ultra bright blue pulses (455nm with a half-bandwidth of 30nm), and the pulse timing is controlled by the trigger of the JTS-10 spectrometer. The pulse duration is controlled by the STF pulse control box and can be adjusted in the range of 1 to 50 mus using a potentiometer on the front plate. To measure PSI cross-section, cultures were diluted to an OD of about 0.2 at maximum chlorophyll (about 440nm) based on measurement of the absorption spectrum of the cell suspension using a Perkin elmer lambda 650 spectrophotometer equipped with an integrating sphere. In the presence of DCMU and hydroxylamine, an intensity range of 4000 to 120,000. mu. mol photon m was used-2s-1ECS was measured by 10 μ s scintillation. Fitting an experimental curve with a simple exponential function
Wherein ECSMECSMIs the largest ECS signal; it is photon density in photons per square meter; and σPSIσPSIIs a functional cross section of PSI the value of the PSI functional cross section of the wild type genus Chlorella (WT-1185) obtained was (4.0. + -. 0.5) × 10-18(4.0±0.5)×10-18m2. These values are approximately under the same conditions (σ)PSII=(4.3±0.1)×10-18σPSII=(4.3±0.1)×10-18m2) obtained for functional cross-section of the grown PSII. The error estimate for these parameters does not exceed 20%.
Carbon fixation rate (C)14Pmax) was measured using a culture normalized to 5. mu.g chl ml-1 in a medium containing 0.5g l-1(5.95mM) sodium bicarbonate. 20.4 μ Ci ml of 1C14 labeled sodium bicarbonate was added to each culture and exposed to 2500 μ E for 10 minutes. The sample was immediately acidified with 2N HCl and allowed to vent overnight. The next day, samples were measured and quantified using a Beckman LS6500 scintillation counter.
τ' Qa (electron transfer time on the PSII acceptor side measured under saturated light conditions-effectively determined by the slowest step of linear photosynthetic electron transfer) was measured from the FIRe light curve and dark-induced relaxation kinetics (DIRK) spectra. The volumetric PSII concentration relative to wild type was estimated as (Fv/σ)530PSII). The error estimate for these parameters does not exceed 15%. The optical absorption cross section (average over the light source emission spectrum) was estimated using the following equation:
where [ Chl/TOC ] is the chlorophyll/TOC of the sample, OD (λ) OD (λ) is the optical density of the sample measured at wavelength λ λ λ, Δ l Δ l is the beam path length measured in the cuvette (1cm), and I (λ) I (λ) is the light source intensity used to grow algae at wavelength λ λ λ.
TABLE 6 fluorescence and photosynthetic parameters measured by the FIRE technique
Photo-physiological data, chlorophyll content and productivity data of wild-type chlorella strain WT-1185, single knockout of SRP54 and SGI2 genes and double knockout of SGI2 and SRP54 genes in chlorella were summarized and evaluated. All measurements were performed using CL-SCPA cultures. To obtain FV/FM and sigma PSII, fluorescence induction and relaxation (FIRe) kinetic measurements were performed in the dark. The values presented for Fv/Fm and σ PSII were calculated as the average of 6 measurements (3 measurements for each of 2 biological replicates) -the error in these parameters did not exceed 5%. τ' Qa (electron transfer time on the PSII acceptor side measured under saturated light conditions-effectively determined by the slowest step of linear photosynthetic electron transfer) was measured from the FIRe light curve and the DIRK spectra. The measurement of the PSI cross section is performed as described above. The results are summarized in table 7 below.
TABLE 7 photophysics, chlorophyll, and productivity data
The functional absorption cross section of PSII was significantly reduced (50%) and the number of functional PSII complexes was also reduced. The cells had increased carbon fixation capacity (a 26% increase in Pmax). Single knockouts of SGI2 or SRP54 showed at least a 17% increase in TOC productivity compared to the wild type strain. Overall, the double SGI2/SRP54 knockout strain showed a 32% increase in TOC productivity (both double SGI2/SRP54 knockout strains, which showed productivity >40 g/m/day, run on CL-SCPA assay), where the highest increase in productivity for chlorella was observed and higher than the average increase for single knockout SRP54 or SGI2, as shown in fig. 11. The results indicate that when both SGI2 and SRP54 genes are knocked out, there appears to be a synergistic effect on productivity.
Photophysiological data of wild type chlorella strain WT-1185, single knock-outs of SRP54 and SGI1 genes, and three strains with double knock-outs of SGI1 and SRP54 genes in chlorella were evaluated. All measurements were performed using CL-SCPA cultures. To obtain FV/FM and sigma PSII, fluorescence induction and relaxation (FIRe) kinetic measurements were performed in the dark. For FV/FMAnd σPSIIThe values presented were calculated as the average of 6 measurements (3 measurements for each of 2 biological replicates) -the error in these parameters did not exceed 5%. τ' Qa (electron transfer time on the PSII acceptor side measured under saturated light conditions-effectively determined by the slowest step of linear photosynthetic electron transfer) was measured from the FIRe light curve and the DIRK spectra. The results are summarized in table 8.
TABLE 8 Photophysiology of Chlorella strains
Compared to single SGI1 or SRP54 gene knockouts, functional cross-sections of SGI1/SRP54 double knockout strain PSII were significantly reduced and light saturation rates of electron transfer decreased, indicating an increased rate of photosynthesis. Work (Gong)The number of functional PSII complexes also increases. Photochemical maximum quantum yield (F) in light System II in double knockout strains compared to Single knockout SRP54 or SGI1V/FM) Is improved.
Example 15
Microanalysis of SGI1/SGI2, SGI1/SRP54 and SGI1/SGI2/SRP54 knockout mutants
To determine the overall biomass composition of SGI1/SGI2, SGI1/SRP54, and SGI1/SGI2/SRP54 knockout mutants, a quantitative analysis was performed on samples from cultures grown in semi-continuous mode at 40% daily dilution to determine Total Organic Carbon (TOC) and lipid content of cells in semi-continuous culture. After the culture reached steady state, aliquots of the daily diluted culture were removed for analysis of lipids, proteins and carbohydrates. The Total Organic Carbon (TOC) of the algae culture samples was determined by diluting 2mL of cell culture with deionized water to a total volume of 20 mL. Three injections were made for each measurement and injected into a Shimadzu TOC-Vcsj analyzer to determine Total Carbon (TC) and Total Inorganic Carbon (TIC). The furnace is set to 720 ℃ and TOC is determined by subtracting TIC from TC. The 4-point calibration range is 2ppm to 200ppm, corresponding to 20-2000ppm of undiluted cultures, with a correlation coefficient r2> 0.999.
To determine lipid content, FAME analysis was performed on 2mL samples dried using GeneVac HT-4X. To the dried precipitate was added the following: 500 μ L of 500mM KOH in methanol, 200 μ L of tetrahydrofuran with 0.05% butylated hydroxytoluene, 40 μ L of 2mg/ml C11:0 free fatty acid/C13: 0 triglyceride/C23: 0 fatty acid methyl ester internal standard mixture and 500 μ L of glass beads (diameter 425-600 μm). The vial was capped with an open-topped PTFE septum liner cap and placed in a SPEX genogrind at 1.65krpm for 7.5 minutes. The sample was then heated at 80 ℃ for five minutes and allowed to cool. For derivatization, 500 μ L of 10% boron trifluoride in methanol was added to the sample before heating at 80 ℃ for 30 minutes. The tube was cooled before adding 2mL heptane and 500. mu.L 5M NaCl. The sample was then vortexed at 2krpm for five minutes and finally centrifuged at 1krpm for three minutes. The heptane layer was sampled using a Gerstel MPS auto-sampler. Quantitation uses an 80. mu. g C23:0FAME internal standard.
FIGS. 12A and 12B show results of measurements indicating semicontinuous-area TOC productivity and batch TOC of a Chlorella-mimetic wild-type strain (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR24183), SGI1/SRP54 double knockout mutant (STR 245638 and STR 245634), respectively. SRP54 knockout mutant, SGI1 knockout mutant, SGI1/SRP54 double knockout mutant show increased TOC productivity relative to chlorella wild-type strains.
FIGS. 13A and 13B show results of measurements indicating semicontinuous-area TOC productivity and batch TOC of a chlorella wild-type strain (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR00012), SGI2/SRP54 double knockout mutant (STR00516), and SGI1/SGI2/SRP54 triple knockout mutant (STR25761 and STR25762), respectively. SGI1 knockout mutants, SGI2/SRP54 double knockout mutants, and SGI1/SGI2/SRP54 triple knockout mutants show increased TOC productivity relative to chlorella wild-type strains.
FIG. 14 shows the results of batch FAME productivity assays for Chlorella wild type strains (STR00010), SRP54 knockout mutant (STR00625), SGI1 knockout mutant (STR24183), SGI1/SRP54 double knockout mutants (STR 24528138 and STR 245840). SGI1 and SGI/SRP54 knockout mutants show increased FAME productivity relative to chlorella wild-type strains.
FIG. 15 shows the results of batch FAME productivity assays for Chlorella wild type strains (STR00010), SGI1 knockout mutant (STR00012), SGI2/SRP54 double knockout mutant (STR00516), and SGI1/SGI2/SRP54 triple knockout mutants (STR25761 and STR 25762).
The headings in this application are for the convenience of the reader only and do not limit the scope of the invention or its embodiments in any way.
All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Sequence listing
<110> synthetic genomics Co
I. Ajiawei tea
F.i. kutzminoff
R.R. Ladacokawitz
J.H. Weiluo
S. bauz
Sprenfelidae
Lambert of W.F
Grainer J.N
<120> Gene Regulation of photosynthetic organisms for improved growth
<130>SGI2140-2WO
<150>US 62/690,205
<151>2018-06-26
<150>US 62/612,251
<151>2017-12-29
<160>105
<170> PatentIn version 3.5
<210>1
<211>4531
<212>DNA
<213> genus Chlorella
<400>1
atgtctggtt cagctggatc gggccaggct actctcagac atgacggtgg ctctgctggc 60
ggcagtgggc ctgtctcaga cggtttttca ccggccggcc tgaaggtaaa gtagaaagac 120
actcatacac atcttggttc ggcgttgaaa gtaggtcatt aacatactct ataaccaata 180
tttgtaggtt ctggtcgtgg acgacgacct catgtgcctt aaggtggtgt cagccatgtt 240
gaagaggtgc agctatcaag gtgaggtctt tactggtgtc tgttattgct gtaacatcat 300
ttcgctgttg cacaatttaa acatttgtaa tttactgttg ttattgcagt ggccacttgt 360
agcagtggca gcgaggcact gacacttcta cgtgaacgca acgaggacgg atcctccgac 420
cagttcgacc tcgtactgtc agatgtttac atgccgggta tgtcgtattc ctttgtaaac 480
tttacaatat gcgtctagtt tgacgcgtac actttgtaca ctttgcaaaa acgcaccctg 540
cgaggtctgc catttggtca ctacaacttg gccaccttgg ttgcaagttt gcaagttcgc 600
tctacgtcaa cgctgcaaaa tgaaccaatt gttttgcact gaccctgcca accttcattt 660
gtggctgcag acatggacgg tttcaagctg cttgaacaca tcggtctaga gttggagctt 720
cccgttatca gtaagttgat cgagccgagt ccagagcgaa gcctgcttct atactattag 780
cagctgtctt ttgatatttg acagcttgac ttgatatggt cacagagcat acttgcaacc 840
aggttacctg ttgaactagc aactgtgccc aagcatctct tcaagcacct ccgtcagtcc 900
atagggtact gttgatttgt actctgcaat actgcactgt aatgcgctgt gaatcactgc 960
ccttcacctc tagatggtgc ttccctggag ccctccccca cctccgcctc aagcccctca 1020
catgcctctc ccccccctgc agtgatgtca tccaacgggg acacgaatgt cgtgctgcgg 1080
ggggtcaccc acggggctgt ggactttctg atcaagcccg ttcgaattga ggagctgcgg 1140
aacgtgtggc agcacgtggt gcgtcgtcgt tccatggcgc tggccaggac gccagacgag 1200
gggggacact cggacgagga ctctcaggtg cccttggcag cttctgggcg gcttgctgtg 1260
tcggatgcca cttggactgg ggatgcacga ggggtggggg gacaatggga gatgggccat 1320
agtaggccag agttgatggc agtggtggtg ggggggagta ggcgggagag aagcagccat 1380
cctggtgttg gttttgatga ttgagtgcat ggggatgatg cacaggtgag ctgactggat 1440
gccttgtctt gctgtgctgc gctgcagcgg cacagtgtga aacgcaagga gtcggagcag 1500
agcccgctgc agctcagcac agagcagggc gggaacaaga agccaagagt ggtgtggtcg 1560
gtggagatgc accaacaggt gtgcttgcgg gcgggtgtat acgggggagg ggggccagct 1620
gctggctgac ctggcgtgcg cggtgcattg cacttggcga tgaggggcgt gcttcagtat 1680
gtagctggga cgcaattggt tgtgctgtgt gaccagtgca caaaatacat ccctgaattc 1740
cagtgggttg aacagagttg tcctggaggt gggaagcaaa cgcgcacgtg gtagagggga 1800
gcagggtgca gaacagccgc agcaggggtg ttgcgcagtg tgcaggtatc ctgcctccat 1860
gccccgggcc atgggcatac tacgctggta ccgtcaggat gggcgttgag cctggcttgg 1920
ggggcagggg gcgagcgaat gcggaatggg agcggcaggt gctgggaggg tggctgactg 1980
gcttgcagga gcgcaagtcc tgtcgggggc gtcgtcctgt tccctcctgc ccgcttcacc 2040
cacgttcact ctcatgcctc cacactcctg ctgctgacac acctgtcgcc acctccgctg 2100
cagtttgtga acgcggtcaa ctccctgggc attgacaagg cggtgcccaa gcggattctg 2160
gacctgatga acgtggaggg gctgacgcgc gagaacgtgg ccagccatct gcaggtgcct 2220
gccatgaccc ctcccaccag ggacctggtg ttttgacacc ctggaactcc tctttgacgg 2280
agcctccagt tcaattccag caatcgaatt gaatcaaaaa gcatgtgcac ccacgtgctg 2340
tttgaatgtc ccatgtggta ggaaacacaa ctgccccctt gccatttgct ggagggtgcc 2400
cgctgcgcca tgcccgagtg cgctgtgctc agcgttgtgc tgcgcccccc gctgactgaa 2460
gctgacagcg tgcggctgag gagggtactg ggggaggggg ggtgggaggc ggccgctggc 2520
ggcggaaggg agggtgtgca cgcatggaca cagggccttt ccgccctgca cggcctctac 2580
tgcaccctgc cacgtgatgt atcgacatgg tgggccatgc tgtgctgtgc cgctgcagaa 2640
gtaccgcctg tacctgaagc gggtggaggg agtgcaatcg ggtgcggcag cctccaagca 2700
gcaccagcac ccgcagtatc accagcagca gcagcagcag caagcgcaac ctcgtgcagc 2760
tgtctcccct gcagcagctt cctttggtgc cctttccttg ggagccccgc agcaggcgca 2820
gcagggcatg ccgcagctgg ggatgcctgt gcaggtgaag actgcccccc cccccctccc 2880
cctttccatc ttccctccat cagcctgctg ttccttaccc ttgtcaaccc gtctctcctt 2940
tttcgcaagc agcgcaccac cccccatgca cgccttgcct ggcactgttg tcagctgccc 3000
ccctagaaat acacaaggtg tgggtgcaac tggtgggacc ccctcccccc cccccctggg 3060
gctgcagggt ctccctccaa acttggcagc catgggatcc cagccgccgc acatcccctt 3120
ccagcaggcc ctggccatgc aggcggcggc tgcggcggct gcagccagcg gcgcgctccc 3180
cgggagtctg cccccctaca tgccaccccc ggggatgatg ccccccggca tgccgggggg 3240
ggtccccggt atgggagggg tggtggggca tcctcaggta cgggcagcac atgagtgggc 3300
aggggtattg gagaggggaa gggcagggag gttgcatgtg aggggctgca tggcaaagag 3360
gctgcagcgc aggtgttgct tgcagcactt cccctcggtg gcgcttgcat caaattttga 3420
atcctccccc gatgggcacg cccgtgtgtg ggggggggtg ggatggggga tgggggtggt 3480
tttgtggcat gtcgggcgct ttcatctacc cgggcccctg cccctgcctg tacgcgtgcg 3540
catgtgtgca gatgcccgcc ccagggatgg actttgcggg tttcaacggg tatggcaacg 3600
ctgcgggggg gctgatgttt ggcgggcagc agcaggcgca gcacgcgcag cagcacgcgt 3660
cagcgcaagc gggctcgctg gcgcagcagc aggcgcagca agtatccatg ggcttgggcc 3720
ttatgccccc cccgttgggg ttcccgccca cctcgctcgc cgcgccagcc ccgcgctccg 3780
cagcaactga gcccgccgca gccccactcc ccctgacgtc ctcgccgcca gctgcttcag 3840
caggcggcag cggcggccca gcagcagctg ctccgcagca cagcagcggc gccgcagcag 3900
cccaagcccc ccatcaccac ccacagtgct cggagcaggg agcggggggg ctcccgcccc 3960
cgctgcccgc gtccagcgcc ccgcagtcct atcccctccc tcccccctcc tcgcaggccg 4020
ctttgcatga cccggacgaa cactaccccc caggctcggc agaggtgagc acgtcccccc 4080
gccccctccc cccccccccc cccccttccc ttcaccctgg cttggcgtgc aatgaaaccc 4140
taaataaccc taaaacctca ttatcagttg caaattggac ccgtgaagcg ggcgggggca 4200
actgcgctct gctggtgtca gcgctgtctc tgccggttcc tgcccagcgt gcgcctgcat 4260
gcaagggggg atgggggggg ggaggcattt aacaataggc cagtcatctc caatccaccg 4320
tcaatttcag ccccctcccc ccccctccct catccccttg cagatgcacc accagcacct 4380
cccagggctg tgtggcttta acccggacga cctgctgggg gggcagctgg gggacatggg 4440
gttcctgggg gagctggggg gggcggtggg aggaaagcac gaacaggacg acttcctgga 4500
cctgctgctg aagggggagg aggagctgtg a 4531
<210>2
<211>1860
<212>DNA
<213> genus Chlorella
<400>2
atgtctggtt cagctggatc gggccaggct actctcagac atgacggtgg ctctgctggc 60
ggcagtgggc ctgtctcaga cggtttttca ccggccggcc tgaaggttct ggtcgtggac 120
gacgacctca tgtgccttaa ggtggtgtca gccatgttga agaggtgcag ctatcaagtg 180
gccacttgta gcagtggcag cgaggcactg acacttctac gtgaacgcaa cgaggacgga 240
tcctccgacc agttcgacct cgtactgtca gatgtttaca tgccggacat ggacggtttc 300
aagctgcttg aacacatcgg tctagagttg gagcttcccg ttatcatgat gtcatccaac 360
ggggacacga atgtcgtgct gcggggggtc acccacgggg ctgtggactt tctgatcaag 420
cccgttcgaa ttgaggagct gcggaacgtg tggcagcacg tggtgcgtcg tcgttccatg 480
gcgctggcca ggacgccaga cgagggggga cactcggacg aggactctca gcggcacagt 540
gtgaaacgca aggagtcgga gcagagcccg ctgcagctca gcacagagca gggcgggaac 600
aagaagccaa gagtggtgtg gtcggtggag atgcaccaac agtttgtgaa cgcggtcaac 660
tccctgggca ttgacaaggc ggtgcccaag cggattctgg acctgatgaa cgtggagggg 720
ctgacgcgcg agaacgtggc cagccatctg cagaagtacc gcctgtacct gaagcgggtg 780
gagggagtgc aatcgggtgc ggcagcctcc aagcagcacc agcacccgca gtatcaccag 840
cagcagcagc agcagcaagc gcaacctcgt gcagctgtct cccctgcagc agcttccttt 900
ggtgcccttt ccttgggagc cccgcagcag gcgcagcagg gcatgccgca gctggggatg 960
cctgtgcagg gtctccctcc aaacttggca gccatgggat cccagccgcc gcacatcccc 1020
ttccagcagg ccctggccat gcaggcggcg gctgcggcgg ctgcagccag cggcgcgctc 1080
cccgggagtc tgccccccta catgccaccc ccggggatga tgccccccgg catgccgggg 1140
ggggtccccg gtatgggagg ggtggtgggg catcctcaga tgcccgcccc agggatggac 1200
tttgcgggtt tcaacgggta tggcaacgct gcgggggggc tgatgtttgg cgggcagcag 1260
caggcgcagc acgcgcagca gcacgcgtca gcgcaagcgg gctcgctggc gcagcagcag 1320
gcgcagcaag tatccatggg cttgggcctt atgccccccc cgttggggtt cccgcccacc 1380
tcgctcgccg cgccagcccc gcgctccgca gcaactgagc ccgccgcagc cccactcccc 1440
ctgacgtcct cgccgccagc tgcttcagca ggcggcagcg gcggcccagc agcagctgct 1500
ccgcagcaca gcagcggcgc cgcagcagcc caagcccccc atcaccaccc acagtgctcg 1560
gagcagggag cgggggggct cccgcccccg ctgcccgcgt ccagcgcccc gcagtcctat 1620
cccctccctc ccccctcctc gcaggccgct ttgcatgacc cggacgaaca ctacccccca 1680
ggctcggcag agatgcacca ccagcacctc ccagggctgt gtggctttaa cccggacgac 1740
ctgctggggg ggcagctggg ggacatgggg ttcctggggg agctgggggg ggcggtggga 1800
ggaaagcacg aacaggacga cttcctggac ctgctgctga agggggagga ggagctgtga 1860
<210>3
<211>619
<212>PRT
<213> genus Chlorella
<400>3
Met Ser Gly Ser Ala Gly Ser Gly Gln Ala Thr Leu Arg His Asp Gly
1 5 10 15
Gly Ser Ala Gly Gly Ser Gly Pro Val Ser Asp Gly Phe Ser Pro Ala
20 25 30
Gly Leu Lys Val Leu Val Val Asp Asp Asp Leu Met Cys Leu Lys Val
35 40 45
Val Ser Ala Met Leu Lys Arg Cys Ser Tyr Gln Val Ala Thr Cys Ser
50 55 60
Ser Gly Ser Glu Ala Leu Thr Leu Leu Arg Glu Arg Asn Glu Asp Gly
65 70 75 80
Ser Ser Asp Gln Phe Asp Leu Val Leu Ser Asp Val Tyr Met Pro Asp
85 90 95
Met Asp Gly Phe Lys Leu Leu Glu His Ile Gly Leu Glu Leu Glu Leu
100 105 110
Pro Val Ile Met Met Ser Ser Asn Gly Asp Thr Asn Val Val Leu Arg
115 120 125
Gly Val Thr His Gly Ala Val Asp Phe Leu Ile Lys Pro Val Arg Ile
130 135 140
Glu Glu Leu Arg Asn Val Trp Gln His Val Val Arg Arg Arg Ser Met
145 150 155 160
Ala Leu Ala Arg Thr Pro Asp Glu Gly Gly His Ser Asp Glu Asp Ser
165 170 175
Gln Arg His Ser Val Lys Arg Lys Glu Ser Glu Gln Ser Pro Leu Gln
180 185 190
Leu Ser Thr Glu Gln Gly Gly Asn Lys Lys Pro Arg Val Val Trp Ser
195 200 205
Val Glu Met His Gln Gln Phe Val Asn Ala Val Asn Ser Leu Gly Ile
210 215 220
Asp Lys Ala Val Pro Lys Arg Ile Leu Asp Leu Met Asn Val Glu Gly
225 230 235 240
Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr
245 250 255
Leu Lys Arg Val Glu Gly Val Gln Ser Gly Ala Ala Ala Ser Lys Gln
260 265 270
His Gln His Pro Gln Tyr His Gln Gln Gln Gln Gln Gln Gln Ala Gln
275 280 285
Pro Arg Ala Ala Val Ser Pro Ala Ala Ala Ser Phe Gly Ala Leu Ser
290 295 300
Leu Gly Ala Pro Gln Gln Ala Gln Gln Gly Met Pro Gln Leu Gly Met
305 310 315 320
Pro Val Gln Gly Leu Pro Pro Asn Leu Ala Ala Met Gly Ser Gln Pro
325 330 335
Pro His Ile Pro Phe Gln Gln Ala Leu Ala Met Gln Ala Ala Ala Ala
340 345 350
Ala Ala Ala Ala Ser Gly Ala Leu Pro Gly Ser Leu Pro Pro Tyr Met
355 360 365
Pro Pro Pro Gly Met Met Pro Pro Gly Met Pro Gly Gly Val Pro Gly
370 375 380
Met Gly Gly Val Val Gly His Pro Gln Met Pro Ala Pro Gly Met Asp
385 390 395 400
Phe Ala Gly Phe Asn Gly Tyr Gly Asn Ala Ala Gly Gly Leu Met Phe
405 410 415
Gly Gly Gln Gln Gln Ala Gln His Ala Gln Gln His Ala Ser Ala Gln
420 425 430
Ala Gly Ser Leu Ala Gln Gln Gln Ala Gln Gln Val Ser Met Gly Leu
435 440 445
Gly Leu Met Pro Pro Pro Leu Gly Phe Pro Pro Thr Ser Leu Ala Ala
450 455 460
Pro Ala Pro Arg Ser Ala Ala Thr Glu Pro Ala Ala Ala Pro Leu Pro
465 470 475 480
Leu Thr Ser Ser Pro Pro Ala Ala Ser Ala Gly Gly Ser Gly Gly Pro
485 490 495
Ala Ala Ala Ala Pro Gln His Ser Ser Gly Ala Ala Ala Ala Gln Ala
500 505 510
Pro His His His Pro Gln Cys Ser Glu Gln Gly Ala Gly Gly Leu Pro
515 520 525
Pro Pro Leu Pro Ala Ser Ser Ala Pro Gln Ser Tyr Pro Leu Pro Pro
530 535 540
Pro Ser Ser Gln Ala Ala Leu His Asp Pro Asp Glu His Tyr Pro Pro
545 550 555 560
Gly Ser Ala Glu Met His His Gln His Leu Pro Gly Leu Cys Gly Phe
565 570 575
Asn Pro Asp Asp Leu Leu Gly Gly Gln Leu Gly Asp Met Gly Phe Leu
580 585 590
Gly Glu Leu Gly Gly Ala Val Gly Gly Lys His Glu Gln Asp Asp Phe
595 600 605
Leu Asp Leu Leu Leu Lys Gly Glu Glu Glu Leu
610 615
<210>4
<211>2802
<212>DNA
<213> genus Chlorella
<400>4
atggctgccc ccccagtatc tatctcttcc aattttccaa aggttagtat ttacgtaaca 60
tttgccgaca gttgggcaat aacgctgagt tggagtgttg ccaacaagct tttgtgccgt 120
ttccagggtt tgcgggttct cttggtcgat caacagccaa gtaggagcca tattgaagcg 180
cagctgatgc agccggatct taattacaca ggtttgctgc agttttgcac attccaagct 240
tggcctttct ccgtgccaaa cccagcgcgc tgagctcttg ttgtttgttg cagttactgg 300
ttgcgagagc gtttctgaag ctctttcata ttgccgctcg ggagtaagca gctttgacgt 360
ggtgcttgcg gaggtgggtg gtagtgcttc gcgaggtgca cagtgcgcac cgggaaaaac 420
ttgaaaagta tttgtaaaat taattttgaa acttctgtat tattttacac ctcttaacaa 480
tgcacccaat gtttgttatg agcgccacgt taccggacgt ttgttgcagg caaggatcgt 540
tgccgtcgac gagacttcgg ggcgcgcatt tatcgactct cttgaagata caccggttat 600
tcttatgtcg gagggaagca cgacgggcga cgttcttcgt gcggtgaagc ttggagctgt 660
ggactggctg gataagcctc tctccgtcct gaagctcaag aacatctggc agcactcagt 720
gcgtaaggta ggtccggaaa caatatgata tctacaattg gaaagttacc agtgtcaact 780
atggaaaacg ttgtactggg tgctagtttc agttgagcca gttgcctgta tatgcatata 840
aggggcagtg acgcagtgtc tgccctgtga gctgtgatgc atcagggtgc acctgaagct 900
ggcagtggat cactccaccc aagatgttgt tgcaatccaa tgtgttgctg atgccttgct 960
tttctgactt gcaaacatgg tgtgggataa aagcgttgct agacagccac cgtgctccac 1020
gttgtcttct gcatgcaaaa ctgcagatga tgcagcgcac cacgttttac gacacttgct 1080
ccgagcagcc aacccagccg gcgcgcagca agctttcttc aggaatcgaa tcgccgagca 1140
cacccacgct gggagactct gtggacttgg acgccatctc ggcggcttcc ttcggcagca 1200
tcaaggactt gaccgatttt tcattttcca gcggagctga ggtgggcatc gctggttgtc 1260
cagcactgca gcatttccca ccagcttggt tggttgcctg tgttttagtg cagagcagag 1320
gccgaggcta ctggttcaac cagcctagtt actcaaaaca attttggcaa cctgctgact 1380
tctctcttaa cctgcagagc gtttcacagc atgtacactt ttcagtgggt ttcgtaattt 1440
cgtagcgcac ccgctggctt ttttctgcag gtcctgagag cctcctttga cagctgtgac 1500
ggctccgagg tcaacctagg cagcgctttg ggccagcctc gcccccctct ggcagtcaag 1560
cccagctcct ttggccccct ggtgagtggc atagctcagc aggagaccca caagtggctg 1620
gaacccacca tgttggcgcg caccctgccc tcgcacgcgc ggctgccgtc tgcgcagcgc 1680
gccgcggtgc gccgctgtgg agttgtggtg ttgcggagtc actcgcgggc cagtgcttca 1740
cagcccattc tcgccgcaca caccctgccc gcacaaatgg ctgccacccc cctaaagcgg 1800
tcctcaggcc acccggatgt gctcaggatg gatccgccga atctcgcacc cctcctccct 1860
caatcccggt tgttcagacg gtttggaacc cctccggctc tctacccctc tgcaggtacc 1920
cgtccctccc acctcccagt ggccccagct gcaggctggc tgcgtgtggg gcactcccgt 1980
gggcggcccg ctggcgcccc cctccatgac caacgcccag catggtgccc cccacagcgt 2040
gcccctggca gacgcacact tggccggcag cgccagttac atgtccctct cctctgtgag 2100
tctcctcccc tccaccccta catcttccaa tcgaacatgc gacgcacgca cacccatagt 2160
ccctaaacaa gtgctttggt gttttttcac ttgcaaaccc caaccctgac acctgaagcg 2220
tgacacaggc gactgcgctg ctccccgccc ccacacgccc ttggttgttt gtgccctgca 2280
cttctgccac gacatgcatg tcatgtcttt tcacgcctgc gatgtcgctg cttaaacttg 2340
aaactcattg tggccggggt gcagctcatg gaggaggaca ccccctgtcc cttggacatg 2400
gatgcaccag aggacgggat gcagcttcct gttgacttcc tgtctgttgc caacgtcagc 2460
agcaatggta ggtccagcac cagacgcctc tgtctgctat gagacgcacc tccagccgcc 2520
ccctctggac agacagcgcg ctgcacgctc tgcgcgctgg accttgccgc acacgcgcgc 2580
gacaaggcct ggtgtgatgc ttggatgtgg aaggttccag catggttgga caagatggta 2640
tcctggcaca catattggta tgcagcatac acccaggctg cccccttacc ctcgcacgcc 2700
ctacccctta ctgcaggcag cggtcccatt gggttgaagc tgaagaaaag caacagcctg 2760
ctgaacatga tcaacgcagc gctgatgtct ggtggtcagt ga 2802
<210>5
<211>359
<212>PRT
<213> genus Chlorella
<400>5
Met Ala Ala Pro Pro Val Ser Ile Ser Ser Asn Phe Pro Lys Gly Leu
1 5 1015
Arg Val Leu Leu Val Asp Gln Gln Pro Ser Arg Ser His Ile Glu Ala
20 25 30
Gln Leu Met Gln Pro Asp Leu Asn Tyr Thr Val Thr Gly Cys Glu Ser
35 40 45
Val Ser Glu Ala Leu Ser Tyr Cys Arg Ser Gly Val Ser Ser Phe Asp
50 55 60
Val Val Leu Ala Glu Ala Arg Ile Val Ala Val Asp Glu Thr Ser Gly
65 70 75 80
Arg Ala Phe Ile Asp Ser Leu Glu Asp Thr Pro Val Ile Leu Met Ser
85 90 95
Glu Gly Ser Thr Thr Gly Asp Val Leu Arg Ala Val Lys Leu Gly Ala
100 105 110
Val Asp Trp Leu Asp Lys Pro Leu Ser Val Leu Lys Leu Lys Asn Ile
115 120 125
Trp Gln His Ser Val Arg Lys Met Met Gln Arg Thr Thr Phe Tyr Asp
130 135 140
Thr Cys Ser Glu Gln Pro Thr Gln Pro Ala Arg Ser Lys Leu Ser Ser
145 150 155 160
Gly Ile Glu Ser Pro Ser Thr Pro Thr Leu Gly Asp Ser Val Asp Leu
165 170 175
Asp Ala Ile Ser Ala Ala Ser Phe Gly Ser Ile Lys Asp Leu Thr Asp
180 185 190
Phe Ser Phe Ser Ser Gly Ala Glu Val Leu Arg Ala Ser Phe Asp Ser
195 200 205
Cys Asp Gly Ser Glu Val Asn Leu Gly Ser Ala Leu Gly Gln Pro Arg
210 215 220
Pro Pro Leu Ala Val Lys Pro Ser Ser Phe Gly Pro Leu Val Pro Val
225 230 235 240
Pro Pro Thr Ser Gln Trp Pro Gln Leu Gln Ala Gly Cys Val Trp Gly
245 250 255
Thr Pro Val Gly Gly Pro Leu Ala Pro Pro Ser Met Thr Asn Ala Gln
260 265 270
His Gly Ala Pro His Ser Val Pro Leu Ala Asp Ala His Leu Ala Gly
275 280 285
Ser Ala Ser Tyr Met Ser Leu Ser Ser Leu Met Glu Glu Asp Thr Pro
290 295 300
Cys Pro Leu Asp Met Asp Ala Pro Glu Asp Gly Met Gln Leu Pro Val
305 310 315 320
Asp Phe Leu Ser Val Ala Asn Val Ser Ser Asn Gly Ser Gly Pro Ile
325 330 335
Gly Leu Lys Leu Lys Lys Ser Asn Ser Leu Leu Asn Met Ile Asn Ala
340 345 350
Ala Leu Met Ser Gly Gly Gln
355
<210>6
<211>119
<212>PRT
<213> genus Chlorella
<400>6
Gly Leu Arg Val Leu Leu Val Asp Gln Gln Pro Ser Arg Ser His Ile
1 5 10 15
Glu Ala Gln Leu Met Gln Asp Leu Asn Tyr Thr Val Thr Gly Cys Glu
20 25 30
Ser Val Ser Glu Ala Leu Ser Tyr Cys Arg Ser Gly Val Ser Ser Phe
35 40 45
Asp Val Val Leu Ala Glu Ala Arg Ile Val Ala Val Asp Glu Thr Ser
50 55 60
Gly Arg Ala Phe Ile Asp Ser Leu Glu Asp Thr Pro Val Ile Leu Met
65 70 75 80
Ser Glu Gly Ser Thr Thr Gly Asp Val Leu Arg Ala Val Lys Leu Gly
85 90 95
Ala Val Asp Trp Leu Asp Lys Pro Leu Ser Val Leu Lys Leu Lys Asn
100 105 110
Ile Trp Gln His Ser Val Arg
115
<210>7
<211>1080
<212>DNA
<213> genus Chlorella
<400>7
atggctgccc ccccagtatc tatctcttcc aattttccaa agggtttgcg ggttctcttg 60
gtcgatcaac agccaagtag gagccatatt gaagcgcagc tgatgcagcc ggatcttaat 120
tacacagtta ctggttgcga gagcgtttct gaagctcttt catattgccg ctcgggagta 180
agcagctttg acgtggtgct tgcggaggca aggatcgttg ccgtcgacga gacttcgggg 240
cgcgcattta tcgactctct tgaagataca ccggttattc ttatgtcgga gggaagcacg 300
acgggcgacg ttcttcgtgc ggtgaagctt ggagctgtgg actggctgga taagcctctc 360
tccgtcctga agctcaagaa catctggcag cactcagtgc gtaagatgat gcagcgcacc 420
acgttttacg acacttgctc cgagcagcca acccagccgg cgcgcagcaa gctttcttca 480
ggaatcgaat cgccgagcac acccacgctg ggagactctg tggacttgga cgccatctcg 540
gcggcttcct tcggcagcat caaggacttg accgattttt cattttccag cggagctgag 600
gtcctgagag cctcctttga cagctgtgac ggctccgagg tcaacctagg cagcgctttg 660
ggccagcctc gcccccctct ggcagtcaag cccagctcct ttggccccct ggtacccgtc 720
cctcccacct cccagtggcc ccagctgcag gctggctgcg tgtggggcac tcccgtgggc 780
ggcccgctgg cgcccccctc catgaccaac gcccagcatg gtgcccccca cagcgtgccc 840
ctggcagacg cacacttggc cggcagcgcc agttacatgt ccctctcctc tctcatggag 900
gaggacaccc cctgtccctt ggacatggat gcaccagagg acgggatgca gcttcctgtt 960
gacttcctgt ctgttgccaa cgtcagcagc aatggcagcg gtcccattgg gttgaagctg 1020
aagaaaagca acagcctgct gaacatgatc aacgcagcgc tgatgtctgg tggtcagtga 1080
<210>8
<211>1707
<212>DNA
<213> genus Chlorella
<400>8
atgcttcggc agcagctgtt gcacagcggc aggcagccgg gtgcgacatg cagcttacta 60
acctgctcga catggcgacc gtctgccttg ttcggccgtc ctaagcccca aaaactgcac 120
agccagcgct tgcagcatca gggccgcccc tcccgcctcg tcgtgcgcag cgcaatgttc 180
gacaacctga gccgcagcct ggagagggcg tgggacatgg tgcgcaagga cgggcggcta 240
acggcggaca acatcaagga gcccatgcgg gagattcgca gggcgctgct tgaggcggat 300
gtgaggctgg gggcgccgct gatcagattc ttggtatcta cccccccccc ctcccaggtc 360
tccctccccg tggtgcgcaa gtttgtgaag gcggtggagg agaaggcgct gggttctgca 420
gtgaccaagg gtgtcacccc cgaccagcag ctggtgaagg tggtgtacga ccagctgcgg 480
gagctgatgg gggggcagca ggaagggctg gtgcccactt cgccagagga gccgcaggtg 540
atcttgatgg cggggctgca gggcacgggg aagacgacag ctgcggggaa gctggccttg 600
ttcctgcaga agaaggggca gaaggtgctg ctggtggcca ccgacatcta ccgccccgcc 660
gccatcgacc agctggtgaa gctgggcgac aggatagggg tgccggtgtt ccagctggga 720
acccaggtgc agccgccgga gattgcaagg caggggctgg agaaggcgcg agcagagggg 780
tttgacgccg tcatcgtcga cacggcgggg cggctgcaga tcgaccagag catgatggag 840
gagctggtgc agatcaagtc cacggtgaag ccctccgaca cgctgctagt ggtcgatgcg 900
atgacggggc aggaggcagc cgggctggtg aaggcgttca atgatgccgt ggacatcaca 960
ggcgccgtgc tgaccaagct tgacggggac agccgcggcg gcgccgcgct gagcgtgcgc 1020
caggtcagcg ggcggcccat caagtttgtg ggcatggggg agggcatgga ggcgctggag 1080
cccttctacc ccgagcgcat ggccagcagg attctgggca tgggtgacgt ggtcaccctg 1140
gtggagaagg ctgaggagag catcaaggaa gaggaggcgc aggagatatc gcggaagatg 1200
ctgtcggcca aatttgactt tgacgacttc ctgaagcagt acaagatggt ggcggggatg 1260
gggaacatgg cccaaatcat gaagatgctg ccaggcatga acaagtttac ggagaagcag 1320
ctggcgggcg ttgagaagca gtacaaggtg tacgagagca tgatccagag catgacggtg 1380
aaggagcgca agcagccgga gctgttggtg aagtcgccct ccaggaggcg gcgcatagcg 1440
cgcgggtcgg ggcgctcgga gcgggaggtc acagagctgc tgggggtgtt caccaacctg 1500
cggacgcaga tgcagagctt ctccaaaatg atggccatgg gggggatggg catgggctcc 1560
atgatgagcg acgaggagat gatgcaggcc acgctggcag gcgccggccc ccgccccgtg 1620
ccagctggca aggtgcggcg gaagaagctg gccgcggcgg gcgggtcgcg gggcatggct 1680
gagctggcat ccctgaaggc agaatga 1707
<210>9
<211>302
<212>PRT
<213> Gliocladium sp
<400>9
Met Gly Leu Lys Ala Arg Ala Ala Ser Val Ser Val His Ser Ser Ala
1 5 10 15
Asn Asn Thr Ala Ser Pro Leu Ser Ser Gly Arg Arg Gly Phe Pro His
20 25 30
Ser Gly Glu Met Ser Gly Glu Asp Leu Ala Arg Ser Asp Ser Trp Glu
35 40 45
Met Phe Pro Ala Gly Leu Lys Val Leu Val Val Asp Asp Asp Pro Leu
50 55 60
Cys Leu Lys Val Val Glu His Met Leu Arg Arg Cys Asn Tyr Gln Val
65 70 75 80
Thr Thr Cys Pro Asn Gly Lys Ala Ala Leu Glu Lys Leu Arg Asp Arg
85 90 95
Ser Val His Phe Asp Leu Val Leu Ser Asp Val Tyr Met Pro Asp Met
100 105 110
Asp Gly Phe Lys Leu Leu Glu His Ile Gly Leu Glu Leu Asp Leu Pro
115 120 125
Val Ile Met Met Ser Ser Asn Gly Glu Thr Asn Val Val Leu Arg Gly
130 135 140
Val Thr His Gly Ala Val Asp Phe Leu Ile Lys Pro Val Arg Val Glu
145 150 155 160
Glu Leu Arg Asn Val Trp Gln His Val Val Arg Arg Lys Arg Asp Gln
165 170 175
Ala Val Ser Gln Ala Arg Asp Ser Arg Asp Ile Ser Asp Glu Glu Gly
180 185 190
Thr Asp Asp Gly Lys Pro Arg Asp Lys Lys Arg Lys Glu Val Ile Leu
195 200 205
Val Leu Trp Trp Asp Met Gln Arg Arg Asp Ser Asp Asp Gly Val Ser
210 215 220
Ala Lys Lys Ala Arg Val Val Trp Ser Val Glu Met His Gln Gln Phe
225 230 235 240
Val Gln Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg
245 250 255
Ile Leu Asp Leu Met Asn Val Asp Gly Leu Thr Arg Glu Asn Val Ala
260 265 270
Ser His Leu Gln Val Pro His Leu Ser Ile Phe Ser Pro Leu Phe Ala
275 280 285
Glu Leu Met Ser Thr Leu Pro Arg Arg Cys Phe Tyr Asp Phe
290 295 300
<210>10
<211>269
<212>PRT
<213> ocean luminescent oyster ball algae
<400>10
Phe Pro Ala Gly Leu Gly Val Leu Val Val Asp Asp Asp Leu Leu Cys
1 5 10 15
Leu Lys Val Val Glu Lys Met Leu Lys Ala Cys Lys Tyr Lys Val Thr
20 25 30
Ala Cys Ser Thr Ala Lys Thr Ala Leu Glu Ile Leu Arg Thr Arg Lys
35 40 45
Glu Glu Phe Asp Ile Val Leu Ser Asp Val His Met Pro Asp Met Asp
50 55 60
Gly Phe Lys Leu Leu Glu Ile Ile Gln Phe Glu Leu Ala Leu Pro Val
65 70 75 80
Leu Met Met Ser Ala Asn Ser Asp Ser Ser Val Val Leu Arg Gly Ile
85 90 95
Ile His Gly Ala Val Asp Tyr Leu Leu Lys Pro Val Arg Ile Glu Glu
100 105 110
Leu Arg Asn Ile Trp Gln His Val Val Arg Arg Asp Tyr Ser Ser Ala
115 120 125
Lys Ser Ser Gly Ser Glu Asp Val Glu Ala Ser Ser Pro Ser Lys Arg
130 135 140
Ala Lys Thr Ser Gly Ser Asn Ser Lys Ser Glu Glu Val Asp Arg Thr
145 150 155 160
Ala Ser Glu Met Ser Ser Gly Lys Ala Arg Lys Lys Pro Thr Gly Lys
165 170 175
Lys Gly Gly Lys Ser Val Lys Glu Ala Glu Lys Lys Asp Val Val Asp
180 185 190
Asn Ser Asn Ser Lys Lys Pro Arg Val Val Trp Ser Ala Glu Leu His
195 200 205
Ala Gln Phe Val Thr Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val
210 215 220
Pro Lys Arg Ile Leu Asp Leu Met Gly Val Gln Gly Leu Thr Glu Asn
225 230 235 240
Val Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu Lys Arg Leu Gln
245 250 255
Gly Asn Asp Ala Arg Gly Gly Gly Asn Ala Ser Ser Thr
260 265
<210>11
<211>941
<212>PRT
<213> Chlamydomonas reinhardtii
<400>11
Met Asp Ser Gln Gly Val Lys Leu Glu Glu His Pro Gly His Thr Gly
1 5 1015
Gly His Trp Gln Gly Phe Pro Ala Gly Leu Arg Leu Leu Val Val Asp
20 25 30
Asp Asp Pro Leu Cys Leu Lys Val Val Glu Gln Met Leu Arg Lys Cys
35 40 45
Ser Tyr Glu Val Thr Val Cys Ser Asn Ala Thr Thr Ala Leu Asn Ile
50 55 60
Leu Arg Asp Lys Asn Thr Glu Tyr Asp Leu Val Leu Ser Asp Val Tyr
65 70 75 80
Met Pro Asp Met Asp Gly Phe Arg Leu Leu Glu Leu Val Gly Leu Glu
85 90 95
Met Asp Leu Pro Val Ile Met Met Ser Ser Asn Gly Asp Thr Ser Asn
100 105 110
Val Leu Arg Gly Val Thr His Gly Ala Cys Asp Tyr Leu Ile Lys Pro
115 120 125
Val Arg Leu Glu Glu Leu Arg Asn Leu Trp Gln His Val Val Arg Arg
130 135 140
Arg Arg Gln His Ala Gln Glu Ile Asp Ser Asp Glu Gln Ser Gln Glu
145 150 155 160
Arg Asp Glu Asp Gln Thr Arg Asn Lys Arg Lys Ala Asp Ala Ala Gly
165 170 175
Val Thr Gly Asp Gln Cys Arg Leu Asn Gly Ser Gly Ser Gly Gly Ala
180 185 190
Ala Gly Pro Gly Ser Gly Gly Gly Ala Gly Gly Met Thr Asp Glu Met
195 200 205
Leu Met Met Ser Gly Gly Glu Asn Gly Ser Asn Lys Lys Ala Arg Val
210 215 220
Val Trp Ser Val Glu Met His Gln Gln Phe Val Asn Ala Val Asn Gln
225 230 235 240
Leu Gly Ile Asp Lys Ala Val Pro Lys Lys Ile Leu Glu Ile Met Gly
245 250 255
Val Asp Gly Ser Ala Gly Arg Leu Ala Asp Thr Ser Gly Arg Asp Val
260 265 270
Cys Gly Thr Val Tyr Arg Leu Tyr Leu Lys Arg Val Ser Gly Val Thr
275 280 285
Pro Ser Gly His His His Asn Ala Ala His Lys Ser Asn Lys Pro Ser
290 295 300
Pro His Thr Thr Pro Pro Pro Pro Ala Leu Pro Gly Gln Ala Gly Thr
305 310 315 320
His Pro Ala Asn Gln Ala Thr Ala Ile Pro Pro Pro Pro Gln Pro Gly
325 330 335
Ser Gly Thr Ala Ala Gly Ala Gly Ala Ala Ala Ala Gly Thr Gly Gly
340 345 350
Gly Ala Ala Ala Ala Asn Gly His Ala Ala Thr Thr Gly Ala Gly Thr
355 360 365
Pro Gly Ala Ala Pro Gly Ala Gly Gly Gly Val Gly Gly Thr Gly Ala
370 375 380
Gly Gly Leu Gly Ser Gly Pro Asp Gly Ala Ala Ala Ala Ala Gly Pro
385 390 395 400
Gly Pro Gly Ala Ala Val Pro Gly Gly Leu Gly Gly Leu Pro Leu Pro
405 410 415
Pro Gly Ala Gly Pro Gly Pro Gly Pro Gly Gly Phe Gly Gly Pro Ser
420 425 430
Pro Pro Pro Pro Pro His Pro Ala Ala Leu Leu Ala Asn Pro Met Ala
435 440 445
Ala Ala Val Ala Gly Leu Asn Gln Ser Leu Leu Asn Ala Met Gly Ser
450 455 460
Leu Gly Val Gly Val Gly Gly Met Ser Pro Leu Gly Pro Val Gly Pro
465 470 475 480
Leu Gly Pro Leu Gly Gly Leu Pro Gly Leu Pro Gly Met Gln Pro Pro
485 490 495
Pro Leu Gly Met Gly Gly Leu Gln Pro Gly Met Gly Pro Leu Gly Pro
500 505 510
Leu Gly Leu Pro Gly Met Gly Gly Leu Pro Gly Leu Pro Gly Met Asn
515 520 525
Pro Met Ala Asn Leu Met Gln Gly Met Ala Ala Gly Met Ala Ala Ala
530 535 540
Asn Gln Met Asn Gly Met Gly Gly His Met Gly Gly His Met Gly Gly
545 550 555 560
Met Asn Gly Pro Met Gly Ala Leu Ala Gly Met Asn Gly Leu Asn Gly
565 570 575
Ala Met Met Gly Gly Leu Pro Gly Met Gly Gly Pro Gln Asn Met Phe
580 585 590
Gln Ala Ala Ala Ala Ala Ala Ala Gln Gln Gln Gln Gln Gln Gln Glu
595 600 605
Gln Gln His Ala Met Met Gln Gln Ala Ala Ala Gly Leu Leu Ala Ser
610 615 620
Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Ala
625 630 635 640
Leu Gln Gln Gln Gln Gln Gln Gly Met Ala Val Ser Pro Pro Gly Pro
645 650 655
His Asn Ala Thr Pro Asn Gly Gln Leu His Thr His Pro Gln Ala His
660 665 670
His Pro His Gln His Gly Leu His Ala His Ala His Pro His Gln His
675 680 685
Leu Asn Thr Ala Pro Ala Gly Ala Leu Gly Leu Ser Pro Pro Gln Pro
690 695 700
Pro Ala Gly Leu Leu Ser Ala Ser Gly Leu Ser Ser Gly Pro Asp Gly
705 710 715 720
Ser Gly Leu Gly Ser Gly Val Gly Gly Leu Leu Asp Gly Leu Gln Gln
725 730 735
His Pro His His Pro Gln Leu Gln Leu Ala Gly Ser Leu Gly Thr Gly
740 745 750
Gly Thr Gly Arg Ser Ser Gly Ala Ala Gly Arg Gly Ser Leu Asp Leu
755 760 765
Pro Ala Asp Leu Met Gly Met Ala Leu Leu Asp Phe Pro Pro Val Pro
770 775 780
Val Pro Gly Gly Ala Asp Val Gly Met Ala Gly Ala Gly Gly Gly Ala
785 790 795 800
Ala Gly Ala His His His Gly His Gln Gly His Gln Gly Ile Gly Gly
805 810 815
Gly Ala Gly Val Gly Ile Ala Gly Gly Val Gly Cys Gly Val Pro Ala
820 825 830
Ala Ala His Gly Leu Glu Pro Ala Ile Leu Met Asp Asp Pro Ala Asp
835 840 845
Leu Gly Ala Val Phe Ser Asp Val Met Tyr Gly Thr Pro Gly Gly Gly
850 855 860
Gly Val Pro Gly Gly Val Pro Gly Gly Gly Val Gly Leu Gly Leu Gly
865 870 875 880
Ala Gly Gln Val Pro Ser Gly Pro Ala Gly Ala Gly Gly Leu His Ser
885 890 895
His His His Gln His His His His Gln His His Leu Gly His Val Val
900 905 910
Pro Val Gly Gly Val Asp Pro Leu Ala Gly Asp Ala Ala Lys Met Ala
915 920 925
Met Asn Asp Asp Asp Phe Phe Asn Phe Leu Leu Lys Asn
930 935 940
<210>12
<211>523
<212>PRT
<213> Zuofu Chromolaena
<400>12
Met Asp Gly Phe Lys Leu Leu Glu Thr Val Gly Leu Glu Leu Asp Leu
1 510 15
Pro Val Ile Met Met Ser Ser Asn Gly Glu His Thr Thr Val Met Arg
20 25 30
Gly Val Thr His Gly Ala Cys Asp Phe Leu Ile Lys Pro Val Arg Ile
35 40 45
Glu Glu Leu Arg Asn Ile Trp Gln His Val Ile Arg Arg Thr Arg His
50 55 60
Pro Val Phe Arg Asp Leu Glu Pro Asp Asp His Glu Gly Gly Asp Tyr
65 70 75 80
Glu Ala Ser Lys Lys Arg Lys Asp Leu Tyr Arg Gly Glu Asn Ser Ser
85 90 95
Gly Ser Gly Gly Ala Gly Gly Leu Glu Arg Asp Asp Asp Gly Ser Ala
100 105 110
Ser Lys Lys Pro Arg Val Val Trp Ser Val Glu Met His Gln Gln Phe
115 120 125
Val Gln Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Lys
130 135 140
Ile Leu Glu Leu Met Asn Val Asp Gly Leu Thr Arg Glu Asn Val Ala
145 150 155 160
Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu Lys Arg Val Gln Gly Val
165 170 175
Gln Ala Pro Phe Gly Leu Pro Asn Ile Gln Leu Pro Arg Gln Thr Ser
180 185 190
Ser Lys Gly Ala Gly Ser Ser Ser Gln Gln Gln His His Gln Gln Gln
195 200 205
Gln His Gln Gln Gln His Gln His Gln His Gln Thr Ala Leu Gly Thr
210 215 220
Gly Gln Gln Gln Ser His Gln Leu Gln Pro Cys Pro Val Ser Thr Ala
225 230 235 240
Thr Pro Val Met Pro Ser Pro Asp Ala Met Val Ala Ala Ser Met Met
245 250 255
Ser Ser Gln Ala Met Ala Ala Met Ala Pro Gly Val Met Asn Pro Met
260 265 270
Thr Ala Met Asn Ser Met Met Ala Gly Leu Asn Pro Asn Met Met Gly
275 280 285
Met Ala Ala Gly Leu Gly Leu Ala Gly Leu Gly Ile Gly Gly Met Ala
290 295 300
Gly His Pro Val Pro Asn Pro Met Leu Ala Gly Met Gly Pro Met Gly
305 310 315 320
Leu Gly Leu Pro Pro Pro Pro Gly Met Pro Pro Pro Pro Pro Gly Met
325 330335
Pro Pro Gly Met Pro Pro Gly Met Pro Pro Gly Met Pro Ala Met Met
340 345 350
Gln Gly Leu Ser Met Ala Gly Met Ser His Leu Ala Ala Ala Gly Met
355 360 365
Arg Pro Pro Pro Gly Ala Leu Gly Gly His Leu Gly Gly Pro Gly Leu
370 375 380
Ser Pro Phe Gly Pro Pro Pro Pro Pro Gly Ala Asp Pro Ala Asn Met
385 390 395 400
Met Ala Asn Met Ser Ser Met Met Ala Asn Met Gln Ala Ala Leu Ala
405 410 415
Phe Gln Ala Asp Ala Ala Ala Ala Ala Gln His Gln Ala Ala Ser Thr
420 425 430
Gly Ser Val Ala Pro Gly Arg Gln Gln Gln Val His Gln His Gln Gln
435 440 445
Ala Val Gly Met Ala Val Asp Asp Ala Ala Ala Phe Pro Ser Pro Gly
450 455 460
Cys Arg Pro Asn Gly Ser Ala Asp Ala Gly Ala Gln Ser Ala Ala Glu
465 470 475 480
Pro Asn Asp Phe Ser Arg Val Phe Asp Asp Pro Phe Ala Gln Pro Ala
485 490495
Ala Ser Pro Ser Gly Ala Ala Ala Ala Gly Ser Asn Glu Ala Pro Gly
500 505 510
Met Asp Asp Phe Leu Asp Phe Phe Leu Lys Ser
515 520
<210>13
<211>834
<212>PRT
<213> Pantoea karezii
<400>13
Met Asp Gly Arg Ala Glu Gly Thr Val Ala Ile Lys Gln Glu Asp His
1 5 10 15
Ala Ser Gly His Trp His Asn Phe Pro Ala Gly Leu Arg Leu Leu Val
20 25 30
Val Asp Asp Asp Pro Leu Cys Leu Lys Val Val Glu Gln Met Leu Arg
35 40 45
Lys Cys Ser Tyr Asp Val Thr Thr Cys Thr Asn Ala Thr Met Ala Leu
50 55 60
Asn Leu Leu Arg Asp Lys Ser Thr Glu Tyr Asp Leu Val Leu Ser Asp
65 70 75 80
Val Tyr Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu Val Val Gly
85 90 95
Leu Glu Met Asp Leu Pro Val Ile Met Met Ser Ser Asn Gly Asp Thr
100 105 110
Ser Asn Val Leu Arg Gly Val Thr His Gly Ala Cys Asp Tyr Leu Ile
115 120 125
Lys Pro Val Arg Leu Glu Glu Leu Arg Asn Leu Trp Gln His Val Val
130 135 140
Arg Arg Arg Arg Gln Leu Asn Leu Asp Met Asp Ser Asp Glu His Ser
145 150 155 160
Gln Glu Arg Asp Asp Asp Gln Gly Arg Lys Arg Lys Ala Asp Thr Ala
165 170 175
Gly Cys Ile Gly Asp Gln Leu Arg Met Met Gly Ala Gly Cys Ser Gly
180 185 190
Gly Ala Asn Gly Leu Gly Ser Thr Gly Asn Leu Gly Ala Val Ala Thr
195 200 205
Gly Ser Ala Gly Leu Gly Leu Gly Leu Gly Thr Ala Ala Asp Glu Leu
210 215 220
Gly Leu Gly Leu Asp Asn Gly Ser Ser Lys Lys Ala Arg Val Val Trp
225 230 235 240
Ser Val Glu Met His Gln Gln Phe Val Asn Ala Val Asn Gln Leu Gly
245 250 255
Ile Asp Lys Ala Val Pro Lys Lys Ile Leu Glu Ile Met Asn Val Asp
260265 270
Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr Arg Leu
275 280 285
Tyr Leu Lys Arg Val Ser Gly Ala Gln Gln Pro Gly Gln Asn Arg Val
290 295 300
Ser Arg Pro Ser Pro Pro Gln Pro Gln Ser Pro Gln Val Pro Ser Gln
305 310 315 320
Gln Gln Gln Ser Leu Pro Gly Gly Gly Gly Ala Ala Ala Ala Gly Ala
325 330 335
Gly Gln Leu Gln Gly Gly Gly Gly Ala Ala Ala Ala Ala Ala Ser Leu
340 345 350
Ala Ser Ile Leu Ala Gly Gly Gly Pro Ala Gly Gly Gly Ala Gly Ala
355 360 365
Gly Pro Pro Pro Gly Gly Gly Gln Leu Gly Ala Asp Gly Gly Gly Pro
370 375 380
Gly Pro Gly Leu Ser Ser Ala Val Ala Asn Ala Met Ser Ala Ala Ala
385 390 395 400
Ala Ala Gly Gly Phe Pro Thr Pro Pro Pro Pro Pro Pro Pro His Pro
405 410 415
Ala Ala Leu Leu Ala Ala Asn Pro Met Met Ala Ala Ala Ala Gly Leu
420425 430
Asn Pro Leu Leu Gly Ala Met Gly Gly Leu Gly Val Gly Pro Leu Gly
435 440 445
Pro Leu Asn Pro Leu Asn Gly Met Pro Met Pro Gly Met Gln Pro Pro
450 455 460
Leu Gly Leu Leu Pro Gly Leu Pro Gly Pro Gly Gly Gln Leu Gly Leu
465 470 475 480
Gly Pro Leu Gly Pro Ile Gly Leu Pro Gly Pro Gly Pro Leu Pro Ser
485 490 495
Leu Pro Ala Gly Leu Pro Leu Asn Pro Met Ala Asn Gly Leu Gln Gln
500 505 510
Met Ala Ala Ala Asn Leu Met Gln Gly Met Ala Gly Met Gly Gln Leu
515 520 525
Pro Ala Leu Ser Met Asn Gly Met Asn Gly Ile Met Gly Pro Leu Pro
530 535 540
Gly Val Gly Leu Pro Gly Pro Gln Gln His Leu Phe Pro Gln Gln Gln
545 550 555 560
Gln Pro His Leu Gln Gln Gln Gln Gln Gln Gln Gln Gln Lys Asp Leu
565 570 575
Gln Met Ala Gln Lys Gln His Gln Ala Ala Ala Ala Ala Ala Ala Val
580585 590
Ala Ala Ala Val Ala Ala Ala Gln His Gln Gln Gln Gln Pro Gln Ala
595 600 605
Gln Gln Gln Pro Gln Pro Gln Gln Gln Gln Gln Gln Pro Gly Lys Leu
610 615 620
Pro Gln Ala Thr Val Gly Thr Pro Ala Leu Ala Ser Pro Ala Gly Ala
625 630 635 640
Leu Pro Arg Gln Pro Ser Gly Gln His Pro His Thr Leu Ser Ser Ser
645 650 655
Ser Leu His Thr Gln Gln Pro His Gln Gln Gln Leu Leu His Ser Gln
660 665 670
Pro Ser Ser Thr His Leu Ala Thr Asn Asn Thr Leu Ala Met Ala Pro
675 680 685
Ala Leu Asn Gly Thr Leu Asp Val Gly Gly Lys Gly His Leu His Ala
690 695 700
Ala Gly Gly Gln Gly Ala Gly Ala Gly Ala Gly Ala Val Leu Asp Ile
705 710 715 720
Pro Pro Asp Leu Ile Gly Gly Leu Ile Glu Asp Gly Phe Gly Ala Pro
725 730 735
Pro Gly Pro Thr Ile Gln Leu Ala His Gly Thr Ala Ala Val Leu Asp
740 745750
Pro Thr Met Leu Leu Asp Glu Gly Asp Asn Ser Asp Phe Ala Ala Val
755 760 765
Phe Gln Glu Met Ser Ser Tyr Gly Gly Gly Gly Val Ile Gly Gly Gly
770 775 780
Gly Ser Gly Ala Gly Ala Met Gly Val Leu Gly His Gly Leu Leu Ala
785 790 795 800
Ala Gly Gly Pro Val Met Val Asp Val Ala Ala Gly Leu Ala Gly Val
805 810 815
Thr Glu Thr Ala Thr Arg Val Asp Asp Asp Phe Leu Asn Phe Leu Leu
820 825 830
Lys Ser
<210>14
<211>446
<212>PRT
<213> Tetrakis algae
<400>14
Met Ser Cys Thr Val Ala Ser Phe Pro Pro Ala Ala Gly Gly Gln Gly
1 5 10 15
Ser Pro Ala Thr Pro Val Pro Tyr Gln Asp Leu Leu Val Lys Arg Gln
20 25 30
Asp Gln Trp Ser Asn Phe Pro Ala Gly Leu Arg Val Leu Val Ala Asp
35 40 45
Asn Asp Pro Ala Ser Leu Gln Gln Val Glu Lys Met Leu Lys Lys Cys
50 55 60
Ser Tyr Gln Val Thr Leu Cys Ser Ser Gly Lys Asn Ser Leu Glu Ile
65 70 75 80
Leu Arg Lys Arg Arg Glu Glu Phe Asp Leu Val Leu Ala Asp Ala Asn
85 90 95
Leu Pro Asp Ile Asp Gly Phe Lys Leu Leu His Val Cys His Thr Glu
100 105 110
Leu Ser Leu Pro Val Val Leu Met Ser Gly Thr Ser Asp Thr Gln Leu
115 120 125
Val Met Arg Gly Val Met Asp Gly Ala Arg Asp Phe Leu Ile Lys Pro
130 135 140
Leu Arg Val Glu Glu Leu Lys Val Leu Trp Gln His Leu Val Arg Phe
145 150 155 160
Thr Ser Glu Ile Thr Lys Thr Asp Ala Gln Leu Asn Val Val Lys Val
165 170 175
Glu Leu Asp Gly Gly Arg Pro Ala Gly Glu Val Ser Thr Ser Gln Asn
180 185 190
Gly Ser Gln Cys Thr Glu Arg Glu Gly Glu Gly Asn Ser Ser Lys Lys
195 200 205
Gln Arg Met Asn Trp SerAsp Glu Met His Gln Gln Phe Val Asn Ala
210 215 220
Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile Leu Asp
225 230 235 240
Leu Met Ser Val Glu Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu
245 250 255
Gln Lys Tyr Arg Ile Tyr Leu Lys Arg Met Ala Asn His Gln Glu Asn
260 265 270
Gly Lys Gln Ala Val Met Ser Thr Asp Thr Ile Ala Arg Ala Glu Ala
275 280 285
Ala Tyr Gln Gly Gly Met Pro Gln Gly Gln Gln Met Met Gln Gln Glu
290 295 300
His Ser Gly Gln Ala Val Gln Tyr Ser Gln Pro His Ala Pro Gly Gly
305 310 315 320
Leu His Gln Gln Ala Met Pro Ala Gln Met His Met Gly Met Met Pro
325 330 335
Ala Gly Pro Gln Pro Gly Ser Met Gln Met Ala Pro His His Val Met
340 345 350
Gln Met Pro Asn Gly Gln Val Met Val Met Gln Gln Met Gly Pro Arg
355 360 365
Pro Gly Met Pro Pro Gly Met ProGln Gln Met Met Ala Ser Ser Gln
370 375 380
Gln Met Gly Met Leu Gln Pro Gly Met Pro Ala Gly Gln Met Leu His
385 390 395 400
Phe Gln His Pro Gln Gln Val His Gln His Pro Pro Ser Ser Gly Pro
405 410 415
Met His Ala Val Gln His Met Glu Tyr Ala Tyr Ser Gln Pro Met Gln
420 425 430
Met Ala Gly Trp Pro Val Gln Gly Gln Pro Gly Asn Gln Ala
435 440 445
<210>15
<211>490
<212>PRT
<213> Tetrakis algae
<400>15
Met Thr Pro Thr Pro Pro Met Ser Cys Thr Val Ala Ser Phe Pro Pro
1 5 10 15
Ala Ala Gly Gly Gln Gly Ser Pro Ala Thr Pro Val Pro Tyr Gln Asp
20 25 30
Leu Leu Val Lys Arg Gln Asp Gln Trp Ser Asn Phe Pro Ala Gly Leu
35 40 45
Arg Val Leu Val Ala Asp Asn Asp Pro Ala Ser Leu Gln Gln Val Glu
50 55 60
Lys Met Leu Lys Lys Cys Ser Tyr Gln Val Thr Leu Cys Ser Ser Gly
65 70 75 80
Lys Asn Ser Leu Glu Ile Leu Arg Lys Arg Arg Glu Glu Phe Asp Leu
85 90 95
Val Leu Ala Asp Ala Asn Leu Pro Asp Ile Asp Gly Phe Lys Leu Leu
100 105 110
His Val Cys His Thr Glu Leu Ser Leu Pro Val Val Leu Met Ser Gly
115 120 125
Thr Ser Asp Thr Gln Leu Val Met Arg Gly Val Met Asp Gly Ala Arg
130 135 140
Asp Phe Leu Ile Lys Pro Leu Arg Val Glu Glu Leu Lys Val Leu Trp
145 150 155 160
Gln His Leu Val Arg Phe Thr Ser Glu Ile Thr Lys Thr Asp Ala Gln
165 170 175
Leu Asn Val Val Lys Val Glu Leu Asp Gly Gly Arg Pro Ala Gly Glu
180 185 190
Val Ser Thr Ser Gln Asn Gly Ser Gln Cys Thr Glu Arg Glu Gly Glu
195 200 205
Gly Asn Ser Ser Lys Lys Gln Arg Met Asn Trp Ser Asp Glu Met His
210 215 220
Gln Gln Phe Val Asn Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val
225 230 235 240
Pro Lys Arg Ile Leu Asp Leu Met Ser Val Glu Gly Leu Thr Arg Glu
245 250 255
Asn Val Ala Ser His Leu Gln Lys Tyr Arg Ile Tyr Leu Lys Arg Met
260 265 270
Ala Asn His Gln Glu Asn Gly Lys Gln Ala Val Met Ser Thr Asp Thr
275 280 285
Ile Ala Arg Ala Glu Ala Ala Tyr Gln Gly Gly Met Pro Gln Gly Gln
290 295 300
Gln Met Met Gln Gln Glu His Ser Gly Gln Ala Val Gln Tyr Ser Gln
305 310 315 320
Pro His Ala Pro Gly Gly Leu His Gln Gln Ala Met Pro Ala Gln Met
325 330 335
His Met Gly Met Met Pro Ala Gly Pro Gln Pro Gly Ser Met Gln Met
340 345 350
Ala Pro His His Val Met Gln Met Pro Asn Gly Gln Val Met Val Met
355 360 365
Gln Gln Met Gly Pro Arg Pro Gly Met Pro Pro Gly Met Pro Gln Gln
370 375 380
Met Met Ala Ser Ser Gln Gln Met Gly Met Leu Gln Pro Gly Met Pro
385 390 395 400
Ala Gly Gln Met Leu His Phe Gln His Pro Gln Gln Val His Gln His
405 410 415
Pro Pro Ser Ser Gly Pro Met His Ala Gly Gly Glu Met Ile Asp Pro
420 425 430
Gly Ser Met Gln Arg Leu His Gln Gln Pro His Tyr Ile Gly Pro Asn
435 440 445
Gly Gln His Met Pro Ala Pro Ala Met Gly Met Pro Ser Gly Thr Val
450 455 460
Gln His Met Glu Tyr Ala Tyr Ser Gln Pro Met Gln Met Ala Gly Trp
465 470 475 480
Pro Val Gln Gly Gln Pro Gly Asn Gln Ala
485 490
<210>16
<211>574
<212>PRT
<213> Tetrakis algae
<400>16
Met Thr Met Pro Leu Gly Gly Gly Leu Cys Met Lys Asp Arg Ile His
1 5 10 15
Gly Asp Glu Arg Tyr Arg Ser Lys Ala Lys Arg Gln Val Asn Thr Ile
2025 30
Phe Ala Phe Thr Gln Arg Asn Thr Trp Arg Gly Arg Phe Arg Leu Cys
35 40 45
Ser Tyr Arg Thr Thr Glu Leu Leu Gly Gly Ser Lys Thr Thr Glu Pro
50 55 60
Gly Arg Gly Thr Phe Val Leu Gln Ile Phe Met Cys Val Lys Asn Ala
65 70 75 80
Ser Ile Asp Asp Gly Ser Arg His Ile Ser Thr Ser Arg Gly Leu Glu
85 90 95
Ser Val Leu Lys Arg Arg Gly Gly Gln Gly Ala Pro Ala Ala Pro Val
100 105 110
Pro Tyr His Asp Leu Leu Val Lys Arg Gln Asp Gln Trp Ser Asn Phe
115 120 125
Pro Ala Gly Leu Arg Val Leu Val Ala Asp Asn Asp Pro Ala Ser Leu
130 135 140
Gln Gln Val Glu Lys Met Leu Lys Lys Cys Ser Tyr Gln Val Thr Leu
145 150 155 160
Cys Ser Ser Gly Lys Asn Ser Leu Glu Ile Leu Arg Lys Arg Arg Glu
165 170 175
Glu Phe Asp Leu Val Leu Ala Asp Ala Asn Leu Pro Asp Ile Asp Gly
180 185190
Phe Lys Leu Leu His Val Cys His Thr Glu Leu Ser Leu Pro Val Val
195 200 205
Leu Met Ser Gly Thr Ser Asp Thr Gln Leu Val Met Arg Gly Val Met
210 215 220
Asp Gly Ala Arg Asp Phe Leu Ile Lys Pro Leu Arg Val Glu Glu Leu
225 230 235 240
Lys Val Leu Trp Gln His Leu Val Arg Phe Thr Ser Glu Ile Thr Lys
245 250 255
Thr Asp Ala Gln Leu Asn Val Val Lys Val Glu Leu Asp Ser Gly Arg
260 265 270
Pro Ala Gly Glu Val Ser Thr Ser Gln Asn Gly Ser Gln Cys Ala Glu
275 280 285
Arg Glu Gly Glu Gly Asn Ser Ser Lys Lys Gln Arg Met Asn Trp Ser
290 295 300
Asp Glu Met His Gln Gln Phe Val Asn Ala Val Asn Gln Leu Gly Ile
305 310 315 320
Asp Lys Ala Val Pro Lys Arg Ile Leu Asp Leu Met Ser Val Glu Gly
325 330 335
Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr Arg Ile Tyr
340 345350
Leu Lys Arg Met Ala Asn His Gln Glu Asn Gly Lys Gln Ala Val Met
355 360 365
Ser Thr Asp Thr Ile Ala Arg Ala Glu Ala Ala Tyr Gln Gly Gly Met
370 375 380
Pro Gln Gly Gln Gln Met Met Gln Gln Glu His Ser Gly Gln Ala Val
385 390 395 400
Gln Tyr Ser Gln Pro His Ala Pro Ser Gly Leu His Gln Gln Ala Met
405 410 415
Pro Ala Gln Met His Met Gly Met Met Pro Ala Gly Pro Gln Pro Gly
420 425 430
Ser Met Gln Met Ala Pro His His Val Met Gln Met Pro Asn Gly Gln
435 440 445
Val Met Val Met Gln Gln Met Gly Pro Arg Pro Gly Met Pro Pro Gly
450 455 460
Met Pro Gln Gln Met Met Ala Ser Ser Gln Gln Met Gly Met Leu Gln
465 470 475 480
Pro Gly Met Pro Ala Gly Gln Met Leu His Phe Gln His Pro Gln Gln
485 490 495
Val His Gln His Pro Pro Ser Ser Gly Pro Met His Ala Gly Gly Glu
500 505 510
Met Ile Asp Pro Gly Ser Met Gln Arg Leu His Gln Gln Pro His Tyr
515 520 525
Ile Val Pro Asn Ala Gln His Met Pro Ala Pro Ala Met Gly Met Pro
530 535 540
Pro Gly Ala Val Gln His Met Glu Tyr Ala Tyr Ser Gln Pro Met Gln
545 550 555 560
Met Ala Gly Trp Pro Val Gln Gly Gln Pro Gly Ser Gln Ala
565 570
<210>17
<211>674
<212>PRT
<213> genus oocysts
<400>17
Met Leu Ala Phe Thr His Gln Arg Met Thr Thr Ala Pro Ala Leu Ala
1 5 10 15
Val Ala Thr Ser His Phe Phe Ala His Val Arg Val Thr Thr Gly Ser
20 25 30
Ser Ala Ile Ala Thr Val Phe Ala Ala Arg Ser Arg Gly Ser Gly Leu
35 40 45
Leu Ala Gly Phe Asn Thr Met Glu Asn Val Lys Val Glu Val Pro Glu
50 55 60
Val Val Pro Glu Asn Val Asn Phe Pro Ala Gly Leu Lys Val Leu Val
6570 75 80
Val Asp Asp Asp Pro Leu Cys Leu Lys Val Ile Asp Gln Met Leu Arg
85 90 95
Arg Cys Asn Tyr Ala Ala Thr Thr Cys Gln Ser Ser Leu Glu Ala Leu
100 105 110
Glu Leu Leu Arg Ser Ser Lys Glu Asn His Phe Asp Leu Val Leu Ser
115 120 125
Asp Val Tyr Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu Ile Ile
130 135 140
Gly Leu Glu Met Gly Leu Pro Val Ile Met Met Ser Ser Asn Gly Glu
145 150 155 160
Thr Gly Val Val Phe Arg Gly Val Thr His Gly Ala Val Asp Phe Leu
165 170 175
Ile Lys Pro Val Arg Ile Glu Glu Leu Arg Asn Leu Trp Gln His Val
180 185 190
Val Arg Lys Thr Met Val Val Pro Ser Asn Asp Lys Ala Thr Ser Glu
195 200 205
Glu Asp Gly Glu Glu Ser Lys His Arg Val Asp Arg Lys Arg Lys Glu
210 215 220
Ser Phe His Ser Arg Ala Arg Glu Gln Val Glu Ile Ala Cys Ser Val
225230 235 240
Val Pro Ala Leu Leu Trp Pro Thr Val Pro Pro Ser Ser Val His Pro
245 250 255
Thr Ser Ser Ser Phe Leu Arg Ser His Val Leu Leu Leu Gln Arg Ser
260 265 270
Ser Gly Gly Lys Asp Val Leu Asp Glu Gly Gly Ser Asn Ala Lys Lys
275 280 285
Pro Arg Val Val Trp Ser Val Glu Met His Gln Gln Phe Val Asn Ala
290 295 300
Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile Leu Asp
305 310 315 320
Leu Met Asn Val Asp Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu
325 330 335
Gln Lys Tyr Arg Leu Tyr Leu Lys Arg Val Ala Gly Ile Asn Thr Ala
340 345 350
Thr Gly Ser Arg Asn Gly Lys Gly Arg Ser Asp Val Ser Gly Leu Ser
355 360 365
Gly Met Pro Asn Gly Ser Leu Pro Met Pro Gly Met Met Pro Pro His
370 375 380
Met Ala Ala Gly Met Leu Leu Ala Gly Met Ala Ala Asp Val Gly Pro
385 390395 400
Arg Pro His Pro Phe Pro Ile Met Pro Met Pro Ala Met Ala Leu Gln
405 410 415
Gly Met His Gly Gly Met Ala Gln Met Met Gln Leu Pro Pro Gly Met
420 425 430
Pro Pro Pro Met Met Met Pro Met Ala Pro Leu Leu Pro Ser Gln Leu
435 440 445
Ala Ala Leu Gly Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln Val Ala
450 455 460
Arg Ser Glu Ser Met Pro Ser Glu Asn Gly Val Ala Gly Pro Ser Gly
465 470 475 480
Ser Phe Thr Ala Met Leu Asn Gly Pro Ala Pro Met Glu Ser Ser Pro
485 490 495
Phe Ala Ala Leu Gln Val Phe Gly Pro Pro Gln Gly Met Glu Gln Leu
500 505 510
Thr Gln Gln Gln Gln Gln Gln Gln Gln Ala Gly Ala Ala Ala Phe Val
515 520 525
Ala Ala Phe Ala Ala Ala Asn Gly Gly Asp Met Gln Gly Gly Gly Gly
530 535 540
Gly Pro Gly Pro Met Leu Gly Gly Ala Gly Gly Ala Gly Pro Leu Leu
545 550555 560
Gly Gly Val Gly Gly Gly Asp Pro Leu His Gly Gly Gly Gly Ser Ser
565 570 575
Ala Leu Gly Gly Arg Pro Met Met Ser Ala Glu Gln Pro Met Gly Gly
580 585 590
Ser Gly Gly Leu Ala Ser Asn Ser Leu Thr Val Gln Gln Asn Asp Leu
595 600 605
Ala Gln Met Cys Ser Gln Leu Asp Val Asn Gly Leu Gln Ala Val Ala
610 615 620
Ala Ala Ala Ala Ala Gly Ala Met Gly Ala Pro Gly Gly Ala Gly Gly
625 630 635 640
Ala Met Pro Pro Ser Ser Val Gly Gly Val Gly Pro Asp Met Lys Leu
645 650 655
Thr Glu Gln Asp Asp Phe Phe Ser Phe Leu Leu Lys Asp Ser Asn Leu
660 665 670
Ile Asp
<210>18
<211>488
<212>PRT
<213> genus Microcystis
<400>18
Met Ser Thr Pro Ala Val Ser Lys Gly Phe Pro Ile Gly Leu Arg Val
1 5 10 15
Leu Val Val Asp Asp Asp Pro Leu Cys Leu Lys Ile Val Glu Lys Met
20 25 30
Leu Lys Arg Cys Gln Tyr Glu Val Thr Thr Phe Ser Arg Gly Ala Glu
35 40 45
Ala Leu Lys Thr Leu Arg Glu Arg Lys Asp Asp Phe Asp Ile Val Leu
50 55 60
Ser Asp Val His Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu His
65 70 75 80
Ile Ala Leu Glu Leu Asp Ile Pro Val Met Met Met Ser Ala Asn Cys
85 90 95
Ala Thr Asp Val Val Leu Arg Gly Ile Ile His Gly Ala Val Asp Tyr
100 105 110
Leu Leu Lys Pro Val Arg Ile Glu Glu Leu Arg Asn Ile Trp Gln His
115 120 125
Val Val Arg Arg Lys Arg Glu Ser Ser Gln Gly Asn Leu Arg Ser Gly
130 135 140
Glu Gly Gly Ser Asn Gly Arg Thr Val Ser Gly Gly Ser Thr Gly Glu
145 150 155 160
Gly Gly Gly Lys Asp Ser Lys Gly Ser Ser Glu Gln His Gly Asp Ala
165 170 175
Lys Asp Lys Thr Gly Ser Ala Gly Gly Ser Gly Gly Ser Ser Lys Arg
180 185 190
Lys Lys Gly Ser Gly Lys Lys Gly Asp Glu Gly Thr Asp Glu Val Lys
195 200 205
Asp Gly Ser Gly Gly Asp Glu Asn Glu Asp Ser Ser Ala Leu Lys Lys
210 215 220
Pro Arg Val Val Trp Ser Ala Glu Leu His Gln Gln Phe Val Thr Ala
225 230 235 240
Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile Leu Asp
245 250 255
Leu Met Gly Val Gln Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu
260 265 270
Gln Lys Tyr Arg Leu Tyr Leu Lys Arg Leu Gln Gly Val Asn Ser Gly
275 280 285
Gly Ala Pro Gly Gly Gly Pro Gly Phe Met Ser Pro Ile Ala Leu Asp
290 295 300
Gly Ser Met Val Gln Gly Gly Pro Gly Gly Arg Val Gly Ser Pro Ala
305 310 315 320
Ile Gly Gly Pro Asn Gly Pro Ile Met Val Gly His Gly His Ile Asp
325 330 335
Pro Ala Met Leu Ala Gly Gly Ala Pro Gln Thr Ile Gln Met Gly Met
340 345 350
Val Tyr Gly Gly Pro Gly Met Gly Pro Pro Gln Met Met Ala Pro Asn
355 360 365
Gly Lys Gly Gly Gly Gly Met Pro Gly Gly Tyr Val Met Gln Pro Gly
370 375 380
Gln Met Met Ala Pro Asn Gly Gln Met Met Pro Val Gly Gln Met Gly
385 390 395 400
Pro Gly Gly Met Met Val Gln Gly Pro Gly Gly Gly Met Met Gln Met
405 410 415
His Asp Gly Gly Met Met Asn Gly Asn Gly Ser Tyr Gly Ser Leu Gln
420 425 430
Asn Met Lys Gln Gly Asn Gly Val Val Met Met Pro Asn Gly Gly Met
435 440 445
Gly Gly Val Asp Gly Ala Ile Pro Asn Met Ala Thr Gly Leu Ile Asn
450 455 460
Gly Gln Gly Leu Pro Asp Asp Asp Val Leu Asp Met Phe Leu Lys Asp
465 470 475 480
Gly Leu Pro Glu Gly Glu Gly Phe
485
<210>19
<211>544
<212>PRT
<213> Microcystis parvum
<400>19
Met Thr Ala Glu Lys Lys Glu Leu Lys Val Phe Pro Ala Gly Leu Arg
1 5 10 15
Val Leu Val Val Asp Asp Asp Pro Leu Cys Leu Arg Ile Val Glu Lys
20 25 30
Met Leu Lys Arg Cys Gln Tyr Glu Val Thr Thr Phe Ser Arg Gly Ala
35 40 45
Glu Ala Leu Glu Thr Leu Arg Ala Arg Arg Asp Asp Phe Asp Ile Val
50 55 60
Leu Ser Asp Val His Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu
65 70 75 80
His Ile Ala Leu Glu Leu Asp Val Pro Val Met Met Met Ser Ala Asn
85 90 95
Cys Ala Thr Asp Val Val Leu Arg Gly Ile Ile His Gly Ala Val Asp
100 105 110
Tyr Leu Leu Lys Pro Val Arg Leu Glu Glu Leu Arg Asn Ile Trp Gln
115 120 125
His Val Val Arg Arg Gln Arg Glu Pro Ser Lys Asp Gly Ala Ala Gly
130 135 140
Lys Gly Gly GlyAla Ser Gly Ala Pro Glu Val Ser Gly Asp Thr His
145 150 155 160
Ala Asn Thr Asp Asp Lys Gln Asp Gly Asn Ala Thr Asp Ser Lys Gly
165 170 175
Ser Gly Ser Gln Lys Arg Lys Ser Gly Lys Ser Gly Asp Asp Gly Gly
180 185 190
Lys Asp Gly Gly Gly Ser Gly Gly Lys Asp Gly Asp Ala Ser Asn Lys
195 200 205
Gly Asn Asn Asn Lys Arg Lys Lys Gly Lys Ser Asn Asp Ala Thr Glu
210 215 220
Thr Ala Gly Gly Ala Gly Val Glu Asp Asn Asp Asp Thr Ser Gly Leu
225 230 235 240
Lys Lys Pro Arg Val Val Trp Ser Pro Glu Leu His Gln Gln Phe Val
245 250 255
Thr Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile
260 265 270
Leu Asp Leu Met Gly Val Gln Gly Leu Thr Arg Glu Asn Val Ala Ser
275 280 285
His Leu Gln Lys Tyr Arg Leu Tyr Leu Lys Arg Leu Gln Gly Val Asn
290 295 300
Asn Asn Gly Thr Val ProSer Gly Ala Ala Gly Phe Met Thr Gly Leu
305 310 315 320
Ala Ile Asp Gly Val Gly Gly Val Met Gly Pro Pro Thr Thr Gly Ser
325 330 335
Pro Ala Met Asn Gly Pro Gly Gly Pro Gly Gly Gly Leu Val Met Gly
340 345 350
Pro Gly His Met Gly Gly Pro His Met Asp Gly Ser Gly Met Met His
355 360 365
Met Gly Pro Gly Gly Pro Met Ala Gly Met Thr Val Val Tyr Gly Gly
370 375 380
Gly Met Pro Gly Gly Met Pro Gly Gly Ala Asp Ser Lys Asn Gly Ala
385 390 395 400
Ser Gly Gln Pro Pro Pro Gly Gly Tyr Val Val Met Gly Gly Pro His
405 410 415
Gly Gly Gly Pro Gly Gly Ala Pro Met Met Met Gln His Gly Gly Met
420 425 430
Val Pro Gly Pro Gly Pro Gly Leu Val Pro Gly Pro Gly Gly Ser Leu
435 440 445
Met Met Pro Ala Gly Met Met Pro Asp Gly Gly Gly Gly Met Val Gly
450 455 460
Val His Val Gly Pro Gly Val ValMet Gly Gln His Gln Leu Gly Gly
465 470 475 480
Lys His Ser Ser Gly Gly Ala Gly Met Ala Gly Gly Ser Ala Ala Gly
485 490 495
Lys Gly Ala Gln Arg Gly Gly Val Gly Gly Ala Phe Asp Val Pro Pro
500 505 510
Thr Asn Gly Ser Leu Asp Ala Asp Glu Ile Gly Asp Asp Val Leu Thr
515 520 525
Met Phe Leu Lys Asp Gly Leu Pro Glu Met Asn Asp Gly Asp Ala Leu
530 535 540
<210>20
<211>776
<212>PRT
<213> Pseudosphagnum moss
<400>20
Met Ser Gly Gly Asp Leu Ser Arg Val Arg Glu Gly Thr Ala Asp Leu
1 5 10 15
Asp Pro Val Met Ala Ser His Gln His Pro Pro Pro Arg Gln Gln Ser
20 25 30
His Gln Gln Pro Lys Asn His Gln Gln Glu Ala His Gln Gln His Cys
35 40 45
Ser Ser Ala Glu Thr Thr Ser Pro Asn Asn Thr Ala Arg Gly Ala Gly
50 55 60
Ala Thr Tyr Gly Lys Met Glu Pro Ala Asp Asp Phe Pro Ala Gly Leu
65 70 75 80
Arg Ile Leu Val Val Asp Asp Asp Pro Thr Cys Leu Ala Ile Leu Lys
85 90 95
Lys Met Leu Gln Gln Cys Ser Tyr Gln Val Thr Thr Cys Gly Arg Ala
100 105 110
Thr Arg Ala Leu Glu Leu Leu Arg Glu Asp Lys Asp Lys Phe Asp Leu
115 120 125
Val Ile Ser Asp Val Tyr Met Pro Asp Met Asp Gly Phe Lys Leu Leu
130 135 140
Glu Leu Val Gly Leu Glu Met Asp Leu Pro Val Ile Met Met Ser Gly
145 150 155 160
Asn Gly Glu Thr Ser Val Val Met Lys Gly Ile Thr His Gly Ala Cys
165 170 175
Asp Tyr Leu Leu Lys Pro Val Arg Ile Glu Glu Leu Ser Asn Ile Trp
180 185 190
Gln His Val Val Arg Lys Leu Arg Ser Glu Pro Lys Glu His Ser Ala
195 200 205
Ser Leu Glu Asp Gly Asp Arg Gln Arg Arg Gly Gly Ala Glu Asp Ala
210 215 220
Asp Asn Thr Ser Ser Ala Ala Asp Thr Ala Asp Gly Ile Trp Arg Asn
225 230 235 240
Lys Lys Lys Lys Glu Ala Lys Glu Asp Glu Glu Asp Phe Glu Gln Asp
245 250 255
Asn Asp Asp Pro Ser Thr Leu Lys Lys Pro Arg Val Val Trp Ser Val
260 265 270
Glu Leu His Gln Gln Phe Val Ser Ala Val Asn Gln Leu Gly Ile Asp
275 280 285
Lys Ala Val Pro Lys Arg Ile Leu Glu Leu Met Ser Val Gln Gly Leu
290 295 300
Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu
305 310 315 320
Lys Arg Leu Ser Gly Val Thr Ser Gln Ser Asn Ser Leu Asn Val Ser
325 330 335
Phe Gly Gly Pro Asp Ala Gly Tyr Gly Gly Leu Phe Gly Leu Asp Glu
340 345 350
Met Ser Asp Tyr Arg Asn Leu Val Thr Asn Gly His Leu Pro Ala Gln
355 360 365
Thr Ile Ala Ala Leu His His Ala Asn Met Ala Gly Arg Leu Gly Ala
370 375 380
Ser Ser Gly Met Val Gly Pro Ser Ser Pro Leu Asp Pro Ser Val Leu
385 390 395 400
Ala Gln Ile Ala Ala Leu Gln Ser Gly Ser Leu Pro Arg Pro Gly Met
405 410 415
Asp Gly Ser Leu Gln Gly Asn Gln Ala Gly Leu Leu Gln Ser Leu Ser
420 425 430
Gly Ala Leu Asp Tyr Asn Ser Leu His Gln Ser His Leu Leu Pro Ala
435 440 445
Ile Gly Gln Leu Gly Gln Leu Asp Glu Leu Pro Ser Leu Lys Ser Met
450 455 460
Gln His Gln Leu Gly Met Gly Ser Leu Gly Gly Ser Thr Arg Asn Leu
465 470 475 480
Ala Gly Ser Pro Asn Glu Glu Leu Thr Met Gln Leu Leu Gln Gln Arg
485 490 495
Ala Gln Gln Gln Ser Gly Gly Ser Pro Ile Asn Leu Pro Gln Ala Thr
500 505 510
Gly Ile Leu Arg Pro Leu Ser Ser Asn Ile Asn Gln Gly Gly Ser Val
515 520 525
Pro Asn Leu Val Gly Val Ile Pro Gly Thr Ala Ile Gly Leu Ser Asn
530 535 540
Met Cys Ser Gly Gly Arg Glu Phe Gly Ser Ser Ser Gly Leu Leu Ser
545 550 555 560
Ala Ser Gly Ser Leu Met Gln Ser Ser Thr Val Glu Ala Gln Asn Leu
565 570 575
Asn Phe Gly Gly Ser Ser Gly Ser Ser Gly Cys Ser Phe Gln Ala Ser
580 585 590
Val Leu Ser Ser Lys Thr Gly Gly Leu Glu Asp Leu Asn Pro Ala Lys
595 600 605
Arg Val Arg Thr Thr Tyr Ser Ala Leu Ser His Ser Ser Pro Asp Leu
610 615 620
Gly Gln Ser Ser Arg Pro Ala Trp Leu Gly Ser Gln Glu Gly Leu Val
625 630 635 640
His Gly Asp Pro Val Tyr Ser Pro His Gln Leu Ser Leu Pro Arg Gln
645 650 655
Asp Ile Val Gly Gly Ile Gly Ser Ser Gly Arg Pro Ala Tyr Met Gly
660 665 670
Ser Gln Ser Met Gly Ser Leu Gly Met Asn Phe Pro Leu Ser Leu Ala
675 680 685
Val Asp Ala Gly Ala Val Arg Pro Ser Leu Thr Arg Gly Gln Ser Leu
690 695 700
Thr Glu Gln Val Ala Ala Asn Arg Glu Leu Lys Phe Pro Lys Glu Glu
705 710 715 720
Arg Gly Arg Asp Asn Leu Met Cys Ala Arg Leu Gly Gly Gly Met Ile
725 730 735
Thr Asn Glu Ser Ser Ser Glu Glu Leu Leu Asn Tyr Leu Lys Gln Ser
740 745 750
His Glu Gly Leu Gly Phe Met Glu Gly Asp Leu Val Ser Asp Gly Tyr
755 760 765
Pro Val Asp Asn Leu Tyr Val Lys
770 775
<210>21
<211>715
<212>PRT
<213> Physcomitrella patens
<400>21
Met Gly Gly Gly Tyr Leu Ser Ser Thr Val Asn Met Gly Glu Ser Arg
1 5 10 15
Asp Gly Gly Ser Pro Ala Met Ala Thr Leu Gln Gln Gln Gln Lys His
20 25 30
Gln Pro Leu Asn Pro Asn His Gln Asn Pro Arg Asn Arg Ser Asn Ser
35 40 45
Ser Pro Thr Asn Cys Tyr Ser Asn Thr Ala Trp Gly Ala Lys Pro Ala
50 55 60
Lys Leu Asp Thr Pro Asp Glu Phe Pro Val Gly Met Arg Val Leu Val
65 70 75 80
Val Asp Asp Asn Pro Thr Cys Leu Met Ile Leu Glu Gln Met Leu Val
85 90 95
Arg Cys Ala Tyr Arg Val Thr Thr Cys Gly Lys Ala Thr Glu Ala Leu
100 105 110
Ser Met Leu Arg Glu Asp Ile Gly Lys Phe Asp Val Val Ile Ser Asp
115 120 125
Val Asp Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu Leu Val Gly
130 135 140
Leu Glu Met Asp Leu Pro Val Ile Met Val Ser Gly Asn Gly Glu Thr
145 150 155 160
Ser Ala Val Met Lys Gly Ile Thr His Gly Ala Cys Asp Tyr Leu Leu
165 170 175
Lys Pro Val Arg Ile Glu Glu Leu Arg Asn Ile Trp Gln His Val Val
180 185 190
Arg Lys Lys Arg Arg Glu Val Lys Ala Val Ala Thr Lys Ser Val Glu
195 200 205
Glu Ala Gly Gly Cys Glu Arg Pro Lys Arg Gly Gly Gly Ala Asp Asp
210 215 220
Ala Asp Tyr Thr Ser Ser Ala Thr Asp Thr Thr Asp Ser Asn Trp Lys
225 230 235 240
Leu Thr Lys Arg Arg Lys Gly Glu Phe Lys Asp Glu Asn Glu Glu Asp
245 250 255
Asn Glu Gln Glu Asn Asp Asp Pro Ser Thr Leu Lys Arg Pro Arg Val
260 265 270
Val Trp Ser Val Glu Leu His Gln Gln Phe Val Ser Ala Val Asn Gln
275 280 285
Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile Leu Glu Leu Met Gly
290 295 300
Val Gln Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr
305 310 315 320
Arg Leu Tyr Leu Lys Arg Leu Ser Gly Val Thr Ser Gln Gln Gly Asn
325 330 335
Met Ser Ala His Phe Gly Gly Ser Asp Pro Phe Cys Met Met Pro Pro
340 345 350
Asp Met Ser Leu Ala Asn Gly Gln Leu Thr Pro Gln Ala Leu Ala Lys
355 360 365
Phe His Met Leu Gly Arg Met Asn Ala Thr Asn Gly Ile Gly Phe Ser
370 375 380
Gly Gly Gly Leu Asp Pro Gly Met Asn Gln Met Phe Leu Gln Asp Leu
385 390 395 400
Pro Arg Pro Pro Gln Leu Asn Ser Met Leu Arg Asn Asn Thr Gly Leu
405 410 415
Leu Ala Ser Val Pro Asn Gly Leu Gln His Leu Glu Gln Leu Ser Glu
420 425 430
Pro His His Val His Val Val Asn Glu Leu Glu His Tyr Pro Ser Asn
435 440 445
Thr Lys Val Tyr Pro Gln Leu Asn Gly Asn Leu Asp Val Ser Val Gly
450 455 460
Pro Leu Gly Ala Ala Asn Gly Asn Leu Ala Ser Asn Pro Asn Ser Asp
465 470 475 480
Thr Leu Leu Met His Ile Leu His Ser Arg Ala Ser Gln Gln Gly Val
485 490 495
Gly Ser Pro Ser Thr Leu Pro Gln Pro Arg Cys Gly Leu Asn Pro Thr
500 505 510
His Leu Leu Ser Asn Asp Ile Asn Phe Ala Pro Val Gly Ser Leu Pro
515 520 525
Asn Leu Ala Gly Ser Leu Gly Pro Ala Val Gly Leu Ser Ala Ile Pro
530 535 540
Gly Ser Ala Gly Gly Arg Asp Leu Ser Pro Ser Val Gly Gly Ser Gly
545 550 555 560
Ala Ser Leu Ser Ser Pro Leu Gly Ser Leu Val Arg Arg Pro Leu Met
565 570 575
Ala Glu Glu Gln Ser Asn Pro Val Asn Ser Thr Asn Gly Thr Tyr Ser
580 585 590
Met Ala His Ser Gly Gln Ser Pro Lys Pro Ser Gly Asp Thr Leu Pro
595 600 605
Thr Pro Leu Asn Glu Gly Leu Glu Gln Gln Gln Pro Leu Trp Ala Leu
610 615 620
Tyr Gln Asn Pro Met Asn Gln Leu Ser His Gly Pro Ser Gln Gly Phe
625 630 635 640
Pro His Asp Ser Leu Gln Trp Ser Val Leu Thr Glu Asn Leu Ser Phe
645 650 655
Gly Asp Met Gly Gln Ser Leu Ser Ala Gly Leu Ile Ser Gln Phe Ser
660 665 670
Ser Gln Gly Gln Asp Asn Gly Ile Gly Phe Ala Pro Pro Ser Gln Arg
675 680 685
Gly Ser Tyr Thr Arg Gln Ser Val Ser Phe Pro Ala Ser Ser Ala Leu
690 695 700
Asp Gly Arg Met Val Arg Ser Ser Tyr Glu Pro
705 710 715
<210>22
<211>664
<212>PRT
<213> Arabidopsis thaliana
<400>22
Met Val Asn Pro Gly His Gly Arg Gly Pro Asp Ser Gly Thr Ala Ala
1 5 10 15
Gly Gly Ser Asn Ser Asp Pro Phe Pro Ala Asn Leu Arg Val Leu Val
20 25 30
Val Asp Asp Asp Pro Thr Cys Leu Met Ile Leu Glu Arg Met Leu Met
35 40 45
Thr Cys Leu Tyr Arg Val Thr Lys Cys Asn Arg Ala Glu Ser Ala Leu
50 55 60
Ser Leu Leu Arg Lys Asn Lys Asn Gly Phe Asp Ile Val Ile Ser Asp
65 70 75 80
Val His Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu His Val Gly
85 90 95
Leu Glu Met Asp Leu Pro Val Ile Met Met Ser Ala Asp Asp Ser Lys
100 105 110
Ser Val Val Leu Lys Gly Val Thr His Gly Ala Val Asp Tyr Leu Ile
115 120125
Lys Pro Val Arg Ile Glu Ala Leu Lys Asn Ile Trp Gln His Val Val
130 135 140
Arg Lys Lys Arg Asn Glu Trp Asn Val Ser Glu His Ser Gly Gly Ser
145 150 155 160
Ile Glu Asp Thr Gly Gly Asp Arg Asp Arg Gln Gln Gln His Arg Glu
165 170 175
Asp Ala Asp Asn Asn Ser Ser Ser Val Asn Glu Gly Asn Gly Arg Ser
180 185 190
Ser Arg Lys Arg Lys Glu Glu Glu Val Asp Asp Gln Gly Asp Asp Lys
195 200 205
Glu Asp Ser Ser Ser Leu Lys Lys Pro Arg Val Val Trp Ser Val Glu
210 215 220
Leu His Gln Gln Phe Val Ala Ala Val Asn Gln Leu Gly Val Asp Lys
225 230 235 240
Ala Val Pro Lys Lys Ile Leu Glu Met Met Asn Val Pro Gly Leu Thr
245 250 255
Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr Arg Ile Tyr Leu Arg
260 265 270
Arg Leu Gly Gly Val Ser Gln His Gln Gly Asn Met Asn His Ser Phe
275 280 285
Met Thr Gly Gln Asp Gln Ser Phe Gly Pro Leu Ser Ser Leu Asn Gly
290 295 300
Phe Asp Leu Gln Ser Leu Ala Val Thr Gly Gln Leu Pro Pro Gln Ser
305 310 315 320
Leu Ala Gln Leu Gln Ala Ala Gly Leu Gly Arg Pro Thr Leu Ala Lys
325 330 335
Pro Gly Met Ser Val Ser Pro Leu Val Asp Gln Arg Ser Ile Phe Asn
340 345 350
Phe Glu Asn Pro Lys Ile Arg Phe Gly Asp Gly His Gly Gln Thr Met
355 360 365
Asn Asn Gly Asn Leu Leu His Gly Val Pro Thr Gly Ser His Met Arg
370 375 380
Leu Arg Pro Gly Gln Asn Val Gln Ser Ser Gly Met Met Leu Pro Val
385 390 395 400
Ala Asp Gln Leu Pro Arg Gly Gly Pro Ser Met Leu Pro Ser Leu Gly
405 410 415
Gln Gln Pro Ile Leu Ser Ser Ser Val Ser Arg Arg Ser Asp Leu Thr
420 425 430
Gly Ala Leu Ala Val Arg Asn Ser Ile Pro Glu Thr Asn Ser Arg Val
435 440 445
Leu Pro Thr Thr His Ser Val Phe Asn Asn Phe Pro Ala Asp Leu Pro
450 455 460
Arg Ser Ser Phe Pro Leu Ala Ser Ala Pro Gly Ile Ser Val Pro Val
465 470 475 480
Ser Val Ser Tyr Gln Glu Glu Val Asn Ser Ser Asp Ala Lys Gly Gly
485 490 495
Ser Ser Ala Ala Thr Ala Gly Phe Gly Asn Pro Ser Tyr Asp Ile Phe
500 505 510
Asn Asp Phe Pro Gln His Gln Gln His Asn Lys Asn Ile Ser Asn Lys
515 520 525
Leu Asn Asp Trp Asp Leu Arg Asn Met Gly Leu Val Phe Ser Ser Asn
530 535 540
Gln Asp Ala Ala Thr Ala Thr Ala Thr Ala Ala Phe Ser Thr Ser Glu
545 550 555 560
Ala Tyr Ser Ser Ser Ser Thr Gln Arg Lys Arg Arg Glu Thr Asp Ala
565 570 575
Thr Val Val Gly Glu His Gly Gln Asn Leu Gln Ser Pro Ser Arg Asn
580 585 590
Leu Tyr His Leu Asn His Val Phe Met Asp Gly Gly Ser Val Arg Val
595 600 605
Lys Ser Glu Arg Val Ala Glu Thr Val Thr Cys Pro Pro Ala Asn Thr
610 615 620
Leu Phe His Glu Gln Tyr Asn Gln Glu Asp Leu Met Ser Ala Phe Leu
625 630 635 640
Lys Gln Glu Gly Ile Pro Ser Val Asp Asn Glu Phe Glu Phe Asp Gly
645 650 655
Tyr Ser Ile Asp Asn Ile Gln Val
660
<210>23
<211>1036
<212>PRT
<213> round leaf Arabidopsis thaliana
<400>23
Leu Ser Lys Lys Gln Asn Glu Asp Ala Ser Gly Arg Lys Glu Glu Asp
1 5 10 15
Gly Lys Gly Asn Glu His Asn Gly Met Glu Ser Cys Thr Arg Met Lys
20 25 30
Arg Thr Val Trp Thr Val Glu Leu His Gln Lys Phe Val Asn Ala Phe
35 40 45
Gln Gln Leu Gly Leu Asp Lys Ala Ser Pro Glu Gln Ile His Ala Leu
50 55 60
Met Asn Val Glu Gly Leu Pro Val Ile Asn Val Ala Ser His Leu Gln
65 70 75 80
Lys Tyr Arg Leu Phe Leu Lys Lys Ile Tyr Glu Gly Gln Gln Leu Asp
85 90 95
Met Ala Thr Ile Gln Leu Leu Leu Ser Ala Gly Ser His Phe Pro Gln
100 105 110
Thr Pro Trp Thr Asn His Cys Ser Ser Phe Ile Gln Gln Gly His His
115 120 125
Gln Asn Ser Ser Asn Ser Ser Glu Thr Tyr His Thr Thr Leu Ser Pro
130 135 140
Arg Val Gln Lys Val Asn Thr Phe Gln Pro Ser Ser Ser Pro Leu Lys
145 150 155 160
Pro Leu Leu Phe Pro Lys Ser Asn Ile Ser Ala Phe Lys Glu Asp Phe
165 170 175
Lys Ser Ile Lys Glu Pro Ala Ile Val Gly Asp Ser Ser Leu Asp Ser
180 185 190
Ser Lys Pro Arg Asn Ser Phe Gln Thr Ala Ser Lys Phe Pro Lys Thr
195 200 205
Asp Pro Cys Thr Gly Ser Tyr Ile Ile Glu Ile Met Thr Glu Pro Tyr
210 215 220
Tyr Gly Lys Ser Ser Arg Arg His Ser Asn Phe Ser Ala Tyr Met Gly
225 230 235 240
Asp Phe Lys Ser Ile Lys Asp Pro Glu Ile Val Gln Glu Ser Arg Thr
245 250 255
Arg Lys Asn His Gly Arg Val Val Trp Ser His Glu Leu His Gln Lys
260 265 270
Phe Leu Asn Ala Ile Asp Gln Leu Gly Gly Asn Glu Lys Ala Ile Pro
275 280 285
Lys Lys Ile Leu Ala Val Met Asn Val Glu Gly Leu Thr Arg Leu Asn
290 295 300
Val Ala Thr His Leu Gln Lys Tyr Arg Gln Cys Cys Ser Ala Glu Ala
305 310 315 320
Gln Gln Leu Asn Met Ala Thr Arg Lys Leu Pro Ser Ser Glu His Leu
325 330 335
Pro Gln Ser Pro Ser Thr Asn His His Ser Ser Leu Ser Pro Arg Val
340 345 350
Gln Asp Val Asn Ile Arg Leu Trp Ser Ser Ser Pro Lys Arg Gln Asp
355 360 365
Gln Ile Leu Val Tyr Val Leu Phe Ser Phe Glu Asn Asp Asn Gly Arg
370 375 380
Glu Glu Thr Thr Cys Arg Arg Ile Ala Ser Thr Met Glu Leu Gly Ser
385 390 395 400
Thr Glu Asp Gly Arg His Asp Lys Phe Pro Val Gly Met Arg Val Leu
405 410 415
Ala Val Asp Asp Asn Pro Thr Cys Leu Arg Lys Leu Glu Glu Leu Leu
420 425 430
Leu Arg Cys Lys Tyr His Val Thr Lys Thr Met Glu Ser Arg Lys Ala
435 440 445
Leu Glu Leu Leu Arg Glu Asn Ser Asn Met Phe Asp Leu Val Ile Ser
450 455 460
Asp Val Glu Met Pro Asp Thr Asp Gly Phe Lys Leu Leu Glu Ile Gly
465 470 475 480
Leu Glu Met Asp Leu Pro Val Ile Met Leu Ser Ala His Ser Asp Tyr
485 490 495
Asp Ser Val Met Lys Gly Ile Ile His Gly Ala Cys Asp Tyr Leu Val
500 505 510
Lys Pro Val Gly Leu Lys Glu Leu Gln Asn Ile Trp His His Val Val
515 520 525
Lys Lys Asn Ile Lys Ser Tyr Ala Lys Asn Ile Gly Pro Ser Arg Gln
530 535 540
Leu Leu Pro Pro Ser Glu Ser Asn Leu Val Pro Ser Ala Ser Lys Lys
545 550 555 560
Arg Lys Glu Lys Ala Ser Asp Ser Gly Asp Glu Asp Asp Ser Asp Arg
565 570 575
Glu Glu Asp Asp Gly Glu Gly Ser Glu Gln Asp Gly Glu Glu Ser Gly
580 585 590
Thr Arg Lys Lys Pro Arg Val Val Trp Ser Gln Glu Leu His Gln Lys
595 600 605
Phe Val Ser Ala Val Gln Gln Leu Gly Leu Asp Lys Ala Val Pro Lys
610 615 620
Lys Ile Leu Asp Leu Met Ser Ile Glu Gly Leu Thr Arg Glu Asn Val
625 630 635 640
Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu Lys Lys Ile Asp Glu
645 650 655
Gly Gln Gln Gln Asn Met Thr Pro Asp Ala Phe Gly Thr Arg Asp Ser
660 665 670
Ser Tyr Phe Gln Met Ala Gln Leu Asp Gly Leu Arg Asp Phe Thr Ala
675 680 685
Thr Arg Gln Ile Pro Ser Ser Gly Leu Leu Ser Arg Ser His Leu Thr
690 695 700
Lys Leu Gln Pro Pro Met Tyr Ser Ser Ile Asn Leu Gln Gly Met Asn
705 710 715 720
Ser Ser Ser Phe Ile Gln Gln Gly His His His Asn Ser Ser Asn Ser
725 730 735
Ala Asn Pro Phe Gly Thr Tyr His Thr Thr Leu Ser Pro Arg Ile Gln
740 745 750
Asn Val Asn Leu Leu Gln Arg Thr Ser Ser Pro Leu Glu Thr Leu Gln
755 760 765
Phe Pro Arg Ser Lys Ser Tyr Ile Gly Asp Phe Lys Gly Ile Gly Asp
770 775 780
Arg Ala Val Gly Gly Ser Phe Leu Asp Ser Cys Met Pro Phe Gly Ser
785 790 795 800
Ser Ser Thr Ser Leu Pro Ser Ala Ser Thr Asn Thr Leu Met Leu Gln
805 810 815
Ala Asn Tyr Thr Gln Pro Leu His Ile Ala Ser Asp Gly Asn Gln Pro
820 825 830
Cys Ile Glu Gly Thr Pro Ser Asn Ser Ala Ser Pro Asn Ile Ser Phe
835 840 845
Gln Gly Leu Ser Arg Phe Pro Ser His Ser Trp Gln Gly Asn Leu Asn
850 855 860
Thr Thr Arg Phe Pro Pro Ser Ser Leu Pro Leu Asn Gln Ala Phe Leu
865 870 875 880
Pro Asp Gln Val Thr Cys Ala Gly Asn Asn Leu Gly Asp Cys Thr Ser
885 890 895
Leu Val Ser Ala Gly Asn Pro Gly Gly Glu Met Gln Cys Glu Pro Gln
900 905 910
Leu Leu Gly Gly Phe Met Gln Asn Met Asn Pro Leu Asp Gly Gln Lys
915 920 925
Trp Glu Gln Gln Asn Ser Met Leu Asn Asn Pro Phe Gly Asn Ile Glu
930 935 940
Tyr Pro Leu Ser Ala Asp Asn Met Val Phe Arg Asp Asn Asn Ala Thr
945 950 955 960
Arg Asn Lys Gly Leu Asp Glu Ser Leu Met Asn Pro Ile Asp Asn Ser
965 970 975
Gln Glu Tyr Val Gly Lys Ala Thr Thr Met Leu Asp Pro Glu Met Lys
980 985 990
Ser Gly Lys Pro Glu Asn Asp Asn Gln His Asp Val Phe Asp Asp Ile
995 1000 1005
Met Asn Glu Met Met Lys Gln Glu Glu Asn Asn Gly Met Val Ser
1010 1015 1020
Val Ala Thr Arg Phe Gly Phe Asp Ser Phe Pro Pro Pro
1025 1030 1035
<210>24
<211>774
<212>PRT
<213> lyre leaf Arabidopsis thaliana
<400>24
Met Gly Asp Phe Lys Ser Ile Lys Glu Pro Glu Ile Val Gln Glu Ser
1 5 10 15
Arg Thr Arg Lys Asn His Gly Arg Val Val Trp Ser His Glu Leu His
20 25 30
Gln Lys Phe Leu His Ala Ile Asp Gln Leu Gly Gly Asn Asp Lys Ala
35 40 45
Ile Pro Lys Lys Ile Leu Ala Val Met Asn Val Glu Gly Leu Thr Arg
50 55 60
Leu Asn Val Ala Thr His Leu Gln Lys Tyr Arg Gln Cys Cys Ser Thr
65 70 75 80
Glu Ala Gln Gln Leu Asn Met Ala Thr Arg Lys Leu Pro Ser Ser Glu
85 90 95
His Leu Pro Gln Ser Pro Ser Thr Asn His His Ser Ser Leu Ser Pro
100 105 110
Arg Val Gln Asp Asn Asp Asn Gly Arg Glu Glu Thr Thr Cys Arg Arg
115 120 125
Ile Ala Ser Thr Met Glu Leu Gly Ser Thr Glu Asp Gly Arg His Asp
130 135 140
Lys Phe Pro Val Gly Met Arg Val Leu Ala Val Asp Asp Asn Pro Thr
145 150 155 160
Cys Leu Arg Lys Leu Glu Glu Leu Leu Leu Arg Cys Lys Tyr His Val
165 170 175
Thr Lys Thr Met Glu Ser Arg Lys Ala Leu Glu Leu Leu Arg Glu Asn
180 185 190
Ser Asn Met Phe Asp Leu Val Ile Ser Asp Val Glu Met Pro Asp Thr
195 200 205
Asp Gly Phe Lys Leu Leu Glu Ile Gly Leu Glu Met Asp Leu Pro Val
210 215 220
Ile Met Leu Ser Ala His Ser Asp Tyr Asp Ser Val Met Lys Gly Ile
225 230 235 240
Ile His Gly Ala Cys Asp Tyr Leu Val Lys Pro Val Gly Leu Lys Glu
245 250 255
Leu Gln Asn Ile Trp His His Val Val Lys Lys Asn Ile Lys Ser Tyr
260 265 270
Ala Lys Asn Ile Gly Pro Ser Arg Gln Leu Leu Pro Pro Ser Glu Ser
275 280 285
Asn Leu Val Pro Ser Ala Ser Lys Lys Arg Lys Glu Lys Ala Asn Asp
290 295 300
Ser Gly Asp Glu Asp Asp Ser Asp Arg Glu Glu Asp Asp Gly Glu Gly
305 310 315 320
Ser Glu Gln Asp Gly Asp Glu Ala Gly Thr Arg Lys Lys Pro Arg Val
325 330 335
Val Trp Ser Gln Glu Leu His Gln Lys Phe Val Ser Ala Val Gln Gln
340 345 350
Leu Gly Leu Asp Lys Ala Val Pro Lys Lys Ile Leu Asp Leu Met Ser
355 360 365
Ile Glu Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr
370 375 380
Arg Leu Tyr Leu Lys Lys Ile Asp Glu Gly Gln Gln Gln Asn Met Thr
385 390 395 400
Pro Asp Ala Phe Gly Thr Arg Asp Ser Ser Tyr Phe Gln Met Ala Gln
405 410 415
Leu Asp Gly Leu Arg Asp Phe Thr Ala Thr Arg Gln Ile Pro Ser Ser
420 425 430
Gly Leu Leu Ser Arg Ser His Leu Thr Lys Leu Gln Pro Pro Met Tyr
435 440 445
Ser Ser Ile Asn Leu Gln Gly Met Asn Ser Ser Ser Phe Ile Gln Gln
450 455 460
Gly His His His Asn Ser Ser Asn Ser Ala Asn Pro Phe Gly Thr Tyr
465 470 475 480
His Thr Thr Leu Ser Pro Arg Ile Gln Asn Val Asn Leu Phe Gln Arg
485 490 495
Thr Ser Ser Pro Leu Glu Thr Leu Gln Phe Pro Arg Ser Lys Ser Tyr
500 505 510
Ile Gly Asp Phe Lys Gly Ile Gly Asp Arg Ala Val Gly Gly Ser Phe
515 520 525
Leu Asp Ser Cys Met Pro Phe Gly Ser Ser Ser Thr Ser Leu Pro Ser
530 535 540
Ala Ser Thr Asn Thr Leu Met Leu Gln Ala Asn Tyr Thr Gln Pro Leu
545 550 555 560
His Ile Ser Ser Asp Gly Asn Gln Pro Cys Ile Glu Gly Thr Pro Ser
565 570 575
Asn Ser Ala Ser Pro Asn Ile Ser Phe Gln Gly Leu Ser Arg Phe Pro
580 585 590
Ser His Ser Trp Gln Gly Asn Leu Asn Thr Thr Arg Phe Pro Pro Ser
595 600 605
Ser Leu Pro Leu Asn Pro Ala Phe Leu Pro Asp Gln Val Thr Cys Ala
610 615 620
Gly Asn Asn Leu Gly Asp Cys Thr Ser Leu Val Ser Ala Gly Asn Pro
625 630 635 640
Gly Gly Glu Ile Gln Cys Glu Pro Gln Leu Leu Gly Gly Phe Met Gln
645 650 655
Asn Met Asn Pro Leu Asp Gly Gln Lys Trp Glu Gln Gln Asn Cys Thr
660 665 670
Met Leu Asn Asn Pro Phe Gly Asn Ile Glu Tyr Pro Leu Pro Ala Asp
675 680 685
Asn Met Val Phe Arg Asp Asn Asn Ala Thr Arg Ser Lys Gly Leu Asp
690 695 700
Glu Ser Leu Met Asn Pro Ile Asp Asn Ser Gln Glu Tyr Val Gly Lys
705 710 715 720
Ala Thr Thr Met Leu Asp Pro Glu Met Lys Ser Gly Lys Pro Glu Asn
725 730 735
Asp Asn Gln His Asp Val Phe Asp Asp Leu Met Asn Glu Met Met Lys
740 745 750
Gln Glu Glu Asn Asn Gly Met Val Ser Val Ala Thr Arg Phe Gly Phe
755 760 765
Asp Ser Phe Pro Pro Pro
770
<210>25
<211>578
<212>PRT
<213> sunflower
<400>25
Met Thr Thr Gly Ser Ser Phe Gly Ser Gly Ser Leu Gly Cys Lys Gln
1 5 10 15
Glu Thr Gly Val Pro Asp Gln Phe Pro Ala Gly Leu Arg Val Leu Val
20 25 30
Val Asp Asp Asp Val Ile Cys Leu Lys Ile Leu Glu Gln Met Leu Arg
35 40 45
Arg Cys Ser Tyr His Val Thr Thr Cys Ser Gln Ala Thr Ala Ala Leu
50 55 60
Asn Leu Leu Arg Glu Arg Lys Gly Cys Phe Asp Val Val Leu Ser Asp
65 70 75 80
Val His Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu Leu Val Gly
85 90 95
Leu Glu Met Asp Leu Pro Val Ile Met Met Ser Ala Asp Gly Arg Thr
100 105 110
Asn Leu Val Leu Arg Gly Ile Arg His Gly Ala Cys Asp Tyr Leu Ile
115 120 125
Lys Pro Ile Arg Glu Glu Gln Leu Lys Asn Ile Trp Gln His Val Ile
130 135 140
Arg Lys Lys Trp Asn Glu Asn Lys Glu His Glu His Ser Gly Ser Val
145 150 155 160
Asp Asp Lys Asp Arg His Lys Arg Gly Gly Asp Asp Asn Asp Tyr Ala
165 170 175
Ser Ser Val Asn Glu Gly Gly Asp Gly Ile Leu Thr Ser His Lys Lys
180 185 190
Lys Arg His Asn Asn Lys Glu Glu Asp Asp Gly Glu Leu Glu Thr Asp
195 200 205
Glu Pro Gly Gly Ser Lys Lys Ala Arg Val Val Trp Ser Val Glu Leu
210 215 220
His Gln Gln Phe Val Thr Ala Val Asn Gln Leu Gly Ile Asp Lys Ala
225 230 235 240
Val Pro Lys Arg Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg
245 250 255
Glu Asn Val Ala Ser His Leu Gln Lys Phe Arg Leu Tyr Leu Lys Arg
260 265 270
Leu Ser Gly Val Ala Gln Gln Gly Gly Gly Pro Asn Ser Phe Cys Gly
275 280 285
Ser Ile Asp Gln Asn Pro Lys Leu Ala Ser Tyr Ala Arg Phe Glu Ile
290 295 300
Gln Ala Leu Ala Ala Ser Gly Gln Ile Pro Pro Gln Thr Leu Val Ala
305 310 315 320
Leu His Ala Glu Leu Leu Gly Gln Pro Thr Ala Asn Val Gly Met Pro
325 330 335
Val Leu Asp His Gln Pro Leu Met Gln Pro Ser Lys Cys Gly Pro Val
340 345 350
Asp His Val Met Ser Tyr Gly Gln Thr Leu Pro Ser Asn Val Thr Lys
355 360 365
Gln Val Pro Gln Pro Ala Ile Glu Asp Val His Ser Gly Leu Gly Ala
370 375 380
Trp His Ser Asn Asn Met Val Gly Gly Tyr Gly Gln Leu Gly Gly Gln
385 390 395 400
Asn Trp His Asn Met Leu Leu Gly Met Leu Gln Ser Gln Ser His Gln
405 410 415
Leu Gln Lys Gln Ser Ile Thr Val Gln Pro Ser Arg Leu Val Val Pro
420 425 430
Ser Gln Ser Ser Asn Phe Gln Ala Val Asn Asn Gly Val Pro Val Asn
435 440 445
Gln Thr Thr Gly Phe Asn Asn Ser Thr Val Ile Asn Tyr Ala Val Gly
450 455 460
Gln Arg Thr Glu Arg Asp Val Glu Asn Gln Ile Gly Gly Gln Ser Ser
465 470 475 480
Val Ser Asn Ile Ser Val Lys Glu Met Gly Glu Lys Gln Ile Ser Phe
485 490 495
Gly Glu Ser Val His Val Leu Asp Gln Gly Ser Leu Arg Asn Leu Gly
500 505 510
Phe Val Gly Lys Lys Ser Ser Ile Pro Ser Arg Phe Ala Val Tyr Glu
515 520 525
Ala Ala Glu Ser Leu Thr His Asn Leu Asn Tyr Gly Asp Asn Asn Gly
530 535 540
Glu Arg Arg Val Lys Gln Glu Pro Asn Ile Glu Phe Leu Glu Asn Ser
545 550 555 560
Lys Ala Gly Ala His Arg Val Ser Gln Asn Asp Leu Met Ser Lys Gln
565 570 575
Val Arg
<210>26
<211>428
<212>PRT
<213> grape
<400>26
Met Ala Ala Leu Leu Lys Val Pro Pro Gln Ser Ser Gly Gly Thr Asn
1 5 10 15
Gly Ser Cys Lys Ala Asp Val Val Val Ser Asp Gln Phe Pro Ala Gly
20 25 30
Leu Arg Val Leu Val Val Asp Asp Asp Val Thr Cys Leu Lys Ile Leu
35 40 45
Glu Gln Met Leu Arg Arg Cys Leu Tyr His Val Thr Thr Cys Ser Gln
50 55 60
Ala Thr Ile Ala Leu Asn Ile Leu Arg Glu Lys Lys Gly Cys Phe Asp
65 70 75 80
Ile Val Leu Ser Asp Val His Met Pro Asp Met Asp Gly Tyr Lys Leu
85 90 95
Leu Glu His Val Gly Leu Glu Met Asp Leu Pro Val Ile Met Met Ser
100 105 110
Ala Asp Gly Arg Thr Ser Ala Val Met Arg Gly Ile Arg His Gly Ala
115 120 125
Cys Asp Tyr Leu Ile Lys Pro Ile Arg Glu Glu Glu Leu Lys Asn Ile
130 135 140
Trp Gln His Val Val Arg Lys Lys Trp Asn Glu Asn Lys Glu His Glu
145 150 155 160
His Ser Gly Ser Leu Glu Asp Asn Asp Arg His Lys Arg Gly Gly Glu
165 170 175
Asp Ala Glu Tyr Ala Ser Ser Val Asn Glu Gly Ala Glu Gly Ile Leu
180 185 190
Lys Gly Gln Lys Lys Arg Arg Asp Ser Lys Asp Glu Asp Asp Gly Glu
195 200 205
Leu Glu Asn Glu Asp Pro Ser Thr Ser Lys Lys Pro Arg Val Val Trp
210 215 220
Ser Val Glu Leu His Gln Gln Phe Val Ser Ala Val Asn Gln Leu Gly
225 230 235 240
Ile Asp Lys Ala Val Pro Lys Arg Ile Leu Glu Leu Met Asn Val Pro
245 250 255
Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Phe Arg Leu
260 265 270
Tyr Leu Lys Arg Leu Ser Gly Val Ala Gln Gln Gln Gly Gly Ile Pro
275 280 285
Asn Ser Phe Cys Gly Pro Val Glu Pro Asn Val Lys Leu Gly Ser Leu
290 295 300
Gly Arg Phe Asp Ile Gln Ala Leu Ala Ala Ser Gly Gln Ile Pro Pro
305 310 315 320
Gln Thr Leu Ala Ala Leu Gln Ala Glu Leu Leu Gly Arg Pro Thr Ser
325 330 335
Asn Leu Val Leu Pro Ala Met Asp Gln Pro Ala Leu Leu Gln Ala Ser
340 345 350
Leu Gln Gly Pro Lys Cys Ile Pro Val Glu His Gly Val Ala Phe Gly
355 360 365
Gln Pro Leu Val Lys Cys Gln Thr Asn Ile Ser Lys His Phe Pro Pro
370 375 380
Thr Val Val Ser Thr Glu Asp Val Pro Ser Gly Phe Gly Ala Trp Pro
385 390 395 400
Ser Asn Ser Leu Gly Thr Val Gly Thr Ser Gly Ser Leu Gly Gly Leu
405 410 415
Ser Ala Gln Asn Asn Asn Ile Leu Met Asp Met Lys
420 425
<210>27
<211>659
<212>PRT
<213> oil-free camphor
<400>27
Met Ala Asn Val Gln Lys Leu Pro His Ser Ser Ile Ser Thr Ala Ser
1 5 10 15
Ser Tyr Gly Ser Cys Arg Gly Glu Gly Val Pro Asp Gln Phe Pro Ala
20 25 30
Gly Leu Arg Val Leu Val Val Asp Asp Asp Thr Thr Cys Leu Arg Ile
35 40 45
Leu Glu Gln Met Leu Arg Lys Cys Met Tyr Lys Val Thr ThrCys Cys
50 55 60
Arg Ala Thr Asp Ala Leu Asp Thr Leu Arg Gly Ser Lys Gly Cys Phe
65 70 75 80
Asp Val Val Ile Ser Asp Val Tyr Met Pro Asp Met Asp Gly Phe Lys
85 90 95
Leu Leu Glu His Val Gly Leu Glu Met Asp Leu Pro Val Ile Met Met
100 105 110
Ser Ala Asp Ala Arg Phe Ser Ala Val Met Lys Gly Ile Lys His Gly
115 120 125
Ala Cys Asp Tyr Leu Ile Lys Pro Val Arg Ile Glu Glu Leu Lys Asn
130 135 140
Ile Trp Gln His Val Val Arg Lys Lys Trp Asn Glu Thr Lys Glu His
145 150 155 160
Asp Gln Ser Gly Ser Ile Glu Asp Asn Glu Arg His Lys Arg Gly Ser
165 170 175
Asp Asp Ala Glu Tyr Ala Ser Ser Val Asn Glu Gly Thr Asp Gly Asn
180 185 190
Trp Lys Val Gln Lys Lys Arg Lys Asp Ser Lys Glu Glu Glu Asp Asp
195 200 205
Gly Glu Gln Glu Asn Glu Asp Pro Ser Ala Ala Lys Lys Pro Arg Val
210 215 220
Val Trp Ser Val Glu Leu His Gln Gln Phe Val Asn Ala Val Asn Gln
225 230 235 240
Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile Leu Glu Leu Met Asn
245 250 255
Val Gln Gly Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Phe
260 265 270
Arg Leu Tyr Leu Lys Arg Leu Ser Gly His Gln Ala Gly Val Ser Ser
275 280 285
Ser Phe Cys Gly Ser Val Asp Pro Asn Ser Lys Leu Gly Pro Leu Ser
290 295 300
Gln Leu Asp Ile Arg Ala Leu Thr Ala Ser Gly Gln Ile Pro Ser Gln
305 310 315 320
Thr Leu Ala Ala Leu Gln Ala Glu Leu Leu Gly Arg Pro Ser Asn Asn
325 330 335
Val Ala Met Pro Val Tyr Gly Gln Thr Leu Val Lys Cys Gln Pro Asn
340 345 350
Leu Pro Lys Gln Phe Pro Gln Pro Asn Leu Pro Val Asp Asp Val Gln
355 360 365
Ser Ser Leu Ser Ile Trp Gln His His Leu Ser Ser Gly Met Pro Leu
370 375 380
Gly Gly Leu Asn Pro Gln Asn Asn Gly Leu Leu Met Gln Gln Gln Gln
385 390 395 400
Gln Leu Thr Ile Glu Ser Asn Arg Pro Cys Asn Val Gln Pro Ser Cys
405 410 415
His Val Ala Pro Ser Asn Gly Gly Phe Thr Met Arg Asn Asn Pro Thr
420 425 430
Ser Ser Asn Ala Ser Ser Val Glu Tyr Asn Ser Leu Leu Ser Ser Gln
435 440 445
Gly Asp Val Gly Gln Ile Ser Gln Ala Ser Gly Ser Asp Leu Ala Thr
450 455 460
Thr Val Gln Ser Asn Gly Gly Phe Lys Ser Leu Asp Tyr Arg Asn Met
465 470 475 480
Gly Gln Val Ser Leu Glu Ser Thr Ser Asp Leu Val Ser Thr Gln Asn
485 490 495
Asn Gly Phe Lys Gly Met Glu Leu Arg Asn Val Gly Ser Leu Gly Gly
500 505 510
Tyr Pro Leu Ser Ser Ser Val Ser Ala Gly Ser Thr Lys Thr Glu Asn
515 520 525
Gly Gln Ser Phe Ser Gln Val Arg Thr Gly Pro Arg Met Ser Met Gly
530 535 540
Pro Thr Gly Gln Phe Val Gly Pro Pro Thr Ile Arg Arg Leu Pro Met
545 550 555 560
Val Asp Gly Gly Thr His Arg Asn Ser Leu Gly Phe Val Gly Lys Gly
565 570 575
Val Ser Ile Pro Ser Arg Phe Met Pro Asp Ser Gly Ser Pro Thr Gly
580 585 590
Val Gly Glu Glu Cys Thr Leu Pro Lys Gln Glu Val Asp Pro Asp Phe
595 600 605
Phe Asp Ser Leu Lys Val Gly Pro Val Gly Val Gln His Tyr Ala Ser
610 615 620
Gly Asp Leu Met Ser Val Leu Ser Lys Gln Gln Gln Ala Ser Thr Gly
625 630 635 640
Asn Leu Asp Cys Glu Phe Gly Ile Asp Gly Tyr Gln Leu Gly Asn Ile
645 650 655
His Val Lys
<210>28
<211>669
<212>PRT
<213> Castor
<400>28
Met Ala Ala Leu Gln Arg Val Ala Ser Ser Val Ser Ala Thr Ala Ser
1 5 1015
Asn Tyr Ser Ser Cys Lys Gly Asn Gly Val Val Thr Ala Thr Ala Asp
20 25 30
Val Ala Val Ser Asp Gln Phe Pro Ala Gly Leu Arg Val Leu Val Val
35 40 45
Asp Asp Asp Thr Thr Cys Leu Arg Ile Leu Glu Gln Met Leu Arg Arg
50 55 60
Cys Leu Tyr His Val Thr Thr Cys Ser Gln Ala Lys Val Ala Leu Asn
65 70 75 80
Leu Leu Arg Glu Arg Lys Gly Cys Phe Asp Val Val Leu Ser Asp Val
85 90 95
His Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu His Val Gly Leu
100 105 110
Glu Met Asp Leu Pro Val Ile Met Met Ser Ala Asp Gly Arg Thr Ser
115 120 125
Ala Val Met Arg Gly Ile Arg His Gly Ala Cys Asp Tyr Leu Ile Lys
130 135 140
Pro Ile Arg Glu Glu Glu Leu Lys Asn Ile Trp Gln His Val Val Arg
145 150 155 160
Lys Lys Trp His Glu Asn Lys Glu Ile Glu His Ser Gly Ser Leu Glu
165 170 175
Asp Asn Asp Arg His Lys Arg Gly Asn Glu Asp Ala Glu Tyr Thr Ser
180 185 190
Ser Val Asn Glu Gly Thr Glu Gly Val Leu Lys Gly Gln Lys Arg Arg
195 200 205
Ser Asn Ser Lys Asp Glu Asp Asp Gly Glu Pro Asp Ser Asp Asp Pro
210 215 220
Ser Thr Ser Lys Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln
225 230 235 240
Gln Phe Val Ser Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro
245 250 255
Lys Arg Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn
260 265 270
Val Ala Ser His Leu Gln Lys Phe Arg Leu Tyr Leu Lys Arg Leu Ser
275 280 285
Gly Val Ala Gln Gln Gly Gly Ile Ser Ser Thr Phe Cys Gly Pro Met
290 295 300
Asp Ser Asn Val Lys Leu Asn Ser Leu Gly Arg Phe Asp Ile Gln Ala
305 310 315 320
Leu Ala Ala Ser Gly Gln Ile Pro Pro Gln Thr Leu Ala Ala Leu His
325 330 335
Ala Glu Leu Phe Gly Arg Pro Thr Gly Ser Leu Val Thr Thr Met Asp
340 345 350
Gln Pro Thr Leu Leu Gln Ala Ser Arg Gln Ser Pro Lys Cys Ile Pro
355 360 365
Val Glu His Gly Val Thr Phe Gly Gln Pro Ile Val Lys Cys Ser Ser
370 375 380
Gly Ile Ser Lys His Phe Pro Gln Asn Met Val Ser Val Glu Glu Val
385 390 395 400
Ser Ser Gly Tyr Gly Ala Trp Pro Ser Asn Ser Leu Gly Thr Val Gly
405 410 415
Pro Ser Thr Asn Leu Gly Gly Met Thr Thr Gln Asn Gly Asn Met Leu
420 425 430
Met Asp Ile Phe His Gln Gln Gln Lys Gln Gln Gln Pro Gln Gln Gln
435 440 445
Gln Ser Leu Ala Asp Pro Ser Arg Ser Ile Asn Val Gln Pro Ser Cys
450 455 460
Leu Val Val Pro Ser Gln Ser Ser Ala Cys Phe Gln Ala Gly Asn Ser
465 470 475 480
Pro Ala Ser Val Asn Gln Ser Asn Phe Asn Arg Asn Val Val Ile Asp
485 490 495
Tyr Ser Leu Leu Ser Ser Gln Ser Asn Asn Ser Ala Leu Asn Ile Gly
500 505 510
His Ile Pro Glu Gly Asp Leu Lys Thr Thr Gly Ala Val Asn Gly Tyr
515 520 525
Ser Ala Pro Gly Ser Leu Ser Pro Pro Ala Ser Ser Cys Ser Val Asn
530 535 540
Ala Asp Ser Gly Val Pro Arg Gln Val Gln Asn Pro Thr Leu Ala Phe
545 550 555 560
Gly Ala Val Arg Gln Leu Pro Ala Leu Ser Pro Asn Ile Phe Asn Ile
565 570 575
Gln Gly Ser Tyr Gly Val Arg Ser Asp Asp Ile Leu Asp Gln Gly Pro
580 585 590
Phe Phe Lys Asn Leu Gly Phe Val Gly Lys Gly Thr Cys Ile Pro Ser
595 600 605
Arg Phe Ala Val Asp Glu Phe Glu Thr Pro Ser Ser Asn Leu Ser His
610 615 620
Gly Lys Leu Tyr Val Glu Asn Asn Asp Asn Lys Val Lys Gln Glu Pro
625 630 635 640
Asn Ile Asp Phe Thr Asp Thr Ser Arg Val Gly Ile Pro Val Leu Gln
645 650 655
Gln Tyr Pro Pro Asn Asp Leu Met Ser Val Phe Thr Glu
660 665
<210>29
<211>654
<212>PRT
<213> tomato
<400>29
Met Val Ser Met Ser Gly Glu Val Ala Thr Cys Lys Ser Glu Ala Thr
1 5 10 15
Val Val Thr Asp His Phe Pro Val Gly Leu Arg Val Leu Val Val Asp
20 25 30
Asp Asp Val Val Cys Leu Arg Ile Ile Glu Gln Met Leu Arg Arg Cys
35 40 45
Lys Tyr Ser Val Thr Thr Cys Thr Gln Ala Met Val Ala Leu Asn Leu
50 55 60
Leu Arg Glu Lys Arg Gly Thr Phe Asp Ile Val Leu Ser Asp Val His
65 70 75 80
Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu Leu Val Gly Leu Glu
85 90 95
Met Asp Leu Pro Val Ile Met Met Ser Gly Asp Gly Arg Thr Asn Leu
100 105 110
Val Met Arg Gly Val Gln His Gly Ala Cys Asp Tyr Leu Ile Lys Pro
115 120125
Ile Arg Asp Glu Glu Leu Lys Asn Ile Trp Gln His Val Val Arg Lys
130 135 140
Arg Tyr Asn Ser Ser Lys Glu Pro Glu Cys Ser Gly Ser Leu Asp Asp
145 150 155 160
Asn Asp Arg Tyr Arg Arg Arg Ser Asp Asp Ala Glu Cys Ala Ser Ser
165 170 175
Val Ile Glu Gly Ala Asp Gly Val Leu Lys Pro Gln Lys Lys Lys Arg
180 185 190
Glu Ala Lys Glu Asp Asp Thr Glu Met Glu Asn Asp Asp Pro Ser Thr
195 200 205
Thr Lys Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln Gln Phe
210 215 220
Val Ser Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg
225 230 235 240
Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn Val Ala
245 250 255
Ser His Leu Gln Lys Phe Arg Leu Tyr Leu Lys Arg Leu Ser Gly Val
260 265 270
Val Gln Gln Gln Gly Gly Leu Pro Ser Thr Phe Cys Gly Pro Ile Glu
275 280285
Gln Asn Ser Glu Leu Gly Ser Leu Gly Arg Phe Asp Ile Gln Ala Leu
290 295 300
Ala Ala Ser Gly Gln Ile Pro Pro Glu Thr Leu Thr Ala Leu His Ala
305 310 315 320
Glu Leu Leu Gly Arg Ser Thr Ser Asn Leu Val Leu Pro Ala Val Glu
325 330 335
Gln Gln Asn Leu Val Gln Val Ser Leu Gln Gln Ala Lys Cys Ile Pro
340 345 350
Val Asp Gln Val Met Ala Tyr Gly Gln Pro Leu Leu Lys Cys Pro Ala
355 360 365
Ser Ile Ser Asn Ser Lys His Leu Ser Gln Ala Ile Leu Ser Ala Glu
370 375 380
Asp Val His Ser Gly Phe Gly Ser Gln Arg Ala Lys Asn Ile Cys Met
385 390 395 400
Val Pro Ser Ser Asn Pro Ile Ala Pro Asn Ser Asn Met Leu Thr Ala
405 410 415
Met Met Gln Gln Gln Gln Trp Gln Lys Gln Gln Gln Ile Glu Leu Gln
420 425 430
His Arg Gln Ser Gly Pro Pro Glu Val Asn Arg Ser Ile Asn Val Gln
435 440445
Pro Ser Cys Leu Val Leu Pro Ser Gln Leu Pro Gly His Phe Gln Val
450 455 460
Gly Asp Ser Pro Ala Ser Ile Ser Arg Ala Gly Ser Leu Ser Lys Ser
465 470 475 480
Ser Val Ile Asp Tyr Gly Val Leu Ser Pro Gln Ser Asn Asn Ser Ser
485 490 495
Gly Val Val Gln Val Leu Asp Arg Glu Leu Lys Pro Glu Cys Gly Leu
500 505 510
Asn Arg Leu Pro Ser Gly Gly Ser Leu Ser Arg Ser Cys Ser Ile Asn
515 520 525
Ala Asp Asn Ser Val Asp Leu Gln Leu His Asn Ser Ser Ser Ala Phe
530 535 540
Gly Ser Ser Lys Gln Leu Pro Gly Leu Ile Pro Ser His Leu Gly Ser
545 550 555 560
Pro Val Pro Tyr Cys Ile Asn Ser Ser Leu Val Leu Asp Gln Gly Arg
565 570 575
Met Lys Gly Ala Ser Ile Pro Ser Arg Phe Ala Val Asp Glu Ser Asp
580 585 590
Ser Pro Met Cys Asn Phe Asn Thr Ala Lys Ile Tyr Leu Glu Glu Thr
595 600 605
Lys Val Lys Gln Glu Pro Asn Met Asn Val Met Glu Asn Ala Lys Val
610 615 620
Gly Pro Ala Ile Phe Gln Lys Phe Gln Pro Gly Asp Leu Met Ser Val
625 630 635 640
Phe Arg Leu Ser Phe Ala Arg Val Lys Val Ser Ser Ser Pro
645 650
<210>30
<211>653
<212>PRT
<213> Potato
<400>30
Met Ser Gly Asp Val Ala Thr Cys Lys Ser Glu Ala Thr Val Val Thr
1 5 10 15
Asp His Phe Pro Leu Gly Leu Arg Val Leu Val Val Asp Asp Asp Val
20 25 30
Val Cys Leu Arg Ile Ile Glu Gln Met Leu Arg Arg Cys Lys Tyr Ser
35 40 45
Val Thr Thr Cys Thr Gln Ala Met Val Ala Leu Asn Leu Leu Arg Glu
50 55 60
Lys Arg Gly Thr Phe Asp Ile Val Leu Ser Asp Val His Met Pro Asp
65 70 75 80
Met Asp Gly Phe Lys Leu Leu Glu Leu Val Gly Leu Glu Met Asp Leu
85 90 95
Pro Val Ile Met Met Ser Gly Asp Gly Arg Thr Asn Leu Val Met Arg
100 105 110
Gly Val Gln His Gly Ala Cys Asp Tyr Leu Ile Lys Pro Ile Arg Asp
115 120 125
Glu Glu Leu Lys Asn Ile Trp Gln His Val Val Arg Lys Arg Tyr Asn
130 135 140
Ser Ser Lys Glu Leu Glu Cys Ser Gly Ser Leu Asp Asp Asn Asp Arg
145 150 155 160
Tyr Lys Arg Gly Ser Asp Asp Ala Glu Cys Ala Ser Ser Val Ile Glu
165 170 175
Gly Ala Asp Gly Val Leu Lys Pro Gln Lys Lys Lys Arg Glu Ala Lys
180 185 190
Glu Glu Asp Asp Thr Glu Met Glu Asn Asp Asp Pro Ser Thr Ser Lys
195 200 205
Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln Gln Phe Val Ser
210 215 220
Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Arg Ile Leu
225 230 235 240
Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn Val Ala Ser His
245 250 255
Leu Gln Glu Asn Gln Lys Phe Arg Leu Tyr Leu Lys Arg Leu Ser Gly
260 265 270
Val Val Gln Gln Gln Gly Gly Leu Pro Ser Thr Phe Cys Gly Pro Ile
275 280 285
Glu Gln Asn Ser Glu Leu Gly Ser Leu Gly Arg Phe Asp Ile Gln Ala
290 295 300
Leu Ala Ala Ser Gly Gln Ile Pro Pro Glu Thr Leu Thr Ala Leu His
305 310 315 320
Ala Glu Leu Leu Gly Arg Ser Thr Ser Asn Leu Val Leu Pro Ala Val
325 330 335
Glu Ile Gln Asn Leu Leu Gln Ala Ser Leu Gln Gln Ala Lys Cys Ile
340 345 350
Pro Ala Asp Gln Val Met Ala Tyr Gly Gln Pro Leu Leu Lys Cys His
355 360 365
Pro Ser Ile Ser Asn Ser Lys His Leu Ser Gln Ser Ile Leu Ser Ala
370 375 380
Glu Asp Val His Ser Gly Phe Gly Ser Gln Arg Ala Lys Asn Ile Cys
385 390 395 400
Leu Val Pro Ser Ser Asn Pro Ile Gly Leu Ala Ala Pro Asn Ser Asn
405410 415
Met Leu Met Ala Met Met Gln Gln Gln Gln Trp Gln Lys Gln Gln Gln
420 425 430
Met Glu Leu Gln His Arg Arg Ser Gly Pro Pro Glu Val Asn His Ser
435 440 445
Ile Asn Val Gln Pro Ser Cys Leu Val Leu Pro Ser Gln Leu Pro Gly
450 455 460
Asn Phe Gln Val Gly Asp Ser Pro Ala Ser Ile Ser Arg Ala Gly Ser
465 470 475 480
Leu Ser Lys Ser Ser Val Ile Asp Tyr Gly Val Leu Ser Pro Gln Ser
485 490 495
Asn Asn Ser Ser Gly Val Val Gln Val Leu Asp Arg Glu Leu Lys Pro
500 505 510
Glu Cys Gly Leu Asn Arg Leu Pro Ser Gly Gly Ser Leu Ser Arg Ser
515 520 525
Cys Ser Ile Asn Ala Asp Asn Ser Val Gly Leu Gln Leu His Asn Ser
530 535 540
Ser Ser Ala Phe Gly Ser Ser Lys Gln Leu Pro Ala Leu Ile Pro Asn
545 550 555 560
His Leu Gly Ser Pro Val Pro Tyr Tyr Ile Asn Ser Ser Gln Val Leu
565570 575
Asp Gln Gly His Thr Arg Asn Pro Gly Val Gly Lys Cys Ala Ser Ile
580 585 590
Pro Ser Arg Phe Ala Val Asp Glu Ser Asp Ser Pro Met Cys Asn Phe
595 600 605
Asn Thr Ala Lys Asn Tyr Leu Glu Glu Thr Lys Val Lys Gln Glu Pro
610 615 620
Asn Met Asn Val Met Glu Asn Ala Lys Val Gly Pro Ala Ile Phe Gln
625 630 635 640
Lys Phe Gln Pro Gly Asp Leu Met Ser Val Phe Ser Asp
645 650
<210>31
<211>669
<212>PRT
<213> upland cotton
<400>31
Met Ala Thr Met His Arg Val Val Gln Ser Ser Val Ser Thr Ser Asp
1 5 10 15
Ala Thr Thr Thr Ser Tyr Asp Gly Leu Thr Ser Cys Lys Ala Ala Asp
20 25 30
Ile Val Ile Ser Asp Gln Phe Pro Ala Gly Leu Arg Val Leu Val Val
35 40 45
Asp Asp Asp Ile Thr Cys Leu Lys Ile Leu Glu Lys Met Leu His Arg
50 55 60
Cys Arg Tyr His Val Thr Thr Cys Pro Gln Ala Lys Val Ala Leu Asn
65 70 75 80
Leu Leu Arg Glu Arg Lys Gly Cys Phe Asp Val Ile Leu Ser Asp Val
85 90 95
Tyr Met Pro Asp Met Asp Gly Tyr Lys Leu Leu Glu His Val Gly Leu
100 105 110
Glu Met Asp Leu Pro Val Ile Met Met Ser Ala Asp Gly Ser Thr Arg
115 120 125
Ala Val Met Lys Gly Ile Arg His Gly Ala Cys Asp Tyr Leu Ile Lys
130 135 140
Pro Ile Arg Glu Glu Glu Leu Lys Asn Ile Trp Gln His Val Val Arg
145 150 155 160
Lys Lys Trp Asn Glu Asn Lys Glu Leu Glu His Ser Gly Ser Leu Asp
165 170 175
Asp Thr Asp Gln His Lys Gln Arg His Asp Asp Ala Glu Tyr Ala Ser
180 185 190
Ser Val Asn Asp Ala Thr Glu Thr Ser Leu Lys Pro Leu Lys Lys Arg
195 200 205
Ser Asn Ser Lys Glu Glu Asp Asp Gly Glu Ile Asp Asn Asp Asp Pro
210 215 220
Ser Thr Ser Lys Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln
225 230 235 240
Gln Phe Val Ser Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro
245 250 255
Lys Arg Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn
260 265 270
Val Ala Ser His Leu Gln Lys Phe Arg Leu Tyr Leu Lys Arg Ile Ser
275 280 285
Gly Val Ala Gln Gln Gly Gly Ile Ala Asn Pro Leu Cys Gly Pro Val
290 295 300
Glu Ala Asn Val Lys Ile Gly Ser Leu Gly Ser Phe Asn Ile Gln Ala
305 310 315 320
Leu Ala Ala Ser Gly Gln Ile Pro Pro Gln Thr Leu Ala Ala Val His
325 330 335
Ala Glu Leu Leu Gly Arg Ser Ala Gly Asn Leu Val Val Ala Thr Asp
340 345 350
Gln Pro Ala Leu Leu Gln Ala Thr Pro Gln Gly Ala Lys Cys Ile Gln
355 360 365
Val Asp Gln Gly Val Ala Phe Val Gln His Ser Val Lys Ser Glu Ser
370 375 380
Ser Ser Ser Lys His Phe Ser Gln Ser Phe Ala Pro Val Glu Asp Val
385 390 395 400
Ala Ser Gly Phe Arg Ser Trp Pro Ser Asn Asn Ile Gly Thr Ala Gly
405 410 415
Pro Ser Asn Ser Gly Gly Leu Ser Ser Gln Asn Gly Asn Met Leu Ile
420 425 430
Asp Leu Leu Gln Gln Gln Gln Gln Leu Gln Lys Pro Gln Gln Arg Ser
435 440 445
Thr Val Ser Glu Leu Arg Arg Ser Ile Asn Val Gln Pro Ser Cys His
450 455 460
Val Val Pro Ser Gln Ser Ser Ala Ser Phe Arg Ala Gly Asn Ser Pro
465 470 475 480
Val Ser Val Thr Gln Asn Gly Ser Tyr Ser Arg Thr Ala Val Ile Asp
485 490 495
Tyr Ser Leu Leu Ser Ser Gln Ser Asn Cys Pro Ser Leu Asn Ile Gly
500 505 510
Gln Val Ser Asp Val Asn Leu Gln Thr Thr Gly Val Leu Ser Gly Tyr
515 520 525
Ile Pro Pro Ala Ser Val Ser Pro Ser Val Ser Ser Cys Ser Val Asn
530 535 540
Ala Asp Asn Cys Ala Ser Gln Gln Val Gln Thr Ser Ser Met Thr Phe
545 550 555 560
Lys Ala Ser Arg His Leu Pro Gly Phe Val His Ser Thr Ser Asn Ile
565 570 575
Pro Asp Pro Tyr Gly Ser Thr Lys Ser Gly Asp Leu Leu Asn Gln Glu
580 585 590
Pro Phe Asn Asn Leu Gly Tyr Ile Asn Lys Gly Thr Cys Leu Pro Ala
595 600 605
Lys Phe Ala Val Asp Glu Phe Gln Ser His Leu Ser Ser Ser Ser His
610 615 620
Gly Lys Val Phe Ser Glu Asn Ile Gly Thr Arg Val Lys Gln Glu Pro
625 630 635 640
Ser Met Glu Phe Gly Asp Asn Ala Lys Val Gly Ile Pro Met Leu Gln
645 650 655
Gln Phe Arg Pro Asn Asp Leu Met Ser Val Phe Thr Glu
660 665
<210>32
<211>681
<212>PRT
<213> cocoa
<400>32
Met Asn Ser Ser Ser Gly Lys Gly Ser Met Ser Ala Ala Ser SerSer
1 5 10 15
Ala Ala Trp Lys Ala Gly Asp Val Val Pro Asp Gln Phe Pro Ala Gly
20 25 30
Leu Arg Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu Met Ile Leu
35 40 45
Glu Lys Met Leu Arg Thr Cys Leu Tyr Glu Val Thr Lys Cys Asn Arg
50 55 60
Ala Glu Thr Ala Leu Ser Leu Leu Arg Glu Asn Lys Asn Gly Phe Asp
65 70 75 80
Ile Val Ile Ser Asp Val His Met Pro Asp Met Asp Gly Phe Lys Leu
85 90 95
Leu Glu His Ile Gly Leu Glu Met Asp Leu Pro Val Ile Met Met Ser
100 105 110
Ala Asp Asp Gly Lys His Val Val Met Lys Gly Val Thr His Gly Ala
115 120 125
Cys Asp Tyr Leu Ile Lys Pro Val Arg Ile Glu Ala Leu Lys Asn Ile
130 135 140
Trp Gln His Val Val Arg Lys Arg Lys Asn Glu Trp Lys Asp Phe Glu
145 150 155 160
Gln Ser Gly Ser Val Glu Glu Gly Asp Arg Gln Pro Lys Gln Ser Glu
165 170 175
Glu Ala Asp Tyr Ser Ser Ser Ala Asn Glu Gly Asn Trp Lys Ser Ser
180 185 190
Lys Lys Arg Lys Asp Asp Asp Asp Glu Ala Glu Glu Arg Asp Asp Thr
195 200 205
Ser Thr Leu Lys Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln
210 215 220
Gln Phe Val Ala Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro
225 230 235 240
Lys Lys Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn
245 250 255
Val Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu Arg Arg Leu Ser
260 265 270
Gly Val Ser Gln His Gln Ser Asn Leu Asn Asn Ser Phe Met Ser Pro
275 280 285
Gln Glu Ala Thr Phe Gly Pro Leu Ser Pro Leu Asn Gly Leu Asp Leu
290 295 300
Gln Thr Leu Ala Ala Thr Gly Gln Leu Pro Ala Gln Ser Leu Ala Thr
305 310 315 320
Phe Gln Ala Ala Gly Leu Gly Arg Ser Thr Ala Lys Ser Gly Ile Ala
325 330 335
Met Pro Leu Val Asp Gln Arg Asn Ile Phe Ser Phe Glu Asn Pro Lys
340 345 350
Leu Arg Phe Gly Glu Gly Gln Gln Gln His Met Asn Asn Asn Lys Gln
355 360 365
Leu Asn Leu Leu His Gly Ile Pro Thr Thr Met Glu Pro Lys Gln Leu
370 375 380
Ala Ser Leu His His Ser Ala Gln Ser Ile Gly Asn Ile Asn Met Gln
385 390 395 400
Val Thr Ser His Gly Val Gln Gly Ser Gln Asn Asn Ser Leu Leu Ile
405 410 415
Gln Met Ala Gln Pro Gln Pro Arg Gly Gln Ile Leu Asn Asp Ser Thr
420 425 430
Gly Ser His Ala Pro Arg Leu Pro Ser Thr Leu Gly Gln Pro Ile Leu
435 440 445
Ser Asn Gly Ile Ala Ala Asn Val Ser Thr Arg Asn Gly Ile Pro Glu
450 455 460
Asn Ile Arg Gly Pro Gly Tyr Asn Pro Val Ser Gln Thr Ser Ser Leu
465 470 475 480
Leu Asn Phe Pro Met Asn His Thr Ser Glu Leu Pro Gly Asn Ser Phe
485 490 495
Pro Leu Gly Thr Thr Pro Gly Ile Ser Ser Leu Thr Ser Lys Gly Ala
500 505 510
Phe Gln Glu Asp Ile Asn Ser Asp Val Lys Gly Ser Gly Gly Phe Met
515 520 525
Pro Ser Tyr Asp Ile Phe Asn Asp Leu Asn Gln His Lys Pro Gln Asn
530 535 540
Trp Glu Leu Gln Asn Val Gly Met Thr Phe Asp Ala Ser Gln His Ser
545 550 555 560
Asn Ser Leu Gln Gly Asn Leu Asp Leu Ala Gln Ser Ile Leu Val Gln
565 570 575
Gln Gly Phe Ser Ser Gly Gln Met Asn Gly Gln Asn Arg Ser Ala Ala
580 585 590
Val Val Ser Lys Ala Met Phe Ser Ala Gly Asp Cys Thr Glu Gln Gly
595 600 605
Asn Ala Gln Asn Val Asn His His Leu Asn Asn Leu Leu Val Asp Asn
610 615 620
Thr Ile Arg Ile Lys Ser Glu Arg Val Ala Asp Ala Gly Pro Ala Asn
625 630 635 640
Leu Phe Pro Asp His Phe Gly Gln Glu Asp Leu Met Ser Ala Leu Leu
645 650 655
Lys Gln Gln Asp Gly Ile Ala Pro Ala Glu Asn Glu Phe Asp Phe Asp
660 665 670
Gly Tyr Ser Met Asp Asn Ile Pro Val
675 680
<210>33
<211>579
<212>PRT
<213> beans
<400>33
Met Asn Leu Ser Asn Gly Lys Gly Ser Met Ser Thr Val Thr Thr Thr
1 5 10 15
Ala Val Met Lys Ser Gly Asp Ala Val Ser Asp Gln Phe Pro Ala Gly
20 25 30
Leu Arg Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu Met Ile Leu
35 40 45
Glu Lys Met Leu Arg Thr Cys Leu Tyr Glu Val Thr Lys Cys Asn Arg
50 55 60
Ala Glu Thr Ala Leu Ser Leu Leu Arg Glu Asn Lys Asn Gly Phe Asp
65 70 75 80
Ile Val Ser Ala Asn Glu Gly Ser Trp Arg Asn Ser Lys Lys Arg Arg
85 90 95
Asp Glu Glu Glu Glu Ala Glu Asp Arg Asp Asp Thr Ser Thr Leu Lys
100 105 110
Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln Gln Phe Val Ala
115 120 125
Ala Val Asp Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Lys Ile Leu
130 135 140
Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn Val Ala Ser His
145 150 155 160
Leu Gln Lys Tyr Arg Leu Tyr Leu Arg Arg Leu Ser Gly Val Ser Gln
165 170 175
His Gln Asn Asn Leu Asn Asn Ser Phe Leu Gly Ser Gln Glu Ala Thr
180 185 190
Phe Gly Thr Ile Ser Ser Ile Asn Gly Ile Asp Leu Gln Thr Leu Ala
195 200 205
Val Thr Gly Gln Leu Pro Ala Gln Ser Leu Ala Thr Leu Gln Ala Ala
210 215 220
Gly Leu Gly Arg Ser Thr Ala Lys Thr Gly Val Pro Met Pro Leu Met
225 230 235 240
Asp Gln Arg Asn Leu Phe Ser Phe Glu Asn Pro Arg Val Arg Phe Gly
245 250 255
Glu Gly Gln Gln Gln His Leu Ser Ser Ser Lys Pro Met Asn Leu Leu
260 265 270
Leu Gly Ile Pro Thr Asn Met Glu Pro Lys Gln Leu Ala Asn Leu His
275 280 285
Gln Ser Thr Gln Ser Ile Ala Ser Leu Asn Met Arg Val Asn Ala Ser
290 295 300
Ala Thr Gln Gly Asn Pro Leu Met Met Gln Met Pro Gln Ser Gln Pro
305 310 315 320
Arg Gly Gln Met Leu Ser Glu Asn Thr Gly Pro Arg Val Pro Arg Leu
325 330 335
Pro Ser Ser Leu Gly Gln Pro Thr Val Ser Asn Gly Ile Ser Asn Gly
340 345 350
Phe Leu Gly Arg Asn Gly Ile Ala Gly Asn Asn Arg Gly Pro Ala Tyr
355 360 365
Asn Pro Val Pro Pro Asn Ser Ser Leu Leu Ser Phe Pro Met Asn Gln
370 375 380
Ser Ser Glu Val Ser Val Asn Asn Ser Leu Pro Leu Gly Ser Ser Pro
385 390 395 400
Gly Ile Ser Ser Ile Thr Thr Lys Gly Ser Phe Gln Glu Glu Val Thr
405 410 415
Ser Gly Ile Lys Ala Thr Gly Gly Phe Pro Ser Tyr Asp Ile Phe Asn
420 425 430
Glu Leu His His Gln Lys Ser His Asp Trp Glu Ile Thr Asn Pro Ser
435 440 445
Leu Thr Tyr Ser Ala Ser His His Ala Asn Pro Leu Gln Gly Asn Ile
450 455 460
Asp Val Ser Pro Ser Val Leu Val His Gln Gly Phe Ser Ser Thr Gln
465 470 475 480
Gln Asn Gly Gln Ser Arg Asp Ala Thr Leu Ile Gly Lys Ala Met Phe
485 490 495
Ser Leu Gly Glu Gly Ser Glu Gln Asp Asn Leu Gln Asn Ala Val Gln
500 505 510
His Leu His Pro Leu Leu Val Asp Asn Ser Ile Arg Val Lys Ala Glu
515 520 525
Arg Ile Pro Asp Ala Ser Ser Gln Thr Asn Leu Phe Pro Asp His Tyr
530 535 540
Val Gln Glu Asp Leu Met Ser Ala Leu Leu Lys Gln Gln Glu Gly Met
545 550 555 560
Gly Pro Ala Glu Ser Glu Phe Glu Phe Asp Ala Tyr Ser Leu Asp Asn
565 570 575
Ile Pro Val
<210>34
<211>679
<212>PRT
<213> Soybean
<400>34
Met Asn Leu Ser Asn Gly Lys Gly Ser Met Ser Thr Leu Thr Ala Ser
1 5 10 15
Val Val Met Lys Ser Gly Asp Ala Val Ser Asp Gln Phe Pro Ala Gly
20 25 30
Leu Arg Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu Met Ile Leu
35 40 45
Glu Lys Met Leu Arg Thr Cys Leu Tyr Glu Val Thr Lys Cys Asn Arg
50 55 60
Ala Glu Thr Ala Leu Ser Leu Leu Arg Glu Asn Lys Asn Gly Phe Asp
65 70 75 80
Ile Val Ile Ser Asp Val His Met Pro Asp Met Asp Gly Phe Lys Leu
85 90 95
Leu Glu His Ile Gly Leu Glu Met Asp Leu Pro Val Ile Met Met Ser
100 105 110
Ala Asp Asp Gly Lys Ser Val Val Met Lys Gly Val Thr His Gly Ala
115 120 125
Cys Asp Tyr Leu Ile Lys Pro Val Arg Ile Glu Ala Leu Lys Asn Ile
130 135 140
Trp Gln His Val Val Arg Lys Arg Lys Asn Glu Trp Lys Asp Ala Glu
145 150 155 160
Gln Ser Gly Ser Ala Glu Glu Gly Asp Arg Gln Pro Lys Ala Ser Asp
165 170 175
Glu Ala Asp Tyr Ser Ser Ser Ala Asn Glu Gly Ser Trp Arg Asn Ser
180 185 190
Lys Lys Arg Arg Asp Glu Glu Glu Glu Ala Glu Asp Arg Asp Asp Thr
195 200 205
Ser Thr Leu Lys Lys Pro Arg Val Val Trp Ser Val Glu Leu His Gln
210 215 220
Gln Phe Val Ala Ala Val Asp Gln Leu Gly Ile Asp Lys Ala Val Pro
225 230 235 240
Lys Lys Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu Asn
245 250 255
Val Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu Arg Arg Leu Ser
260 265 270
Gly Val Ser Gln His Gln Asn Asn Met Asn Asn Ser Phe Leu Ser Pro
275 280 285
Gln Glu Ala Thr Phe Gly Thr Ile Ser Ser Ile Asn Gly Ile Asp Leu
290 295 300
Gln Thr Leu Ala Val Ala Gly Gln Leu Pro Ala Gln Ser Leu Ala Thr
305 310 315 320
Leu Gln Ala Ala Gly Leu Gly Arg Pro Thr Gly Lys Ala Gly Val Pro
325 330 335
Met Pro Leu Met Asp Gln Arg Asn Leu Phe Ser Phe Glu Asn Pro Arg
340 345 350
Leu Arg Phe Gly Glu Gly Gln Gln Gln His Leu Ser Thr Ser Lys Pro
355 360 365
Met Asn Leu Leu His Gly Ile Pro Thr Asn Met Glu Pro Lys Gln Leu
370 375 380
Ala Asn Leu His Gln Ser Thr Gln Ser Ile Gly Ser Leu Asn Met Arg
385 390 395 400
Val Asn Ala Ser Ala Thr Gln Gly Ser Pro Leu Leu Met Gln Met Ala
405 410 415
Gln Ser Gln Pro Arg Gly Gln Met Leu Ser Glu Asn Ile Gly Pro Arg
420 425 430
Val Pro Arg Leu Pro Ser Ser Leu Gly Gln Pro Thr Val Ser Asn Gly
435 440 445
Ile Ser Asn Gly Leu Leu Gly Arg Asn Gly Ile Ala Gly Asn Asn Arg
450 455 460
Gly Pro Ala Tyr Asn Pro Val Pro Pro Ser Ser Ser Leu Leu Ser Phe
465 470 475 480
Pro Met Asn Gln Thr Ser Glu Met Ser Val Asn Asn Ser Phe Pro Leu
485 490 495
Gly Ser Thr Pro Gly Ile Ser Ser Ile Thr Thr Lys Gly Ser Phe Gln
500 505 510
Glu Glu Val Thr Ser Gly Ile Lys Gly Ser Gly Gly Phe Pro Ser Tyr
515 520 525
Asp Ile Phe Asn Glu Leu His His Gln Lys Pro His Asp Trp Glu Ile
530 535 540
Thr Asn Pro Asn Leu Thr Tyr Asn Ala Ser Gln His Ala Asn Pro Leu
545 550 555 560
Gln Gly Asn Ile Asp Val Thr Pro Ser Val Leu Val His Gln Gly Phe
565 570 575
Ser Ser Thr Gln Gln Thr Gly Gln Ser Arg Asp Ala Ala Leu Ile Gly
580 585 590
Lys Ala Met Phe Ser Met Gly Glu Gly Leu Glu Gln Asn Asn Phe Gln
595 600 605
Asn Ala Ser Gln Asn Leu Asn Ser Leu Leu Leu Asp Asn Ser Ile Arg
610 615 620
Val Lys Ala Glu Arg Ile Pro Asp Ala Ser Ser Gln Thr Asn Leu Phe
625 630 635 640
Pro Glu His Tyr Gly Gln Glu Asp Leu Met Ser Ala Leu Leu Lys Gln
645 650 655
Gln Glu Gly Met Gly Pro Ser Glu Asn Glu Phe Asp Phe Asp Gly Tyr
660 665 670
Ser Leu Asp Asn Ile Pro Val
675
<210>35
<211>668
<212>PRT
<213> quinoa
<400>35
Met Asn Leu Gly Gly Gly Leu Met Gly Ser Met Ala Met Pro Ser Ser
1 5 10 15
Thr Val Ser Arg Lys Ser Ser Glu Val Val Thr Ala Asp Gln Phe Pro
20 25 30
Val Gly Leu Arg Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu Thr
35 40 45
Ile Leu Glu Lys Met Leu Arg Thr Cys Arg Tyr Glu Val Thr Lys Thr
50 55 60
Asn Arg Ala Glu His Ala Leu Asn Met Leu Arg Glu Asn Lys Asn Gly
65 70 75 80
Phe Asp Val Val Ile Ser Asp Val His Met Pro Asp Met Asp Gly Phe
85 90 95
Lys Leu Leu Glu Gln Val Gly Leu Glu Met Asp Leu Pro Val Ile Met
100 105 110
Met Ser Ala Asp Asp Ser Lys Gln Val Val Met Lys Gly Val Thr His
115 120 125
Gly Ala Cys Asp Tyr Leu Ile Lys Pro Val Arg Ile Glu Ala Leu Lys
130 135 140
Asn Ile Trp Gln His Val Val Arg Lys Lys Lys Tyr Glu Tyr Asn Lys
145 150 155 160
Asp Val Glu Gln Ser Gly Ser Trp Asp Glu Gly Asp Arg Gln Leu Lys
165 170 175
His Asp Asp Ala Val Ser Ser Pro Ala Asn Asp Gly Ser Trp Lys Asn
180 185 190
Ser Lys Arg Lys Ser Gly Glu Asp Asp Glu Ala Asp Asp Lys Asp Asp
195 200 205
Thr Thr Thr Leu Lys Lys Pro Arg Val Val Trp Ser Val Glu Leu His
210 215 220
Gln Gln Phe Val Ala Ala Val Asn Gln Leu Gly Ile Asp Lys Ala Val
225 230 235 240
Pro Lys Lys Ile Leu Glu Leu Met Asn Val Pro Gly Leu Thr Arg Glu
245 250 255
Asn Val Ala Ser His Leu Gln Lys Tyr Arg Leu Tyr Leu Arg Arg Leu
260 265 270
Ser Gly Val Ser Gln His Gln Gly Gly Leu Asn Ser Ser Phe Met Pro
275 280 285
Gln Asp Pro Ser Phe Ser Thr Met Ser Ser Leu Gly Gly Ile Asp Leu
290 295 300
Gln Thr Leu Ala Ala Thr Gly Gln Leu Ser Ala Gln Thr Leu Ala Ala
305 310 315 320
Tyr Thr Arg Leu Pro Pro Thr Ile Lys Pro Gly Ile Ser Met Pro Phe
325 330 335
Val Asp Gln Arg Asn Leu Phe Ser Phe Glu Asn Ser Lys Leu Arg Tyr
340 345 350
Gly Asp Gly Gln Gln Ser Gln Ile Ser Asn Val Ser Lys Gln Met Asn
355 360 365
Leu Leu His Gly Phe Pro Thr Thr Met Glu Pro Lys Gln Leu Ala Val
370 375 380
Leu Asn Gln Ser Ala Gln Thr Leu Gly Ser Met Asn Met Gln Ala Asn
385 390 395 400
Ala Ser Ser Ser His Gln Ser Ser Ser Leu Leu Met Gln Gln Met Val
405 410 415
Pro Gln Gln Arg Gly His Ile Ser Asn Glu Ser Ile Ser Ser Gln Val
420 425 430
Pro Arg Ile Gln Pro Ser Val Gly Gln Pro Leu Gln Ser Asn Gly Asn
435 440 445
Ala Asn Ala Val Leu Ser Arg Asn Gly Ile Pro Tyr Asp Pro Val Asn
450 455 460
Gln Ser Ala Ser Val Val Asp Phe Ser Val Asn His Ile Pro Glu Leu
465 470 475 480
Pro Gly Asn Ser Phe Pro Leu Gly Ser Thr Pro Gly Ile Thr Ser Ile
485 490 495
Thr Ser Lys Gly Phe Asn Gln Glu Glu Ile Gly Ser Asp Ile Lys Val
500 505 510
Ser Arg Gly Phe Val Gly Ser Tyr Asp Met Phe Ser Glu Leu Gln His
515 520 525
Lys Pro Gln Glu Trp Gln Met Gln Asn Pro Asn Met Gly Phe Ala Gly
530 535 540
Ser Ser Gln His Val Pro Ser Val Gln Ser Gly Val Asn Val Ala Pro
545 550 555 560
Ser Ile Met Val Asn Gln Ser Tyr Val Ser Gly Gln Lys Asn Glu Gln
565 570 575
Asn Gly His Ser Met Ala Gly Lys Pro Met Tyr Ser Ala Gly Leu Glu
580 585 590
Asn Gln His Met Gly Met Gln Asn Val Asn Gln Asn Tyr Asn Ser Ile
595 600 605
His Val Asn Asn Ser Ser Arg Val Lys Ala Glu Ser Val Ser Asp Val
610 615 620
Val Asn Leu Gly Ala Asn Leu Phe Asp Tyr Ser Pro Glu Asp Met Leu
625 630 635 640
Ser Thr Ile Met Leu Lys Gln Gln Glu Gly Ile Gly Ser Gly Asp Phe
645 650 655
Asp Phe Asp Gly Tyr Thr Leu Asp Asn Ile Pro Val
660 665
<210>36
<211>670
<212>PRT
<213> apple
<220>
<221>MOD_RES
<222>(195)..(195)
<223> any amino acid
<220>
<221>MOD_RES
<222>(215)..(215)
<223> any amino acid
<220>
<221>MOD_RES
<222>(530)..(530)
<223> any amino acid
<220>
<221>MOD_RES
<222>(540)..(540)
<223> any amino acid
<220>
<221>MOD_RES
<222>(579)..(579)
<223> any amino acid
<400>36
Met Ala Ala Leu Gln Arg Val Ala Gln Ser Ser Val Ser Thr Thr Ala
1 5 10 15
Ser Ser Tyr Gly Ser Cys Lys Val Gly Gly Gly Val Leu Ser Pro Ser
20 25 30
Ala Gly Ile Glu Met Ala Val Pro Asn Gln Phe Pro Ala Gly Leu Arg
35 40 45
Val Leu Val Val Asp Asp Asp Thr Thr Cys Leu Arg Ile Leu Glu Leu
50 55 60
Met Leu Leu Arg Cys Leu Tyr Gln Val Thr Thr Cys Ser Glu Ala Thr
65 70 75 80
Val Ala Leu Asn Leu Leu Arg Glu Arg Lys Asp Cys Phe Asp Val Val
8590 95
Leu Ser Asp Val His Met Pro Asp Met Asp Gly Phe Lys Leu Leu Glu
100 105 110
His Val Gly Leu Glu Met Asp Leu Pro Val Ile Met Met Ser Ala Asp
115 120 125
Gly Arg Thr Ser Val Val Met Arg Gly Ile Arg His Gly Ala Cys Asp
130 135 140
Phe Leu Ile Lys Pro Ile Ser Glu Ala Glu Leu Lys Asn Ile Trp Gln
145 150 155 160
His Val Val Arg Lys Lys Trp Asn Gly Ser Lys Glu Leu Glu His Ser
165 170 175
Gly Ser Leu Glu Asp Asn Asp Pro His Lys Arg Gly Asn Asn Asp Phe
180 185 190
Glu Tyr Xaa Ser Ser Val Asn Glu Gly Thr Glu Val Ser Leu Lys Gly
195 200 205
His Lys Lys Arg Ile Asn Xaa Lys Glu Asp Asp Asp Gly Asp Thr Glu
210 215 220
Asn Asp Asp Leu Ser Thr Ser Lys Lys Pro Arg Val Val Trp Ser Val
225 230 235 240
Glu Leu His Gln Gln Phe Val Thr Ala Val Asn Gln Leu Gly Leu Asp
245 250 255
Lys Ala Val Pro Lys Arg Ile Leu Glu Leu Met Asn Val Pro Gly Leu
260 265 270
Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Phe Arg Leu Tyr Leu
275 280 285
Lys Arg Leu Ser Gly Val Ala Gln Gln Gln Ser Gly Ile Ala Asn Pro
290 295 300
Leu Cys Gly Pro Val Asp Ser Asn Gly Lys Leu Gly Ser Leu Ser Arg
305 310 315 320
Phe Asp Phe Gln Ala Leu Ala Ala Ser Gly Gln Ile Pro Pro Gln Thr
325 330 335
Leu Ala Ala Leu Gln Ala Glu Leu Leu Gly Gln Pro Ala Gly Asn Leu
340 345 350
Val Pro Ala Met Asp Gln Pro Ala Leu Leu His Ala Ser Leu Gln Ala
355 360 365
Pro Lys Arg Pro Pro Val Glu His Gly Val Pro Phe Met Gln Pro Phe
370 375 380
Val Lys Ser Gln Ser Asn Val Ser Lys His Phe Pro Gln Ser Val Ile
385 390 395 400
Ser Ala Glu Asp Ala Ser Leu Gly Phe Gly Gln Trp Arg Ser Asn Ser
405 410415
Arg Ser Thr Val Ala Pro Ser Asn Asp His Gly Gly Leu Ser Thr Gln
420 425 430
Asn Ser Asn Leu Leu Met Gly Ile Val Pro Gln Glu Gln Arg Gln His
435 440 445
Lys Arg Thr Gln Gln Gln Ser Val Leu Thr Glu Pro Ser Arg Ser Phe
450 455 460
Asn Val Gln Pro Ser Cys Leu Val Val Pro Ser Gln Ser Ser Thr Gly
465 470 475 480
Phe Gln Ala Gly Asn Ser Pro Ala Ser Val Asn Gln Ser Ser Ser Phe
485 490 495
Asn Arg Ser Thr Val Val Asp Tyr Ser Leu Pro Ser Asp Gln Ser Asn
500 505 510
Asn Ser Leu Asn Val Gly His Ile Pro Thr Gly Asn Pro Lys Thr Ser
515 520 525
Gly Xaa Leu Gly Gly Tyr Ser Gly Pro Gly Ser Xaa Cys Ala Thr Ser
530 535 540
Cys Leu Val Asn Ala Asp Asn Ser Thr Ser Tyr Gln Asn Ser Thr Ala
545 550 555 560
Thr Phe Ser Asp Ser Arg Glu Leu Pro Gly Phe Leu His Asn Thr Ala
565 570575
Asn Ser Xaa Gly Phe Tyr Val Asp Lys Ser Gly Glu Met Leu Asp Gln
580 585 590
Gly Pro Leu Arg Asn Leu Gly Phe Val Gly Lys Glu Thr Cys Ile Pro
595 600 605
Ser Arg Phe Ala Val Asp Asp Phe Glu Ser Gln Met Ser Asn Leu Asn
610 615 620
Pro Gly Arg Ile His Val Glu Ser Ser Gly Thr Leu Val Lys Gln Glu
625 630 635 640
Pro Ser Glu Asp Tyr Val Asp Asn Ala Lys Leu Gly Ile Pro Ile Leu
645 650 655
His Gln Tyr Ser Ser Ser Asp Phe Met Ser Pro Phe Ala Asp
660 665 670
<210>37
<211>802
<212>PRT
<213> corn
<400>37
Pro Tyr Pro Thr His Thr Leu Leu Pro Gln Pro His Leu Ser Leu Ser
1 5 10 15
Ala Cys Val Leu Leu Val Leu Leu Ser Leu Ser Ser Pro Ala Leu Thr
20 25 30
Ser Pro Pro Phe Pro Ala Val Ser Trp Ile Ser Arg Ile Gln Thr Thr
35 40 45
Ala Leu Val Ser Leu Pro Ser Cys Leu Leu Pro Ala Tyr Val Gln Glu
50 55 60
Gly Pro Cys Leu Gly Asp Pro Gly Ala Trp Phe Leu Gly Ser Ala Ala
65 70 75 80
Ser Ala Ala Val Gly Phe Ala Glu Pro Glu Pro Pro Glu Met Thr Val
85 90 95
Asp Glu Leu Lys Leu Gln Ala Arg Ala Ser Gly Gly His Gly Ala Lys
100 105 110
Asp Gln Phe Pro Val Gly Met Arg Val Leu Ala Val Asp Asp Asp Pro
115 120 125
Thr Cys Leu Lys Ile Leu Glu Asn Leu Leu Leu Arg Cys Gln Tyr His
130 135 140
Val Thr Thr Thr Gly Gln Ala Ala Thr Ala Leu Lys Leu Leu Arg Glu
145 150 155 160
Lys Lys Asp Gln Phe Asp Leu Val Ile Ser Asp Val His Met Pro Asp
165 170 175
Met Asp Gly Phe Lys Leu Leu Glu Leu Val Gly Leu Glu Met Asp Leu
180 185 190
Pro Val Ile Met Leu Ser Ala Asn Gly Glu Thr Gln Thr Val Met Lys
195 200 205
Gly Ile Thr His Gly Ala Cys Asp Tyr Leu Leu Lys Pro Val Arg Ile
210 215 220
Glu Gln Leu Arg Thr Ile Trp Gln His Val Val Arg Arg Arg Ser Cys
225 230 235 240
Asp Ala Lys Asn Ser Gly Asn Asp Asn Asp Asp Ser Gly Lys Lys Leu
245 250 255
Gln Val Val Ser Ala Glu Gly Asp Asn Gly Gly Val Asn Arg Asn Lys
260 265 270
Arg Ile Ser Arg Lys Gly Arg Asp Asp Asn Gly Asp Asp Gly Asp Asp
275 280 285
Ser Asp Asp Asn Ser Asn Glu Asn Gly Asp Ser Ser Ser Gln Lys Lys
290 295 300
Pro Arg Val Val Trp Ser Val Glu Leu His Arg Lys Phe Val Ala Ala
305 310 315 320
Val Asn Gln Leu Gly Ile Asp Lys Ala Val Pro Lys Lys Ile Leu Asp
325 330 335
Leu Met Asn Val Glu Asn Ile Thr Arg Glu Asn Val Ala Ser His Leu
340 345 350
Gln Lys Tyr Arg Leu Tyr Leu Lys Arg Leu Ser Ala Asp Ala Ser Arg
355 360 365
Gln Ala Asn Leu Thr Ala Ala Phe Gly Gly Arg Asn Pro Ala Tyr Val
370 375 380
Asn Met Gly Leu Asp Ala Phe Arg Gln Tyr Asn Ala Tyr Gly Arg Tyr
385 390 395 400
Arg Pro Val Pro Thr Thr Asn His Ser Gln Pro Asn Asn Leu Leu Ala
405 410 415
Arg Met Asn Ser Pro Ala Phe Gly Met His Gly Leu Leu Pro Ser Gln
420 425 430
Pro Leu Gln Ile Gly His Asn Gln Asn Asn Leu Ser Thr Ser Leu Gly
435 440 445
Asn Val Gly Gly Met Asn Asn Gly Asn Leu Ile Arg Gly Ala His Met
450 455 460
Pro Leu Gln Asp Thr Ser Lys Cys Phe Pro Thr Gly Pro Ser Gly Asn
465 470 475 480
Ser Phe Ala Asn Ile Ser Asn Ser Thr Gln Leu Val Thr Thr Asn Asn
485 490 495
Leu Pro Leu Gln Ser Leu Glu Pro Ser Asn Gln Gln His Leu Gly Arg
500 505 510
Leu His Ser Ser Ala Asp Pro Phe Asn Ser Phe Val Gly Glu Pro Pro
515 520 525
Gln Phe Ala Asp Leu Gly Arg Cys Asn Thr Thr Trp Pro Thr Ala Val
530 535 540
Ser Ser Ser Asn Val Gln Glu Ile Gly Gln Lys Asp Arg Ile Val Asn
545 550 555 560
Arg Pro Lys Leu Glu Pro Leu Ser Ser Phe Thr Glu Ala Ser Ser Gln
565 570 575
Ile Pro Leu Leu Gly Asn Glu Met Gln Ser His Gln Val Ala Ser Leu
580 585 590
Ala Ser Asn Gly Leu Pro Met Pro Phe Thr Gln Glu Ala Val Pro Phe
595 600 605
Ala Tyr Gly Ser Ser Thr Asn Ser Arg Glu Met Leu Asn Asn Asn Leu
610 615 620
Ala Leu Ser Asn Ser Gly Val Asn Ser Thr Leu Pro Asn Leu Arg Ile
625 630 635 640
Asp Gly Ser Val Val Pro Gly Gln Thr Leu Gly Gly Ser Asn Ser Gly
645 650 655
Gly Cys Val Val Pro Pro Leu Gln Asp Gly Arg Ile Asp His Gln Ala
660 665 670
Val Ser Ser His Leu Asn Tyr Asn Asn Glu Leu Met Gly Thr Gly Arg
675680 685
Leu Gln Arg Gly Leu Ser Gly Gly Leu Asp Asp Ile Val Val Asp Met
690 695 700
Phe Arg Pro Asp Arg Ala Asp Asp Gly Val Ser Phe Ile Asp Gly Asp
705 710 715 720
Trp Glu Leu Arg Pro Gly Ser Ser Val Thr Ser Glu Tyr Gln Leu Cys
725 730 735
Gly Ile Cys Tyr Leu Asn Ser Tyr Asp Tyr Val Phe Lys Ser Gly Val
740 745 750
Asn Cys Gly Tyr Arg Asp Ile Gln His Val Tyr Glu Pro Arg Asn Asp
755 760 765
Val Leu Phe Pro Leu Gly Asn Arg Phe Ala Val Pro Phe Val Asp Cys
770 775 780
His Cys Ile Val Ala Ser Leu Ala Glu Thr Glu Val Lys Gly Lys Asp
785 790 795 800
Gln Ala
<210>38
<211>591
<212>PRT
<213> turnip
<400>38
Met Leu Asn Pro Gly Val Val Gly Gly Ser Ser Asn Ser Asp Pro Phe
1 5 10 15
Pro Ser Gly Leu Arg Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu
20 25 30
Met Ile Leu Glu Arg Met Leu Lys Thr Cys Leu Tyr Arg Val Thr Lys
35 40 45
Cys Asn Arg Ala Glu Ile Ala Leu Ser Leu Leu Arg Lys Asn Lys Asn
50 55 60
Gly Phe Asp Ile Val Ile Ser Asp Val His Met Pro Asp Met Asn Gly
65 70 75 80
Phe Lys Leu Leu Glu His Val Gly Leu Glu Met Asp Leu Pro Val Ile
85 90 95
Met Met Ser Ala Asp Asp Ser Lys Ser Val Val Leu Lys Gly Val Thr
100 105 110
His Gly Ala Val Asp Tyr Leu Ile Lys Pro Val Arg Ile Glu Ala Leu
115 120 125
Lys Asn Ile Trp Gln His Val Val Arg Lys Lys Gln Asn Val Ser Glu
130 135 140
His Ser Gly Ser Val Glu Glu Thr Gly Gly Asp Arg Gln Gln Gln Gln
145 150 155 160
Arg Gly Asp Asp Asp Asp Asp Gly Asn Asn Ser Ser Ser Gly Asn Asn
165 170 175
Glu Gly Asn Leu Arg Lys Arg Lys Glu Glu Glu Gln Gly Asp Asp Lys
180 185 190
Glu Asp Thr Ser Ser Leu Lys Lys Pro Arg Val Val Trp Ser Val Glu
195 200 205
Leu His Gln Gln Phe Val Ala Ala Val Asn His Leu Gly Val Asp Lys
210 215 220
Ala Val Pro Lys Lys Ile Leu Glu Met Met Asn Val Gln Gly Leu Thr
225 230 235 240
Arg Glu Asn Val Ala Ser His Leu Gln Lys Tyr Arg Ile Tyr Leu Lys
245 250 255
Arg Leu Gly Gly Val Ser Gln Gly Asn Met Asn His Ser Phe Leu Thr
260 265 270
Gly Gln Asp Pro Ser Tyr Gly Pro Leu Asn Gly Phe Asp Leu Gln Gly
275 280 285
Leu Ala Thr Ala Gly Gln Leu Gln Ala Gln Ser Leu Ala Gln Leu Gln
290 295 300
Ala Val Gly Leu Gly Gln Ser Ser Ser Pro Leu Ile Lys Pro Gly Ile
305 310 315 320
Thr Ser Val Asp Gln Arg Ser Phe Phe Thr Phe Gln Asn Ser Lys Ser
325 330 335
Arg Phe Gly Asp Gly His Gly Pro Met Met Met Asn Gly Gly Gly Gly
340 345 350
Asn Lys Gln Thr Ser Leu Leu His Gly Val Pro Thr Gly His Met Arg
355 360 365
Leu Gln Gln Gln Gln Met Ala Gly Met Arg Val Ala Gly Pro Ser Met
370 375 380
Gln Gln Gln Gln Gln Gln Ser Met Leu Ser Arg Arg Ser Val Pro Glu
385 390 395 400
Thr Arg Ser Ser Arg Val Leu Pro Ala Ala Thr His Ser Ala Leu Asn
405 410 415
Asn Ser Phe Pro Leu Ala Ser Ala Pro Gly Met Met Ser Val Ser Asp
420 425 430
Thr Lys Gly Val Asn Glu Phe Cys Asn Pro Ser Tyr Asp Ile Leu Asn
435 440 445
Asn Phe Pro Gln Gln Gln His His Asn Asn Asn Asn Asn Arg Val Asn
450 455 460
Glu Trp Asp Leu Arg Asn Val Gly Met Val Phe Asn Ser His Gln Asp
465 470 475 480
Asn Thr Thr Ser Ala Ala Phe Ser Thr Ser Glu Ala Tyr Ser Ser Ser
485 490 495
Ser Thr His Lys Arg Lys Arg Glu Ala Glu Leu Val Val Glu His Gly
500 505 510
Gln Asn Gln Gln Gln Pro Gln Ser Arg Ser Val Lys Pro Met Asn Gln
515 520 525
Thr Tyr Met Asp Gly Gly Gly Ser Val Arg Met Lys Thr Glu Thr Val
530 535 540
Thr Cys Pro Pro Gln Ala Thr Thr Met Phe His Glu Gln Tyr Ser Asn
545 550 555 560
Gln Asp Asp Leu Leu Ser Asp Leu Leu Lys Gln Glu Gly Leu Leu Asp
565 570 575
Thr Glu Phe Asp Phe Glu Gly Tyr Ser Phe Asp Ser Ile Leu Val
580 585 590
<210>39
<211>691
<212>PRT
<213> Rice
<400>39
Met Ala Pro Val Glu Asp Gly Gly Gly Val Glu Phe Pro Val Gly Met
1 5 10 15
Lys Val Leu Val Val Asp Asp Asp Pro Thr Cys Leu Ala Val Leu Lys
20 25 30
Arg Met Leu Leu Glu Cys Arg Tyr Asp Ala Thr Thr Cys Ser Gln Ala
35 4045
Thr Arg Ala Leu Thr Met Leu Arg Glu Asn Arg Arg Gly Phe Asp Val
50 55 60
Ile Ile Ser Asp Val His Met Pro Asp Met Asp Gly Phe Arg Leu Leu
65 70 75 80
Glu Leu Val Gly Leu Glu Met Asp Leu Pro Val Ile Met Met Ser Ala
85 90 95
Asp Ser Arg Thr Asp Ile Val Met Lys Gly Ile Lys His Gly Ala Cys
100 105 110
Asp Tyr Leu Ile Lys Pro Val Arg Met Glu Glu Leu Lys Asn Ile Trp
115 120 125
Gln His Val Ile Arg Lys Lys Phe Asn Glu Asn Lys Glu His Glu His
130 135 140
Ser Gly Ser Leu Asp Asp Thr Asp Arg Thr Arg Pro Thr Asn Asn Asp
145 150 155 160
Asn Glu Tyr Ala Ser Ser Ala Asn Asp Gly Ala Glu Gly Ser Trp Lys
165 170 175
Ser Gln Lys Lys Lys Arg Asp Lys Asp Asp Asp Asp Gly Glu Leu Glu
180 185 190
Ser Gly Asp Pro Ser Ser Thr Ser Lys Lys Pro Arg Val Val Trp Ser
195 200 205
Val Glu Leu His Gln Gln Phe Val Asn Ala Val Asn His Leu Gly Ile
210 215 220
Asp Lys Ala Val Pro Lys Lys Ile Leu Glu Leu Met Asn Val Pro Gly
225 230 235 240
Leu Thr Arg Glu Asn Val Ala Ser His Leu Gln Lys Phe Arg Leu Tyr
245 250 255
Leu Lys Arg Ile Ala Gln His His Ala Gly Ile Ala Asn Pro Phe Cys
260 265 270
Pro Pro Ala Ser Ser Gly Lys Val Gly Ser Leu Gly Gly Leu Asp Phe
275 280 285
Gln Ala Leu Ala Ala Ser Gly Gln Ile Pro Pro Gln Ala Leu Ala Ala
290 295 300
Leu Gln Asp Glu Leu Leu Gly Arg Pro Thr Asn Ser Leu Val Leu Pro
305 310 315 320
Gly Arg Asp Gln Ser Ser Leu Arg Leu Ala Ala Val Lys Gly Asn Lys
325 330 335
Pro His Gly Glu Arg Glu Ile Ala Phe Gly Gln Pro Ile Tyr Lys Cys
340 345 350
Gln Asn Asn Ala Tyr Gly Ala Phe Pro Gln Ser Ser Pro Ala Val Gly
355 360 365
Gly Met Pro Ser Phe Ser Ala Trp Pro Asn Asn Lys Leu Gly Met Ala
370 375 380
Asp Ser Thr Gly Thr Leu Gly Gly Met Ser Asn Ser Gln Asn Ser Asn
385 390 395 400
Ile Val Leu His Glu Leu Gln Gln Gln Pro Asp Ala Met Leu Ser Gly
405 410 415
Thr Leu His Ser Leu Asp Val Lys Pro Ser Gly Ile Val Met Pro Ser
420 425 430
Gln Ser Leu Asn Thr Phe Ser Ala Ser Glu Gly Leu Ser Pro Asn Gln
435 440 445
Asn Thr Leu Met Ile Pro Ala Gln Ser Ser Gly Phe Leu Ala Ala Met
450 455 460
Pro Pro Ser Met Lys His Glu Pro Val Leu Ala Thr Ser Gln Pro Ser
465 470 475 480
Ser Ser Leu Leu Gly Gly Ile Asp Leu Val Asn Gln Ala Ser Thr Ser
485 490 495
Gln Pro Leu Ile Ser Ala His Gly Gly Gly Asn Leu Ser Gly Leu Val
500 505 510
Asn Arg Asn Pro Asn Val Val Pro Ser Gln Gly Ile Ser Thr Phe His
515 520 525
Thr Pro Asn Asn Pro Tyr Leu Val Ser Pro Asn Ser Met Gly Met Gly
530 535 540
Ser Lys Gln Pro Pro Gly Val Leu Lys Thr Glu Asn Ser Asp Ala Leu
545 550 555 560
Asn His Ser Tyr Gly Tyr Leu Gly Gly Ser Asn Pro Pro Met Asp Ser
565 570 575
Gly Leu Leu Ser Ser Gln Ser Lys Asn Thr Gln Phe Gly Leu Leu Gly
580 585 590
Gln Asp Asp Ile Thr Gly Ser Trp Ser Pro Leu Pro Asn Val Asp Ser
595 600 605
Tyr Gly Asn Thr Val Gly Leu Ser His Pro Gly Ser Ser Ser Ser Ser
610 615 620
Phe Gln Ser Ser Asn Val Ala Leu Gly Lys Leu Pro Asp Gln Gly Arg
625 630 635 640
Gly Lys Asn His Gly Phe Val Gly Lys Gly Thr Cys Ile Pro Ser Arg
645 650 655
Phe Ala Val Asp Glu Ile Glu Ser Pro Thr Asn Asn Leu Ser His Ser
660 665 670
Ile Gly Ser Ser Gly Asp Ile Met Ser Pro Asp Ile Phe Gly Phe Ser
675 680 685
Gly Gln Met
690
<210>40
<211>428
<212>PRT
<213> genus oocysts
<400>40
Met Ala Leu Lys Arg Val Pro Ser Phe Ser Gly Arg Pro Asn Phe Pro
1 5 10 15
Ala Gly Leu Gln Ile Leu Val Val Asp Ser Asp Ser Ser Ser Arg Glu
20 25 30
Ala Val Glu Met Gln Leu Lys Ser His Ser Tyr Leu Ala Thr Cys Cys
35 40 45
Cys Thr Cys Gly Glu Ala Val Glu Gln Leu Gly Thr Ser Lys Tyr Asp
50 55 60
Ile Val Leu Ala Glu Ser Lys Leu Val Ala Ala Glu Cys Val Asp Ser
65 70 75 80
Thr Arg Leu Cys Glu Ala Ala Arg Ala Leu Pro Leu Val Leu Met Cys
85 90 95
Glu Asp Ser Thr Ala Asp Asp Val Leu Lys Gly Ile Arg Leu Gly Ala
100 105 110
Cys Asp Phe Leu Glu Lys Pro Leu Ser Pro Leu Lys Leu Lys Asn Ile
115 120 125
TrpGln His Val Val Arg Lys Met Met Glu Gln Met His Val Arg Arg
130 135 140
Thr Asp Asp Ala Asp Thr Cys Thr Thr Lys Ser Ser Arg Asp Gln Ser
145 150 155 160
Cys Ala Ile Lys Gly Lys Ser Val Ala Ser Thr Pro Ser Cys Pro Lys
165 170 175
Thr Pro Ser Pro Ala Ala Ser Gly Ala Asp Ile Gly Cys Ser Ile Ala
180 185 190
Thr Ser Val Ser Lys Ala Gly Asp Val Val Gly Glu Ser Ser Ser Ser
195 200 205
Glu Thr Arg Lys Glu His Cys Ser Glu Thr Thr Glu Cys Ser Asp Leu
210 215 220
Lys Ser Cys Ala Ala Lys Ser Ala Val Ser Ala Gln Thr Pro Val Ser
225 230 235 240
Thr Ala Thr Val Ala Ala Thr Trp Gly Ala Ser Lys Lys Lys Ser Thr
245 250 255
Ala Ser Ala Thr Thr Ser Ser Val Ser Asn Arg Pro Pro Leu Ala Ile
260 265 270
Lys Met Pro Ala Pro Ala Val Ala Tyr Thr Ser Gly Leu Ala Pro Phe
275 280 285
Pro Pro ProMet Phe Val Pro Gly Gly Trp Gly Gln Ser Ser Asn Pro
290 295 300
Cys Val Val Gly Thr Pro Met Met Pro Pro Pro Pro Gly Met Gly Met
305 310 315 320
Pro Pro His His His Ala Pro Tyr Gly Gln Val Pro Pro Pro Gly Tyr
325 330 335
Pro Val Ala Cys Met Pro Ser Ala Phe Val Pro Thr Pro Met Gly Pro
340 345 350
Gly Gly Val Ala Phe Ala Pro Pro Pro Gly Ala Ser Cys Thr Ser Ala
355 360 365
Ala Tyr Tyr Pro His Pro Ala Val Asp Ala Ser Ala Ser Ala Thr Ala
370 375 380
Thr Phe Thr Gly His Val Gln Ile Asp Leu Thr Asn Val Ser Ala Glu
385 390 395 400
Glu Pro Ala Pro Ile Gly Leu Ala Leu Arg Lys Thr Ala Ser Leu Leu
405 410 415
Asp Leu Val Ser Asp Arg Leu Gly Gln Arg Ala Cys
420 425
<210>41
<211>341
<212>PRT
<213> Tetrakis algae
<400>41
Met Leu Cys Pro Ala Val Gln Val Ala Thr Met Ala Thr Val Leu Ala
1 5 10 15
Ser Thr His Phe Ser Glu Arg Pro Ser Phe Pro Ala Asp Leu Glu Val
20 25 30
Leu Leu Leu Asp Ser Ala Thr Gln Gly Ala Glu Thr Ala Ser Lys Leu
35 40 45
Leu Leu Ser Cys Ser Tyr Arg Val Thr Val Cys Arg Ser Val Ser Glu
50 55 60
Ala Leu Ser His Met Ala Cys Lys Ala Phe Asp Val Val Leu Val Glu
65 70 75 80
Gln Lys Leu Phe Ser Gly Arg Asp Ala Ala Ala Ala Gln Leu Lys Ala
85 90 95
Leu Ala Gly Val Ile Pro Thr Val Val Leu Ser Asp Ser Gly Ser Ala
100 105 110
Lys Asp Thr Trp Ala Ala Ile Val Gly Gln Ala Ala Asp Val Leu Ile
115 120 125
Arg Pro Leu Thr Lys Gln Lys Leu Gln Thr Leu Trp Gln His Thr Val
130 135 140
Arg Met Gln Arg Ala Ala Ser Ser Ala Ser Ala Ala Thr Ser Met Val
145 150 155 160
Ala Lys Pro Val Ala Val Leu Ser Ser Ala Leu Lys Pro Ala Ala Ser
165 170 175
Ser Ala Ser Leu Asp Lys Gly Gln Lys Arg Lys Leu Lys Asp His Met
180 185 190
Met Gly Pro Ile Met Ala His Pro Gln Val Ser Asn Pro Gly Phe Ile
195 200 205
Trp Gly Ala Pro Val Met Gly Val Pro Ala Gly Gln Gln Ala Pro Gln
210 215 220
Lys Ser Glu Ala Pro Val Thr Pro Gln Lys Pro Gly Ser Glu Met His
225 230 235 240
Pro Glu Leu Asp Ala Thr Ser His Ile Ala Met Gly Ser Ser Asp Asn
245 250 255
Phe Asn Val Pro Val Tyr Glu Ser Gly Thr Asp Ser Gln Glu Ser Gln
260 265 270
Pro Thr Cys Asp Pro Thr Ser Leu Asp Asp Ile Asn Glu Asp Asp Tyr
275 280 285
Ala Phe Ile Asp Phe Ala Leu Ser Asp Ser Phe Pro Thr Val Glu Glu
290 295 300
Asp Glu Ile Leu Pro Pro Ile Gly Leu Ser Leu Lys Lys Ser Ser Ser
305 310 315 320
Leu Leu Asn Met Leu Asn Gly Val Leu Leu Ser Ala His Ser Val Pro
325 330 335
Leu Gln Leu Pro Gln
340
<210>42
<211>558
<212>PRT
<213> Arabidopsis thaliana
<400>42
Met Ser Ser Ser Glu Glu Val Val Glu Val Thr Val Val Lys Ala Pro
1 5 10 15
Glu Ala Gly Gly Gly Lys Leu Ser Arg Arg Lys Ile Arg Lys Lys Asp
20 25 30
Ala Gly Val Asp Gly Leu Val Lys Trp Glu Arg Phe Leu Pro Lys Ile
35 40 45
Ala Leu Arg Val Leu Leu Val Glu Ala Asp Asp Ser Thr Arg Gln Ile
50 55 60
Ile Ala Ala Leu Leu Arg Lys Cys Ser Tyr Arg Val Ala Ala Val Pro
65 70 75 80
Asp Gly Leu Lys Ala Trp Glu Met Leu Lys Gly Lys Pro Glu Ser Val
85 90 95
Asp Leu Ile Leu Thr Glu Val Asp Leu Pro Ser Ile Ser Gly Tyr Ala
100 105 110
Leu Leu Thr Leu Ile Met Glu His Asp Ile Cys Lys Asn Ile Pro Val
115 120 125
Ile Met Met Ser Thr Gln Asp Ser Val Asn Thr Val Tyr Lys Cys Met
130 135 140
Leu Lys Gly Ala Ala Asp Tyr Leu Val Lys Pro Leu Arg Arg Asn Glu
145 150 155 160
Leu Arg Asn Leu Trp Gln His Val Trp Arg Arg Gln Thr Ser Leu Ala
165 170 175
Pro Asp Ser Phe Pro Trp Asn Glu Ser Val Gly Gln Gln Lys Ala Glu
180 185 190
Gly Ala Ser Ala Asn Asn Ser Asn Gly Lys Arg Asp Asp His Val Val
195 200 205
Ile Gly Asn Gly Gly Asp Ala Gln Ser Ser Cys Thr Arg Pro Glu Met
210 215 220
Glu Gly Glu Ser Ala Asp Val Glu Val Ser Ala Arg Asp Ala Val Gln
225 230 235 240
Met Glu Cys Ala Lys Ser Gln Phe Asn Glu Thr Gln Leu Leu Ala Asn
245 250 255
Glu Leu Gln Ser Lys Gln Ala Glu Ala Ile Asp Phe Met Gly Ala Ser
260 265 270
Phe Arg Arg Thr Gly Arg Arg Asn Arg Glu Glu Ser Val Ala Gln Tyr
275 280 285
Glu Ser Arg Ile Glu Leu Asp Leu Ser Leu Arg Arg Pro Asn Ala Ser
290 295 300
Glu Asn Gln Ser Ser Gly Asp Arg Pro Ser Leu His Pro Ser Ser Ala
305 310 315 320
Ser Ala Phe Thr Arg Tyr Val His Arg Pro Leu Gln Thr Gln Cys Ser
325 330 335
Ala Ser Pro Val Val Pro Asp Gln Arg Lys Asn Val Ala Ala Ser Gln
340 345 350
Asp Asp Asn Ile Val Leu Met Asn Gln Tyr Asn Thr Ser Glu Pro Pro
355 360 365
Pro Asn Ala Pro Arg Arg Asn Asp Thr Ser Phe Tyr Thr Gly Thr Asp
370 375 380
Ser Pro Gly Pro Pro Phe Ser Asn Gln Met Asn Ser Trp Pro Gly Gln
385 390 395 400
Gly Ser Tyr Pro Thr Pro Thr Pro Ile Asn Asn Ile Gln Phe Arg Gly
405 410 415
Pro Asn Thr Ala Tyr Thr Ser Ala Met Ala Pro Ala Ser Leu Ser Pro
420 425 430
Ser Pro Ser Ser Val Ser Pro His Glu Tyr Ser Ser Met Phe His Pro
435 440 445
Phe Asn Ser Lys Pro Glu Gly Leu Gln Asp Arg Asp Cys Ser Met Asp
450 455 460
Val Asp Asp Arg Arg Tyr Val Ser Ser Ala Thr Glu His Ser Ala Ile
465 470 475 480
Gly Asn His Ile Asp Gln Leu Ile Glu Lys Lys Asn Glu Asp Gly Tyr
485 490 495
Ser Ser Ser Val Gly Lys Ile Gln Gln Ser Leu Gln Arg Glu Ala Ala
500 505 510
Leu Thr Lys Phe Arg Met Lys Arg Lys Asp Arg Cys Phe Glu Lys Lys
515 520 525
Val Arg Tyr Glu Ser Arg Lys Lys Leu Ala Glu Gln Arg Pro Arg Ile
530 535 540
Lys Gly Gln Phe Val Arg Gln Val Gln Ser Thr Gln Ala Pro
545 550 555
<210>43
<211>186
<212>PRT
<213> Arabidopsis thaliana
<400>43
Met Ala Glu Val Met Leu Pro Arg Lys Met Glu Ile Leu Asn His Ser
1 5 1015
Ser Lys Phe Gly Ser Pro Asp Pro Leu His Val Leu Ala Val Asp Asp
20 25 30
Ser His Val Asp Arg Lys Phe Ile Glu Arg Leu Leu Arg Val Ser Ser
35 40 45
Cys Lys Val Thr Val Val Asp Ser Ala Thr Arg Ala Leu Gln Tyr Leu
50 55 60
Gly Leu Asp Val Glu Glu Lys Ser Val Gly Phe Glu Asp Leu Lys Val
65 70 75 80
Asn Leu Ile Met Thr Asp Tyr Ser Met Pro Gly Met Thr Gly Tyr Glu
85 90 95
Leu Leu Lys Lys Ile Lys Glu Ser Ser Ala Phe Arg Glu Val Pro Val
100 105 110
Val Ile Met Ser Ser Glu Asn Ile Leu Pro Arg Ile Asp Arg Cys Leu
115 120 125
Glu Glu Gly Ala Glu Asp Phe Leu Leu Lys Pro Val Lys Leu Ser Asp
130 135 140
Val Lys Arg Leu Arg Asp Ser Leu Met Lys Val Glu Asp Leu Ser Phe
145 150 155 160
Thr Lys Ser Ile Gln Lys Arg Glu Leu Glu Thr Glu Asn Val Tyr Pro
165 170175
Val His Ser Gln Leu Lys Arg Ala Lys Ile
180 185
<210>44
<211>727
<212>PRT
<213> Arabidopsis thaliana
<400>44
Met Asn Ala Asn Glu Glu Gly Glu Gly Ser Arg Tyr Pro Ile Thr Asp
1 5 10 15
Arg Lys Thr Gly Glu Thr Lys Phe Asp Arg Val Glu Ser Arg Thr Glu
20 25 30
Lys His Ser Glu Glu Glu Lys Thr Asn Gly Ile Thr Met Asp Val Arg
35 40 45
Asn Gly Ser Ser Gly Gly Leu Gln Ile Pro Leu Ser Gln Gln Thr Ala
50 55 60
Ala Thr Val Cys Trp Glu Arg Phe Leu His Val Arg Thr Ile Arg Val
65 70 75 80
Leu Leu Val Glu Asn Asp Asp Cys Thr Arg Tyr Ile Val Thr Ala Leu
85 90 95
Leu Arg Asn Cys Ser Tyr Glu Val Val Glu Ala Ser Asn Gly Ile Gln
100 105 110
Ala Trp Lys Val Leu Glu Asp Leu Asn Asn His Ile Asp Ile Val Leu
115120 125
Thr Glu Val Ile Met Pro Tyr Leu Ser Gly Ile Gly Leu Leu Cys Lys
130 135 140
Ile Leu Asn His Lys Ser Arg Arg Asn Ile Pro Val Ile Met Met Ser
145 150 155 160
Ser His Asp Ser Met Gly Leu Val Phe Lys Cys Leu Ser Lys Gly Ala
165 170 175
Val Asp Phe Leu Val Lys Pro Ile Arg Lys Asn Glu Leu Lys Ile Leu
180 185 190
Trp Gln His Val Trp Arg Arg Cys Gln Ser Ser Ser Gly Ser Gly Ser
195 200 205
Glu Ser Gly Thr His Gln Thr Gln Lys Ser Val Lys Ser Lys Ser Ile
210 215 220
Lys Lys Ser Asp Gln Asp Ser Gly Ser Ser Asp Glu Asn Glu Asn Gly
225 230 235 240
Ser Ile Gly Leu Asn Ala Ser Asp Gly Ser Ser Asp Gly Ser Gly Ala
245 250 255
Gln Ser Ser Trp Thr Lys Lys Ala Val Asp Val Asp Asp Ser Pro Arg
260 265 270
Ala Val Ser Leu Trp Asp Arg Val Asp Ser Thr Cys Ala Gln Val Val
275280 285
His Ser Asn Pro Glu Phe Pro Ser Asn Gln Leu Val Ala Pro Pro Ala
290 295 300
Glu Lys Glu Thr Gln Glu His Asp Asp Lys Phe Glu Asp Val Thr Met
305 310 315 320
Gly Arg Asp Leu Glu Ile Ser Ile Arg Arg Asn Cys Asp Leu Ala Leu
325 330 335
Glu Pro Lys Asp Glu Pro Leu Ser Lys Thr Thr Gly Ile Met Arg Gln
340 345 350
Asp Asn Ser Phe Glu Lys Ser Ser Ser Lys Trp Lys Met Lys Val Gly
355 360 365
Lys Gly Pro Leu Asp Leu Ser Ser Glu Ser Pro Ser Ser Lys Gln Met
370 375 380
His Glu Asp Gly Gly Ser Ser Phe Lys Ala Met Ser Ser His Leu Gln
385 390 395 400
Asp Asn Arg Glu Pro Glu Ala Pro Asn Thr His Leu Lys Thr Leu Asp
405 410 415
Thr Asn Glu Ala Ser Val Lys Ile Ser Glu Glu Leu Met His Val Glu
420 425 430
His Ser Ser Lys Arg His Arg Gly Thr Lys Asp Asp Gly Thr Leu Val
435 440445
Arg Asp Asp Arg Asn Val Leu Arg Arg Ser Glu Gly Ser Ala Phe Ser
450 455 460
Arg Tyr Asn Pro Ala Ser Asn Ala Asn Lys Ile Ser Gly Gly Asn Leu
465 470 475 480
Gly Ser Thr Ser Leu Gln Asp Asn Asn Ser Gln Asp Leu Ile Lys Lys
485 490 495
Thr Glu Ala Ala Tyr Asp Cys His Ser Asn Met Asn Glu Ser Leu Pro
500 505 510
His Asn His Arg Ser His Val Gly Ser Asn Asn Phe Asp Met Ser Ser
515 520 525
Thr Thr Glu Asn Asn Ala Phe Thr Lys Pro Gly Ala Pro Lys Val Ser
530 535 540
Ser Ala Gly Ser Ser Ser Val Lys His Ser Ser Phe Gln Pro Leu Pro
545 550 555 560
Cys Asp His His Asn Asn His Ala Ser Tyr Asn Leu Val His Val Ala
565 570 575
Glu Arg Lys Lys Leu Pro Pro Gln Cys Gly Ser Ser Asn Val Tyr Asn
580 585 590
Glu Thr Ile Glu Gly Asn Asn Asn Thr Val Asn Tyr Ser Val Asn Gly
595 600605
Ser Val Ser Gly Ser Gly His Gly Ser Asn Gly Pro Tyr Gly Ser Ser
610 615 620
Asn Gly Met Asn Ala Gly Gly Met Asn Met Gly Ser Asp Asn Gly Ala
625 630 635 640
Gly Lys Asn Gly Asn Gly Asp Gly Ser Gly Ser Gly Ser Gly Ser Gly
645 650 655
Ser Gly Asn Leu Ala Asp Glu Asn Lys Ile Ser Gln Arg Glu Ala Ala
660 665 670
Leu Thr Lys Phe Arg Gln Lys Arg Lys Glu Arg Cys Phe Arg Lys Lys
675 680 685
Val Arg Tyr Gln Ser Arg Lys Lys Leu Ala Glu Gln Arg Pro Arg Val
690 695 700
Arg Gly Gln Phe Val Arg Lys Thr Ala Ala Ala Thr Asp Asp Asn Asp
705 710 715 720
Ile Lys Asn Ile Glu Asp Ser
725
<210>45
<211>444
<212>PRT
<213> Arabidopsis thaliana
<400>45
Met Gly Glu Ile Val Val Leu Ser Ser Asp Asp Gly Met Glu Thr Ile
1 510 15
Lys Asn Arg Val Lys Ser Ser Glu Val Val Gln Trp Glu Lys Tyr Leu
20 25 30
Pro Lys Thr Val Leu Arg Val Leu Leu Val Glu Ser Asp Tyr Ser Thr
35 40 45
Arg Gln Ile Ile Thr Ala Leu Leu Arg Lys Cys Cys Tyr Lys Val Val
50 55 60
Ala Val Ser Asp Gly Leu Ala Ala Trp Glu Val Leu Lys Glu Lys Ser
65 70 75 80
His Asn Ile Asp Leu Ile Leu Thr Glu Leu Asp Leu Pro Ser Ile Ser
85 90 95
Gly Phe Ala Leu Leu Ala Leu Val Met Glu His Glu Ala Cys Lys Asn
100 105 110
Ile Pro Val Ile Met Met Ser Ser Gln Asp Ser Ile Lys Met Val Leu
115 120 125
Lys Cys Met Leu Arg Gly Ala Ala Asp Tyr Leu Ile Lys Pro Met Arg
130 135 140
Lys Asn Glu Leu Lys Asn Leu Trp Gln His Val Trp Arg Arg Leu Thr
145 150 155 160
Leu Arg Asp Asp Pro Thr Ala His Ala Gln Ser Leu Pro Ala Ser Gln
165 170175
His Asn Leu Glu Asp Thr Asp Glu Thr Cys Glu Asp Ser Arg Tyr His
180 185 190
Ser Asp Gln Gly Ser Gly Ala Gln Ala Ile Asn Tyr Asn Gly His Asn
195 200 205
Lys Leu Met Glu Asn Gly Lys Ser Val Asp Glu Arg Asp Glu Phe Lys
210 215 220
Glu Thr Phe Asp Val Thr Met Asp Leu Ile Gly Gly Ile Asp Lys Arg
225 230 235 240
Pro Asp Ser Ile Tyr Lys Asp Lys Ser Arg Asp Glu Cys Val Gly Pro
245 250 255
Glu Leu Gly Leu Ser Leu Lys Arg Ser Cys Ser Val Ser Phe Glu Asn
260 265 270
Gln Asp Glu Ser Lys His Gln Lys Leu Ser Leu Ser Asp Ala Ser Ala
275 280 285
Phe Ser Arg Phe Glu Glu Ser Lys Ser Ala Glu Lys Ala Val Val Ala
290 295 300
Leu Glu Glu Ser Thr Ser Gly Glu Pro Lys Thr Pro Thr Glu Ser His
305 310 315 320
Glu Lys Leu Arg Lys Val Thr Ser Asp Gln Gly Ser Ala Thr Thr Ser
325 330335
Ser Asn Gln Glu Asn Ile Gly Ser Ser Ser Val Ser Phe Arg Asn Gln
340 345 350
Val Leu Gln Ser Thr Val Thr Asn Gln Lys Gln Asp Ser Pro Ile Pro
355 360 365
Val Glu Ser Asn Arg Glu Lys Ala Ala Ser Lys Glu Val Glu Ala Gly
370 375 380
Ser Gln Ser Thr Asn Glu Gly Ile Ala Gly Gln Ser Ser Ser Thr Glu
385 390 395 400
Lys Pro Lys Glu Glu Glu Ser Ala Lys Gln Arg Trp Ser Arg Ser Gln
405 410 415
Arg Glu Ala Ala Leu Met Lys Phe Arg Leu Lys Arg Lys Asp Arg Cys
420 425 430
Phe Asp Lys Lys Val Arg Asp Thr Gln Ala Ser Ser
435 440
<210>46
<211>204
<212>PRT
<213> Arabidopsis thaliana
<400>46
Met Ala Leu Arg Asp Leu Ser Ser Ser Ser Ser Ser Pro Glu Leu His
1 5 10 15
Val Leu Ala Val Asp Asp Ser Phe Val Asp Arg Lys Val Leu Glu Arg
20 25 30
Leu Leu Lys Ile Ser Ala Cys Lys Val Thr Thr Val Glu Ser Gly Thr
35 40 45
Arg Ala Leu Gln Tyr Leu Gly Leu Asp Gly Asp Asn Gly Ser Ser Gly
50 55 60
Leu Lys Asp Leu Lys Val Asn Leu Ile Val Thr Asp Tyr Ser Met Pro
65 70 75 80
Gly Leu Thr Gly Tyr Glu Leu Leu Lys Lys Ile Lys Glu Ser Ser Ala
85 90 95
Leu Arg Glu Ile Pro Val Val Ile Met Ser Ser Glu Asn Ile Gln Pro
100 105 110
Arg Ile Glu Gln Cys Met Ile Glu Gly Ala Glu Glu Phe Leu Leu Lys
115 120 125
Pro Val Lys Leu Ala Asp Val Lys Arg Leu Lys Glu Leu Ile Met Arg
130 135 140
Gly Gly Glu Ala Glu Glu Gly Lys Thr Lys Lys Leu Ser Pro Lys Arg
145 150 155 160
Ile Leu Gln Asn Asp Ile Asp Ser Ser Pro Ser Ser Ser Ser Ser Thr
165 170 175
Ser Ser Ser Ser Ser Ser His Asp Val Ser Ser Leu Asp Asp Asp Thr
180185 190
Pro Ser Ser Lys Arg Ile Lys Leu Glu Ser Arg Gly
195 200
<210>47
<211>691
<212>PRT
<213> Soybean
<400>47
Met Gly Glu Val Val Ile Met Ser Gly Glu Lys Lys Ser Val Arg Val
1 5 10 15
Glu Gly Val Glu Lys Glu Asp Ser Gly Gly Ser Gly Ser Lys Ala Gly
20 25 30
Glu Phe Lys Gly Leu Met Arg Trp Glu Lys Phe Leu Pro Lys Met Val
35 40 45
Leu Arg Val Leu Leu Val Glu Ala Asp Asp Ser Thr Arg Gln Ile Ile
50 55 60
Ala Ala Leu Leu Arg Lys Cys Ser Tyr Lys Val Val Ala Val Pro Asp
65 70 75 80
Gly Leu Lys Ala Trp Glu Leu Leu Lys Gly Arg Pro His Asn Val Asp
85 90 95
Leu Ile Leu Thr Glu Val Asp Leu Pro Ser Ile Ser Gly Tyr Ala Leu
100 105 110
Leu Thr Leu Ile Met Glu His Glu Ile Cys Lys Asn Ile Pro Val Ile
115 120 125
Met Met Ser Ser Gln Asp Ser Ile Ser Thr Val Tyr Lys Cys Met Leu
130 135 140
Arg Gly Ala Ala Asp Tyr Leu Val Lys Pro Ile Arg Lys Asn Glu Leu
145 150 155 160
Arg Asn Leu Trp Gln His Val Trp Arg Arg Gln Ser Ser Thr Thr Gly
165 170 175
Ile Asn Gly Leu Gln Asp Glu Ser Val Ala Gln Gln Lys Val Glu Ala
180 185 190
Thr Ala Glu Asn Asn Ala Ala Ser Asn Arg Ser Ser Gly Asp Ala Ala
195 200 205
Cys Ile Gln Arg Asn Ile Glu Leu Ile Glu Lys Gly Ser Asp Ala Gln
210 215 220
Ser Ser Cys Thr Lys Pro Asp Cys Glu Ala Glu Ser Asp Pro Val Gly
225 230 235 240
Asn Met Gln Glu Phe Ser Leu Leu Lys Cys Gly Glu Ala Tyr Pro Ser
245 250 255
Gly Thr Glu Thr Gln Gln Val Glu Thr Ser Phe Arg Leu Gly Gln Thr
260 265 270
Leu Met Met His Asp Cys His Ala Gly Gly Leu Asn Val Ser Ile Arg
275 280 285
Lys Asn Gly Glu Ala Ser Thr Thr Asn Asp Lys Asp Thr Asp Thr Glu
290 295 300
His Phe Gly Asn Ala Ser Ile Ser Gly Glu Ala His Asp Asn Pro Tyr
305 310 315 320
Val Gln Ile Asn Ser Ser Lys Glu Ala Met Asp Leu Ile Gly Ala Phe
325 330 335
His Thr His Pro Asn Cys Ser Leu Lys Asn Ser Thr Val Asn Cys Thr
340 345 350
Gly Asn Phe Asp His Ser Pro Gln Leu Asp Leu Ser Leu Arg Arg Ser
355 360 365
Cys Pro Gly Ser Phe Glu Asn Lys Leu Thr Glu Glu Arg His Thr Leu
370 375 380
Met His Ser Asn Ala Ser Ala Phe Lys Arg Tyr Thr Thr Arg Gln Leu
385 390 395 400
Gln Ile Ser Met Pro Ala Val Leu Ile Asn Phe Ser Asp Gln Gln Arg
405 410 415
Glu Gln Ile Thr Asn Cys Glu Lys Asn Ile Ser His Ile Ala Thr Gly
420 425 430
Ser Asn Ser Asp Ser Ser Thr Pro Met Gln Arg Cys Ile Val Ser Pro
435 440 445
Thr Thr Val Gln Ser Lys Glu Ser Glu Leu Ala Thr Ser His Pro Pro
450 455 460
Gln Gly His Ser Leu Pro Ile Pro Val Lys Gly Val Arg Phe Asn Asp
465 470 475 480
Leu Cys Thr Ala Tyr Gly Ser Val Leu Pro Ser Val Phe His Thr Gln
485 490 495
Ser Gly Pro Pro Ala Met Pro Ser Pro Asn Ser Val Val Leu Leu Glu
500 505 510
Pro Asn Phe Gln Val Asn Ala Phe Tyr Gln Ser Asn Met Lys Glu Ser
515 520 525
Ser Ser Glu Gln Leu Tyr Glu Ser Arg Gly Pro Asn Gly Asn Thr Thr
530 535 540
Gln Asn His Ile Val Tyr Thr Gln Glu His Lys Ser Glu His Ala Glu
545 550 555 560
Asp Arg Gly His Ile Ser Pro Thr Thr Asp Gln Ser Val Ser Ser Ser
565 570 575
Phe Cys Asn Gly Asn Ala Ser His Leu Asn Ser Ile Gly Tyr Gly Ser
580 585 590
Asn Cys Gly Ser Ser Ser Asn Val Asp Gln Val Asn Thr Val Trp Ala
595 600 605
Ala Ser Glu Gly Lys His Glu Asp Leu Thr Asn Asn Ala Asn Ser His
610 615 620
Arg Ser Ile Gln Arg Glu Ala Ala Leu Asn Lys Phe Arg Leu Lys Arg
625 630 635 640
Lys Glu Arg Cys Tyr Glu Lys Lys Val Arg Tyr Glu Ser Arg Lys Lys
645 650 655
Leu Ala Glu Gln Arg Pro Arg Val Lys Gly Gln Phe Val Arg Gln Val
660 665 670
His Pro Asp Pro Leu Val Ala Glu Lys Asp Gly Lys Glu Tyr Asp His
675 680 685
Ser Asp Phe
690
<210>48
<211>747
<212>PRT
<213> grape
<400>48
Met Gly Glu Val Val Val Ser Ser Glu Ala Gly Gly Gly Gly Met Glu
1 5 10 15
Gly Glu Val Glu Lys Lys Glu Val Gly Ser Gly Val Val Arg Trp Glu
20 25 30
Arg Phe Leu Pro Arg Met Val Leu Arg Val Leu Leu Val Glu Ala Asp
3540 45
Asp Ser Thr Arg Gln Ile Ile Ala Ala Leu Leu Arg Lys Cys Ser Tyr
50 55 60
Lys Val Ala Ala Val Pro Asp Gly Leu Lys Ala Trp Glu Val Leu Lys
65 70 75 80
Ala Arg Pro His Asn Ile Asp Leu Ile Leu Thr Glu Val Glu Leu Pro
85 90 95
Ser Ile Ser Gly Phe Ala Leu Leu Thr Leu Val Met Glu His Glu Ile
100 105 110
Cys Lys Asn Ile Pro Val Ile Met Met Ser Ser His Gly Ser Ile Asn
115 120 125
Thr Val Tyr Lys Cys Met Leu Arg Gly Ala Ala Asp Phe Leu Val Lys
130 135 140
Pro Val Arg Arg Asn Glu Leu Lys Asn Leu Trp Gln His Val Trp Arg
145 150 155 160
Arg Gln Ser Ser Thr Val Ser Gly Asn Gly Pro Gln Asp Glu Ser Val
165 170 175
Ala Gln Gln Lys Val Glu Ala Thr Ser Glu Asn Asn Pro Thr Ser Asn
180 185 190
His Ser Ser Asp His Val Ala Cys Ile Gln Lys Asn Lys Glu Ala Leu
195 200205
Asn Lys Val Ser Asp Ala Gln Ser Ser Cys Ser Lys Pro Asp Leu Glu
210 215 220
Ala Glu Ser Ala Tyr Met Glu Thr Met Gln Asp Phe Ser Asn Pro Thr
225 230 235 240
Trp Ser Arg Ser Leu Val Ser Asp Thr Lys Met Gln Lys Asn Glu Glu
245 250 255
Cys Ala Lys Leu Gly Pro Lys Phe Leu Met His Asn Lys Glu Ala Gly
260 265 270
Gly Thr Leu Glu Ala Ala Cys Arg Asp Val Asn Thr Met Thr Gln Pro
275 280 285
Glu Ala Val Glu Pro Glu Asn Asp Gly Gln Gly Ala Asn Ala Pro Ser
290 295 300
Glu Ala Cys Gly Asn Asn Ala Ile Leu Gly Ser Ser Ser Arg Glu Ala
305 310 315 320
Ile Asp Leu Ile Gly Val Phe Asp Asn Ser Lys Lys Cys Thr Tyr Gly
325 330 335
Asn Ser Ser Ser Asn Asn Gly Thr Lys Lys Ser Asp Ser Ile Pro Gln
340 345 350
Leu Asp Leu Ser Leu Arg Arg Ser His Pro Ser Ser Pro Glu Asn Gln
355 360365
Val Ala Asp Glu Arg His Thr Leu Asn His Ser Asn Gly Ser Ala Phe
370 375 380
Ser Arg Tyr Ile Asn Arg Ser Leu Gln Pro Pro His Leu Pro Ser Thr
385 390 395 400
Gly Val Phe Asn Gln Gln Lys Asn Phe Gly Ala Asp Ser Asp Lys Arg
405 410 415
Leu Ser Gln Leu Val Thr Gly Tyr Asn Ser Asp Ile Thr Ser Pro Thr
420 425 430
Leu Ser Thr Gln Arg Ser Val Ile Ser Leu Ala Thr Ser Pro Ser Gly
435 440 445
Arg Val Glu Ile Ala Leu Cys Gly Pro Gln Gln Arg Ala Phe Pro Ala
450 455 460
Pro Val Pro Gln Asn Ala Asn Asn Ser Thr Ser Gln Thr Asn His Lys
465 470 475 480
Pro Glu His Lys Leu Asp Ser Leu Glu Gly Gln Gly His Phe Ser Pro
485 490 495
Ala Thr Asp Gln Asn Ser Ser Ser Ser Phe Gly Asn Gly Gly Ala Ser
500 505 510
Asn Leu Asn Ser Phe Gly Cys Gly Ser Ile Cys Gly Ser Asn Gly Asn
515 520 525
Ala Asn Thr Val Ala Val Val Gln Ala Ala Ala Glu Gly Lys Asn Glu
530 535 540
Glu Gly Ile Phe Ser His Glu Gly His Ser Gln Arg Ser Ile Gln Arg
545 550 555 560
Glu Ala Ala Leu Thr Lys Phe Arg Leu Lys Arg Lys Asp Arg Cys Phe
565 570 575
Glu Lys Lys Val Arg Tyr Glu Ser Arg Lys Lys Leu Ala Glu Gln Arg
580 585 590
Pro Arg Val Lys Gly Gln Phe Val Arg Gln Val His Thr Ile Pro Pro
595 600 605
Pro Ala Glu Pro Asp Thr Tyr Tyr Gly Ser Ser Phe Asp Val Gln Pro
610 615 620
Gln Arg Ser Arg Tyr Leu Ser Ala Gln Pro Leu Arg Ala Ser Ser Ser
625 630 635 640
Gln Leu Leu Tyr Pro Thr His Thr Pro Leu Gln Glu Ser Lys Tyr Glu
645 650 655
Gly His Glu Glu Ser Asn Leu Leu Thr Ala Ser Leu Val Gly Thr Ala
660 665 670
Leu Pro Val Ala Pro Ser Phe Gly Tyr Glu Val Gly Arg Asp Gln Thr
675 680 685
Ala Gly Lys Leu Val Leu Ser Leu Lys Leu Asp Gly Arg Val Arg Trp
690 695 700
Lys Val Gly Thr Trp Val Ser Gly Arg Tyr Arg Leu Asn Val Asn Cys
705 710 715 720
Val Ala Val Met Ala Phe Gly Pro Ser Ile Pro Ser Gly Pro Leu Ser
725 730 735
Ser Lys Glu Gly Thr Gln Cys Ser Thr Thr Val
740 745
<210>49
<211>799
<212>PRT
<213> cocoa
<400>49
Met Gly Ile Val Gln Met Asn Asn Asn Gly Pro Val Ala Asn Gly Leu
1 5 10 15
Val Glu Leu Asn Thr His Ile His Asp Glu His Lys Lys Ile Arg Gly
20 25 30
Gly Val Ile Gly Glu Gly Gln Gly Leu Ser Val Glu Glu Glu Ser Trp
35 40 45
Ile Asn Glu Asp Val Glu Asp Arg Asn Asp Gly Lys Thr Glu Leu Val
50 55 60
Gln Val Gln Gly His Ala His Gly Glu Gln Glu Arg Ser Gln Gln Gln
65 7075 80
Pro Gln Gly Pro Leu Val His Trp Glu Arg Phe Leu Pro Leu Arg Ser
85 90 95
Leu Lys Val Leu Leu Val Glu Asn Asp Asp Ser Thr Arg His Val Val
100 105 110
Cys Ala Leu Leu Arg Asn Cys Gly Phe Glu Val Thr Ala Val Ser Asn
115 120 125
Gly Leu Gln Ala Trp Lys Ile Leu Glu Asp Leu Thr Asn His Ile Asp
130 135 140
Leu Val Leu Thr Glu Val Val Met Pro Cys Leu Ser Gly Ile Gly Leu
145 150 155 160
Leu Cys Lys Ile Met Ser His Lys Thr Arg Met Asn Ile Pro Val Ile
165 170 175
Met Met Ser Ser His Asp Ser Met Ser Thr Val Phe Arg Cys Leu Ser
180 185 190
Lys Gly Ala Val Asp Phe Leu Val Lys Pro Ile Arg Lys Asn Glu Leu
195 200 205
Lys Asn Leu Trp Gln His Val Trp Arg Lys Cys His Ser Ser Ser Ser
210 215 220
Ser Gly Gly Gln Ser Gly Thr Gln Thr Gln Lys Ser Ser Lys Ser Lys
225 230 235240
Gly Thr Asp Ser Asp Asn Asn Thr Gly Ser Asn Asp Glu Asp Asp Asn
245 250 255
Gly Ser Val Gly Leu Asn Val Gln Asp Gly Ser Asp Asn Gly Ser Gly
260 265 270
Thr Gln Ser Ser Trp Thr Lys Arg Ala Val Glu Val Asp Ser Ser Gln
275 280 285
Pro Ile Ser Pro Trp Asp Gln Leu Ala Asp Pro Pro His Ser Thr Cys
290 295 300
Ala Gln Val Ile His Ser Arg His Glu Val Leu Gly Asp Ser Trp Val
305 310 315 320
Pro Val Thr Ala Thr Arg Glu Tyr Asp Glu Leu Asp Asn Glu Leu Glu
325 330 335
Asn Val Val Met Gly Lys Asp Leu Glu Ile Gly Val Pro Lys Ile Thr
340 345 350
Ala Ser Gln Leu Glu Asp Pro Ser Glu Lys Val Met Thr Asn Ile Ala
355 360 365
Gly Val Asn Lys Asp Lys Leu Ser Ala Ile Asn Pro Lys Lys Asp Asp
370 375 380
Glu Lys Leu Glu Lys Ala Gln Leu Glu Leu Asn Ser Glu Lys Ser Gly
385 390 395400
Gly Asp Leu Arg Asn Gln Ala Ala Asp Leu Ile Gly Val Ile Thr Asn
405 410 415
Asn Thr Glu Pro His Ile Glu Ser Ala Val Phe Asp Ile Pro Asn Gly
420 425 430
Leu Pro Lys Val Ser Asp Ala Lys Glu Lys Val Asn Tyr Asp Thr Lys
435 440 445
Glu Met Pro Phe Leu Glu Leu Ser Leu Lys Arg Leu Arg Asp Val Gly
450 455 460
Asp Thr Gly Thr Ser Ala His Glu Arg Asn Val Leu Arg His Ser Asp
465 470 475 480
Leu Ser Ala Phe Ser Arg Tyr Asn Ser Gly Ser Thr Ala Asn Gln Ala
485 490 495
Pro Thr Gly Asn Val Gly Ser Cys Ser Pro Leu Asp Asn Ser Ser Glu
500 505 510
Ala Val Lys Thr Asp Ser Met Lys Asn Phe Gln Ser Thr Ser Asn Ser
515 520 525
Ile Pro Pro Lys Gln Gln Ser Asn Gly Ser Ser Asn Asn Asn Asp Met
530 535 540
Gly Ser Thr Thr Asn Asn Ala Phe Ser Lys Pro Ala Val Leu Ser Asp
545 550 555 560
Lys Pro Ala Pro Lys Thr Ser Ala Lys Ser Phe His Pro Ser Ser Ala
565 570 575
Phe Gln Pro Val Gln Ser Gly His Gly Ser Ala Leu Gln Pro Val Ala
580 585 590
Gln Gly Lys Ala Asp Ala Ala Leu Gly Asn Met Ile Leu Val Lys Ala
595 600 605
Arg Gly Thr Asp Gln Gln Gly Lys Val Gln His His His His His Tyr
610 615 620
His His His His His His His Val His Asn Met Leu Pro Asn Gln Lys
625 630 635 640
Leu Gly Asn His Asp Asp Leu Ser Leu Glu Asn Met Ala Ala Ala Ala
645 650 655
Pro Gln Cys Gly Ser Ser Asn Leu Ser Ser Leu Pro His Val Glu Gly
660 665 670
Asn Ala Ala Asn His Ser Leu Thr Arg Ser Ala Ser Gly Ser Asn His
675 680 685
Gly Ser Asn Gly Gln Asn Gly Ser Ser Thr Val Leu Asn Thr Arg Gly
690 695 700
Met Asn Leu Glu Ser Glu Asn Gly Val Pro Gly Lys Gly Gly Ala Gly
705 710 715 720
Gly Gly Ile Gly Ser Gly Gly Arg Asn Val Val Asp Gln Asn Arg Phe
725 730 735
Ala Gln Arg Glu Ala Ala Leu Asn Lys Phe Arg Gln Lys Arg Lys Glu
740 745 750
Arg Cys Phe Glu Lys Lys Val Arg Tyr Gln Ser Arg Lys Lys Leu Ala
755 760 765
Glu Gln Arg Pro Arg Ile Arg Gly Gln Phe Val Arg Gln Ile Ser Thr
770 775 780
Thr Gly Lys Glu Ala Phe Arg Phe Arg Gly Ala Gly Leu Cys Thr
785 790 795
<210>50
<211>742
<212>PRT
<213> Rice
<400>50
Met Met Gly Thr Ala His His Asn Gln Thr Ala Gly Ser Ala Leu Gly
1 5 10 15
Val Gly Val Gly Asp Ala Asn Asp Ala Val Pro Gly Ala Gly Gly Gly
20 25 30
Gly Tyr Ser Asp Pro Asp Gly Gly Pro Ile Ser Gly Val Gln Arg Pro
35 40 45
Pro Gln Val Cys Trp Glu Arg Phe Ile Gln Lys Lys Thr Ile Lys Val
5055 60
Leu Leu Val Asp Ser Asp Asp Ser Thr Arg Gln Val Val Ser Ala Leu
65 70 75 80
Leu Arg His Cys Met Tyr Glu Val Ile Pro Ala Glu Asn Gly Gln Gln
85 90 95
Ala Trp Thr Tyr Leu Glu Asp Met Gln Asn Ser Ile Asp Leu Val Leu
100 105 110
Thr Glu Val Val Met Pro Gly Val Ser Gly Ile Ser Leu Leu Ser Arg
115 120 125
Ile Met Asn His Asn Ile Cys Lys Asn Ile Pro Val Ile Met Met Ser
130 135 140
Ser Asn Asp Ala Met Gly Thr Val Phe Lys Cys Leu Ser Lys Gly Ala
145 150 155 160
Val Asp Phe Leu Val Lys Pro Ile Arg Lys Asn Glu Leu Lys Asn Leu
165 170 175
Trp Gln His Val Trp Arg Arg Cys His Ser Ser Ser Gly Ser Gly Ser
180 185 190
Glu Ser Gly Ile Gln Thr Gln Lys Cys Ala Lys Ser Lys Ser Gly Asp
195 200 205
Glu Ser Asn Asn Asn Asn Gly Ser Asn Asp Asp Asp Asp Asp Asp Gly
210215 220
Val Ile Met Gly Leu Asn Ala Arg Asp Gly Ser Asp Asn Gly Ser Gly
225 230 235 240
Thr Gln Ala Gln Ser Ser Trp Thr Lys Arg Ala Val Glu Ile Asp Ser
245 250 255
Pro Gln Ala Met Ser Pro Asp Gln Leu Ala Asp Pro Pro Asp Ser Thr
260 265 270
Cys Ala Gln Val Ile His Leu Lys Ser Asp Ile Cys Ser Asn Arg Trp
275 280 285
Leu Pro Cys Thr Ser Asn Lys Asn Ser Lys Lys Gln Lys Glu Thr Asn
290 295 300
Asp Asp Phe Lys Gly Lys Asp Leu Glu Ile Gly Ser Pro Arg Asn Leu
305 310 315 320
Asn Thr Ala Tyr Gln Ser Ser Pro Asn Glu Arg Ser Ile Lys Pro Thr
325 330 335
Asp Arg Arg Asn Glu Tyr Pro Leu Gln Asn Asn Ser Lys Glu Ala Ala
340 345 350
Met Glu Asn Leu Glu Glu Ser Ser Val Arg Ala Ala Asp Leu Ile Gly
355 360 365
Ser Met Ala Lys Asn Met Asp Ala Gln Gln Ala Ala Arg Ala Ala Asn
370 375380
Ala Pro Asn Cys Ser Ser Lys Val Pro Glu Gly Lys Asp Lys Asn Arg
385 390 395 400
Asp Asn Ile Met Pro Ser Leu Glu Leu Ser Leu Lys Arg Ser Arg Ser
405 410 415
Thr Gly Asp Gly Ala Asn Ala Ile Gln Glu Glu Gln Arg Asn Val Leu
420 425 430
Arg Arg Ser Asp Leu Ser Ala Phe Thr Arg Tyr His Thr Pro Val Ala
435 440 445
Ser Asn Gln Gly Gly Thr Gly Phe Met Gly Ser Cys Ser Leu His Asp
450 455 460
Asn Ser Ser Glu Ala Met Lys Thr Asp Ser Ala Tyr Asn Met Lys Ser
465 470 475 480
Asn Ser Asp Ala Ala Pro Ile Lys Gln Gly Ser Asn Gly Ser Ser Asn
485 490 495
Asn Asn Asp Met Gly Ser Thr Thr Lys Asn Val Val Thr Lys Pro Ser
500 505 510
Thr Asn Lys Glu Arg Val Met Ser Pro Ser Ala Val Lys Ala Asn Gly
515 520 525
His Thr Ser Ala Phe His Pro Ala Gln His Trp Thr Ser Pro Ala Asn
530 535540
Thr Thr Gly Lys Glu Lys Thr Asp Glu Val Ala Asn Asn Ala Ala Lys
545 550 555 560
Arg Ala Gln Pro Gly Glu Val Gln Ser Asn Leu Val Gln His Pro Arg
565 570 575
Pro Ile Leu His Tyr Val His Phe Asp Val Ser Arg Glu Asn Gly Gly
580 585 590
Ser Gly Ala Pro Gln Cys Gly Ser Ser Asn Val Phe Asp Pro Pro Val
595 600 605
Glu Gly His Ala Ala Asn Tyr Gly Val Asn Gly Ser Asn Ser Gly Ser
610 615 620
Asn Asn Gly Ser Asn Gly Gln Asn Gly Ser Thr Thr Ala Val Asn Ala
625 630 635 640
Glu Arg Pro Asn Met Glu Ile Ala Asn Gly Thr Ile Asn Lys Ser Gly
645 650 655
Pro Gly Gly Gly Asn Gly Ser Gly Ser Gly Ser Gly Asn Asp Met Tyr
660 665 670
Leu Lys Arg Phe Thr Gln Arg Glu His Arg Val Ala Ala Val Ile Lys
675 680 685
Phe Arg Gln Lys Arg Lys Glu Arg Asn Phe Gly Lys Lys Val Arg Tyr
690 695 700
Gln Ser Arg Lys Arg Leu Ala Glu Gln Arg Pro Arg Val Arg Gly Gln
705 710 715 720
Phe Val Arg Gln Ala Val Gln Asp Gln Gln Gln Gln Gly Gly Gly Arg
725 730 735
Glu Ala Ala Ala Asp Arg
740
<210>51
<211>766
<212>PRT
<213> corn
<400>51
Met Gly Ser Ala Cys Gln Ala Gly Thr Asp Gly Pro Ser Arg Lys Asp
1 5 10 15
Val Leu Gly Ile Gly Asn Ala Ala Leu Glu Asn Gly His His Gln Ala
20 25 30
Glu Ala Asp Ala Asp Glu Trp Arg Glu Lys Glu Glu Asp Leu Ala Asn
35 40 45
Asn Gly His Ser Ala Pro Pro Pro Gly Met Gln Gln Val Asp Glu His
50 55 60
Lys Glu Glu Gln Arg Gln Ser Ile His Trp Glu Arg Phe Leu Pro Val
65 70 75 80
Lys Thr Leu Arg Val Leu Leu Val Glu Asn Asp Asp Ser Thr Arg Gln
85 9095
Val Val Ser Ala Leu Leu Arg Lys Cys Cys Tyr Glu Val Ile Pro Ala
100 105 110
Glu Asn Gly Leu His Ala Trp Arg Tyr Leu Glu Asp Leu Gln Asn Asn
115 120 125
Ile Asp Leu Val Leu Thr Glu Val Phe Met Pro Cys Leu Ser Gly Ile
130 135 140
Gly Leu Leu Ser Lys Ile Thr Ser His Lys Ile Cys Lys Asp Ile Pro
145 150 155 160
Val Ile Met Met Ser Thr Asn Asp Ser Met Ser Met Val Phe Lys Cys
165 170 175
Leu Ser Lys Gly Ala Val Asp Phe Leu Val Lys Pro Leu Arg Lys Asn
180 185 190
Glu Leu Lys Asn Leu Trp Gln His Val Trp Arg Arg Cys His Ser Ser
195 200 205
Ser Gly Ser Glu Ser Gly Ile Gln Thr Gln Lys Cys Ala Lys Leu Asn
210 215 220
Thr Gly Asp Glu Tyr Glu Asn Gly Ser Asp Ser Asn His Asp Asp Glu
225 230 235 240
Glu Asn Asp Asp Gly Asp Asp Asp Asp Phe Ser Val Gly Leu Asn Ala
245 250 255
Arg Asp Gly Ser Asp Asn Gly Ser Gly Thr Gln Ser Ser Trp Thr Lys
260 265 270
Arg Ala Val Glu Ile Asp Ser Pro Gln Pro Ile Ser Pro Asp Gln Leu
275 280 285
Val Asp Pro Pro Asp Ser Thr Cys Ala Gln Val Ile His Pro Arg Ser
290 295 300
Glu Ile Cys Ser Asn Lys Trp Leu Pro Thr Ala Asn Lys Arg Asn Val
305 310 315 320
Lys Lys Gln Lys Glu Asn Lys Asp Glu Ser Met Gly Arg Tyr Leu Gly
325 330 335
Ile Gly Ala Pro Arg Asn Ser Ser Ala Glu Tyr Gln Ser Ser Leu Asn
340 345 350
Asp Val Ser Val Asn Pro Ile Glu Lys Gly His Glu Asn His Met Ser
355 360 365
Lys Cys Lys Ser Lys Lys Glu Thr Met Ala Glu Asp Asp Cys Thr Asn
370 375 380
Met Pro Ser Ala Thr Asn Ala Glu Thr Ala Asp Leu Ile Ser Ser Ile
385 390 395 400
Ala Arg Asn Thr Glu Gly Gln Gln Ala Val Gln Ala Val Asp Ala Pro
405 410 415
Asp Gly Pro Ser Lys Met Ala Asn Gly Asn Asp Lys Asn His Asp Ser
420 425 430
His Ile Glu Val Thr Pro His Glu Leu Gly Leu Lys Arg Ser Arg Thr
435 440 445
Asn Gly Ala Thr Ala Glu Ile His Asp Glu Arg Asn Ile Leu Lys Arg
450 455 460
Ser Asp Gln Ser Ala Phe Thr Arg Tyr His Thr Ser Val Ala Ser Asn
465 470 475 480
Gln Gly Gly Ala Arg Tyr Gly Glu Ser Ser Ser Pro Gln Asp Asn Ser
485 490 495
Ser Glu Ala Met Lys Thr Asp Ser Thr Cys Lys Met Lys Ser Asn Ser
500 505 510
Asp Ala Ala Pro Ile Lys Gln Gly Ser Asn Gly Ser Ser Asn Asn Asp
515 520 525
Val Gly Ser Ser Thr Lys Asn Val Ala Ala Arg Pro Ser Gly Asp Arg
530 535 540
Glu Arg Val Ala Ser Pro Leu Ala Ile Lys Ser Thr Gln His Ala Ser
545 550 555 560
Ala Phe His Thr Ile Gln Asn Gln Thr Ser Pro Ala Asn Leu Ile Gly
565 570 575
Glu Asp Lys Ala Asp Glu Gly Ile Ser Asn Thr Val Lys Met Ser His
580 585 590
Pro Thr Glu Val Pro Gln Gly Cys Val Gln His His His His Val His
595 600 605
Tyr Tyr Leu His Val Met Thr Gln Lys Gln Pro Ser Thr Asp Arg Gly
610 615 620
Ser Ser Asp Val His Cys Gly Ser Ser Asn Val Phe Asp Pro Pro Val
625 630 635 640
Glu Gly His Ala Ala Asn Tyr Ser Val Asn Gly Gly Val Ser Val Gly
645 650 655
His Asn Gly Cys Asn Gly Gln Asn Gly Ser Ser Ala Val Pro Asn Ile
660 665 670
Ala Arg Pro Asn Ile Glu Ser Ile Asn Gly Thr Met Ser Gln Asn Ile
675 680 685
Ala Gly Gly Gly Ile Val Ser Gly Ser Gly Ser Gly Asn Asp Met Tyr
690 695 700
Gln Asn Arg Phe Leu Gln Arg Glu Ala Ala Leu Asn Lys Phe Arg Leu
705 710 715 720
Lys Arg Lys Asp Arg Asn Phe Gly Lys Lys Val Arg Tyr Gln Ser Arg
725 730 735
Lys Arg Leu Ala Glu Gln Arg Pro Arg Val Arg Gly Gln Phe Val Arg
740 745 750
Gln Ser Glu Gln Glu Asp Gln Thr Ala Gln Gly Ser Glu Arg
755 760 765
<210>52
<211>917
<212>PRT
<213> Physcomitrella patens
<400>52
Met Thr Ala Asp Leu Cys Glu Phe Glu Ser Glu Ser Asp Pro Leu Gln
1 5 10 15
Pro Leu Ser Ala Val Gly Arg Ala Trp Val Glu Pro Ile Val Gly Thr
20 25 30
Pro Val Gly Ala Glu Trp Arg Ile Lys Gly Gly Phe Lys Ala His Lys
35 40 45
Glu Val Asp Arg Ser Arg Glu Gln Val Gly Ser Lys Arg Val Asp Asp
50 55 60
Arg Glu Lys Asn Ser Gly Arg Leu Glu Asn Gly Cys Arg Phe Ala Asp
65 70 75 80
Arg Thr Gly Gly Ala Val Leu Lys Ala Arg Glu Asp Pro Lys Asp Ile
85 90 95
Ala Glu Gln Ile Arg Arg Glu Leu Asp His Gln Phe Pro Val Asn Asp
100 105 110
Val Leu Arg Thr Ser Glu Ser Asp Glu Asp Gly Arg Arg Glu Asp Ser
115 120 125
Ala Glu Asp His Tyr Glu Glu Gly Asp Ala Val Ala Ala Val Val Phe
130 135 140
Glu Lys Gln Arg Pro Arg Glu Ile Ala Gln Thr Arg Glu Gln Gln Gln
145 150 155 160
Gly Gly Asn Ala Ala Ala Ala Ala Ala Gly Thr Gln Gly Gly Gly Gly
165 170 175
Trp Glu Ser Phe Leu Leu Lys Arg Asn Leu Lys Val Leu Leu Val Glu
180 185 190
Asp Asp Asp Ala Thr Arg His Val Val Gly Ala Leu Leu Arg Asn Cys
195 200 205
Asn Tyr Glu Val Thr Pro Val Ala Asn Gly Ser Leu Ala Trp Gly Leu
210 215 220
Leu Glu Glu Ala Asn Ser Asn Phe Asp Leu Val Leu Thr Asp Val Val
225 230 235 240
Met Pro Tyr Leu Ser Gly Val Gly Leu Leu Ser Lys Met Met Lys Arg
245 250 255
Glu Ala Cys Lys Arg Val Pro Ile Val Ile Met Ser Ser Tyr Asp Ser
260265 270
Leu Gly Ile Val Phe Arg Cys Leu Ser Lys Gly Ala Cys Asp Tyr Leu
275 280 285
Val Lys Pro Val Arg Lys Asn Glu Leu Lys Asn Leu Trp Gln His Val
290 295 300
Trp Arg Lys Cys His Ser Ser Ser Gly Ser Arg Ser Gly Ser Gly Ser
305 310 315 320
Gln Thr Gly Glu Val Ala Lys Pro Arg Ser Arg Gly Val Ala Ala Ala
325 330 335
Asp Asn Pro Ser Gly Ser Asn Asp Gly Asn Gly Ser Ser Asp Gly Ser
340 345 350
Asp Asn Gly Ser Ser Arg Val Asn Ala Gln Gly Gly Ser Asp Asn Gly
355 360 365
Ser Gly Asn Gln Ala Cys Met Gln Pro Val Gln Val Leu Arg Asn Ser
370 375 380
Ala Ile Pro Glu Ala Val Asp Gly Asp Glu Glu Gly Gln Ala Thr Ser
385 390 395 400
Gln Asp Lys Gly Ala Asp Leu Asp Gly Glu Met Gly His Asp Leu Glu
405 410 415
Met Ala Thr Arg Arg Ser Ala Cys Val Thr Thr Gly Lys Asp Gln Gln
420425 430
Pro Glu Asp Ala Gln Lys Gln Asp Glu Asp Ala Val Cys Ile Leu Gln
435 440 445
Asp Ala Gly Pro Ser Pro Asp Gly Ala Asn Ala Glu Ser Pro Ser Ser
450 455 460
Ser Gly Arg Asn Asp Ala Ala Glu Glu Ser Ser Pro Lys Ile Ile Asp
465 470 475 480
Leu Ile Asn Val Ile Ala Cys Gln Pro Gln Thr Gln Asp Ala Glu Pro
485 490 495
Gln Glu Ser Glu Asn Asp Asp Glu Glu Leu Asp Pro Arg Gly Arg Ser
500 505 510
Ser Pro Lys Asn Asn Ser Ala Ser Asp Ser Gly Thr Ser Leu Glu Leu
515 520 525
Ser Leu Lys Arg Pro Arg Ser Ala Val Gly Asn Gly Gly Glu Leu Glu
530 535 540
Glu Arg Gln Pro Leu Arg His Ser Gly Gly Ser Ala Phe Ser Arg Tyr
545 550 555 560
Gly Ser Gly Gly Thr Ile Ile Gln Gln Tyr His Gln Thr Gly Gly Ser
565 570 575
Leu Pro Leu Ser Gly Tyr Pro Val Ser Gly Gly Tyr Gly Val Tyr Gly
580585 590
Met Ser Gly Gly Ser Pro Gly Gly Ser Leu Arg Leu Gly Met Gly Met
595 600 605
Asp Arg Ser Gly Ser Ser Lys Gly Ser Val Glu Gly Thr Thr Pro Pro
610 615 620
Pro Ser His Pro Gln Ser Met Glu Lys Val Gly Gly Gln Asp Gly Tyr
625 630 635 640
Gly Asn Ala Arg Gln Thr Thr Glu Asp Ala Met Ile Val Pro Gly Met
645 650 655
Pro Met Ala Ile Pro Leu Pro Pro Pro Gly Met Leu Ala Tyr Asp Gly
660 665 670
Val Ile Gly Thr Tyr Gly Pro Ala Met His Pro Met Tyr Tyr Ala His
675 680 685
Pro Ser Ala Trp Met Ala Ala Pro Ser Arg His Met Gly Glu Arg Gly
690 695 700
Asp Val Tyr Asn Gln Ser Pro Ala Phe Gln Glu Gln Asp Ser Gly Ser
705 710 715 720
Gly Asn His Ser Gln Ala Gly Gln Thr His Gln His Met His His His
725 730 735
Gln Gly Asn Gln His His His His His His His His His His Gly Ser
740 745750
Gly Ala Gln Pro Ser Gly Asn Ala Gly Val Gln Asp Glu Gln Gln Gln
755 760 765
Ser Val Val Pro Pro Gly Ser Ser Ala Pro Arg Cys Gly Ser Thr Gly
770 775 780
Val Asp Gly Arg Ser Gly Ser Ser Asn Gly Tyr Gly Ser Thr Gly Asn
785 790 795 800
Gly Asn Gly Ser Met Asn Gly Ser Ala Ser Gly Ser Asn Thr Gly Val
805 810 815
Asn Asn Gly Gln Ser Gly Phe Gly Ala Thr Pro Met Leu Thr Asp Asn
820 825 830
Ser Gly Ser Asn Gly Val Gly Gly Thr Asp Ala Ala Met Asp Gly Val
835 840 845
Ser Gly Gly Asn Gly Leu Cys Thr Glu Gln Met Arg Phe Ala Arg Arg
850 855 860
Glu Ala Ala Leu Asn Lys Phe Arg Gln Lys Arg Lys Glu Arg Cys Phe
865 870 875 880
Glu Lys Lys Val Arg Tyr Gln Ser Arg Lys Arg Leu Ala Glu Gln Arg
885 890 895
Pro Arg Val Arg Gly Gln Phe Val Arg Gln Ala Val His Asp Pro Ser
900 905910
Ala Gly Asp Ala Glu
915
<210>53
<211>1359
<212>PRT
<213> Pantoea karezii
<400>53
Met Glu Phe His Val Leu Leu Val Glu Asp Asp Arg Val Thr Leu Lys
1 5 10 15
Thr Val Glu Gln Leu Leu Arg Lys Cys Asn Tyr Lys Val Thr Cys Ala
20 25 30
Ala Asn Gly Arg Glu Ala Ile Lys Val Leu Thr Ala Cys Arg His Ser
35 40 45
Gly Val Lys Val Asp Leu Ile Leu Thr Asp Ile Leu Met Pro Glu Val
50 55 60
Thr Gly Phe Asp Leu Ile Asn Glu Val Val His Gly Asp Thr Phe Cys
65 70 75 80
Asp Val Pro Val Val Val Met Ser Ser Gln Asp Ser Gln Glu Asn Val
85 90 95
Leu Gln Ala Phe Gln Ala Gly Ala Ala Asp Tyr Leu Ile Lys Pro Ile
100 105 110
Arg Lys Asn Glu Leu Ala Thr Leu Trp Gln His Val Trp Arg Ala Asn
115 120 125
Lys Ala Lys Gly Ser Gly Ser Gly Thr Thr Thr Asn Val Thr Gly Gln
130 135 140
Pro Leu Ser Gly Arg Glu Asp Leu Glu Ala Gly Glu Ala Val Ala Val
145 150 155 160
Ala Ala Ala Ala Ala Ala Ala Ser Gly Lys Ala Cys Ala Ala Thr His
165 170 175
Gly His Leu Lys Asp Ser Ser Gly Gly Ser Ser Gly Ala Ala Ala Ser
180 185 190
Val Leu Gln Ser Thr Gly Gly Thr Leu Leu Pro Asp Arg Ala Ala Thr
195 200 205
Val Arg Tyr Pro Ala Ala Ala Ala Ala Pro Pro Pro Pro Gly Ala Ser
210 215 220
Glu Leu Ser Gly Asn Val Thr Ala Gly Glu Ala Gln Gly Ser Arg Thr
225 230 235 240
Gln His Leu Arg His Leu Ser Gly Leu Ala Gly Met Glu Ser Thr Ala
245 250 255
Ala Thr Ser Ala Ala Ala Gln Gly Ser Ser Ala Ala Gly Pro Leu Arg
260 265 270
Gly Cys Gly Gly Ala Gly Thr Ala Ile Ala Gly Gly Pro Arg Ala Pro
275 280 285
Leu Gly Pro Leu Ser Phe Ala Pro Phe Gly Thr Ser Val Ala Val His
290 295 300
Phe Asp Leu Asn Pro Ala Ser Gly Ala Ala Arg Arg Leu Val Asn Ser
305 310 315 320
Ser Gly Ala Ile Asp Ala Ser Thr Gly Ser Gly Thr Ala Gly Val Ala
325 330 335
Ala Ser Ser Arg Cys Ala Ala Gly Thr Ser Ala Thr Val Ile Ser Trp
340 345 350
Ser His Val Asp Pro Thr Glu Thr Asp Pro Ala Glu Ala Glu Pro Met
355 360 365
Tyr Asp Thr Asn Ala Asp Ala Thr Ala Ala Lys Ala Ala Ala Asp Gly
370 375 380
Val Ala Glu Ala Asp Asp Asp Asp Val Gly Asp Asp Gly Gly Ala Gly
385 390 395 400
Pro Asn His Asn Asp Asp Asp Asp Glu Gly Gly Gly Asp Asp Asp Val
405 410 415
Ser Gly Asp Gly Asp Glu Asp Gly Asn Arg Pro Arg Lys Arg Pro Arg
420 425 430
Leu Leu Gln Gly Ser Ser His His His Ser His Gln His Arg Leu His
435 440 445
Ser Leu Gly Gly Thr Thr Thr Asn Thr Thr Thr Thr Thr Thr Ala Ala
450 455 460
Lys Pro Lys Ser Thr Ala Gly Glu Arg Gly Gly Ala Ala Ala Leu Leu
465 470 475 480
Ala Cys Arg Thr Ala Ala Ala Ala Pro Leu Arg Gly Ser Gly Cys Ala
485 490 495
Thr Ala Gly Ala Thr Gly Ala Cys Arg Leu Ala Ala Ala Ala Ala Ala
500 505 510
Ala Glu Gly Ser Gln Gly Ser Arg Ala Ala Ser Ala Ser Ala Gly Pro
515 520 525
Asp Gly Gly Ala Arg Glu Ser Thr Ala Thr Pro Ser Gly Asp Thr Phe
530 535 540
Ala Glu Ser Pro Ser Ala Tyr Thr Ala Thr Ala Thr Thr Thr Ser Thr
545 550 555 560
Ala Thr Thr Ser Thr Thr Thr Gly Ser Gly Ile Glu Met Gln Asp Asp
565 570 575
Glu Gln Gln Gln Arg Gln Gln Pro Lys Gln Arg Pro Pro Ala Ser Gln
580 585 590
Pro Glu Leu Glu Gly His His His Gln Gln Gln Tyr His His Tyr Tyr
595 600 605
Arg Arg Thr Ser Leu Glu Gly Gly Cys Ala Asn Ala Pro Pro Leu Pro
610 615 620
Val Pro Ser Ser Ala Arg Gly Ala Ser Pro Ala Gly Thr Gly Pro Thr
625 630 635 640
Glu Ser Gly Ser Gly Arg Asp Ser Gly Cys Ala Arg Ile Thr Asn Gly
645 650 655
Thr Ala Ala Gly Ala Thr Ala Ala Met Pro Pro Ser His Val Ser Ser
660 665 670
Ala Ser Pro Pro Arg Cys Thr Ala Thr Ser Ala Ala Ala Thr Arg Gly
675 680 685
Ser Ser Gly Ala Ala Thr Ala Ala Ala Gly Ala Met Thr Thr Ala Leu
690 695 700
Ala Thr Ala Gly Ser Tyr Pro Arg Gly Val Asp Ala Ser Pro Pro Pro
705 710 715 720
Asn Arg Ser Met Gly Ser Ser Gly Gly Asp Gly Gly Gly Thr Ala Ala
725 730 735
Ala Ala Ala Gly Thr Ala Arg Gly Ser Ser Pro Ala Ala Ala Thr Pro
740 745 750
Pro Leu Pro Pro Ser Thr Gln Gln His Gly Leu Pro His Pro Ala Ala
755 760 765
Ala Pro Pro Pro Gly Ala Ala Ser Pro Gly Gly Ala Val Thr Leu Pro
770 775 780
Pro Ala Leu Gln Glu Leu Ala Ala Leu Gly Ala Ala Arg His Ala Gly
785 790 795 800
Leu Trp Thr Gln Arg Ala Leu Leu His Gln Gln Gln Leu Leu Leu Gln
805 810 815
Gln Gln Lys Gln Gln Lys Gln Gln Gln His Gln Gln Asp Gln Val Val
820 825 830
Gly Ala Glu Lys Ile His Gly Gly Ser Thr Ser Ala Val Ala Asn Ala
835 840 845
Ala Glu Gln Gln Gln Gln Gln Pro Leu Gly Ala Ala Ala Ala Arg Arg
850 855 860
Pro Ser Lys Ala Gly Val Asp Gly Thr Glu Ala Gly Ser Gly Ala Val
865 870 875 880
Gly Gly Cys Ala Ser Ala Thr Ala Ala Val Met Ala Met Glu Ala Ser
885 890 895
Glu Pro His Gly Ala Val Gly Ser Ser Phe Thr Ala Ala Asp Arg Gln
900 905 910
Glu Thr Pro Leu Gln Pro Leu His Ala Glu Ser Ala Ala Ala Gly Gly
915 920 925
Asp Met Asp Gly Asn Arg Ser Thr Pro Ala Thr Met Pro Ser Gly Pro
930 935 940
Thr Ala Ala Ala Ser Gly Pro Ser Gln Thr Ser Asn Ser Leu Thr Val
945 950 955 960
Leu Arg His Ser Asp Arg Ser Ala Phe Thr Ala Phe Thr Val Phe Leu
965 970 975
Pro Ser Arg Val Ala Gly Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala
980 985 990
Ala Arg Pro Pro Pro Pro Pro Ala Pro Val Gln Pro Pro Ala Pro Ile
995 1000 1005
Phe Thr His Pro Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala
1010 1015 1020
Ala Gly Ser Gly Gly Ala Ala Ser Val Trp Tyr Pro His Leu His
1025 1030 1035
His His His His Tyr Leu Gln Gln Gln Gln Thr His Met Gly Pro
1040 1045 1050
Leu Pro Pro Leu Pro Gly Ala Val His Val Leu Pro Ser Ile Met
1055 1060 1065
Gln Leu His Met Gly Val Leu Ala Pro Gly Pro Pro Pro Gln Gln
1070 1075 1080
Gln Gln Gln Gln His Leu Gln Ala Lys Ala Pro Gln Lys ProHis
1085 1090 1095
Asp Ser Ala Ala Ala Ala Gly Gly Ala Asn Gly Ser Leu Gly Pro
1100 1105 1110
Ala Thr Ser Ala Ala Ala Ala Thr His Met Ser Tyr Thr Gly Met
1115 1120 1125
Gln Gln Arg Pro Gly Ala Ser Ser Ala Thr Thr Thr Ser Ala Gly
1130 1135 1140
Ala Val Ala Phe Gly Gln Ser Pro Pro His Gly Leu Ala Ala Ala
1145 1150 1155
Ala Ala Ala Ala Ser Thr Pro Pro Pro Pro Pro Pro Pro Pro Val
1160 1165 1170
Cys Ile Pro Glu Ser Val Leu Gln Leu Ile Ala His Leu Ser Gly
1175 1180 1185
Arg Ala Ala Ala Glu Leu Pro Val Pro Glu Thr Val Thr Thr Ala
1190 1195 1200
Pro Leu Val Val Gln Lys Ala Pro Ser Ala Ala Arg Leu Ala Ala
1205 1210 1215
Val Ala Lys Tyr Leu Glu Lys Arg Lys His Arg Asn Phe Gln Lys
1220 1225 1230
Lys Val Arg Tyr Glu Ser Arg Lys Arg Leu Ala Glu Ala Arg Pro
1235 1240 1245
Arg Val Arg Gly Gln Phe Val Lys Ala Ser Thr Ser Ala Val Ala
1250 1255 1260
Ala Thr Thr Pro Ala Ala Thr Gly Ala Thr Val Thr Ser Thr Ser
1265 1270 1275
Leu Arg Gln Pro Val Tyr Thr Ala Ala Gly Pro Ala Gly Leu Ala
1280 1285 1290
Leu Pro Pro Ala Ala Ala Ala Ala Ala Ala Ser Ala Ala Ala Ala
1295 1300 1305
Arg Gly Val Pro Pro Pro Ser Ser Arg Ile Gly Ala Val Glu Leu
1310 1315 1320
Ala Glu Leu Val Pro Asp His Asp Ala Asp Ile Glu Asp Glu Gly
1325 1330 1335
Cys Asp Glu Pro Ala Ala Ala Glu Asp Ser Asp Gly Ser Val Ala
1340 1345 1350
Val Glu Leu Ala Glu Val
1355
<210>54
<211>1102
<212>PRT
<213> Chlamydomonas reinhardtii
<400>54
Met Glu Ala Asn Gly Phe His Val Val Leu Val Glu Asp Asp Asn Ile
1 5 10 15
Cys Leu Lys Val Val Glu Gln Leu Leu Arg Lys Leu Ser TyrArg Val
20 25 30
Ser Thr Ala Ser Asp Gly Ala Ala Ala Leu Lys Val Leu Ala Asp Cys
35 40 45
Lys Gln Arg Gly Asp Lys Val Asp Leu Ile Leu Thr Asp Ile Leu Met
50 55 60
Pro Glu Val Thr Gly Phe Asp Leu Ile Asn Glu Val Val His Gly Glu
65 70 75 80
Thr Phe Ala Asp Ile Pro Val Val Val Met Ser Ser Gln Asp Ser Gln
85 90 95
Glu Ser Val Leu Gln Ala Phe Gln Ala Gly Ala Ala Asp Tyr Leu Ile
100 105 110
Lys Pro Ile Arg Lys Asn Glu Leu Ala Thr Leu Trp Gln His Val Trp
115 120 125
Arg Ala Asn Arg Ala Lys Gly Gly Gln Thr Ser Ser Gly Ala Ala His
130 135 140
Val Gly Ala Gly Gly Arg Gly Gly Thr Ser Ser Arg Asp Gly Gly Gly
145 150 155 160
Val Ala Gly Thr Arg Cys Gly Pro Gly Asp Arg Gly Gly Ser Gly Gly
165 170 175
Asp Ala Glu Gly Ser Gly Leu Gly Gly Gly Ala Gly Ala Val Lys Asp
180 185 190
Ser Ser Gly Gly Ser Thr Gly Ala Ala Thr Ser Val Leu His Ser Thr
195 200 205
Gly Gly Thr Thr Leu Pro Ser Arg Ala Ala Thr Gly Arg His Ala Ser
210 215 220
Thr Ser Ala Gly His Gly Val Thr Ser Ala Asp Pro Ser Asn Asn Gln
225 230 235 240
Thr Ser His Ala His Ala His Ala His Ala His Ala His Gly Asn Ala
245 250 255
His Ala His Ala His Leu His Met His Gly Ala Thr Asp Arg Ala Ala
260 265 270
Gln Gly Ser Ser Ala Asn Gly Pro Ala Asn His Gly Ala Ala Gly Thr
275 280 285
Gly Leu Gln Ser Ala Gly Met Ala Gly Ser Thr Ala Ala Gly Ala Ala
290 295 300
Ala Pro Ala Gly Glu Ser Leu Ala Lys Pro Pro Phe Ala Ser Leu Ala
305 310 315 320
Val His Phe Asp Leu His Ser Val Leu Ala Gly Ala Gly Ala Ala Ala
325 330 335
Ala Asn Gly Gly Ala Asn Ala Ala Ala His Thr Ala Gly Ala Thr Gly
340 345 350
Arg Glu Ser Gly Gln Ala Ala Gly Ala Ala Thr Gly Gly Ile Ala Ala
355 360 365
Ala Gly Thr Val Ile Gly Trp Ser His Ala Asp Met Asp Val Asp Gly
370 375 380
Gly Glu Ala Gly Ala Gln Asp Glu Asp Asp Glu Asp Glu Asp Asp Gly
385 390 395 400
Val Glu Ala Pro Ala Gly Thr Gln Asn Arg Lys Arg Ala Ala Asp Asp
405 410 415
Ser Gly Cys Asp Gly Ala Ala Ala Asn Asn Asn Gly Asn Thr Ala Ala
420 425 430
Lys Ala Gly Ala Ala Ala Ile Ala Ala Gly Gly Pro Gly Ser Ser Gly
435 440 445
Arg Ala Lys Ala Thr Asp Gly Ala Arg Ala Glu Ile Arg His Asn Gly
450 455 460
Gly Pro Met Ala Ala Arg Met Ala Ala Ala Glu Gly Ser Gln Gly Ser
465 470 475 480
Arg Ala Ala Ser Gly Ser Ala Ala Thr Gly Pro Gly Gly Ala Arg Glu
485 490 495
Gly Thr Ala Thr Pro Ser Gly Asp Thr Phe Ala Glu Ser Pro Ser Thr
500 505 510
Phe Thr Ser Ile Ile Asn Thr Thr Gly Ser Gly Ser Glu Ala Asp Glu
515 520 525
Gln Pro Val Pro Leu Lys His Gln Glu Gln Gln Gln Gln Gln Gln Gln
530 535 540
Gln Arg Val Gly Glu Gly Asp Arg Ala Lys Pro Glu Pro His Pro Gln
545 550 555 560
Asn Pro Ala Gln Ala Ala His Leu Pro His Pro Ser Ala Ala Pro Cys
565 570 575
Ser Gly Gly Gly Gly Ile Ala Gln Ala Ala Leu Pro Leu Gly Leu Gln
580 585 590
Glu Leu Ala Ala Leu Gly Ala Ala Arg His Lys Glu Leu Trp Thr Gln
595 600 605
Arg His Leu Met His Gln Arg Gln Ala Ala Ala Ala Ala Thr Ala Ala
610 615 620
Ala Ala Ser Ala Ala Ala Ala Ala Ala Met Pro Thr Ala Gly Ala Ser
625 630 635 640
Ala Ala Ala Pro Ala Gly Pro Pro Ser Ala Arg Pro Ser Ala Ser Leu
645 650 655
Ala Asp Thr Gly Gly Asp Gly Pro Ala Ala Ala Thr Ala Pro Glu Thr
660 665 670
Arg Ala Asp Gly Pro Ser Gly Pro Ala Thr Thr Gln Gly Pro Lys Arg
675 680 685
Asp Ala Val Ala Gly Ala Ala Ala Val Gly Ser Ser Ala Arg Ser Asp
690 695 700
Ser Pro Leu Pro Ala Ala Ala Ala Ala Thr Ala Gly Ala Asn Gly Ala
705 710 715 720
Ser Gly Ala Ala Ser Asp Val Leu Ala Gly Ala Gly Ser Leu Ala Leu
725 730 735
Leu Arg His Ser Asp Arg Ser Ala Phe Thr Ala Phe Thr Val Phe Leu
740 745 750
Pro Gly Arg Val Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala
755 760 765
Ala Ala Thr Ser Ala Gly Ala Ser Thr Gly Thr Ala Asn Gly Ala Pro
770 775 780
Pro Ala Pro Gly Thr Ala Leu Ala Ala Ala Ala Ala Ala Ala Ala Ala
785 790 795 800
Ala Ala Ser Ala Val Pro Leu Pro His Pro His Thr Ala Pro Pro Ala
805 810 815
Leu Phe Gly Val Pro Pro Pro Ser Ser Val Pro Pro Ser Ser Leu Ser
820 825 830
Val Leu Pro Pro Val Met Pro Leu His Pro Ala Ala Ala Ala Ala Ala
835 840 845
Ala Thr Ala Gly Gly Gly Lys Pro Ser Asp Ala Ala Thr Tyr Ala Ala
850 855 860
Ala Ala Ala Ala Gly Leu Val Pro Tyr Pro Gly Phe Ala Pro Ala Arg
865 870 875 880
Pro Gly Pro Phe Pro Pro Pro Pro Gly Ser Gly Gly Pro Gly Ala Pro
885 890 895
Pro Val Tyr Ile Pro Glu Ser Val Leu Gln Leu Ile Ala His Leu Ser
900 905 910
Gly Arg Ala Ala Ala Glu Ile Pro Ala Val Pro Ala Glu Ser Val Thr
915 920 925
Ala Ala Pro Val Val Val Gln Lys Ser Gly Gly Pro Ala Ser Ala Ala
930 935 940
Arg Leu Ala Ala Val Ala Lys Tyr Leu Glu Lys Arg Lys His Arg Asn
945 950 955 960
Phe Gln Lys Lys Val Arg Tyr Glu Ser Arg Lys Arg Leu Ala Glu Ala
965 970 975
Arg Pro Arg Val Arg Gly Gln Phe Val Lys Ala Gly Thr Ala Gly Ala
980 985 990
Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Gly Thr Ala
995 1000 1005
Ala Thr Ala Ala Gly Thr Gly Thr Ala Arg Gly Ala Ala Thr Ala
1010 1015 1020
Ser Gly Ala Ala Gly Lys Pro Glu Leu Gln Gly Pro Asp Thr Ala
1025 1030 1035
Glu Glu Ala Ala Ala Ala Thr Leu Leu Ser Ala Ala Ala Ala Met
1040 1045 1050
Ala Ala Ala Ala Ala Gly Thr Ser Gly Pro Ser Gly Ser Gly Ser
1055 1060 1065
Gly Ala Met Asp Val Asp Gly Ala Asp Pro Glu Ala Asp Ala Asp
1070 1075 1080
Val Met Asp Glu Asp Asp Gly Glu Asp Asp Gly Ser Asp Glu Ser
1085 1090 1095
Ala Gly Glu Pro
1100
<210>55
<211>936
<212>PRT
<213> Zuofu Chromolaena
<400>55
Met Ser Ala Asp Ala Gly Gly Gln Lys Pro Gly Val Ala Glu Pro Gly
1 5 10 15
Ala Arg Thr Gly Pro Gly Phe Ser Val Asn Ser Ser Phe His Val Leu
20 25 30
Leu Val Asp Asp Asp Ala Val Thr Leu Lys Tyr Val Glu Gln Leu Leu
35 40 45
Arg Lys Cys Ser Tyr Glu Val Thr Thr Ala Thr Asn Gly Arg Glu Ala
50 55 60
Ile Glu Val Leu Glu Gly Arg Arg Gly Gln Val His Ile Asp Leu Ile
65 70 75 80
Leu Thr Asp Ile Ser Met Pro Glu Val Asn Gly Val Gln Leu Ile Glu
85 90 95
Glu Val Val Asn Gly Gly Lys Trp Lys Asn Leu Pro Val Ile Val Met
100 105 110
Ser Ser His Glu Ala Gln Ala Asn Val Leu Glu Ala Phe Gln Ala Gly
115 120 125
Ala Ser Asp Tyr Leu Ile Lys Pro Leu Arg Arg Asn Glu Leu Ser Thr
130 135 140
Leu Trp Gln His Val Trp Lys Ala Ser Lys Leu His Gln Pro His Leu
145 150 155 160
His Gly Glu Asp Asp Glu Asp Glu Asp Asp Thr Ala Leu Asp Asn Gly
165 170 175
Lys Phe Asp Ser Ser Ala Gly Asn Asn Lys Gly Ser Ser Gly Ala Ser
180 185 190
Thr Ser Ala Ala Gly Asp Ala Thr Ala Leu Ser Met Ala Asp Ala Ser
195 200 205
Arg Ala Leu Tyr Glu His His Pro Ser His His His Asn His Ile Gly
210 215 220
Glu Pro Ser Ile Asp Thr Gln Ala Ser Gly Gln Val Gly Ser Asn Asp
225 230 235 240
Pro Ser Leu Val Ile His Pro Leu Asp Ile Ser Pro Leu Pro Ala Ala
245 250 255
Ala Pro Pro Leu Ala Val Ala Gly Asp Pro Ala Leu Ala Ala Ala Pro
260 265 270
Leu Gly Thr Gly Gly Gln Asp Thr Pro Gly Ser Gly Asp Glu Gln Ala
275 280 285
Thr Ala Gly Thr Ser Val Gln Gln His Gln His Ser Gln Ala His His
290 295 300
His Ser Arg His Ile Pro Ala Ser Gly Ser Gly Thr Thr Glu His Ala
305 310 315 320
Pro Gln Ser Leu Ser Gln His Pro His His His Asn His Gln His His
325 330 335
His Asn Ser His His His His His Gln His Asp Leu Ala Gln Gln Arg
340 345 350
Gln Gln His His His His His Ser Asn Gly Val Asn Gln Asp His Ser
355 360 365
Gln Pro Asn Pro Asp Leu Thr Gln Met Pro Ser Ala Asp Gln Gln Ser
370 375 380
Leu Leu Thr Leu Pro His Ser Pro Asn Gly Ala Met Pro Leu Phe Lys
385 390 395 400
Pro Ser Thr Ser Ser Ala Ala Met Asp Cys Ser Thr Gln Gln Pro Leu
405 410 415
Gln Gln Gln Gln Gln His Glu His Gly Ser Ser Ser Pro Ala Leu Ser
420 425 430
Arg Pro His Ala Glu Lys Ser Pro Phe Gly Val Arg Tyr Gly Gly Gly
435 440 445
Asn Gly Gly Tyr Ser Ser Ser Met Ser Gly Ala Ser Leu Pro Pro Gly
450 455 460
Leu Gln Glu Leu Ala Val Leu Gly Gln Gln Arg Gln Ala Ala Arg Glu
465 470 475 480
Lys Asp Leu Gln Gln Arg Gln Gln Gln Gln Gln Lys Gln Gln Gln Gln
485 490 495
Gln Gln Gln Thr Ser Ala Leu Arg His Ser Asp Ser Ser Ala Phe Thr
500 505 510
Ala Phe Thr Val Phe Leu Pro Lys Gly Ser Asn Gly Leu Asn Arg Ser
515 520 525
Ser Gly Val Gly Val His Gly Ser Asn Ser Gln Thr Ser Gly Gly Gly
530 535 540
Ala Ala Asp Leu Gly Arg Ser Ala Ser Ser Met Glu Ile Leu Ser Thr
545 550 555 560
Ala Glu Thr Leu Val Gly Gln Thr Ala Gly Gly Ala Gly Val Asn Gly
565 570 575
Val Gly Ser Ala Lys Pro Gly Gly Asp Cys Leu Lys Glu Glu Ser Pro
580 585 590
Asn Asp Ser Thr Pro Ser Ala Glu Glu Gly Asp Glu Gln Asp Val Lys
595 600 605
Pro Pro Gln Ser Thr Ser Gly Ala Ala Ala Ala Glu Pro Ala Val Ala
610 615 620
Thr Ala Ser Gly Arg Ala Ala Thr Ala Ala Ile Ala Val Val Ala Asp
625 630 635 640
Ala Thr Val Ala Lys Pro Asp Ala Pro Val Ala Thr Ser Asp Gln Lys
645 650 655
Gln Val Leu Pro Phe Pro Gly Val Asn Gly Ala Ala His Leu Thr Gly
660 665 670
Met Asn Asn Gly Val Ser His Ser Gly Thr Ala Gly Ser Tyr Ser Glu
675 680 685
Leu Thr Gln Met Leu Tyr Ala Gln Leu Pro His Gln Gly Gln Pro Leu
690 695 700
Pro Asp His Val Met His Phe Leu His Asn Phe Tyr Arg Thr Met Met
705 710 715 720
Glu His Gln His Gln Gln Gln Ser Gln Gln Met Asp Gln Leu His His
725 730 735
His Val Gln Gln Gln Gln Gln Gln Gln Val Gln Gln Gln Gln Arg His
740 745 750
Leu Gln Gln Phe Ala Thr Ala Pro Asn Gly Gln Ala Pro Pro Asn His
755 760 765
Ser Asn Thr Asn Gln His Leu Gln Gln Gln Gln Gln Ala Cys Gly Asn
770 775 780
Gln Pro Leu Gln His Thr Ser Gln Pro His Cys Asn Gly Ala Ala His
785 790 795 800
Leu Gln His Leu Gln Gln Ser His Ser Ala Pro Ser Leu His Thr Pro
805 810 815
Gly Phe Thr Cys Thr Thr Thr Ala Thr Gln Ser Asn Thr Glu Pro Ser
820 825 830
Cys Met Leu Thr Gln Ser Glu Gly Ala Pro Cys Ala Ser Ser Tyr Arg
835 840 845
Ala Ala Ala Val Ala Lys Tyr Arg Glu Lys Arg Lys Asn Arg Asn Tyr
850 855 860
Asp Lys Lys Val Arg Tyr Glu Ser Arg Lys Arg Leu Ala Glu Ser Arg
865 870 875 880
Pro Arg Val Lys Gly Gln Phe Val Lys Gln Glu Val Leu Ala Ala Ala
885 890 895
Gly Leu Thr Ala Leu Ala Glu Leu Ala Thr Ala Asn Lys Arg Ala Arg
900 905 910
Leu Asp Val Asp Tyr Val Thr Ala Thr Gly Met Thr Asp Ala Asp His
915 920 925
Met Asp Thr Ala Glu Glu Ser Ser
930 935
<210>56
<211>444
<212>PRT
<213> Gliocladium sp
<400>56
Met Ala Ala Gly Leu Lys Arg Ile Pro Ser Phe Ser Gly Arg Pro Gly
1 5 10 15
Phe Pro Asn Gly Leu Gln Val Leu Val Val Asp Gly Asp Thr Ser Ser
20 25 30
Ser Gln Cys Leu Arg Gln Lys Leu Glu Glu Leu Ala Tyr Glu Val Ser
35 40 45
Cys Cys Ser Ser Gly Ser Asp Ala Ser Ala Leu Leu Arg Lys Glu Asp
50 55 60
Ser Ser Tyr Asp Ile Leu Leu Val Glu Ala Lys Ala Leu Ala Lys Asp
65 70 75 80
Ala Thr Asp Gly Gly Ser Leu Arg Asp Ser Ala Ala His Leu Pro Leu
85 90 95
Val Leu Met Ser Glu Lys Ser Ser Ser Thr Asp Ala Val Trp Arg Gly
100 105 110
Ile Glu Leu Gly Ala Ala Asp Val Leu Glu Lys Pro Leu Ser Ser Leu
115 120 125
Lys Leu Arg Asn Ile Trp Gln His Val Val Arg Lys Met Met Ser Ser
130 135 140
Ser Gln Asp Ser Ser Arg Glu Ala Val Pro Cys Lys Met Glu Pro Lys
145 150 155 160
Ser Lys Gly Lys Gly Val Ser Ala Pro Ser Ser Pro Arg Thr Pro Ser
165 170 175
Pro Ala Ala Ser Leu Leu Thr Ile Ser Ser Gly Thr Met Thr Glu Lys
180 185 190
Ser Cys Lys Gly Gly Gly Asp Glu Ala Ser Phe Ser Gly Val Gly Asp
195 200 205
Val Lys Met Ser Cys Ser Ala Glu Ala Pro Glu Pro Cys Asp Ser Arg
210 215 220
Ala Thr Ala Glu Ser Pro Ala Ser Thr Gln Thr Lys Val Thr Phe Pro
225 230 235 240
Gly Cys Leu Asn Ser Gly Gly Thr Ala Leu Ala Ala Ser Lys Asn Cys
245 250 255
Ser Arg Lys Arg Lys Ala Lys Ala Pro Asp Thr Pro Ala Ser Val Ala
260 265 270
Ser Arg Pro Pro Leu Ala Ile Arg Pro Pro Ala Trp Ala Ser Pro Phe
275 280 285
Gly Pro Pro His Gln Gly Asn Thr His Val Val Gly Met Ala Pro Pro
290 295 300
Gln Cys Tyr Met Gln Gly Val Asp Pro Thr Asn Gly Cys Val Trp Gly
305 310 315 320
Thr Pro Ala Gly Gly Val Ser Gln Ala Pro Ala Tyr Met Pro Gly Trp
325 330 335
Gly Phe Ser Pro Gln Pro Met Leu Ser Gly Ser Phe Leu Gln His Pro
340 345 350
Ser Thr Ser Asp Leu His Lys Cys Pro Ser Val Gly Ala Ser Ser Leu
355 360 365
Ala Ser Ser Leu Asp Ser Ser Leu Thr Leu Cys Gly Phe Gly Ala Asp
370 375 380
Leu Pro Asp Asp Asp Leu Leu Leu Glu Asp Val Leu Leu Pro Asp Glu
385 390 395 400
Asp Leu Leu Asp Leu Ala Pro Asp Glu Pro Ala Thr Met Lys Ala Pro
405 410 415
Glu Gln Pro Pro Ile Gly Leu Lys Leu Lys Lys Ser Ala Ser Leu Ile
420 425 430
Asp Leu Ile Asn Ala Gln Leu Ser Ala Ala Thr Ala
435 440
<210>57
<211>1284
<212>DNA
<213> genus oocysts
<400>57
atggcgctga agcgcgttcc tagcttttcc ggtcggccga actttcccgc cggtctgcag 60
atcctggtgg tggacagcga ttcttcctca agggaggctg tagagatgca actcaaatcg 120
cactcctatc tagcaacctg ttgttgcacc tgcggcgagg ctgtggagca gctcggcacg 180
tcaaagtatg acatcgtgct ggcagagtcc aagctggttg ctgcggagtg cgttgactcg 240
acacggttgt gcgaggccgc aagggctctg cctctggttt tgatgtgcga ggactcgacg 300
gcggacgacg tgttgaaggg aatcaggctc ggcgcttgcg actttctgga gaagccgctg 360
tccccactga agctcaagaa catatggcag cacgttgttc gcaagatgat ggagcagatg 420
cacgtccgcc gcacggacga cgcggatacg tgcactacta agagcagccg cgaccaaagc 480
tgcgcgatca agggcaagtc ggtggcttcc acgccctcgt gtcccaagac accttctccc 540
gcggcttctg gcgcagacat cggctgcagc atagccacgt cggtcagcaa ggccggggac 600
gtggtcggcg agtccagcag ttccgagacg cgcaaggagc attgcagcga gaccacggag 660
tgctccgacc tcaagagctg cgccgcaaag tcagctgtgt cggcgcaaac gccggtatcc 720
accgcgaccg ttgcagctac ctggggtgcg tcgaagaaga agtcgacagc atcagccact 780
accagcagtg tcagcaaccg gccgccgctg gcgatcaaga tgccggcgcc agctgtggca 840
tacacgtcag ggcttgcgcc ctttccgccg ccgatgtttg tacctggcgg ctggggccag 900
tcaagcaacc catgcgtggt gggcacgcca atgatgccac cgccgcccgg catgggcatg 960
ccgccccacc accacgcgcc ctatggccag gtgccgccgc cgggctatcc agtcgcatgc 1020
atgcccagcg cctttgtgcc gacgccgatg ggccctggcg gcgtggcgtt tgcgccgcca 1080
cctggcgcca gctgcacatc tgctgcgtac tacccccatc ctgctgtgga tgcaagcgcg 1140
tctgcaactg ccaccttcac gggccatgtg cagatcgacc tgactaacgt gtctgctgaa 1200
gagccggcgc ccattggttt ggcgctgcgc aagaccgcgt cgctgctcga cctggtcagc 1260
gatcgcctgg gccagcgtgc gtgc 1284
<210>58
<211>1026
<212>DNA
<213> Tetrakis algae
<400>58
atgctgtgcc ctgctgtcca ggttgccacc atggccactg tcctggcttc cacgcatttt 60
tcggagcgcc ccagcttccc ggctgatctg gaggtgctgc ttctggattc agcaacgcag 120
ggcgcagaaa ctgcctcgaa gctgttgctg tcgtgttcct atcgtgtcac cgtgtgccga 180
tccgtgtctg aggctctgag ccacatggca tgcaaggctt tcgacgtggt cctggtggag 240
cagaaacttt tcagcggcag ggatgcggcc gctgcgcagc tcaaggccct ggcaggcgtc 300
atccccaccg tggtcctgag tgacagcggc agtgcgaagg atacctgggc tgccatcgtt 360
gggcaggcag ccgatgtcct catccgcccg ctgaccaagc agaagctgca gacgctgtgg 420
cagcacactg tccgtatgca gcgcgcagca tcttcggctt cggcggctac tagcatggtt 480
gccaagcctg ttgccgtgct ctcctcggct ctgaagcccg ctgcttccag tgcttcactg 540
gacaaggggc agaagcgcaa gttgaaggat catatgatgg ggcccatcat ggcacacccg 600
caagtgtcca accctggctt tatctggggc gcaccagtga tgggcgttcc ggctggacag 660
caggctcccc agaagtcaga ggccccggtc accccccaga agccaggctc agagatgcac 720
cccgagctgg atgccacaag ccacatcgcc atgggctcca gcgacaactt caacgtacct 780
gtgtatgaaa gcggcactga cagccaggag tcgcagccaa cctgcgaccc cacctctctt 840
gatgacatca atgaggatga ctacgcgttt atcgatttcg cgctcagcga ttcttttccc 900
actgtggagg aggatgagat ccttccaccc attggccttt cgctgaagaa gtccagctcc 960
ctcctgaaca tgctgaacgg tgtgcttctc tcggctcact ctgtaccgct gcagctgccc 1020
cagtag 1026
<210>59
<211>2076
<212>DNA
<213> Soybean
<400>59
atgggagagg tggtcatcat gagtggagag aagaagtcag ttagagtgga gggggtggag 60
aaggaagata gtggtggaag tgggagcaag gctggtgaat ttaaggggtt gatgaggtgg 120
gagaagttct tgcccaagat ggttttgagg gtgctgttgg ttgaagcaga tgattccaca 180
agacaaatta ttgccgcgct tctcagaaaa tgcagctaca aagtggttgc tgttcctgat 240
ggcttgaagg catgggaatt actcaaggga agaccgcaca atgttgatct aattctgaca 300
gaagtggatt tgccatccat atctggctat gcacttctca cattaattat ggagcacgag 360
atttgcaaaa acatccctgt tataatgatg tcttcccaag attcaattag cacagtatac 420
aaatgcatgt tgagaggtgc tgctgattat cttgttaagc ctattagaaa aaatgaactg 480
aggaacttgt ggcaacatgt ttggagaaga caatcatcaa ccactggtat taatggcctc 540
caagatgaga gtgttgcaca acagaaggtt gaagccactg cagaaaataa tgctgctagt 600
aatcgttcaa gtggtgatgc tgcttgcatt cagagaaata tagaactaat tgagaaggga 660
agtgatgcac agagctcttg taccaagcct gactgtgaag ctgagagtga ccctgtcggt 720
aacatgcagg aattttctct gctgaaatgt ggggaagcat atccaagtgg aacagagaca 780
caacaggttg aaacaagctt tcgcttaggc cagacattaa tgatgcatga ctgtcatgct 840
ggaggattaa atgtgagtat ccgcaaaaat ggtgaggcaa gcacgactaa tgacaaggat 900
actgatacag agcattttgg gaatgctagc atcagtggtg aggctcatga caatccctat 960
gttcaaatta actcttccaa ggaagctatg gacttgattg gagcatttca tactcatcca 1020
aactgttccc tgaaaaattc cacagttaat tgcacaggca actttgacca ttctcctcaa 1080
ttggatcttt ctttgagaag atcttgtccc ggaagctttg agaataaact cactgaagaa 1140
aggcacaccc tgatgcattc taatgcttca gctttcaagc ggtatactac caggcaattg 1200
caaatatcaa tgcctgcagt gttaattaac ttctctgatc aacaaagaga acagataaca 1260
aattgtgaga aaaacatctc acacatcgct actggcagca actcagatag ttcaacacct 1320
atgcaaagat gtattgtgtc tccaactaca gtccaatcaa aagaatctga acttgcaacc 1380
tcacaccccc cgcaaggaca ttctctccca attccagtaa agggtgtaag gttcaatgat 1440
ctatgcacag cctatggttc tgtacttcct tcagtgtttc atacacagtc aggtccacca 1500
gcaatgccaa gtccaaattc agttgtgctc cttgaaccaa actttcaagt aaatgcattt 1560
tatcagtcaa atatgaaaga gagtagttca gagcagcttt atgaatctcg tggtccaaat 1620
ggaaacacca cccaaaacca cattgtgtac acacaggagc acaaatcaga acatgcagaa 1680
gatcgaggac atatctctcc tacaactgat caaagtgtgt caagtagttt ctgcaatgga 1740
aatgcaagcc atcttaacag cattggttat ggaagcaact gtggaagtag cagcaatgtt 1800
gatcaagtta acactgtttg ggcagcttca gagggaaagc atgaagacct cacaaataat 1860
gcaaactctc atcgatctat ccaaagagaa gcagctctaa acaaatttcg cttgaaaagg 1920
aaagagagat gctatgagaa gaaggttcga tacgagagca gaaaaaaact agcagagcag 1980
cgtcccagag ttaaaggaca atttgttcgt caagtgcatc ctgatcctct tgttgcagaa 2040
aaagatggca aagaatatga tcattcagat ttctga 2076
<210>60
<211>2244
<212>DNA
<213> grape
<400>60
atgggtgagg ttgtggtgag cagtgaggca ggaggaggag gcatggaggg tgaggtggag 60
aagaaggagg tgggcagtgg ggttgtgagg tgggagaggt ttcttcccag aatggttctc 120
agggttttgt tggttgaagc ggacgattcc accaggcaaa ttatcgctgc gcttctcagg 180
aaatgcagtt acaaagttgc tgctgttcct gatggcttaa aggcatggga ggtactgaag 240
gctagacccc acaacattga cctcatattg acagaagtgg agttgccatc aatatctggc 300
tttgctctcc tcaccttggt tatggaacat gagatctgca aaaacattcc tgttataatg 360
atgtcctcac atggttcgat aaacacggtt tataaatgca tgttgagagg tgcagctgac 420
tttcttgtta agcctgttag aagaaatgag ctgaagaatt tgtggcaaca tgtctggaga 480
agacaatcgt caactgttag cggaaatggc ccccaagatg agagtgttgc acaacagaag 540
gtcgaagcca cttctgaaaa caaccccaca agtaatcact caagtgatca tgttgcttgt 600
attcagaaaa ataaggaagc actcaataaa gtgagtgatg ctcagagctc ttgttcaaag 660
ccagacttgg aagctgagag tgcctacatg gaaactatgc aggatttctc aaatccgaca 720
tggagcagat ctcttgtgag tgacacaaaa atgcagaaga atgaagaatg tgccaaattg 780
ggcccgaaat ttcttatgca caataaagaa gctgggggaa cactggaggc tgcctgcagg 840
gatgtgaaca caatgactca gcctgaagca gtggaaccag aaaatgatgg gcaaggtgct 900
aacgctccta gtgaggcttg tggtaacaat gccatattgg gcagctcatc tagagaagcc 960
atcgacttga ttggagtatt tgataattct aaaaaatgca cttatggaaa ttcttcttca 1020
aataatggca ccaaaaagag tgattctatt ccacagttgg acctttcctt gagaagatct 1080
catcctagta gccctgagaa tcaagttgct gatgaaaggc atacactgaa ccattctaat 1140
ggctcggcct tttcacgcta cataaacagg tcattgcagc caccacatct accatcaaca 1200
ggtgttttca atcagcagaa aaactttgga gctgattctg ataaacgttt atctcagctg 1260
gttactggtt ataactctga tattactagt cccacactga gtactcaaag aagtgtgatc 1320
tctctagcta ctagtccatc tggacgagtt gaaattgcac tttgtggccc tcaacagaga 1380
gcttttcctg ctccagttcc acaaaatgcc aacaattcca ccagccagac taatcacaag 1440
ccagagcaca aattggactc actggagggt caagggcact tctctcctgc cactgatcag 1500
aattcaagta gtagttttgg taatggtggt gcaagtaatc tgaatagctt tgggtgtgga 1560
agcatttgtg gaagtaatgg gaatgccaat acagttgctg ttgttcaggc cgctgcagag 1620
ggcaagaatg aagaaggtat cttcagtcat gaaggacact ctcaacgatc tatccaaaga 1680
gaagctgctc taaccaagtt tcgcttgaag cggaaagaca gatgctttga gaagaaggtt 1740
cgttatgaaa gcagaaagaa gcttgcagag cagcgacccc gagtaaaagg acagtttgtt 1800
cgacaagtgc ataccatccc cccacctgca gagcctgata catactatgg cagttcgttt 1860
gatgttcagc ctcaaagaag ccgatatcta tcagctcaac ctctcagggc ctcatcttct 1920
caactcctct atccaactca cactcctctc caagaatcca aatacgaagg tcatgaagaa 1980
agcaatctct tgacggcgtc cttggttgga actgccctac cggtggctcc atcttttggt 2040
tatgaagttg gacgtgatca gacggcagga aaacttgttc tgagtttaaa gctcgatggc 2100
cgggttcgat ggaaggtggg gacttgggtt tctggccgat accgacttaa cgttaattgt 2160
gttgctgtga tggcatttgg accctccatc ccatctggtc cactgagttc aaaagaagga 2220
actcagtgct ctactactgt ttga 2244
<210>61
<211>2400
<212>DNA
<213> cocoa
<400>61
atggggatag ttcaaatgaa taataatggt cctgtggcca atgggttggt tgaattgaat 60
acacatattc atgatgagca caagaaaata aggggtgggg tcataggtga ggggcagggc 120
ctctcagtgg aagaagagtc atggattaat gaggatgtgg aagacaggaa tgatgggaag 180
acagagttgg ttcaggttca gggccatgcg catggtgagc aagagaggtc acagcaacag 240
cctcaaggtc ctttggttca ctgggagagg tttttacctc taaggtcttt gaaggttcta 300
ctggtggaaa atgatgactc aactcgccat gttgtctgtg cattgctgcg aaattgtgga 360
tttgaagtta ctgctgtgtc aaatggactg caagcttgga agatcttgga agatctaacc 420
aatcatattg atcttgtttt aactgaggta gtgatgcctt gtttgtcagg cattggcctt 480
ttatgcaaga taatgagcca caaaactcgc atgaatattc cagtgattat gatgtcatct 540
catgattcta tgagtacagt ctttaggtgt ttgtccaagg gtgcagttga ctttttagtg 600
aagcctatac gaaagaatga gcttaaaaat ctttggcagc atgtttggag gaaatgccac 660
agctctagca gtagtggagg ccaaagtggt acacagaccc aaaaatcctc aaaatcaaaa 720
ggtactgatt cagacaacaa tactggaagt aatgatgagg atgacaacgg cagtgttggt 780
ttgaatgttc aggatggaag tgacaatgga agtggcactc agagctcatg gacaaagaga 840
gcagtagaag tcgacagctc ccagccaata tcaccatggg accagttagc tgatcctcct 900
catagcactt gtgcccaggt tatccattct agacatgaag tgttaggtga cagctgggtt 960
ccagtaacag cgacgaggga gtatgatgag ctggataatg aactagaaaa tgttgttatg 1020
ggcaaagact tggagatagg ggtacctaaa attacagctt cgcagcttga agacccaagt 1080
gaaaaagtaa tgaccaacat agctggtgtt aataaagaca aattatctgc aataaaccct 1140
aagaaagatg atgagaaact agagaaagcg caattggaac ttaacagtga gaaatcaggt 1200
ggtgatttga gaaatcaagc tgctgacctg ataggtgtca tcaccaataa tactgaacct 1260
catatagaaa gcgcagtctt tgacatccca aatggcctcc ctaaggtctc tgatgcaaaa 1320
gagaaggtga actacgacac gaaggaaatg ccttttcttg agctcagttt aaagagactg 1380
agagatgtag gagacactgg aacaagtgcc catgaacgaa atgtattgag acattcagac 1440
ctttcagcct tctcaagata caattctggt tcaactgcca atcaggctcc aacaggaaat 1500
gttggtagtt gttctccact tgataatagc tcggaggcag ttaaaacaga ttctatgaag 1560
aattttcagt ctacctcaaa tagcatacct ccaaagcaac agtccaatgg aagtagtaac 1620
aataatgaca tgggttccac cactaataat gccttcagca aaccagcggt actcagtgac 1680
aagccagcac ctaaaacttc agctaaatct ttccatccct cttctgcctt ccaaccagtg 1740
cagagtggcc atggttctgc cctgcaacct gtagcacaag gtaaggctga tgctgcacta 1800
ggtaacatga ttttagttaa agcaaggggc acagaccaac aggggaaagt gcagcatcac 1860
catcatcatt atcaccacca ccaccaccac catgtccata acatgctccc aaatcaaaag 1920
ttaggtaacc atgatgattt atctttggaa aatatggcag cagcagctcc ccagtgtggg 1980
tcatccaatc tgtcaagttt accacatgtt gaaggcaatg ctgctaacca cagtttgact 2040
agaagtgcat caggaagtaa ccatggaagc aatggacaga acgggagcag cactgtgtta 2100
aataccagag gaatgaatct tgaaagtgaa aatggggtgc ctgggaaagg tggagctggc 2160
ggtggaattg gatctggagg caggaatgta gttgatcaaa accgttttgc tcaaagagaa 2220
gctgctttga acaaattccg ccagaaaagg aaagaaagat gctttgagaa gaaggttcga 2280
tatcagagca gaaagaaact ggctgagcag agaccacgca ttcgaggaca gtttgtgcga 2340
cagattagca ctactgggaa ggaagcattc agatttcgtg gtgcaggatt gtgcacttag 2400
<210>62
<211>2229
<212>DNA
<213> Rice
<400>62
atgatgggaa ccgctcatca caaccaaacc gccggctctg ccctcggagt cggagtcgga 60
gatgccaacg acgccgtgcc tggggctggg ggtgggggct acagcgaccc ggatggcgga 120
ccaatctccg gtgtgcagcg gccaccgcag gtctgctggg agcgcttcat ccagaagaag 180
actatcaaag tcttgctagt tgatagcgat gactccacca ggcaggtggt cagtgccctg 240
cttcgtcact gcatgtatga agtcatccct gctgaaaatg gccagcaagc atggacatat 300
ctagaagata tgcaaaacag cattgatctt gttttgacag aggttgttat gcctggtgta 360
tctggaattt ctctattgag taggatcatg aaccacaata tttgcaagaa tattccagtg 420
attatgatgt cttcaaatga tgctatgggt acagttttta agtgtttgtc aaagggcgct 480
gttgacttct tagtcaagcc catacgtaag aatgaactta agaacctatg gcagcatgtg 540
tggagacggt gccacagctc cagtggcagt ggaagtgaaa gtggcattca gacacaaaag 600
tgtgccaaat caaaaagtgg ggatgaatcc aataataaca atggcagcaa tgacgatgat 660
gacgacgatg gtgtaatcat gggacttaat gcaagagatg gcagtgataa cggcagtggc 720
actcaagcgc agagctcatg gacaaagcgc gctgttgaga ttgacagtcc acaggctatg 780
tctccagatc aattagctga tccacctgat agcacttgtg cacaagtgat ccacctgaag 840
tcagatatat gcagcaatag atggttacca tgtacaagca acaaaaattc caagaaacaa 900
aaagaaacta atgatgactt caaggggaag gacttggaaa taggttctcc tagaaattta 960
aacacagctt atcaatcctc tccgaatgag agatccatca aaccaacaga tagacggaat 1020
gaatatccac tgcaaaacaa ttcaaaggag gcagcgatgg aaaatctgga ggagtcaagt 1080
gttcgagctg ctgacttaat tggttcgatg gccaaaaaca tggatgcaca acaggcagca 1140
agagccgcaa atgcccctaa ttgctcctcc aaagtgccag aagggaaaga taagaaccgt 1200
gataatatta tgccatcact tgaattaagt ttgaaaaggt caagatcgac tggggatggt 1260
gcaaacgcaa tccaagagga acaacggaat gttttgagac gatcagatct ctcggcattt 1320
acgaggtacc atacacctgt ggcttccaat caaggtggga caggattcat gggaagctgt 1380
tcgctgcatg ataatagctc agaggctatg aaaacggatt ctgcttacaa catgaagtca 1440
aactcagatg ctgcaccaat aaaacaaggt tctaatggta gtagcaataa caatgacatg 1500
ggttccacta caaagaacgt tgtgacaaag cctagtacaa ataaggagag agtaatgtca 1560
ccctcagctg ttaaggctaa tggacacaca tcagcatttc atcctgcaca gcactggacg 1620
tctccagcta atacaacagg aaaagaaaag actgatgaag tggctaacaa tgcagcaaag 1680
agggctcagc ctggtgaagt acagagcaac ctcgtacaac accctcgccc aatacttcat 1740
tatgttcatt tcgatgtgtc acgtgagaat ggtggatccg gggcccctca atgtggttca 1800
tccaatgtat ttgatcctcc tgtcgaaggt catgctgcca actatggtgt caatggaagc 1860
aactcaggca gtaacaatgg aagcaatggg cagaatggga gtacgactgc tgtaaatgct 1920
gaacggccaa atatggagat cgctaatggc accatcaaca aaagtggacc tggaggtggc 1980
aatggaagtg gaagcggcag tggcaatgac atgtatctga aacgcttcac tcaacgagag 2040
catagagtgg ctgcagtgat caagtttaga cagaaaagga aagagcgcaa cttcggaaaa 2100
aaggtgcggt accagagcag aaagaggctg gccgagcagc ggccaagggt ccgcggacag 2160
ttcgtgcggc aagctgtgca agaccaacaa cagcagggtg gtgggcgcga agcggcagcg 2220
gacagatga 2229
<210>63
<211>2301
<212>DNA
<213> corn
<400>63
atgggcagtg cttgccaagc tggcacagac gggccttccc gcaaggatgt gttagggata 60
gggaatgccg ccttagagaa tggccaccat caggctgaag ctgacgcaga tgaatggagg 120
gaaaaggaag aggacttggc caacaacggg cacagtgcgc caccgccagg catgcagcag 180
gtggatgagc ataaggagga acaaagacaa agcattcact gggagaggtt cctacctgtg 240
aagacactga gagtcttgct ggtggagaat gatgactcta ctcgtcaggt ggtcagtgcc 300
ctgctccgta agtgctgcta tgaagttatt cctgctgaaa atggtttgca tgcatggcga 360
tatcttgaag atctgcagaa caacatcgac cttgtattga ctgaggtttt catgccttgt 420
ctatctggta tcggtctgct tagcaaaatc acaagtcaca aaatttgcaa agacattcct 480
gtgattatga tgtctacgaa tgattctatg agtatggtgt ttaagtgttt gtcgaaggga 540
gcagttgatt tcttggtaaa accactacgt aagaatgagc ttaagaacct ttggcagcat 600
gtttggaggc gatgccacag ttccagtgga agtgaaagtg gcatccagac acagaagtgt 660
gccaaactaa atactggcga cgagtatgag aacggcagtg acagcaatca tgatgatgaa 720
gaaaatgatg acggcgacga tgacgacttc agtgttggac tcaatgctag ggatggaagt 780
gacaatggca gtggtactca aagctcatgg acaaagcgtg ctgtggagat tgacagccca 840
caacctatat ctcccgatca actagttgat ccacctgata gtacatgtgc acaagtaatt 900
caccctagat cagagatatg cagtaacaag tggttaccga cagcaaacaa aaggaatgtc 960
aagaaacaga aggagaataa agatgaatct atgggaagat acttaggaat aggtgctcct 1020
aggaactcaa gtgcagaata tcaatcatct ctcaatgatg tatctgttaa tccaatagaa 1080
aaaggacatg agaatcacat gtccaaatgc aaatctaaaa aggaaacaat ggcagaagat 1140
gattgtacaa acatgcctag tgcaacaaat gctgaaactg ctgatttgat tagctcaata 1200
gccagaaaca cagaaggcca acaagcagta caagccgttg acgcaccaga tggcccttcc 1260
aaaatggcta atggaaatga taagaatcat gattctcata tcgaagtgac accccatgag 1320
ttgggtttga agagatcgag aacaaatgga gctacagcgg aaatccatga tgagcgaaat 1380
attctgaaaa gatcagatca gtcagccttc accaggtacc atacatctgt ggcttccaat 1440
caaggtggag caagatatgg ggaaagctct tcaccacaag ataacagttc tgaggccatg 1500
aaaacggact ctacatgcaa gatgaagtca aattcagatg ctgctccaat aaagcagggc 1560
tccaatggca gtagcaataa cgatgtggga tccagtacaa agaatgttgc tgcaaggcct 1620
tcgggtgaca gggagagagt agcgtcacca ttagccatca aatctaccca gcatgcctca 1680
gcatttcata ctatacagaa tcaaacgtca ccagctaatc tgattgggga agacaaagct 1740
gatgaaggaa tttccaatac agtgaaaatg agccacccaa cagaggttcc acaaggctgc 1800
gtccagcatc atcatcatgt gcattattac ctccatgtta tgacacagaa acagccatca 1860
acagaccgtg gatcatcaga tgttcactgt ggttcgtcaa atgtgtttga tcctcctgtt 1920
gaaggacatg ctgcaaacta cagtgtgaat gggggtgtct cagttggtca taatgggtgc 1980
aatgggcaga atggaagtag cgctgtcccc aatattgcaa gaccaaacat agagagtatt 2040
aatggtacca tgagccaaaa tattgccgga ggtggcattg taagtgggag tgggagtggc 2100
aatgacatgt atcagaatcg gttcctgcaa cgagaagctg cattgaacaa attcagactg 2160
aagcggaaag atcggaactt tggtaaaaag gttcgctacc aaagcaggaa gaggcttgct 2220
gagcagcggc cacgggtccg aggacagttt gtgcgacaat ctgagcaaga agatcaaaca 2280
gcgcaaggtt cagaaagatg a 2301
<210>64
<211>2034
<212>DNA
<213> Physcomitrella patens
<400>64
atgccatatc tgtccggagt tgggcttctg tcgaagatga tgaagcggga agcatgcaag 60
agagtgccta ttgtcatcat gtcatcgtac gacagtcttg gcatcgtgtt ccgctgcctc 120
tcgaaaggag cttgcgacta tctcgtgaaa ccagttagga aaaacgagtt gaagaatctg 180
tggcagcacg tatggaggaa gtgccacagt tcgagtggga gcagaagtgg aagcggaagc 240
cagactgggg aagtagctaa gcctcggagt cgtggtgtag cagccgctga caatcctagt 300
ggaagcaatg atgggaatgg cagcagtgat gggagtgata atgggagcag ccgggtaaat 360
gcccagggtg gaagcgacaa tggtagtggc aatcaagctt gcatgcaacc tgtacaggtt 420
ctgaggaaca gcgcaattcc agaagcagta gacggggatg aggaggggca ggcgacatcg 480
caagataagg gtgctgactt ggatggagag atggggcatg atctggagat ggcaactcga 540
aggtctgctt gtgttaccac cggaaaagat cagcaaccag aggatgccca gaagcaagat 600
gaggatgctg tatgtatctt gcaagatgcg gggccatcac ctgatggggc taatgccgag 660
agcccatcat ctagcggtcg gaatgatgcc gcagaggagt cttctccaaa gatcattgac 720
ctgataaacg tcatagcgtg tcagccacag acccaggatg cagaacctca agaaagtgag 780
aacgatgacg aagaattgga tccgcgggga aggagcagcc ctaaaaacaa ctccgcttca 840
gattccggta cttcgctgga gttaagtttg aaacggccac gatcggcggt tggtaacggc 900
ggagaattag aagagcgtca accactgcga cattcaggag gctcggcctt ttctaggtat 960
ggcagcggag gaaccattat acagcaatac catcagactg gaggttcact ccctctcagt 1020
ggttatcctg tgtctggtgg atatggtgta tatggcatgt ccggcggtag ccctggagga 1080
tctcttcgtc tgggaatggg aatggatcga agtgggtcat cgaaaggaag tgtagagggg 1140
actacacccc caccctcgca tcctcagagc atggagaaag tgggtgggca agatgggtac 1200
ggcaatgcaa gacagactac ggaggatgca atgatcgtac ctggaatgcc catggctatt 1260
cctctcccac cacctgggat gcttgcatat gatggcgtta ttggaacgta tggtccggcg 1320
atgcacccga tgtattatgc tcaccctagc gcgtggatgg cagctccgtc tcgtcacatg 1380
ggagagcggg gagatgtcta caatcaatct cctgcatttc aagagcagga ttctgggtct 1440
gggaatcatt ctcaagcggg gcagactcac cagcacatgc accaccacca aggcaaccag 1500
caccaccatc atcatcacca tcaccaccat gggagtggcg cccagccttc tggaaatgca 1560
ggggtgcaag atgaacaaca gcaatcagtg gtaccgcctg ggtcgagtgc tcctcgctgc 1620
ggctcgaccg gtgtggatgg tcgaagtggt agcagcaacg gctacgggag caccgggaat 1680
gggaatgggt ccatgaacgg aagtgcttcg ggaagtaata ctggcgtgaa caacggtcag 1740
agtggatttg gtgcgacgcc gatgttaact gacaacagtg ggagtaacgg cgtcggtgga 1800
acggatgcag ccatggatgg ggtgagtggg ggcaatgggc tgtgcacaga gcaaatgcgt 1860
ttcgccagac gagaggctgc cttgaataag tttaggcaga agagaaagga gcgatgcttt 1920
gagaagaagg tgcgatacca aagcaggaaa cggcttgcag aacaaagacc acgagtccgc 1980
ggtcagtttg tgcggcaagc ggtacatgat ccgtctgctg gtgacgccga atag 2034
<210>65
<211>4080
<212>DNA
<213> Pantoea karezii
<400>65
atggagttcc acgtactgct ggtcgaagac gacagggtga cgctgaagac agttgagcag 60
ctactccgga aatgcaatta caaagttacc tgtgcagcaa atggacggga ggcaataaag 120
gtccttactg cctgccggca cagcggcgtc aaagtggacc ttattttgac cgatatactg 180
atgccggagg ttaccggctt tgacttaatc aatgaagtgg tacatgggga caccttttgc 240
gatgtgccag tggtcgtcat gtcctctcaa gactcgcagg agaacgtgtt acaggcattc 300
caagcaggcg ctgccgacta ccttataaag cccattcgca aaaatgagct ggctacgctc 360
tggcagcatg tctggcgcgc caacaaggcc aaggggtccg gcagcggcac caccactaac 420
gtcaccgggc agcccctttc cggtcgggag gatctggagg caggcgaagc cgtcgctgtc 480
gccgccgccg ccgccgctgc cagcggcaag gcctgtgcag caacgcatgg gcatttgaag 540
gacagcagcg gcggcagcag cggcgccgcc gcttctgtat tgcagtccac gggcggaaca 600
ctactgccgg accgtgctgc cactgtacgg tatccagctg cggcggcagc gccaccgcca 660
cctggcgcat ccgagctatc agggaacgtg acggcgggcg aagctcaagg gagccgtacg 720
cagcatctgc gccatctgtccggcttggcg gggatggaaa gcacagcggc gacgtcagcg 780
gcggcgcaag gcagtagcgc agcagggccg ctgcggggct gcggcggtgc tggtactgct 840
atagctggtg ggccgcgcgc gcccttgggc ccactttcat tcgcgccctt cggcacttcc 900
gttgccgtac actttgacct gaaccccgca tccggcgcag ctcgacggct ggtcaactcc 960
agcggcgcca tcgatgcgtc gacgggcagc ggcactgctg gcgtcgccgc ttcatcgcgt 1020
tgcgccgccg gcacctccgc caccgtcatc agttggtcgc acgtcgatcc gacggagacg 1080
gacccagcgg aggcggagcc catgtacgac acgaacgcgg acgccaccgc ggcgaaggca 1140
gcggctgacg gtgtggcgga agctgacgac gacgatgttg gcgacgacgg cggtgctggg 1200
cccaaccaca atgacgatga tgacgagggt ggcggcgacg acgacgtcag cggcgacggt 1260
gacgaggacg gaaaccggcc tcgcaagcgt ccgcggctgc ttcagggatc ctcgcatcac 1320
cacagccacc agcatcgcct tcacagccta ggcggtacga ctaccaacac caccaccact 1380
acgacagccg cgaagcctaa gtcgacagcg ggagaacgcg gcggcgcggc ggcgctactc 1440
gcatgccgta ctgcggcggc cgcaccccta cgcggcagtg gctgcgccac cgctggcgcc 1500
accggagcat gtcgactggc ggcggcggca gcggcggcgg agggctccca gggttctcgc 1560
gccgcgtcgg cgtcggcagg ccctgacggc ggcgcgcgtg agagtacggc tacccccagt 1620
ggtgacacct ttgcagagag cccgtccgcg tacactgcaa ccgccacaac gaccagtacg 1680
gcaacaacca gtacgacaac gggatccggg attgagatgc aggacgacga gcaacagcag 1740
cgacagcagc ctaagcagcg tccgccggca tctcagccgg aactggaggg tcatcatcac 1800
caacaacaat atcaccatta ttatcgacgc accagcctgg agggcggttg cgccaacgca 1860
ccccctctcc ctgtcccttc atctgcacgg ggtgcttccc cggcaggcac gggtccgacg 1920
gaaagcggct ccgggaggga tagcggctgt gccaggatta caaatggtac ggcggcgggg 1980
gcgacggcgg caatgccgcc atctcacgtc agctcggcaa gccccccccg ctgtaccgcc 2040
acttccgcgg cggcgactcg cgggtcctct ggtgctgcta ctgcggcagc gggtgccatg 2100
acaacagcct tggcgacggc cggcagctat ccgcgaggag tggacgccag cccgccgccg 2160
aatagaagta tggggtccag cggcggtgat ggcggcggaa ccgccgctgc agctgccggt 2220
acggcacgag ggagctcgcc tgcggctgct acgccgccgc tgccaccttc tacgcagcag 2280
cacgggttgc cgcatcccgc ggcggcgccg ccgccgggcg ctgcatcgcc tggcggcgcc 2340
gtgacgctgc cgccagcgct tcaggagctg gcggcactgg gggcggcccg ccatgcgggg 2400
ctatggaccc agcgggcctt attgcatcag cagcaattgt tgctgcagca gcagaagcag 2460
cagaagcaac aacagcacca acaagaccag gtagtggggg cagagaagat tcatggtggg 2520
tcgacgtcgg ctgtagccaa cgccgccgag cagcagcagc agcagccgct gggggcggcg 2580
gcggcacgtc gtcccagcaa agcgggcgtg gacggaactg aggcgggaag tggcgcggtc 2640
ggcggatgcg catcggcgac agcggcggtc atggcgatgg aggcgtcgga gccgcatggc 2700
gcggttggca gctcctttac ggcggcagat cggcaggaga cgccgttgca gcctctgcat 2760
gctgaatctg cggcggcagg cggcgacatg gacggcaacc gcagtacacc cgcaactatg 2820
ccgtcggggc ctacggcagc cgcatcgggc ccttcgcaga cgtcgaacag cttgacggtg 2880
ctgcgacata gcgacagatc cgctttcacc gcattcaccg ttttcttgcc aagcagggtt 2940
gccggcgccg cggcggcggc ggcggcggca gcagctgctc ggccgccacc accgccggcg 3000
ccggtgcagc cgccggcgcc aatcttcacg caccctgctg ctgctgctgc agccgcggcg 3060
gcggctgccg ctggcagcgg cggtgcagcc tcagtgtggt atcctcacct ccatcatcac 3120
caccactact tgcagcagca gcagacgcac atgggtccct tgccgccact gccaggtgcc 3180
gtacatgttc tgccgtcgat catgcagctt cacatgggag tactggcgcc agggccgccg 3240
ccacagcagc agcagcagca gcaccttcag gccaaggcgc ctcagaagcc tcatgattcc 3300
gccgccgccg ccggcggagc taacggctcg ctaggtcccg cgacatcggc tgcagcggcc 3360
acgcacatgt cgtacactgg catgcaacag cgcccgggcg cctcatccgc caccaccacc 3420
agcgccggcg ctgtagcgtt cggtcaatct ccacctcacg ggctggcggc ggcggcggcc 3480
gccgctagca cgcctccgcc gcctccaccg ccgcctgttt gtattcccga atcggtacta 3540
cagctcattg cgcatctgtc tggtcgggcg gcggcggagc tgcccgtacc ggaaaccgtc 3600
acgacggcac cgttggtcgt acagaaggcg ccgtcggcag cgcgattggc tgctgtagcg 3660
aagtaccttg aaaagcggaa gcaccgaaac ttccaaaaga aggttcggta cgagagccgt 3720
aaacggctgg cggaggccag gcctcgcgta cgcggccaat tcgtcaaggc aagtacttcc 3780
gcggtggcgg caaccacccc tgccgccacg ggcgccaccg tcacctctac gtcgctccgt 3840
cagcccgttt atacggcggc cggcccggct ggcctggcgc tgccgccggc ggcggcagca 3900
gcggcggcca gcgccgccgc cgcgaggggg gttccgccgc cgtcatcccg catcggagcg 3960
gtggagctgg cggagttggt gcccgaccac gacgccgaca ttgaggacga ggggtgtgac 4020
gagcccgccg ccgccgagga ctccgacggg tccgtcgcgg tggagctggc ggaggtgtag 4080
<210>66
<211>3309
<212>DNA
<213> Chlamydomonas reinhardtii
<400>66
atggaggcta acggcttcca cgtcgtatta gtcgaggatg ataacatttg cctgaaagtg 60
gtggagcagc tgctgcggaa gctttcgtac agagtcagca ccgcatccga tggtgccgca 120
gcgctcaaag tcctggctga ctgcaagcag aggggcgaca aagtagacct cattctcacg 180
gacatcctga tgccagaggt taccgggttt gacctcatca acgaggtcgt gcatggagag 240
acctttgccg atattccggt cgtggttatg tcgtctcaag actcgcagga aagtgtcttg 300
caggcatttc aggcgggcgc agcggactac ctcatcaagc ccattcggaa aaatgagctt 360
gcaacgctct ggcagcacgt ctggcgtgca aaccgcgcca agggtggaca gaccagcagc 420
ggcgccgcgc atgtgggcgc aggcggcagg gggggcacca gcagccgcga tggcggtggc 480
gttgccggga cgcggtgcgg cccaggcgac cgcggcggca gcggcggcga cgctgagggt 540
agtgggctag gcggcggcgc gggtgcagtc aaggacagca gcggcggcag taccggcgcc 600
gccacttcag tgctgcactc cactggtggc acgacgctgc cctcacgtgc ggccaccggt 660
cggcacgcta gcacctcagc tggacacggc gtcaccagcg ctgaccccag caacaaccaa 720
acctcgcacg cgcacgcgca tgcgcatgcg cacgctcacg ggaacgcgca cgcgcacgcg 780
caccttcata tgcacggcgc aacagatcgt gcggcgcagg gcagcagcgc taacggcccg 840
gccaaccacg gggccgctgg gacagggctg cagtccgctg ggatggcagg ttccacggct 900
gcaggcgcgg ctgcgcccgc cggtgagtcg ctggccaagc cgcccttcgc ctccctagcc 960
gtccacttcg acctgcactc agtcctggcg ggcgcgggag cggctgcagc caatggtggc 1020
gccaatgccg cagctcacac tgctggcgcc accgggcgag agagcggcca ggcggcgggc 1080
gcggccacag gcggcattgc cgccgccggc accgtcatcg gctggtcgca tgcggacatg 1140
gacgtggacg gaggggaggc cggcgcgcag gatgaagatg acgaggacga ggacgacggc 1200
gtggaggcgc cggcgggcac acagaaccgg aagcgcgccg cggatgactc gggttgcgac 1260
ggcgccgccg ccaacaacaa cggcaacact gccgcaaagg ctggcgcagc ggcaatcgcc 1320
gcgggcgggc ctgggagctc gggcagggcg aaggccacgg acggcgcccg cgctgagatt 1380
cgccacaacg gtgggccgat ggcggcgcgg atggcggctg cagagggctc tcaaggctcg 1440
cgcgctgcat cgggctcggc ggcaacggga ccgggaggag cgcgggaggg cactgcgacg 1500
cctagcggcg acacctttgc ggagagccct tccaccttca cttccatcat caacaccacc 1560
ggctcgggca gcgaggccga cgagcagcca gtgccgctga agcaccagga acagcaacag 1620
cagcaacagc agcagcgggt cggcgagggt gacagggcga agcccgaacc gcacccacag 1680
aaccctgccc aggcagcaca cctgccgcac ccgtccgcgg ccccatgctc gggcggtggc 1740
ggtattgcgc aagcggccct acccctaggg ctacaggagc tggcagcgct gggggcggct 1800
cggcacaaag agctgtggac gcagcggcac cttatgcatc agcggcaggc ggcggcagcg 1860
gcgacagcag cggcggcctc ggcagctgct gcagcggcaa tgcccacggc cggcgcgagc 1920
gccgcggctc ctgcaggccc accttcggcg cggccctccg cttccttggc agacacgggc 1980
ggcgacggcc ccgcggctgc gacggcgcct gagacgcgcg cagatgggcc ctctggccct 2040
gccacgacgc agggccccaa acgagatgcc gtcgcaggtg ccgcggctgt cggcagctct 2100
gcacggagcg acagtccgct gccggcagcc gccgccgcga cggcaggcgc caacggcgcg 2160
agcggcgccg cttctgacgt gttggcgggc gcaggcagcc ttgcgcttct ccggcacagc 2220
gatcggtctg ccttcaccgc gttcacggtc ttcctgcccg ggcgtgttgc cgccgccgcg 2280
gccgctgcag cggccgccgc cgcagctgct accagcgcgg gcgccagcac cggcactgcc 2340
aacggggctc cgccggcacc gggcaccgct ctggctgccg ctgccgcagc agctgccgcc 2400
gctgcgtcag cagtgccgct gccgcatcca cacacagcgc ccccagcgct gttcggcgtc 2460
cctccgccgt cctccgtgcc tcccagctcg ctttctgtgc tacctcctgt gatgccgctc 2520
catccggccg ctgccgctgc agcggcgacg gcgggtgggg gcaagcccag cgacgcagcc 2580
acgtatgccg cggctgctgc agctggattg gtgccgtatc cagggtttgc gccggcgcgg 2640
ccggggccat ttccgccgcc gccaggttct ggtggccccg gcgcgccgcc tgtgtacata 2700
cccgagtcag tcctgcagct gattgcgcac ctgtccggcc gcgcggctgc ggaaattccg 2760
gcggtgccgg cggagtcagt gacggcagca ccggtggttg tgcagaagag cggcggccct 2820
gcctcggcgg cgcgactggc ggcagtggcc aagtacctgg agaagcggaa gcaccgcaat 2880
ttccagaaga aggtgcgcta cgagagccgc aagcggctcg ccgaggcccg gccacgcgtc 2940
agggggcagt tcgtcaaggc gggcaccgcg ggtgcagcgg cagcggcagc ggcagcggca 3000
gccgcagccg cagccggcac tgccgctact gctgccggca ccggcacggccagaggtgct 3060
gccaccgctt ctggggctgc tgggaagccg gagctacagg gccccgacac ggcagaagag 3120
gctgcggctg cgacgctgct tagcgcagca gctgctatgg cagcagcggc tgcgggcacc 3180
agtggcccca gcggctctgg gtccggcgcg atggatgtgg acggtgccga cccggaagca 3240
gatgcagacg tcatggatga ggacgatggc gaagacgacg ggtcggacga gtccgctggg 3300
gagccctag 3309
<210>67
<211>1335
<212>DNA
<213> Gliocladium sp
<400>67
atggctgcag gcctcaagcg gatacccagc ttctcggggc gaccaggatt ccccaacggt 60
ctgcaggtgt tggttgtgga cggggacacc agcagcagcc agtgcttgcg gcagaagctg 120
gaggagctgg catatgaagt cagctgctgc tcgtccggat ctgacgcttc ggcgctcctg 180
cgcaaggagg actccagcta cgacattctc ctagttgagg ccaaagctct ggcaaaggat 240
gctactgatg gaggcagtct cagagattct gcagcgcacc tgccgctggt cctcatgtca 300
gaaaagagca gcagcacaga cgctgtatgg cgaggcatag agctcggggc agcggacgtt 360
ctggagaagc cgctgtcctc cttgaagctg cgcaacatct ggcaacatgt cgttcgcaag 420
atgatgagct cgtcccagga cagcagcagg gaggcggtgc cctgcaagat ggagccgaag 480
agcaagggca agggcgtgtc agcgccctcc agccctcgca ctccctcccc tgcagcctcc 540
ctcctcacca tcagcagcgg cacgatgaca gagaagagct gcaagggcgg cggcgatgag 600
gcctccttct caggtgtggg agatgtgaag atgtcctgct cggcagaggc gccggagccc660
tgcgattcgc gcgcgaccgc tgagtcaccc gccagcacgc agaccaaggt cacgttcccg 720
gggtgcttga atagcggcgg cacggcgctc gcggctagca agaattgcag ccgcaagaga 780
aaggcaaagg cgccggacac tcctgcatcg gtggcgagcc ggccgcctct ggccatcagg 840
ccccccgcat gggcctcccc atttggtccc ccccaccagg gcaacaccca cgtcgtcggc 900
atggccccgc cacagtgcta tatgcagggg gttgacccca cgaacgggtg cgtatggggc 960
acgccagcag ggggcgtcag ccaagcgcca gcctacatgc ccggctgggg cttctcgccg 1020
cagccaatgc tttccggcag cttcttgcag catccctcca ccagcgacct gcacaagtgc 1080
cccagcgtgg gtgccagcag cctggcaagc agcctggaca gcagcctgac gctgtgcggc 1140
tttggcgcgg acctgcctga cgacgatctc ctgttggagg acgtgcttct gccggacgag 1200
gatcttctgg acttggcccc agatgagccc gccaccatga aggcccccga gcagccgccc 1260
atcggcctca agctcaagaa gtccgcttca ctcatcgacc tcatcaatgc gcaactgtcc 1320
gccgccaccg cctga 1335
<210>68
<211>568
<212>PRT
<213> genus Chlorella
<400>68
Met Leu Arg Gln Gln Leu Leu His Ser Gly Arg Gln Pro Gly Ala Thr
1 5 10 15
Cys Ser Leu Leu Thr Cys Ser Thr Trp Arg Pro Ser Ala Leu Phe Gly
20 25 30
Arg Pro Lys Pro Gln Lys Leu His Ser Gln Arg Leu Gln His Gln Gly
35 40 45
Arg Pro Ser Arg Leu Val Val Arg Ser Ala Met Phe Asp Asn Leu Ser
50 55 60
Arg Ser Leu Glu Arg Ala Trp Asp Met Val Arg Lys Asp Gly Arg Leu
65 70 75 80
Thr Ala Asp Asn Ile Lys Glu Pro Met Arg Glu Ile Arg Arg Ala Leu
85 90 95
Leu Glu Ala Asp Val Arg Leu Gly Ala Pro Leu Ile Arg Phe Leu Val
100 105 110
Ser Thr Pro Pro Pro Ser Gln Val Ser Leu Pro Val Val Arg Lys Phe
115 120 125
Val Lys Ala Val Glu Glu Lys Ala Leu Gly Ser Ala Val Thr Lys Gly
130 135 140
Val Thr Pro Asp Gln Gln Leu Val Lys Val Val Tyr Asp Gln Leu Arg
145 150 155 160
Glu Leu Met Gly Gly Gln Gln Glu Gly Leu Val Pro Thr Ser Pro Glu
165 170 175
Glu Pro Gln Val Ile Leu Met Ala Gly Leu Gln Gly Thr Gly Lys Thr
180 185 190
Thr Ala Ala Gly Lys Leu Ala Leu Phe Leu Gln Lys Lys Gly Gln Lys
195 200 205
Val Leu Leu Val Ala Thr Asp Ile Tyr Arg Pro Ala Ala Ile Asp Gln
210 215 220
Leu Val Lys Leu Gly Asp Arg Ile Gly Val Pro Val Phe Gln Leu Gly
225 230 235 240
Thr Gln Val Gln Pro Pro Glu Ile Ala Arg Gln Gly Leu Glu Lys Ala
245 250 255
Arg Ala Glu Gly Phe Asp Ala Val Ile Val Asp Thr Ala Gly Arg Leu
260 265 270
Gln Ile Asp Gln Ser Met Met Glu Glu Leu Val Gln Ile Lys Ser Thr
275 280 285
Val Lys Pro Ser Asp Thr Leu Leu Val Val Asp Ala Met Thr Gly Gln
290 295 300
Glu Ala Ala Gly Leu Val Lys Ala Phe Asn Asp Ala Val Asp Ile Thr
305 310 315 320
Gly Ala Val Leu Thr Lys Leu Asp Gly Asp Ser Arg Gly Gly Ala Ala
325 330 335
Leu Ser Val Arg Gln Val Ser Gly Arg Pro Ile Lys Phe Val Gly Met
340 345 350
Gly GluGly Met Glu Ala Leu Glu Pro Phe Tyr Pro Glu Arg Met Ala
355 360 365
Ser Arg Ile Leu Gly Met Gly Asp Val Val Thr Leu Val Glu Lys Ala
370 375 380
Glu Glu Ser Ile Lys Glu Glu Glu Ala Gln Glu Ile Ser Arg Lys Met
385 390 395 400
Leu Ser Ala Lys Phe Asp Phe Asp Asp Phe Leu Lys Gln Tyr Lys Met
405 410 415
Val Ala Gly Met Gly Asn Met Ala Gln Ile Met Lys Met Leu Pro Gly
420 425 430
Met Asn Lys Phe Thr Glu Lys Gln Leu Ala Gly Val Glu Lys Gln Tyr
435 440 445
Lys Val Tyr Glu Ser Met Ile Gln Ser Met Thr Val Lys Glu Arg Lys
450 455 460
Gln Pro Glu Leu Leu Val Lys Ser Pro Ser Arg Arg Arg Arg Ile Ala
465 470 475 480
Arg Gly Ser Gly Arg Ser Glu Arg Glu Val Thr Glu Leu Leu Gly Val
485 490 495
Phe Thr Asn Leu Arg Thr Gln Met Gln Ser Phe Ser Lys Met Met Ala
500 505 510
Met Gly Gly MetGly Met Gly Ser Met Met Ser Asp Glu Glu Met Met
515 520 525
Gln Ala Thr Leu Ala Gly Ala Gly Pro Arg Pro Val Pro Ala Gly Lys
530 535 540
Val Arg Arg Lys Lys Leu Ala Ala Ala Gly Gly Ser Arg Gly Met Ala
545 550 555 560
Glu Leu Ala Ser Leu Lys Ala Glu
565
<210>69
<211>23
<212>DNA
<213> genus Chlorella
<400>69
gggacatggt gcgcaaggac ggg 23
<210>70
<211>2667
<212>DNA
<213> genus Chlorella
<400>70
atggccaaac tgacatccgc tgttcctgtg ttgacagcaa gagatgttgc aggtgcagtg 60
gagttttgtg agttctgaga agctgattgt tgtttaactt ctttgaaagc tttatcgaag 120
attctgcaag cgatgaacat tgcttgtcaa gaccgagagc tgcatgccca cttgacatcc 180
agctttgaac ggctcttcat gtttgatttg tttctgattg tagggacaga tagactgggg 240
tttagcaggg actttgtgga ggacgatttt gcaggagtgg tgagggatga tgtgacactg 300
tttatctcag cagtgcagga tcaagtgagt gcagcgtcag ctgtggcagt tgttggcttt 360
cgtctcagtc agtagtttgc tgggattgat tatggagggc acagttgcaa ttttgagttg 420
cacgttgcga caagcgtgtt gacaaagcgt ggtcaagccg gccagtcttg ccggtggcgg 480
gtggcttggt ctaacttccg ctctacagca atcgttttgt tcatggttac ggggctggcg 540
tgccagaaag tcctggtcag ccaccctcgc ttcaaagccg tagcccaaca actttgcgaa 600
tatgttcgat ttgcaggtgg tgcccgataa tacactggca tgggtttggg tgagaggtac 660
agctctgcgt gcaacaggtt gcaagatgca gcgcaggtct tccctggtca aacgatgtat 720
gcagagttga gaggcacttg agctgggtga atggcgtggg ctcgtaggta gtgtgcaggg 780
caggaagggc agccaatttt ggagttgtgg tccggtgtcg ttgcttcgag ccttattagg 840
actcttgctc atcaaagcgt tagttgtgaa taagttgatc tgaaaggatg ttatgtacag 900
caagcagcag cagttaagag tctggggagt agctgcacag ggcgaggtgt caagatggga 960
agggtcctgc ctccttatgt gtttttccct gtaggggagg aagcctctta tgggcaatgg 1020
ttgggcatat tttccagcca gcccttcttt ctataggggc cagggtgggc ccagctcgtc 1080
ttggcttcca ccaccaggag agtgagggca ttgaagggcc ataaatagtc ctcccatcta 1140
cgtgcaccag agggtgtcgt ctaggctgtg catgccacga ggggaaggag ccaagaatga 1200
gtgtatgggt tgttttcatg tttaggctgg gataaaactg ttttcaattg cgcctgccgg 1260
gtgaaaacca cagcagcatc agcaagcttg gagaaggcca gcccgcccag cacaggctca 1320
cgttcccact caggcggtca gtcgggcggg ggtgtgagtc aggcaggcga gggtgtctgt 1380
gcctgacatc agcacctctg cttagccact gcagcccctg gagcagggta gggcgtcatt 1440
tgcagcaatc acctgctgcc tcacacgtcg cagcttggaa tttcaacgac catcagcgct 1500
ggggttgttg agggatcata gcagattttg gtgcagcctg gttgtcatgc tctttgtgga 1560
atggcctcta tgttcgagca attcgttgga tgttgaggtg cttggggaca gagagtcgaa 1620
tgatgggcca gggtcaaaca tgcgagcgtt tggctgagtc agcggttttt gctggtcact 1680
ttttcttttg tttcttattt aggtttgatg gatgtgtttt gtgctgctgc cctgaagctg 1740
cagcagcgtg tctgccctgc gctactgcgg gcaccaaggc tatgtgctgg tgcactcggc 1800
tgcgctgcac ctgtgcacct cgcactccgt ccagcctcca tgcagcacac gtactcacgg 1860
tgtcctcctg acctgtcgta cgctattcca aacttgctct tttgctgccg ctgctctcgt 1920
acacaattgc tgttgattat cgatatctaa tcgagcgcct gctgactgaa ctccgcaggt 1980
ttggatgaac tgtatgcaga gtggtctgaa gtggtgagca ccaactttag gtgggtgggc 2040
tctgaaggag gaggagggag cgggtgatta aacagggcct gcatgaagag gagcaggggc 2100
tgcatggaca gcagggggaa ggtgcagaag ggagggtcaa gcggggttca ggtggctgtg 2160
ggtttctgca cgagcagtga aagaagctgt atccttccac ctgctttcac tggcgaaagg 2220
ttgaaaacag gatgtcgcag ctggaaagat gttgcgctgt caagtgcaag ccatggttga 2280
gggtatgcct gtgtgcatgt gcttcttaaa gttactcctg ttctatggtt ctgggtgctt 2340
gttgtttgtg gtgcagggat gcaagcggac ctgcaatgac agagattgga gaacaacctt 2400
ggggaaggga gtttgcattg agagatcctg caggtgaggg ggcatgtaag caatggcagg 2460
caattcaaga acgaatcatt gctgcaaatg ctgggatggt atgcagctga ggtatctatt 2520
gccttgtatt ttgtctcgca ttgcatcggt ggtgcgttct gtggcctgag gcacagttct 2580
tgctgtttga taagggttcg actgagttgt cgtgtgtgct gtgctgcagg caattgcgtg 2640
cactttgttg cagaagaaca ggactga 2667
<210>71
<211>530
<212>DNA
<213> genus Chlorella
<400>71
ccaccatggg ggaggtttga agtgtgcgcc tgatataatc atacacctaa aagcaccact 60
tgctgattgt gaagggacta tgtcgtttat gacgggacgt tacgctggcc gatggtttga 120
atttggacgc tgtggtagaa tgttatatgg acgtaaaggt tggcatattg aaaatcgtct 180
tcgcaggcaa acttctagac gtgtgaccca ccggtaaaac gacaagcgtg gcgcgtcgat 240
tgcgctttga acgtcgtttg ttggactcca gatgaacctc aaaatcaaag cggtgattga 300
cgaaaatcaa atgacagccc gcaaaatttc atcagccttc ggatcggatt ctcagaatct 360
gattgtccct gctggctaca tttatgaaat ttcgtacatt ttggcagaaa tgtcccaata 420
ccatagcact gccgcctgag ctcacccgag caatgcatac tgggtacctc gcccatctcg 480
ccctctttcc aagcccagtg ctgttgtaat agccaaaggg ctcagtaaca 530
<210>72
<211>546
<212>DNA
<213> genus Chlorella
<400>72
gcatagcatc agcctgtggc agggttgtgg tagggctgag tggcagggtt aaaggggttg 60
cctaccccac ccctactctc atgacaccag caacagcagc agctcatgcagtactcaaat 120
cactgatgtc aatggtgtga cacatttggt taaggctgct ttttaaagtg ctgctttggg 180
ggcagtgact gtgcagagct tggagcgtat ccccatgtaa tcagaaccga cgagagttcg 240
gggcaacctt tcatcttcac attttttgtg atcagctaca gagtctgaaa tcaaatagag 300
gctgccatct aaacgcagga gtcacaacga aggcgaaaac tccaattgct gtactcaatg 360
cactaagtga ttgttcaatg gataaataca ctatgctcaa ttcatgccag cagagctgct 420
ccttccagcc agctacaatg gctttttcca cgccttttga agtatgaatg ttcagcttgc 480
tgtgcttgat gcatcaccat aaacacaatt ctacaacatt tcatgccaac aacagtacgg 540
gctttc 546
<210>73
<211>23
<212>DNA
<213> genus Chlorella
<400>73
tgcggtgaag cttggagctg tgg 23
<210>74
<211>23
<212>DNA
<213> genus Chlorella
<400>74
acaccacctt aaggcacatg agg 23
<210>75
<211>549
<212>PRT
<213> Chlamydomonas reinhardtii
<400>75
Met Gln Thr Ala Leu Arg Ala Arg Ser Ala Ala Pro Arg Gly Ala Cys
15 10 15
Asn Arg Thr Ala Val Ala Pro Val Ala Ser Ala His Leu Arg Gly Gln
20 25 30
Tyr Ala Pro Phe Ser Gly Ala Gln Ala Arg Pro Ala Leu Gly Arg Gln
35 40 45
Arg Gln Gln Gln Gln Gln Gln Arg Arg Gly Ala Leu Val Ile Arg Ser
50 55 60
Ala Met Phe Asp Ser Leu Ser Arg Ser Ile Glu Lys Ala Gln Arg Leu
65 70 75 80
Ile Gly Lys Ser Gly Thr Leu Thr Ala Glu Asn Met Lys Glu Pro Leu
85 90 95
Lys Glu Val Arg Arg Ala Leu Leu Glu Ala Asp Val Ser Leu Pro Val
100 105 110
Val Arg Arg Phe Ile Lys Lys Val Glu Glu Arg Ala Leu Gly Thr Lys
115 120 125
Val Arg Glu Gly Arg Ala Met Gly Thr Lys Trp Lys Ser Val Val Asn
130 135 140
Cys Pro Leu Gln Asp Gly Leu Gly Asn Arg Gly Val Gly Arg Ala Arg
145 150 155 160
Thr Glu Val Gly His Arg Ala Ala Cys Val His Gly Ala Arg Gly Val
165170 175
Gly Lys Thr Thr Ala Ala Gly Lys Leu Ala Leu Tyr Leu Lys Lys Ala
180 185 190
Lys Lys Ser Cys Leu Leu Val Ala Thr Asp Val Tyr Arg Pro Ala Ala
195 200 205
Ile Asp Gln Leu Val Lys Leu Gly Ala Ala Ile Asp Val Pro Val Phe
210 215 220
Glu Met Gly Thr Asp Val Ser Pro Val Glu Ile Ala Lys Lys Gly Val
225 230 235 240
Glu Glu Ala Arg Arg Leu Gly Val Asp Ala Val Ile Ile Asp Thr Ala
245 250 255
Gly Arg Leu Gln Val Asp Glu Gly Met Met Ala Glu Leu Arg Asp Val
260 265 270
Lys Ser Ala Val Arg Pro Ser Asp Thr Leu Leu Val Val Asp Ala Met
275 280 285
Thr Gly Gln Glu Ala Ala Asn Leu Val Arg Ser Phe Asn Glu Ala Val
290 295 300
Asp Ile Ser Gly Ala Ile Leu Thr Lys Met Asp Gly Asp Ser Arg Gly
305 310 315 320
Gly Ala Ala Leu Ser Val Arg Glu Val Ser Gly Lys Pro Ile Lys Phe
325330 335
Val Gly Val Gly Glu Lys Met Glu Ala Leu Glu Pro Phe Tyr Pro Glu
340 345 350
Arg Met Ala Ser Arg Ile Leu Gly Met Gly Asp Val Leu Thr Leu Tyr
355 360 365
Glu Lys Ala Glu Ala Ala Ile Lys Glu Glu Asp Ala Gln Lys Thr Met
370 375 380
Glu Arg Leu Met Glu Glu Lys Phe Asp Phe Asn Asp Phe Leu Asn Gln
385 390 395 400
Trp Lys Ala Met Asn Asn Met Gly Gly Leu Gln Met Leu Lys Met Met
405 410 415
Pro Gly Phe Asn Lys Ile Ser Glu Lys Gln Leu Tyr Glu Ala Glu Lys
420 425 430
Gln Phe Gly Val Tyr Glu Ala Ile Ile Gly Ala Met Asp Glu Glu Glu
435 440 445
Arg Ser Asn Pro Glu Val Leu Ile Lys Asn Leu Ala Arg Arg Arg Arg
450 455 460
Val Ala Gln Asp Ser Gly Lys Ser Glu Ala Glu Val Thr Lys Leu Met
465 470 475 480
Ala Ala Tyr Thr Ser Met Lys Ala Gln Val Gly Gly Met Ser Lys Leu
485 490495
Leu Lys Leu Gln Lys Ala Gly Ala Asp Pro Gln Lys Ala Asn Ser Leu
500 505 510
Leu Gln Glu Leu Val Ala Ser Ala Gly Lys Lys Val Ala Pro Gly Lys
515 520 525
Val Arg Arg Lys Lys Glu Lys Glu Pro Leu Ser Lys Ala Arg Gly Phe
530 535 540
Gly Ser Ser Ser Lys
545
<210>76
<211>559
<212>PRT
<213> Microcystis parvum
<400>76
Met Arg His Leu Leu Ser Ser Ala Ser Ile Arg Gln Tyr Asp Lys Trp
1 5 10 15
Ser Leu Val Ser Ser His Ala Lys Lys Pro Ala Leu Val Cys Ala Ser
20 25 30
Lys His Thr Lys Ser Ala Val Lys Leu Gln Cys Thr Ser Arg Gly Ser
35 40 45
Ser Asn Arg Thr Ile Gln Leu Leu Leu Phe Gln Gln Phe Arg Pro Ala
50 55 60
Lys Arg Gly Lys Leu Leu Ile Thr Arg Ala Asp Ser Phe Gly Thr Leu
65 70 75 80
Ser Glu Arg Leu Asn Ser Ala Trp Ser Ala Leu Lys Asp Glu Asp Asp
85 90 95
Leu Ser Val Glu Asn Ile Ser Leu Pro Leu Lys Asp Ile Arg Arg Ala
100 105 110
Leu Leu Glu Ala Asp Val Ser Leu Pro Val Val Arg Arg Phe Ile Lys
115 120 125
Ser Val Glu Glu Lys Ser Ile Gly Val Lys Val Thr Lys Gly Val Ser
130 135 140
Ala Ser Gln Gln Leu Thr Lys Val Val Ala Asp Glu Leu Cys Glu Leu
145 150 155 160
Met Gly Gly Phe Gly Gly Asp Lys Leu Ile Phe Arg Lys Glu Gly Glu
165 170 175
Gly Pro Thr Val Ile Leu Met Ala Gly Leu Gln Gly Val Gly Lys Thr
180 185 190
Thr Ala Cys Gly Lys Leu Ala Leu Phe Leu Lys Ala Gln Gly Lys Gln
195 200 205
Ser Leu Leu Val Ala Thr Asp Val Tyr Arg Pro Ala Ala Ile Asp Gln
210 215 220
Leu Lys Lys Leu Gly Glu Gln Ile Asp Val Pro Val Phe Glu Leu Gly
225 230 235 240
Thr Asp Phe Ser Pro Pro Asp Ile Ala Arg Ser Gly Val Glu Lys Ala
245 250 255
Lys Leu Glu Asn Phe Asp Val Val Ile Val Asp Thr Ala Gly Arg Leu
260 265 270
Gln Val Asp Glu Met Leu Met Ala Glu Leu Leu Ala Thr Lys Ala Ala
275 280 285
Thr Arg Ala Asp Glu Thr Leu Leu Val Val Asp Ala Met Thr Gly Gln
290 295 300
Glu Ala Ala Ser Leu Thr Ala Ala Phe Asn Asp Ala Val Gly Ile Thr
305 310 315 320
Gly Ala Val Leu Thr Lys Met Asp Gly Asp Thr Arg Gly Gly Ala Ala
325 330 335
Leu Ser Val Arg Glu Val Ser Gly Lys Pro Ile Lys Phe Ile Gly Ser
340 345 350
Gly Glu Lys Leu Asp Ala Leu Glu Pro Phe Phe Pro Glu Arg Met Thr
355 360 365
Thr Arg Ile Leu Gly Met Gly Asp Val Val Ser Leu Val Glu Arg Ala
370 375 380
Gln Val Ala Val Lys Glu Glu Gln Ala Asn Leu Met Arg Asp Lys Ile
385 390 395 400
Leu Ser Ala Thr Phe Asp Phe Asn Asp Phe Leu Ser Gln Leu Glu Met
405 410 415
Met Gly Lys Met Gly Gly Met Gly Gly Leu Thr Lys Met Met Pro Gly
420 425 430
Met Asn Thr Met Ser Asp Lys Glu Leu Gln Asp Ala Glu Lys Ser Leu
435 440 445
Ser Val Ala Lys Ser Leu Ile Met Ser Met Thr Pro Arg Glu Arg Gln
450 455 460
Phe Pro Asp Leu Leu Val Ala Gly Ser Ser Ala Ala Ser Arg Arg Gly
465 470 475 480
Arg Val Val Glu Gly Ser Gly Arg Ser Asp Lys Asp Leu Ala Asn Leu
485 490 495
Ile Val Met Phe Gly Ser Met Arg Val Lys Met Gln Ser Leu Ser Ala
500 505 510
Gln Met Asn Gly Thr Ala Lys Glu Val Gly Leu Val Pro Gln Leu Ser
515 520 525
Glu Val Asp Leu Asn Lys Leu Ala Phe Glu Gly Val Gly Lys Arg Val
530 535 540
Ser Pro Gly Met Val Arg Arg Arg Lys Leu Asn Ala Ser Phe Gly
545 550 555
<210>77
<211>568
<212>PRT
<213> genus Microcystis
<400>77
Met Glu Ala Arg Thr Lys Gln Ala Arg Ala Pro Lys Gly Ser Ile Trp
1 5 10 15
Cys Ala Gln Arg Ala Arg Lys Asp Leu Arg Ala Arg Gly Cys Arg Gly
20 25 30
Leu Gly Ser Arg Ile Ser Lys Gly Gln Pro Phe Ser Pro Leu Thr Leu
35 40 45
Ser Thr Pro Ala Val Thr Glu Ile Gly Phe Gly Thr Leu Leu Tyr Gly
50 55 60
Ser Arg Leu Ser Ala Gly Gly Ser Arg Arg Gly Glu Thr Met Leu Arg
65 70 75 80
Arg Ala Ser Ala Phe Gly Ser Leu Thr Glu Arg Leu Asn Ser Val Trp
85 90 95
Ala Thr Leu Lys Asp Glu Asp Asp Leu Ser Leu Glu Asn Ile Lys Gly
100 105 110
Pro Leu Lys Asp Ile Arg Arg Ala Leu Leu Glu Ala Asp Val Ser Leu
115 120 125
Pro Val Val Arg Arg Phe Ile Lys Asn Ile Glu Gln Lys Ala Ile Gly
130 135140
Thr Arg Val Thr Lys Gly Val Asn Ala Gly Gln Gln Leu Thr Lys Val
145 150 155 160
Val Ala Asp Glu Leu Cys Glu Leu Met Gly Gly Phe Gly Gly Asp Ser
165 170 175
Leu Ala Phe Lys Asp Pro Ser Met Gly Pro Thr Val Ile Leu Met Ala
180 185 190
Gly Leu Gln Gly Val Gly Lys Thr Thr Ala Cys Gly Lys Leu Ala Leu
195 200 205
Tyr Leu Lys Lys Gln Gly Lys Asp Ser Leu Leu Val Ala Thr Asp Val
210 215 220
Tyr Arg Pro Ala Ala Ile Glu Gln Leu Lys Arg Leu Gly Glu Gln Val
225 230 235 240
Lys Thr Pro Val Phe Asp Met Gly Val Arg Val Asp Pro Pro Glu Val
245 250 255
Ala Arg Leu Gly Leu Glu Lys Ala Arg Ala Glu Gly Ile Asp Val Val
260 265 270
Ile Ile Asp Thr Ala Gly Arg Leu Gln Val Asp Val His Leu Met Glu
275 280 285
Glu Leu Arg Ala Thr Lys Ile Ala Thr Ala Ala Asp Glu Ile Leu Leu
290 295300
Val Val Asp Ala Met Thr Gly Gln Glu Ala Ala Ala Leu Thr Ala Ala
305 310 315 320
Phe Asp Glu Ala Val Gly Ile Thr Gly Ala Val Leu Thr Lys Met Asp
325 330 335
Gly Asp Thr Arg Gly Gly Ala Ala Leu Ser Val Arg Glu Val Ser Gly
340 345 350
Lys Pro Ile Lys Phe Thr Gly Val Gly Glu Lys Met Glu Ala Leu Glu
355 360 365
Pro Phe Tyr Pro Glu Arg Met Ala Ser Arg Ile Leu Gly Met Gly Asp
370 375 380
Val Val Thr Leu Val Glu Arg Ala Gln Gln Val Val Lys Asn Glu Glu
385 390 395 400
Ala Glu Gln Met Arg Asp Lys Ile Leu Ser Ala Thr Phe Asp Phe Asn
405 410 415
Asp Phe Ile Lys Gln Met Glu Met Met Gly Gln Met Gly Gly Met Asp
420 425 430
Gly Phe Met Lys Leu Leu Pro Gly Met Ser Gly Met Ser Glu Arg Glu
435 440 445
Met Gln Glu Ala Asp Lys Ser Leu Lys Val Ala Lys Ser Leu Ile Leu
450 455460
Ser Met Thr Ser Lys Glu Arg Gln Phe Pro Asp Ile Leu Val Ala Gly
465 470 475 480
Ala Ser Ala Lys Ser Arg Arg Lys Arg Ile Ile Glu Gly Ala Gly Arg
485 490 495
Ser Glu Lys Asp Leu Ser Gln Leu Ile Val Leu Phe Gly Ser Met Arg
500 505 510
Val Lys Met Gln Lys Met Thr Ala Glu Ile Thr Gly Ala Ser Ala Glu
515 520 525
Val Gly Leu Thr Pro Gln Leu Ser Glu Glu Asp Met Asn Thr Leu Ala
530 535 540
Asn Glu Gly Leu Arg Lys Asn Val Ser Pro Gly Met Val Arg Arg Leu
545 550 555 560
Arg Ile Arg Arg Leu Thr Gly Ser
565
<210>78
<211>481
<212>PRT
<213> tourmaline insect
<400>78
Met Phe Asp Glu Leu Ser Ala Arg Phe Glu Glu Ala Val Lys Ser Leu
1 5 10 15
Lys Gly Leu Ser Ala Ile Thr Glu Asn Asn Val Glu Asn Ala Leu Lys
2025 30
Gln Val Arg Arg Ala Leu Ile Glu Ala Asp Val Ser Leu Val Val Val
35 40 45
Lys Glu Phe Met Glu Glu Val Arg Ser Lys Ser Ile Gly Ile Glu Val
50 55 60
Val Arg Gly Ile Lys Pro Asp Gln Lys Phe Ile Gln Val Val Tyr Glu
65 70 75 80
Gln Leu Ile Glu Ile Met Gly Ala Asn Asn Thr Pro Leu His Lys Gln
85 90 95
Ser His Thr Val Thr Val Val Leu Met Ala Gly Leu Gln Gly Ala Gly
100 105 110
Lys Thr Thr Ala Ala Ala Lys Leu Ala Leu Tyr Leu Lys Asn Gln Gly
115 120 125
Glu Lys Val Leu Met Val Ala Ala Asp Val Tyr Arg Pro Ala Ala Ile
130 135 140
Asp Gln Leu Phe Val Leu Gly Lys Gln Ile Asp Val Glu Val Phe Thr
145 150 155 160
Leu Asn Pro Glu Ser Ile Pro Glu Asp Ile Ala Ala Ala Gly Leu Gln
165 170 175
Lys Ala Ile Arg Glu Gly Phe Asp Tyr Leu Ile Val Asp Thr Ala Gly
180 185 190
Arg Leu Gln Ile Asp Thr Ala Met Met Gln Glu Met Val Arg Ile Arg
195 200 205
Ser Ala Val Asn Pro Asn Glu Ile Leu Leu Val Val Asp Ser Met Ile
210 215 220
Gly Gln Glu Ala Ala Glu Leu Thr Arg Ala Phe His Glu Gln Ile Gly
225 230 235 240
Ile Thr Gly Ala Val Leu Thr Lys Leu Asp Gly Asp Ala Arg Gly Gly
245 250 255
Ala Ala Leu Ser Ile Arg Lys Val Ser Gly Ala Pro Ile Lys Phe Ile
260 265 270
Gly Thr Gly Glu Lys Val Glu Ala Leu Gln Pro Phe His Pro Glu Arg
275 280 285
Met Ala Ser Arg Ile Leu Gly Met Gly Asp Ile Val Thr Leu Val Glu
290 295 300
Lys Ala Gln Glu Glu Val Glu Leu Ala Asp Val Glu Lys Met Gln Arg
305 310 315 320
Lys Leu Gln Glu Ala Ser Phe Asp Phe Ser Asp Phe Leu Gln Gln Met
325 330 335
Arg Leu Val Lys Arg Met Gly Ser Leu Gly Gly Leu Met Lys Met Ile
340 345350
Pro Gly Met Asn Lys Ile Asp Ser Thr Met Leu Arg Glu Gly Glu Ala
355 360 365
Gln Leu Lys Arg Ile Glu Ser Met Ile Gly Ser Met Thr Pro Thr Glu
370 375 380
Arg Glu Lys Pro Glu Leu Leu Ala Ser Gln Pro Ser Arg Arg Gly Arg
385 390 395 400
Ile Ala Lys Gly Ser Gly His Lys Ile Ala Asp Val Asp Lys Met Leu
405 410 415
Val Asp Phe Gln Lys Met Arg Gly Phe Met Gln Gln Met Thr Lys Gly
420 425 430
Asn Asn Phe Ala Asn Pro Leu Ser Met Gly Ala Asn Met Phe Ser Gln
435 440 445
Pro Asn Met Thr Val Pro Gln Thr Lys Ile Ser Asn Thr Asn Glu Ser
450 455 460
Arg Met Arg Asn Ser Arg Ala Thr Lys Lys Lys Lys Gly Phe Gly Gln
465 470 475 480
Leu
<210>79
<211>498
<212>PRT
<213> ocean luminescent oyster ball algae
<400>79
Met Thr ArgAla Asp Ala Phe Ala Gly Met Ser Asp Lys Leu Asp Lys
1 5 10 15
Ala Trp Ala Arg Leu Gln Gly Glu Lys Asp Leu Asn Ala Asp Asn Val
20 25 30
Lys Ala Pro Leu Lys Asp Val Arg Arg Ala Leu Leu Glu Ala Asp Val
35 40 45
Ser Leu Pro Val Val Arg Arg Phe Ile Ala Arg Cys Glu Glu Lys Ala
50 55 60
Val Gly Met Lys Val Thr Lys Gly Val Glu Pro Gly Gln Met Leu Val
65 70 75 80
Lys Cys Val Ala Asp Glu Leu Cys Glu Leu Met Gly Gly Val Gly Ala
85 90 95
Glu Gly Ile Lys Phe Arg Asp Asp Gly Glu Pro Thr Val Val Leu Met
100 105 110
Ala Gly Leu Gln Gly Val Gly Lys Thr Thr Ala Cys Gly Lys Leu Ser
115 120 125
Leu Ala Leu Arg Lys Gln Gly Lys Ser Val Leu Leu Val Ala Thr Asp
130 135 140
Val Tyr Arg Pro Ala Ala Ile Asp Gln Leu Lys Thr Leu Gly Lys Gln
145 150 155 160
Ile Gly Val Pro Val Phe Asp Met Gly Val Asp Gly Asn Pro Pro Glu
165 170 175
Ile Ala Ala Arg Gly Val Arg Lys Ala Lys Asp Glu Asp Ile Asp Val
180 185 190
Val Ile Val Asp Thr Ala Gly Arg Leu Asn Ile Asp Glu Lys Leu Met
195 200 205
Gly Glu Leu Lys Ala Thr Lys Glu Ala Thr Ser Ala Asp Glu Thr Leu
210 215 220
Leu Val Val Asp Ala Met Thr Gly Gln Glu Ala Ala Thr Leu Thr Ala
225 230 235 240
Ser Phe Asn Glu Ala Val Glu Ile Thr Gly Ala Ile Leu Thr Lys Met
245 250 255
Asp Gly Asp Thr Arg Gly Gly Ala Ala Leu Ser Val Arg Glu Val Ser
260 265 270
Gly Lys Pro Ile Lys Phe Thr Gly Val Gly Glu Lys Met Asp Ala Leu
275 280 285
Glu Pro Phe Tyr Pro Glu Arg Met Thr Ser Arg Ile Leu Gly Met Gly
290 295 300
Asp Ile Val Ser Leu Val Glu Lys Val Gln Ala Gly Val Lys Glu Glu
305 310 315 320
Glu Ala Glu Lys Ile Lys Gln Lys Ile Met Ser Ala Thr Phe Asp Phe
325 330 335
Asn Asp Phe Val Gly Gln Leu Glu Met Met Asn Asn Met Gly Gly Met
340 345 350
Lys Gln Ile Met Gln Met Met Pro Gly Thr Ala Lys Leu Ser Glu Ala
355 360 365
Asp Met Glu Ala Ala Gly Lys Ser Met Thr Ile Ala Lys Ser Leu Ile
370 375 380
Asn Ser Met Thr Lys Glu Glu Arg Gln Tyr Pro Asp Met Leu Val Ala
385 390 395 400
Ser Thr Thr Ala Asp Ser Arg Arg Gln Arg Ile Val Lys Gly Ser Gly
405 410 415
Arg Thr Glu Ala Asp Leu Ala Gln Leu Ile Met Met Phe Gly Gly Met
420 425 430
Arg Thr Gln Met Gln Lys Met Ser Gly Gln Leu Gly Gly Gln Ala Gly
435 440 445
Asp Val Gly Leu Gln Pro Gln Leu Ser Glu Ala Glu Leu Ser Lys Leu
450 455 460
Ala Met Asn Lys Ile Arg Lys Thr Val Lys Pro Gly Met Val Arg Arg
465 470 475 480
Gln Lys Ala Lys Lys Val Pro Lys Phe Leu Ala Glu Arg Glu Ser Phe
485 490 495
Ser Gln
<210>80
<211>426
<212>PRT
<213> oyster globulina
<400>80
Met Lys Val Thr Lys Gly Val Glu Pro Gly Gln Met Leu Val Lys Ala
1 5 10 15
Val Ala Asp Glu Leu Cys Glu Leu Met Gly Gly Val Gly Ala Glu Gly
20 25 30
Ile Lys Phe Arg Asp Asp Gly Glu Pro Thr Val Ile Leu Met Ala Gly
35 40 45
Leu Gln Gly Val Gly Lys Thr Thr Ala Cys Gly Lys Leu Ser Leu Ala
50 55 60
Met Arg Lys Gln Gly Lys Thr Val Leu Leu Val Ala Thr Asp Val Tyr
65 70 75 80
Arg Pro Ala Ala Ile Asp Gln Leu Lys Thr Leu Gly Thr Gln Ile Gly
85 90 95
Val Pro Val Phe Asp Met Gly Val Asp Ala Ser Pro Pro Glu Val Ala
100 105 110
Ala Arg Gly Val Arg Lys Ala Lys Glu Glu Asp Ile Asp Val Val Ile
115 120 125
Val Asp Thr Ala Gly Arg Leu Asn Ile Asp Glu Lys Leu Met Ser Glu
130 135 140
Leu Lys Asp Thr Lys Leu Ala Thr Lys Ala Asp Glu Thr Leu Leu Val
145 150 155 160
Val Asp Ala Met Thr Gly Gln Glu Ala Ala Asn Leu Thr Ala Ser Phe
165 170 175
Gln Arg Gly Asp Gly Arg Arg Thr Arg Arg Gly Gly Ala Ala Leu Ser
180 185 190
Val Ala Arg Ser Phe Arg Lys Ala His Gln Phe Thr Ala Ser Val Lys
195 200 205
Met Asp Ala Leu Glu Pro Phe Tyr Pro Glu Arg Met Thr Ser Arg Ile
210 215 220
Leu Gly Met Gly Asp Ile Val Ser Leu Val Glu Lys Val Gln Ser Glu
225 230 235 240
Val Lys Glu Ala Glu Ala Glu Lys Leu Lys Glu Lys Ile Leu Lys Ala
245 250 255
Thr Phe Asp Phe Asn Asp Phe Val Thr Gln Leu Glu Met Met Asn Asn
260 265 270
Met Gly Ser Met Lys Gln Ile Met Gln Met Leu Pro Gly Thr Thr Lys
275 280 285
Leu Ser Glu Ser Glu Met Glu Ala Ala Glu Lys Ser Phe Lys Ile Ala
290 295 300
Arg Ser Leu Ile Asn Ser Met Thr Lys Glu Glu Arg Gln Phe Pro Asp
305 310 315 320
Met Leu Val Ala Ser Thr Thr Ala Glu Ser Arg Arg Ala Arg Ile Val
325 330 335
Lys Gly Ser Gly Arg Thr Glu Ala Asp Leu Ala Gln Leu Ile Ile Met
340 345 350
Phe Gly Ser Met Arg Gly Lys Met Gln Gln Leu Ser Gly Glu Leu Gly
355 360 365
Gly Glu Ala Gly Asn Val Gly Leu Gln Pro Gln Leu Ser Ala Ala Glu
370 375 380
Leu Glu Lys Leu Thr Thr Asn Lys Leu Arg Lys Asn Ile Lys Pro Gly
385 390 395 400
Met Val Arg Arg Leu Lys Ser Lys Lys Ile Pro Ile Ala Lys Asn Gly
405 410 415
Asp Arg Met Gly Ile Ser Ala Ser Ala Asp
420 425
<210>81
<211>510
<212>PRT
<213> Pantoea karezii
<400>81
Met Ser Arg Pro Ala Ala Leu Arg Gly Ala Gly Asn Arg Lys Leu Thr
1 5 10 15
Ala Thr Val Thr Ala Ala His Leu Arg Gly Ile Ala Phe Thr Ser Ile
20 25 30
Arg Thr Cys Gln Gly Ala Lys Gly Gly Ser Leu Gly Leu Pro His Pro
35 40 45
Ser Pro Pro Leu Ala Leu Pro Arg Arg Gly Arg Gly Arg Gly Ala Ala
50 55 60
Val Val Val Arg Ala Ala Met Phe Asp Asn Leu Ser Lys Ser Leu Glu
65 70 75 80
Lys Ala Gln Arg Leu Ile Gly Gly Cys Glu Val Pro Gly Val Gly Val
85 90 95
Val Gly Lys Ser Gly Thr Leu Thr Ala Glu Asn Met Lys Glu Pro Leu
100 105 110
Lys Glu Val Arg Arg Ala Leu Leu Glu Ala Asp Val Ser Leu Pro Val
115 120 125
Val Arg Arg Phe Val Lys Lys Val Glu Glu Arg Ala Leu Gly Thr Lys
130 135 140
Val Ile Glu Gly Val Thr Pro Asp Val Gln Phe Ile Lys Val Val Ser
145 150 155 160
Asn Glu Leu Ile Glu Leu Met Gly Gly Gly Val Gly Ala Lys Asp Leu
165 170 175
Glu Pro Gly Phe Pro Gln Ile Ile Leu Met Ala Gly Leu Gln Gly Val
180 185 190
Gly Lys Thr Thr Ala Ala Gly Lys Leu Ala Leu Tyr Leu Lys Lys Ala
195 200 205
Lys Lys Ser Cys Leu Leu Val Ala Thr Asp Val Tyr Arg Pro Ala Ala
210 215 220
Ile Asp Gln Leu Val Lys Leu Gly Ala Ala Ile Asp Val Pro Val Phe
225 230 235 240
Glu Leu Gly Thr Gln Val Ser Gly Lys Pro Ile Lys Phe Val Gly Val
245 250 255
Gly Glu Lys Met Glu Ala Leu Glu Pro Phe Tyr Pro Glu Arg Met Ala
260 265 270
Ser Arg Ile Leu Gly Met Gly Asp Val Leu Thr Leu Tyr Glu Lys Ala
275 280 285
Glu Ala Ala Ile Lys Glu Glu Asp Ala Lys Ala Val Met Asp Arg Leu
290 295 300
Met Glu Glu Lys Phe Asp Phe Asn Asp Phe Leu Asn Gln Trp Lys Ser
305 310 315 320
Met Asn Asn Met Gly Gly Met Gln Ile Leu Lys Met Met Pro Gly Phe
325 330 335
Asn Lys Glu Arg Ser Asn Pro Glu Val Ile Ile Lys Ser Leu Ala Arg
340 345 350
Arg Arg Arg Val Ala Gln Asp Ser Gly His Ser Glu Ala Glu Val Ala
355 360 365
Lys Leu Met Thr Ala Tyr Thr Ala Met Arg Thr Gln Val Gly Gly Met
370 375 380
Ser Lys Leu Leu Lys Leu Gln Lys Ser Gly Gly Asp Pro Ser Gln Ala
385 390 395 400
Glu Lys Leu Leu Lys Glu Leu Val Ala Ser Ala Gly Lys Lys Val Ala
405 410 415
Pro Gly Lys Pro Pro Gly Asp Pro Ala Gly Ser Phe Ile Ser Thr Pro
420 425 430
Arg Thr Pro His Pro Pro Pro Gly Pro Leu Gly Pro Arg Ser Gln Val
435 440 445
Arg Arg Lys Lys Glu Lys Glu Pro Ile Ser Lys Ala Arg Gly Phe Gly
450 455 460
Ser Pro Ser Asn Phe Asn His Asp Leu Ser Pro Pro Gly Ser Ser Pro
465 470 475 480
Ala Ala Tyr Thr Tyr Thr Leu Ser Arg Leu Ser Cys Gln Arg Leu Cys
485 490 495
Asp Gly Gly Gly Leu Leu Asp Asp Trp Asn Leu Trp Arg Arg
500 505 510
<210>82
<211>448
<212>PRT
<213> Phaeodactylum tricornutum
<400>82
Met Ser Glu Ala Ser Ile Gln Pro Ala Leu Arg Glu Val Arg Arg Ala
1 5 10 15
Leu Leu Asp Ala Asp Val Asn Val Asp Val Ala Asp Thr Leu Ile Glu
20 25 30
Gly Val Arg Ala Arg Ser Leu Gly Gln Glu Val Leu Glu Gly Val Thr
35 40 45
Ala Glu Gln Gln Phe Val Lys Ala Met Tyr Asp Glu Leu Leu Asp Met
50 55 60
Met Gly Gly Asp Ser Ser Val Pro Met Ser Asp Gly Pro Ser Asn Val
65 70 75 80
Pro Val Ala Thr Leu Ala Ser Gly Thr Ala Ala Asp Pro Ala Val Ile
85 90 95
Leu Leu Ala Gly Leu Gln Gly Ala Gly Lys Thr Thr Ala Ala Gly Lys
100 105 110
Leu Ala Leu Phe Leu Lys Glu Gln Arg Lys Val Leu Leu Val Ala Ala
115 120 125
Asp Ile Tyr Arg Pro Ala Ala Ile Lys Gln Leu Gln Val Leu Gly Glu
130 135 140
Ser Ile Gly Val Glu Val Phe Thr Lys Gly Thr Asp Val Asp Pro Val
145 150 155 160
Glu Ile Val Asn Ala Gly Ile Gln Lys Ala Arg Asp Glu Gly Tyr Asp
165 170 175
Thr Val Ile Val Asp Thr Ala Gly Arg Gln Val Ile Asp Thr Asp Leu
180 185 190
Met Asp Glu Leu Gln Arg Met Lys Arg Ala Ala Ser Pro Gln Glu Thr
195 200 205
Leu Leu Ile Val Asp Ala Met Thr Gly Gln Glu Ala Ala Ser Leu Thr
210 215 220
Ala Ala Phe Asp Ser Ala Ile Gly Leu Thr Gly Ala Ile Leu Thr Lys
225 230 235 240
Met Asp Gly Asp Ser Arg Gly Gly Ala Ala Val Ser Val Arg Gly Val
245 250 255
Ser Gly Lys Pro Ile Lys Phe Val Gly Thr Gly Glu Lys Thr Ala Asp
260 265 270
Leu Glu Pro Phe Tyr Pro Asp Arg Met Ala Ser Arg Ile Leu Gly Met
275 280 285
Gly Asp Val Val Ser Leu Val Glu Lys Ala Ala Ser Glu Val Ser Asp
290 295 300
Ala Asp Ala Leu Lys Met Gln Gln Lys Met Leu Asp Ala Ser Phe Asp
305 310 315 320
Phe Asp Asp Phe Val Lys Gln Ser Glu Leu Val Thr Lys Met Gly Ser
325 330 335
Val Ala Gly Ile Ala Lys Leu Met Pro Gly Met Ala Asn Gln Leu Asn
340 345 350
Met Asn Gln Ile Arg Glu Val Glu Ala Arg Leu Lys Lys Ser Lys Ser
355 360 365
Met Ile Ser Ser Met Thr Lys Lys Glu Arg Ala Asn Pro Glu Leu Leu
370 375 380
Ile Lys Asp Ser Ser Ala Arg Ser Arg Leu Ile Arg Ile Thr Lys Gly
385 390 395 400
Ser Gly Cys Gly Leu Asp Glu Gly Gln Gln Phe Met Ser Glu Phe Gln
405 410 415
Arg Met Lys Thr Met Met Ser Thr Arg Arg Phe Trp Arg Phe Trp Leu
420 425 430
Met Ile Gln Ser Leu Ala Leu Ala Val Thr Arg Pro Glu Asn Thr Val
435 440 445
<210>83
<211>486
<212>PRT
<213> Thalassiosira pseudonana
<400>83
Met Phe Asp Gln Leu Ser Asn Ala Leu Thr Glu Val Ala Lys Asn Phe
1 5 10 15
Gly Gly Lys Gln Arg Met Thr Glu Asn Ser Ile Gln Pro Ala Leu Lys
20 25 30
Ser Val Arg Arg Ala Leu Leu Asp Ala Asp Val Asn Leu Asp Val Ala
35 40 45
Thr Ala Leu Ile Asp Gly Val Lys Arg Arg Ser Leu Gly Lys Glu Val
50 55 60
Thr Lys Gly Val Thr Ala Glu Gln Gln Phe Ile Lys Ala Met Tyr Asp
65 70 75 80
Glu Leu Leu Asp Met Met Gly Gly Glu Ala Asn Glu Ser Asn Thr Met
85 90 95
Ala Thr Leu Ala His Ser Ser Val Ala Asn Glu Pro Ala Val Ile Leu
100 105110
Leu Ala Gly Leu Gln Gly Ala Gly Lys Thr Thr Ala Ala Gly Lys Leu
115 120 125
Ala Phe Arg Leu Pro Lys Arg Asn Arg Lys Val Leu Leu Val Ala Ala
130 135 140
Asp Val Tyr Arg Pro Ala Ala Ile Glu Gln Leu Gln Ile Leu Gly Lys
145 150 155 160
Gln Ile Gly Val Glu Val Phe Ser Met Gly Val Asp Ala Asp Pro Ala
165 170 175
Asp Ile Ala Lys Glu Ala Val Glu Lys Ala Lys Arg Glu Gly Phe Asp
180 185 190
Thr Val Val Val Asp Thr Ala Gly Arg Gln Val Val Asp Glu Glu Leu
195 200 205
Met Glu Glu Leu Arg Arg Val Lys Lys Thr Val Glu Pro Asp Glu Thr
210 215 220
Leu Leu Val Val Asp Ala Met Thr Gly Gln Ala Ala Ala Ser Leu Thr
225 230 235 240
Ala Ser Phe Asp Ala Ala Val Gly Ile Ser Gly Ala Ile Leu Thr Lys
245 250 255
Leu Asp Gly Asp Ser Arg Gly Gly Ala Ala Val Ser Ile Arg Gly Val
260 265270
Ser Gly Lys Pro Ile Lys Phe Val Gly Val Gly Glu Lys Thr Asn Asp
275 280 285
Leu Glu Pro Phe Tyr Pro Asp Arg Met Ala Ser Arg Ile Leu Gly Met
290 295 300
Gly Asp Val Ile Ser Leu Val Glu Lys Ala Ser Met Glu Val Ser Asp
305 310 315 320
Ala Asp Ala Ala Lys Met Gln Glu Lys Met Ala Lys Ala Glu Phe Asp
325 330 335
Phe Asp Asp Phe Met Thr Gln Ser Arg Met Val Ser Lys Met Gly Ser
340 345 350
Met Ala Gly Val Ala Lys Met Leu Pro Gly Met Gly Asn Met Ile Asp
355 360 365
Ser Ser Gln Met Arg Gln Val Glu Glu Arg Ile Lys Arg Ser Glu Ala
370 375 380
Met Ile Cys Ser Met Asn Lys Lys Glu Arg Ala Asn Pro Gly Leu Leu
385 390 395 400
Leu Thr Asp Lys Ser Ala Arg Ser Arg Leu Met Arg Ile Thr Lys Gly
405 410 415
Ser Gly Leu Ala Phe Glu Asp Gly Leu Ala Phe Met Ser Glu Phe Gln
420 425 430
Lys Met Arg Thr Met Ile Ser Arg Met Ala Lys Gln Thr Gly Met Gly
435 440 445
Gln Pro Asp Gly Glu Gly Glu Met Glu Pro Ala Met Ala Gly Asn Arg
450 455 460
Asn Ala Arg Arg Ala Ala Lys Lys Lys Gly Lys Lys Gly Gly Arg Gly
465 470 475 480
Gly Gly Met Gly Phe Ala
485
<210>84
<211>530
<212>PRT
<213> Chrysophyta antifeedant
<400>84
Met Thr Met Ala Arg Arg Ala Ala Thr Ala Ala Leu Val Leu Ala Ala
1 5 10 15
Ala Trp Ala Phe Ala Pro Pro Gln Thr Lys Arg Ala Thr Thr Gln Leu
20 25 30
Tyr Phe Phe Asp Lys Leu Ala Glu Ser Ile Thr Ala Ala Thr Asp Val
35 40 45
Leu Ser Gly Lys Ser Arg Met Thr Glu Ala Asn Thr Lys Ser Ala Leu
50 55 60
Arg Asp Val Arg Arg Ser Leu Leu Asp Ala Asp Val Ala Lys Val Val
65 70 7580
Val Asp Gly Phe Val Glu Asn Val Gln Ala Ser Ala Leu Asp Gly Glu
85 90 95
Val Ala Glu Gly Val Asp Pro Gly Gln Gln Phe Val Lys Ile Val Tyr
100 105 110
Asp Glu Leu Lys Arg Val Met Gly Gly Asp Asp Asp Glu Leu Leu Phe
115 120 125
Ser Asp Asp Pro Glu Ala Ala Ala Lys Ala Arg Ala Gly Leu Ala Tyr
130 135 140
Arg Asp Asp Gly Ala Pro Thr Val Val Leu Leu Cys Gly Leu Gln Gly
145 150 155 160
Ala Gly Lys Thr Thr Ala Ala Ala Lys Leu Ala Leu Arg Leu Lys Glu
165 170 175
Glu Glu Gly Lys Thr Pro Met Leu Val Ala Ala Asp Val Tyr Arg Pro
180 185 190
Ala Ala Val Glu Gln Leu Gln Ile Leu Gly Glu Gln Val Gly Val Pro
195 200 205
Val Tyr Ala Glu Ala Phe Glu Ala Gly Ala Gly Asp Ala Val Ala Ile
210 215 220
Ala Thr Ala Gly Val Arg Ala Ala Lys Glu Arg Gly Ala Asp Val Val
225 230 235 240
Ile Val Asp Thr Ala Gly Arg Gln Val Ile Glu Glu Ser Leu Met Ala
245 250 255
Glu Leu Arg Ser Val Arg Ala Ala Thr Lys Pro Asp Glu Thr Leu Leu
260 265 270
Val Leu Asp Ala Met Thr Gly Gln Asp Ala Ala Ser Leu Ala Lys Arg
275 280 285
Phe Asp Asp Ala Cys Pro Leu Thr Gly Ser Val Leu Thr Lys Leu Asp
290 295 300
Gly Asp Ala Arg Gly Gly Ala Ala Leu Ser Val Arg Ala Val Ser Gly
305 310 315 320
Lys Pro Ile Lys Phe Val Gly Val Gly Glu Lys Val Gly Asp Leu Glu
325 330 335
Pro Phe Phe Pro Ala Arg Met Ala Ser Arg Ile Leu Gly Met Gly Asp
340 345 350
Val Val Ser Leu Val Glu Lys Ala Ser Lys Gln Gln Ser Ala Ala Glu
355 360 365
Ala Lys Ala Val Met Glu Arg Thr Lys Gln Ala Lys Phe Asn Phe Asp
370 375 380
Asp Tyr Leu Asp Gln Ala Arg Met Val Ser Asn Met Gly Ser Phe Gly
385 390 395 400
Ala Val Ala Lys Met Met Pro Gly Met Gly Gly Ile Asp Asn Asp Gln
405 410 415
Ile Ala Ala Ala Glu Ala Lys Ile Lys Ile Gln Ala Ser Leu Ile Asn
420 425 430
Ser Met Thr Pro Lys Glu Arg Gly Glu Pro Asp Leu Ile Ile Arg Asp
435 440 445
Lys Ser Ala Leu Ala Arg Gln Lys Arg Ile Ala Ala Gly Ser Gly Arg
450 455 460
Ser Val Asp Gln Ala Lys Gln Phe Leu Ser Glu Phe Gln Gln Met Arg
465 470 475 480
Thr Met Met Ala Lys Met Ala Gly Gln Ala Pro Pro Asp Gly Ala Asp
485 490 495
Ala Ala Ala Ala Pro Asp Pro Asp Ala Leu Leu Asn Arg Ala Ala Arg
500 505 510
Arg Ala Lys Lys Lys Lys Gly Gly Lys Arg Lys Leu Lys Thr Ala Gly
515 520 525
Phe Gly
530
<210>85
<211>556
<212>PRT
<213> Long-bag Water cloud
<400>85
Met Ile Met Ala Ser Leu Lys His Arg Ser Pro Pro Arg Gly Gly Ala
1 5 10 15
Ala Ala Thr Leu Ser Phe Phe Cys Cys Val Cys Ala Leu Phe Ala Gln
20 25 30
Ser Ser Val Ala Phe Val Pro Ala Gly Gly Leu Ser Arg Cys Gly Val
35 40 45
Asn Asp Arg Ser Ser Ser Ser Cys Arg Ala Ala Ala Ile Gly Ala Ala
50 55 60
Gly Arg Ser Ser Leu Pro Val Ser Arg Ser Ser Ser Arg Arg Gly Arg
65 70 75 80
Arg Gly Gly Cys Ala Gly Gly Ala Ser Ser Pro Leu Gly Met Met Phe
85 90 95
Asp Thr Leu Ala Glu Asn Met Ala Gly Val Ala Asn Leu Phe Thr Gly
100 105 110
Gln Lys Thr Ile Thr Glu Ser Ser Val Glu Gly Ala Leu Asn Glu Val
115 120 125
Lys Arg Ala Leu Leu Asp Ala Asp Leu Asn Leu Met Val Thr Asn Thr
130 135 140
Leu Val Asp Ala Val Lys Ser Lys Ala Val Gly Met Lys Leu Val Asp
145 150 155 160
GlyVal Thr Ala Lys Gln Gln Phe Val Asn Val Met Asn Asp Glu Leu
165 170 175
Val Glu Ile Met Gly Ala Glu Gln Ala Pro Leu Ala Arg Arg Thr Asp
180 185 190
Gly Lys Pro Thr Val Ile Leu Leu Ala Gly Leu Gln Gly Thr Gly Lys
195 200 205
Thr Thr Ala Ala Ala Lys Leu Ala Lys Tyr Leu Gln Gln Glu Glu Glu
210 215 220
Pro Lys Lys Val Leu Leu Val Ala Gly Asp Val Tyr Arg Pro Ala Ile
225 230 235 240
Asp Gln Leu Ile Ser Leu Gly Lys Arg Ile Asp Val Glu Val Phe Ser
245 250 255
Met Gly Gln Gly Val Asp Pro Val Glu Ile Thr Lys Ala Gly Leu Glu
260 265 270
Arg Ala Val Glu Gly Glu Phe Asp Thr Val Ile Val Asp Thr Ala Gly
275 280 285
Arg Gln Val Val Asp Asp Thr Leu Met Thr Glu Leu Lys Asp Ile Gln
290 295 300
Val Ala Ser Glu Ala Asp Glu Val Leu Leu Val Val Asp Ala Met Thr
305 310 315 320
Gly Gln GluAla Ala Thr Leu Ala Ser Val Phe Asn Glu Lys Ile Gly
325 330 335
Ile Thr Gly Ala Val Leu Thr Lys Met Asp Gly Asp Thr Arg Gly Gly
340 345 350
Ala Ala Leu Ser Val Gln Gly Val Ser Gln Lys Pro Ile Lys Phe Val
355 360 365
Gly Ile Gly Glu Lys Met Ser Glu Glu Glu Ala Ala Lys Leu Ala Lys
370 375 380
Lys Met Ile Asn Ala Glu Phe Asp Phe Asn Asp Phe Leu Lys Gln Ala
385 390 395 400
Lys Met Met Lys Gly Met Gly Ser Leu Gly Gly Val Ala Asn Met Ile
405 410 415
Pro Gly Met Ala Gly Lys Ile Thr Pro Gln Gln Leu Asn Gln Ala Glu
420 425 430
Glu Gly Val Gln Arg Ala Glu Gly Leu Ile Lys Phe Met Thr Pro Glu
435 440 445
Glu Arg Arg Thr Pro Lys Leu Leu Ile Leu Asp Pro Thr Ser Gln Ala
450 455 460
Arg Cys Arg Arg Ile Ala Arg Asp Ala Gly Val Lys Leu Ser Ala Val
465 470 475 480
Ser Ala Phe Leu LysGlu Phe Gln Ala Met Gln Ser Asn Met Ser Arg
485 490 495
Met Gly Lys Gln Met Ala Asp Gly Asp Pro Asn Ala Gly Pro Gly Gly
500 505 510
Gln Pro Ser Pro Phe Gln Gly Leu Gly Gly Asp Thr Ala Pro Gly Ala
515 520 525
Ala Pro Ser Met Asn Arg Gln Gln Arg Arg Gln Ser Lys Lys Asn Lys
530 535 540
Ala Gly Arg Ser Ala Ala Pro Ser Lys Gly Phe Gly
545 550 555
<210>86
<211>28452
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic polynucleotides
<400>86
cgtctgatta aaccacgctg ggagattaga taatgaagcg tgcgcctgtt attccaaaac 60
atacgctcaa tactcaaccg gttgaagata cttcgttatc gacaccagct gccccgatgg 120
tggattcgtt aattgcgcgc gtaggagtaa tggctcgcgg taatgccatt actttgcctg 180
tatgtggtcg ggatgtgaag tttactcttg aagtgctccg gggtgatagt gttgagaaga 240
cctctcgggt atggtcaggt aatgaacgtg accaggagct gcttactgag gacgcactgg 300
atgatctcat cccttctttt ctactgactg gtcaacagac accggcgttc ggtcgaagag 360
tatctggtgt catagaaatt gccgatggga gtcgccgtcg taaagctgct gcacttaccg 420
aaagtgatta tcgtgttctg gttggcgagc tggatgatga gcagatggct gcattatcca 480
gattgggtaa cgattatcgc ccaacaagtg cttatgaacg tggtcagcgt tatgcaagcc 540
gattgcagaa tgaatttgct ggaaatattt ctgcgctggc tgatgcggaa aatatttcac 600
gtaagattat tacccgctgt atcaacaccg ccaaattgcc taaatcagtt gttgctcttt 660
tttctcaccc cggtgaacta tctgcccggt caggtgatgc acttcaaaaa gcctttacag 720
ataaagagga attacttaag cagcaggcat ctaaccttca tgagcagaaa aaagctgggg 780
tgatatttga agctgaagaa gttatcactc ttttaacttc tgtgcttaaa acgtcatctg 840
catcaagaac tagtttaagc tcacgacatc agtttgctcc tggagcgaca gtattgtata 900
agggcgataa aatggtgctt aacctggaca ggtctcgtgt tccaactgag tgtatagaga 960
aaattgaggc cattcttaag gaacttgaaa agccagcacc ctgatgcgac ctcgttttag 1020
tctacgttta tctgtcttta cttaatgtcc tttgttacag gccagaaagc ataactggcc 1080
tgaatattct ctctgggccc actgttccac ttgtatcgtc ggtctgataa tcagactggg 1140
accacggtcc cactcgtatc gtcggtctga ttattagtct gggaccacgg tcccactcgt 1200
atcgtcggtc tgattattag tctgggacca cggtcccact cgtatcgtcg gtctgataat 1260
cagactggga ccacggtccc actcgtatcg tcggtctgat tattagtctg ggaccatggt 1320
cccactcgta tcgtcggtct gattattagt ctgggaccac ggtcccactc gtatcgtcgg 1380
tctgattatt agtctggaac cacggtccca ctcgtatcgt cggtctgatt attagtctgg 1440
gaccacggtc ccactcgtat cgtcggtctg attattagtc tgggaccacg atcccactcg 1500
tgttgtcggt ctgattatcg gtctgggacc acggtcccac ttgtattgtc gatcagacta 1560
tcagcgtgag actacgattc catcaatgcc tgtcaagggc aagtattgac atgtcgtcgt 1620
aacctgtaga acggagtaac ctcggtgtgc ggttgtatgc ctgctgtgga ttgctgctgt 1680
gtcctgctta tccacaacat tttgcgcacg gttatgtgga caaaatacct ggttacccag 1740
gccgtgccgg cacgtgatcg cgcaggctca gctgcacagc agacgcaagg gacagctcag 1800
catctggaac cgccgacacc aggtgctgag gatgctgcac ctttggcaac cccaataggt 1860
gcttttgggc gtactgctgt gcctgcgcct agggacattg actggtggcg ggtcccgaag 1920
gagctgatgg gaagctacgc acaagctgaa gctggagaca gcagctccac caatgttgac 1980
ttctctgggg agcctccggc cagcagcgtg tacaacgaga ggggggacgc gttagtggag 2040
caggaggtga aggcagcaac ggcgggtgtg gattttgctg gcaggaggag ggccaggggg 2100
ttgttggaca atgctgagcc tcccgatctt gataatggcc ctggagagca gccagcaggg 2160
gcaacagtga gttggagctg gaggcgattc agggataggc agggataggc agtgtagtgg 2220
ccagaactgg ctgctggaac ctggggttac tcagggtgaa cgcaggcaaa ggggtgcagg 2280
tgtattgaag ctcttaatat aagagagatg cgtcgaacat atatggtgat agtcttgagt 2340
ggtgtgttgg gtggaaggct gctgtttacg gtgcaggaag ttttcctggt acggtcgtta 2400
tgtaatgcag cagcacgtat gtaagaacca gtcgacattt aacctatgca gtagcatata 2460
gttatgtgtc aaaatcataa attggcccta tttgtggcga gcctatcttt caaatactac 2520
tgttcctcgc actgtcctct aaaatttctc caaacaacgt tgtaaaggtg ctgatgttag 2580
catatcatct ctggcataac tggatgcacc cagccggcta acagtgggaa gatgaagagg 2640
ggcttgtaca ctacactttt cttgccaaga ctgttagctt gcccaagcca gcacagcgat 2700
tttcttgcaa caaacgtgag ccttgcatct gcttttgatt gcaacggccg actggtgagt 2760
tattgtgcaa gcagtgttct gctaaactgt tcagaccagg ttcgcagctg gctaagatcg 2820
gtatctggaa agctccaacg aacaggtttt caatacgtgc tgcgtcaata tgcccttctt 2880
gttcactaca gcgacctttc caatgttgga tgtgaacaaa tgtcgaagcg cacaataacc 2940
tgaaagacat tgttgctcat tcccttttct ttggtagcgt aggtttgtat atttagagtt 3000
ccagttctgt actagttgct ctgcggcaac gattgaagtg tgtaccttat actgcacgtt 3060
aaatatgata ggttcagcgc ggttctttaa atgacaaaat aaatagtatt caacaaaaaa 3120
aaatagttgt ttgacatgtc actttttctt ttacataggt agcatgtcgt caaatcgtca 3180
atgcaaacca gcttgcgact aacgtaagca gtacagggga tagtacaatg agtttttcac 3240
cagcaatttg gtccagtgtt ttcgcaccgc cgtgaagcgc attcacatta aagtagcatc 3300
gctacacctg ttctcatctt gttaggttca aattttgcaa cgtgtagcta caaagtggca 3360
acagcgcagg ctgttggtca ctcgctaagg cttgcattgg caccctcgtt gctctgtgta 3420
ggagcgtgca tttgtgctca agactgttat ttttgacttc aaaaacttta tcgatagcgc 3480
actgcctcgt ttttacaaga tagccttctg tgagcagccc tgccccatgc gcctttaggc 3540
tttctgtggc aatgtctggt tcagctggat cgggccaggc tactctcaga catgacggtg 3600
gctctgctgg cggcagtggg cctgtctcag acggtttttc accggccggc ctgaaggtaa 3660
agtagaaaga cactcataca catcttggtt cggcgttgaa agtaggtcat taacatactc 3720
tataaccaat atttgtaggt tctggtcgtg gacgacgagt taaccttgga aatccctacc 3780
aggctattct ccacagcccg aaccccttaa gctagacgaa cacagttagc ataacttcgt 3840
ataggatact ttatacgaag ttatgcggcc gcccaccatg ggggaggttt gaagtgtgcg 3900
cctgatataa tcatacacct aaaagcacca cttgctgatt gtgaagggac tatgtcgttt 3960
atgacgggac gttacgctgg ccgatggttt gaatttggac gctgtggtag aatgttatat 4020
ggacgtaaag gttggcatat tgaaaatcgt cttcacaggc aaacttctag acgtgtgacc 4080
caccggtaaa acgacaagcg tggcgcgtcg attgcgcttt gaacgtcgtt tgttggactc 4140
cagatgaacc tcaaaatcaa agcggtgatt gacgaaaatc aaatgacagc ccgcaaaatt 4200
tcatcagcct tcggatcgga ttctcagaat ctgattgtcc ctgctggcta catttatgaa 4260
atttcgtaca ttttggcaga aatgtcccaa taccatagca ctgccgcctg agctcacccg 4320
agcaatgcat actgggtacc tcgcccatct cgccctcttt ccaagcccag tgctgttgta 4380
aatagccaaa gggctcagta acaatggcca aactgacatc cgctgttcct gtgttgacag 4440
caagagatgt tgcaggtgca gtggagtttt gtgagttctg agaagctgat tgttgtttaa 4500
cttctttgaa agctttatcg aagattctgc aagcgatgaa cattgcttgt caagaccgag 4560
agctgcatgc ccacttgaca tccagctttg aacggctctt catgtttgat ttgtttctga 4620
ttgtagggac agatagactg gggtttagca gggactttgt ggaggacgat tttgcaggag 4680
tggtgaggga tgatgtgacactgtttatct cagcagtgca ggatcaagtg agtgcagcgt 4740
cagctgtggc agttgttggc tttcgtctca gtcagtagtt tgctgggatt gattatggag 4800
ggcacagttg caattttgag ttgcacgttg cgacaagcgt gttgacaaag cgtggtcaag 4860
ccggccagtc ttgccggtgg cgggtggctt ggtctaactt ccgctctaca gcaatcgttt 4920
tgttcatggt tacggggctg gcgtgccaga aagtcctggt cagccaccct cgcttcaaag 4980
ccgtagccca acaactttgc gaatatgttc gatttgcagg tggtgcccga taatacactg 5040
gcatgggttt gggtgagagg tacagctctg cgtgcaacag gttgcaagat gcagcgcagg 5100
tcttccctgg tcaaacgatg tatgcagagt tgagaggcac ttgagctggg tgaatggcgt 5160
gggctcgtag gtagtgtgca gggcaggaag ggcagccaat tttggagttg tggtccggtg 5220
tcgttgcttc gagccttatt aggactcttg ctcatcaaag cgttagttgt gaataagttg 5280
atctgaaagg atgttatgta cagcaagcag cagcagttaa gagtctgggg agtagctgca 5340
cagggcgagg tgtcaagatg ggaagggtcc tgcctcctta tgtgtttttc cctgtagggg 5400
aggaagcctc ttatgggcaa tggttgggca tattttccag ccagcccttc tttctatagg 5460
ggccagggtg ggcccagctc gtcttggctt ccaccaccag gagagtgagg gcattgaagg 5520
gccataaata gtcctcccat ctacgtgcac cagagggtgt cgtctaggct gtgcatgcca 5580
cgaggggaag gagccaagaa tgagtgtatg ggttgttttc atgtttaggc tgggataaaa 5640
ctgttttcaa ttgcgcctgc cgggtgaaaa ccacagcagc atcagcaagc ttggagaagg 5700
ccagcccgcc cagcacaggc tcacgttccc actcaggcgg tcagtcgggc gggggtgtga 5760
gtcaggcagg cgagggtgtc tgtgcctgac atcagcacct ctgcttagcc actgcagccc 5820
ctggagcagg gtagggcgtc atttgcagca atcacctgct gcctcacacg tcgcagcttg 5880
gaatttcaac gaccatcagc gctggggttg ttgagggatc atagcagatt ttggtgcagc 5940
ctggttgtca tgctctttgt ggaatggcct ctatgttcga gcaattcgtt ggatgttgag 6000
gtgcttgggg acagagagtc gaatgatggg ccagggtcaa acatgcgagc gtttggctga 6060
gtcagcggtt tttgctggtc actttttctt ttgtttctta tttaggtttg atggatgtgt 6120
tttgtgctgc tgccctgaag ctgcagcagc gtgtctgccc tgcgctactg cgggcaccaa 6180
ggctatgtgc tggtgcactc ggctgcgctg cacctgtgca cctcgcactc cgtccagcct 6240
ccatgcagca cacgtactca cggtgtcctc ctgacctgtc gtacgctatt ccaaacttgc 6300
tcttttgctg ccgctgctct cgtacacaat tgctgttgat tatcgatatc taatcgagcg 6360
cctgctgact gaactccgca ggtttggatg aactgtatgc agagtggtct gaagtggtga 6420
gcaccaactt taggtgggtg ggctctgaag gaggaggagg gagcgggtga ttaaacaggg 6480
cctgcatgaa gaggagcagg ggctgcatgg acagcagggg gaaggtgcag aagggagggt 6540
caagcggggt tcaggtggct gtgggtttct gcacgagcag tgaaagaagc tgtatccttc 6600
cacctgcttt cactggcgaa aggttgaaaa caggatgtcg cagctggaaa gatgttgcgc 6660
tgtcaagtgc aagccatggt tgagggtatg cctgtgtgca tgtgcttctt aaagttactc 6720
ctgttctatg gttctgggtg cttgttgttt gtggtgcagg gatgcaagcg gacctgcaat 6780
gacagagatt ggagaacaac cttggggaag ggagtttgca ttgagagatc ctgcaggtga 6840
gggggcatgt aagcaatggc aggcaattca agaacgaatc attgctgcaa atgctgggat 6900
ggtatgcagc tgaggtatct attgccttgt attttgtctc gcattgcatc ggtggtgcgt 6960
tctgtggcct gaggcacagt tcttgctgtt tgataagggt tcgactgagt tgtcgtgtgt 7020
gctgtgctgc aggcaattgc gtgcactttg ttgcagaaga acaggactga gcatagcatc 7080
agcctgtggc agggttgtgg tagggctgag tggcagggtt aaaggggttg cctaccccac 7140
ccctactctc atgacaccag caacagcagc agctcatgca gtactcaaat cactgatgtc 7200
aatggtgtga cacatttggt taaggctgct ttttaaagtg ctgctttggg ggcagtgact 7260
gtgcagagct tggagcgtat ccccatgtaa tcagaaccga cgagagttcg gggcaacctt 7320
tcatcttcac attttttgtg atcagctaca gagtctgaaa tcaaatagag gctgccatct 7380
aaacgcagga gtcacaacga aggcgaaaac tccaattgct gtactcaatg cactaagtga 7440
ttgttcaatg gataaataca ctatgctcaa ttcatgccag cagagctgct ccttccagcc 7500
agctacaatg gctttttcca cgccttttga agtatgaatg ttcagcttgc tgtgcttgat 7560
gcatcaccat aaacacaatt ctacaacatt tcatgccaac aacagtacgg gctttccctg 7620
caggcagttg gtacggcata ttatggttta aacatctatc ctccagatca ccagggccag 7680
tgaggccagt ttgcatagtt aagtatgctg gctattgcag taccttatat gcaaacaagt 7740
gctcaatctg tttcatcatt gtctgtgggc aaattgcctg ccaatattct ccagttattg 7800
cctgttgttt caaatgattg aaattggaag ttgtattgct ctacattttt gacttgtgat 7860
tttttcattt gttgatatct gacaactgtg aactgcactg aacttgctgt gcttataaat 7920
gcattttttt gttttgggcc acgttgattc cttgtgatac tttcctgcta tcaaaccaaa 7980
aatatactct catgactgac gtgcaacaaa tgcatggaag ctttcaacgt tacgacagct 8040
gcttgccccc catcagctat tctacatgtg taacctacct tgcatggcca ccacaacgct 8100
actgcatgca agatctggcg caactggatg tcccaatagt agaagtatcc ggattatctc 8160
cgagagtttt acatatgtaa tcgacgccat ttctgtcatc aactataaat ccattgctcc 8220
tgcatttctg gcactgacat tctaccacaa gcaataccaa tgttggagag cgacgagagc 8280
ggcctgcccg ccatggagat cgagtgccgc atcaccggca ccctgaacgg cgtggagttc 8340
gagctggtgg gcggcggaga gggcaccccc gagcagggcc gcatgaccaa caagatgaag 8400
agcaccaaag gcgccctgac cttcagcccc tacctgctga gccacgtgat gggctacggc 8460
ttctaccact tcggcaccta ccccagcggc tacgagaacc ccttcctgca cgccatcaac 8520
aacggcggct acaccaacac ccgcatcgag aagtacgagg acggcggcgt gctgcacgtg 8580
agcttcagct accgctacga ggccggccgc gtgatcggcg acttcaaggt gatgggcacc 8640
ggcttccccg aggacagcgt gatcttcacc gacaagatca tccgcagcaa cgccaccgtg 8700
gagcacctgc accccatggg cgataacgat ctggatggca gcttcacccg caccttcagc 8760
ctgcgcgacg gcggctacta cagctccgtg gtggacagcc acatgcactt caagagcgcc 8820
atccacccca gcatcctgca gaacgggggc cccatgttcg ccttccgccg cgtggaggag 8880
gatcacagca acaccgagct gggcatcgtg gagtaccagc acgccttcaa gaccccggat 8940
gcagatgccg gtgaagaata agcagcagct tgttatgcct tccccatggg catcagcatg 9000
ctgcaagctg tctagatatc cagctttcag tggaggttga gcgagggtca gcagcggttc 9060
cctggcgatg gcggtcagct tttctggaag ccttcactag gactgcgccc agcgcatgtg 9120
acgccaatcg aacttgtgtg caaggccaaa ttttgtgacc ctgtgctgca cttcatgtat 9180
tcaagaattg agaagaaatt tcattgctgc ccttctttca ctttaatttc catccctgga 9240
tccacctccc accattgtgg ttgatgggta ggggttttgg gtaggtgcag ttcgttgtgc 9300
acgttgacat gtgtaacggt gagcaaagga attgctgggc aagtagctat tgcagcttaa 9360
gggcatggtg aaacacttgt gctgtattta cagaggaagc cagacaggta aggagtgtgt 9420
ggcagcttgg aacaggaggg ctggtcgcaa caagtatgca tatcccatga ttgttgacat 9480
aagagcagca ggtgcatatt gccagccttt gtgaaagtgg attgaaaatc aattagttgg 9540
tgtgatagct gaggctaggc actgccaacc tgcagtgaaa tgaggctcca agaccgggta 9600
ataatacagg caatcgaatc cagttgaaat tacggcgatt aaatccaagc gagcgttgta 9660
agaacatctg cacctgtctg aagtagtgag cggataatga gcattgcttg ccttctatca 9720
ctatacctga cagttacgtg tcacacactc tcaagcacaa cacacagcgg caaagttact 9780
tgctaaacct cacagtcaag ctgaaaataa aggctaaatt acgtgagacc ggcgcgccat 9840
aacttcgtat aggatacttt atacgaagtt atcaccagat ataggtgacc cgataactta 9900
attaatcttg cgaagattga attgctaata gaaggttctc atctatacat gagttaccag 9960
tgaaccccat atctgctcta taatatagtc cccgctgagg cgcagtgctg aggttccagc 10020
tcgaacgagc cagtagggct tcgactcacg gctcatttat tttagagcta ggttgacttc 10080
ccagtctcat gcaatacatg agagcaggtg ttggtcgcac gcctctctca cggtgcctct 10140
tgattttcgg ccccttgcac ccgctctcat atgacatatt cgcgctgcac ccttgctcag 10200
agcaggcgca gcatgtggag tagcgggcgc aagccgtaat gaggagtctc agctcaacat 10260
gattgaggtc agcatcactg taacaataca aatcattgtg gtgccttata tatttggtaa 10320
atgctcgctg cagtattcaa atcgaccttc actgcaagca actcgattga actacgcgcg 10380
ttattgaagg cacatacaac cgggagttca gaggagtatg cccaagaaga agcggaaagt 10440
cgggagcaat ctgttgaccg tgcatcaggt atcgagaaga actaaagagc gttcaaacgc 10500
atcaatattt tgctaaagag ctttacatct ttttggggct attttctggc tactcggtag 10560
tgacttgacc actttcttcc caagtggggg caagccgata agccgctgtg accgttgatt 10620
tttttataaa agacgtagac atgttcaatc agccacaatt gatatgcttg aatacagaac 10680
ctgcccgcat tgcctgttga cgcaacatct ggtgagctgc ggttgctatc ctcccaatat 10740
aacctgaagt catgcatata ttcgcactaa tctacatccc atgttgtgtt gagctattcg 10800
gtattgatgc cagctcagtg aactaattat caaatgtata tcggtgctgc cagaatcgat 10860
ccatgtatca atgccacaag taactggaga tacatttgct acatgtagat gaggtgcgca 10920
agaacctgat ggacatgttt agggaccgcc aagccttcag cgagcataca tggaagatgc 10980
tgctgagcgt gtgcagatct tgggcagcat ggtgtaagct gaacaaccgc aagtggttcc 11040
cagcagaacc cgaaggtatg cctgggtaac tgtcaaaatc atgtatattc ccgcaatgca 11100
agtggttcat tgttgtgctt tacgttaaag acgtgtcagc tgcaggagaa ttattttgag 11160
gatgattgtc cgttgttggc gatgtcttgc attgtgaagt atgttttgaa gtcatacagg 11220
aagtgtgaaa tcccaaagca gctggctgcc gctgcatgcg accagtcatt cacctgcatt 11280
gtgtgtgctg tagatgtgag ggactatctg ctgtacctgc aagcaagggg actggcagtg 11340
aagaccattc agcagcatct gggacagctg aacatgctgc ataggaggtc tggactgcct 11400
aggccaagcg atagcaatgc agtgtctctg gtgatgcgcc gcattagaaa ggagaacgtg 11460
gatgctgggg agagggcaaa acaagcactg gcatttgagc gcaccgactt tgaccaagtg 11520
aggtgggctt cgcaactgct gcctgaactt cctgttcctg tgcatgtaca tgagagtcgg 11580
ttggaacagg ctcatactgc gcctgattga taggctgtcc cacattgttt tatttgctgt 11640
atcgatgtat tcattttgca ttgggtcctt tctgctcatg aagcaccaag aaggctggct 11700
gtcaatggca tgccagctca tgccatctgg atgacattat gcaagaccag tgttgactcg 11760
aacatgaatc ttactggaaa ctttaatgaa tgctttcgag ctttttgtgc aggtctctga 11820
tggagaactc agaccgctgc caagacatcc gcaatctggc atttctgggg atcgcctaca 11880
acacactgct gaggattgcc gagatcgcac gcattagggt gaaggacatt agccgcacag 11940
atggagggag gatgctgatc catatcggga ggacaaagac cctggtgagc acagctggag 12000
tggagaaagc actgtctctg ggagtgacca aggtaagctt accatgtgtt tatatgaagc 12060
tgatatttgg aagaaaggag gaagcaacga caacaagggc ggtgcacaat ctattgccgc 12120
ttttgaatct tgcccgcaaa ggcagtcgat gattgctcac tgtatcaggt tgatttagtt 12180
gatgaggtgt agctggggaa gctccaatcc ccagtccaga tagccttggt tatgaattgc 12240
ataatgtagg caccacttgc actggtccta aaccccagtt cattcctgtc cttctcgtgc 12300
attttgtcaa atgaacatgc aaccgagtgt gttttcctac tcgacatgtg tgcgattgcc 12360
cacgtgtgct gcagctggtg gaacggtgga ttagcgtgtc tggagtggca gatgacccca 12420
acaactacct gttttgccgc gtgcgcaaga atggagttgc tgcacctagc gcaaccagtc 12480
aactgtctac aagggcactg gaggggatct ttgaggcaac acatcgcctg atctacgggg 12540
caaaggatga ttctgggcag aggtatctgg cctggtctgg acattctgca agggttggag 12600
cagcaaggga catggcaaga gctggagtga gcattcccga gatcatgtga gaggccccag 12660
caaaaacaac agcactagct gttgctgctc agtttgtgct cgtgatgttt gaaaggaatg 12720
gacaaggttc atccatgatg ttcattatct gggctggtct tgtacatggg gttattctat 12780
actaaacagg agcgatacaa ataacaaaca atcaatgtct atatacacat atacttggct 12840
aaatttttct cccggcctta catacataac aaaggctaaa ctaattgacc caaaataatt 12900
gtatgaataa tcaaattgat gcatacaaat aatcctaaaa atgaaaaaaa tttcattgaa 12960
ataagtatag aaataacaaa tgtttgaccc acagccctca ctctccaacc caatcctgcc 13020
tctcacaaga cttgccatgt accaacttac aatgacagcg agctacaaca agttccatca 13080
aggtgtgggt tgctattagt tggtggaacg tttgtacatt tcacagttgg acatgcactt 13140
gcgaaaaagg cgttggcttc agtgaggcag tgcttgctcg tatcccctcc aagcatgcct 13200
tgtgcaccca ttttgcaacg caggcaagct ggagggtgga caaacgtgaa catcgtgatg 13260
aactacatcc gcaacctgga cagcgagact ggagcaatgg tgagactgct ggaggatggg 13320
gattaatcag gatgttttga gcggttgtag gttctgtagt tgtatggtag gttgcatgga 13380
ggaaataggc caacaacaat tccaaatcaa aggagattgt agcgttgctc ttggtccccc 13440
tgaaaatttt tgttgttatg tgtctataaa tctagttctg caccttgcaa actgtgggat 13500
gccctgtcca gagcagaagg taatcccaaa acagtcgaga aagtctcgtt gggtggttgt 13560
gtaaagtaca aatgtatgtt ttccaccttg tctttgtatt gtgcacgagc tacagcattg 13620
gtggaagggc ttatagctgc tgggtcatca tgctgtcctg ttcttgatgg tttaggtgtc 13680
atccctttca ctgactcagc gaaatcggat gcgtaccatt catgaacggt gttgcacttg 13740
ctgtttgtga aaggtactgc atgtgcattg tacaatagac tactataatg tctcatgcac 13800
gtggtcaatg atgtagattt ctggaatatg catcgtgtaa ttgattcgat gaacccctcg 13860
tttggaactc tatttgaaaa gcaatcgagt gtcattatcc ataatggatg atgatcatga 13920
gcattgcaaa tagcaccatt agaacaaact gaatattgta caccttgacc tggatatgca 13980
tccgtccttc atcccacttt attaaggcag gttataattg gcaaggagtc ggcagaatag 14040
tcgtttggtt ataccccagt tttagtgggg cctttggcag ctatattatg gtcgcgactg 14100
taaccgggtc cgtttaaagt tcgattacat ctcagaaata taattgggct gcatgttaga 14160
aacttttcgc cgggtataac cggggtataa tcggcatact gcccaatgac ggccagccgc 14220
tggtcagtga ccgtcaaacg gtcggacggt ctgcatcgca tgtgcgctga catgtcaagt 14280
gcatgcttct cttacattca ggcaaaagac tacaagtcat tgaagaattg tcaactcagt 14340
aagctgacaa ttacgttcat gaaggtcagt cgtatgaaac tcgtatttct ccctaagtcg 14400
ttactatgga aagtacatcg tgccacgtca tcgtcatcgt ggcaatgaca gatgatggat 14460
agggtggggt tggcattaat tgctatcatt ttctttgcag aaaacaaata cctggcacat 14520
aatttgttga taatcatatg tatgtatgtc cacatgtcaa cgttatatgt ataaaaatca 14580
agacttgttt gcttaactct aaatttaatg taagaatttc ggtaataatc tgatctacat 14640
tatcacttgt gattaatgtt gaaatttgtt atccttaatt atcgtgcttg gcacaacttt 14700
cagattttgt ctgctgtcac attcatgcag tttcatttgc agtaaattct caatcattta 14760
tgtagttgat aagaatattt gatctgcttt tcattaagca aattttgtta gctttctccc 14820
cttgattgtt cattcaatga gattacattg aatgatgtct acacatataa taagaacgca 14880
tgtctacaca aatctaaaaa tcagctgcac gctcccaatt actatcgcac actctgacac 14940
cagaccgtgc tgtgacaata taagctgcac tgacaaattt ggaaaacaca agattcagaa 15000
gaaaacaaat actggaaccc ctcacacacc acctttctac agcacaaaca cgaagcagta 15060
gccaaggtaa gaaaatccga tcaaaataca ttaaatcatg tctaatatac agcataagta 15120
tagctaatga aatcgttggt cgggccttaa taacacacag tctaccaaca cctagttggt 15180
aaataccgtt gctgatattg ctctgtacca gtaaaagagg gctgcgatga gcgtttttag 15240
tgcacttctt caacacggaa tatttttcac aaattggtat gagaaccaat tttgcaaaat 15300
gttcgccctg taaagtatcg ctctgggacg atcagcttga cgtaattgta ggcgaaaagg 15360
gcgttcaaag tgcagcttta tgtatgaacg tcataaaata taaagcatag cacaatcact 15420
gatagaaaat atttgtgcgc attaaaactc tcacttctgt tgcggataca acgacggaaa 15480
tgagaagctt gtgtaagaag caattcaagt tttcattttg tcatctaagg tgtgatcctc 15540
cgatattcat taccgaatgc tgatctgagt tggaaagatg gcaatattta gctgtgcaca 15600
ctttgacctc caggccttgg cgggaattta gtattctagc tttcctattg gaacgatagg 15660
ccagccaagt ctccagcttg tatacgctac accagcagac atgctctcaa tttagctgac 15720
agtgtcttca tatttgtatt atctgttgtg tctatgccga agaagaagcg caaggtgggc 15780
gactacaagg acgacgacga caagctggag ccaggtatgc ataacctttc aatagatgct 15840
gccgcgcctt gggttcgctg cctgtgtcct gaagtacttt tcaccaggtc tacatgcatg 15900
cagcaactaa tcgttagttg ttcctttgta aacagcgttt tctgtcttta ccatgattca 15960
ggcgagaagc cgtacaagtg tccagagtgc ggcaagagct tcagccagtc aggagcactg 16020
acccgccacc agagaacaca tacacgcgac aagaagtaca gcatcggcct ggacatcggc 16080
accaactctg ttggttgggc ggtgatcacc gacgagtaca aggtgccgag gtatgttatc 16140
tttgattgca ctacttgcag tcctggtggg cactattgtt gtgcataggc gctcttttgc 16200
attcatgtat tgaatgtaga gaagttgtac actcctccta ggagactagc tgatggagtc 16260
ctgtattaaa tttgttcaca tcatatgcct tacagcatga tccattagaa gtaactaaat 16320
ttctaagcac ccagtctgag aaaccagatc gatggcaagt tgctcttggc ttgctgtgct 16380
tgcagcaaga agttcaaggt gctgggcaac accgaccgcc acagcatcaa gaagaacctg 16440
atcggcgcgc tgctgttcga ttctggcgag acagcagagg cgacacgcct gaagagaaca 16500
gcacgcagac gctacacacg ccgcaagaac cgcatctgct acctccagga gatcttcagc 16560
aacgagatgg cgaaggtgga cgacagcttc ttccacaggc tggaggagtc gttcctggtg 16620
gaggaggaca agaagcacga gcgccacccg gtaagtcgcg tgccaagcac tagtttacca 16680
tcccacaaat gacaggtctg ggtgggacat ctgcacctga aaatggctta cgacagctgc 16740
ttctcaattc gagtgtgcat attgcaagca ttagattttt tcctgcagat cttcggcaac 16800
atcgtggatg aggtggcgta ccacgagaag tacccgacca tctaccacct gcgcaagaag 16860
ctggtggaca gcaccgacaa ggcggacctg agactgatct acctggcact ggcgcacatg 16920
atcaagttcc gcggccactt cctgatcgag ggtgagtgtg gaatgcatca cagtggaaac 16980
tgctttgtag tacaatttgt ttgtgaagtt tgtgtctaga tgtccatttg atctgtggaa 17040
tgaatgtgct agctctcatg cacagcagta tttggaatgc tgaattacag tgtttccttt 17100
gttggtgtca ggcgatctga acccggacaa cagcgacgtg gacaagctgt tcatccagct 17160
ggtgcagacc tacaaccagc tgttcgagga gaacccgatc aacgcaagcg gcgtggacgc 17220
aaaggtgtct tgatgtaaag tcgaacattg catttgaacg aaggagctcc cttgttggct 17280
aagcatgggt attgactcta ccccagcagg gaatcatctt gctgcaacag ctcacgtcgt 17340
atttgtatgt ggtgcaggcg attctgagcg caaggctgag caagagccgc agactggaga 17400
acctgatcgc gcaactgcca ggcgagaaga agaacggcct gttcggcaac ctgatcgcgc 17460
tgtcactggg cctgacgccg aacttcaaga gcaacttcga cctggcggag gacgcgaagc 17520
tgcaactgag caaggtgaac gtccccctcg gccctgtgct ggtgtgcctg ctgtccaatg 17580
gcacgtttgt gcttcacaat tctacaggtt gatgcaatgt aggttggttg tgctgatgcc 17640
agagatgcac tcaaccaaca ccgtgttgct ttgttggttc ccaaccagcc tgcaatgcaa 17700
cctgtgaatc gtgcaccata cgatctgcat gcaggacacc tacgacgacg acctggacaa 17760
cctgctggcg caaatcggcg accagtacgc agacctgttc ctggcagcga agaacctgag 17820
cgacgcgatt ctgctgagcg acattctgta agtctcagag cacatcacct gcatcacaca 17880
ggatttcttt tgtcagcata tcctgccttt tcgggtcatg tttggatgcc gtgcggctgt 17940
gtgccactgg tccaggcgta ctgggctttc tgacaagctg gatgttatgc ttatattgca 18000
ggcgcgtgaa caccgagatc accaaggtga gccgcacact tgctattgct cgctttcaca 18060
aaatacccgt cgtgaaaacg tcatgtgaag gttgctatca tcgggtcaga gagtatatta 18120
catcatgaac aggctgcaag ggtttgattc ctgcaggcac cactgagcgc gagcatgatc 18180
aagcggtacg acgagcacca ccaggacctg acactgctga aggcactggt gaggcagcag 18240
cttccggaga aatacaagga gatcttcttc gaccagagca agaacggcta cgcgggctac 18300
atcgatggcg gtgcatctca agaggagttc tacaaattca tcaaggtatg tttggcacac 18360
cattgacaga aggggcatgt cttgcccagt gtgcactgct gtcaggtcga tgagagaagt 18420
ggcaatgaaa aattttggtt tgacaacaaa tatgaggggg tactcgggac tgattggcaa 18480
tgcgttagaa actccgtaag atcaaatttc tgaagtggta gcagtggaag ttcctagctg 18540
agggtgtcac tcactcttat ttctgcagcc gatcctggag aagatggacg gcaccgagga 18600
gctgctggtg aagctgaacc gcgaggatct gctgcgcaag cagcgcacat tcgacaatgg 18660
cagcatcccg caccagatcc atctgggtga gctgcacgcg attctgagaa ggcaggagga 18720
cttctacccg ttcctgaagg acaaccgcga gaagatcgag aagatcctgg tacgtggccc 18780
gggttcacct gttgcgtgca tgttgacttc aggacaaagt tagcattatt acacagcggc 18840
agcacagtga gggtcatcat gtggctggct ttccaattgc tccgagggaa taatcggttg 18900
aatgtgtgtt tctcttgcca gtgtgtcctt ggaggtgcgt gcgtgcttcg caaaaaagga 18960
gtacccaata acccttgaaa caaccagttt tgggctgcaa caacacaaga ccgcggttta 19020
ctgcctgact atgcagacgt tccgcatccc gtactacgtc ggtccactgg cacgcggcaa 19080
cagcagattt gcgtggatga cccgcaagag cgaggagaca atcaccccgt ggaacttcga 19140
ggaggtggtg gataagggtg cgtggccagt accagctgca ccccacaggc ggttgttttg 19200
acatttaaac cgctttcagg aagcgtttgt acactcatgc gcttcatggt ctaccagcag 19260
gaggtctgga acacattcag atctaacatg aaatcaagct tgcatttcaa aagcggggca 19320
tccaagtgca gcggggatga actgctgtct catttctatg caggcgcgtc tgcacagagc 19380
ttcatcgagc gcatgaccaa cttcgacaag aacctgccga acgagaaggt gctgccgaag 19440
catagcctgc tgtacgagta tttcaccgtg tacaacgagc tgaccaaggt gaagtacgtg 19500
accgagggca tgcgcaagcc ggcatttctg agcggtgagc aaaagaaggt gggtggtgca 19560
caatgttgat gcagatttga cgctgtatca ctgctgtctc gctgtacagc atctgataca 19620
ctgctgttcc cgctccccgc aggccatcgt ggacctgctg ttcaagacca accgcaaggt 19680
gaccgtgaag cagctgaagg aggactactt caagaagatc gagtgcttcg acagcgtgga 19740
gatcagcggc gtggaggatc gctttaacgc gagtctgggc acctaccacg acctgctgaa 19800
gatcatcaag gacaaggact tcctggacaa cgaggaggtg attgtgggtg gagtgcaccg 19860
cgaatgaatg gggcactgca gcacaatgga gcacacatcc aatccgcaat gagctctcct 19920
gagacttttt ttggctcctg aagcaaacca gacaatgtgc gcctatttca cggacctggc 19980
gcatggaagt agtctggcaa ctatggctgg agcacaacaa tttctggtta ttttgattgg 20040
aatgattggg ggaaaaaaca atgtgttgcc cgcagcacag gccctggtgc agttgagtta 20100
gctgtagcag tagcagaagg catgtcatcg aaaaagtacc gaattgtgcc atcatcccca 20160
ccctgctgca gaacgaggac atcctggagg acatcgtgct gaccctgacc ctgttcgagg 20220
atcgcgagat gatcgaggag cgcctgaaga catacgcgca cctgtgagtg gttgccctgg 20280
acactggaga tttcttgcat gttgggtgtg gctgattgtg cctgcatcac tggatgattg 20340
tggcacattt tcggtttaat attcagggta ctgctgcaaa cgagcttggt tcaactgacg 20400
tacctgaacc agtcgttttg ctgcttgcag gttcgacgac aaggtaagct gtgacaggac 20460
aagctggcag attcttcact tgcacctgtc cagctgaatc tacaaccatg ggtgaaggat 20520
gctgccgttg ctggcagcca cacctgtttg aaactaaaat gggagcaacc tgtgcagcaa 20580
ggtcctacga tatcatacct gcttcttcaa ccatctgatg ccccttatca acaagcgcac 20640
cctgcaggaa ttacccttgc accaaaacct gggcacgttg cctgccgctt gccagaacta 20700
gctgtctgtg ccactcccaa catgtgccta gcatctgtga tatctgctac aggtgatgaa 20760
gcagctgaag cgccgccgct atacaggttg gggtagactg agccgcaagc tgatcaacgg 20820
catccgcgac aagcagagcg gcaagacaat cctggacttc ctgaagagcg acggcttcgc 20880
aaaccgcaac ttcatgcagc tgatccacga cgacaggtga gccaggggag gtgcattcct 20940
agcctgtgct tgcttgtgtg gaccctattt gggaggagga agattgacct ggtatgaaat 21000
gtgaggctag acaacacatg cgactatttc tctccagcag cactggcagg acgatgggac 21060
tgcatgtgag ggcatgtctt gacatgaaat gtcttgccac cagtttgatg tgttgacatc 21120
gaacatcagc cccccttccc cagctattat ctagttctgg tcctatcaga ccatgcgcaa 21180
tctgctggcg gtctcatctt taaaagcatt cttgtcatca ggctgtgcag tggagccagc 21240
aataaaacca acctattgtt ttgcagcctg accttcaagg aggacatcca gaaggcgcag 21300
gtgtctggtc agggcgatag ccttcacgag cacatcgcga acctggcagg ctcaccagcg 21360
atcaagaagg gcatcctgca gaccgtgaag gtggtggatg agctggtgaa ggtgatgggc 21420
cgccacaaac cggagaacat cgtgatcgag atggcgcgcg agaaccagac aacccaaaag 21480
ggccagaaga acagccgcga gcggtacgca gaactctggc gtagccacgc aaatcatgtt 21540
tgcagatgaa agttttgtca tatgcgcaag accagggacc ttctatgtat caaaaggctt 21600
aacagtgtgt tgttggttat gttgtgcagc atgaagcgca tcgaagaggg catcaaggag 21660
ctgggtgagt catgtggaaa ggtatcatac attagatggt gttcccctgt tgtacaagat 21720
ctggcagcat ttggatgctg ccattggaga tttcatgaga tattcagtta aactaaaagc 21780
gtgagttttc gcagcagagg atagagccaa actcacaaat cattttggct tggtgcaggc 21840
agccagatcc tgaaggagca tccagtggag aacacccagc tgcagaacga gaagctgtac 21900
ctgtactacc tgcagaacgg ccgcgacatg tacgtggatc aggagctgga catcaaccgc 21960
ctgagcgact acgacgtgga ccacattgtg ccgcagtcgttcctgaagga cgacagcatc 22020
gacaacaagg tgctgacccg cagcgacaag aatcgcggca agagcgacaa cgtgccgtct 22080
gaagaggtga ggcatcgcac aggatataca gtgggttcca tgagtgctgt tgtgttgtgc 22140
attgcttcga cccgctttcc aacctgtgcg tggtgtatgg gtttgcacca tggcgtgcac 22200
gggcacaggc atgtcatgct gcaagcaaca gggccgccaa gcttccttca cctgctcggt 22260
gatctttgtc ccttcctcca ccctcccttt ttccccgccc caggtggtga agaagatgaa 22320
gaactactgg cgccagctgc tgaacgcgaa gctgatcaca cagcgcaagt tcgacaacct 22380
gaccaaggca gagaggggtg gcctgtctga gctggataag gcgggcttca tcaagcgcca 22440
gctggtggag acacgccaga tcacaaagca cgtggcgcag atcctggaca gccgcatgaa 22500
caccaagtac gacgagaacg acaagctgat ccgcgaggtg tgacccgggt gtattagaga 22560
gatgcgcaac gcgtgctggt tgttgttgcc gttgcaccta gggagtaggt cgaatgccgc 22620
gttggtgccc gctggggtgg ctgtatcatg ctggatgggg ttgcaatcag acccgggtaa 22680
gaatgaagtg tggagctcac tgttccgtcg agcgcttcag cctgcttgat ggtgatgccg 22740
gtttggcgca ggtgaaggtg atcaccctga agagcaagct gtgagtggcg tgctgcacaa 22800
ttgtttgtca agtgcacttg ttcttgatac aaagttgggc tcgccattga tagcaagaaa 22860
aagaacttgc cacctggata gctgcgtctg gaacatgttg catggaggga attttatggt 22920
gacacccatg gtgacactct tcatggaacc tgctggccac ctgctggtat gcctcttgag 22980
gctggatgat caacaaatga tgtgccgcag tctacagtca atttcagttc acccagtagc 23040
tgtttttcat tcgtgctgca gggtgagcga cttccgcaag gacttccagt tctacaaggc 23100
aagtgccttc tagggttcag atctaagcca gagcagtgaa caactggtgc tattatatcg 23160
tacatatggt gctaattcgc ctgcttgcag ctcagcaggc accattggtg cacaggaaaa 23220
tcggcgcatg atccaagtgc agctgcgcct cgcagcttgt acccctgctg agttttcttt 23280
cggctgttgc ccatgcaggt gcgcgagatc aacaactacc accacgcgca cgacgcctac 23340
ctgaatgcag tggttggcac cgcgctgatc aagaagtacc cgaagctgga gagcgagttc 23400
gtgtacggcg actacaaggt gtacgacgtg cgcaagatga tcgcgaagag cgaggtgagc 23460
actcacaggc agttctgtta ccaacatctg cgattttctt gggcagagag tgtatcttag 23520
acctcattca cctcagattc ctgagcgagc tgcaatgccc gttgtcagcc tgtgcaatga 23580
aggaaaaacc tgtcgtaatg cttgcagcag gagatcggca aggcaaccgc gaagtatttc 23640
ttctactcga acatcatgaa cttcttcaag accgagatca ccctggcgaa cggcgagatt 23700
gtgagtgtca cagtagtgtg catcttcgtt tgatccagtt tgatccacgt gcagctgccc 23760
atcaagtcca ggttgtggac cttcatcttt ggactggcag tgtatgaaaa gtccactggg 23820
aacctgctct ttttcatacc gcatcatgca tatcgtgtcc catcgtgcgt acttcatgag 23880
ttgtccctat ttttattact gtcgtcatca cttccaacgt ccacagagcc aacacgactt 23940
gtgctgaata aaggaatgaa atcgcctatt taatataaac tggtattgtg ggacaaagtc 24000
caattcgcaa gtctgatgcg cacctgtgca gaggaagagg ccgctgatcg agaccaacgg 24060
cgagacaggc gagatcgtgt gggataaggg ccgcgacttt gcgacagtgc gcaaggttct 24120
gagcatgcca caggtgaaca tcgtgaagaa gaccgaggtg cagaccggcg gcttcagcaa 24180
agagagcatc ctgccaaagc gcaacagcga caagctgatc gcgcgcaaga aggactggga 24240
cccgaagaag tatggcggct tcgacagccc aaccgtggca tatagcgtgc tggtggtggc 24300
gaaggtggag aagggcaaga gcaagaagct gaagagcgtg aaggagctgc tgggtgagcg 24360
gccagcacat gcacctaggt tgcctatcac atggcaccaa attgcatagc catttcaggg 24420
tgattcactt cccggtaaca ggcattgtct ggcagcctca tcgtatgcat gaatggagat 24480
gggtcaattc aagcttgcat ttcaaaagca gggcatccaa gtgcagctgg gatcaactgc 24540
tgtctcattt ctatgcaggc atcaccatca tggagaggag cagcttcgag aagaacccca 24600
tcgacttcct ggaggcgaag ggctacaagg aggtgaagaa ggacctgatc atcaagctgc 24660
cgaagtacag cctgttcgag ctggagaatg gccgcaagcg catgctggca tctgcaggtg 24720
ggtggtgcac aatgttgatg atagtgccct gatgtagtgc gcagatttga cgctgtatca 24780
ctgctgtctc gctgtacagc atctgataca ctgctgttcc cgctccccgc aggtgagctg 24840
caaaagggca acgagctggc actgccgagc aagtacgtga acttcctgta cctggcgagc 24900
cactacgaga agctgaaggg ctcaccggag gacaacgagc agaagcagct gttcgtggag 24960
cagcacaagc actacctgga cgagatcatc gagcagatca gcgagttcag caagcgcgtg 25020
atcctggcag acgcgaacct ggataaggta ggaattttcc cctccctgca ggtggccagg 25080
gaaatgaacc ggtcaccatg taccgggtag cacgggtgga cacacggcag tggccaggga 25140
atcgtactgc tgagggtccc cctgcatgca gactgtgggg gttccctcag gctccgtctt 25200
tgttgcacat gcaatggttt gatcggtctcagttggcatc tctattgaaa ctgctatatt 25260
cctatgccag tgacgcagag gtgaggatgg ttgacaaggt tttgacgtag tgggtgttga 25320
gggtgctgtg caggtgctga gcgcgtacaa caagcaccgc gacaagccaa tccgcgagca 25380
agcagagaac atcatccacc tgttcacgct gaccaacctg ggcgcaccag cagcgttcaa 25440
atatttcgac accaccatcg accgcaagcg gtacacaagc accaaggtac tacctgcctg 25500
cccaaatgct gttgggcttt gcagcacaaa ggaaaattct ccagccaggg tttttcctgc 25560
tgcaacactg ttgtatgatc gctcacaata agggggaaat aggtttccaa gtcatggttg 25620
tgacagtgga aaccaagtct tttttgcctc caccaagttt ttgtcctcaa atttaattca 25680
atggtggttt gtaggaggtg ctggacgcga ccctgatcca tcagagcatc acaggtacag 25740
tgcagcagca caatccctcg tcaagcttac ttgtgttgca ttgccaaatt gcccaatttc 25800
ctatgaagtt tgctgtacat ttgatcatgc gctaaattgc ttttacgttc tatcgctttg 25860
tatgcatgca ggcctgtacg agacccgcat cgacctgtct caactgggtg gcgactgagg 25920
tgcgaatagt gcttcagtaa aaaagtagca acttggtgca atatcgtcag ggtcgtgtgg 25980
tctgctcgcc agcaagtttt ttggcacagg agagcgcttt ttccgagtac cgccaaagtt 26040
caagcatgtg ctgtgattcg ctgttgcctc ttatgataat tgctcaaagt ttccaagcat 26100
tctatgtcca ccctgcacca ctaagttgta tggtgcttat tctgcagggg atgattcatg 26160
gtgcctaaaa attttgtgct gctgtcgcgt ctgttttctg tcgcagttta gtgaatgtaa 26220
ctccaaatac caaacttttc atcacaatca tattgatgcc tttgtaagtg aattacagcg 26280
ttttttgcca taaaaagaag taccgtgaca ttggggtcgt cataacaaga agctttatga 26340
acaagcagct tgatctacga gacttataca taaatggttt cgggtaactc ctaatacggg 26400
gctacgttag ttcagcagct gagaacgacc acgaacggga agaattccag ccatgttgaa 26460
gaggtgcagc tatcaaggtg aggtctttac tggtgtctgt tattgctgta acatcatttc 26520
gctgttgcac aatttaaaca tttgtaattt actgttgtta ttgcagtggc cacttgtagc 26580
agtggcagcg aggcactgac acttctacgt gaacgcaacg aggacggatc ctccgaccag 26640
ttcgacctcg tactgtcaga tgtttacatg ccgggtatgt cgtattcctt tgtaaacttt 26700
acaatatgcg tctagtttga cgcgtacact ttgtacactt tgcaaaaacg caccctgcga 26760
ggtctgccat ttggtcacta caacttggcc accttggttg caagtttgca agttcgctct 26820
acgtcaacgc tgcaaaatga accaattgtt ttgcactgac cctgccaacc ttcatttgtg 26880
gctgcagaca tggacggttt caagctgctt gaacacatcg gtctagagtt ggagcttccc 26940
gttatcagta agttgatcga gccgagtcca gagcgaagcc tgcttctata ctattagcag 27000
ctgtcttttg atatttgaca gcttgacttg atatggtcac agagcatact tgcaaccagg 27060
ttacctgttg aactagcaac tgtgcccaag catctcttca agcacctccg tcagtccata 27120
gggtactgtt gatttgtact ctgcaatact gcactgtaat gcgctgtgaa tcactgccct 27180
tcacctctag atggtgcttc cctggagccc tcccccacct ccgcctcaag cccctcacat 27240
gcctctcccc cccctgcagt gatgtcatcc aacggggaca cgaatgtcgt gctgcggggg 27300
gtcacccacg gggctgtgga ctttctgatc aagcccgttc gaattgagga gctgcggaac 27360
gtgtggcagc acgtggtgcg tcgtcgttcc atggcgctgg ccaggacgcc agacgagggg 27420
ggacactcgg acgaggactc tcaggtgccc ttggcagctt ctgggcggct tgctgtgtcg 27480
gatgccactt ggactgggga tgcacgaggg gtggggggac aatgggagat gggccatagt 27540
aggccagagt tgatggcagt ggtggtgggg gggagtaggc gggagagaag cagccatcct 27600
ggtgttggtt ttgatgattg agtgcatggg gatgatgcac aggtgagctg actggatgcc 27660
ttgtcttgct gtgctgcgct gcagcggcac agtgtgaaac gcaaggagtc ggagcagagc 27720
ccgctgcagc tcagcacaga gcagggcggg aacaagaagc caagagtggt gtggtcggtg 27780
gagatgcacc aacaggtgtg cttgcgggcg ggtgtatacg ggggaggggg gccagctgct 27840
ggctgacctg gcgtgcgcgg tgcattgcac ttggcgatga ggggcgtgct tcagtatgta 27900
gctgggacgc aattggttgt gctgtgtgac cagtgcacaa aatacatccc tgaattccag 27960
tgggttgaac agagttgtcc tggaggtggg aagcaaacgc gcacgtggta gaggggagca 28020
gggtgcagaa cagccgcagc aggggtgttg cgcagtgtgc aggtatcctg cctccatgcc 28080
ccgggccatg ggcatactac gctggtaccg tcaggatggg cgttgagcct ggcttggggg 28140
gcagggggcg agcgaatgcg gaatgggagc ggcaggtgct gggagggtgg ctgactggct 28200
tgcaggagcg caagtcctgt cgggggcgtc gtcctgttcc ctcctgcccg cttcacccac 28260
gttcactctc atgcctccac actcctgctg ctgacacacc tgtcgccacc tccgctgcag 28320
tttgtgaacg cggtcaactc cctgggcatt gacaaggcgg tgcccaagcg gattctggac 28380
ctgatgaacg tggaggggct gacgcgcgag aacgtggcca gccatctgca ggtgcctgcc 28440
atgacccgcg at 28452
<210>87
<211>30
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>87
aggctactct cagacatgac ggtggctctg 30
<210>88
<211>30
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>88
gccacaaatg aaggttggca gggtcagtgc 30
<210>89
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>89
acaccacctt aaggcacatg agg 23
<210>90
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>90
ggcgtgggac atggtgcgca agg 23
<210>91
<211>55
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>91
tgaagcaccc cccggcctct ccccccgcag ggccgcccct cccgcctcgt cgtgc 55
<210>92
<211>56
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>92
cgcaacgctc tccctcccca ccccccagcc tcacatccgc ctcaagcagc gccctg 56
<210>93
<211>25
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>93
caagctatgc gaggaaggga gggtc 25
<210>94
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>94
ctgccgcaag tgagtgtgct gtc 23
<210>95
<211>25
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>95
caccagatat aggtgacccg ataac 25
<210>96
<211>24
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>96
aaaactccac tgcacctgca acat 24
<210>97
<211>20
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>97
tgcggtgaag cttggagctg 20
<210>98
<211>59
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>98
ttgccgtcga cgagacttcg gggcgcgcat ttatcgactc tcttgaagat acaccggtt 59
<210>99
<211>65
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>99
tccaattgta gatatcatat tgtttccgga cctaccttac gcactgagtg ctgccagatg 60
ttctt 65
<210>100
<211>28
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>100
gaggtgggtg gtagtgcttc gcgaggtg 28
<210>101
<211>29
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences-synthetic primers
<400>101
atcacagctc acagggcaga cactgcgtc 29
<210>102
<211>7
<212>PRT
<213> unknown
<220>
<223> unknown description of the 'SHAQKYF' family protein sequence
<400>102
Ser His Ala Gln Lys Tyr Phe
1 5
<210>103
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>103
gggacatggt gcgcaaggac ggg 23
<210>104
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>104
tgcggtgaag cttggagctg tgg 23
<210>105
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> description of Artificial sequences synthetic oligonucleotides
<400>105
acaccacctt aaggcacatg agg 23