Rice nitrogen metabolism regulation protein ARE4 and application of coding gene thereof
阅读说明:本技术 水稻氮代谢调控蛋白are4及其编码基因的应用 (Rice nitrogen metabolism regulation protein ARE4 and application of coding gene thereof ) 是由 左建儒 马晓辉 粘金沯 钱前 李家洋 于 2020-05-06 设计创作,主要内容包括:本发明公开了水稻氮代谢调控蛋白ARE4及其编码基因的应用。本发明通过图位克隆的技术方法,鉴定到一个调控氮代谢的基因ARE4,并通过转基因实验验证了该基因的功能。将本发明保护的基因在水稻中过表达后,水稻株高升高,生物量增加,产量增加;而功能缺失或降低后,水稻株高降低,生物量减少,产量降低,说明该基因可以调控水稻的生物量及产量。因此,本发明对于培育高产水稻新品种具有重要意义和应用价值。(The invention discloses a rice nitrogen metabolism regulatory protein ARE4 and application of an encoding gene thereof. The invention identifies a gene ARE4 for regulating nitrogen metabolism by a map-based cloning technical method, and verifies the function of the gene by a transgenic experiment. After the gene protected by the invention is over-expressed in rice, the plant height of the rice is increased, the biomass is increased, and the yield is increased; after the function is lost or reduced, the plant height of the rice is reduced, the biomass is reduced, and the yield is reduced, which shows that the gene can regulate and control the biomass and the yield of the rice. Therefore, the invention has important significance and application value for cultivating new varieties of high-yield rice.)
1. The application of any one of the following substances a1) to a3) in at least one of the following substances b1 to b 6;
a1) protein ARE 4;
a2) a nucleic acid molecule encoding the protein ARE 4;
a3) a recombinant vector, expression cassette or recombinant bacterium comprising a nucleic acid molecule encoding the protein ARE 4;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3);
b1, regulating and controlling nitrogen metabolism of plants;
b2, regulating and controlling the plant height;
b3, regulating and controlling plant biomass;
b4, regulating and controlling the yield of the plants;
b5, regulating and controlling the absorption or transport capacity of the plant to nitrate;
b6, regulating the transcription of target genes.
2. Use according to claim 1, characterized in that:
the nucleic acid molecule encoding the protein ARE4 is a DNA molecule of any one of the following d1) -d 6):
d1) the coding region is a DNA molecule shown as a sequence 2 in a sequence table;
d2) the coding region is a DNA molecule shown as a sequence 3 in a sequence table;
d3) the coding region is shown from 388 th to 1341 th of the 5' end of a sequence 2 in a sequence table;
d4) the coding region is shown from 391 th to 1344 th of the 5' end of a sequence 3 in a sequence table;
d5) the coding region is a DNA molecule shown as a sequence 4 in a sequence table;
d6) a DNA molecule which hybridizes with the DNA sequence defined in any one of d1) -d5) under strict conditions and codes for a protein with the same function;
d7) a DNA molecule having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% homology to a DNA sequence defined in any one of d1) -d5) and encoding a protein having the same function.
3. Use according to claim 1 or 2, characterized in that:
the regulation and control of plant nitrogen metabolism is to promote plant nitrogen metabolism;
or, the plant height of the plant is regulated and controlled to be increased;
or, the regulating plant biomass is increasing plant biomass;
or, said modulating plant yield is increasing plant yield;
or, the regulation and control of the absorption or transport capacity of the plant to the nitrate is to improve the absorption or transport capacity of the plant to the nitrate;
or, the regulating the transcription of the target gene promotes the transcription of the target gene.
4. Use according to claim 3, characterized in that:
the promotion of the nitrogen metabolism of the plants is to improve the absorption or transport capacity of the plants to nitrate.
5. Use of the substance of any one of claims 1-4 for growing plants as represented by B1-B5:
b1, plants with rapid nitrogen metabolism;
b2, high plants;
b3, high biomass plants;
b4, high-yielding plants;
b5, a plant with improved nitrate uptake or transport capacity.
6. The application of substances for inhibiting the content or activity of protein ARE4 in plants in at least one of C1-C5;
or inhibiting the substance of the expression of the gene coding for the protein ARE4 in plants in at least one of the following C1-C5;
c1, reducing nitrogen metabolism of plants;
c2, reducing the plant height;
c3, reducing the biomass of the plant;
c4, reducing plant yield;
c5, reducing the absorption or transport capacity of the plant to nitrate;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3).
7. A method for breeding a transgenic plant, which is 1) or 2) as follows:
1) the method comprises the following steps: improving the content and/or activity of the protein ARE4 in the target plant to obtain a transgenic plant;
2) the method comprises the following steps: improving the expression of a nucleic acid molecule of an encoding protein ARE4 in a target plant to obtain a transgenic plant;
the transgenic plant has at least one phenotype of D1-D5 as follows:
d1, the nitrogen metabolism of the transgenic plant is faster than that of the target plant;
d2, the plant height of the transgenic plant is higher than that of the target plant;
d3, wherein the biomass of the transgenic plant is larger than that of the target plant;
d4, the yield of the transgenic plant is larger than that of the target plant;
d5, the nitrate absorption or transport capacity of the transgenic plant is larger than that of the target plant;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3).
8. A method for breeding a transgenic plant, which is 1) or 2) as follows:
1) the method comprises the following steps: inhibiting or reducing the content and/or activity of the protein ARE4 in a target plant to obtain a transgenic plant;
2) the method comprises the following steps: inhibiting or reducing expression of a nucleic acid molecule encoding the protein ARE4 in a plant of interest to obtain a transgenic plant;
the transgenic plant has at least one phenotype of E1-E5 as follows:
e1, nitrogen metabolism of the transgenic plant is slower than that of the target plant;
e2, the plant height of the transgenic plant is lower than that of the target plant;
e3, the biomass of the transgenic plant is less than that of the target plant;
e4, the yield of the transgenic plant is less than that of the target plant;
e5, the nitrate absorption or transport capacity of the transgenic plant is smaller than that of the target plant;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3).
9. Use according to any one of claims 1 to 6 or a method according to claim 7 or 8, wherein: the plant is a dicotyledonous plant or a monocotyledonous plant.
10. The use of the protein ARE4 as a transcription factor;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3).
Technical Field
The invention belongs to the technical field of biology, and relates to a rice nitrogen metabolism regulation protein ARE4 and application of an encoding gene thereof.
Background
The growth and development of plants require multiple nutrient elements, wherein nitrogen (N) is particularly important, the nitrogen content of the plants generally accounts for 0.3-5% of the dry weight of the plants, and most non-leguminous crops producing 1 kg of dry matter require root systems to absorb about 20-50 g of nitrogen, so that the nitrogen content in soil is often a key factor limiting the crop yield in agricultural production. During the past fifty years, due to the development of crop breeding and the use of a large amount of chemical nitrogen fertilizers, the global grain yield is greatly improved, but the problems of energy consumption, increase of agricultural production cost, environmental pollution and the like are also accompanied. Therefore, improving the Nitrogen Utilization Efficiency (NUE) of crops is one of the effective ways to realize sustainable development of agriculture, and more scientists are engaged in the research of new gene cloning, function analysis and excellent allelic variation capable of improving and improving NUE in developing the application foundation and application research of crops.
The nitrogen utilization efficiency refers to the biomass of a plant or the yield of grains under the unit nitrogen element supply, and mainly comprises two physiological indexes, namely nitrogen uptake efficiency (NUpE) and nitrogen assimilation efficiency (NUtE). The nitrogen absorption efficiency is mainly used for measuring the capability of the plant root system for obtaining nitrogen from the soil around the root. The main form of the plant root system for absorbing inorganic nitrogen source from soil is Nitrate (NO)3 -) And ammonium salts (NH)4 +) Plants have a preference for uptake of nitrate and ammonium salts due to differences in the growing environment. For example, rice grown in a typical anoxic paddy field prefers to absorb ammonium salts. The absorption of nitrate and ammonium salts in soil by plant roots is mainly achieved by nitrate transporters (NRT) and ammonium transporters (AMT) located on the cell membranes of the roots. NRTs dependent on their NO3 -The affinity of (A) can be divided into two types of high affinity and low affinity, the low affinity NRTs (NRT1 family proteins) play a major role when the plant is grown under conditions of high nitrogen concentration, and the high affinity NRTs (NRT2.1, NRT2.2, NRT2.4) play a major role when the plant is grown under conditions of low nitrogen concentration. NO3 -After being absorbed by plant root system, one part enters assimilation or is stored in vacuole, and the other part depends on transpirationTransported to the aboveground part to be assimilated into organic nitrogen, and low-affinity nitrate transporters AtNRT1.5, AtNRT1.8 and AtNRT1.9 in arabidopsis thaliana are in NO3 -Play a role in the transportation process of (1). In Arabidopsis thaliana, at least 6 ammonium root transporters AMTs are present in NH4 +In the absorption and transportation process, the functions are performed in a form of homo-or heteromultimer, wherein the expression of AtAMT1.1, AtAMT1.3 and AtAMT1.5 is induced by low-nitrogen conditions.
NH absorbed by plants from the soil4 +Can directly enter nitrogen assimilation, and NO in vivo3 -The Nitrate Reductase (NR) in cytoplasm and the nitrite reductase (NiR) in chloroplast are required to be catalyzed and reduced into NH in sequence4 +。NH4 +Glutamine and glutamic acid are assimilated into a cyclic reaction consisting of Glutamine Synthetase (GS) and glutamic acid synthase (GOGAT), and the cyclic reaction is a key link of a plant nitrogen assimilation process. There are two classes of isoenzymes for GS in plants: GS1 localized in the cytoplasm and GS2 localized in the chloroplasts. There are also two forms of GOGAT in plants, which are classified into ferredoxin-dependent glutamate synthase (Fd-GOGAT) and nicotinamide adenine dinucleotide-dependent glutamate synthase (NADH-GOGAT) depending on their electron donors, wherein Fd-GOGAT is localized in the chloroplast stroma and specifically expressed mainly in green tissues such as leaves, and NADH-GOGAT is localized in the cytoplasm. Studies have shown that the chloroplast-localized GS2/Fd-GOGAT in the GS/GOGAT cycle in plants is primarily involved in the production of NH by leaf blades upon photorespiration4 +Located in the cytoplasm GS1/NADH-GOGAT are mainly involved in the primary assimilation of nitrogen in roots and the transport of nitrogen in vascular bundles. In addition to the GS/GOGAT cycle, cytoplasmic-located Asparagine Synthetase (AS) and NADH-dependent Glutamate Dehydrogenase (GDH) in mitochondria are involved in the nitrogen assimilation process of plants.
The absorption and assimilation of nitrogen by plants requires the continuous consumption of ATP, NAD (P) H and the carbon skeleton, most of which areProvided by a carbon metabolic process. Carbon metabolism mainly includes photosynthesis-driven sugar anabolism and respiration-mediated sugar catabolism. The plant drives CO after converting light energy into chemical energy ATP and reducing power NADPH by photosynthesis2Organic substances such as saccharides are synthesized through a series of enzymatic reactions, and respiration is performed through glycolysis pathway, tricarboxylic acid cycle pathway, photorespiration pathway, and the like, and photosynthesis (intermediate) products are used as substrates to decompose and generate carbon skeleton, ATP, NAD (P) H, and the like required by other biological processes such as nitrogen absorption, assimilation, amino acid synthesis, and the like. When the plant is in high light, high temperature or low CO2Under the condition, the oxidation activity of key enzyme 1, 5-diphosphoribulose carboxylase/oxygenase (ribulose-1,5-bisphosphate carboxylase/oxygenase, RuBisCO) in the photosynthesis process is increased, and the plant performs the light respiration to release a large amount of CO2And NH3Excess of NH4 +Can have toxic effects on cells, so that the chloroplast-localized GS2/Fd-GOGAT cycle produces NH on photo-respiration4 +And (4) carrying out reassification. Studies have shown that photo-respiration produces NH4 +Approximately plant primary nitrogen assimilation to obtain NH from soil4 +10 times of the nitrogen source, which is an important source of organic nitrogen of plants. Alpha-ketoglutarate (2-OG), an important intermediate of the Krebs cycle, is a reaction substrate for the nitrogen assimilation GS/GOGAT cycle, and 2-OG is generally considered to be the junction of carbon and nitrogen metabolism in plants. At the same time, nitrogen metabolism plays an important role in maintaining the efficiency and stability of carbon metabolism, and the photosynthesis of plants is fixed by CO2The proportion of the distribution into metabolites such as sucrose, starch or organic acids is regulated by nitrogen metabolism. The interaction between carbon metabolism and nitrogen metabolism is not only dependent on the mutual regulation and control among metabolites, but also relates to sugar, amino acid and NO generated by carbon metabolism3 -And NH4 +And the like, and the interaction between various endogenous signals.
At present, researches report that transcription factors such as bZIP, Dof, NLP7 and the like have regulation and control functions in the processes of carbon metabolism and nitrogen metabolism. Among them, HY5(elongated hypocotyl 5) is bZIP transcription factor, an important regulatory factor for promoting photomorphogenesis, and is involved in the growth and development processes of plant root growth, hypocotyl elongation after seed germination, pigment biosynthesis and the like regulated by light, hormone and the like. HY5 can also promote the transportation of photosynthetic products of aerial parts of plants, move into root system as signal molecules, promote the absorption and transportation of nitrate by root system, maintain the carbon-nitrogen balance in plant body, realize signal exchange transcription factor between aerial parts and underground parts, is a specific transcription factor of plants, and plays an important role in the growth and development processes of seed germination, photoresponse, biological stress, carbon-nitrogen metabolism, etc. The Dof protein is considered as an activator of a plurality of key genes in the organic acid metabolism process, the corn Dof1 is an activator of phosphoenolpyruvate carboxylase (PEPC) genes, the expression of the genes in the carbon skeleton synthesis process can be activated, the corn Dof1 gene is over-expressed in arabidopsis thaliana, the nitrogen assimilation efficiency of transgenic plants can be improved under the nitrogen deficiency condition, and the total nitrogen content is increased by 30%. NLP (NIN-like protein) is an important regulator in the nitrate signal process, can sense the nitrate signal and combine to a nitrate response element to activate the expression of a series of nitrate-induced genes. It has been reported that NLP7 can also regulate the expression of PGD (6-phosphogluconate dehydrogenase) which is a key gene of OPPP (oxidative dependent phosphate pathway) pathway, while OPPP pathway and its metabolic (intermediate) products can improve the accumulation level and transport activity of nitrate transporter NRT2.1 in Arabidopsis thaliana, and the regulation is independent of the glucose signaling process mediated by HXK1 (hexokinase 1), suggesting that it may influence the nitrate absorption by some kind of post-transcriptional regulation mechanism.
Therefore, more transcription factors and new genes for regulating and controlling the interaction of nitrogen metabolism and carbon metabolism are excavated, the molecular mechanism of carbon-nitrogen metabolism balance in plants can be further clarified, and the method has important theoretical guidance significance and agricultural breeding application value on efficient utilization of nitrogen, environmental adaptability, crop yield improvement and the like.
Disclosure of Invention
An object of the present invention is to provide use of any one of the following 1) to 3).
The invention provides application of any one of substances a1) -a3) in at least one of b1-b 6;
a1) protein ARE 4;
a2) a nucleic acid molecule encoding the protein ARE 4;
a3) a recombinant vector, expression cassette or recombinant bacterium comprising a nucleic acid molecule encoding the protein ARE 4;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3);
b1, regulating and controlling nitrogen metabolism of plants;
b2, regulating and controlling the plant height;
b3, regulating and controlling plant biomass;
b4, regulating and controlling the yield of the plants;
b5, regulating and controlling the absorption or transport capacity of the plant to nitrate;
b6, regulating the transcription of target genes.
The ARE4 protein in the c2) or the c3) can be obtained by expressing in an escherichia coli biological system, and the white coding gene of the protein can be obtained by codon optimizing the coding gene of ARE4 shown in the sequence 2 in the sequence table to obtain the nucleotide sequence shown in the sequence 3. In order to facilitate purification of the ARE4 protein in c3), the tag shown in FIG. 1 was ligated to the amino terminus of the protein consisting of the amino acid sequence shown in c3) to obtain c 2).
Table 1 shows the tag sequences
The ARE4 protein in the c2) can be obtained by synthesizing the coding gene and then carrying out the expression of an Escherichia coli biological system. The gene encoding the ARE4 in c3) above can be obtained by adding or deleting one or more codons of amino acid residues to the DNA sequence shown in sequence 3 in the sequence table and/or by linking the amino terminal thereof to the DNA coding sequence of the tag shown in Table 1.
In the above application, the nucleic acid molecule encoding the protein ARE4 is a DNA molecule of any one of the following d1) -d 6):
d1) the coding region is a DNA molecule shown as a sequence 2 in a sequence table;
d2) the coding region is a DNA molecule shown as a sequence 3 in a sequence table;
d3) the coding region is shown from 388 th to 1341 th of the 5' end of a sequence 2 in a sequence table;
d4) the coding region is shown from 391 th to 1344 th of the 5' end of a sequence 3 in a sequence table;
d5) the coding region is a DNA molecule shown as a sequence 4 in a sequence table;
d6) a DNA molecule which hybridizes with the DNA sequence defined in any one of d1) -d5) under strict conditions and codes for a protein with the same function;
d7) a DNA molecule having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% homology to a DNA sequence defined in any one of d1) -d5) and encoding a protein having the same function.
The stringent conditions can be those in which the membrane is hybridized and washed at 65 ℃ in a DNA or RNA hybridization assay using 0.1 XSSC, 0.1% SDS buffer.
The recombinant vector can be constructed by the existing escherichia coli expression vector or plant expression vector; the Escherichia coli expression vector comprises pGEX-4T-1, pGEX-4T-2, pGEX-4T-3 or other derived bacterial expression vectors. The plant expression vector comprises a binary agrobacterium vector, a vector for plant microprojectile bombardment and the like, such as pCAMBIA1300, pCAMBIA1301, pGreen0800, pTCK303 or other derivative plant expression vectors. The plant expression vector may further comprise a promoter region and/or a 5 'untranslated region and/or a 3' untranslated region of the ARE4 gene. When the gene is used for constructing a recombinant expression vector, any one of enhanced, constitutive, tissue-specific or inducible promoters, such as a cauliflower mosaic virus (CAMV)35S promoter, a Ubiquitin gene Ubiquitin promoter (pUbi) and the like, can be added in front of the transcription initiation nucleotide, and can be used alone or in combination with other plant promoters; in addition, when the gene of the present invention is used to construct a recombinant expression vector, enhancers, including translational or transcriptional enhancers, may be used, and these enhancer regions may be ATG initiation codon or initiation codon of adjacent regions, etc., but must be in the same reading frame as the coding sequence to ensure proper translation of the entire sequence. The translational control signals and initiation codons are widely derived, either naturally or synthetically. The translation initiation region may be derived from a transcription initiation region or a structural gene. In order to facilitate the identification and screening of transgenic cells or tissues or plants, the recombinant expression vectors used may be processed or modified, for example by adding antibiotic markers having resistance (hygromycin gene markers, kanamycin markers, etc.) or by expressing genes encoding enzymes or luminescent compounds which produce a color change (GFP gene, GUS gene, etc.) in plants.
In the invention, the promoter for starting the transcription of the coding gene in the recombinant vector is specifically Ubiquitin gene Ubiquitin promoter (pUbi) (1 st to 1981 st of sequence 6) or endogenous promoter (sequence 5) of rice ARE4 gene.
In the application, the regulation and control of the nitrogen metabolism of the plant is to promote the nitrogen metabolism of the plant; the specific embodiment is that the absorption or the transportation of nitrate by plants is promoted; the promotion of the absorption or the transportation of the nitrate by the plants is represented by 1) and/or 2) the following steps of 1) improving the absorption activity and the transportation rate of the nitrate by the plants; 2) improving the expression of nitrogen absorption and transportation related genes in plants, wherein the genes can be OsNRT2.1, OsNRT2.2 and OsNRT2.4 genes.
Or, the plant height of the plant is regulated and controlled to be increased;
or, the regulating plant biomass is increasing plant biomass; the increase of the plant biomass is embodied by increasing the plant height.
Or, said modulating plant yield is increasing plant yield; the plant yield is increased specifically by increasing the yield of a single plant of the plant.
Or, the regulation and control of the absorption or transport capacity of the plant to the nitrate is to improve the absorption or transport capacity of the plant to the nitrate;
or, the regulation of the transcription of the target gene is to promote the transcription of the target gene, and is embodied in activating the expression of a target gene promoter.
In the application, the promotion of the nitrogen metabolism of the plant is the promotion of the absorption or the transportation of the nitrate by the plant.
The use of the substance of the first object in the cultivation of plants as shown in B1-B5 is also within the scope of the present invention:
b1, plants with rapid nitrogen metabolism;
b2, high plants;
b3, high biomass plants;
b4, high-yielding plants;
b5, a plant with improved nitrate uptake or transport capacity.
In practical application, when a rice variety with increased plant height and/or increased yield per plant is selected, the gene or the transgenic rice with higher expression level of the coding protein thereof is required to be used as a parent for hybridization.
Another object of the present invention is the use of a substance which inhibits the content or activity of the protein ARE4 in plants.
The invention provides application of a substance for inhibiting the content or activity of a protein ARE4 in plants in at least one of C1-C5;
or the invention provides the application of the substance for inhibiting the expression of the coding gene of the protein ARE4 in plants in at least one of the following C1-C5;
c1, reducing nitrogen metabolism of plants;
c2, reducing the plant height;
c3, reducing the biomass of the plant;
c4, reducing plant yield;
c5, reducing the absorption or transport capacity of the plant to nitrate; the reduction of the absorption or the transport of the nitrate by the plants is represented by 1) and/or 2) the following steps of 1) reducing the absorption activity and the transport rate of the nitrate by the plants; 2) reducing the expression of nitrogen absorption and transportation related genes in plants, wherein the genes can be OsNRT2.1, OsNRT2.2 and OsNRT2.4 genes.
The protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3).
It is a further object of the present invention to provide a method for breeding transgenic plants.
The method provided by the invention is 1) or 2):
1) the method comprises the following steps: improving the content and/or activity of the protein ARE4 in the target plant to obtain a transgenic plant;
2) the method comprises the following steps: improving the expression of a nucleic acid molecule of an encoding protein ARE4 in a target plant to obtain a transgenic plant;
the transgenic plant has at least one phenotype of D1-D5 as follows:
d1, the nitrogen metabolism of the transgenic plant is faster than that of the target plant;
d2, the plant height of the transgenic plant is higher than that of the target plant;
d3, wherein the biomass of the transgenic plant is larger than that of the target plant;
d4, the yield of the transgenic plant is larger than that of the target plant;
d5, the nitrate absorption or transport capacity of the transgenic plant is larger than that of the target plant;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3);
it is also an object of the present invention to provide a method for breeding transgenic plants.
The method provided by the invention is 1) or 2):
1) the method comprises the following steps: inhibiting or reducing the content and/or activity of the protein ARE4 in a target plant to obtain a transgenic plant;
2) the method comprises the following steps: inhibiting or reducing expression of a nucleic acid molecule encoding the protein ARE4 in a plant of interest to obtain a transgenic plant;
the transgenic plant has at least one phenotype of E1-E5 as follows:
e1, nitrogen metabolism of the transgenic plant is slower than that of the target plant;
e2, the plant height of the transgenic plant is lower than that of the target plant;
e3, the biomass of the transgenic plant is less than that of the target plant;
e4, the yield of the transgenic plant is less than that of the target plant;
e5, the nitrate absorption or transport capacity of the transgenic plant is smaller than that of the target plant;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3);
in the above method, the plant is a dicotyledonous plant or a monocotyledonous plant.
The application of the ARE4 protein as a transcription factor is also the protection scope of the invention;
the protein ARE4 is any one of the following (c1) - (c 5):
c1) a protein consisting of an amino acid sequence shown in a sequence 1 in a sequence table;
c2) a protein consisting of an amino acid sequence shown as a sequence 7 in a sequence table;
c3) a protein consisting of an amino acid sequence shown from 227 th to 544 th in a sequence 7 in a sequence table;
c4) a protein consisting of a tag sequence added to the end of the amino acid sequence of any one of the proteins c1) to c 3);
c5) a protein which is derived from any one of c1) to c3) and has the same function by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence of the protein of any one of c1) to c 3).
The invention identifies a gene ARE4 for regulating nitrogen metabolism by a map-based cloning technical method, and verifies the function of the gene by a transgenic experiment. After the gene is over-expressed in rice, the plant height of the rice is increased, the biomass is increased, and the yield is increased; after the function of the gene is deleted or reduced, the plant height of the rice is reduced, the biomass is reduced, and the yield is reduced, which shows that the gene can regulate and control the biomass and the yield of the rice. Therefore, the invention has important significance and application value for cultivating new varieties of high-yield rice.
Drawings
FIG. 1 shows phenotypic analysis of abc1-1 are4-1 double mutant and are4-1 single mutant; FIG. 1a shows the phenotype of rice in the fill period (15 cm on scale); FIG. 1b is a quantitative analysis of the plant height of rice in the grouting period, the values represent the mean value. + -. standard deviation, and the sample size is 40; FIG. 1c is a quantitative analysis of the number of tillers of rice at the grouting stage, the numerical values represent the mean value. + -. standard deviation, and the sample size is 40; wherein WT is a wild japonica rice variety Nipponbare, abc1-1 is a single mutant under the Nipponbare background, abc1-1 are4-1 is a double mutant under the Nipponbare background, and are4-1 is a single mutant under the Nipponbare background.
FIG. 2 is a map-based clone of ARE4 gene; FIG. 2a shows BC obtained by using rice indica Nanjing No. 6 and are4-1 mutants2F2(ii) a genetic mapping performed; FIG. 2b is a fine mapping, with the numbers below representing the number of recombinants; FIG. 2c shows the predicted gene within the 109kb region, the black filled arrow representing the predicted gene; FIG. 2d represents a schematic structural view of the cloned ARE4 gene of the present invention, the black boxes representing exons, the white open arrows representing 3' untranslated regions, and the middle horizontal lines representing introns; black thin line arrows indicate the mutation sites where single nucleotide substitutions occurred and the resulting encoded amino acid changes in the are4-1 mutant.
FIG. 3 shows the verification of genetic complementation of ARE4 gene; FIG. 3a shows the phenotype of rice in the fill period (15 cm scale);
FIG. 3b is a diagram showing the quantitative analysis of the plant height of rice in the period of grouting, pARE4 shows T obtained by transferring the ARE4 coding gene into the ARE4-1 mutant2Transgenic positive plants were generated and the values represent the meanValues ± standard deviation, sample size 40.
FIG. 4 is an analysis of the expression level of a gene encoding a nitrate transporter; FIG. 4a is the expression levels of the nitrate transporter genes OsNRT2.1, OsNRT2.2 and OsNRT2.4 in the aerial parts of wild type and are the are4-1 mutants; FIG. 4b is the expression levels of the genes OsNRT2.1, OsNRT2.2 and OsNRT2.4 encoding nitrate transporter in the underground part of the wild type and are4-1 mutants.
FIG. 5 is the analysis of nitrate absorption and transport capacity of different mutant materials; FIG. 5a shows pairs of different mutant materials15Analyzing the transport capacity of the N-labeled nitrate; FIG. 5b shows pairs of different mutant materials15And analyzing the absorption capacity of the N-labeled nitrate.
FIG. 6 is a phenotypic analysis of ARE4 overexpressed and RNA interfered rice transgenic plants; FIGS. 6a and 6d show the phenotype of rice during the fill period (15 cm on scale); FIGS. 6b and 6e are quantitative analyses of plant height of rice at the time of grouting, the values represent the mean. + -. standard deviation, and the sample size is 40; FIGS. 6c and 6f are quantitative analyses of rice individual yield, values represent mean. + -. standard deviation, and sample size is 40.
FIG. 7 shows the results of gel migration experiments of the ARE4 recombinant protein bound to the target nucleic acid probe.
FIG. 8 shows the results of dual-luciferase reporter assays for transcriptional activation of target gene expression by ARE4 protein.
Detailed Description
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
The experimental methods used in the following examples are conventional methods unless otherwise specified, and the present invention is not limited to the following examples.
The japanese shin-front of rice is described in the following documents: international Rice Genome Sequencing Project & Takuji Sasaki (2005) The map-based sequence of The Rice Genome Nature 436, 793-.
The vector pCAMBIA1300 is described in the following documents: roberts, c, Rajagopal, s, Smith, l.m., Nguyen, t.a., Yang, w, numgrohu, s, Ravi, k.s., vijayacchandra, k.a., Harcourt, r.l., drysfield, l.a. (1997). a comprehensive set of modulators for advanced manipulations and effects transformation of plants. pcambia Vector Release Manual.
The vector pTCK303 is described in the following documents: wang, Z., Chen, C., Xu, Y., Jiang, R., Han, Y., Xu, Z., and Chong, K. (2004). A practical vector for effective knock down of gene expression in rice (Oryza sativa L.). Plant mol.biol.Rep.22, 409-417.
The vector pGreenII 0800-LUC, described in the following documents: hellens, r.p., alan, a.c., Friel, e.n., boltho, k., Grafton, k, Templeton, m.d., Karunairetnam, s., Gleave, a.p., and lane, W.A. (2005).
Example 1 obtaining of ARE4 Gene
First, acquisition of ARE4 Gene
1. Isolation and identification of the are4-1 mutant and phenotypic analysis
The ABC1 gene encodes the key enzyme Fd-GOGAT in the nitrogen assimilation process and plays an important role in regulating nitrogen metabolism and carbon-nitrogen balance (Yang, X, Nian, J, Xie, Q, Feng, J, Zhang, F, Jing, H, Zhang, J, Dong, G, Liang, Y, Peng, J, et al (2016), Rice driver-dependent throttle synthesis enzymes regulation genes-carbon reagent and is genetic differential between nitrogen and carbon consumption experiments. plant 9, 1520.).
In order to further analyze the molecular mechanism of ABC1/Fd-GOGAT for regulating nitrogen metabolism and maintaining the carbon-nitrogen balance in the rice body, the invention screens the inhibition mutation ABC1 repressors (are) of the ABC1-1 mutant by using EMS mutagenesis, wherein the ABC1-1 are4-1 double mutants can partially restore the nitrogen assimilation abnormal phenotypes such as leaf yellowing, plant height reduction, tiller reduction, low fruiting rate and the like of the ABC1-1 mutant (figure 1 a). Subsequently, after backcrossing the abc1-1 are4-1 double mutants with wild type (Nipponbare), the F2 population is subjected to phenotype analysis, and are separated and identified to obtain an are4-1 mutant. The are4-1 mutant decreased in plant height and increased in tillering number per plant compared to the wild type (FIG. 1 b-c).
2. Map-based cloning of ARE4 Gene
The invention hybridizes the are4-1 mutant with wild type material, F1All the generations showed wild type phenotype, F2The phenotype segregation ratio of the population plants at the seedling stage is close to 1:3, and the are4-1 mutant is caused by the recessive mutation of a single nuclear gene. The invention further hybridizes the ARE4-1 mutant with indica Rice variety Nanjing 6 to construct a genetic localization population, preliminarily localizes the ARE4 gene in an interval of about 109kb of chromosome 4 (figure 2a and figure 2b) by a map-based cloning method, the prediction information of a Rice Genome annotation database (Rice Genome analysis Project) shows that 13 open reading frames (open reading frames) ARE totally contained in the interval (figure 2c), and the subsequent sequencing result shows that only one gene LOC _ Os04G58020 has single base mutation (G35 3587A) on the second exon, which causes the substitution of the coding amino acid from alanine to threonine (A312T), so that the gene is tentatively designated as an ARE4 candidate gene (figure 2 d).
The ARE4 gene related in the embodiment is derived from a rice variety Nipponbare, the genome sequence of the ARE4 gene is shown as a sequence 4 in a sequence table, the sequence 4 is composed of 4432 nucleotides, the cDNA sequence is a sequence 2 in the sequence table, and the sequence 2 is composed of 1344 nucleotides. The sequence 2 and the sequence 4 encode the protein (ARE4 protein) shown in the sequence 1 in the sequence table, and the sequence 1 consists of 447 amino acid residues. The sequence of the endogenous promoter is shown as a sequence 5 in the sequence table, and the sequence 5 consists of 1843 nucleotides.
Compared with wild type Nipponbare rice, the ore 4-1 mutant has the ARE4 gene with the sequence 4 in the genome, the 3587 th G is mutated into A, and other nucleotide sequences ARE not changed.
The are4-1 mutant has a reduced plant height and an increased number of tillers per plant compared to wild-type japonica rice.
Second, transgenic genetic complementation experiment of ARE4 gene
1. Obtaining of ARE4 gene complementation transgenic rice
The recombinant vector prepared in1 in example 3 was named pCAMBIA1300-pUbi:: ARE4 was transferred into Agrobacterium tumefaciens EHA105 to obtain recombinant strain EHA105/pCAMBIA1300-pUbi:: ARE 4.
The recombinant bacteria ARE respectively adopted to infect the ARE4-1 mutant, and a specific transformation screening method is disclosed in the literature, "Yi-Li-Zi, Cao-Li, Wang-Li, and He-Sr, Cheng-Shi 31066;" Shunhua, Zhou-Pu-Wai, Tianwen-Zheng-improving the frequency of agrobacterium-transformed rice-genetic science, 2001,28(4):352-358 "to obtain the T1 generation ARE4 gene complementation transgenic rice.
And performing PCR preliminary identification and identification by adopting the third method in the embodiment 3 to obtain the HYG positive ARE4 gene complementation transgenic rice.
Total RNA is extracted from HYG positive ARE4 gene complementation transgenic rice, reverse transcription cDNA is used as a template, the following primer sequence pairs ARE adopted to detect the transcription level of ARE4 gene through real-time quantitative fluorescent PCR, the experiment is repeated for 3 times, and the result is averaged.
The primer sequences for detecting the expression level of the ARE4 gene ARE as follows:
qARE 4-1F: 5'-AGGACGAGCACAGGCTGTT-3' (830 th and 848 rd in SEQ ID NO: 2);
qARE 4-1R: 5'-CCTGAGCAGCAGATGTATCTCC-3' (reverse complement of position 1027-1048 of SEQ ID NO: 2).
The OsActin1 gene is used as an internal reference gene, and a detection primer pair comprises:
OsActin1-F:5’-CAACACCCCTGCTATGTACG-3’;
OsActin1-R:5’-CATCACCAGAGTCCAACACAA-3’。
the expression level of the internal reference gene OsActin1 gene is set as 1, and the relative expression level of ARE4 gene is calculated.
The are4-1 mutant was used as a control.
The result shows that compared with an ARE4-1 mutant (the relative expression quantity of the ARE4 gene is 0.89), the transcription level of the ARE4 gene (the relative expression quantity of the ARE4 gene is 1.74) in the HYG positive ARE4 gene complementation transgenic rice is obviously improved, and the positive ARE4 gene complementation transgenic rice (named as PARE4) is obtained.
2. Phenotypic observation of ARE4 gene complementation transgenic rice
Seeds of the positive ARE4 gene complementation transgenic rice ARE selected, sowed and subjected to field phenotype analysis, and each rice is selected from 2 lines and 40 seeds of each line. The are4-1 mutant and wild type rice Nipponbare (WT) were used as controls.
After 140 days of sowing, the plant heights of the individual plants were examined.
As a result, as shown in FIG. 3, it was found that the plant height of the positive ARE4 gene-complemented transgenic rice was restored to that of the wild-type rice.
The results show that the ARE4 gene can improve the plant height of the plant.
Example 2 application of ARE4 Gene in regulating Nitrogen metabolism
First, determination of nitrate absorption transport related gene expression in are4-1 mutant
Respectively taking the overground part and the underground part of wild rice Nipponbare and are4-1 mutant seedlings, extracting total RNA of plants, carrying out reverse transcription to obtain cDNA, and detecting the relative expression level of nitrate absorption and transport related genes by qRT-PCR: (1) the OsNRT2.1 gene (NC-029257, 655310-657326, 07-AUG-2018) is mainly responsible for the absorption of nitrate; (2) the OsNRT2.2 gene (NC-029257, 667179-669065, 07-AUG-2018) is mainly responsible for the absorption of nitrate; (3) the OsNRT2.4 gene (NC-029256, 20385987-20388532, 07-AUG-2018) is mainly responsible for the absorption and transport of nitrate.
Primer pairs for detecting OsNRT2.1 gene:
OsNRT2.1-F:5’-CACGGTGCAAGTCTCAAG-3’;
OsNRT2.1-R:5’-GGTATAAATGCCTCTCCC-3’。
primer pairs for detecting OsNRT2.2 gene:
OsNRT2.2-F:5’-TGGAACATTTGGATCCTCC-3’
OsNRT2.2-R:5’-CCATGACGACATACTCTAG-3’。
primer pairs for detecting OsNRT2.4 gene:
OsNRT2.4-F:5’-AAAGGTCGCTGGGCGTGGTG-3’
OsNRT2.4-R:5’-CCTGGACCCGCTGAAGAAGAG-3’。
the OsActin1 gene is used as an internal reference gene, and a detection primer pair comprises:
OsActin1-F:5’-CAACACCCCTGCTATGTACG-3’;
OsActin1-R:5’-CATCACCAGAGTCCAACACAA-3’。
the expression level of the internal reference gene OsActin1 gene was set to 1, and the relative expression levels of OsNRT2.1, OsNRT2.2, and OsNRT2.4 genes were calculated.
The results ARE shown in fig. 4, compared with wild type rice material Nipponica (NPB), the relative expression levels of the osnrt2.1, osnrt2.2, osnrt2.4 genes in the aerial and underground parts of the ARE4-1 mutant ARE significantly reduced (P <0.01), indicating that the ARE4 gene is positively regulating the expression of the nitrogen uptake transport-related gene.
Secondly, analyzing the nitrate absorption and transport capacity of different mutant materials
In order to further confirm whether the ARE4 gene regulates the absorption and transportation of nitrate by rice seedlings, the invention carries out15N-labelled nitrate uptake transport experiments. First, wild type rice Nipponbare, abc1-1 mutant, abc1-1 are4-1 double mutant and are4-1 mutant were subjected to Kimura B nutrient solution (2mM KNO)3,1.8mM KCl, 0.36mM CaCl2,0.54mM MgSO4·7H2O,0.18mM KH2PO4,40μM Na2EDTA-Fe(II),13.4 μM MnCl2·4H2O,18.8μM H3BO3,0.03μM Na2MoO4·2H2O,0.3μM ZnSO4·7H2O,0.32μM CuSO4·5H2O and 1.6mM Na2SiO3·9H2O) for 10 days (12h light/12 h dark, 28 ℃, 70% humidity), and replacing fresh nutrient solution every day. In the process of15N-KNO3The roots of the rice seedlings are soaked in clear water for 2 times before being directly transferred to the soil containing 5mM15N-KNO3Cultured in modified Kimura B nutrient solution for 3 hours. Before sampling, the roots of the rice seedlings were again brought to 0.1mM CaSO4Soaking in the solution for 2 min to remove residual root surface15N-NO3 -Then respectively sampling the overground part and the underground part of the rice seedling, filling the samples into paper bags, and drying the paper bags at 65 DEG CThen grinding into powder for later use. Measurement of the concentration of a substance in a sample by means of an elemental mass spectrometer (ICP-MS)15N content, repeated 4 times per sample (Liu, y., Hu, b., and Chu, C. (2016).15N-nitrate update Activity and Root-to-shot Transport Assay in rice. bio-protocol 6, e 1897.).
15The results of the N-labeled nitrate absorption/transport capacity assay experiments ARE shown in FIG. 5, and compared with wild type rice material Nipponbare (NPB or WT), the rate of nitrate absorption and transport of the ARE4-1 mutant seedlings is reduced, and compared with the abc1-1 mutant, the rate of nitrate absorption and transport of the abc1-1 ARE4-1 double mutant is also reduced, which indicates that ARE4 positively regulates the nitrate absorption and transport of rice plants (P4)<0.01)。
The results show that ARE4 regulates nitrogen metabolism, particularly promotes nitrate absorption or transport; is embodied in promoting the expression of nitrate absorption or transport related genes.
Example 3 acquisition and phenotypic analysis of ARE4 transgenic plants
The ARE4 gene related in the embodiment is derived from a rice variety Nipponbare, the genome sequence of the ARE4 gene is shown as a sequence 4 in a sequence table, the sequence 4 is composed of 4432 nucleotides, the cDNA sequence is a sequence 2 in the sequence table, and the sequence 2 is composed of 1344 nucleotides. The sequence 2 and the sequence 4 encode ARE4 protein shown in the sequence 1 in the sequence table, and the sequence 1 consists of 447 amino acid residues.
First, construction of plant expression vector pCAMBIA1300-pUbi ARE4
Extracting genome DNA of a rice variety Nipponbare by a CTAB method to be used as a template, and carrying out PCR amplification by adopting the following primer sequence pairs to obtain a Ubiquitin gene Ubiquitin promoter (pUbi).
The sequences of the primer pairs are as follows:
F:5’-AAGCTTTGCAGCGTGACCCGGTC-3' (recognition sequence for HindIII is underlined);
R:5’-CTGCAGAAGTAACACCAAACAACAGGGTGA-3' (underlined part is the recognition sequence for PstI).
Total RNA of a plant of a rice variety Nipponbare is extracted and reverse transcription cDNA is used as a template, and the following primer sequence pair is adopted for PCR amplification to obtain a coding sequence CDS (without a stop codon) of an ARE4 gene.
The sequences of the primer pairs are as follows:
F:5’-CTGCAGATGTCCGCGTCTGCATCC-3' (the underlined part is the recognition sequence of PstI, and the sequence after the recognition sequence is the 1 st to 18 th positions of the sequence 2 in the sequence table);
R:5’-CCCGGGTCCATGAGATGTTGGTGGCG-3' (the underlined part is the recognition sequence of XmaI, and the sequence following the recognition sequence is the reverse complement of position 1322-1341 of sequence 2 in the sequence list).
After the PCR product Ubiquitin gene promoter (pUbi) and the coding sequence CDS of ARE4 gene ARE respectively connected with pBluescript SK II (-) vector (Stratagene), after the sequencing is correct, the Ubiquitin gene Ubiquitin promoter (pUbi) is subjected to HindIII/PstI double digestion, the coding sequence CDS of ARE4 gene is subjected to PstI/BamHI double digestion, and the Ubiquitin gene Ubiquitin promoter and ARE4 gene ARE jointly connected into plant expression vector pCAMBIA1300(HindIII/BamHI), and the recombinant vector pCAMBIA1300-pUbi is obtained, wherein ARE 4.
ARE4 is a vector obtained by replacing a DNA molecule shown in a sequence 6 with a fragment between enzyme cutting sites HindIII and BamHI of a pCAMBIA1300 vector, wherein in the DNA molecule shown in the sequence 6, the 1 st to 1981 st sites ARE Ubiquitin promoter sequences, and the 1988 th to 3328 th sites ARE coding sequences CDS of ARE4 genes.
Secondly, construction of gene silencing vector pTCK303-ARE4
The silencing target sequence of ARE4 gene (corresponding to the 492 th to 806 th from the 5' end of the sequence 2 in the sequence table) was amplified by PCR using cDNA of Nipponbare of rice variety as a template with the following primer sequence pair.
The sequences of the primer pairs are as follows:
F:5’-GGTACCACTAGTCGAGCAGGAGAAGGCGTTCGAG-3' (the underlined part is the recognition sequence of KpnI SpeI, and the sequence thereafter is sequence No. 492-513 of sequence 2 in the sequence listing);
R:5’-GGATCCGAGCTCCGCTCCTGCTCAGAGGACTTAGC-3' (the underlined part is the recognition sequence for BamHI SacI, and the sequence following it is the second part of sequence 2 in the sequence Listing784-806 reverse complement).
After the PCR product is connected with a pBluescript SK II (-) vector (Stratagene), an intermediate vector connected with an ARE4 gene silencing target sequence is obtained after the sequencing is correct; after the intermediate vector is subjected to SpeI and SacI double enzyme digestion, a target sequence (SpeI/SacI) is connected into a gene silencing vector pTCK303(SpeI/SacI), and a recombinant vector 1 is obtained after a sequencing result is correct; then, the intermediate vector is subjected to double enzyme digestion by KpnI and BamHI to obtain a 315bp target sequence (KpnI/BamHI), the target sequence is substituted for a fragment between KpnI/BamHI digestion sites of the recombinant vector 1, and the gene silencing vector pTCK303-ARE4 is obtained after the sequencing is correct.
The gene silencing vector pTCK303-ARE4 is a vector obtained by replacing the fragment from the 492 th site to the 806 th site of the sequence 2 with the fragment between the SpeI and SacI cleavage sites of the gene silencing vector Ptck303 and replacing the fragment from the 492 th site to the 806 th site of the sequence 2 with the fragment between the KpnI and BamHI cleavage sites of the gene silencing vector Ptck 303.
Thirdly, acquisition and phenotypic analysis of ARE4 transgenic plants
1. Acquisition of ARE4 transgenic plants
And (3) respectively transferring the recombinant plant expression vectors pCAMBIA1300-pUbi constructed in the first step and the second step into the Agrobacterium tumefaciens EHA105 by the ARE4 and the pTCK303-ARE4 to obtain a recombinant bacterium EHA105/pCAMBIA1300-pUbi, an ARE4 and a recombinant bacterium EHA105/pTCK303-ARE 4.
The recombinant bacteria are respectively adopted to infect callus of japonica rice variety Nipponbare, and the specific transformation screening method is described in the literature 'Yi-Li-Zi, Cao-Li, Wang-Li, and which is Jie, and then Tang-31066; (Shun, Zhou-Pu-Wa, and Tianwen-Zheng-Zhong-improving the frequency of agrobacterium transformed rice-Gen. Gen-Xue, 2001,28(4): 352-358'. Finally, two transgenic seedlings ARE obtained, namely a rice plant (named as OE-ARE4) which overexpresses ARE4 gene and a rice plant (named as ARE4-RNAi) which has ARE4 gene silent expression.
2. Identification of ARE4 transgenic plants
The rice plant (named OE-ARE4) with over-expressed ARE4 gene and the rice plant (named ARE4-RNAi) with ARE4 gene expression silencing ARE identified according to the following two methods:
(1) preliminary identification by PCR
Genomic DNA of wild rice variety Nipponbare and the 2 transgenic plants are respectively extracted as templates, the following primer pair aiming at hygromycin gene HYG is adopted for identification through PCR amplification, and the plant containing the HYG gene (the size of the PCR product is 557bp) is identified to be the HYG positive transgenic plant.
HYG-F:5’-GTCTCCGACCTGATGCAGCTCTCGG-3’;
HYG-R:5’-GTCCGTCAGGACATTGTTGGAG-3’。
Obtaining rice plants (named OE-ARE4) with HYG positive overexpression ARE4 genes and rice plants (named ARE4-RNAi) with HYG positive ARE4 genes with silent expression.
(2) Analysis of transcript levels
Extracting total RNA from a wild rice variety Nipponbare, the rice plant with the HYG positive overexpression ARE4 gene obtained in the step (1) and the rice plant with the HYG positive ARE4 gene silent expression obtained in the step (1), carrying out reverse transcription on the total RNA to obtain a template, detecting the transcription level of the ARE4 gene by adopting the following primer sequence pairs through real-time quantitative fluorescent PCR, repeating the experiment for 3 times, and averaging the results.
The primer sequences for detecting the expression level of the ARE4 gene ARE as follows:
qARE 4-1F: 5'-AGGACGAGCACAGGCTGTT-3' (830 th and 848 rd in SEQ ID NO: 2);
qARE 4-1R: 5'-CCTGAGCAGCAGATGTATCTCC-3' (reverse complement of position 1027-1048 of SEQ ID NO: 2).
The OsActin1 gene is used as an internal reference gene, and a detection primer pair comprises:
OsActin1-F:5’-CAACACCCCTGCTATGTACG-3’;
OsActin1-R:5’-CATCACCAGAGTCCAACACAA-3’。
the expression level of the internal reference gene OsActin1 gene is set as 1, and the relative expression level of ARE4 gene is calculated.
The results show that compared with wild rice material Nipponbare (the relative expression quantity of the ARE4 gene is 1), the transcription level of the ARE4 gene (the relative expression quantity of the ARE4 gene is 20.9 and 25.5) in 2 lines of rice plants of HYG positive over-expression ARE4 gene is obviously improved, and rice of positive expression ARE4 gene is obtained; the expression quantity of the ARE4 gene (the relative expression quantity of the ARE4 gene is 0.45 and 0.54) in 2 strains of the rice plant with the HYG positive ARE4 gene silent expression is about 0.3 to 0.6 times of that of Nipponbare, and the rice with the positive ARE4 gene silent expression is obtained.
The gene silencing vectors Ptck303 and pCAMBIA1300 are respectively transferred into Nipponbare by the same method to obtain Ptck 303-transferred rice and pCAMBIA 1300-transferred rice, and the fact that the expression of target genes does not exist is verified.
3. ARE4 transgenic plant phenotypic analysis
Seeds of the rice (named OE-ARE4-Flag) with positive ARE4 gene expression and the rice (named ARE4-RNAi) with positive ARE4 gene silent expression obtained in the step 2 ARE selected, sowed and subjected to field phenotype analysis, and each rice is selected from 2 strains and 40 seeds of each strain. Wild type rice Nipponbare (also known as NPB), Ptck 303-transgenic rice and pCAMBIA 1300-transgenic rice were used as controls.
After 140 days of sowing, the plant height and the individual plant yield of each plant were examined.
Yield per plant: the sum of the mass of all seeds of a single plant.
The results are shown in figure 6 which shows,
the reduction in plant height of ARE4-RNAi transgenic plants (#1, #2) compared to the control plant Nipponbare (NPB) (FIG. 6b), indicating a reduction in biomass (FIG. 6 a); the yield per plant of ARE4-RNAi transgenic plants (#1, #2) was reduced compared to the control plants Nipponbare (NPB) (FIG. 6 c);
the OE-ARE4-Flag transgenic plants (#1, #2) were elevated in plant height compared to the control plants Nipponbare (FIG. 6e), indicating an increase in biomass (FIG. 6 d); individual yield of OE-ARE4-Flag transgenic plants (#1, #2) was increased compared to the control plants Nipponbare (FIG. 6 f).
The results show that the ARE4 gene can improve the plant height, biomass and single plant yield of plants.
The results of the pTCK303 transgenic rice and pCAMBIA1300 transgenic rice have no significant difference with those of wild rice.
Example 4 use of ARE4 protein in regulating Gene transcription
Construction of recombinant bacteria
1. Recombinant vector
And optimizing the cDNA shown in the sequence 2 according to the codon of the escherichia coli to obtain an optimized sequence 3.
The recombinant vector pGEX-ARE4 is a vector obtained by replacing a fragment (corresponding to the 388 th to 1341 th positions from the 5' end of the sequence 2 in the sequence table before optimization) shown by the 391 th to 1344 th positions of the sequence 3 with a fragment between BamHI and XmaI of the enzyme cutting sites of a pGEX-4T-1 vector (commercially available from GE Healthcare), the vector expresses recombinant ARE4 protein, the amino acid sequence of the protein is the sequence 7, the sequences 1 to 218 in the protein ARE GST (glutathione S-transferase) tags, and the 227 th to 544 th positions ARE the amino acid sequence of the truncated ARE4 protein.
2. Recombinant bacterium
The recombinant vector pGEX-ARE4 is transformed into BL21(DE3) escherichia coli competent cells (commercially available from TransGen Biotech), and the screened ampicillin resistant strain is the escherichia coli recombinant strain containing the recombinant vector pGEX-ARE 4.
3. Purification by inducible expression
Specific Methods for Protein-induced expression purification are described in published articles (Graslund, S., Nordlund, P., Weigelt, J., Hallberg, B.M., Bray, J., Gileadi, O., Knapp, S., Oppermann, U., Arrowsmith, C., Hui, R., et al (2008). Protein production and purification. Nat Methods 5,135-146.), where the recombinant strain of the previous step is inoculated into 5mL of liquid LB medium containing 50. mu.g/mL of ampicillin and cultured overnight at 37 ℃ and 220 rpm. The whole strain was inoculated into 500mL of LB liquid medium (ampicillin was added to a final concentration of 50. mu.g/mL), cultured at 37 ℃ for 3-4 hours with shaking at 220rpm until the OD value reached 0.6-0.8, IPTG was added to a final concentration of 0.5mM, and expression was induced overnight at 16 ℃ with shaking at 220 rpm. Centrifuging at room temperature of 6000rpm for 5min, collecting thallus, adding PBS buffer solution (containing protease inhibitor) to resuspend thallus precipitate, centrifuging at 4 deg.C of 12000rpm for 30min after ultrasonication, and transferring supernatant to new centrifuge tube. Adding 500 μ L Glutathione Sepharose 4beads (GE healthcare), washing beads 6-8 times with PBS buffer after binding 4h with gentle shaking at 4 deg.C, and protein washing with 1mL 10mM GSHRemoving to obtain ARE4 recombinant protein (with amino acid sequence of 7; protein concentration of 1mg/mL, solvent of PBS, and formulation of 2mM KH2PO4, 8mM Na2HPO4136mM NaCl, 2.6mM KCl, balance water, pH 7.4).
Application of ARE4 protein in regulation and control of gene transcription
1. Binding of ARE4 recombinant protein to target nucleic acid probes
The method comprises the following steps: specific methods for binding of ARE4 recombinant protein to the target nucleic acid probe refer to the gel blocking assay (relevant probe synthesis and kits ARE commercially available from Thermo Fisher Scientific, specific protocols refer to kit instructions). In the invention, specific experimental verification is carried out by taking the OsNRT2.4 gene promoter region (NC-029256, 20388533-20390532, 07-AUG-2018) as the binding region of ARE4 recombinant protein as an example.
A DNA sequence obtained by annealing a nucleic acid probe for binding with an OsNRT2.4 gene promoter region according to the following primers, wherein an OsNRT2.4-P1m1 probe to an OsNRT2.4-P1m5 probe are obtained after single base mutation of an OsNRT2.4-P1WT probe sequence:
synthetic primers for OsNRT2.4-P1WT probe:
OsNRT2.4-P1F:ACAGGAATCGgataaGAGAGATAGA
OsNRT2.4-P1R:TCTATCTCTCttatcCGATTCCTGT
synthetic primers for OsNRT2.4-P1m1 probe:
OsNRT2.4-P1Fm1:ACAGGAATCGtataaGAGAGATAGA
OsNRT2.4-P1Rm1:TCTATCTCTCttataCGATTCCTGT
synthetic primers for OsNRT2.4-P1m2 probe:
OsNRT2.4-P1Fm2:ACAGGAATCGgttaaGAGAGATAGA
OsNRT2.4-P1Rm2:TCTATCTCTCttaacCGATTCCTGT
synthetic primers for OsNRT2.4-P1m3 probe:
OsNRT2.4-P1Fm3:ACAGGAATCGgaaaaGAGAGATAGA
OsNRT2.4-P1Rm3:TCTATCTCTCttttcCGATTCCTGT
synthetic primers for OsNRT2.4-P1m4 probe:
OsNRT2.4-P1Fm4:ACAGGAATCGgattaGAGAGATAGA
OsNRT2.4-P1Rm4:TCTATCTCTCtaatcCGATTCCTGT
synthetic primers for OsNRT2.4-P1m5 probe:
OsNRT2.4-P1Fm5:ACAGGAATCGgatatGAGAGATAGA
OsNRT2.4-P1Rm5:TCTATCTCTCatatcCGATTCCTGT
as a result: in order to confirm whether the ARE4 recombinant protein can be directly combined with the promoter region of the OsNRT2.4 gene, a gel retardation experiment was performed to detect the combination of the ARE4 recombinant protein and a nucleic acid probe (probe length is 25bp) containing a GATAA sequence in the promoter region of the OsNRT2.4 gene.
The results of the gel migration experiments ARE shown in FIG. 7, wherein WT is an OsNRT2.4-P1WT probe, M1-M5 ARE an OsNRT2.4-P1M1 probe-OsNRT2.4-P1M 5 probe, and ARE4 protein can be directly combined with a promoter sequence of an OsNRT2.4 gene containing a GATAA site and cannot be combined with a promoter sequence of a GATAA site mutation, which indicates that ARE4 protein can be directly combined with the promoter of OsNRT2.4 in vitro.
2. ARE4 protein transcriptional activation target gene expression
Given that the OsNRT2.1, OsNRT2.2 and OsNRT2.4 genes ARE likely to be candidate target genes of ARE4 protein, the detection was performed as follows:
the effector factors are: dividing into experimental group and control group;
experimental groups: recombinant vector pCAMBIA1300-pUbi: (ARE 4): the vector constructed in example 3; control group: pCAMBIA 1300;
the report factor is as follows: the following 3 types:
pGreenII 0800-LUC-pOsNRT2.1 is a vector obtained by replacing a fragment between SalI and NcoI enzyme cutting sites of a pGreenII 0800-LUC vector by a promoter pOsNRT2.1 sequence (NC-029257, 653310-655309, 07-AUG-2018) of an OsNRT2.1 gene;
pGreenII 0800-LUC-pOsNRT2.2 is a vector obtained by replacing a fragment between SalI and NcoI enzyme cutting sites of a pGreenII 0800-LUC vector by a promoter pOsNRT2.2 sequence (NC-029257, 669066-671065, 07-AUG-2018) of an OsNRT2.2 gene;
pGreenII 0800-LUC-pOsNRT2.4 is a vector obtained by replacing a fragment between SalI and NcoI cleavage sites of a pGreenII 0800-LUC vector by a promoter pOsNRT2.4 sequence (NC-029256, 20388533-20390532, 07-AUG-2018) of an OsNRT2.4 gene.
Plasmid DNA (experimental group effector: different Reporter factors 2:1) in different combinations was transformed into protoplast cells of Nipponbare rice, respectively, according to the experimental design, and cultured for 12h, and then measured using a Dual-Luciferase Reporter Assay kit (Dual-Luciferase Reporter Assay System, commercially available from Promega Corporation) and a single-tube chemiluminescence detector (GLOMAX 20/20, commercially available from Promega Corporation), with specific procedures according to the instructions of the Assay kit and detector. Control group: only effector control pCAMBIA1300 was added.
The results ARE shown in fig. 8, the transcriptional regulation effect of the effector ARE4 on the OsNRT2s gene is detected in the rice protoplast by the dual-luciferase reporter system, and compared with the control group, the ARE4 protein can significantly activate the expression of the OsNRT2.1, OsNRT2.2 and OsNRT2.4 gene promoters, which indicates that the ARE4 protein is a transcription factor in promoting gene transcription.
Sequence listing
<110> institute of genetics and developmental biology of Chinese academy of sciences
<120> application of rice nitrogen metabolism regulatory protein ARE4 and coding gene thereof
<160> 7
<170> PatentIn version 3.5
<210> 1
<211> 447
<212> PRT
<213> Artificial sequence
<400> 1
Met Ser Ala Ser Ala Ser Ala Val Cys Leu Leu Pro Pro Arg Gly Gly
1 5 10 15
Ser Leu Ala Arg Pro Asp Thr Ala Leu Pro Pro Ala Ser Gln Pro Ala
20 25 30
Thr Val Ala Val Asn Gln Asn Ile Pro Arg Leu Ala Ser Pro Arg Leu
35 40 45
Ala Val Thr Ser Ile Thr Leu Leu Pro Arg Arg Gly Arg Arg Cys Ala
50 55 60
Val Asp Leu Leu Leu Leu His Leu His Arg Leu Leu Leu Phe Leu Ser
65 70 75 80
Leu Phe Ser Glu Glu Thr Pro Asn Leu Phe Leu Pro Arg Lys Pro Ala
85 90 95
Ala Phe Leu Lys Arg Ile Lys Ser Pro Ser Leu Ile Arg Arg Cys Asn
100 105 110
Pro Ser Pro Gln Asn Leu Ala Ala Pro Arg Ala Val Leu Gly Phe Glu
115 120 125
Leu Met Ala Val Glu Glu Ala Ser Ser Ser Ser Gly Gly Gly Arg Gly
130 135 140
Gly Gly Gly Gly Gly Gly Gly Glu Glu Gly Leu Ser Gly Cys Gly Gly
145 150 155 160
Gly Trp Thr Arg Glu Gln Glu Lys Ala Phe Glu Asn Ala Leu Ala Thr
165 170 175
Val Gly Asp Asp Glu Glu Glu Gly Asp Gly Leu Trp Glu Lys Leu Ala
180 185 190
Glu Ala Val Glu Gly Lys Thr Ala Asp Glu Val Arg Arg His Tyr Glu
195 200 205
Leu Leu Val Glu Asp Val Asp Gly Ile Glu Ala Gly Arg Val Pro Leu
210 215 220
Leu Val Tyr Ala Gly Asp Gly Gly Val Glu Glu Gly Ser Ala Gly Gly
225 230 235 240
Gly Lys Lys Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly His
245 250 255
Gly Glu Lys Gly Ser Ala Lys Ser Ser Glu Gln Glu Arg Arg Lys Gly
260 265 270
Ile Ala Trp Thr Glu Asp Glu His Arg Leu Phe Leu Leu Gly Leu Glu
275 280 285
Lys Tyr Gly Lys Gly Asp Trp Arg Ser Ile Ser Arg Asn Phe Val Ile
290 295 300
Ser Arg Thr Pro Thr Gln Val Ala Ser His Ala Gln Lys Tyr Phe Ile
305 310 315 320
Arg Leu Asn Ser Met Asn Arg Glu Arg Arg Arg Ser Ser Ile His Asp
325 330 335
Ile Thr Ser Val Asn Asn Gly Asp Thr Ser Ala Ala Gln Gly Pro Ile
340 345 350
Thr Gly Gln Pro Asn Gly Pro Ser Ala Asn Pro Gly Lys Ser Ser Lys
355 360 365
Gln Ser Leu Gln Pro Ala Asn Ala Pro Pro Gly Val Asp Ala Tyr Gly
370 375 380
Thr Thr Ile Gly Gln Pro Val Gly Gly Pro Leu Val Ser Ala Val Gly
385 390 395 400
Thr Pro Val Thr Leu Pro Val Pro Ala Ala Pro His Ile Ala Tyr Gly
405 410 415
Met His Ala Pro Val Pro Gly Ala Val Val Pro Gly Ala Pro Val Asn
420 425 430
Met Pro Pro Met Pro Tyr Pro Met Pro Pro Pro Thr Ser His Gly
435 440 445
<210> 2
<211> 1344
<212> DNA
<213> Artificial sequence
<400> 2
atgtccgcgt ctgcatccgc tgtctgtctc cttcctcccc gaggaggaag cctggcccga 60
cccgacaccg cactcccccc agccagccag ccagccactg tcgcggtgaa ccaaaatatc 120
ccccgcctcg cctcgcctcg cctcgcggta acatccatca cactcctccc ccgccgcggc 180
cgccgctgcg cggtagatct cctcctcctc cacctccacc gtctcctgct cttcctctct 240
ctcttctcgg aggaaacccc caatttattc cttccccgca aacccgcagc cttcctaaaa 300
cgaattaaat ctccctccct tatccgccgc tgcaatccct ccccacaaaa tctcgctgcg 360
ccgcgcgctg ttttagggtt tgagctcatg gcggtggagg aggcgagcag cagcagtggc 420
ggcggtcgtg gtgggggtgg cggtgggggt ggggaggagg ggttgtccgg ttgcggcggt 480
gggtggacgc gcgagcagga gaaggcgttc gagaacgcgc tggcgacggt gggggatgac 540
gaggaggaag gggacgggtt gtgggagaag ctagcggagg ccgtggaggg gaagacggcc 600
gacgaggtga ggcggcacta cgagctgctg gtggaggacg tcgacggcat cgaggccggg 660
cgggtgccgc tcctggtgta cgccggcgac gggggcgtcg aggagggctc tgcgggaggt 720
gggaagaagg ggggtggtgg gggaggaggt ggaggtggag gggggcatgg ggagaagggg 780
tcggctaagt cctctgagca ggagcgccgg aaggggatcg cctggacgga ggacgagcac 840
aggctgttcc ttcttggact tgagaagtac ggcaaaggcg actggaggag tatctcaaga 900
aactttgtga tctcaaggac acccacccaa gtagctagtc atgcacagaa gtattttatt 960
cgcctgaact caatgaacag agagaggcgg cgatcaagta tacatgacat aaccagcgtg 1020
aacaatggag atacatctgc tgctcagggg ccaatcacag gtcagccaaa tggcccatca 1080
gcaaatcctg gaaaatcctc taagcagtct ctacagccag caaatgcgcc tccaggcgtc 1140
gatgcttatg gtacgacaat tggacagcca gttggtggtc ctcttgtgtc cgcagttggc 1200
actcctgtta cacttcctgt tcctgctgca cctcatatag cctatggcat gcatgcccct 1260
gtccctggag ctgtagtccc tggtgcccca gtaaacatgc ctccaatgcc ctaccccatg 1320
ccgccaccaa catctcatgg atga 1344
<210> 3
<211> 1374
<212> DNA
<213> Artificial sequence
<400> 3
catatgagcg cgagcgcgag cgcggtttgc ctgctgccgc cgcgtggtgg tagcctggcg 60
cgtccggata ccgcgctgcc gccggcgagc cagccggcga ccgtggcggt taaccaaaac 120
attccgcgtc tggcgagccc gcgtctggcg gtgaccagca ttaccctgct gccgcgtcgt 180
ggtcgtcgtt gcgcggttga cctgctgctg ctgcacctgc accgtctgct gctgttcctg 240
agcctgttta gcgaggaaac cccgaacctg ttcctgccgc gtaagccggc ggcgtttctg 300
aagcgtatca aaagcccgag cctgattcgt cgttgcaacc cgagcccgca gaacctggcg 360
gcgccgcgtg cggtgctggg tttcgaactg atggcggttg aggaagcgag cagcagcagc 420
ggtggcggtc gtggcggtgg cggtggcggt ggcggtgagg aaggcctgag cggttgcggc 480
ggtggctgga cccgtgaaca agagaaagcg tttgagaacg cgctggcgac cgtgggtgac 540
gatgaggaag agggcgacgg tctgtgggaa aagctggcgg aagcggtgga gggtaaaacc 600
gcggatgagg ttcgtcgtca ctacgaactg ctggttgagg acgttgatgg catcgaagcg 660
ggtcgtgtgc cgctgctggt ttatgcgggt gatggtggcg ttgaagaggg cagcgcgggt 720
ggcggtaaga aaggcggtgg cggtggcggt ggcggtggcg gtggcggtca tggtgaaaag 780
ggtagcgcga aaagcagcga acaggagcgt cgtaagggta ttgcgtggac cgaagacgag 840
caccgtctgt tcctgctggg cctggagaag tacggcaagg gtgattggcg tagcatcagc 900
cgtaacttcg tgattagccg taccccgacc caggttgcga gccacgcgca aaaatatttt 960
atccgtctga acagcatgaa ccgtgagcgt cgtcgtagca gcatccacga cattaccagc 1020
gtgaacaacg gtgataccag cgcggcgcag ggtccgatta ccggtcaacc gaacggtccg 1080
agcgcgaacc cgggcaagag cagcaaacag agcctgcaac cggcgaacgc gccgccgggc 1140
gtggatgcgt acggtaccac cattggtcaa ccggttggcg gtccgctggt gagcgcggtt 1200
ggtaccccgg tgaccctgcc ggttccggcg gcgccgcaca ttgcgtatgg catgcatgcg 1260
ccggtgccgg gtgcggtggt tccgggcgcg ccggttaaca tgccgccgat gccgtatccg 1320
atgccgccgc cgaccagcca cggtcaccac caccaccacc actaatgaaa gctt 1374
<210> 4
<211> 4432
<212> DNA
<213> Artificial sequence
<400> 4
ggccatgtcc gcgtctgcat ccgctgtctg tctccttcct ccccgaggag gaagcctggc 60
ccgacccgac accgcactcc ccccagccag ccagccagcc actgtcgcgg tgaaccaaaa 120
tatcccccgc ctcgcctcgc ctcgcctcgc ggtaacatcc atcacactcc tcccccgccg 180
cggccgccgc tgcgcggtag atctcctcct cctccacctc caccgtctcc tgctcttcct 240
ctctctcttc tcggaggaaa cccccaattt attccttccc cgcaaacccg cagccttcct 300
aaaacgaatt aaatctccct cccttatccg ccgctgcaat ccctccccac aaaatctcgc 360
tgcgccgcgc gctgttttag ggtttgagct catggcggtg gaggaggcga gcagcagcag 420
tggcggcggt cgtggtgggg gtggcggtgg gggtggggag gaggggttgt ccggttgcgg 480
cggtgggtgg acgcgcgagc aggagaaggc gttcgagaac gcgctggcga cggtggggga 540
tgacgaggag gaaggggacg ggttgtggga gaagctagcg gaggccgtgg aggggaagac 600
ggccgacgag gtgaggcggc actacgagct gctggtggag gacgtcgacg gcatcgaggc 660
cgggcgggtg ccgctcctgg tgtacgccgg cgacgggggc gtcgaggagg gctctgcggg 720
aggtgggaag aaggggggtg gtgggggagg aggtggaggt ggaggggggc atggggagaa 780
ggggtcggct aagtcctctg agcaggagcg ccggaagggg atcgcctgga cggaggacga 840
gcacaggtta gctttgcctt cgttcctatc taccaaattg cattgctgct ctagcctaga 900
caatatttga tgattgcaga aactggcttc tgttcggagc ctgtacaact tcactgtttt 960
attgtggatt aatccgtcta gtcattgtaa atggaagttg aaattgaaat gctctgggaa 1020
ttttggaaat cgctgtttag acagtctagc agcttctttt ggttgcagaa ctgcatactt 1080
gtaggtgtgt tgctcttcag ctttctttgc actggacaat taggtagctc ccttttctat 1140
ttcttgaata aatgattgca taggagctag aatagtagtt tatttaatcg tgaattaggt 1200
accgcttctg tgttgttggg aattttgccg attggtttgt cttgttattg cagcctgttt 1260
acgattttgg aaattactgc ttcagacccc aggaagtagg cttgttggtt tgtgtgtgct 1320
aggttttgtc ccgcatggtt tttttatggg tttcaccact tggcagttta tgttacatcc 1380
tgcatagttc ttctcagatg gagcggctga ttacttgtgc tgcatacttg ttttcaactg 1440
gaagtggttc attggagatg ttgtttcttc cttgctagtt tcgacaactc cagtaactac 1500
ttaactcaat caatttttga attataaact gttttctaac aataacctat tcacgttact 1560
tcttatggtg gatcctatgc atttcgagct acatcccact gatttagttt taggtttgtt 1620
tctcatttct tcttgctgaa ggcttgttta attgattcat cttagcatat ttcaccttcc 1680
tcctaaaagg atgagaaaat tctgttcata gcacaattcg taaaatacaa ctaagctttt 1740
tctcactctg cttaatacca tagcgcatca gttttttgtt tgaaaaatac caccgaaaga 1800
ccaaaataat tttacttcaa ttgcaatgca acaagaccta cctcttacgc attcctgttt 1860
gctggggttc ctctggcgcg ttctccttct ctttgttcaa tctgacacct ccattagctc 1920
atacatcatc tttaatcaca aaccaacaga tcaaacagat agtttatcaa atatctttgt 1980
caccattaca gaaatctcct atttcatagg tacaaataaa agaaagaaag caaatctgtc 2040
tgcttgcatc tactttgatt ttccgaacgc caatgctcaa tcccctacat agctccagca 2100
gttcatggaa tacaaaatta aacttggtca tttggcttca aagtggagtg aagaacgatg 2160
ttgtgcatgt gcgccttggc ttccagctag aagatcgatt agccattgac ctgggttaac 2220
acgtggtgag ccagcaggct agcgtcactg gcaagcaacc atacggccac atccaccaag 2280
cgcgtttgct gtgtgcaggt tggagggacg gcaatcacca tggacgtcgc taagcttcga 2340
gctcacagga gttggttgag ggtgaggcgg tggcaccagc aagtgaccac actgccgcgt 2400
ccagcaatcg cgagcataca ggggagggga atggcggcca ccactagccc gtgcctgccg 2460
agaatgacga aacacaagca ggccccagca gcggaagagg gggttagatc aagggatttg 2520
aggatttgga ggagatgggg aattattgtt cagtcaggaa cttacaagat gggttcgaat 2580
catctaggct caatctgagg gacaattttg tccaaataaa ctcccgttag ctcaaaagat 2640
gacggtagtg gtatttcccc tagaaaaacc cacaacctgt ggcattatga aggtttgtgc 2700
tattttctga atttaatggt cagcgtgtgt catggacgaa ttttccctaa aaggagttag 2760
aggaatcagg acaatggtag cttataactg atttttgtcc atcttaaatt tatttcagca 2820
aaatgaacat tttgaacgca tttgtttcac atttcatttt tttttccgta tttgatatgc 2880
taattttagt tgaaattttc ctgctagctt tagaccggaa caagctagta catgcttgct 2940
gtgcgcatat aattggtttt tgtgtctttt agacatagtc ctgttttcat ttgcaacctg 3000
tttcttccat gatattgtgg acagtatttc tggtccaagg agataaaatt ttgatatctt 3060
attggcatat tggtaatatt gaatgtgttt tatcagtata taaggcagta tgcttgtcag 3120
atccccatat aaaatgatgt gattatcgtg tatttttgtc tgtttgaatg gctgaatgag 3180
tgaaatagac ataacatacc aagatgatgc agagcatgaa aatacgtttc aagtagaagt 3240
aacctgatgc cctgatggat gctaatgtgt tgttcgttca ttgccatgta tgcacatgca 3300
tgatattgat tacaactcaa gtgaaagcat ggttgtaaaa atggcacaac taacctcctt 3360
ttgttctgat gcataaattt gtagctatta cccctctctt caggatgaga aatatccaaa 3420
gttgattaat tcatgcagac tgccctttac tcatatggtt attgatgtgt cctaccactt 3480
gattttaatt tgcaggctgt tccttcttgg acttgagaag tacggcaaag gcgactggag 3540
gagtatctca agaaactttg tgatctcaag gacacccacc caagtagcta gtcatgcaca 3600
gaagtatttt attcgcctga actcaatgaa cagagagagg cggcgatcaa gtatacatga 3660
cataaccagc gtgaacaatg gagatacatc tgctgctcag gggccaatca caggtcagcc 3720
aaatggccca tcagcaaatc ctggaaaatc ctctaagcag tctctacagc cagcaaatgc 3780
gcctccaggc gtcgatgctt atggtacgac aattggacag ccagttggtg gtcctcttgt 3840
gtccgcagtt ggcactcctg ttacacttcc tgttcctgct gcacctcata tagcctatgg 3900
catgcatgcc cctgtccctg gagctgtagt ccctggtgcc ccagtaaaca tgcctccaat 3960
gccctacccc atgccgccac caacatctca tggatgaggg ctttgaatac tacagttctt 4020
ctagacaaac tcataatatc tgtcttgttt agagtttcaa tgcatgctgt tatgtctcaa 4080
taaagcaata tcaataaact cttgtacatt acaaatggtt attgaatgta gcattttgag 4140
gacatcctgg actgtattta tcatctttgt tacgcctgca cttcgttcca tcttcaatgt 4200
acgcctgcca ccctgcccca gtcgtaaaat ggttggaatg ctgaatctcc ttcagcccag 4260
atgtagtggt ttattatctg aaaagtaaat atcgagtcaa tacgtaatca tgaacatatg 4320
taatggttca gtaagtatcc gactatctga ttcgtaatta tgaacagatg tattggttca 4380
ctacctgatt cgtactagta atgaacctgg tgcaggtaca aagagatcga aa 4432
<210> 5
<211> 1843
<212> DNA
<213> Artificial sequence
<400> 5
cggagcgaat acgagacgga acgaatacgg tagcgaatat ttatcggtat ataaaaaacc 60
cctcaaattg agtttcttga tcaaggaaga gatatcgctt attattttag ttcaacatct 120
ccaacattta tatcgtcaat tttatagacg gtcccacaac tgtatgtgga aatcgatttt 180
catggctgtt cctctaagag atccatatgc aaatatgatt atcattttct attcccgaga 240
cctttcacta gatgtataac ttacttacca ttgtataaat tggagatttt gtttatttta 300
cttcacatct tcgaaacttg taatgtttgt attgtacttt aaatgctttc aaatacaaat 360
gttataaact gcaaagtggt agatcccatt gagctctaca attttgatat ggaacacatc 420
tcctcagatg tcgttgaatt gtagatctga gattttgtaa aaattaatat ggtatattat 480
aatgaatatt tagacccata aatgacctca aataataaaa tagtcaataa taaagttgta 540
gatctcatcg agctctacaa tgttgatata aagtttgtct tcatctgatt ccgtatgaaa 600
aagttatgta tatatacgtg ttttttatat aatttgctta atgtctgcgg atatctgaaa 660
aaaattctgg atagtttccg accgttttct gattccgacg gatattaccc ttactgtatt 720
cgttttcgtt tccaagaaaa aatatccgaa ttcgtttccg aatccgagaa tttccggata 780
attccgactg aaactatcct aatccgaaaa atggtccgga cggacgaaaa ctatccgaac 840
cagtttcatc cctacttaaa agcttacagg gtggtgctaa tatagctaaa caaataatag 900
ggtgatggga tgatccttct aggtaactaa ccatgttata attctagctc atgaattact 960
cacttcatcc cataatataa agaattttga agggatgtga cacttcttag gactacgaat 1020
ctggataaag agcctgtcaa gattcgtagt cctagaaagt gtcactcccc tctaaaattc 1080
tttgtattat gagacggaag gagtatttgt ttttgtaata tttaaatcca tgatgtagag 1140
tattatggtc aaaaatagcc tttttagata actaatatac tggtagacat agatttacct 1200
ccattttact gcatgaattc cattttttat acatataaga taactagata agtcccatat 1260
atccatgaac tgagaacatt tatttacgcc aaacattata tgtctatatt ggtaaaacaa 1320
ttaaatatat ctgttcttta taaaccgcta tcacgcttga aacacgacaa aacttgttga 1380
atataaataa ttgaaaagat gcaaatcatg ctctcgctat caccggtaat caggagcata 1440
tggacagcga caggaagaga taaacacgaa ctcatgatta atactgccta gacgctgttg 1500
attttttatc taacgtttga tcattcgtct tattcaaaaa atgtatataa ttattattca 1560
ttttagtgtg acttaattca tcatcaaata ttctttaagc atgatataaa tattttcatt 1620
ttacacaaaa ataaaacgaa tagtcaaaca ttggttaaaa agtcaacgac gttatacatt 1680
gaaatacgga gagagtagta accagctagt acatctataa cacccaaaaa gaaaagtccc 1740
tccccacaaa atcacagaaa gagaacaaaa tgaaaaagga aaaaaaaaga aaaaaaaaag 1800
agacggagaa ataatacggc cggcgtcgcg ccagccagcg gcc 1843
<210> 6
<211> 3370
<212> DNA
<213> Artificial sequence
<400> 6
tgcagcgtga cccggtcgtg cccctctcta gagataatga gcattgcatg tctaagttat 60
aaaaaattac cacatatttt ttttgtcaca cttgtttgaa gtgcagttta tctatcttta 120
tacatatatt taaactttac tctacgaata atataatcta tagtactaca ataatatcag 180
tgttttagag aatcatataa atgaacagtt agacatggtc taaaggacaa ttgagtattt 240
tgacaacagg actctacagt tttatctttt tagtgtgcat gtgttctcct ttttttttgc 300
aaatagcttc acctatataa tacttcatcc attttattag tacatccatt tagggtttag 360
ggttaatggt ttttatagac taattttttt agtacatcta ttttattcta ttttagcctc 420
taaattaaga aaactaaaac tctattttag tttttttatt taataattta gatataaaat 480
agaataaaat aaagtgacta aaaattaaac aaataccctt taagaaatta aaaaaactaa 540
ggaaacattt ttcttgtttc gagtagataa tgccagcctg ttaaacgccg tcgacgagtc 600
taacggacac caaccagcga accagcagcg tcgcgtcggg ccaagcgaag cagacggcac 660
ggcatctctg tcgctgcctc tggacccctc tcgagagttc cgctccaccg ttggacttgc 720
tccgctgtcg gcatccagaa attgcgtggc ggagcggcag acgtgagccg gcacggcagg 780
cggcctcctc ctcctctcac ggcaccggca gctacggggg attcctttcc caccgctcct 840
tcgctttccc ttcctcgccc gccgtaataa atagacaccc cctccacacc ctctttcccc 900
aacctcgtgt tgttcggagc gcacacacac acaaccagat ctcccccaaa tccacccgtc 960
ggcacctccg cttcaaggta cgccgctcgt cctccccccc cccccctctc taccttctct 1020
agatcggcgt tccggtccat ggttagggcc cggtagttct acttctgttc atgtttgtgt 1080
tagatccgtg tttgtgttag atccgtgctg ctagcgttcg tacacggatg cgacctgtac 1140
gtcagacacg ttctgattgc taacttgcca gtgtttctct ttggggaatc ctgggatggc 1200
tctagccgtt ccgcagacgg gatcgatttc atgatttttt ttgtttcgtt gcatagggtt 1260
tggtttgccc ttttccttta tttcaatata tgccgtgcac ttgtttgtcg ggtcatcttt 1320
tcatgctttt ttttgtcttg gttgtgatga tgtggtctgg ttgggcggtc gttctagatc 1380
ggagtagaat tctgtttcaa actacctggt ggatttatta attttggatc tgtatgtgtg 1440
tgccatacat attcatagtt acgaattgaa gatgatggat ggaaatatcg atctaggata 1500
ggtatacatg ttgatgcggg ttttactgat gcatatacag agatgctttt tgttcgcttg 1560
gttgtgatga tgtggtgtgg ttgggcggtc gttcattcgt tctagatcgg agtagaatac 1620
tgtttcaaac tacctggtgt atttattaat tttggaactg tatgtgtgtg tcatacatct 1680
tcatagttac gagtttaaga tggatggaaa tatcgatcta ggataggtat acatgttgat 1740
gtgggtttta ctgatgcata tacatgatgg catatgcagc atctattcat atgctctaac 1800
cttgagtacc tatctattat aataaacaag tatgttttat aattattttg atcttgatat 1860
acttggatga tggcatatgc agcagctata tgtggatttt tttagccctg ccttcatacg 1920
ctatttattt gcttggtact gtttcttttg tcgatgctca ccctgttgtt tggtgttact 1980
tctgcagatg tccgcgtctg catccgctgt ctgtctcctt cctccccgag gaggaagcct 2040
ggcccgaccc gacaccgcac tccccccagc cagccagcca gccactgtcg cggtgaacca 2100
aaatatcccc cgcctcgcct cgcctcgcct cgcggtaaca tccatcacac tcctcccccg 2160
ccgcggccgc cgctgcgcgg tagatctcct cctcctccac ctccaccgtc tcctgctctt 2220
cctctctctc ttctcggagg aaacccccaa tttattcctt ccccgcaaac ccgcagcctt 2280
cctaaaacga attaaatctc cctcccttat ccgccgctgc aatccctccc cacaaaatct 2340
cgctgcgccg cgcgctgttt tagggtttga gctcatggcg gtggaggagg cgagcagcag 2400
cagtggcggc ggtcgtggtg ggggtggcgg tgggggtggg gaggaggggt tgtccggttg 2460
cggcggtggg tggacgcgcg agcaggagaa ggcgttcgag aacgcgctgg cgacggtggg 2520
ggatgacgag gaggaagggg acgggttgtg ggagaagcta gcggaggccg tggaggggaa 2580
gacggccgac gaggtgaggc ggcactacga gctgctggtg gaggacgtcg acggcatcga 2640
ggccgggcgg gtgccgctcc tggtgtacgc cggcgacggg ggcgtcgagg agggctctgc 2700
gggaggtggg aagaaggggg gtggtggggg aggaggtgga ggtggagggg ggcatgggga 2760
gaaggggtcg gctaagtcct ctgagcagga gcgccggaag gggatcgcct ggacggagga 2820
cgagcacagg ctgttccttc ttggacttga gaagtacggc aaaggcgact ggaggagtat 2880
ctcaagaaac tttgtgatct caaggacacc cacccaagta gctagtcatg cacagaagta 2940
ttttattcgc ctgaactcaa tgaacagaga gaggcggcga tcaagtatac atgacataac 3000
cagcgtgaac aatggagata catctgctgc tcaggggcca atcacaggtc agccaaatgg 3060
cccatcagca aatcctggaa aatcctctaa gcagtctcta cagccagcaa atgcgcctcc 3120
aggcgtcgat gcttatggta cgacaattgg acagccagtt ggtggtcctc ttgtgtccgc 3180
agttggcact cctgttacac ttcctgttcc tgctgcacct catatagcct atggcatgca 3240
tgcccctgtc cctggagctg tagtccctgg tgccccagta aacatgcctc caatgcccta 3300
ccccatgccg ccaccaacat ctcatggacc cggggattac aaggatgacg acgataagtg 3360
ctaagctagc 3370
<210> 7
<211> 553
<212> PRT
<213> Artificial sequence
<400> 7
Met Ser Pro Ile Leu Gly Tyr Trp Lys Ile Lys Gly Leu Val Gln Pro
1 5 10 15
Thr Arg Leu Leu Leu Glu Tyr Leu Glu Glu Lys Tyr Glu Glu His Leu
20 25 30
Tyr Glu Arg Asp Glu Gly Asp Lys Trp Arg Asn Lys Lys Phe Glu Leu
35 40 45
Gly Leu Glu Phe Pro Asn Leu Pro Tyr Tyr Ile Asp Gly Asp Val Lys
50 55 60
Leu Thr Gln Ser Met Ala Ile Ile Arg Tyr Ile Ala Asp Lys His Asn
65 70 75 80
Met Leu Gly Gly Cys Pro Lys Glu Arg Ala Glu Ile Ser Met Leu Glu
85 90 95
Gly Ala Val Leu Asp Ile Arg Tyr Gly Val Ser Arg Ile Ala Tyr Ser
100 105 110
Lys Asp Phe Glu Thr Leu Lys Val Asp Phe Leu Ser Lys Leu Pro Glu
115 120 125
Met Leu Lys Met Phe Glu Asp Arg Leu Cys His Lys Thr Tyr Leu Asn
130 135 140
Gly Asp His Val Thr His Pro Asp Phe Met Leu Tyr Asp Ala Leu Asp
145 150 155 160
Val Val Leu Tyr Met Asp Pro Met Cys Leu Asp Ala Phe Pro Lys Leu
165 170 175
Val Cys Phe Lys Lys Arg Ile Glu Ala Ile Pro Gln Ile Asp Lys Tyr
180 185 190
Leu Lys Ser Ser Lys Tyr Ile Ala Trp Pro Leu Gln Gly Trp Gln Ala
195 200 205
Thr Phe Gly Gly Gly Asp His Pro Pro Lys Ser Asp Leu Val Pro Arg
210 215 220
Gly Ser Met Ala Val Glu Glu Ala Ser Ser Ser Ser Gly Gly Gly Arg
225 230 235 240
Gly Gly Gly Gly Gly Gly Gly Gly Glu Glu Gly Leu Ser Gly Cys Gly
245 250 255
Gly Gly Trp Thr Arg Glu Gln Glu Lys Ala Phe Glu Asn Ala Leu Ala
260 265 270
Thr Val Gly Asp Asp Glu Glu Glu Gly Asp Gly Leu Trp Glu Lys Leu
275 280 285
Ala Glu Ala Val Glu Gly Lys Thr Ala Asp Glu Val Arg Arg His Tyr
290 295 300
Glu Leu Leu Val Glu Asp Val Asp Gly Ile Glu Ala Gly Arg Val Pro
305 310 315 320
Leu Leu Val Tyr Ala Gly Asp Gly Gly Val Glu Glu Gly Ser Ala Gly
325 330 335
Gly Gly Lys Lys Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly
340 345 350
His Gly Glu Lys Gly Ser Ala Lys Ser Ser Glu Gln Glu Arg Arg Lys
355 360 365
Gly Ile Ala Trp Thr Glu Asp Glu His Arg Leu Phe Leu Leu Gly Leu
370 375 380
Glu Lys Tyr Gly Lys Gly Asp Trp Arg Ser Ile Ser Arg Asn Phe Val
385 390 395 400
Ile Ser Arg Thr Pro Thr Gln Val Ala Ser His Ala Gln Lys Tyr Phe
405 410 415
Ile Arg Leu Asn Ser Met Asn Arg Glu Arg Arg Arg Ser Ser Ile His
420 425 430
Asp Ile Thr Ser Val Asn Asn Gly Asp Thr Ser Ala Ala Gln Gly Pro
435 440 445
Ile Thr Gly Gln Pro Asn Gly Pro Ser Ala Asn Pro Gly Lys Ser Ser
450 455 460
Lys Gln Ser Leu Gln Pro Ala Asn Ala Pro Pro Gly Val Asp Ala Tyr
465 470 475 480
Gly Thr Thr Ile Gly Gln Pro Val Gly Gly Pro Leu Val Ser Ala Val
485 490 495
Gly Thr Pro Val Thr Leu Pro Val Pro Ala Ala Pro His Ile Ala Tyr
500 505 510
Gly Met His Ala Pro Val Pro Gly Ala Val Val Pro Gly Ala Pro Val
515 520 525
Asn Met Pro Pro Met Pro Tyr Pro Met Pro Pro Pro Thr Ser His Gly
530 535 540
Pro Gly Ser Thr Arg Ala Ala Ala Ser
545 550