mRNA and novel coronavirus mRNA vaccine containing same

文档序号:336580 发布日期:2021-12-03 浏览:31次 中文

阅读说明:本技术 mRNA及包含其的新冠病毒mRNA疫苗 (mRNA and novel coronavirus mRNA vaccine containing same ) 是由 王冰 俞航 于 2020-05-28 设计创作,主要内容包括:本发明提供了mRNA,其包含编码来源于SARS-CoV-2病毒的S蛋白、E蛋白、M蛋白和N蛋白中的一种、两种、三种或四种蛋白或其片段的mRNA,编码S蛋白的mRNA的序列如SEQ ID NO.18、SEQ ID NO.19或SEQ ID NO.20所示;编码E蛋白的mRNA的序列如SEQ ID NO.21所示;编码M蛋白的mRNA的序列如SEQ ID NO.22所示;编码N蛋白的mRNA的序列如SEQ ID NO.23所示。还提供了一种包含所述mRNA的脂质体纳米颗粒、一种针对新冠病毒的mRNA疫苗等。本发明的mRNA在细胞水平高效产生病毒蛋白,或由产生的蛋白自组装成病毒样颗粒。将包含本发明的mRNA制备成疫苗时,安全性高、有效性好、不会产生非中和抗体而不会产生抗体依赖增强感染效应。(The invention provides mRNA, which comprises mRNA for encoding one, two, three or four proteins or fragments thereof in S protein, E protein, M protein and N protein derived from SARS-CoV-2 virus, wherein the sequence of the mRNA for encoding the S protein is shown as SEQ ID NO.18, SEQ ID NO.19 or SEQ ID NO. 20; the sequence of mRNA for encoding the E protein is shown as SEQ ID NO. 21; the sequence of mRNA for coding the M protein is shown as SEQ ID NO. 22; the sequence of mRNA for coding the N protein is shown as SEQ ID NO. 23. Also provided are a liposomal nanoparticle comprising the mRNA, an mRNA vaccine against the novel coronavirus, and the like. The mRNA of the present invention efficiently produces viral proteins at the cellular level, or self-assembles from the produced proteins into virus-like particles. When the mRNA of the present invention is used to prepare a vaccine, it is highly safe and effective, and produces no non-neutralizing antibody and no antibody-dependent infection-enhancing effect.)

mRNA comprising mRNA encoding one, two, three or four proteins or fragments thereof from among the S protein, the E protein, the M protein and the N protein of SARS-CoV-2 virus,

wherein the sequence of mRNA for coding the S protein is shown as SEQ ID NO.18, SEQ ID NO.19 or SEQ ID NO. 20; the sequence of mRNA for encoding the E protein is shown as SEQ ID NO. 21; the sequence of mRNA for coding the M protein is shown as SEQ ID NO. 22; the sequence of mRNA for coding the N protein is shown as SEQ ID NO. 23.

2. The mRNA of claim 1, wherein the fragment is a fragment of the RBD domain of the S protein, and the mRNA preferably has the sequence shown in SEQ ID No. 37.

3. The mRNA of claim 1 or 2, further comprising one or more of the following (a) to (e):

(a) a 5' -cap structure, preferably 3' -O-Me-m7G (5') ppp (5') G, m7G (5') ppp (5') (2' OMeA) pG or m7(3' OMeG) (5') ppp (5') (2' OMeA) pG;

(b) a 3' -poly a whose sequence preferably comprises a sequence of about 25 to about 400 adenosine nucleotides, preferably a sequence of about 50 to about 400 adenosine nucleotides, more preferably a sequence of about 50 to about 300 adenosine nucleotides, even more preferably a sequence of about 50 to about 250 adenosine nucleotides, even more preferably a sequence of about 60 to about 250 adenosine nucleotides, and most preferably a sequence consisting of 120 poly a;

(c)5 '-UTR, the sequence of the 5' -UTR is preferably shown in SEQ ID NO. 15;

(d) a 3' -UTR, the sequence of said 3' -UTR preferably being derived from the 3' -UTR of a gene providing a stable mRNA, more preferably as shown in SEQ ID No.16 or SEQ ID No. 17;

(e) modification of a polynucleotide, preferably one or more of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP and 5-Methoxy-UTP;

preferably:

the mRNA of the N protein comprises a modification of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP or 5-Methoxy-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP;

and/or, the mRNA of said E protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP;

and/or, when the sequence of the mRNA encoding said S protein is as shown in SEQ ID No.18, the mRNA of said S protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP, preferably comprises a modification of pseudo-UTP or N1-methyl pseudo-UTP; or, when the sequence of the mRNA encoding said S protein is shown in SEQ ID NO.19, the mRNA of said S protein comprises a modification of pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP; alternatively, when the sequence of the mRNA encoding the S protein is as shown in SEQ ID NO.20, the mRNA of the S protein comprises a modification of pseudo-UTP or N1-Methylprudo-UTP.

4. The mRNA of any one of claims 1 to 3, wherein the mRNA comprises mRNAs encoding an S protein, an E protein and an M protein derived from SARS-CoV-2 virus, wherein the S protein, the E protein and the M protein are expressed from three separate mRNAs, and the molar ratio of the mRNAs for expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5) such as 1:1: 1;

or, the mRNA comprises mRNA encoding M protein and E protein derived from SARS-CoV-2 virus, the mRNA of the M protein and the mRNA of the E protein are preferably expressed after being connected, the connection is preferably performed through the sequence of the mRNA encoding the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the mRNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the mRNA after being connected is preferably shown as SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29;

alternatively, the mRNA comprises mRNA encoding an S protein derived from SARS-CoV-2 virus;

alternatively, the mRNA comprises mRNA encoding the RBD domain derived from the S protein of SARS-CoV-2 virus;

or, the mRNA comprises mRNA encoding M protein, E protein and S protein derived from SARS-CoV-2 virus, the M protein and the E protein mRNA are connected for expression, the connection is preferably performed through the connection of the sequence of the mRNA encoding the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the mRNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the connected mRNA is preferably shown as SEQ ID NO.35 or 36, the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29, and the molar ratio of the connected mRNA to the mRNA of the S protein is preferably 1.5: 1-3: 1, such as 2: 1.

5. The mRNA of any one of claims 1 to 4, wherein when the mRNA comprises mRNA encoding two, three or four proteins or fragments thereof from the S, E, M and N proteins of SARS-CoV-2 virus, the proteins encoded by the mRNA self-assemble into virus-like particles.

6. A DNA comprising a DNA encoding at least one protein selected from the group consisting of S protein, E protein, M protein and N protein derived from SARS-CoV-2 virus, or a fragment thereof,

wherein, the sequence of the DNA for coding the S protein is shown as SEQ ID NO.3, SEQ ID NO.4 or SEQ ID NO. 5; the sequence of the DNA for coding the E protein is shown as SEQ ID NO. 8; the sequence of the DNA for coding the M protein is shown as SEQ ID NO. 11; the sequence of the DNA for coding the N protein is shown as SEQ ID NO. 13;

preferably, the fragment is a fragment of the RBD domain of the S protein, and the DNA sequence of the fragment is preferably shown in SEQ ID NO. 30.

7. A composition comprising a plurality or more than one of the mRNAs of any one of claims 1 to 5 and/or the DNA of claim 6.

8. A liposomal nanoparticle comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6, and/or the composition of claim 7;

preferably:

the liposomal nanoparticles further comprise a cationic lipid, preferably DLin-MC3-DMA or DOTMA, and a helper lipid, preferably DSPC and/or cholesterol;

and/or the liposome nanoparticle is a long-circulating cationic liposome nanoparticle, preferably a long-circulating cationic liposome nanoparticle modified by PEG or a derivative thereof; the relative molecular mass of the PEG is preferably 2000-5000, such as 2000, 3000, 4000 or 5000; more preferably, the long circulating cationic liposome nanoparticles comprise DMPE-PEG 2000.

9. A virus-like particle self-assembled from proteins expressed from a composition comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6 and/or the composition of claim 7, preferably expressed in a cell, preferably 293T and/or 293A;

preferably:

the virus-like particle is formed by self-assembling proteins expressed by mRNA of two, three or four proteins or fragments thereof in S protein, E protein, M protein and N protein of SARS-CoV-2 virus, and preferably expresses the proteins in cells;

more preferably:

the virus-like particle is composed of mRNA for coding S protein, E protein and M protein of SARS-CoV-2 virus, the S protein, the E protein and the M protein are obtained by self-assembly of three independent mRNA expressed proteins, the mol ratio of the mRNA for expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5) such as 1:1: 1;

or, the virus-like particle is formed by self-assembling proteins expressed by mRNA of M protein and E protein of SARS-CoV-2 virus, preferably expressing the proteins in cells, the mRNA of the M protein and the E protein is preferably expressed after being connected, the connection is preferably performed through the sequence of the mRNA of the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the mRNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the mRNA after being connected is preferably shown as SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29;

or, the virus-like particle is formed by self-assembling proteins expressed by mRNA of M protein, E protein and S protein of SARS-CoV-2 virus, preferably, the proteins are expressed in cells, the mRNA of the M protein and the E protein is expressed after being connected, the connection is preferably performed through the sequence of the mRNA of the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the RNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the connected mRNA is preferably shown as SEQ ID NO.35 or 36, the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29, the molar ratio of the connected mRNA to the mRNA of the S protein is preferably 1.5: 1-3: 1, for example 2: 1.

10. An mRNA vaccine against neocoronaviruses, characterized in that it comprises mRNA according to any one of claims 1 to 5, DNA according to claim 6, composition according to claim 7 and/or liposomal nanoparticles according to claim 8;

preferably, the mRNA vaccine induces the production of virus-like particles by cells; and/or, the mRNA vaccine further comprises an adjuvant.

11. A pharmaceutical composition comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6, the composition of claim 7, the liposomal nanoparticle of claim 8, the virus-like particle of claim 9, and/or the mRNA vaccine of claim 10, and optionally a pharmaceutically acceptable carrier.

12. A kit comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6, the composition of claim 7, the liposomal nanoparticle of claim 8, the virus-like particle of claim 9, the mRNA vaccine of claim 10, and/or the pharmaceutical composition of claim 11.

13. mRNA for encoding 2A peptide segment has the sequence shown in SEQ ID NO.40 and/or SEQ ID NO. 41.

14. The DNA of the 2A peptide segment is shown in SEQ ID NO.38 or SEQ ID NO. 39.

Technical Field

The invention relates to mRNA and a novel coronavirus mRNA vaccine containing the same, and also relates to the mRNA and the novel coronavirus mRNA vaccine containing the same, liposome nanoparticles, a pharmaceutical composition, a kit and the like.

Background

In recent years, In Vitro Transcription (IVT) based messenger rna (mrna) therapies are showing great potential. The principle is that mRNA prepared in vitro is wrapped into medicine which is delivered to tissues in vivo and is endocytosed by cells, exogenous mRNA reaches the cells and is recognized by ribosome, and corresponding protein is synthesized according to coding information of the exogenous mRNA. Wolff et al demonstrated that mRNA injected into mice was capable of being translated into protein as early as 1990 [7 ]. Jirikowski et al showed in 1992 that vasopressin mRNA injected into the hypothalamic sites alleviated the symptoms of diabetes insipidus in mice [8 ]. mRNA drugs have many theoretical advantages: compared with DNA therapy, mRNA does not need to enter the nucleus, and the risk of insertional mutation of genome integration does not exist; compared with protein drugs, mRNA can realize high-efficiency and dose-dependent active protein expression by utilizing a translation system of a cell, and the problem of non-druggability of some proteins is solved. However, mRNA has been plagued by problems with in vitro preparation, stability and delivery. Until recently IVT (in vitro transcription) technology coupled with chemical and enzymatic capping, introduction of modified nucleotides, HPLC purification technology allowed large scale preparation of mRNA in vitro [9,10 ]. While liposomes and lipid nanoparticles have been shown to be useful for mRNA encapsulation and delivery after success in siRNA delivery [11 ]. The breakthrough of these technologies has led to a great improvement in mRNA druggability, and currently more than 25 mRNA drugs including mRNA vaccines and protein replacement are under development [12], and competition for the first mRNA product in the market has been fully developed. More and more researchers are focusing on the application of mRNA drugs, and chinese research in this field is just starting.

One of the most potential applications of mRNA drugs is vaccines, including tumor vaccines and infectious disease vaccines. The mRNA molecules of the coded antigen protein can be used for human immunity after being synthesized in vitro and formed into preparations, and the process does not relate to the operation related to the culture of live viruses, thereby greatly shortening the research and development time [13 ]. mRNA vaccines have continued to develop in breakthrough in recent years, and in a study of 2013, researchers designed and prepared mRNA vaccines against H7N9 influenza virus, with success in mouse experiments [14 ]. In 2015, mRNA vaccines against HIV produced a humoral immune response in non-human primates. In 2017, the mRNA vaccine of Zika virus was effective in protecting mice under virus challenge [15] and reducing the risk of pregnancy mice infection [16 ]. In addition to the success in animal trials, mRNA vaccines (e.g., influenza vaccine and Zika vaccine) have begun clinical trials, and phase I clinical results from influenza virus mRNA vaccine from Moderna have shown immunogenicity or superiority over traditional vaccines [17 ]. Also, the Zika virus vaccine mRNA-1893 from this company entered the U.S. FDA's fast channel in the last year. The technical advantages of IVT mRNA can effectively cope with the high mutation rate of the virus, so that the rapid development of new outbreak epidemic vaccine becomes possible, and the IVT mRNA is expected to become a breakthrough direction for improving the prevention and treatment efficiency of new infectious diseases.

Conventional vaccines for virus prevention include recombinant protein vaccines, inactivated vaccines, attenuated live vaccines, and in vitro recombinant virus-like particles (VLPs). In the past experience, inactivated or attenuated vaccines have been the first choice for vaccines because they are similar in form and composition to authentic viruses and produce a strong immune response. They have inevitable disadvantages: the production cycle of inactivated or attenuated vaccines is long, some viruses such as norovirus cannot be cultured on a large scale, the inactivated virus cannot induce immune response, and attenuated vaccines also have the risk of progenitor return. The in vitro recombinant virus-like particle vaccine is an empty capsid structure formed by independently packaging virus capsid protein or envelope protein, can quickly stimulate an organism to generate humoral immunity and cellular immune response, does not contain virus genetic material and immunosuppressive protein, is a novel candidate vaccine with highest safety at present, and has various vaccine products based on VLP (VLP) [18 ]. Following the SARS-CoV and MERS-CoV outbreaks in 2002, various vaccine protocols were investigated, including inactivated or attenuated strains, recombinant DNA-based S proteins, and in vitro recombinant virus-like particles [19,20 ]. The S protein is a major protein mediating virus invasion and is also a major target of neutralizing antibodies, and is of particular interest for vaccine development. Animal experiments have shown that these vaccines all have protective effects, but safety is the greatest concern. For example, vaccines based on full-length S protein antigens generate a large number of non-neutralizing antibodies that play an important role in antibody-dependent enhanced infection (ADE) [21], but rather accelerate disease progression, creating a significant problem in vaccine safety. Since the body can synthesize any protein according to the coding information after receiving the mRNA drug, mRNA is extremely flexible in the selection of vaccine antigens. However, in view of the advantages of virus-like particles, most of the virus mRNA vaccines in clinical use today are virus-like particles as the final antigen display form, such as Zika virus [15 ].

mRNA vaccines offer many advantages, but are mostly theoretical and require extensive basic and clinical research. An effective mRNA vaccine capable of inducing and synthesizing virus-like particles in vivo meets two conditions, namely, the expression efficiency is high, and the virus-like particles with enough dosage are generated to stimulate an organism to generate immune response; secondly, the produced virus-like particles are consistent with real viruses in form and structure composition, so that the organism can obtain the immunity to the real viruses. However, many challenges are faced in development due to the nature of coronaviruses themselves. Coronaviruses are positive-strand single-stranded RNA viruses, which have a structure in which a lipid bilayer forms an envelope (envelope) into which structural proteins M (membrane), E (envelope), and S (spike) are inserted. Among them, the S-spinous-process protein is the most important surface protein of coronavirus, and determines the host range and specificity of virus. S protein is the important site of action of host neutralizing antibody, thus becoming a key target in the vaccine design of SARS-CoV and MERS-CoV. Coronaviruses also have a nucleoprotein n (nucleoprotein) that surrounds the viral genome within the inner layer. In addition to binding to the genome, the N protein also contributes to the morphological shaping of the envelope and is therefore also considered to be one of the structural proteins. One characteristic of coronaviruses is that their morphology and size are not completely fixed, in fact coronaviruses have diameters between 80 and 200 nm. Therefore, even with high resolution cryoelectron microscopy, the atomic structure of the entire virus cannot be obtained using single particle analysis. The proportion of structural proteins within the coronavirus envelope is also not fixed, and depends on the content of each structural protein when the virus is assembled in the cell. This is different from Zika virus, which is also an enveloped virus, but has a fixed morphology, a rigid icosahedral structure, a single structural protein, a fixed copy number, and no spinous process structure. Thus, also synthetic virus-like particles, coronavirus mRNA vaccine design is much more complex than that of the zika virus. First, the mRNA vaccine of Zika virus contains only one mRNA encoding the prM-E fusion protein, whereas the mRNA vaccine of coronavirus must be a combination (cocktail) containing at least 3 mRNAs encoding different structural proteins. Secondly, there are currently many issues with the assembly of coronavirus envelope structures. According to the study of SARS-CoV, M and E co-expression was sufficient to form virus-like particles, but without spinous process structure, co-expression of S with M and E could introduce S protein, resulting in VLP with spinous processes. However, despite the formation of virus-like particles, the protein composition ratio is very different from that of the true virus. In addition, the presence of the N protein, although it interacts mainly with the viral genome in the inner layer, has been studied and shown to have an enhancing effect on the expression and secretion of virus-like particles. At present, several new coronavirus vaccines enter clinical tests, all of which take new coronavirus S protein as a main antigen, so that the safety and the effectiveness are not proved, and the risk of failure still exists. Therefore, there is an urgent need to continue to develop novel coronavirus vaccines directed against multiple antigenic strategies of the new coronaviruses.

Disclosure of Invention

The invention aims to overcome the defects that no commercialized new coronavirus vaccine exists in the prior art and the like, and provides mRNA, DNA, a new coronavirus mRNA vaccine containing the mRNA, liposome nanoparticles, virus-like particles generated by expression of the liposome nanoparticles, a pharmaceutical composition and a kit. The mRNA of several proteins required for assembling the novel coronavirus, which are subjected to codon optimization or further nucleotide modification, can be highly expressed in cells independently. The mRNA formed by the specific proportion of the invention can efficiently generate virus protein at the cellular level, or the generated protein is self-assembled into virus-like particles, so that the high expression of the virus-like particles can be realized, the size and the morphological structure of the virus-like particles are extremely close to those of real viruses, and the virus-like particles can enable an organism to obtain the immunological competence to cope with the real viruses when being subsequently used in clinic. The efficiency/expression efficiency of the nano-particles containing the mRNA of the invention that a plurality of mRNAs are simultaneously packaged by the lipid nano-particles is still high, so that enough doses of virus-like particles can be generated to stimulate the body to generate immune response, and the immunogenicity and stability are high. When the mRNA containing several proteins required for assembling the novel coronavirus after codon optimization or nucleotide modification of the invention is prepared into a vaccine (for example, in the form of a virus-like particle, a vaccine only expressing S protein or a vaccine only expressing RBD region in S protein), the safety is high, the effectiveness is good, and non-neutralizing antibodies are not generated so that antibody-dependent enhanced infection effect is not generated.

It is well known to those skilled in the art that coronavirus are not completely fixed in shape and size, nor are the proportions of structural proteins in their envelopes fixed, and thus are also synthetic virus-like particles, and that coronavirus mRNA vaccine design is much more complex than other viruses of the prior art. However, the inventors have surprisingly found through a large number of experiments and gropes that complete expression of virus-like particles can be achieved after specific optimization of codons. The present inventors have also found in experiments that the translation efficiency and stability of in vitro transcribed mRNA is affected by its different chemical modifications (the fate of the cell is largely different using different modified nucleotides for each mRNA), 5 'and 3' Untranslated Region Sequences (UTRs), 5 'capping patterns (using different cap0 or cap1 analogues) and 3' poly (a) tail length. Through a great deal of research, the inventor finds that the high-level protein can be further expressed after mRNA transfected cells are half an hour by selecting specific nucleoside chemical modification, specific UTR sequence and specific optimized capping mode, and the expression can last for one week. Meanwhile, through a large amount of experiments, the inventor finally discovers that a plurality of modified nucleotides can further obtain better immunogenicity and stability through specific combination. In addition, the S protein of the present invention is 1273 amino acids long, belongs to a larger protein, and combines with 5 'and 3' UTRs, and the final total mRNA length exceeds 4000 nt. The present inventors found in experiments that the synthesis of long-chain mRNA is always a challenge, and by optimizing the mRNA sequence encoding a protein (e.g., S protein), and simultaneously optimizing UTR sequence and modified nucleotides, the expression screening of the protein (e.g., S protein) can overcome the problems of preparation and purification of the mRNA of a super-long gene.

In order to solve the above technical problems, the present invention provides, in a first aspect, mRNA comprising mRNA encoding one, two, three or four proteins, fragments, variants or derivatives thereof, among an S protein, an E protein, an M protein and an N protein derived from SARS-CoV-2 virus,

wherein the sequence of mRNA for coding the S protein is shown as SEQ ID NO.18, SEQ ID NO.19 or SEQ ID NO. 20; the sequence of mRNA for encoding the E protein is shown as SEQ ID NO. 21; the sequence of mRNA for coding the M protein is shown as SEQ ID NO. 22; the sequence of mRNA for coding the N protein is shown as SEQ ID NO. 23.

Preferably, the fragment is a fragment of the RBD domain of the S protein, and the sequence of the mRNA of the fragment is preferably shown as SEQ ID NO. 37.

Preferably, the mRNA further comprises a 5' -cap structure, preferably 3' -O-Me-m7G (5') ppp (5') G, m7G (5') ppp (5') (2' OMeA) pG or m7(3' OMeG) (5') ppp (5') (2' OMeA) pG.

In the present invention, the structure of 3' -O-Me-m7G (5') ppp (5') G is generally as follows:

in the present invention, the structure of m7G (5') ppp (5') (2' OMeA) pG is generally as follows:

in the present invention, the structure of m7(3'OMeG) (5') ppp (5') (2' OMeA) pG is generally as follows:

preferably, the mRNA sequence further comprises a 3' -poly a, which sequence preferably comprises a sequence of about 25 to about 400 adenosine nucleotides, preferably a sequence of about 50 to about 400 adenosine nucleotides, more preferably a sequence of about 50 to about 300 adenosine nucleotides, even more preferably a sequence of about 50 to about 250 adenosine nucleotides, even more preferably a sequence of about 60 to about 250 adenosine nucleotides, most preferably a sequence consisting of 120 poly a.

Preferably, the mRNA sequence further comprises a 5 '-UTR, the sequence of the 5' -UTR preferably being as shown in SEQ ID number 15.

Preferably, the mRNA sequence further comprises a 3' -UTR, the sequence of the 3' -UTR preferably being derived from the 3' UTR of a gene providing a stable mRNA or from a homologue, fragment or variant thereof, more preferably as shown in SEQ ID No.16 or SEQ ID No. 17.

Preferably, the mRNA sequence further comprises a polynucleotide modification, preferably one or more of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP and 5-Methoxy-UTP. In the present invention, 5-methyl-CTP is commercially available from ApexBio, # B7967. Such pseudo-UTP is commercially available from ApexBio, # B7972. N1-Methylprudo-UTP was purchased from ApexBio, # B8049. The 5-Methoxy-UTP is commercially available from ApexBio, # B8061.

More preferably, the mRNA of the N protein comprises a modification of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP or 5-Methoxy-UTP, or comprises a modification of both 5-methyl-CTP and pseudo-UTP.

More preferably, the mRNA of the E protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP.

More preferably, when the sequence of the mRNA encoding said S protein is as shown in SEQ ID NO.18, the mRNA of said S protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP, preferably comprises a modification of pseudo-UTP or N1-methyl pseudo-UTP.

More preferably, when the mRNA encoding said S protein has the sequence shown in SEQ ID NO.19, the mRNA of said S protein comprises a modification of pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP.

More preferably, the mRNA of said S protein comprises a modification of pseudo-UTP or N1-Methylprudo-UTP, when the sequence of the mRNA encoding said S protein is shown in SEQ ID NO. 20.

Preferably, the mRNA comprises mRNA encoding S protein, E protein and M protein derived from SARS-CoV-2 virus, the S protein, the E protein and the M protein are expressed from three separate mRNAs respectively, and the molar ratio of the mRNA expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5), for example, 1:1: 1.

Preferably, the mRNA comprises mRNA encoding M and E proteins from SARS-CoV-2 virus, which are preferably expressed after ligation, preferably by the sequence of mRNA encoding the 2A peptide stretch (after protein expression, the resulting 2A peptide is "self-cleaved", resulting in separate M and E proteins). Wherein, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.38 or SEQ ID NO.39, and the mRNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29.

Preferably, the mRNA comprises mRNA encoding the S protein derived from SARS-CoV-2 virus.

Preferably, the mRNA comprises mRNA encoding the RBD domain of the S protein derived from SARS-CoV-2 virus.

Preferably, the mRNA comprises mRNA encoding M protein, E protein and S protein derived from SARS-CoV-2 virus, and the mRNA of M protein and E protein is expressed after ligation, preferably by ligation of the sequence of the 2A peptide fragment encoding the 2A peptide fragment. Wherein, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID number 42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.38 or SEQ ID NO.39, and the mRNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29. More preferably, the molar ratio of the ligated mRNA to the mRNA of the S protein is preferably 1.5:1 to 3:1, for example, 2: 1.

Preferably, when the mRNA comprises mRNA encoding two, three or four proteins or fragments thereof from the S, E, M and N proteins of SARS-CoV-2 virus, the proteins encoded by the mRNA self-assemble into virus-like particles.

In order to solve the above-mentioned technical problems, the second aspect of the present invention provides a DNA comprising a DNA encoding at least one protein (e.g., one, two, three, four) or a fragment thereof among the S protein, the E protein, the M protein and the N protein derived from SARS-CoV-2 virus,

wherein, the sequence of the DNA for coding the S protein is shown as SEQ ID NO.3, SEQ ID NO.4 or SEQ ID number 5; the sequence of the DNA for coding the E protein is shown as SEQ ID NO. 8; the sequence of the DNA for coding the M protein is shown as SEQ ID NO. 11; the sequence of the DNA for coding the N protein is shown as SEQ ID NO. 13.

Preferably, the fragment is a fragment of the RBD domain of the S protein, and the DNA sequence of the fragment is preferably shown in SEQ ID NO. 30.

In order to solve the above technical problem, the third aspect of the present invention provides a composition comprising a plurality of or more than one mRNA according to the first aspect of the present invention or DNA according to the second aspect of the present invention.

In order to solve the above technical problem, the fourth aspect of the present invention provides a liposome nanoparticle comprising the mRNA according to the first aspect of the present invention, the DNA according to the second aspect of the present invention, or the composition according to the third aspect of the present invention.

Preferably, the liposomal nanoparticles further comprise a cationic lipid, preferably DLin-MC3-DMA or DOTMA, and a helper lipid, preferably DSPC and/or cholesterol.

In the invention, the structural formula of the DLin-MC3-DMA is generally shown as follows:

in the present invention, the formula of DOTMA is generally as follows:

in the present invention, the structural formula of the DSPC is generally as follows:

preferably, the liposome nanoparticle is a long-circulating cationic liposome nanoparticle, preferably a long-circulating cationic liposome nanoparticle modified by PEG or derivatives thereof; the PEG preferably has a relative molecular mass of 2000-5000, such as 2000, 3000, 4000 or 5000. In a preferred embodiment of the present invention, the liposome nanoparticle is a long circulating cationic liposome nanoparticle comprising DMPE-PEG 2000.

In order to solve the above technical problems, a fifth aspect of the present invention provides a virus-like particle comprising a self-assembly of a corresponding protein expressed by the mRNA according to the first aspect of the present invention, a self-assembly of a corresponding protein expressed by the DNA according to the second aspect of the present invention, and/or a self-assembly of a corresponding protein expressed by the composition according to the third aspect of the present invention, wherein the mRNA, the DNA, and/or the composition are preferably transferred into a cell, and the cell preferably expresses the corresponding protein, and the cell preferably expresses 293T and/or 293A.

Preferably, the virus-like particle is self-assembled from proteins expressed by mRNA encoding two, three or four of the S, E, M and N proteins of the SARS-CoV-2 virus or fragments thereof, preferably in cells, preferably 293T and/or 293A.

More preferably, the virus-like particle is composed of mRNAs encoding S protein, E protein and M protein of SARS-CoV-2 virus, wherein the S protein, the E protein and the M protein are self-assembled from three separate mRNAs, and the molar ratio of the mRNAs expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5), for example, 1:1: 1.

More preferably, the virus-like particle is self-assembled from proteins expressed by mRNAs encoding the M and E proteins of SARS-CoV-2 virus, preferably the proteins are expressed in cells, preferably 293T and/or 293A. Wherein, the mRNA of the M protein and the E protein is preferably expressed after being connected, and the connection is preferably performed through the sequence of the mRNA encoding the 2A peptide segment. Wherein, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.38 or SEQ ID NO.39, and the mRNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29.

More preferably, the virus-like particle is self-assembled from proteins expressed by mRNAs encoding the M, E and S proteins of SARS-CoV-2 virus, preferably the proteins are expressed in cells, preferably 293T and/or 293A. Wherein, the mRNA of the M protein and the E protein is connected and then expressed, and the connection is preferably carried out through the sequence of the mRNA of the encoding 2A peptide segment. Wherein, the amino acid sequence of the 2A peptide fragment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide fragment is further preferably shown as SEQ ID number 38 or SEQ ID NO.39, and the RNA sequence for coding the 2A peptide fragment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29. More preferably, the molar ratio of the ligated mRNA to the mRNA of the S protein is preferably 1.5:1 to 3:1, for example, 2: 1.

In order to solve the above technical problem, the sixth aspect of the present invention provides an mRNA vaccine against the novel coronavirus, which comprises mRNA according to the first aspect of the present invention, DNA according to the second aspect of the present invention, a composition according to the third aspect of the present invention, and/or a liposomal nanoparticle according to the fourth aspect of the present invention.

Preferably, the mRNA vaccine induces the cells to produce virus-like particles to activate the immune system.

Preferably, the mRNA vaccine further comprises adjuvants conventionally used in the art.

In order to solve the above technical problem, the seventh aspect of the present invention provides a pharmaceutical composition comprising mRNA according to the first aspect of the present invention, DNA according to the second aspect of the present invention, a composition according to the third aspect of the present invention, liposomal nanoparticles according to the fourth aspect of the present invention, virus-like particles according to the fifth aspect of the present invention, and/or mRNA vaccine according to the sixth aspect of the present invention, and optionally a pharmaceutically acceptable carrier.

In order to solve the above technical problems, the eighth aspect of the present invention provides a kit comprising the mRNA according to the first aspect of the present invention, the DNA according to the second aspect of the present invention, the composition according to the third aspect of the present invention, the liposomal nanoparticle according to the fourth aspect of the present invention, the virus-like particle according to the fifth aspect of the present invention, the mRNA vaccine according to the sixth aspect of the present invention, and/or the pharmaceutical composition according to the seventh aspect of the present invention.

In order to solve the technical problem, the invention also provides mRNA for encoding the 2A peptide segment, and the sequence of the mRNA is preferably shown as SEQ ID NO.40 or SEQ ID NO. 41.

In order to solve the technical problems, the invention also provides a DNA for coding the 2A peptide segment, and the sequence of the DNA is shown as SEQ ID NO.38 or SEQ ID NO. 39.

In order to solve the above technical problems, the present invention also provides the use of mRNA according to the first aspect of the present invention or DNA according to the second aspect of the present invention or a composition according to the third aspect of the present invention in the preparation of liposomal nanoparticles according to the fourth aspect of the present invention, virus-like particles according to the fifth aspect of the present invention, mRNA vaccines according to the sixth aspect of the present invention, pharmaceutical compositions according to the seventh aspect of the present invention, and/or kits according to the eighth aspect of the present invention.

In order to solve the above technical problem, the present invention also provides a method for preventing and/or treating a neocoronavirus infection, comprising the step of administering (optionally to a subject in need thereof) an mRNA according to the first aspect of the invention, a DNA according to the second aspect of the invention, a composition according to the third aspect of the invention, a liposomal nanoparticle according to the fourth aspect of the invention, a virus-like particle according to the fifth aspect of the invention, an mRNA vaccine according to the sixth aspect of the invention, a pharmaceutical composition according to the seventh aspect of the invention, and/or a kit according to the eighth aspect of the invention.

In order to solve the above technical problems, the present invention also provides a mRNA according to the first aspect of the present invention, a DNA according to the second aspect of the present invention, a composition according to the third aspect of the present invention, a liposome nanoparticle according to the fourth aspect of the present invention, a virus-like particle according to the fifth aspect of the present invention, a mRNA vaccine according to the sixth aspect of the present invention, a pharmaceutical composition according to the seventh aspect of the present invention, and/or a kit according to the eighth aspect of the present invention, for use in preventing and/or treating a neocoronavirus infection.

In the invention, the sequence encoding the 2A peptide segment can be the sequence encoding the 2A peptide segment of a natural virus, and can also be an optimized sequence (for example, T2A and P2A, the mRNA sequence of T2A can be shown as SEQ ID NO.40, the mRNA sequence of P2A can be shown as SEQ ID NO.41, the corresponding DNA sequence can be shown as SEQ ID NO.38 and SEQ ID NO.39, and the amino acid sequence of the polypeptide obtained after translation can be shown as SEQ ID NO.42 and SEQ ID NO. 43). The polypeptide can be efficiently self-sheared into a front fragment and a rear fragment, so that the sequences of the front part and the rear part of the sequence can be independently expressed into two independent proteins, and the aim of cooperatively expressing the two independent proteins on the sequence is fulfilled.

Interpretation of terms

In the present invention, the mRNA is also called messenger RNA, and is usually a single-stranded ribonucleic acid (ssrna) that is transcribed from a DNA strand as a template and carries genetic information and can direct protein synthesis. After mRNA is produced by transcription from gene in cell as template based on base complementary pairing principle, the mRNA contains base sequence corresponding to some functional segment in DNA molecule as direct template for protein biosynthesis.

In the present invention, the mRNA vaccine is generally produced by directly introducing mRNA encoding a viral antigen into a human body, and expressing the viral protein antigen in cells, thereby activating the immune system of the human body and generating a neutralizing antibody against the virus.

In the present invention, the antigen (abbreviated as Ag) refers to any substance that can induce an immune response, and is generally a substance that can induce the production of antibodies.

In the present invention, the antibody generally refers to immunoglobulin produced by plasma cells differentiated from B cells in the body under the stimulation of an antigen substance and capable of specifically binding and reacting with a corresponding antigen.

In the invention, the neutralizing antibody generally means that a plurality of antibodies are generated by stimulation after microorganisms invade a human body, but only part of the antibodies can rapidly identify the microorganisms and can be caught before the microorganisms invade cells of the human body, so that the human body is protected from infection. This process is called neutralization, and the antibody that exerts its effect is a neutralizing antibody.

In the invention, the liposome nanoparticle generally refers to a compound which utilizes liposome to package drug molecules (small molecule compounds, RNA, DNA or protein drugs) into hundred nanometers in size, and delivers the drugs into the body, thereby having the advantages of increasing the solubility of the drugs, prolonging the retention time of the drugs in the body, enhancing the targeting property of the drugs, reducing the toxicity and the like.

In the present invention, the virus-like particles (VLPs) are typically hollow particles containing one or more structural proteins of a virus, do not contain viral nucleic acid, cannot replicate autonomously, and are identical or similar in morphology to authentic virus particles, and are commonly referred to as pseudoviruses.

In the invention, the new coronavirus S protein (Spike protein) is also called spinous process or Spike protein. The S protein is the most important pathogenic target protein of coronaviruses and comprises two subunits, S1 and S2. Among them, S1 mainly contains a receptor binding region (RBD domain), and it is through the RBD domain that coronavirus binds to a cell surface receptor to infect cells. The S protein thus assumes mainly the functions of binding the virus to the host cell membrane receptor and membrane fusion. And meanwhile, the polypeptide is also an important action site of host neutralizing antibodies and a key target point of vaccine design.

On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.

The reagents and starting materials used in the present invention are commercially available.

The positive progress effects of the invention are as follows:

(1) the mRNA of several proteins required for assembling the novel coronavirus, which are subjected to codon optimization or further nucleotide modification, can be highly expressed in cells independently. In addition, the mRNA formed by the specific proportion of the invention can efficiently generate virus protein at the cellular level, or the generated protein can be self-assembled into virus-like particles, so that the high expression of the virus-like particles can be realized, the size and the morphological structure of the virus-like particles are extremely close to those of real viruses, and the virus-like particles can enable an organism to obtain the immunocompetence to cope with the real viruses when being subsequently used in clinic.

(2) The efficiency/expression efficiency of the nano-particles containing the mRNA is high, and a plurality of mRNAs are simultaneously packaged by the lipid nano-particles, so that the virus-like particles can be generated in enough dosage to stimulate the body to generate immune response, and the immunogenicity and stability are high.

(3) When the mRNA containing several proteins required for assembling the novel coronavirus, which are codon-optimized or further nucleotide-modified according to the present invention, is prepared into a vaccine (for example, in the form of a virus-like particle, a vaccine expressing only the S protein, or a vaccine expressing only the RBD region in the S protein), the safety is high, the effectiveness is good, and non-neutralizing antibodies are not generated, so that antibody-dependent enhanced infection effects are not generated.

Drawings

Fig. 1 shows an overview of an embodiment of the invention. In the examples, mRNA was used to express the structural proteins S, M, E of the novel coronavirus and the RBD domain of the N and S proteins. The mRNA is coated into nanoparticles (LNP) with liposomes for cell transfection or animal immunization. Multiple mrnas transfected by cells in vitro can highly express viral proteins and, at the appropriate ratio, self-assemble into virus-like particles (VLPs). After LNP immunization of mice, the immune system of the mice is activated to produce antibodies.

FIG. 2 is a graph showing the results of Western Blot detection of protein expression after transfection of 293A cells with mRNA coated with liposomes. Among them, lane (lane)1 is a protein expressed by the cap 1-modified mRNA, lane 2 is a protein expressed by the cap1+5mC + pseudoU-modified mRNA, lane 3 is a protein expressed by the cap1+ pseudoU-modified mRNA, lane 4 is a protein expressed by the cap1+5 moU-modified mRNA, lane 5 is a protein expressed by the cap1+ N1-m-pseudoU-modified mRNA, and lane 6 is a protein expressed by the cap1+5 mC-modified mRNA. A is a WB result graph of mRNA of the N protein and WB results of proteins expressed by NBL mRNA, B is a WB result graph of proteins expressed by EBL mRNA and MBL mRNA, C is a WB result graph of proteins expressed by SGS mRNA, STFmRNA and SBLmRNA, D is a WB result graph of proteins expressed by SDC50, SDC54, SDC58 and SDC60, E is a WB result graph of proteins expressed by SGS-RBD domain, and F is a WB result graph of proteins expressed by MP2AE and MT2 AE.

Fig. 3 shows an electron micrograph of VLP particles.

Figure 4 shows a schematic representation of mRNA lipid nanoparticle packaging.

Figure 5 shows the LNP chromatogram profile with ZetaView detection. The top panel of a is the particle size and distribution profile before LNP filtration of SGS mRNA coated with expressed S protein, and the bottom panel is the particle size distribution profile after the same LNP filtration. The upper panel of B is the particle size and distribution profile before LNP filtration of mRNA expressing the RBD domain of the S protein, and the lower panel is the particle size distribution profile after the same LNP filtration. The top panel of C is the particle size and distribution profile before LNP filtration coated with mRNA expressing M, E and S protein, and the bottom panel is the particle size distribution profile after the same LNP filtration.

FIG. 6 is a graph showing the results of the measurement of the antibody titer in serum by ELISA one week after the first immunization.

FIG. 7 shows the results of mouse serum neutralizing antibody titer experiments. mRNA (spike) expressing the full length of the S protein and mRNA (SME) expressing the virus-like particles were induced to produce antibody titers of greater than 104The RBD domain-producing mRNA (RBD) alone induced a slightly higher titer of neutralizing antibodies than the blank control (Ctrl).

Detailed Description

The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention. The experimental methods without specifying specific conditions in the following examples were selected according to the conventional methods and conditions, or according to the commercial instructions.

The invention is directed against mRNA vaccine that the new coronavirus develops, mainly adopt (1) to express multiple viral protein, assemble into virus-like particle in vivo; (2) expressing the full-length mRNA of the S protein; (3) three ways of expressing the RBD domain of the S protein are shown in FIG. 1.

Example 1 mRNA preparation

It is codon optimized for S, M, E, N structural genes of 4 new type coronavirus (SARS-CoV-2) and has several designed coding sequences for each gene. Each sequence will be cloned into an mRNA synthesis vector. For each sequence, two mrnas were prepared, one encoding wild-type protein without tag and one encoding a Flag tag at the C-terminus for later expression validation. The method comprises the following specific steps:

the Shanghai work is entrusted to synthesize gene sequences carrying S protein (Spike protein), M protein, E protein and N protein (amino acid sequences are respectively shown as SEQ ID NO.1, SEQ ID NO.9, SEQ ID NO.6 and SEQ ID NO.12, natural gene sequences of the four proteins are respectively shown as SEQ ID NO.2 (connected with 3 '-UTR-2), SEQ ID NO.10, SEQ ID NO.7 and SEQ ID NO.14 (connected with 3' -UTR-2)) which are optimized by codons aiming at SARS-Cov-2, wherein: the optimized sequences of the S protein genes are respectively shown in SEQ ID NO.3(SGS connected with 3' -UTR-2), SEQ ID NO.4(SBL or S-benchling connected with 3' -UTR-1), SEQ ID NO.5(STF connected with 3' -UTR-1), SEQ ID NO.24(SDC50 connected with 3' -UTR-2), SEQ ID NO.25(SDC54 connected with 3' -UTR-2), SEQ ID NO.26(SDC58 connected with 3' -UTR-2), and SEQ ID NO.27(SDC60 connected with 3' -UTR-2); the sequence of the M protein gene after optimization is shown as SEQ ID NO.11(MBL, connected with 3' -UTR-1); the optimized sequence of the E protein gene is shown as SEQ ID NO.8(EBL connected with 3' -UTR-1); the sequence of the optimized N protein gene is shown as SEQ ID NO.13(NBL connected with 3' -UTR-2). The codon-optimized gene sequence was then subcloned into a vector containing the T7 promoter and the 5 'noncoding region (5' UTR, sequence shown in SEQ ID NO. 15), the 3 'noncoding region (3' UTR, sequence shown in SEQ ID NO.16(3 '-UTR-1) or SEQ ID NO.17 (3' -UTR-2)) (two vectors: one vector in which the 5 '-UTR and 3' -UTR-1 regions were added to pUC19 and one vector in which the 5 '-UTR and 3' -UTR-2 regions were added to pUC 57). S, E, N and the M protein are labeled at the C-terminus with HA and Flag, respectively. After the vector is amplified, the vector is linearized by digestion with restriction enzymes (all procedures are conventional in the art). The cleaved fragments were further purified and used as templates for In Vitro Transcription (IVT) to synthesize modified mRNA, specifically: IVT was performed using a HyperScribe T7 high yield RNA synthesis kit (ApexBio) with 1-2. mu.g template and capped cap0 or cap1 analogs (purchased from ApexBio) (7.5 mM of each modified nucleotide). The reaction was incubated at 37 ℃ for 2-4 hours and then subjected to DNase (thermo) treatment. The 3' poly (a) tail was further added to the IVT RNA product using the poly (a) tailing kit (apextio). The mRNA was purified by using RNAclean and Concentrator kit (ApexBio). The obtained mRNA sequences of the optimized S protein gene are respectively shown as SEQ ID NO.18(SGS mRNA), SEQ ID NO.19(SBL mRNA) and SEQ ID NO.20(STF mRNA), SEQ ID NO.31 (SDC50), SEQ ID NO.32(SDC54), SEQ ID NO.33(SDC58) and SEQ ID NO.34(SDC60), the obtained mRNA sequence of the optimized M protein gene is shown as SEQ ID NO.22(MBL mRNA), the obtained mRNA sequence of the optimized E protein gene is shown as SEQ ID NO.21(EBL mRNA), and the obtained mRNA sequence of the optimized N protein gene is shown as SEQ ID NO.23(NBL mRNA).

Example 2 modified nucleotides incorporated during in vitro transcription

In the in vitro transcription synthesis of modified mRNA described in example 1, modified nucleotides are added to the reaction system in a certain ratio and randomly inserted into the mRNA sequence. Modified nucleotides attempted to be used in this example include 5-methyl-CTP (abbreviated as 5mC, ApexBio, # B7967), pseudo-UTP (abbreviated as pseudo U, ApexBio, # B7972), N1-methyl pseudo-UTP (abbreviated as N1-m-pseudo, ApexBio, # B8049), 5-Methoxy-UTP (abbreviated as 5moU, ApexBio, # B8061); modified nucleotides used for 5' capping of mRNA are 3' -O-Me-m7G (5') ppp (5') G (ARCA, Cap0, product of APExBIO, # B8175), m7G (5') ppp (5') (2' OMeA) pG (product of APExBIO EZ Cap # B8176, Cap1,) and m7(3' OMeG) (5') ppp (5') (2' OMeA) pG (product of APExBIO EZ Cap # B8178, Cap1 analogues).

The specific experimental steps are as follows:

(1) inserting a plurality of modified nucleotides into an in vitro mRNA sequence; randomly inserting modified nucleotides during in vitro transcription according to the ratio of the modified nucleotides to the unmodified nucleotides 1:5 was added to the reaction system, and kit # K1047 by APExBIO was used. The reaction system is configured according to the kit use instruction and reacts for 2-4 hours at 37 ℃.

(2) Transcription processes such as the addition of 5' capping nucleotides; then 5'm 7(3' OMeG) (5') ppp (5') (2' OMeA) pG, m7G (5') ppp (5') (2' OMeA) pG or 3' -O-Me-m7G (5') ppp (5') G were added simultaneously to the transcription reaction system in a molar ratio of 8: 1.

(3) adding 120 poly A sequences at the 3' end; the 3' poly (A) tail was added to the IVT RNA product using a poly (A) tailing kit (APExBIO, # K1053), the reaction system was configured according to the kit instructions, and the reaction was carried out at 37 ℃ for 1 hour.

(4) Digesting a DNA template by DNase; DNA template digestion was carried out using DNase I (cat # M0303S) from NEB and the reaction was carried out at 37 ℃ for 1 hour.

(5) mRNA purification; after purification of the transcript with Thermo Fisher RiboPure Kit (# AM1924), the DNA template-digested mRNA was eluted with 1mM sodium citrate, pH 6.4. Agarose gel nucleic acid electrophoresis was performed to detect mRNA and the concentration was determined using NanoDrop.

Example 3 mRNA transfected cells

Lipofectamine 2000(lipo2K, ThermoFisher Scientific #11668019) was used to mix the two in a mass to volume ratio of 1: 2 (mRNA: lipo2K, 1g mRNA +2L lipo2K) the S, M, E, N mRNA obtained in examples 1 and 2 was transfected into 293A cells, respectively, and protein expression was examined 24hr later using Western Blot. The results are shown in FIG. 2.

In fig. 2, the numbers represent the insertion of different modified nucleotides into the mRNA sequence: cap 1; 2. cap1+5mC + pseudoU; cap1+ pseudoU; cap1+5 moU; cap1+ N1-m-pseudo; cap1+5 mC. The N protein and the E protein expressed by the cell both have HA sequence labels, an anti-HA antibody is used as a western blot to detect the protein expression condition in the cell, and the GAPDH protein is used as a positive control. Wherein:

as shown in A of FIG. 2, the N protein is small, and each sequence and modified optimized mRNA can express the N protein in the cell, wherein the mRNA of two modifications, namely cap1+5mC + pseudoU (lane 2) and cap1+5moU (lane 4), is relatively low for protein expression.

In B of FIG. 2, the EBL sequence is strongly expressed and the signal is strongly detected with an antibody against the HA-tag peptide. The MBL sequence is connected with a flag tag peptide, and detection is carried out by using an antibody against the flag tag peptide, so that four modification combinations of cap1(lane 1), cap1+ pseudoU (lane 3), cap1+ N1-m-pseudoo (lane 5) and cap1+5mC (lane 6) are better expressed, mRNA of two modification combinations of cap1+5mC + pseudoU (lane 2) and cap1+5moU (lane 4) is low in expression quantity of the E protein.

In C and D of FIG. 2, the sequences of expressed S proteins are respectively connected with HA tag peptide or flag tag peptide, and the protein expression difference is very large by detecting with anti-HA or flag tag peptide antibody. As can be seen in C of fig. 2, the native S gene sequence without optimization is hardly expressed, or expressed in very low amounts, in 293A cells. The expression of the STF and SBL optimized sequences is slightly improved compared with the protein expression of the natural S gene sequence, the protein expression of the STF modified by cap1+ pseudoU (lane 3) and cap1+ N1-m-pseudodo (lane 5) is relatively high, and the protein expression of the SBL modified by cap1+ pseudodo (lane 3) is relatively high. The SGS gene optimized sequence greatly increases protein expression, the best expression level is the SGS sequence modified by two modes of adding cap + pseudoU (lane 3) and cap1+ N1-m-pseudoU (lane 5), and the expression level of the SGS sequence modified by cap1(lane 1), cap1+5mC + pseudoU (lane 2) and cap1+5mC (lane 6) is also higher. As can be seen from D in fig. 2, the optimized sequences SDC50, SDC54, SDC58, and SDC60 express many proteins including hetero proteins.

In E of FIG. 2, the SGS-RBD optimized mRNA sequence with HA tag peptide (mRNA sequence shown in SEQ ID NO.37, the corresponding DNA sequence shown in SEQ ID NO.30, both modified with pseudoU polynucleotide, 5 'capped structure of Cap1, 3' added 120 poly A, connecting the 5 'UTR shown in SEQ ID NO.15 and the 3' UTR shown in SEQ ID NO.16 or 17) can highly express the S protein RBD domain in cells.

In F of FIG. 2, a mRNA sequence is used to serially express two proteins M and E (i.e., mRNA of M protein and E protein is linked and then expressed, and different mRNA expressing 2A peptide fragment can be used to link the two proteins, wherein the 2A peptide is self-sheared after protein expression, and finally independent M and E proteins can be obtained, and on the basis of the natural virus 2A sequence, the optimized DNA sequences corresponding to the T2A and P2A polypeptides expressing 2A peptide fragment are shown in SEQ ID NO.38 and SEQ ID NO.39, the T2A mRNA sequence is shown in SEQ ID NO.40, the P2A mRNA sequence is shown in SEQ ID NO.41, and can be translated into polypeptides (for the sequences of SEQ ID NO.42 and SEQ ID NO.43)), the mRNA sequence of MT2AE obtained after linking the mRNA is shown in SEQ ID NO.35 (the corresponding DNA sequence is shown in SEQ ID NO. 28), the mRNA sequence of MP2AE obtained after linking the mRNA is shown in SEQ ID NO.36 (the corresponding DNA sequence is shown in SEQ ID NO. 29), adopts pseudoU polynucleotide modification, the 5 'capping structure is Cap1, 120 poly A are added into 3', 5 'UTR with the sequence shown as SEQ ID NO.15 and 3' UTR with the sequence shown as SEQ ID NO.16 or 17 are connected, and the western blot shows that double bands with close positions represent M and E proteins. The two protein amounts obtained by optimizing the mRNA sequence with MP2AE are closer.

Example 4 preparation and Observation of Virus-like particles

To produce virus-like particles (VLPs), mRNA of expressed S, M, E protein (SME mRNA consisting of SGS mRNA, MBL mRNA, EBL mRNA, all modified with pseudoU polynucleotide, 5 'capped structure Cap1, 3' with 120 polya, linked to 5 'UTR of sequence shown in SEQ ID No.15 and 3' UTR of sequence shown in SEQ ID No.16 or 17), was coated with lipo2K at a molar ratio of 1:0.5:0.5, co-transfected into 293A cells, and supernatant was collected 48 hours after transfection. Or mRNA for serially expressing M protein and E protein (namely mRNA for M protein and E protein is connected and then carries out the subsequent steps, different connecting peptides can be used for connection, the sequence of the connected mRNA is shown as SEQ ID NO.35 or 36), the mRNA and mRNA for expressing S protein (SEQ ID NO.3) are coated with lipo2K according to the molar ratio of 2:1, the cells are transfected into 293A cells, and supernatant is collected 48 hours after transfection.

The collected supernatant was concentrated using Amicon Ultra-15(Millipore) at a cut-off concentration of 100kDa and then placed in an appropriate solution (20mM HEPES, pH7.4, 120mM NaCl). Immediately after ultracentrifugation at 31,000rpm (Beckman ultracentrifuge, rotor model SW32) for 90 minutes at 4 ℃, between 30-40% (w/v) of the sucrose solution comprising the virus-like particles (VLPs) was extracted with a 5mL syringe. The solution containing VLPs was replaced with PBS buffer using Amicon Ultra-15 centrifuge tubes with a 100kDa cut-off. To prepare grids for negative staining Transmission Electron Microscopy (TEM), 5 μ Ι _ of VLP solution was absorbed on a glow-discharge carbon coated grid for 2 minutes. The grid was stained in a drop-wise fashion for 60 seconds and then loaded onto a Talos L120C microscope (thermolasher) to visualize the VLPs. The result is shown in FIG. 3, wherein S, E transcribed from mRNA and M protein are shown in a of FIG. 3, and the self-assembled new coronavirus-like particle is shown in an electron micrograph; a magnified photograph of a single virus-like particle in b of fig. 3, and the size of the surface spinous process was measured; and c in FIG. 3 is a cartoon mode diagram of the novel coronavirus-like particle. As can be seen from FIG. 3, the diameter of VLP particles under electron microscope is about 90nm, trimeric spinous processes similar to natural viruses are formed on the surfaces, and the size of the spinous processes is about 12X 13nm, which is very close to the size and structure of the natural viruses.

Example 5 mRNA coating method

According to the previous report, mRNA containing modified nucleotides obtained in example 2(mRNA expressing RBD domain of S protein; SGS mRNA capable of expressing S protein; SME mRNA expressing S, M, E three proteins, respectively, are mixed and expressed in a molar ratio of 1:0.5: 0.5; both modified with pseudoU polynucleotide, 5 'capped with Cap structure Cap1, 3' added with 120 poly A acids, connecting 5 'UTR as shown in SEQ ID No.15 and 3' UTR as shown in SEQ ID No.16 or 17) are coated with DLin-MC3-DMA (APBIO, # A8791) ionizable (cationic) at low pH, two helper lipids (DSPC and cholesterol) and pegylated lipid (DMPE-PEG2000) to form nanoparticles (as shown in FIG. 4). The mRNA was purified by mixing mRNA dissolved in ultrapure water with 100. mu.mM citrate buffer 1 at pH 3.0: 1(v/v) to prepare an aqueous mRNA solution. Modulation of four lipid components [ ionizable lipids: cholesterol: DSPC: DMPE-PEG2000] ratio (50:10:38.5:1.5) was dissolved in ethanol (99.5%) to form a lipid solution. mRNA and lipid solutions were mixed in a nanoassmblr (precision nanosystems) microfluidic mixing system at Aq: EtOH ═ 3: a volumetric mixing ratio of 1 and a constant total flow rate of 12mL/min, resulting in liposomal nanoparticles containing mRNA (LNP).

To characterize the LNP prepared as described above, after preparation, 25 μ L of the sample fraction was injected into 975 of 10 μmM phosphate buffer (pH7.4) and used to measure the intensity average particle size (Z-average) on a ZetaSizer (Malvern Instruments Inc.). The sample fractions were immediately transferred to Slide-a-lyzer G2 dialysis cassettes (10000MWCO, Thermo Fischer Scientific Inc.) and dialyzed against PBS (ph7.4) at 4 ℃ overnight. The volume of PBS buffer was 650-800 times the sample volume. A sample fraction was collected and 25. mu.L of this volume was injected into 975. mu.L of 10. mu.mM phosphate buffer (pH7.4), and the particle size (post-dialysis particle size) of LNP, which was about 100nm in diameter before and after dialysis, was measured again in a uniform and stable state, as shown in FIG. 5 and Table 1. The dialyzed sample was used for mouse injection immunization. FIG. 5 shows the results for SGS mRNA expressing the S protein, and the packaging results for mRNA expressing the RBD domain of the S protein and for mRNA expressing S, M, E. These results show that the particle size of mRNA is between 100-110nm after liposome packing, and the packing efficiency is greater than 90%.

The particle size and distribution of the mRNA samples after LNP coating with ZetaView is shown in table 1. S-RBD mRNA can express an S protein RBD structural domain, SGS mRNA can express an S protein, SME mRNA can express S, M, E three proteins, and virus-like particles can be formed. The particle size of the coated LNP is between 100 and 110nm, which meets the expected size of the nanometer particles. The amount after dilution is between 100 and 300, and the dilution ratio is suitable. After dialysis with 1xPBS and filtration with 0.22. mu.M or 0.45. mu.M filters, the particle size and number remained stable and were available for subsequent animal experiments.

TABLE 1

Example 6 mouse immunization experiment

The coated above-described liposomal nanoparticles expressing neocoronaviruses VLPs (containing SGS mRNA expressing the S protein described in example 5, or SME mRNA expressing S, M, E three proteins) or RBDs (containing mRNA expressing the RBD domain of the S protein described in example 5) were injected with immunoadjuvant in Balb/c mice (muscle (i.m.) for information shown in Table 2 below, blood samples were collected on day 42 and sera were analyzed in a fluorescent antibody virus neutralization assay, as described in example 7 below.

TABLE 2

Group of Line of Number of Pathway volume Vaccine dosage Time of inoculation
1 Balb/c 8 i.m.50μl×3 Control PBS D0, sensitization; d14, boost; d35, boost immunization
2 Balb/c 8 i.m.50μl×1 mRNA 10μg D0, sensitization
3 Balb/c 8 i.m.50μl×2 mRNA 10μg D0, sensitization; d14, boost;
4 Balb/c 8 i.m.50μl×3 mRNA 10μg d0, sensitization; d14, boost; d35, boost immunization

Example 7 measurement of antibody titer in serum by enzyme-linked immunosorbent assay

96-well ELISA plates, 50. mu.l/well, 4 degrees overnight protected from light were coated with 2. mu.g/ml antigenic protein (in PBS), 100ng, respectively. Wherein the S protein antigen is purchased from Sino Biological, cat # 40589-V08B 1; RBD domain of the S protein, purchased from Novoprotein, cat # DRA 36. PBST (0.05% Tween) 3 times, 200. mu.l/well, each time reverse the ELISA plate and tap clean. Blocking was performed by adding 100. mu.l/well 2% BSA (in PBST) and incubating at room temperature for 1 hr. PBST (0.05% Tween) 3 times, 200. mu.l/well, each time reverse the ELISA plate and tap clean. Mouse serum (diluted 100-fold as the initial concentration, followed by 5-fold dilution with gradient, total 6 gradients) was added to PBS, mixed well, 100. mu.l each was added to an ELISA plate, and incubated at room temperature for 2 hr. The mice in example 6 were periocularly bled with 100. mu.l of about 20. mu.l of serum. After washing, HRP-anti-mouse IgG (1:5000 diluted in PBS) was added thereto, 100. mu.l/well, and incubated at room temperature for 1 hr. After washing the plates, TMB substrate (Thermo Fisher, cat # 34022) was added in 50. mu.l/well and allowed to stand at room temperature for 5-15min (protected from light) to develop a blue color. The reaction was stopped by adding 1M sulfuric acid, 150. mu.l/well, and the blue color turned yellow. The microplate reader reads OD 450.

The results are shown in FIG. 6, in which the highest antibody titer, up to 10, was observed for mRNA expression virus-like particles7. The mRNA expressing the S protein produced an antibody titer of 106. mRNA expressing the RBD domain of the S protein produced an antibody titer of 104. Therefore, the virus-like particles expressed by the mRNA can effectively activate the immune system of mice, promote the generation of antibodies in serum and effectively play the role of vaccines.

Example 8 neutralizing antibody detection assay

Detection of virus neutralizing antibody responses (specific B cell immune responses) was performed by a virus neutralization assay.The result of this assay is called Virus Neutralization Titer (VNT). According to WHO standards, antibody titers are considered protective if the respective VNT is at least 0.5 IU/ml. Therefore, blood samples were taken from the vaccinated mice described in example 6 on day 42 and sera were prepared. These sera were used for fluorescent antibody titer neutralization (FAVN) assay using human CACO-2 cells. Cultured cells were infected with pseudovirions (expressing the new coronavirus S protein, the core being EGFP DNA). Shortly thereafter, heat inactivated sera were tested in quadruplicate at serial two-fold dilutions and tested for their potential to neutralize 100TCID50 (tissue culture infectious dose 50%) of pseudovirions in a volume of 50 μ l. Therefore, serum dilutions were made at 37 deg.C (in the presence of 5% CO)2Humidified incubator) was incubated with virus for 1 hour, and then trypsinized CACO-2 cells (4 × 10) were added5Individual cells/ml; 50. mu.l/well). Infected cell cultures were incubated in a humidified incubator at 37 ℃ and 5% CO2The culture was carried out for 48 hours. After fixation of the cells with 80% acetone at room temperature, EGFP expression was detected by fluorescence, using amounts to mark the infection of the cells.

From the results shown in FIG. 7, it was found that the vaccine of the present invention (in the form of a virus-like particle, a vaccine expressing only the S protein, or a vaccine expressing only the RBD region of the S protein) was effective in activating the immune system of mice, producing antibodies in serum, and was highly safe and effective. Wherein the combination of mrnas expressing virus-like particles produces the highest neutralizing antibody titer. Non-neutralizing antibodies and thus antibody-dependent enhanced infection effects are not produced.

Reference to the literature

1.Huang C,Wang Y,Li X,Ren L,Zhao J,et al.2020.Lancet

2.Zhu N,Zhang D,Wang W,Li X,Yang B,et al.2020.N Engl J Med

3.de Wit E,van Doremalen N,Falzarano D,Munster VJ.2016.Nat Rev Microbiol 14:523-34

4.Potter CW.2001.J Appl Microbiol 91:572-9

5.Smith W,Andrewes CH,Laidlaw PP.1933.Lancet 2:66-8

6.Barberis I,Myles P,Ault SK,Bragazzi NL,Martini M.2016.J Prev Med Hyg 57: E115-E20

7.Wolff JA,Malone RW,Williams P,Chong W,Acsadi G,et al.1990.Science 247: 1465-8

8.Jirikowski GF,Sanna PP,Maciejewski-Lenoir D,Bloom FE.1992.Science 255:996-8

9.Zangi L,Lui KO,von Gise A,Ma Q,Ebina W,et al.2013.Nat Biotechnol 31:898-907

10.Kariko K,Muramatsu H,Ludwig J,Weissman D.2011.Nucleic Acids Res 39:e142

11.Reichmuth AM,Oberli MA,Jaklenec A,Langer R,Blankschtein D.2016.Ther Deliv 7: 319-34

12.Sahin U,Kariko K,Tureci O.2014.Nat Rev Drug Discov 13:759-80

13.Pardi N,Hogan MJ,Porter FW,Weissman D.2018.Nat Rev Drug Discov 17:261-79

14.Hekele A,Bertholet S,Archer J,Gibson DG,Palladino G,et al.2013.Emerg Microbes Infect 2:e52

15.Richner JM,Himansu S,Dowd KA,Butler SL,Salazar V,et al.2017.Cell 169:176

16.Richner JM,Jagger BW,Shan C,Fontes CR,Dowd KA,et al.2017.Cell 170:273-83 e12

17.Feldman RA,Fuhr R,Smolenov I,Mick Ribeiro A,Panther L,et al.2019.Vaccine 37: 3326-34

18.Chroboczek J,Szurgot I,Szolajska E.2014.Acta Biochim Pol 61:531-9

19.Yong CY,Ong HK,Yeap SK,Ho KL,Tan WS.2019.Front Microbiol 10:1781

20.Baric RS,Sheahan T,Deming D,Donaldson E,Yount B,et al.2006.Adv Exp Med Biol 581:553-60

21.Yip MS,Leung HL,Li PH,Cheung CY,Dutry I,et al.2016.Hong Kong Med J 22: 25-31

22.Millet JK,Tang T,Nathan L,Jaimes JA,Hsu HL,et al.2019.J Vis Exp

23.Islam MA,Xu Y,Tao W,Ubellacker JM,Lim M,et al.2018.Nat Biomed Eng 2: 850-64

SEQUENCE LISTING

<110> Shanghai blue magpie Bio-pharmaceutical Co Ltd

<120> mRNA and novel coronavirus mRNA vaccine comprising the same

<130> P20011191C

<160> 43

<170> PatentIn version 3.5

<210> 1

<211> 1273

<212> PRT

<213> SARS-COV-2

<400> 1

Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val

1 5 10 15

Asn Leu Thr Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe

20 25 30

Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu

35 40 45

His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp

50 55 60

Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp

65 70 75 80

Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu

85 90 95

Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser

100 105 110

Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile

115 120 125

Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr

130 135 140

Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr

145 150 155 160

Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu

165 170 175

Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe

180 185 190

Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr

195 200 205

Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu

210 215 220

Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr

225 230 235 240

Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser

245 250 255

Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro

260 265 270

Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala

275 280 285

Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys

290 295 300

Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val

305 310 315 320

Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys

325 330 335

Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala

340 345 350

Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu

355 360 365

Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro

370 375 380

Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe

385 390 395 400

Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly

405 410 415

Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys

420 425 430

Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn

435 440 445

Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe

450 455 460

Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys

465 470 475 480

Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly

485 490 495

Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val

500 505 510

Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys

515 520 525

Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn

530 535 540

Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu

545 550 555 560

Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val

565 570 575

Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe

580 585 590

Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val

595 600 605

Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile

610 615 620

His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser

625 630 635 640

Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val

645 650 655

Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala

660 665 670

Ser Tyr Gln Thr Gln Thr Asn Ser Pro Arg Arg Ala Arg Ser Val Ala

675 680 685

Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser

690 695 700

Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile

705 710 715 720

Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val

725 730 735

Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu

740 745 750

Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr

755 760 765

Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln

770 775 780

Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe

785 790 795 800

Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser

805 810 815

Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly

820 825 830

Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp

835 840 845

Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu

850 855 860

Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly

865 870 875 880

Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile

885 890 895

Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr

900 905 910

Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn

915 920 925

Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala

930 935 940

Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn

945 950 955 960

Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val

965 970 975

Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln

980 985 990

Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val

995 1000 1005

Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn

1010 1015 1020

Leu Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys

1025 1030 1035

Arg Val Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro

1040 1045 1050

Gln Ser Ala Pro His Gly Val Val Phe Leu His Val Thr Tyr Val

1055 1060 1065

Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His

1070 1075 1080

Asp Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe Val Ser Asn

1085 1090 1095

Gly Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro Gln

1100 1105 1110

Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val

1115 1120 1125

Val Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro

1130 1135 1140

Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn

1145 1150 1155

His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn

1160 1165 1170

Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu

1175 1180 1185

Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu

1190 1195 1200

Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Ile Trp Leu

1205 1210 1215

Gly Phe Ile Ala Gly Leu Ile Ala Ile Val Met Val Thr Ile Met

1220 1225 1230

Leu Cys Cys Met Thr Ser Cys Cys Ser Cys Leu Lys Gly Cys Cys

1235 1240 1245

Ser Cys Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro

1250 1255 1260

Val Leu Lys Gly Val Lys Leu His Tyr Thr

1265 1270

<210> 2

<211> 3819

<212> DNA

<213> SARS-COV-2

<400> 2

atgtttgttt ttcttgtttt attgccacta gtctctagtc agtgtgttaa tcttacaacc 60

agaactcaat taccccctgc atacactaat tctttcacac gtggtgttta ttaccctgac 120

aaagttttca gatcctcagt tttacattca actcaggact tgttcttacc tttcttttcc 180

aatgttactt ggttccatgc tatacatgtc tctgggacca atggtactaa gaggtttgat 240

aaccctgtcc taccatttaa tgatggtgtt tattttgctt ccactgagaa gtctaacata 300

ataagaggct ggatttttgg tactacttta gattcgaaga cccagtccct acttattgtt 360

aataacgcta ctaatgttgt tattaaagtc tgtgaatttc aattttgtaa tgatccattt 420

ttgggtgttt attaccacaa aaacaacaaa agttggatgg aaagtgagtt cagagtttat 480

tctagtgcga ataattgcac ttttgaatat gtctctcagc cttttcttat ggaccttgaa 540

ggaaaacagg gtaatttcaa aaatcttagg gaatttgtgt ttaagaatat tgatggttat 600

tttaaaatat attctaagca cacgcctatt aatttagtgc gtgatctccc tcagggtttt 660

tcggctttag aaccattggt agatttgcca ataggtatta acatcactag gtttcaaact 720

ttacttgctt tacatagaag ttatttgact cctggtgatt cttcttcagg ttggacagct 780

ggtgctgcag cttattatgt gggttatctt caacctagga cttttctatt aaaatataat 840

gaaaatggaa ccattacaga tgctgtagac tgtgcacttg accctctctc agaaacaaag 900

tgtacgttga aatccttcac tgtagaaaaa ggaatctatc aaacttctaa ctttagagtc 960

caaccaacag aatctattgt tagatttcct aatattacaa acttgtgccc ttttggtgaa 1020

gtttttaacg ccaccagatt tgcatctgtt tatgcttgga acaggaagag aatcagcaac 1080

tgtgttgctg attattctgt cctatataat tccgcatcat tttccacttt taagtgttat 1140

ggagtgtctc ctactaaatt aaatgatctc tgctttacta atgtctatgc agattcattt 1200

gtaattagag gtgatgaagt cagacaaatc gctccagggc aaactggaaa gattgctgat 1260

tataattata aattaccaga tgattttaca ggctgcgtta tagcttggaa ttctaacaat 1320

cttgattcta aggttggtgg taattataat tacctgtata gattgtttag gaagtctaat 1380

ctcaaacctt ttgagagaga tatttcaact gaaatctatc aggccggtag cacaccttgt 1440

aatggtgttg aaggttttaa ttgttacttt cctttacaat catatggttt ccaacccact 1500

aatggtgttg gttaccaacc atacagagta gtagtacttt cttttgaact tctacatgca 1560

ccagcaactg tttgtggacc taaaaagtct actaatttgg ttaaaaacaa atgtgtcaat 1620

ttcaacttca atggtttaac aggcacaggt gttcttactg agtctaacaa aaagtttctg 1680

cctttccaac aatttggcag agacattgct gacactactg atgctgtccg tgatccacag 1740

acacttgaga ttcttgacat tacaccatgt tcttttggtg gtgtcagtgt tataacacca 1800

ggaacaaata cttctaacca ggttgctgtt ctttatcagg atgttaactg cacagaagtc 1860

cctgttgcta ttcatgcaga tcaacttact cctacttggc gtgtttattc tacaggttct 1920

aatgtttttc aaacacgtgc aggctgttta ataggggctg aacatgtcaa caactcatat 1980

gagtgtgaca tacccattgg tgcaggtata tgcgctagtt atcagactca gactaattct 2040

cctcggcggg cacgtagtgt agctagtcaa tccatcattg cctacactat gtcacttggt 2100

gcagaaaatt cagttgctta ctctaataac tctattgcca tacccacaaa ttttactatt 2160

agtgttacca cagaaattct accagtgtct atgaccaaga catcagtaga ttgtacaatg 2220

tacatttgtg gtgattcaac tgaatgcagc aatcttttgt tgcaatatgg cagtttttgt 2280

acacaattaa accgtgcttt aactggaata gctgttgaac aagacaaaaa cacccaagaa 2340

gtttttgcac aagtcaaaca aatttacaaa acaccaccaa ttaaagattt tggtggtttt 2400

aatttttcac aaatattacc agatccatca aaaccaagca agaggtcatt tattgaagat 2460

ctacttttca acaaagtgac acttgcagat gctggcttca tcaaacaata tggtgattgc 2520

cttggtgata ttgctgctag agacctcatt tgtgcacaaa agtttaacgg ccttactgtt 2580

ttgccacctt tgctcacaga tgaaatgatt gctcaataca cttctgcact gttagcgggt 2640

acaatcactt ctggttggac ctttggtgca ggtgctgcat tacaaatacc atttgctatg 2700

caaatggctt ataggtttaa tggtattgga gttacacaga atgttctcta tgagaaccaa 2760

aaattgattg ccaaccaatt taatagtgct attggcaaaa ttcaagactc actttcttcc 2820

acagcaagtg cacttggaaa acttcaagat gtggtcaacc aaaatgcaca agctttaaac 2880

acgcttgtta aacaacttag ctccaatttt ggtgcaattt caagtgtttt aaatgatatc 2940

ctttcacgtc ttgacaaagt tgaggctgaa gtgcaaattg ataggttgat cacaggcaga 3000

cttcaaagtt tgcagacata tgtgactcaa caattaatta gagctgcaga aatcagagct 3060

tctgctaatc ttgctgctac taaaatgtca gagtgtgtac ttggacaatc aaaaagagtt 3120

gatttttgtg gaaagggcta tcatcttatg tccttccctc agtcagcacc tcatggtgta 3180

gtcttcttgc atgtgactta tgtccctgca caagaaaaga acttcacaac tgctcctgcc 3240

atttgtcatg atggaaaagc acactttcct cgtgaaggtg tctttgtttc aaatggcaca 3300

cactggtttg taacacaaag gaatttttat gaaccacaaa tcattactac agacaacaca 3360

tttgtgtctg gtaactgtga tgttgtaata ggaattgtca acaacacagt ttatgatcct 3420

ttgcaacctg aattagactc attcaaggag gagttagata aatattttaa gaatcataca 3480

tcaccagatg ttgatttagg tgacatctct ggcattaatg cttcagttgt aaacattcaa 3540

aaagaaattg accgcctcaa tgaggttgcc aagaatttaa atgaatctct catcgatctc 3600

caagaacttg gaaagtatga gcagtatata aaatggccat ggtacatttg gctaggtttt 3660

atagctggct tgattgccat agtaatggtg acaattatgc tttgctgtat gaccagttgc 3720

tgtagttgtc tcaagggctg ttgttcttgt ggatcctgct gcaaatttga tgaagacgac 3780

tctgagccag tgctcaaagg agtcaaatta cattacaca 3819

<210> 3

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> sequence after S protein gene optimization (S-GS)

<400> 3

atgttcgtct tcctggtcct gctgcctctg gtctcctcac agtgcgtcaa tctgacaact 60

cggactcagc tgccacctgc ttatactaat agcttcacca gaggcgtgta ctatcctgac 120

aaggtgttta gaagctccgt gctgcactct acacaggatc tgtttctgcc attctttagc 180

aacgtgacct ggttccacgc catccacgtg agcggcacca atggcacaaa gcggttcgac 240

aatcccgtgc tgccttttaa cgatggcgtg tacttcgcct ctaccgagaa gagcaacatc 300

atcagaggct ggatctttgg caccacactg gactccaaga cacagtctct gctgatcgtg 360

aacaatgcca ccaacgtggt catcaaggtg tgcgagttcc agttttgtaa tgatcccttc 420

ctgggcgtgt actatcacaa gaacaataag agctggatgg agtccgagtt tagagtgtat 480

tctagcgcca acaactgcac atttgagtac gtgagccagc ctttcctgat ggacctggag 540

ggcaagcagg gcaatttcaa gaacctgagg gagttcgtgt ttaagaatat cgacggctac 600

ttcaaaatct actctaagca cacccccatc aacctggtgc gcgacctgcc tcagggcttc 660

agcgccctgg agcccctggt ggatctgcct atcggcatca acatcacccg gtttcagaca 720

ctgctggccc tgcacagaag ctacctgaca cccggcgact cctctagcgg atggaccgcc 780

ggcgctgccg cctactatgt gggctacctc cagccccgga ccttcctgct gaagtacaac 840

gagaatggca ccatcacaga cgcagtggat tgcgccctgg accccctgag cgagacaaag 900

tgtacactga agtcctttac cgtggagaag ggcatctatc agacatccaa tttcagggtg 960

cagccaaccg agtctatcgt gcgctttcct aatatcacaa acctgtgccc atttggcgag 1020

gtgttcaacg caacccgctt cgccagcgtg tacgcctgga ataggaagcg gatcagcaac 1080

tgcgtggccg actatagcgt gctgtacaac tccgcctctt tcagcacctt taagtgctat 1140

ggcgtgtccc ccacaaagct gaatgacctg tgctttacca acgtctacgc cgattctttc 1200

gtgatcaggg gcgacgaggt gcgccagatc gcccccggcc agacaggcaa gatcgcagac 1260

tacaattata agctgccaga cgatttcacc ggctgcgtga tcgcctggaa cagcaacaat 1320

ctggattcca aagtgggcgg caactacaat tatctgtacc ggctgtttag aaagagcaat 1380

ctgaagccct tcgagaggga catctctaca gaaatctacc aggccggcag caccccttgc 1440

aatggcgtgg agggctttaa ctgttatttc ccactccagt cctacggctt ccagcccaca 1500

aacggcgtgg gctatcagcc ttaccgcgtg gtggtgctga gctttgagct gctgcacgcc 1560

ccagcaacag tgtgcggccc caagaagtcc accaatctgg tgaagaacaa gtgcgtgaac 1620

ttcaacttca acggcctgac cggcacaggc gtgctgaccg agtccaacaa gaagttcctg 1680

ccatttcagc agttcggcag ggacatcgca gataccacag acgccgtgcg cgacccacag 1740

accctggaga tcctggacat cacaccctgc tctttcggcg gcgtgagcgt gatcacaccc 1800

ggcaccaata caagcaacca ggtggccgtg ctgtatcagg acgtgaattg taccgaggtg 1860

cccgtggcta tccacgccga tcagctgacc ccaacatggc gggtgtacag caccggctcc 1920

aacgtcttcc agacaagagc cggatgcctg atcggagcag agcacgtgaa caattcctat 1980

gagtgcgaca tcccaatcgg cgccggcatc tgtgcctctt accagaccca gacaaactct 2040

cccagaagag cccggagcgt ggcctcccag tctatcatcg cctataccat gtccctgggc 2100

gccgagaaca gcgtggccta ctctaacaat agcatcgcca tcccaaccaa cttcacaatc 2160

tctgtgacca cagagatcct gcccgtgtcc atgaccaaga catctgtgga ctgcacaatg 2220

tatatctgtg gcgattctac cgagtgcagc aacctgctgc tccagtacgg cagcttttgt 2280

acccagctga atagagccct gacaggcatc gccgtggagc aggataagaa cacacaggag 2340

gtgttcgccc aggtgaagca aatctacaag acccccccta tcaaggactt tggcggcttc 2400

aatttttccc agatcctgcc tgatccatcc aagccttcta agcggagctt tatcgaggac 2460

ctgctgttca acaaggtgac cctggccgat gccggcttca tcaagcagta tggcgattgc 2520

ctgggcgaca tcgcagccag ggacctgatc tgcgcccaga agtttaatgg cctgaccgtg 2580

ctgccacccc tgctgacaga tgagatgatc gcacagtaca caagcgccct gctggccggc 2640

accatcacat ccggatggac cttcggcgca ggagccgccc tccagatccc ctttgccatg 2700

cagatggcct ataggttcaa cggcatcggc gtgacccaga atgtgctgta cgagaaccag 2760

aagctgatcg ccaatcagtt taactccgcc atcggcaaga tccaggacag cctgtcctct 2820

acagccagcg ccctgggcaa gctccaggat gtggtgaatc agaacgccca ggccctgaat 2880

accctggtga agcagctgag cagcaacttc ggcgccatct ctagcgtgct gaatgacatc 2940

ctgagccggc tggacaaggt ggaggcagag gtgcagatcg accggctgat caccggccgg 3000

ctccagagcc tccagaccta tgtgacacag cagctgatca gggccgccga gatcagggcc 3060

agcgccaatc tggcagcaac caagatgtcc gagtgcgtgc tgggccagtc taagagagtg 3120

gacttttgtg gcaagggcta tcacctgatg tccttccctc agtctgcccc acacggcgtg 3180

gtgtttctgc acgtgaccta cgtgcccgcc caggagaaga acttcaccac agcccctgcc 3240

atctgccacg atggcaaggc ccactttcca agggagggcg tgttcgtgtc caacggcacc 3300

cactggtttg tgacacagcg caatttctac gagccccaga tcatcaccac agacaacacc 3360

ttcgtgagcg gcaactgtga cgtggtcatc ggcatcgtga acaataccgt gtatgatcca 3420

ctccagcccg agctggacag ctttaaggag gagctggata agtatttcaa gaatcacacc 3480

tcccctgacg tggatctggg cgacatcagc ggcatcaatg cctccgtggt gaacatccag 3540

aaggagatcg accgcctgaa cgaggtggct aagaatctga acgagagcct gatcgacctc 3600

caggagctgg gcaagtatga gcagtacatc aagtggccct ggtacatctg gctgggcttc 3660

atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgtat gacatcctgc 3720

tgttcttgcc tgaagggctg ctgtagctgt ggctcctgct gtaagtttga cgaggatgac 3780

tctgaacctg tgctgaaggg cgtgaagctg cattacacc 3819

<210> 4

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> Sequence (SBL) after S protein gene optimization

<400> 4

atgttcgttt tcctcgttct gctgcctctt gtcagctctc agtgtgtgaa cctgacaact 60

agaacacaac tacctcccgc ctacacaaac tctttcaccc ggggcgtgta ctacccagac 120

aaagtgttca ggagctctgt gttgcacagc acccaagacc tgtttttgcc attctttagt 180

aatgtgacct ggtttcacgc tatccatgtg tcgggcacca acgggaccaa aagattcgac 240

aaccccgttc tgccgttcaa cgacggcgtg tacttcgcta gcactgagaa gtccaacatt 300

attcgcgggt ggatcttcgg aactaccttg gactccaaaa cacagtctct actcatcgtg 360

aacaacgcga ctaacgtggt gattaaggtg tgtgaatttc agttctgcaa tgatccattt 420

ttaggagtgt actaccacaa aaataataaa tcatggatgg agtctgaatt tcgcgtatac 480

agtagcgcta ataactgtac attcgaatat gttagccaac cctttttgat ggacttagag 540

gggaagcagg gaaattttaa gaatttgcga gaatttgtgt tcaaaaatat cgatgggtat 600

ttcaagatct actccaagca tactcccata aatctggtgc gcgacttacc tcaagggttc 660

agcgcactgg agccactggt agacctgcca atcggcatca acatcacccg attccagacc 720

ctgcttgctc tgcaccgttc atatctgaca ccaggagatt cgtcttccgg atggacagca 780

ggggccgctg cttactatgt tggttatctt cagcctcgga cctttctgct caagtataat 840

gagaatggga ccattaccga cgctgttgat tgtgctctcg atcccctgtc agaaaccaag 900

tgcacactaa aatctttcac agtcgaaaag gggatctacc agacttctaa ctttcgtgta 960

cagcccaccg agagcatcgt caggttccca aatatcacta acctgtgtcc ttttggcgag 1020

gtgttcaacg ctacaagatt tgctagcgtg tacgcctgga acagaaaaag aatatcaaat 1080

tgcgtagccg attacagcgt cttatataac tctgcatcct tctcaacttt caagtgttat 1140

ggagtgagcc cgactaagct gaatgatttg tgctttacaa atgtttatgc cgattcattc 1200

gtgatccggg gcgacgaggt cagacagatc gcccctggcc aaacaggtaa gattgctgat 1260

tacaactaca aattacctga cgattttaca ggatgcgtta tcgcttggaa ctctaacaat 1320

ctcgattcta aggtcggcgg caattacaat tatctttatc gccttttcag gaagtcaaat 1380

cttaagccat tcgagcgaga catcagtacc gagatatacc aggcggggtc caccccgtgt 1440

aacggtgtcg agggtttcaa ctgctacttt ccactgcagt cctatgggtt ccagcccacc 1500

aatggcgtgg gttaccagcc ctaccgagta gtcgtattgt cttttgagct cttgcacgcc 1560

cccgccacgg tgtgcggtcc aaagaaatca actaacttag ttaagaataa atgtgtgaat 1620

tttaacttta acggcctgac agggacagga gtcctgacag aatccaataa gaagttcctt 1680

ccctttcagc agtttggacg cgacatcgca gacaccacag acgccgtgcg tgacccccaa 1740

actctcgaaa ttctcgatat cacaccctgc agttttggcg gggtcagtgt cattacccct 1800

gggaccaata ctagtaacca ggtcgcagtg ctttaccaag atgtcaactg taccgaggtt 1860

cctgtggcta ttcacgcaga ccaactgact ccgacttggc gggtgtatag tacaggctcc 1920

aatgtgtttc agacccgggc aggctgcctg attggggccg agcatgtaaa taactcctac 1980

gagtgcgata tccccatagg tgctggaata tgtgccagtt atcagaccca gacgaactcg 2040

ccaagacgag ctaggtccgt agcctctcag agcataatcg cgtacactat gagcctgggg 2100

gccgaaaatt ccgtggcata tagcaacaac agcattgcta ttcctactaa ctttacaatt 2160

tcagtcacga cggagatcct gccagtctcc atgactaaaa cctccgtgga ctgtacgatg 2220

tacatttgtg gcgattcaac tgaatgctct aacctgctct tacagtacgg ttctttttgt 2280

acccagctga accgggcatt gacgggcatc gcagttgagc aggacaagaa tactcaggag 2340

gtgtttgcgc aagtgaagca aatttataaa actcctccca ttaaggactt tggcggtttc 2400

aacttctcgc agatcctacc tgacccatca aaacctagca agaggtcttt cattgaagac 2460

cttctgttca acaaggtcac actggctgac gccggcttca ttaaacagta cggagattgt 2520

ctaggtgata ttgcagcgcg cgatctgatt tgcgcacaga agtttaacgg cctgacggtc 2580

ttaccccctc tccttaccga cgaaatgatt gcccagtaca ccagcgccct gctcgctggc 2640

acgattacta gcggatggac atttggggcc ggcgctgccc tccagatacc atttgccatg 2700

cagatggcgt ataggtttaa cggcatagga gtaacccaga acgtgctgta cgagaaccaa 2760

aaactgatag ccaatcaatt caatagtgcc ataggaaaga tacaggacag tctcagcagc 2820

accgcgtccg ctctcggaaa gctacaagat gtggtcaacc agaacgcgca ggcattgaat 2880

acactggtga agcagctctc ctcgaatttt ggagcaatca gcagcgtgct gaatgatatc 2940

ctgtctcggc tggacaaggt tgaagccgaa gtccagatcg acaggttaat caccggtcgg 3000

ctgcagagtc tccagacata tgttacccag caactcatca gagctgccga aatacgcgcc 3060

agtgccaatc ttgcagccac taagatgtcc gagtgcgtgt tggggcaaag taaaagggtt 3120

gatttctgtg gaaaaggata tcatcttatg agtttccctc aatccgcccc tcacggagtt 3180

gtcttcctgc atgtgaccta cgtgccagcg caggagaaga acttcacgac cgcccccgcc 3240

atctgccatg atggcaaggc ccattttccc cgcgaaggag tgttcgtatc caatggcacc 3300

cactggttcg tgacgcagag aaatttttat gagccgcaaa ttatcactac cgacaacaca 3360

ttcgtttccg gcaattgcga tgtcgtaatc gggatcgtga ataatacagt ctatgatcct 3420

cttcagccag aactcgattc attcaaagag gagctggata aatatttcaa gaaccacacc 3480

tcccccgatg tggatctggg tgacatatca ggaattaacg caagcgtcgt gaacattcag 3540

aaggaaatcg acaggctcaa tgaagtagca aagaacttga atgagtctct catcgacttg 3600

caggaactcg gcaaatatga gcagtacatt aaatggccgt ggtatatctg gctaggcttt 3660

atcgccggtc tgattgcaat tgtgatggtt actatcatgt tgtgctgcat gacaagttgc 3720

tgttcatgcc ttaaaggctg ctgctcctgc gggtcatgtt gtaaattcga tgaggacgac 3780

tctgagcccg tgctgaaagg ggtgaaactg cactacacg 3819

<210> 5

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> S protein Gene optimization sequence 3 (STF)

<400> 5

atgttcgtgt tcctggtgct gctgcctctg gtgtccagcc agtgtgtgaa cctgaccacc 60

agaacacagc tgcctccagc ctacaccaac agctttacca gaggcgtgta ctaccccgac 120

aaggtgttca gatccagcgt gctgcactct acccaggacc tgttcctgcc tttcttcagc 180

aacgtgacct ggttccacgc catccacgtg tccggcacca atggcaccaa gagattcgac 240

aaccccgtgc tgcccttcaa cgacggggtg tactttgcca gcaccgagaa gtccaacatc 300

atcagaggct ggatcttcgg caccacactg gacagcaaga cccagagcct gctgatcgtg 360

aacaacgcca ccaacgtggt catcaaagtg tgcgagttcc agttctgcaa cgaccccttc 420

ctgggcgtct actaccacaa gaacaacaag agctggatgg aaagcgagtt ccgggtgtac 480

agcagcgcca acaactgcac cttcgagtac gtgtcccagc ctttcctgat ggacctggaa 540

ggcaagcagg gcaacttcaa gaacctgcgc gagttcgtgt ttaagaacat cgacggctac 600

ttcaagatct acagcaagca cacccctatc aacctcgtgc gggatctgcc tcagggcttc 660

tctgctctgg aacccctggt ggatctgccc atcggcatca acatcacccg gtttcagaca 720

ctgctggccc tgcacagaag ctacctgaca cctggcgata gcagcagcgg atggacagct 780

ggtgccgccg cttactatgt gggctacctg cagcctagaa ccttcctgct gaagtacaac 840

gagaacggca ccatcaccga cgccgtggat tgtgctctgg atcctctgag cgagacaaag 900

tgcaccctga agtccttcac cgtggaaaag ggcatctacc agaccagcaa cttccgggtg 960

cagcccaccg aatccatcgt gcggttcccc aatatcacca atctgtgccc cttcggcgag 1020

gtgttcaatg ccaccagatt cgcctctgtg tacgcctgga accggaagcg gatcagcaat 1080

tgcgtggccg actactccgt gctgtacaac tccgccagct tcagcacctt caagtgctac 1140

ggcgtgtccc ctaccaagct gaacgacctg tgcttcacaa acgtgtacgc cgacagcttc 1200

gtgatccggg gagatgaagt gcggcagatt gcccctggac agacaggcaa gatcgccgac 1260

tacaactaca agctgcccga cgacttcacc ggctgtgtga ttgcctggaa cagcaacaac 1320

ctggactcca aagtcggcgg caactacaat tacctgtacc ggctgttccg gaagtccaat 1380

ctgaagccct tcgagcggga catctccacc gagatctatc aggccggcag caccccttgt 1440

aacggcgtgg aaggcttcaa ctgctacttc ccactgcagt cctacggctt tcagcccaca 1500

aatggcgtgg gctatcagcc ctacagagtg gtggtgctga gcttcgaact gctgcatgcc 1560

cctgccacag tgtgcggccc taagaaaagc accaatctcg tgaagaacaa atgcgtgaac 1620

ttcaacttca acggcctgac cggcaccggc gtgctgacag agagcaacaa gaagttcctg 1680

ccattccagc agtttggccg ggatatcgcc gataccacag acgccgttag agatccccag 1740

acactggaaa tcctggacat caccccttgc agcttcggcg gagtgtctgt gatcacccct 1800

ggcaccaaca ccagcaatca ggtggcagtg ctgtaccagg acgtgaactg taccgaagtg 1860

cccgtggcca ttcacgccga tcagctgaca cctacatggc gggtgtactc caccggcagc 1920

aatgtgtttc agaccagagc cggctgtctg atcggagccg agcacgtgaa caatagctac 1980

gagtgcgaca tccccatcgg cgctggcatc tgtgccagct accagacaca gacaaacagc 2040

cccagacggg ccagatctgt ggccagccag agcatcattg cctacacaat gtctctgggc 2100

gccgagaaca gcgtggccta ctccaacaac tctatcgcta tccccaccaa cttcaccatc 2160

agcgtgacca cagagatcct gcctgtgtcc atgaccaaga ccagcgtgga ctgcaccatg 2220

tacatctgcg gcgattccac cgagtgctcc aacctgctgc tgcagtacgg cagcttctgc 2280

acccagctga atagagccct gacagggatc gccgtggaac aggacaagaa cacccaagag 2340

gtgttcgccc aagtgaagca gatctacaag acccctccta tcaaggactt cggcggcttc 2400

aatttcagcc agattctgcc cgatcctagc aagcccagca agcggagctt catcgaggac 2460

ctgctgttca acaaagtgac actggccgac gccggcttca tcaagcagta tggcgattgt 2520

ctgggcgaca ttgccgccag ggatctgatt tgcgcccaga agtttaacgg actgacagtg 2580

ctgcctcctc tgctgaccga tgagatgatc gcccagtaca catctgccct gctggccggc 2640

acaatcacaa gcggctggac atttggagct ggcgccgctc tgcagatccc ctttgctatg 2700

cagatggcct accggttcaa cggcatcgga gtgacccaga atgtgctgta cgagaaccag 2760

aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820

acagcaagcg ccctgggaaa gctgcaggac gtggtcaacc agaatgccca ggcactgaac 2880

accctggtca agcagctgtc ctccaacttc ggcgccatca gctctgtgct gaacgatatc 2940

ctgagcagac tggacaaggt ggaagccgag gtgcagatcg acagactgat caccggaagg 3000

ctgcagtccc tgcagaccta cgttacccag cagctgatca gagccgccga gattagagcc 3060

tctgccaatc tggccgccac caagatgtct gagtgtgtgc tgggccagag caagagagtg 3120

gacttttgcg gcaagggcta ccacctgatg agcttccctc agtctgcccc tcacggcgtg 3180

gtgtttctgc acgtgacata cgtgcccgct caagagaaga atttcaccac cgctccagcc 3240

atctgccacg acggcaaagc ccactttcct agagaaggcg tgttcgtgtc caacggcacc 3300

cattggttcg tgacccagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360

ttcgtgtctg gcaactgcga cgtcgtgatc ggcattgtga acaataccgt gtacgaccct 3420

ctgcagcccg agctggacag cttcaaagag gaactggata agtactttaa gaaccacaca 3480

agccccgacg tggacctggg cgatatcagc ggaatcaatg ccagcgtcgt gaacatccag 3540

aaagagatcg accggctgaa cgaggtggcc aagaatctga acgagagcct gatcgacctg 3600

caagaactgg ggaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttt 3660

atcgccggac tgattgccat cgtgatggtc acaatcatgc tgtgttgcat gaccagctgc 3720

tgtagctgcc tgaagggctg ttgtagctgt ggcagctgct gcaagttcga cgaggacgat 3780

tctgagcccg tgctgaaggg cgtgaaactg cactacaca 3819

<210> 6

<211> 75

<212> PRT

<213> SARS-COV-2

<400> 6

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu Ile Val Asn Ser

1 5 10 15

Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala

20 25 30

Ile Leu Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn Ile Val Asn

35 40 45

Val Ser Leu Val Lys Pro Ser Phe Tyr Val Tyr Ser Arg Val Lys Asn

50 55 60

Leu Asn Ser Ser Arg Val Pro Asp Leu Leu Val

65 70 75

<210> 7

<211> 228

<212> DNA

<213> SARS-COV-2

<400> 7

atgtactcat tcgtttcgga agagacaggt acgttaatag ttaatagcgt acttcttttt 60

cttgctttcg tggtattctt gctagttaca ctagccatcc ttactgcgct tcgattgtgt 120

gcgtactgct gcaatattgt taacgtgagt cttgtaaaac cttcttttta cgtttactct 180

cgtgttaaaa atctgaattc ttctagagtt cctgatcttc tggtctaa 228

<210> 8

<211> 225

<212> DNA

<213> Artificial Sequence

<220>

<223> E protein Gene optimization sequence (EBL)

<400> 8

atgtacagct ttgtctcaga ggaaaccggc acgctgattg taaacagcgt gttactattc 60

ctcgccttcg ttgtgtttct ccttgttaca ctggcaatac tgactgccct gcggttgtgc 120

gcttactgct gtaatatcgt gaacgtgtct ttggtgaagc ccagtttcta tgtatattcc 180

agagtcaaaa atctcaactc ctctagggtg cctgacctgc ttgtc 225

<210> 9

<211> 222

<212> PRT

<213> SARS-COV-2

<400> 9

Met Ala Asp Ser Asn Gly Thr Ile Thr Val Glu Glu Leu Lys Lys Leu

1 5 10 15

Leu Glu Gln Trp Asn Leu Val Ile Gly Phe Leu Phe Leu Thr Trp Ile

20 25 30

Cys Leu Leu Gln Phe Ala Tyr Ala Asn Arg Asn Arg Phe Leu Tyr Ile

35 40 45

Ile Lys Leu Ile Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys

50 55 60

Phe Val Leu Ala Ala Val Tyr Arg Ile Asn Trp Ile Thr Gly Gly Ile

65 70 75 80

Ala Ile Ala Met Ala Cys Leu Val Gly Leu Met Trp Leu Ser Tyr Phe

85 90 95

Ile Ala Ser Phe Arg Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe

100 105 110

Asn Pro Glu Thr Asn Ile Leu Leu Asn Val Pro Leu His Gly Thr Ile

115 120 125

Leu Thr Arg Pro Leu Leu Glu Ser Glu Leu Val Ile Gly Ala Val Ile

130 135 140

Leu Arg Gly His Leu Arg Ile Ala Gly His His Leu Gly Arg Cys Asp

145 150 155 160

Ile Lys Asp Leu Pro Lys Glu Ile Thr Val Ala Thr Ser Arg Thr Leu

165 170 175

Ser Tyr Tyr Lys Leu Gly Ala Ser Gln Arg Val Ala Gly Asp Ser Gly

180 185 190

Phe Ala Ala Tyr Ser Arg Tyr Arg Ile Gly Asn Tyr Lys Leu Asn Thr

195 200 205

Asp His Ser Ser Ser Ser Asp Asn Ile Ala Leu Leu Val Gln

210 215 220

<210> 10

<211> 669

<212> DNA

<213> SARS-COV-2

<400> 10

atggcagatt ccaacggtac tattaccgtt gaagagctta aaaagctcct tgaacaatgg 60

aacctagtaa taggtttcct attccttaca tggatttgtc ttctacaatt tgcctatgcc 120

aacaggaata ggtttttgta tataattaag ttaattttcc tctggctgtt atggccagta 180

actttagctt gttttgtgct tgctgctgtt tacagaataa attggatcac cggtggaatt 240

gctatcgcaa tggcttgtct tgtaggcttg atgtggctca gctacttcat tgcttctttc 300

agactgtttg cgcgtacgcg ttccatgtgg tcattcaatc cagaaactaa cattcttctc 360

aacgtgccac tccatggcac tattctgacc agaccgcttc tagaaagtga actcgtaatc 420

ggagctgtga tccttcgtgg acatcttcgt attgctggac accatctagg acgctgtgac 480

atcaaggacc tgcctaaaga aatcactgtt gctacatcac gaacgctttc ttattacaaa 540

ttgggagctt cgcagcgtgt agcaggtgac tcaggttttg ctgcatacag tcgctacagg 600

attggcaact ataaattaaa cacagaccat tccagtagca gtgacaatat tgctttgctt 660

gtacagtaa 669

<210> 11

<211> 669

<212> DNA

<213> Artificial Sequence

<220>

<223> M protein gene optimization sequence MBL

<400> 11

atggcagatt ccaacggtac aattaccgtc gaagagctga aaaagctcct tgagcagtgg 60

aacctggtca tagggttcct attcctgaca tggatttgcc tgctgcaatt tgcctatgcc 120

aacaggaata ggtttttgta tataatcaag ctgattttcc tctggctgtt atggccagtg 180

accctggcct gttttgtgct tgccgctgtt tacagaataa attggatcac cggcggaatc 240

gccatcgcaa tggcttgcct tgtaggcttg atgtggctca gctacttcat tgcttctttc 300

cggctgtttg cgcgaacgcg gtccatgtgg tctttcaatc cggagactaa catactcctc 360

aatgtgcccc tccatggcac tattctgacc agacccctgc tagagagtga actcgtcatc 420

ggagctgtga tcctgcgggg gcacctgaga atcgccggac accacttagg ccgctgtgac 480

atcaaggatc tgcctaaaga aatcactgtt gccacatcac gaaccctttc ttattacaag 540

ttgggggcct cgcagcgtgt ggcaggagac tcaggttttg cggcatacag tcgctacagg 600

attggcaact ataaattaaa cacagaccat tccagcagca gcgataatat tgctttgctt 660

gtgcagtga 669

<210> 12

<211> 419

<212> PRT

<213> SARS-COV-2

<400> 12

Met Ser Asp Asn Gly Pro Gln Asn Gln Arg Asn Ala Pro Arg Ile Thr

1 5 10 15

Phe Gly Gly Pro Ser Asp Ser Thr Gly Ser Asn Gln Asn Gly Glu Arg

20 25 30

Ser Gly Ala Arg Ser Lys Gln Arg Arg Pro Gln Gly Leu Pro Asn Asn

35 40 45

Thr Ala Ser Trp Phe Thr Ala Leu Thr Gln His Gly Lys Glu Asp Leu

50 55 60

Lys Phe Pro Arg Gly Gln Gly Val Pro Ile Asn Thr Asn Ser Ser Pro

65 70 75 80

Asp Asp Gln Ile Gly Tyr Tyr Arg Arg Ala Thr Arg Arg Ile Arg Gly

85 90 95

Gly Asp Gly Lys Met Lys Asp Leu Ser Pro Arg Trp Tyr Phe Tyr Tyr

100 105 110

Leu Gly Thr Gly Pro Glu Ala Gly Leu Pro Tyr Gly Ala Asn Lys Asp

115 120 125

Gly Ile Ile Trp Val Ala Thr Glu Gly Ala Leu Asn Thr Pro Lys Asp

130 135 140

His Ile Gly Thr Arg Asn Pro Ala Asn Asn Ala Ala Ile Val Leu Gln

145 150 155 160

Leu Pro Gln Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly Ser

165 170 175

Arg Gly Gly Ser Gln Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg Asn

180 185 190

Ser Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Thr Ser Pro Ala

195 200 205

Arg Met Ala Gly Asn Gly Gly Asp Ala Ala Leu Ala Leu Leu Leu Leu

210 215 220

Asp Arg Leu Asn Gln Leu Glu Ser Lys Met Ser Gly Lys Gly Gln Gln

225 230 235 240

Gln Gln Gly Gln Thr Val Thr Lys Lys Ser Ala Ala Glu Ala Ser Lys

245 250 255

Lys Pro Arg Gln Lys Arg Thr Ala Thr Lys Ala Tyr Asn Val Thr Gln

260 265 270

Ala Phe Gly Arg Arg Gly Pro Glu Gln Thr Gln Gly Asn Phe Gly Asp

275 280 285

Gln Glu Leu Ile Arg Gln Gly Thr Asp Tyr Lys His Trp Pro Gln Ile

290 295 300

Ala Gln Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg Ile

305 310 315 320

Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr Thr Gly Ala

325 330 335

Ile Lys Leu Asp Asp Lys Asp Pro Asn Phe Lys Asp Gln Val Ile Leu

340 345 350

Leu Asn Lys His Ile Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu Pro

355 360 365

Lys Lys Asp Lys Lys Lys Lys Ala Asp Glu Thr Gln Ala Leu Pro Gln

370 375 380

Arg Gln Lys Lys Gln Gln Thr Val Thr Leu Leu Pro Ala Ala Asp Leu

385 390 395 400

Asp Asp Phe Ser Lys Gln Leu Gln Gln Ser Met Ser Ser Ala Asp Ser

405 410 415

Thr Gln Ala

<210> 13

<211> 1257

<212> DNA

<213> Artificial Sequence

<220>

<223> N protein gene optimization sequence NBL

<400> 13

atgtcagata acggaccgca gaaccaaagg aacgcccctc ggatcacttt cgggggtcct 60

agcgacagca ctgggtctaa ccaaaatgga gaacgttccg gcgcaagatc caaacagagg 120

aggcctcagg ggcttcctaa caatacagcc tcctggttca cagctctcac acagcatggc 180

aaggaagacc tgaagtttcc tagaggccag ggggttccca tcaatactaa ctcctcccca 240

gacgatcaga ttggttatta tcggcgggct accaggcgga tccggggcgg agacggtaag 300

atgaaggacc tctctccccg ttggtacttt tactacctcg gtacaggccc cgaggctggg 360

cttccgtatg gcgccaataa ggatggaata atttgggtgg ctacggaagg ggccctcaac 420

acaccgaagg atcacattgg cacccgtaat cccgcgaata atgccgccat tgtcctgcag 480

ttgccccagg ggacgacgtt gcccaaaggc ttttacgcag aaggatcgcg cggaggatcc 540

caagcctcca gccgatcaag ctctcgatct cggaactcaa gtcgcaatag cacaccaggg 600

tcttctcgcg ggaccagccc tgcaaggatg gccggaaacg gcggtgatgc tgctttagcg 660

ctgctgctgc tggatagact gaaccaatta gagagtaaaa tgtcaggtaa aggccagcaa 720

cagcaggggc agacagtgac caaaaaaagt gcggccgagg ccagcaagaa accccgccag 780

aaacgaacag ccactaaagc ctacaacgta acccaagcat tcggaaggag aggaccagag 840

cagacccaag gcaattttgg cgatcaagag ctgatccgcc aggggacgga ctataagcat 900

tggccacaga tcgcccagtt cgcacccagt gcttcagcct tcttcggaat gtcgagaatc 960

ggtatggagg tcactccttc tggcacttgg ctgacttata ccggcgcaat aaagctagac 1020

gacaaagacc ctaactttaa ggatcaggtg atcctgctaa ataaacacat tgatgcgtac 1080

aaaacattcc caccaactga gccaaagaag gacaagaaga agaaggcaga tgaaacccag 1140

gctttgcccc agagacagaa aaagcagcag accgtgacct tgctgccagc agccgacctc 1200

gacgattttt caaagcaact tcagcagtcc atgagtagcg ctgacagcac ccaggct 1257

<210> 14

<211> 1257

<212> DNA

<213> SARS-COV-2

<400> 14

atgtctgata atggacccca aaatcagcga aatgcacccc gcattacgtt tggtggaccc 60

tcagattcaa ctggcagtaa ccagaatgga gaacgcagtg gggcgcgatc aaaacaacgt 120

cggccccaag gtttacccaa taatactgcg tcttggttca ccgctctcac tcaacatggc 180

aaggaagacc ttaaattccc tcgaggacaa ggcgttccaa ttaacaccaa tagcagtcca 240

gatgaccaaa ttggctacta ccgaagagct accagacgaa ttcgtggtgg tgacggtaaa 300

atgaaagatc tcagtccaag atggtatttc tactacctag gaactgggcc agaagctgga 360

cttccctatg gtgctaacaa agacggcatc atatgggttg caactgaggg agccttgaat 420

acaccaaaag atcacattgg cacccgcaat cctgctaaca atgctgcaat cgtgctacaa 480

cttcctcaag gaacaacatt gccaaaaggc ttctacgcag aagggagcag aggcggcagt 540

caagcctctt ctcgttcctc atcacgtagt cgcaacagtt caagaaattc aactccaggc 600

agcagtaggg gaacttctcc tgctagaatg gctggcaatg gcggtgatgc tgctcttgct 660

ttgctgctgc ttgacagatt gaaccagctt gagagcaaaa tgtctggtaa aggccaacaa 720

caacaaggcc aaactgtcac taagaaatct gctgctgagg cttctaagaa gcctcggcaa 780

aaacgtactg ccactaaagc atacaatgta acacaagctt tcggcagacg tggtccagaa 840

caaacccaag gaaattttgg ggaccaggaa ctaatcagac aaggaactga ttacaaacat 900

tggccgcaaa ttgcacaatt tgcccccagc gcttcagcgt tcttcggaat gtcgcgcatt 960

ggcatggaag tcacaccttc gggaacgtgg ttgacctaca caggtgccat caaattggat 1020

gacaaagatc caaatttcaa agatcaagtc attttgctga ataagcatat tgacgcatac 1080

aaaacattcc caccaacaga gcctaaaaag gacaaaaaga agaaggctga tgaaactcaa 1140

gccttaccgc agagacagaa gaaacagcaa actgtgactc ttcttcctgc tgcagatttg 1200

gatgatttct ccaaacaatt gcaacaatcc atgagcagtg ctgactcaac tcaggcc 1257

<210> 15

<211> 46

<212> DNA

<213> Artificial Sequence

<220>

<223> 5'UTR

<400> 15

ggaaataaga gagaaaagaa gagtaagaag aaatataaga gccacc 46

<210> 16

<211> 110

<212> DNA

<213> Artificial Sequence

<220>

<223> 3'UTR-1

<400> 16

gctggagcct cggtggccat gcttcttgcc ccttgggcct ccccccagcc cctcctcccc 60

ttcctgcacc cgtacccccg tggtctttga ataaagtctg agtgggcggc 110

<210> 17

<211> 109

<212> DNA

<213> Artificial Sequence

<220>

<223> 3'UTR-2

<400> 17

gcggccgctt aattaagctg ccttctgcgg ggcttgcctt ctggccatgc ccttcttctc 60

tcccttgcac ctgtacctct tggtctttga ataaagcctg agtaggaag 109

<210> 18

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> mRNA sequence 1 (S-GS mRNA) after S protein gene optimization

<400> 18

uacaagcaga aggaccagga cgacggagac cagaggagug ucacgcaguu agacuguuga 60

gccugagucg acgguggacg aauaugauua ucgaaguggu cuccgcacau gauaggacug 120

uuccacaaau cuucgaggca cgacgugaga uguguccuag acaaagacgg uaagaaaucg 180

uugcacugga ccaaggugcg guaggugcac ucgccguggu uaccguguuu cgccaagcug 240

uuagggcacg acggaaaauu gcuaccgcac augaagcgga gauggcucuu cucguuguag 300

uagucuccga ccuagaaacc guggugugac cugagguucu gugucagaga cgacuagcac 360

uuguuacggu gguugcacca guaguuccac acgcucaagg ucaaaacauu acuagggaag 420

gacccgcaca ugauaguguu cuuguuauuc ucgaccuacc ucaggcucaa aucucacaua 480

agaucgcggu uguugacgug uaaacucaug cacucggucg gaaaggacua ccuggaccuc 540

ccguucgucc cguuaaaguu cuuggacucc cucaagcaca aauucuuaua gcugccgaug 600

aaguuuuaga ugagauucgu guggggguag uuggaccacg cgcuggacgg agucccgaag 660

ucgcgggacc ucggggacca ccuagacgga uagccguagu uguagugggc caaagucugu 720

gacgaccggg acgugucuuc gauggacugu gggccgcuga ggagaucgcc uaccuggcgg 780

ccgcgacggc ggaugauaca cccgauggag gucggggccu ggaaggacga cuucauguug 840

cucuuaccgu gguagugucu gcgucaccua acgcgggacc ugggggacuc gcucuguuuc 900

acaugugacu ucaggaaaug gcaccucuuc ccguagauag ucuguagguu aaagucccac 960

gucgguuggc ucagauagca cgcgaaagga uuauaguguu uggacacggg uaaaccgcuc 1020

cacaaguugc guugggcgaa gcggucgcac augcggaccu uauccuucgc cuagucguug 1080

acgcaccggc ugauaucgca cgacauguug aggcggagaa agucguggaa auucacgaua 1140

ccgcacaggg gguguuucga cuuacuggac acgaaauggu ugcagaugcg gcuaagaaag 1200

cacuaguccc cgcugcucca cgcggucuag cgggggccgg ucuguccguu cuagcgucug 1260

auguuaauau ucgacggucu gcuaaagugg ccgacgcacu agcggaccuu gucguuguua 1320

gaccuaaggu uucacccgcc guugauguua auagacaugg ccgacaaauc uuucucguua 1380

gacuucggga agcucucccu guagagaugu cuuuagaugg uccggccguc guggggaacg 1440

uuaccgcacc ucccgaaauu gacaauaaag ggugagguca ggaugccgaa ggucgggugu 1500

uugccgcacc cgauagucgg aauggcgcac caccacgacu cgaaacucga cgacgugcgg 1560

ggucguuguc acacgccggg guucuucagg ugguuagacc acuucuuguu cacgcacuug 1620

aaguugaagu ugccggacug gccguguccg cacgacuggc ucagguuguu cuucaaggac 1680

gguaaagucg ucaagccguc ccuguagcgu cuaugguguc ugcggcacgc gcuggguguc 1740

ugggaccucu aggaccugua gugugggacg agaaagccgc cgcacucgca cuaguguggg 1800

ccgugguuau guucguuggu ccaccggcac gacauagucc ugcacuuaac auggcuccac 1860

gggcaccgau aggugcggcu agucgacugg gguuguaccg cccacauguc guggccgagg 1920

uugcagaagg ucuguucucg gccuacggac uagccucguc ucgugcacuu guuaaggaua 1980

cucacgcugu aggguuagcc gcggccguag acacggagaa uggucugggu cuguuugaga 2040

gggucuucuc gggccucgca ccggaggguc agauaguagc ggauauggua cagggacccg 2100

cggcucuugu cgcaccggau gagauuguua ucguagcggu aggguugguu gaaguguuag 2160

agacacuggu gucucuagga cgggcacagg uacugguucu guagacaccu gacguguuac 2220

auauagacac cgcuaagaug gcucacgucg uuggacgacg aggucaugcc gucgaaaaca 2280

ugggucgacu uaucucggga cuguccguag cggcaccucg uccuauucuu guguguccuc 2340

cacaagcggg uccacuucgu uuagauguuc ugggggggau aguuccugaa accgccgaag 2400

uuaaaaaggg ucuaggacgg acuagguagg uucggaagau ucgccucgaa auagcuccug 2460

gacgacaagu uguuccacug ggaccggcua cggccgaagu aguucgucau accgcuaacg 2520

gacccgcugu agcgucgguc ccuggacuag acgcgggucu ucaaauuacc ggacuggcac 2580

gacggugggg acgacugucu acucuacuag cgugucaugu guucgcggga cgaccggccg 2640

ugguagugua ggccuaccug gaagccgcgu ccucggcggg aggucuaggg gaaacgguac 2700

gucuaccgga uauccaaguu gccguagccg cacugggucu uacacgacau gcucuugguc 2760

uucgacuagc gguuagucaa auugaggcgg uagccguucu agguccuguc ggacaggaga 2820

ugucggucgc gggacccguu cgagguccua caccacuuag ucuugcgggu ccgggacuua 2880

ugggaccacu ucgucgacuc gucguugaag ccgcgguaga gaucgcacga cuuacuguag 2940

gacucggccg accuguucca ccuccgucuc cacgucuagc uggccgacua guggccggcc 3000

gaggucucgg aggucuggau acacuguguc gucgacuagu cccggcggcu cuagucccgg 3060

ucgcgguuag accgucguug guucuacagg cucacgcacg acccggucag auucucucac 3120

cugaaaacac cguucccgau aguggacuac aggaagggag ucagacgggg ugugccgcac 3180

cacaaagacg ugcacuggau gcacgggcgg guccucuucu ugaaguggug ucggggacgg 3240

uagacggugc uaccguuccg ggugaaaggu ucccucccgc acaagcacag guugccgugg 3300

gugaccaaac acugugucgc guuaaagaug cucggggucu aguaguggug ucuguugugg 3360

aagcacucgc cguugacacu gcaccaguag ccguagcacu uguuauggca cauacuaggu 3420

gaggucgggc ucgaccuguc gaaauuccuc cucgaccuau ucauaaaguu cuuagugugg 3480

aggggacugc accuagaccc gcuguagucg ccguaguuac ggaggcacca cuuguagguc 3540

uuccucuagc uggcggacuu gcuccaccga uucuuagacu ugcucucgga cuagcuggag 3600

guccucgacc cguucauacu cgucauguag uucaccggga ccauguagac cgacccgaag 3660

uagcggccgg acuagcggua gcacuaccac ugguaguacg acacgacaua cuguaggacg 3720

acaagaacgg acuucccgac gacaucgaca ccgaggacga cauucaaacu gcuccuacug 3780

agacuuggac acgacuuccc gcacuucgac guaaugugg 3819

<210> 19

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> mRNA sequence (SBLmRNA) after S protein gene optimization

<400> 19

uacaagcaaa aggagcaaga cgacggagaa cagucgagag ucacacacuu ggacuguuga 60

ucuuguguug auggagggcg gauguguuug agaaaguggg ccccgcacau gaugggucug 120

uuucacaagu ccucgagaca caacgugucg uggguucugg acaaaaacgg uaagaaauca 180

uuacacugga ccaaagugcg auagguacac agcccguggu ugcccugguu uucuaagcug 240

uuggggcaag acggcaaguu gcugccgcac augaagcgau cgugacucuu cagguuguaa 300

uaagcgccca ccuagaagcc uugauggaac cugagguuuu gugucagaga ugaguagcac 360

uuguugcgcu gauugcacca cuaauuccac acacuuaaag ucaagacguu acuagguaaa 420

aauccucaca ugaugguguu uuuauuauuu aguaccuacc ucagacuuaa agcgcauaug 480

ucaucgcgau uauugacaug uaagcuuaua caaucgguug ggaaaaacua ccugaaucuc 540

cccuucgucc cuuuaaaauu cuuaaacgcu cuuaaacaca aguuuuuaua gcuacccaua 600

aaguucuaga ugagguucgu augaggguau uuagaccacg cgcugaaugg aguucccaag 660

ucgcgugacc ucggugacca ucuggacggu uagccguagu uguagugggc uaaggucugg 720

gacgaacgag acguggcaag uauagacugu gguccucuaa gcagaaggcc uaccugucgu 780

ccccggcgac gaaugauaca accaauagaa gucggagccu ggaaagacga guucauauua 840

cucuuacccu gguaauggcu gcgacaacua acacgagagc uaggggacag ucuuugguuc 900

acgugugauu uuagaaagug ucagcuuuuc cccuagaugg ucugaagauu gaaagcacau 960

gucggguggc ucucguagca guccaagggu uuauagugau uggacacagg aaaaccgcuc 1020

cacaaguugc gauguucuaa acgaucgcac augcggaccu ugucuuuuuc uuauaguuua 1080

acgcaucggc uaaugucgca gaauauauug agacguagga agaguugaaa guucacaaua 1140

ccucacucgg gcugauucga cuuacuaaac acgaaauguu uacaaauacg gcuaaguaag 1200

cacuaggccc cgcugcucca gucugucuag cggggaccgg uuuguccauu cuaacgacua 1260

auguugaugu uuaauggacu gcuaaaaugu ccuacgcaau agcgaaccuu gagauuguua 1320

gagcuaagau uccagccgcc guuaauguua auagaaauag cggaaaaguc cuucaguuua 1380

gaauucggua agcucgcucu guagucaugg cucuauaugg uccgccccag guggggcaca 1440

uugccacagc ucccaaaguu gacgaugaaa ggugacguca ggauacccaa ggucgggugg 1500

uuaccgcacc caauggucgg gauggcucau cagcauaaca gaaaacucga gaacgugcgg 1560

gggcggugcc acacgccagg uuucuuuagu ugauugaauc aauucuuauu uacacacuua 1620

aaauugaaau ugccggacug ucccuguccu caggacuguc uuagguuauu cuucaaggaa 1680

gggaaagucg ucaaaccugc gcuguagcgu cugugguguc ugcggcacgc acuggggguu 1740

ugagagcuuu aagagcuaua gugugggacg ucaaaaccgc cccagucaca guaaugggga 1800

cccugguuau gaucauuggu ccagcgucac gaaaugguuc uacaguugac auggcuccaa 1860

ggacaccgau aagugcgucu gguugacuga ggcugaaccg cccacauauc auguccgagg 1920

uuacacaaag ucugggcccg uccgacggac uaaccccggc ucguacauuu auugaggaug 1980

cucacgcuau agggguaucc acgaccuuau acacggucaa uagucugggu cugcuugagc 2040

gguucugcuc gauccaggca ucggagaguc ucguauuagc gcaugugaua cucggacccc 2100

cggcuuuuaa ggcaccguau aucguuguug ucguaacgau aaggaugauu gaaauguuaa 2160

agucagugcu gccucuagga cggucagagg uacugauuuu ggaggcaccu gacaugcuac 2220

auguaaacac cgcuaaguug acuuacgaga uuggacgaga augucaugcc aagaaaaaca 2280

ugggucgacu uggcccguaa cugcccguag cgucaacucg uccuguucuu augaguccuc 2340

cacaaacgcg uucacuucgu uuaaauauuu ugaggagggu aauuccugaa accgccaaag 2400

uugaagagcg ucuaggaugg acuggguagu uuuggaucgu ucuccagaaa guaacuucug 2460

gaagacaagu uguuccagug ugaccgacug cggccgaagu aauuugucau gccucuaaca 2520

gauccacuau aacgucgcgc gcuagacuaa acgcgugucu ucaaauugcc ggacugccag 2580

aaugggggag aggaauggcu gcuuuacuaa cgggucaugu ggucgcggga cgagcgaccg 2640

ugcuaaugau cgccuaccug uaaaccccgg ccgcgacggg aggucuaugg uaaacgguac 2700

gucuaccgca uauccaaauu gccguauccu cauugggucu ugcacgacau gcucuugguu 2760

uuugacuauc gguuaguuaa guuaucacgg uauccuuucu auguccuguc agagucgucg 2820

uggcgcaggc gagagccuuu cgauguucua caccaguugg ucuugcgcgu ccguaacuua 2880

ugugaccacu ucgucgagag gagcuuaaaa ccucguuagu cgucgcacga cuuacuauag 2940

gacagagccg accuguucca acuucggcuu caggucuagc uguccaauua guggccagcc 3000

gacgucucag aggucuguau acaauggguc guugaguagu cucgacggcu uuaugcgcgg 3060

ucacgguuag aacgucggug auucuacagg cucacgcaca accccguuuc auuuucccaa 3120

cuaaagacac cuuuuccuau aguagaauac ucaaagggag uuaggcgggg agugccucaa 3180

cagaaggacg uacacuggau gcacggucgc guccucuucu ugaagugcug gcgggggcgg 3240

uagacgguac uaccguuccg gguaaaaggg gcgcuuccuc acaagcauag guuaccgugg 3300

gugaccaagc acugcgucuc uuuaaaaaua cucggcguuu aauagugaug gcuguugugu 3360

aagcaaaggc cguuaacgcu acagcauuag cccuagcacu uauuauguca gauacuagga 3420

gaagucgguc uugagcuaag uaaguuucuc cucgaccuau uuauaaaguu cuuggugugg 3480

agggggcuac accuagaccc acuguauagu ccuuaauugc guucgcagca cuuguaaguc 3540

uuccuuuagc uguccgaguu acuucaucgu uucuugaacu uacucagaga guagcugaac 3600

guccuugagc cguuuauacu cgucauguaa uuuaccggca ccauauagac cgauccgaaa 3660

uagcggccag acuaacguua acacuaccaa ugauaguaca acacgacgua cuguucaacg 3720

acaaguacgg aauuuccgac gacgaggacg cccaguacaa cauuuaagcu acuccugcug 3780

agacucgggc acgacuuucc ccacuuugac gugaugugc 3819

<210> 20

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> optimization of mRNA sequence 3(STF mRNA) for S protein gene

<400> 20

uacaagcaca aggaccacga cgacggagac cacaggucgg ucacacacuu ggacuggugg 60

ucuugugucg acggaggucg gaugugguug ucgaaauggu cuccgcacau gauggggcug 120

uuccacaagu cuaggucgca cgacgugaga uggguccugg acaaggacgg aaagaagucg 180

uugcacugga ccaaggugcg guaggugcac aggccguggu uaccgugguu cucuaagcug 240

uuggggcacg acgggaaguu gcugccccac augaaacggu cguggcucuu cagguuguag 300

uagucuccga ccuagaagcc guggugugac cugucguucu gggucucgga cgacuagcac 360

uuguugcggu gguugcacca guaguuucac acgcucaagg ucaagacguu gcuggggaag 420

gacccgcaga ugaugguguu cuuguuguuc ucgaccuacc uuucgcucaa ggcccacaug 480

ucgucgcggu uguugacgug gaagcucaug cacagggucg gaaaggacua ccuggaccuu 540

ccguucgucc cguugaaguu cuuggacgcg cucaagcaca aauucuugua gcugccgaug 600

aaguucuaga ugucguucgu guggggauag uuggagcacg cccuagacgg agucccgaag 660

agacgagacc uuggggacca ccuagacggg uagccguagu uguagugggc caaagucugu 720

gacgaccggg acgugucuuc gauggacugu ggaccgcuau cgucgucgcc uaccugucga 780

ccacggcggc gaaugauaca cccgauggac gucggaucuu ggaaggacga cuucauguug 840

cucuugccgu gguaguggcu gcggcaccua acacgagacc uaggagacuc gcucuguuuc 900

acgugggacu ucaggaagug gcaccuuuuc ccguagaugg ucuggucguu gaaggcccac 960

gucggguggc uuagguagca cgccaagggg uuauaguggu uagacacggg gaagccgcuc 1020

cacaaguuac gguggucuaa gcggagacac augcggaccu uggccuucgc cuagucguua 1080

acgcaccggc ugaugaggca cgacauguug aggcggucga agucguggaa guucacgaug 1140

ccgcacaggg gaugguucga cuugcuggac acgaaguguu ugcacaugcg gcugucgaag 1200

cacuaggccc cucuacuuca cgccgucuaa cggggaccug ucuguccguu cuagcggcug 1260

auguugaugu ucgacgggcu gcugaagugg ccgacacacu aacggaccuu gucguuguug 1320

gaccugaggu uucagccgcc guugauguua auggacaugg ccgacaaggc cuucagguua 1380

gacuucggga agcucgcccu guagaggugg cucuagauag uccggccguc guggggaaca 1440

uugccgcacc uuccgaaguu gacgaugaag ggugacguca ggaugccgaa agucgggugu 1500

uuaccgcacc cgauagucgg gaugucucac caccacgacu cgaagcuuga cgacguacgg 1560

ggacgguguc acacgccggg auucuuuucg ugguuagagc acuucuuguu uacgcacuug 1620

aaguugaagu ugccggacug gccguggccg cacgacuguc ucucguuguu cuucaaggac 1680

gguaaggucg ucaaaccggc ccuauagcgg cuaugguguc ugcggcaauc ucuagggguc 1740

ugugaccuuu aggaccugua guggggaacg ucgaagccgc cucacagaca cuagugggga 1800

ccgugguugu ggucguuagu ccaccgucac gacauggucc ugcacuugac auggcuucac 1860

gggcaccggu aagugcggcu agucgacugu ggauguaccg cccacaugag guggccgucg 1920

uuacacaaag ucuggucucg gccgacagac uagccucggc ucgugcacuu guuaucgaug 1980

cucacgcugu agggguagcc gcgaccguag acacggucga uggucugugu cuguuugucg 2040

gggucugccc ggucuagaca ccggucgguc ucguaguaac ggauguguua cagagacccg 2100

cggcucuugu cgcaccggau gagguuguug agauagcgau aggggugguu gaagugguag 2160

ucgcacuggu gucucuagga cggacacagg uacugguucu ggucgcaccu gacgugguac 2220

auguagacgc cgcuaaggug gcucacgagg uuggacgacg acgucaugcc gucgaagacg 2280

ugggucgacu uaucucggga cugucccuag cggcaccuug uccuguucuu guggguucuc 2340

cacaagcggg uucacuucgu cuagauguuc uggggaggau aguuccugaa gccgccgaag 2400

uuaaagucgg ucuaagacgg gcuaggaucg uucgggucgu ucgccucgaa guagcuccug 2460

gacgacaagu uguuucacug ugaccggcug cggccgaagu aguucgucau accgcuaaca 2520

gacccgcugu aacggcgguc ccuagacuaa acgcgggucu ucaaauugcc ugacugucac 2580

gacggaggag acgacuggcu acucuacuag cgggucaugu guagacggga cgaccggccg 2640

uguuaguguu cgccgaccug uaaaccucga ccgcggcgag acgucuaggg gaaacgauac 2700

gucuaccgga uggccaaguu gccguagccu cacugggucu uacacgacau gcucuugguc 2760

uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuguc ggacucgucg 2820

ugucguucgc gggacccuuu cgacguccug caccaguugg ucuuacgggu ccgugacuug 2880

ugggaccagu ucgucgacag gagguugaag ccgcgguagu cgagacacga cuugcuauag 2940

gacucgucug accuguucca ccuucggcuc cacgucuagc ugucugacua guggccuucc 3000

gacgucaggg acgucuggau gcaauggguc gucgacuagu cucggcggcu cuaaucucgg 3060

agacgguuag accggcggug guucuacaga cucacacacg acccggucuc guucucucac 3120

cugaaaacgc cguucccgau gguggacuac ucgaagggag ucagacgggg agugccgcac 3180

cacaaagacg ugcacuguau gcacgggcga guucucuucu uaaaguggug gcgaggucgg 3240

uagacggugc ugccguuucg ggugaaagga ucucuuccgc acaagcacag guugccgugg 3300

guaaccaagc acugggucgc cuugaagaug cucggggucu aguaguggug gcuguugugg 3360

aagcacagac cguugacgcu gcagcacuag ccguaacacu uguuauggca caugcuggga 3420

gacgucgggc ucgaccuguc gaaguuucuc cuugaccuau ucaugaaauu cuuggugugu 3480

ucggggcugc accuggaccc gcuauagucg ccuuaguuac ggucgcagca cuuguagguc 3540

uuucucuagc uggccgacuu gcuccaccgg uucuuagacu ugcucucgga cuagcuggac 3600

guucuugacc ccuucaugcu cgucauguag uucaccggga ccauguagac cgacccgaaa 3660

uagcggccug acuaacggua gcacuaccag uguuaguacg acacaacgua cuggucgacg 3720

acaucgacgg acuucccgac aacaucgaca ccgucgacga cguucaagcu gcuccugcua 3780

agacucgggc acgacuuccc gcacuuugac gugaugugu 3819

<210> 21

<211> 225

<212> RNA

<213> Artificial Sequence

<220>

<223> E protein Gene optimized mRNA sequence (EBL mRNA)

<400> 21

uacaugucga aacagagucu ccuuuggccg ugcgacuaac auuugucgca caaugauaag 60

gagcggaagc aacacaaaga ggaacaaugu gaccguuaug acugacggga cgccaacacg 120

cgaaugacga cauuauagca cuugcacaga aaccacuucg ggucaaagau acauauaagg 180

ucucaguuuu uagaguugag gagaucccac ggacuggacg aacag 225

<210> 22

<211> 669

<212> RNA

<213> Artificial Sequence

<220>

<223> M protein Gene optimized mRNA sequence (MBL mRNA)

<400> 22

uaccgucuaa gguugccaug uuaauggcag cuucucgacu uuuucgagga acucgucacc 60

uuggaccagu aucccaagga uaaggacugu accuaaacgg acgacguuaa acggauacgg 120

uuguccuuau ccaaaaacau auauuaguuc gacuaaaagg agaccgacaa uaccggucac 180

ugggaccgga caaaacacga acggcgacaa augucuuauu uaaccuagug gccgccuuag 240

cgguagcguu accgaacgga acauccgaac uacaccgagu cgaugaagua acgaagaaag 300

gccgacaaac gcgcuugcgc cagguacacc agaaaguuag gccucugauu guaugaggag 360

uuacacgggg agguaccgug auaagacugg ucuggggacg aucucucacu ugagcaguag 420

ccucgacacu aggacgcccc cguggacucu uagcggccug uggugaaucc ggcgacacug 480

uaguuccuag acggauuucu uuagugacaa cgguguagug cuugggaaag aauaauguuc 540

aacccccgga gcgucgcaca ccguccucug aguccaaaac gccguauguc agcgaugucc 600

uaaccguuga uauuuaauuu gugucuggua aggucgucgu cgcuauuaua acgaaacgaa 660

cacgucacu 669

<210> 23

<211> 1257

<212> RNA

<213> Artificial Sequence

<220>

<223> N protein Gene optimized mRNA sequence (NBL mRNA)

<400> 23

uacagucuau ugccuggcgu cuugguuucc uugcggggag ccuagugaaa gcccccagga 60

ucgcugucgu gacccagauu gguuuuaccu cuugcaaggc cgcguucuag guuugucucc 120

uccggagucc ccgaaggauu guuaugucgg aggaccaagu gucgagagug ugucguaccg 180

uuccuucugg acuucaaagg aucuccgguc ccccaagggu aguuaugauu gaggaggggu 240

cugcuagucu aaccaauaau agccgcccga ugguccgccu aggccccgcc ucugccauuc 300

uacuuccugg agagaggggc aaccaugaaa augauggagc cauguccggg gcuccgaccc 360

gaaggcauac cgcgguuauu ccuaccuuau uaaacccacc gaugccuucc ccgggaguug 420

uguggcuucc uaguguaacc gugggcauua gggcgcuuau uacggcggua acaggacguc 480

aacggggucc ccugcugcaa cggguuuccg aaaaugcguc uuccuagcgc gccuccuagg 540

guucggaggu cggcuaguuc gagagcuaga gccuugaguu cagcguuauc gugugguccc 600

agaagagcgc ccuggucggg acguuccuac cggccuuugc cgccacuacg acgaaaucgc 660

gacgacgacg accuaucuga cuugguuaau cucucauuuu acaguccauu uccggucguu 720

gucguccccg ucugucacug guuuuuuuca cgccggcucc ggucguucuu uggggcgguc 780

uuugcuuguc ggugauuucg gauguugcau uggguucgua agccuuccuc uccuggucuc 840

gucuggguuc cguuaaaacc gcuaguucuc gacuaggcgg uccccugccu gauauucgua 900

accggugucu agcgggucaa gcguggguca cgaagucgga agaagccuua cagcucuuag 960

ccauaccucc agugaggaag accgugaacc gacugaauau ggccgcguua uuucgaucug 1020

cuguuucugg gauugaaauu ccuaguccac uaggacgauu uauuugugua acuacgcaug 1080

uuuuguaagg gugguugacu cgguuucuuc cuguucuucu ucuuccgucu acuuuggguc 1140

cgaaacgggg ucucugucuu uuucgucguc uggcacugga acgacggucg ucggcuggag 1200

cugcuaaaaa guuucguuga agucgucagg uacucaucgc gacugucgug gguccga 1257

<210> 24

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> SDC50

<400> 24

atgttcgtgt ttctggtgct gctgcctctg gtgtcttctc agtgtgtgaa tctgacaaca 60

agaacacagc tgcctcctgc ctacaccaac agctttacaa gaggagtgta ctaccctgac 120

aaggtgttca gaagcagcgt gctgcattct acacaggacc tgtttctgcc tttcttcagc 180

aacgtgacct ggtttcacgc cattcacgtg tctggcacaa atggaaccaa gaggttcgac 240

aatcctgtgc tgcctttcaa cgatggcgtg tactttgcct ctaccgagaa gagcaacatc 300

atcagaggct ggatctttgg caccacactg gatagcaaga cacagtctct gctgatcgtg 360

aacaatgcca ccaacgtggt gatcaaggtg tgtgagttcc agttctgcaa cgaccctttt 420

ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt cagagtgtac 480

agctctgcca acaattgcac ctttgagtac gtgagccagc ctttcctgat ggatctggaa 540

ggaaagcagg gcaatttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600

ttcaagatct acagcaagca cacccccatc aatctggtga gagatctgcc tcagggattt 660

tctgctctgg aacctctggt ggatctgcct attggcatca acatcaccag attccagaca 720

ctgctggctc tgcacagatc ttacctgaca cctggagatt cttcttctgg atggacagct 780

ggagctgctg cttattacgt gggctatctg cagcctagaa ccttcctgct gaagtacaac 840

gagaatggca ccatcacaga tgctgtggat tgtgctctgg atcctctgtc tgagaccaag 900

tgtacactga agagcttcac agtggagaag ggcatctacc agaccagcaa tttcagagtg 960

cagcctacag agagcatcgt gagattcccc aacatcacca atctgtgccc ttttggagag 1020

gtgttcaatg ccaccagatt tgcctctgtg tacgcctgga acagaaagag gatcagcaac 1080

tgtgtggccg attactctgt gctgtacaac tctgccagct ttagcacctt caagtgctac 1140

ggagtgtctc ctacaaagct gaacgacctg tgtttcacca acgtgtacgc cgatagcttc 1200

gtgattagag gcgatgaagt gagacagatt gctcctggcc agacaggaaa gatcgccgat 1260

tacaactaca agctgcctga tgacttcacc ggctgtgtga ttgcctggaa tagcaataac 1320

ctggacagca aagtgggcgg caactacaac tacctgtaca gactgttcag gaagagcaac 1380

ctgaagccct tcgagagaga catctctacc gagatttatc aggctggaag caccccttgt 1440

aatggcgtgg aaggcttcaa ctgttacttt cctctgcaga gctacggctt tcagcctacc 1500

aatggagtgg gatatcagcc ttatagagtg gtggtgctga gctttgaact gctgcatgct 1560

cctgctacag tgtgtggacc taagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620

ttcaacttca acggcctgac aggaacagga gtgctgacag agagcaataa gaagttcctg 1680

cccttccagc agtttggcag agacattgcc gatacaacag atgccgtgag agatcctcag 1740

acactggaga tcctggatat cacaccttgt agctttggcg gcgtgtctgt gattacacct 1800

ggaaccaata ccagcaatca ggtggctgtg ctgtaccagg atgtgaattg cacagaagtg 1860

cctgtggcca ttcatgctga tcagctgaca cctacatgga gagtgtacag caccggctct 1920

aatgtgtttc agaccagagc tggatgtctg attggagccg agcacgtgaa taacagctac 1980

gagtgtgaca tccctattgg agccggaatc tgtgcctctt atcagacaca gaccaactct 2040

cctagaagag ccagatctgt ggcctctcag tctatcatcg cctataccat gtctctggga 2100

gctgagaata gcgtggccta tagcaacaac agcattgcca tccctaccaa cttcaccatc 2160

agcgtgacaa cagagattct gcctgtgagc atgaccaaga catctgtgga ctgcaccatg 2220

tacatctgtg gcgattctac cgagtgtagc aatctgctgc tgcagtacgg ctctttttgt 2280

acccagctga atagagccct gacaggaatt gccgtggaac aggacaagaa tacccaggaa 2340

gtgtttgccc aggtgaagca gatctacaag acccctccta tcaaggactt tggcggcttc 2400

aacttctctc agattctgcc tgatcctagc aagcccagca agagaagttt catcgaggat 2460

ctgctgttca acaaggtgac actggccgat gccggattta tcaagcagta tggagattgt 2520

ctgggcgata tcgccgccag agatctgatt tgtgcccaga agtttaatgg actgaccgtg 2580

ctgcctcctc tgctgacaga tgagatgatt gctcagtata catctgccct gctggccgga 2640

acaatcacat ctggatggac atttggagct ggagctgctc tgcagattcc ttttgccatg 2700

cagatggcct acagattcaa tggcatcggc gtgacacaga atgtgctgta cgagaaccag 2760

aagctgattg ccaaccagtt caacagcgcc attggcaaga tccaggattc tctgtcttct 2820

acagcctctg ctctgggaaa actgcaggat gtggtgaatc agaatgccca ggccctgaat 2880

acactggtga agcagctgtc tagcaatttt ggcgccatct ctagcgtgct gaatgacatc 2940

ctgagcagac tggataaagt ggaggccgaa gtgcagatcg atagactgat cacaggcaga 3000

ctgcagtctc tgcagacata tgtgacacag cagctgatta gagctgccga gatcagagct 3060

tctgctaatc tggctgccac aaagatgtct gagtgtgtgc tgggacagtc taagagagtg 3120

gacttctgtg gcaaaggcta tcacctgatg agctttcctc agtctgctcc tcatggagtg 3180

gtgtttctgc atgtgacata tgtgcctgcc caggagaaga acttcacaac agctcctgcc 3240

atttgtcatg atggcaaggc ccactttcct agagaaggag tgttcgtgtc taatggcaca 3300

cactggttcg tgacacagag gaacttctac gagcctcaga tcatcaccac cgataacacc 3360

ttcgtgtctg gcaattgcga tgtggtgatc ggcatcgtga acaataccgt gtatgatcct 3420

ctgcagcctg agctggatag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480

tctcctgatg tggatctggg cgatatctct ggcatcaatg cctctgtggt gaacatccag 3540

aaggagatcg acagactgaa tgaggtggcc aagaacctga atgagagcct gatcgatctg 3600

caggaactgg gaaagtacga gcagtacatc aagtggcctt ggtacatctg gctgggattt 3660

attgccggac tgattgccat cgtgatggtg accatcatgc tgtgctgtat gaccagctgt 3720

tgtagctgtc tgaaaggctg ctgtagctgt ggcagctgtt gcaagtttga tgaggatgat 3780

tctgagcctg tgctgaaggg cgtgaagctg cactacacc 3819

<210> 25

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> SDC54

<400> 25

atgttcgtgt tcctggtgct gctgcctctg gtgagctctc agtgtgtgaa tctgaccaca 60

agaacccagc tgcctcctgc ctacaccaac agctttacca gaggagtgta ctaccccgac 120

aaggtgttca gaagcagcgt gctgcatagc acacaggatc tgttcctgcc cttcttcagc 180

aacgtgacct ggtttcacgc catccatgtg tctggcacca atggcaccaa gagattcgac 240

aaccctgtgc tgcctttcaa cgatggcgtg tacttcgcct ctaccgagaa gagcaacatc 300

atcagaggct ggatcttcgg caccacactg gatagcaaga cccagtctct gctgatcgtg 360

aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420

ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt cagggtgtac 480

agcagcgcca acaattgcac cttcgagtac gtgagccagc ctttcctgat ggatctggag 540

ggaaagcagg gcaacttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600

ttcaagatct acagcaagca cacccccatc aacctggtga gagatctgcc tcagggattt 660

tctgctctgg agcctctggt ggatctgcct atcggcatca acatcaccag attccagaca 720

ctgctggccc tgcacagaag ctacctgaca cctggagatt cttcttctgg ctggacagct 780

ggagctgctg cctattacgt gggctatctg cagcccagaa ccttcctgct gaagtacaac 840

gagaacggca ccatcacaga tgccgtggat tgtgccctgg atcctctgtc tgagaccaag 900

tgtaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttcagagtg 960

cagcctaccg agagcatcgt gagattcccc aacatcacca acctgtgccc ttttggcgag 1020

gtgttcaatg ccaccagatt tgccagcgtg tacgcctgga acaggaagag gatcagcaac 1080

tgtgtggccg attacagcgt gctgtacaac tctgccagct tcagcacctt caagtgctac 1140

ggcgtgtctc ctacaaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200

gtgattagag gcgatgaggt gagacagatt gctcctggcc agacaggcaa gattgccgac 1260

tacaactaca agctgcctga cgacttcacc ggctgtgtga ttgcctggaa cagcaacaat 1320

ctggacagca aggtgggcgg caactacaac tacctgtaca ggctgttcag gaagagcaac 1380

ctgaagccct tcgagagaga catcagcacc gagatctatc aggctggaag caccccttgt 1440

aatggcgtgg agggcttcaa ctgttacttc cctctgcaga gctacggctt tcagcctacc 1500

aatggagtgg gctatcagcc ttacagagtg gtggtgctga gctttgaact gctgcatgct 1560

cctgctacag tgtgtggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620

ttcaacttca acggcctgac cggaacagga gtgctgacag agagcaacaa gaagttcctg 1680

cccttccagc agttcggcag agatatcgcc gataccacag atgccgtgag agatcctcag 1740

acactggaga tcctggacat cacaccttgc agctttggcg gagtgtctgt gatcacacct 1800

ggcaccaata ccagcaatca ggtggctgtg ctgtaccagg acgtgaattg caccgaagtg 1860

cctgtggcca ttcatgctga tcagctgacc cctacatgga gagtgtacag caccggctct 1920

aatgtgttcc agaccagagc cggatgtctg attggagccg agcacgtgaa taacagctac 1980

gagtgcgaca tccctattgg agccggcatc tgtgcctctt atcagaccca gaccaactct 2040

cctagaagag ccagaagcgt ggcctctcag agcatcattg cctacaccat gtctctggga 2100

gccgagaata gcgtggccta cagcaataac agcatcgcca tccccaccaa cttcaccatc 2160

agcgtgacca cagagattct gcctgtgagc atgaccaaga cctctgtgga ctgcaccatg 2220

tacatctgtg gcgactctac cgagtgcagc aatctgctgc tgcagtatgg cagcttttgt 2280

acccagctga acagagccct gacaggcatt gctgtggagc aggataagaa cacccaggag 2340

gtgtttgccc aggtgaagca gatctacaag acccctccca tcaaggactt cggcggcttt 2400

aacttcagcc agatcctgcc tgatcctagc aagcccagca agaggagctt tatcgaggac 2460

ctgctgttca acaaggtgac cctggccgat gctggcttta tcaagcagta cggagattgt 2520

ctgggcgata tcgccgccag agacctgatt tgtgcccaga agttcaatgg actgaccgtg 2580

ctgcctcctc tgctgacaga tgagatgatt gcccagtaca catctgccct gctggctggc 2640

acaatcacat ctggatggac atttggagct ggagctgccc tgcagatccc ttttgccatg 2700

cagatggcct acagattcaa cggcatcggc gtgacccaga atgtgctgta cgagaaccag 2760

aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggattc tctgtctagc 2820

acagcctctg ctctgggaaa gctgcaggat gtggtgaatc agaatgccca ggccctgaat 2880

acactggtga agcagctgag cagcaacttt ggcgccatca gctctgtgct gaatgacatc 2940

ctgagcagac tggacaaggt ggaggctgaa gtgcagatcg acagactgat cacaggcaga 3000

ctgcagtctc tgcagaccta cgtgacacag cagctgatta gagctgccga gatcagagct 3060

tctgccaatc tggctgccac caagatgtct gagtgtgtgc tgggacagag caagagagtg 3120

gacttctgtg gcaaaggcta ccacctgatg agcttccctc agtctgctcc tcatggagtg 3180

gtgtttctgc acgtgaccta tgtgcctgcc caggagaaga acttcaccac agctcctgcc 3240

atttgtcacg atggcaaggc ccactttcct agagaaggcg tgttcgtgag caatggcaca 3300

cactggttcg tgacccagag gaacttctac gagccccaga tcatcaccac cgataacacc 3360

ttcgtgagcg gcaattgcga cgtggtgatc ggcatcgtga acaataccgt gtacgatcct 3420

ctgcagcctg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480

agccctgatg tggatctggg cgacatctct ggcatcaatg ccagcgtggt gaacatccag 3540

aaggagatcg acaggctgaa cgaggtggcc aagaacctga atgagagcct gatcgatctg 3600

caggagctgg gcaagtacga gcagtacatc aagtggcctt ggtacatctg gctgggcttt 3660

atcgccggac tgattgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720

tgtagctgtc tgaagggctg ttgtagctgt ggcagctgtt gcaagttcga cgaggatgat 3780

agcgagcctg tgctgaaagg cgtgaagctg cactacacc 3819

<210> 26

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> SDC58

<400> 26

atgttcgtgt tcctggtgct gctgcccctg gtgagctctc agtgtgtgaa cctgaccacc 60

agaacccagc tgcctcctgc ctacaccaac agcttcacca gaggcgtgta ctaccccgac 120

aaggtgttca gaagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180

aacgtgacct ggttccacgc catccacgtg tctggcacca atggcaccaa gaggttcgac 240

aaccctgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300

atcaggggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360

aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420

ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt ccgggtgtac 480

agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540

ggcaagcagg gcaacttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600

ttcaagatct acagcaagca cacccccatc aacctggtga gagacctgcc tcagggcttt 660

tctgccctgg agcctctggt ggacctgcct atcggcatca acatcaccag gttccagacc 720

ctgctggccc tgcacagaag ctacctgaca cctggcgata gctcttctgg ctggacagct 780

ggagctgctg cctattacgt gggctacctg cagcccagga ccttcctgct gaagtacaac 840

gagaacggca ccatcaccga cgccgtggat tgtgccctgg atcctctgag cgagaccaag 900

tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttccgggtg 960

cagcctaccg agagcatcgt gaggttcccc aacatcacca acctgtgccc tttcggcgag 1020

gtgttcaacg ccaccagatt cgcctctgtg tacgcctgga acaggaagcg gatcagcaac 1080

tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140

ggcgtgagcc ctaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200

gtgatcagag gcgatgaggt gagacagatc gcccctggac agaccggcaa gatcgccgac 1260

tacaactaca agctgcccga cgacttcacc ggctgtgtga tcgcctggaa cagcaacaac 1320

ctggacagca aggtgggcgg caactacaac tacctgtacc ggctgttccg gaagagcaac 1380

ctgaagccct tcgagaggga catcagcacc gagatctacc aggccggaag cacaccttgc 1440

aatggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt tcagcctacc 1500

aatggcgtgg gctaccagcc ctacagagtg gtggtgctga gctttgaact gctgcatgcc 1560

cctgccacag tgtgtggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620

ttcaacttca acggcctgac cggcacaggc gtgctgaccg agagcaacaa gaagttcctg 1680

cccttccagc agttcggcag agacatcgcc gataccaccg atgccgtgag agatcctcag 1740

accctggaga tcctggacat caccccttgc agctttggcg gagtgagcgt gatcacacct 1800

ggcaccaaca ccagcaatca ggtggccgtg ctgtaccagg acgtgaactg cacagaggtg 1860

cctgtggcca ttcatgccga tcagctgacc cctacctgga gagtgtacag caccggcagc 1920

aatgtgttcc agaccagagc cggctgtctg atcggagccg agcacgtgaa caacagctac 1980

gagtgcgaca tccctatcgg agccggcatc tgcgcctctt accagacaca gaccaacagc 2040

cccagaagag ccagaagcgt ggccagccag tctatcatcg cctacaccat gagcctggga 2100

gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160

agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220

tacatctgcg gcgacagcac agagtgcagc aacctgctgc tgcagtacgg cagcttttgc 2280

acccagctga acagagccct gacaggcatt gccgtggagc aggacaagaa cacccaggag 2340

gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400

aacttcagcc agatcctgcc tgaccctagc aagcccagca agcggagctt catcgaggac 2460

ctgctgttca acaaggtgac cctggccgat gccggcttca tcaagcagta cggcgattgt 2520

ctgggcgata tcgccgccag agacctgatc tgtgcccaga agttcaacgg cctgaccgtg 2580

ctgcctcctc tgctgacaga tgagatgatc gcccagtaca cctctgccct gctggccgga 2640

accatcacat ctggctggac atttggagct ggagccgccc tgcagatccc tttcgccatg 2700

cagatggcct acaggttcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760

aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgtctagc 2820

acagcctctg ctctgggcaa gctgcaggat gtggtgaacc agaatgccca ggccctgaac 2880

accctggtga agcagctgag cagcaatttc ggcgccatca gcagcgtgct gaacgacatc 2940

ctgagcagac tggacaaggt ggaggccgag gtgcagatcg acagactgat caccggcaga 3000

ctgcagagcc tgcagaccta cgtgacacag cagctgatca gagccgccga gatcagagcc 3060

tctgccaatc tggctgccac caagatgagc gagtgtgtgc tgggccagag caagagagtg 3120

gacttctgcg gcaaaggcta ccacctgatg agcttccccc agtctgctcc tcatggcgtg 3180

gtgtttctgc acgtgaccta cgtgcctgcc caggagaaga acttcaccac agcccctgcc 3240

atctgtcacg atggcaaggc ccacttccct agagagggcg tgttcgtgag caatggcacc 3300

cactggttcg tgacccagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360

ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgaccct 3420

ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480

agccccgacg tggatctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540

aaggagatcg accggctgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600

caggagctgg gcaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttt 3660

atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720

tgcagctgcc tgaagggctg ttgtagctgt ggcagctgct gcaagttcga cgaggacgat 3780

agcgagcctg tgctgaaggg cgtgaagctg cactacacc 3819

<210> 27

<211> 3819

<212> DNA

<213> Artificial Sequence

<220>

<223> SDC60

<400> 27

atgttcgtgt tcctggtgct gctgcccctg gtgagcagcc agtgtgtgaa cctgaccacc 60

agaacccagc tgcctcccgc ctacaccaac agcttcacca ggggcgtgta ctaccccgac 120

aaggtgttca ggagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180

aacgtgacct ggttccacgc catccacgtg agcggcacca atggcaccaa gcggttcgac 240

aaccctgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300

atccggggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360

aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420

ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt ccgggtgtac 480

agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540

ggcaagcagg gcaacttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600

ttcaagatct acagcaagca cacccccatc aacctggtga gggacctgcc tcagggcttt 660

tctgccctgg agcctctggt ggacctgccc atcggcatca acatcaccag gttccagacc 720

ctgctggccc tgcacaggag ctacctgaca cctggcgata gctcttctgg ctggacagcc 780

ggagctgctg cctactacgt gggctacctg cagccccgga ccttcctgct gaagtacaac 840

gagaacggca ccatcaccga cgccgtggat tgcgccctgg atcctctgag cgagaccaag 900

tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttccgggtg 960

cagcccaccg agagcatcgt gaggttcccc aacatcacca acctgtgccc cttcggcgag 1020

gtgttcaacg ccaccagatt cgccagcgtg tacgcctgga accggaagcg gatcagcaac 1080

tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140

ggcgtgagcc ccaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200

gtgatcaggg gcgatgaggt gagacagatc gcccctggcc agaccggcaa gatcgccgac 1260

tacaactaca agctgcccga cgacttcacc ggctgcgtga tcgcctggaa cagcaacaac 1320

ctggacagca aggtgggcgg caactacaac tacctgtacc ggctgttccg gaagagcaac 1380

ctgaagccct tcgagcggga catcagcacc gagatctacc aggccggaag caccccttgc 1440

aacggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt ccagcctacc 1500

aatggcgtgg gctaccagcc ctacagggtg gtggtgctga gctttgagct gctgcatgct 1560

cctgccaccg tgtgcggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620

ttcaacttca acggcctgac cggcaccggc gtgctgaccg agagcaacaa gaagttcctg 1680

cccttccagc agttcggcag ggacatcgcc gataccaccg atgccgtgag agaccctcag 1740

accctggaga tcctggacat caccccttgc agcttcggcg gagtgagcgt gatcacacct 1800

ggcaccaaca ccagcaacca ggtggccgtg ctgtaccagg acgtgaactg caccgaggtg 1860

cctgtggcca ttcacgccga tcagctgacc cccacctgga gagtgtacag caccggcagc 1920

aacgtgttcc agaccagagc cggctgtctg atcggcgccg agcacgtgaa caacagctac 1980

gagtgcgaca tccccatcgg cgccggcatc tgtgccagct atcagaccca gaccaacagc 2040

cctaggaggg ccagaagcgt ggccagccag tctatcatcg cctacaccat gagcctgggc 2100

gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160

agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220

tacatctgcg gcgacagcac cgagtgcagc aacctgctgc tgcagtacgg cagcttctgc 2280

acccagctga acagagccct gacaggcatc gccgtggagc aggacaagaa cacccaggag 2340

gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400

aacttcagcc agatcctgcc tgaccccagc aagcccagca agcggagctt catcgaggac 2460

ctgctgttca acaaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgt 2520

ctgggcgaca tcgccgccag agacctgatc tgtgcccaga agttcaacgg cctgaccgtg 2580

ctgccccctc tgctgaccga tgagatgatc gcccagtaca cctctgccct gctggccggc 2640

accatcacat ctggctggac ctttggagct ggagccgccc tgcagatccc tttcgccatg 2700

cagatggcct accggttcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760

aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820

accgcctctg ctctgggcaa actgcaggac gtggtgaacc agaacgccca ggccctgaac 2880

accctggtga agcagctgag cagcaacttc ggcgccatca gcagcgtgct gaacgacatc 2940

ctgagcaggc tggacaaggt ggaggccgag gtgcagatcg acaggctgat caccggcaga 3000

ctgcagagcc tgcagaccta cgtgacccag cagctgatca gagccgccga gatcagagcc 3060

tctgccaatc tggccgccac caagatgagc gagtgtgtgc tgggccagag caagagggtg 3120

gacttctgcg gcaagggcta ccacctgatg agcttccccc agtctgcccc tcatggcgtg 3180

gtgttcctgc acgtgaccta cgtgcctgcc caggagaaga acttcaccac cgcccctgcc 3240

atctgccacg atggcaaggc ccacttccct agagagggcg tgttcgtgag caacggcacc 3300

cactggttcg tgacccagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360

ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgacccc 3420

ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480

agccccgacg tggacctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540

aaggagatcg accggctgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600

caggagctgg gcaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttc 3660

atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720

tgcagctgcc tgaagggctg ctgcagctgt ggcagctgtt gcaagttcga cgaggacgac 3780

agcgagcccg tgctgaaggg cgtgaagctg cactacacc 3819

<210> 28

<211> 957

<212> DNA

<213> Artificial Sequence

<220>

<223> MT2AE

<400> 28

atggccgatt ctaatggcac catcaccgtg gaagagctga agaagctgct cgagcaatgg 60

aacctggtga tcggatttct gttcctgacc tggatctgtc tgttgcagtt cgcctacgcc 120

aaccggaaca gattcctgta catcatcaaa ctgatcttcc tgtggctgct gtggcctgtg 180

accctggcct gcttcgtgct ggccgccgtg taccggatta actggatcac cggaggcatc 240

gctatcgcca tggcatgcct ggtcggactt atgtggctgt cttatttcat cgccagcttc 300

agactgttcg ctagaaccag aagcatgtgg tcctttaacc ctgagacaaa catcctgctg 360

aacgtgcctc tgcacggcac aatcctgaca cggccactgc tggaaagcga gctggtcatc 420

ggcgccgtga tcctgcgggg ccatctgcgc attgccggac accacctggg cagatgcgac 480

atcaaggacc tgcccaagga aatcaccgtg gccaccagca gaacactgtc ctactacaaa 540

ctgggcgcta gtcagagagt ggccggcgac agcggcttcg ccgcttattc tagatacaga 600

atcggcaact acaagctgaa taccgatcac agcagcagca gcgacaacat cgccctgctg 660

gtgcagggca gcggcgaggg cagaggaagc ctgctgacat gtggcgatgt ggaagagaac 720

cccggccctg ccatgtacag ctttgtgtct gaggaaaccg gcaccctgat cgtgaacagc 780

gtgctgctgt ttctggcctt cgtcgtgttc ctgctggtga cactggctat cctgaccgcc 840

ctgaggctgt gcgcctactg ctgcaacatc gtgaatgtat ccctggtgaa gccttccttc 900

tacgtgtaca gccgggtgaa gaaccttaat agctctagag tgcccgacct gctcgtt 957

<210> 29

<211> 960

<212> DNA

<213> Artificial Sequence

<220>

<223> MP2AE

<400> 29

atggccgaca gcaacggcac aatcacagtg gaagagctga agaagctgct ggagcagtgg 60

aacctggtga ttggatttct tttcctcacc tggatctgcc tgctgcagtt cgcctatgcc 120

aaccggaaca gattcctgta catcatcaag ctgatcttcc tgtggctgct gtggcccgtg 180

accctggcct gttttgtgct ggccgccgtg taccggatca actggatcac cggcggaatc 240

gctatcgcca tggcctgcct ggtgggcctg atgtggctga gctacttcat cgcctccttt 300

agactgttcg ccagaaccag aagcatgtgg tccttcaacc ctgagacaaa tatcctgctc 360

aacgtgcccc tgcacggcac catcctgacc cggcctctgc tcgagagcga gctggtgatc 420

ggcgccgtga tcctgagagg ccacctgaga atcgccggac accacctggg cagatgcgac 480

atcaaggacc tgccaaagga aatcaccgtt gctacaagca gaacactgtc ctactacaag 540

ctgggcgctt ctcaaagagt cgccggcgac agcggcttcg ctgcttatag ccgctacagg 600

attggaaatt acaagctgaa caccgatcat tcttctagca gcgacaacat cgccctgctg 660

gtccagggca gcggcgccac aaacttcagc ctgcttaaac aggccggcga tgtggaagag 720

aaccccggcc ctgccatgta cagcttcgtg tccgaggaaa ccggcaccct gatcgtgaac 780

agcgtgctgc tgttccttgc ttttgtggtg ttcctgctgg tcaccctggc catcctgacc 840

gccctgagac tgtgtgccta ctgctgcaac atcgtgaatg tgtctctggt gaagcctagc 900

ttctacgtgt acagccgggt gaaaaacctg aactctagcc gggtgcctga tctgctggtg 960

<210> 30

<211> 798

<212> DNA

<213> Artificial Sequence

<220>

<223> SGS-RBD

<400> 30

atggagacag acacactcct gctatgggta ctgctgctct gggttccagg ttccaccgga 60

gactgcccat ttggcgaggt gttcaacgca acccgcttcg ccagcgtgta cgcctggaat 120

aggaagcgga tcagcaactg cgtggccgac tatagcgtgc tgtacaactc cgcctctttc 180

agcaccttta agtgctatgg cgtgtccccc acaaagctga atgacctgtg ctttaccaac 240

gtctacgccg attctttcgt gatcaggggc gacgaggtgc gccagatcgc ccccggccag 300

acaggcaaga tcgcagacta caattataag ctgccagacg atttcaccgg ctgcgtgatc 360

gcctggaaca gcaacaatct ggattccaaa gtgggcggca actacaatta tctgtaccgg 420

ctgtttagaa agagcaatct gaagcccttc gagagggaca tctctacaga aatctaccag 480

gccggcagca ccccttgcaa tggcgtggag ggctttaact gttatttccc actccagtcc 540

tacggcttcc agcccacaaa cggcgtgggc tatcagcctt accgcgtggt ggtgctgagc 600

tttgagctgc tgcacgccta cccgtacgac gtgccggact acgccaatgc tgtgggccag 660

gacacgcagg aggtcatcgt ggtgccacac tccttgccct ttaaggtggt ggtgatctca 720

gccatcctgg ccctggtggt gctcaccatc atctccctta tcatcctcat catgctttgg 780

cagaagaagc cacgttag 798

<210> 31

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> SDC-50 mRNA

<400> 31

uacaagcaca aggaccacga cgacggagac cacagaagag ucacacacuu agacuguugg 60

ucuugggucg acggaggacg gauaugguug ucgaaguguu cuccgcacau gaugggacug 120

uuccacaagu ccagaagaca cgacgugaga uggguccuag acaaggacgg aaagaagucg 180

uugcacugga ccaaagugcg guagguacac agaccguggu uaccgugguu cucuaagcug 240

uuaggacacg acggaaaguu gcuaccgcac augaagcgga gauggcucuu cucguuguag 300

uagucuccga ccuagaaacc guguugggac cuaucguucu gggucagaga cgacuagcac 360

uuguuacggu gguugcacca cuaguuccac acgcucaagg ucaagacguu acugggaaag 420

gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa gucccacaug 480

ucgagacggu uguuaacgug gaagcucaug cacucggucg gaaaggacua ccuagaccuu 540

ccuuucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600

aaguucuaga ugucguucgu guggggguag uuagaccacu cucuagacgg agucccuaaa 660

agacgagacc uuggagacca ccuagacgga uagccguagu uguagugguc uaaggucugu 720

gacgaccgag acgugucuuc gauagacugu ggaccgcuaa gaagaagacc uaccugucga 780

ccucgacgac gaauaaugca cccgauggac gucggaucuu ggaaggacga cuucauguug 840

cucuuaccgu gguaguggcu acgacaccua acacgggacc uaggagacag acucuguuuc 900

acaugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu aaagucucac 960

gucggauggc ucucguagca cucuaagggg uuguaguggu uagacacggg aaaaccgcuc 1020

cacaaguuac gguggucuaa acggucgcac auacggaccu uguccuucuc uuagucguug 1080

acacaccggc ugaugucgca cgacauguua agacggucga aaucguggaa guucacgaug 1140

ccgcacagag gaugguucga cuuacuggac acaaaguggu ugcacaugcg gcugucgaag 1200

cacuagucuc cucuacuuca cucugucuaa cgaggaccgg ucuguccguu cuagcggcua 1260

auguugaugu ucgacggacu acugaagugg ccgacacacu agcggaccuu aucguuguua 1320

gaccugucgu uucacccgcc guugauguug auggacaugu ccgacaaguc cuucucguug 1380

gacuucggga agcucucucu guagagaugg cucuagauag uccgaccuuc guggggaaca 1440

uuaccgcacc uuccgaaguu gacaaugaag ggagacgucu cgaugccgaa agucggaugg 1500

uuaccucacc cuauagucgg aaugucucac caccacgacu cgaaacuuga cgacguacga 1560

ggacgauguc acacaccggg auucuucucg ugguuggacc acuucuuguu cacgcacuug 1620

aaguugaagu ugccggacug gccuuguccu cacgacuguc ucucguuguu cuucaaggac 1680

gggaaggucg ucaaaccguc ucuguaacgg cuaugguguc uacggcacuc ucuaggaguc 1740

ugugaccucu aggaccuaua guguggaacg ucgaaaccgc cucacagaca cuagugugga 1800

ccuugguuau ggucguuagu ccaccgacac gacauggucc ugcacuuaac gugucuucac 1860

ggacaccggu aaguacgacu agucgacugg ggauguaccu cucacauguc guguccgucg 1920

uuacacaaag ucuggucucg gccuacagac uaaccucgac ucgugcacuu guugucgaug 1980

cucacacugu agggauaacc ucggccuuag acacggucga uagucugugu cugguugaga 2040

ggaucuucuc ggucuagaca ccggucgguc agauaguagc ggauauggua cagagacccu 2100

cgacucuuau cgcaccggau gucguuguug ucguagcggu agggaugguu gaagugguag 2160

ucgcacuguu gucucuagga cggacacucg uacugguucu guagacaccu gacgugguac 2220

auguagacac cgcugucgug ucucacaucg uuagacgacg acgucaugcc gucgaaaaca 2280

ugggucgacu uaucucggga cuguccuuaa cggcaccucg uccuauucuu auggguccuc 2340

cacaaacggg uccacuucgu cuagauguuc uggggaggau aguuccugaa gccgccgaag 2400

uugaagucgg ucuaagacgg acuaggaucg uucgggucgu ucucuucaaa guagcuccua 2460

gacgacaagu uguuccacug ggaccggcua cggccuaaau aguucgucau accgcuaaca 2520

gacccgcuau agcggcgguc ucuagacuaa acacgggucu ucaaguuacc ugacuggcac 2580

gacggaggag acgacugucu acucuacuaa cgagucaugu guagacggga cgaccgaccg 2640

uguuagugua gaccuaccug uaaaccucga ccucgacgag acgucuaggg aaaacgguac 2700

gucuaccgga ugucuaaguu gccguagccu cacugggucu uacacgacau gcucuugguc 2760

uucgacuagc gguuggucaa guugucgcgg uaaccguucu agguccuaag agacagaucg 2820

ugucgaagac gagacccguu ugacguccua caccacuuag ucuuacgagu ccgggacuua 2880

ugggaccacu ucgucgacag aucguuaaaa ccgcgguagu cgucgcacga cuuacuguag 2940

gacucgucug accuauuuca ccuccggcuu cacgucuagc ugucugacua guguccuucu 3000

gacgucagag acgucuggau gcacuguguc gucgacuaau cucgacggcu cuaaucucgg 3060

agacgauuag accgacggug guucuacaga cucacacacg acccugucag auucucucac 3120

cugaagacac cguuuccgau gguggacuac ucgaaaggag ucagacgagg aguaccucac 3180

cacaaagacg ugcacuguau acacggacgg guccucuucu ugaaguggug ucgaggacgg 3240

uaaacagugc uaccguuucg ggugaaagga ucucuuccgc acaagcacuc guuaccuugg 3300

gugaccaaac acugggucuc uuugaagaug cucggggucu aguaguggug gcuguuaugg 3360

aagcacagac cguuaacgcu gcaccacuag ccguagcacu uguuauggca cauacuagga 3420

gacgucggac ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480

ucgggacuac accuagaccc gcuauagaga ccguaguuac ggagacacca cuuguagguc 3540

uuccucuagc uguccgacuu acuccaccgg uucuuggacu uacucucgga cuagcuagac 3600

guccucgacc cuuucaugcu cgucauguag uucaccggaa ccauguagac cgacccgaaa 3660

uaacggccug acuaacggua gcacuaccac ugguaguacg acacgacgua cuguucgaca 3720

acaucgacag acuucccgac gacaagaaca ccgucgacaa cguucaagcu acuccuacua 3780

ucgcucggac acgacuuucc gcacuucgac gugaugugg 3819

<210> 32

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> SDC-54 mRNA

<400> 32

uacaagcaca aggaccacga cgacggagac cacucgagag ucacacacuu agacuggugu 60

ucuugggucg acggaggacg gaugugguug ucgaaauggu cuccucacau gauggggcug 120

uuccacaagu cuucgucgca cgacguaucg uguguccuag acaaggacgg gaagaagucg 180

uugcacugga ccaaagugcg guagguacac agaccguggu uaccgugguu cucuaagcug 240

uugggacacg acggaaaguu gcuaccgcac augaagcgga gauggcucuu cucguuguag 300

uagucuccga ccuagaagcc guggugugac cuaucguucu gggucagaga cgacuagcac 360

uuguugcggu gguugcacca cuaguuccac acgcucaagg ucaagacguu gcuggggaag 420

gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa gucccacaug 480

ucgucgcggu uguuaacgug gaagcucaug cacucggucg gaaaggacua ccuagaccuc 540

ccuuucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600

aaguucuaga ugucguucgu guggggguag uuggaccacu cucuagacgg agucccuaaa 660

agacgagacc ucggagacca ccuagacgga uagccguagu uguagugguc uaaggucugu 720

gacgaccggg acgugucuuc gauggacugu ggaccucuaa gaagaagacc gaccugucga 780

ccucgacgac ggauaaugca cccgauagac gucgggucuu ggaaggacga cuucauguug 840

cucuugccgu gguagugucu acggcaccua acacgggacc uaggagacag acucugguuc 900

acaugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu gaagucucac 960

gucggauggc ucucguagca cucuaagggg uuguaguggu uggacacggg aaaaccgcuc 1020

cacaaguuac gguggucuaa acggucgcac augcggaccu uguccuucuc cuagucguug 1080

acacaccggc uaaugucgca cgacauguug agacggucga agucguggaa guucacgaug 1140

ccgcacagag gauguuucga cuugcuggac acgaaguggu ugcacaugcg gcugucgaag 1200

cacuaaucuc cgcuacucca cucugucuaa cgaggaccgg ucuguccguu cuaacggcug 1260

auguugaugu ucgacggacu gcugaagugg ccgacacacu aacggaccuu gucguuguua 1320

gaccugucgu uccacccgcc guugauguug auggacaugu ccgacaaguc cuucucguug 1380

gacuucggga agcucucucu guagucgugg cucuagauag uccgaccuuc guggggaaca 1440

uuaccgcacc ucccgaaguu gacaaugaag ggagacgucu cgaugccgaa agucggaugg 1500

uuaccucacc cgauagucgg aaugucucac caccacgacu cgaaacuuga cgacguacga 1560

ggacgauguc acacaccggg guucuucucg ugguuggacc acuucuuguu cacgcacuug 1620

aaguugaagu ugccggacug gccuuguccu cacgacuguc ucucguuguu cuucaaggac 1680

gggaaggucg ucaagccguc ucuauagcgg cuaugguguc uacggcacuc ucuaggaguc 1740

ugugaccucu aggaccugua guguggaacg ucgaaaccgc cucacagaca cuagugugga 1800

ccgugguuau ggucguuagu ccaccgacac gacauggucc ugcacuuaac guggcuucac 1860

ggacaccggu aaguacgacu agucgacugg ggauguaccu cucacauguc guggccgaga 1920

uuacacaagg ucuggucucg gccuacagac uaaccucggc ucgugcacuu auugucgaug 1980

cucacgcugu agggauaacc ucggccguag acacggagaa uagucugggu cugguugaga 2040

ggaucuucuc ggucuucgca ccggagaguc ucguaguaac ggauguggua cagagacccu 2100

cggcucuuau cgcaccggau gucguuauug ucguagcggu aggggugguu gaagugguag 2160

ucgcacuggu gucucuaaga cggacacucg uacugguucu ggagacaccu gacgugguac 2220

auguagacac cgcugagaug gcucacgucg uuagacgacg acgucauacc gucgaaaaca 2280

ugggucgacu ugucucggga cuguccguaa cgacaccucg uccuauucuu guggguccuc 2340

cacaaacggg uccacuucgu cuagauguuc uggggagggu aguuccugaa gccgccgaaa 2400

uugaagucgg ucuaggacgg acuaggaucg uucgggucgu ucuccucgaa auagcuccug 2460

gacgacaagu uguuccacug ggaccggcua cgaccgaaau aguucgucau gccucuaaca 2520

gacccgcuau agcggcgguc ucuggacuaa acacgggucu ucaaguuacc ugacuggcac 2580

gacggaggag acgacugucu acucuacuaa cgggucaugu guagacggga cgaccgaccg 2640

uguuagugua gaccuaccug uaaaccucga ccucgacggg acgucuaggg aaaacgguac 2700

gucuaccgga ugucuaaguu gccguagccg cacugggucu uacacgacau gcucuugguc 2760

uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuaag agacagaucg 2820

ugucggagac gagacccuuu cgacguccua caccacuuag ucuuacgggu ccgggacuua 2880

ugugaccacu ucgucgacuc gucguugaaa ccgcgguagu cgagacacga cuuacuguag 2940

gacucgucug accuguucca ccuccgacuu cacgucuagc ugucugacua guguccgucu 3000

gacgucagag acgucuggau gcacuguguc gucgacuaau cucgacggcu cuagucucga 3060

agacgguuag accgacggug guucuacaga cucacacacg acccugucuc guucucucac 3120

cugaagacac cguuuccgau gguggacuac ucgaagggag ucagacgagg aguaccucac 3180

cacaaagacg ugcacuggau acacggacgg guccucuucu ugaaguggug ucgaggacgg 3240

uaaacagugc uaccguuccg ggugaaagga ucucuuccgc acaagcacuc guuaccgugu 3300

gugaccaagc acugggucuc cuugaagaug cucggggucu aguaguggug gcuauugugg 3360

aagcacucgc cguuaacgcu gcaccacuag ccguagcacu uguuauggca caugcuagga 3420

gacgucggac ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480

ucgggacuac accuagaccc gcuguagaga ccguaguuac ggucgcacca cuuguagguc 3540

uuccucuagc uguccgacuu gcuccaccgg uucuuggacu uacucucgga cuagcuagac 3600

guccucgacc cguucaugcu cgucauguag uucaccggaa ccauguagac cgacccgaaa 3660

uagcggccug acuaacggua gcacuaccac ugguaguacg acacgacgua cuggucgacg 3720

acaucgacag acuucccgac aacaucgaca ccgucgacaa cguucaagcu gcuccuacua 3780

ucgcucggac acgacuuucc gcacuucgac gugaugugg 3819

<210> 33

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> SDC-58 mRNA

<400> 33

uacaagcaca aggaccacga cgacggggac cacucgagag ucacacacuu ggacuggugg 60

ucuugggucg acggaggacg gaugugguug ucgaaguggu cuccgcacau gauggggcug 120

uuccacaagu cuucgucgca cgacgugucg uggguccugg acaaggacgg gaagaagucg 180

uugcacugga ccaaggugcg guaggugcac agaccguggu uaccgugguu cuccaagcug 240

uugggacacg acgggaaguu gcugccgcac augaagcggu cguggcucuu cucguuguag 300

uaguccccga ccuagaagcc guggugggac cugucguucu gggucucgga cgacuagcac 360

uuguugcggu gguugcacca cuaguuccac acgcucaagg ucaagacguu gcuggggaag 420

gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa ggcccacaug 480

ucgucgcggu uguugacgug gaagcucaug cacucggucg ggaaggacua ccuggaccuc 540

ccguucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600

aaguucuaga ugucguucgu guggggguag uuggaccacu cucuggacgg agucccgaaa 660

agacgggacc ucggagacca ccuggacgga uagccguagu uguagugguc caaggucugg 720

gacgaccggg acgugucuuc gauggacugu ggaccgcuau cgagaagacc gaccugucga 780

ccucgacgac ggauaaugca cccgauggac gucggguccu ggaaggacga cuucauguug 840

cucuugccgu gguaguggcu gcggcaccua acacgggacc uaggagacuc gcucugguuc 900

acgugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu gaaggcccac 960

gucggauggc ucucguagca cuccaagggg uuguaguggu uggacacggg aaagccgcuc 1020

cacaaguugc gguggucuaa gcggagacac augcggaccu uguccuucgc cuagucguug 1080

acgcaccggc ugaugucgca cgacauguug ucgcggucga agucguggaa guucacgaug 1140

ccgcacucgg gaugguucga cuugcuggac acgaaguggu ugcacaugcg gcugucgaag 1200

cacuagucuc cgcuacucca cucugucuag cggggaccug ucuggccguu cuagcggcug 1260

auguugaugu ucgacgggcu gcugaagugg ccgacacacu agcggaccuu gucguuguug 1320

gaccugucgu uccacccgcc guugauguug auggacaugg ccgacaaggc cuucucguug 1380

gacuucggga agcucucccu guagucgugg cucuagaugg uccggccuuc guguggaacg 1440

uuaccgcacc ucccgaaguu gacgaugaag ggggacgucu cgaugccgaa agucggaugg 1500

uuaccgcacc cgauggucgg gaugucucac caccacgacu cgaaacuuga cgacguacgg 1560

ggacgguguc acacaccggg guucuucucg ugguuggacc acuucuuguu cacgcacuug 1620

aaguugaagu ugccggacug gccguguccg cacgacuggc ucucguuguu cuucaaggac 1680

gggaaggucg ucaagccguc ucuguagcgg cuaugguggc uacggcacuc ucuaggaguc 1740

ugggaccucu aggaccugua guggggaacg ucgaaaccgc cucacucgca cuagugugga 1800

ccgugguugu ggucguuagu ccaccggcac gacauggucc ugcacuugac gugucuccac 1860

ggacaccggu aaguacggcu agucgacugg ggauggaccu cucacauguc guggccgucg 1920

uuacacaagg ucuggucucg gccgacagac uagccucggc ucgugcacuu guugucgaug 1980

cucacgcugu agggauagcc ucggccguag acgcggagaa uggucugugu cugguugucg 2040

gggucuucuc ggucuucgca ccggucgguc agauaguagc ggauguggua cucggacccu 2100

cggcucuugu cgcaccggau gucguuguug ucguagcggu aggggugguu gaagugguag 2160

ucgcacuggu ggcucuagga cgggcacucg uacugguucu ggucgcaccu gacgugguac 2220

auguagacgc cgcugucgug ucucacgucg uuggacgacg acgucaugcc gucgaaaacg 2280

ugggucgacu ugucucggga cuguccguaa cggcaccucg uccuguucuu guggguccuc 2340

cacaagcggg uccacuucgu cuagauguuc uggggggggu aguuccugaa gccgccgaag 2400

uugaagucgg ucuaggacgg acugggaucg uucgggucgu ucgccucgaa guagcuccug 2460

gacgacaagu uguuccacug ggaccggcua cggccgaagu aguucgucau gccgcuaaca 2520

gacccgcuau agcggcgguc ucuggacuag acacgggucu ucaaguugcc ggacuggcac 2580

gacggaggag acgacugucu acucuacuag cgggucaugu ggagacggga cgaccggccu 2640

ugguagugua gaccgaccug uaaaccucga ccucggcggg acgucuaggg aaagcgguac 2700

gucuaccgga uguccaaguu gccguagccg cacugggucu ugcacgacau gcucuugguc 2760

uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuguc ggacagaucg 2820

ugucggagac gagacccguu cgacguccua caccacuugg ucuuacgggu ccgggacuug 2880

ugggaccacu ucgucgacuc gucguuaaag ccgcgguagu cgucgcacga cuugcuguag 2940

gacucgucug accuguucca ccuccggcuc cacgucuagc ugucugacua guggccgucu 3000

gacgucucgg acgucuggau gcacuguguc gucgacuagu cucggcggcu cuagucucgg 3060

agacgguuag accgacggug guucuacucg cucacacacg acccggucuc guucucucac 3120

cugaagacgc cguuuccgau gguggacuac ucgaaggggg ucagacgagg aguaccgcac 3180

cacaaagacg ugcacuggau gcacggacgg guccucuucu ugaaguggug ucggggacgg 3240

uagacagugc uaccguuccg ggugaaggga ucucucccgc acaagcacuc guuaccgugg 3300

gugaccaagc acugggucgc cuugaagaug cucggggucu aguaguggug gcuguugugg 3360

aagcacucgc cguugacgcu gcaccacuag ccguagcacu uguuguggca caugcuggga 3420

gacgucgggc ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480

ucggggcugc accuagaccc gcuguagucg ccguaguugc ggucgcacca cuuguagguc 3540

uuccucuagc uggccgacuu gcuccaccgg uucuuggacu ugcucucgga cuagcuggac 3600

guccucgacc cguucaugcu cgucauguag uucaccggga ccauguagac cgacccgaaa 3660

uagcggccgg acuagcggua gcacuaccac ugguaguacg acacgacgua cuggucgacg 3720

acgucgacgg acuucccgac aacaucgaca ccgucgacga cguucaagcu gcuccugcua 3780

ucgcucggac acgacuuccc gcacuucgac gugaugugg 3819

<210> 34

<211> 3819

<212> RNA

<213> Artificial Sequence

<220>

<223> SDC-60 mRNA

<400> 34

uacaagcaca aggaccacga cgacggggac cacucgucgg ucacacacuu ggacuggugg 60

ucuugggucg acggagggcg gaugugguug ucgaaguggu ccccgcacau gauggggcug 120

uuccacaagu ccucgucgca cgacgugucg uggguccugg acaaggacgg gaagaagucg 180

uugcacugga ccaaggugcg guaggugcac ucgccguggu uaccgugguu cgccaagcug 240

uugggacacg acgggaaguu gcugccgcac augaagcggu cguggcucuu cucguuguag 300

uaggccccga ccuagaagcc guggugggac cugucguucu gggucucgga cgacuagcac 360

uuguugcggu gguugcacca cuaguuccac acgcucaagg ucaagacguu gcuggggaag 420

gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa ggcccacaug 480

ucgucgcggu uguugacgug gaagcucaug cacucggucg ggaaggacua ccuggaccuc 540

ccguucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600

aaguucuaga ugucguucgu guggggguag uuggaccacu cccuggacgg agucccgaaa 660

agacgggacc ucggagacca ccuggacggg uagccguagu uguagugguc caaggucugg 720

gacgaccggg acguguccuc gauggacugu ggaccgcuau cgagaagacc gaccugucgg 780

ccucgacgac ggaugaugca cccgauggac gucggggccu ggaaggacga cuucauguug 840

cucuugccgu gguaguggcu gcggcaccua acgcgggacc uaggagacuc gcucugguuc 900

acgugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu gaaggcccac 960

gucggguggc ucucguagca cuccaagggg uuguaguggu uggacacggg gaagccgcuc 1020

cacaaguugc gguggucuaa gcggucgcac augcggaccu uggccuucgc cuagucguug 1080

acgcaccggc ugaugucgca cgacauguug ucgcggucga agucguggaa guucacgaug 1140

ccgcacucgg ggugguucga cuugcuggac acgaaguggu ugcacaugcg gcugucgaag 1200

cacuaguccc cgcuacucca cucugucuag cggggaccgg ucuggccguu cuagcggcug 1260

auguugaugu ucgacgggcu gcugaagugg ccgacgcacu agcggaccuu gucguuguug 1320

gaccugucgu uccacccgcc guugauguug auggacaugg ccgacaaggc cuucucguug 1380

gacuucggga agcucgcccu guagucgugg cucuagaugg uccggccuuc guggggaacg 1440

uugccgcacc ucccgaaguu gacgaugaag ggggacgucu cgaugccgaa ggucggaugg 1500

uuaccgcacc cgauggucgg gaugucccac caccacgacu cgaaacucga cgacguacga 1560

ggacgguggc acacgccggg guucuucucg ugguuggacc acuucuuguu cacgcacuug 1620

aaguugaagu ugccggacug gccguggccg cacgacuggc ucucguuguu cuucaaggac 1680

gggaaggucg ucaagccguc ccuguagcgg cuaugguggc uacggcacuc ucugggaguc 1740

ugggaccucu aggaccugua guggggaacg ucgaagccgc cucacucgca cuagugugga 1800

ccgugguugu ggucguuggu ccaccggcac gacauggucc ugcacuugac guggcuccac 1860

ggacaccggu aagugcggcu agucgacugg ggguggaccu cucacauguc guggccgucg 1920

uugcacaagg ucuggucucg gccgacagac uagccgcggc ucgugcacuu guugucgaug 1980

cucacgcugu agggguagcc gcggccguag acacggucga uagucugggu cugguugucg 2040

ggauccuccc ggucuucgca ccggucgguc agauaguagc ggauguggua cucggacccg 2100

cggcucuugu cgcaccggau gucguuguug ucguagcggu aggggugguu gaagugguag 2160

ucgcacuggu ggcucuagga cgggcacucg uacugguucu ggucgcaccu gacgugguac 2220

auguagacgc cgcugucgug gcucacgucg uuggacgacg acgucaugcc gucgaagacg 2280

ugggucgacu ugucucggga cuguccguag cggcaccucg uccuguucuu guggguccuc 2340

cacaagcggg uccacuucgu cuagauguuc uggggggggu aguuccugaa gccgccgaag 2400

uugaagucgg ucuaggacgg acuggggucg uucgggucgu ucgccucgaa guagcuccug 2460

gacgacaagu uguuccacug ggaccggcug cggccgaagu aguucgucau gccgcugaca 2520

gacccgcugu agcggcgguc ucuggacuag acacgggucu ucaaguugcc ggacuggcac 2580

gacgggggag acgacuggcu acucuacuag cgggucaugu ggagacggga cgaccggccg 2640

ugguagugua gaccgaccug gaaaccucga ccucggcggg acgucuaggg aaagcgguac 2700

gucuaccgga uggccaaguu gccguagccg cacugggucu ugcacgacau gcucuugguc 2760

uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuguc ggacucgucg 2820

uggcggagac gagacccguu ugacguccug caccacuugg ucuugcgggu ccgggacuug 2880

ugggaccacu ucgucgacuc gucguugaag ccgcgguagu cgucgcacga cuugcuguag 2940

gacucguccg accuguucca ccuccggcuc cacgucuagc uguccgacua guggccgucu 3000

gacgucucgg acgucuggau gcacuggguc gucgacuagu cucggcggcu cuagucucgg 3060

agacgguuag accggcggug guucuacucg cucacacacg acccggucuc guucucccac 3120

cugaagacgc cguucccgau gguggacuac ucgaaggggg ucagacgggg aguaccgcac 3180

cacaaggacg ugcacuggau gcacggacgg guccucuucu ugaaguggug gcggggacgg 3240

uagacggugc uaccguuccg ggugaaggga ucucucccgc acaagcacuc guugccgugg 3300

gugaccaagc acugggucgc cuugaagaug cucggggucu aguaguggug gcuguugugg 3360

aagcacucgc cguugacgcu gcaccacuag ccguagcacu uguuguggca caugcugggg 3420

gacgucgggc ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480

ucggggcugc accuggaccc gcuguagucg ccguaguugc ggucgcacca cuuguagguc 3540

uuccucuagc uggccgacuu gcuccaccgg uucuuggacu ugcucucgga cuagcuggac 3600

guccucgacc cguucaugcu cgucauguag uucaccggga ccauguagac cgacccgaag 3660

uagcggccgg acuagcggua gcacuaccac ugguaguacg acacgacgua cuggucgacg 3720

acgucgacgg acuucccgac gacgucgaca ccgucgacaa cguucaagcu gcuccugcug 3780

ucgcucgggc acgacuuccc gcacuucgac gugaugugg 3819

<210> 35

<211> 957

<212> RNA

<213> Artificial Sequence

<220>

<223> MT2AE mRNA

<400> 35

uaccggcuaa gauuaccgug guaguggcac cuucucgacu ucuucgacga gcucguuacc 60

uuggaccacu agccuaaaga caaggacugg accuagacag acaacgucaa gcggaugcgg 120

uuggccuugu cuaaggacau guaguaguuu gacuagaagg acaccgacga caccggacac 180

ugggaccgga cgaagcacga ccggcggcac auggccuaau ugaccuagug gccuccguag 240

cgauagcggu accguacgga ccagccugaa uacaccgaca gaauaaagua gcggucgaag 300

ucugacaagc gaucuugguc uucguacacc aggaaauugg gacucuguuu guaggacgac 360

uugcacggag acgugccgug uuaggacugu gccggugacg accuuucgcu cgaccaguag 420

ccgcggcacu aggacgcccc gguagacgcg uaacggccug ugguggaccc gucuacgcug 480

uaguuccugg acggguuccu uuaguggcac cgguggucgu cuugugacag gaugauguuu 540

gacccgcgau cagucucuca ccggccgcug ucgccgaagc ggcgaauaag aucuaugucu 600

uagccguuga uguucgacuu auggcuagug ucgucgucgu cgcuguugua gcgggacgac 660

cacgucccgu cgccgcuccc gucuccuucg gacgacugua caccgcuaca ccuucucuug 720

gggccgggac gguacauguc gaaacacaga cuccuuuggc cgugggacua gcacuugucg 780

cacgacgaca aagaccggaa gcagcacaag gacgaccacu gugaccgaua ggacuggcgg 840

gacuccgaca cgcggaugac gacguuguag cacuuacaua gggaccacuu cggaaggaag 900

augcacaugu cggcccacuu cuuggaauua ucgagaucuc acgggcugga cgagcaa 957

<210> 36

<211> 960

<212> RNA

<213> Artificial Sequence

<220>

<223> MP2AE mRNA

<400> 36

uaccggcugu cguugccgug uuagugucac cuucucgacu ucuucgacga ccucgucacc 60

uuggaccacu aaccuaaaga aaaggagugg accuagacgg acgacgucaa gcggauacgg 120

uuggccuugu cuaaggacau guaguaguuc gacuagaagg acaccgacga caccgggcac 180

ugggaccgga caaaacacga ccggcggcac auggccuagu ugaccuagug gccgccuuag 240

cgauagcggu accggacgga ccacccggac uacaccgacu cgaugaagua gcggaggaaa 300

ucugacaagc ggucuugguc uucguacacc aggaaguugg gacucuguuu auaggacgag 360

uugcacgggg acgugccgug guaggacugg gccggagacg agcucucgcu cgaccacuag 420

ccgcggcacu aggacucucc gguggacucu uagcggccug ugguggaccc gucuacgcug 480

uaguuccugg acgguuuccu uuaguggcaa cgauguucgu cuugugacag gaugauguuc 540

gacccgcgaa gaguuucuca gcggccgcug ucgccgaagc gacgaauauc ggcgaugucc 600

uaaccuuuaa uguucgacuu guggcuagua agaagaucgu cgcuguugua gcgggacgac 660

caggucccgu cgccgcggug uuugaagucg gacgaauuug uccggccgcu acaccuucuc 720

uuggggccgg gacgguacau gucgaagcac aggcuccuuu ggccguggga cuagcacuug 780

ucgcacgacg acaaggaacg aaaacaccac aaggacgacc agugggaccg guaggacugg 840

cgggacucug acacacggau gacgacguug uagcacuuac acagagacca cuucggaucg 900

aagaugcaca ugucggccca cuuuuuggac uugagaucgg cccacggacu agacgaccac 960

<210> 37

<211> 798

<212> RNA

<213> Artificial Sequence

<220>

<223> SGS-RBD mRNA

<400> 37

uaccucuguc ugugugagga cgauacccau gacgacgaga cccaaggucc aagguggccu 60

cugacgggua aaccgcucca caaguugcgu ugggcgaagc ggucgcacau gcggaccuua 120

uccuucgccu agucguugac gcaccggcug auaucgcacg acauguugag gcggagaaag 180

ucguggaaau ucacgauacc gcacaggggg uguuucgacu uacuggacac gaaaugguug 240

cagaugcggc uaagaaagca cuaguccccg cugcuccacg cggucuagcg ggggccgguc 300

uguccguucu agcgucugau guuaauauuc gacggucugc uaaaguggcc gacgcacuag 360

cggaccuugu cguuguuaga ccuaagguuu cacccgccgu ugauguuaau agacauggcc 420

gacaaaucuu ucucguuaga cuucgggaag cucucccugu agagaugucu uuagaugguc 480

cggccgucgu ggggaacguu accgcaccuc ccgaaauuga caauaaaggg ugaggucagg 540

augccgaagg ucggguguuu gccgcacccg auagucggaa uggcgcacca ccacgacucg 600

aaacucgacg acgugcggau gggcaugcug cacggccuga ugcgguuacg acacccgguc 660

cugugcgucc uccaguagca ccacggugug aggaacggga aauuccacca ccacuagagu 720

cgguaggacc gggaccacca cgagugguag uagagggaau aguaggagua guacgaaacc 780

gucuucuucg gugcaauc 798

<210> 38

<211> 66

<212> DNA

<213> Artificial Sequence

<220>

<223> T2A DNA

<400> 38

ggcagcggcg agggcagagg aagcctgctg acatgtggcg atgtggaaga gaaccccggc 60

cctgcc 66

<210> 39

<211> 69

<212> DNA

<213> Artificial Sequence

<220>

<223> P2A DNA

<400> 39

ggcagcggcg ccacaaactt cagcctgctt aaacaggccg gcgatgtgga agagaacccc 60

ggccctgcc 69

<210> 40

<211> 66

<212> RNA

<213> Artificial Sequence

<220>

<223> T2A mRNA

<400> 40

ccgucgccgc ucccgucucc uucggacgac uguacaccgc uacaccuucu cuuggggccg 60

ggacgg 66

<210> 41

<211> 69

<212> RNA

<213> Artificial Sequence

<220>

<223> P2A mRNA

<400> 41

ccgucgccgc gguguuugaa gucggacgaa uuuguccggc cgcuacaccu ucucuugggg 60

ccgggacgg 69

<210> 42

<211> 22

<212> PRT

<213> thosea asigna virus 2A

<400> 42

Gly Ser Gly Glu Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu

1 5 10 15

Glu Asn Pro Gly Pro Ala

20

<210> 43

<211> 23

<212> PRT

<213> porcine teschovirus-1 2A

<400> 43

Gly Ser Gly Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val

1 5 10 15

Glu Glu Asn Pro Gly Pro Ala

20

73页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种提高杨树木材产量的方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!