mRNA and novel coronavirus mRNA vaccine containing same
阅读说明:本技术 mRNA及包含其的新冠病毒mRNA疫苗 (mRNA and novel coronavirus mRNA vaccine containing same ) 是由 王冰 俞航 于 2020-05-28 设计创作,主要内容包括:本发明提供了mRNA,其包含编码来源于SARS-CoV-2病毒的S蛋白、E蛋白、M蛋白和N蛋白中的一种、两种、三种或四种蛋白或其片段的mRNA,编码S蛋白的mRNA的序列如SEQ ID NO.18、SEQ ID NO.19或SEQ ID NO.20所示;编码E蛋白的mRNA的序列如SEQ ID NO.21所示;编码M蛋白的mRNA的序列如SEQ ID NO.22所示;编码N蛋白的mRNA的序列如SEQ ID NO.23所示。还提供了一种包含所述mRNA的脂质体纳米颗粒、一种针对新冠病毒的mRNA疫苗等。本发明的mRNA在细胞水平高效产生病毒蛋白,或由产生的蛋白自组装成病毒样颗粒。将包含本发明的mRNA制备成疫苗时,安全性高、有效性好、不会产生非中和抗体而不会产生抗体依赖增强感染效应。(The invention provides mRNA, which comprises mRNA for encoding one, two, three or four proteins or fragments thereof in S protein, E protein, M protein and N protein derived from SARS-CoV-2 virus, wherein the sequence of the mRNA for encoding the S protein is shown as SEQ ID NO.18, SEQ ID NO.19 or SEQ ID NO. 20; the sequence of mRNA for encoding the E protein is shown as SEQ ID NO. 21; the sequence of mRNA for coding the M protein is shown as SEQ ID NO. 22; the sequence of mRNA for coding the N protein is shown as SEQ ID NO. 23. Also provided are a liposomal nanoparticle comprising the mRNA, an mRNA vaccine against the novel coronavirus, and the like. The mRNA of the present invention efficiently produces viral proteins at the cellular level, or self-assembles from the produced proteins into virus-like particles. When the mRNA of the present invention is used to prepare a vaccine, it is highly safe and effective, and produces no non-neutralizing antibody and no antibody-dependent infection-enhancing effect.)
mRNA comprising mRNA encoding one, two, three or four proteins or fragments thereof from among the S protein, the E protein, the M protein and the N protein of SARS-CoV-2 virus,
wherein the sequence of mRNA for coding the S protein is shown as SEQ ID NO.18, SEQ ID NO.19 or SEQ ID NO. 20; the sequence of mRNA for encoding the E protein is shown as SEQ ID NO. 21; the sequence of mRNA for coding the M protein is shown as SEQ ID NO. 22; the sequence of mRNA for coding the N protein is shown as SEQ ID NO. 23.
2. The mRNA of claim 1, wherein the fragment is a fragment of the RBD domain of the S protein, and the mRNA preferably has the sequence shown in SEQ ID No. 37.
3. The mRNA of claim 1 or 2, further comprising one or more of the following (a) to (e):
(a) a 5' -cap structure, preferably 3' -O-Me-m7G (5') ppp (5') G, m7G (5') ppp (5') (2' OMeA) pG or m7(3' OMeG) (5') ppp (5') (2' OMeA) pG;
(b) a 3' -poly a whose sequence preferably comprises a sequence of about 25 to about 400 adenosine nucleotides, preferably a sequence of about 50 to about 400 adenosine nucleotides, more preferably a sequence of about 50 to about 300 adenosine nucleotides, even more preferably a sequence of about 50 to about 250 adenosine nucleotides, even more preferably a sequence of about 60 to about 250 adenosine nucleotides, and most preferably a sequence consisting of 120 poly a;
(c)5 '-UTR, the sequence of the 5' -UTR is preferably shown in SEQ ID NO. 15;
(d) a 3' -UTR, the sequence of said 3' -UTR preferably being derived from the 3' -UTR of a gene providing a stable mRNA, more preferably as shown in SEQ ID No.16 or SEQ ID No. 17;
(e) modification of a polynucleotide, preferably one or more of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP and 5-Methoxy-UTP;
preferably:
the mRNA of the N protein comprises a modification of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP or 5-Methoxy-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP;
and/or, the mRNA of said E protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP;
and/or, when the sequence of the mRNA encoding said S protein is as shown in SEQ ID No.18, the mRNA of said S protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP, preferably comprises a modification of pseudo-UTP or N1-methyl pseudo-UTP; or, when the sequence of the mRNA encoding said S protein is shown in SEQ ID NO.19, the mRNA of said S protein comprises a modification of pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP; alternatively, when the sequence of the mRNA encoding the S protein is as shown in SEQ ID NO.20, the mRNA of the S protein comprises a modification of pseudo-UTP or N1-Methylprudo-UTP.
4. The mRNA of any one of claims 1 to 3, wherein the mRNA comprises mRNAs encoding an S protein, an E protein and an M protein derived from SARS-CoV-2 virus, wherein the S protein, the E protein and the M protein are expressed from three separate mRNAs, and the molar ratio of the mRNAs for expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5) such as 1:1: 1;
or, the mRNA comprises mRNA encoding M protein and E protein derived from SARS-CoV-2 virus, the mRNA of the M protein and the mRNA of the E protein are preferably expressed after being connected, the connection is preferably performed through the sequence of the mRNA encoding the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the mRNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the mRNA after being connected is preferably shown as SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29;
alternatively, the mRNA comprises mRNA encoding an S protein derived from SARS-CoV-2 virus;
alternatively, the mRNA comprises mRNA encoding the RBD domain derived from the S protein of SARS-CoV-2 virus;
or, the mRNA comprises mRNA encoding M protein, E protein and S protein derived from SARS-CoV-2 virus, the M protein and the E protein mRNA are connected for expression, the connection is preferably performed through the connection of the sequence of the mRNA encoding the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the mRNA sequence encoding the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the connected mRNA is preferably shown as SEQ ID NO.35 or 36, the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29, and the molar ratio of the connected mRNA to the mRNA of the S protein is preferably 1.5: 1-3: 1, such as 2: 1.
5. The mRNA of any one of claims 1 to 4, wherein when the mRNA comprises mRNA encoding two, three or four proteins or fragments thereof from the S, E, M and N proteins of SARS-CoV-2 virus, the proteins encoded by the mRNA self-assemble into virus-like particles.
6. A DNA comprising a DNA encoding at least one protein selected from the group consisting of S protein, E protein, M protein and N protein derived from SARS-CoV-2 virus, or a fragment thereof,
wherein, the sequence of the DNA for coding the S protein is shown as SEQ ID NO.3, SEQ ID NO.4 or SEQ ID NO. 5; the sequence of the DNA for coding the E protein is shown as SEQ ID NO. 8; the sequence of the DNA for coding the M protein is shown as SEQ ID NO. 11; the sequence of the DNA for coding the N protein is shown as SEQ ID NO. 13;
preferably, the fragment is a fragment of the RBD domain of the S protein, and the DNA sequence of the fragment is preferably shown in SEQ ID NO. 30.
7. A composition comprising a plurality or more than one of the mRNAs of any one of claims 1 to 5 and/or the DNA of claim 6.
8. A liposomal nanoparticle comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6, and/or the composition of claim 7;
preferably:
the liposomal nanoparticles further comprise a cationic lipid, preferably DLin-MC3-DMA or DOTMA, and a helper lipid, preferably DSPC and/or cholesterol;
and/or the liposome nanoparticle is a long-circulating cationic liposome nanoparticle, preferably a long-circulating cationic liposome nanoparticle modified by PEG or a derivative thereof; the relative molecular mass of the PEG is preferably 2000-5000, such as 2000, 3000, 4000 or 5000; more preferably, the long circulating cationic liposome nanoparticles comprise DMPE-PEG 2000.
9. A virus-like particle self-assembled from proteins expressed from a composition comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6 and/or the composition of claim 7, preferably expressed in a cell, preferably 293T and/or 293A;
preferably:
the virus-like particle is formed by self-assembling proteins expressed by mRNA of two, three or four proteins or fragments thereof in S protein, E protein, M protein and N protein of SARS-CoV-2 virus, and preferably expresses the proteins in cells;
more preferably:
the virus-like particle is composed of mRNA for coding S protein, E protein and M protein of SARS-CoV-2 virus, the S protein, the E protein and the M protein are obtained by self-assembly of three independent mRNA expressed proteins, the mol ratio of the mRNA for expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5) such as 1:1: 1;
or, the virus-like particle is formed by self-assembling proteins expressed by mRNA of M protein and E protein of SARS-CoV-2 virus, preferably expressing the proteins in cells, the mRNA of the M protein and the E protein is preferably expressed after being connected, the connection is preferably performed through the sequence of the mRNA of the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the mRNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the mRNA after being connected is preferably shown as SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29;
or, the virus-like particle is formed by self-assembling proteins expressed by mRNA of M protein, E protein and S protein of SARS-CoV-2 virus, preferably, the proteins are expressed in cells, the mRNA of the M protein and the E protein is expressed after being connected, the connection is preferably performed through the sequence of the mRNA of the 2A peptide segment, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.38 or SEQ ID NO.39, the RNA sequence of the 2A peptide segment is preferably shown as SEQ ID NO.40 or SEQ ID NO.41, the sequence of the connected mRNA is preferably shown as SEQ ID NO.35 or 36, the DNA sequence thereof is preferably shown as SEQ ID NO.28 or 29, the molar ratio of the connected mRNA to the mRNA of the S protein is preferably 1.5: 1-3: 1, for example 2: 1.
10. An mRNA vaccine against neocoronaviruses, characterized in that it comprises mRNA according to any one of claims 1 to 5, DNA according to claim 6, composition according to claim 7 and/or liposomal nanoparticles according to claim 8;
preferably, the mRNA vaccine induces the production of virus-like particles by cells; and/or, the mRNA vaccine further comprises an adjuvant.
11. A pharmaceutical composition comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6, the composition of claim 7, the liposomal nanoparticle of claim 8, the virus-like particle of claim 9, and/or the mRNA vaccine of claim 10, and optionally a pharmaceutically acceptable carrier.
12. A kit comprising the mRNA of any one of claims 1 to 5, the DNA of claim 6, the composition of claim 7, the liposomal nanoparticle of claim 8, the virus-like particle of claim 9, the mRNA vaccine of claim 10, and/or the pharmaceutical composition of claim 11.
13. mRNA for encoding 2A peptide segment has the sequence shown in SEQ ID NO.40 and/or SEQ ID NO. 41.
14. The DNA of the 2A peptide segment is shown in SEQ ID NO.38 or SEQ ID NO. 39.
Technical Field
The invention relates to mRNA and a novel coronavirus mRNA vaccine containing the same, and also relates to the mRNA and the novel coronavirus mRNA vaccine containing the same, liposome nanoparticles, a pharmaceutical composition, a kit and the like.
Background
In recent years, In Vitro Transcription (IVT) based messenger rna (mrna) therapies are showing great potential. The principle is that mRNA prepared in vitro is wrapped into medicine which is delivered to tissues in vivo and is endocytosed by cells, exogenous mRNA reaches the cells and is recognized by ribosome, and corresponding protein is synthesized according to coding information of the exogenous mRNA. Wolff et al demonstrated that mRNA injected into mice was capable of being translated into protein as early as 1990 [7 ]. Jirikowski et al showed in 1992 that vasopressin mRNA injected into the hypothalamic sites alleviated the symptoms of diabetes insipidus in mice [8 ]. mRNA drugs have many theoretical advantages: compared with DNA therapy, mRNA does not need to enter the nucleus, and the risk of insertional mutation of genome integration does not exist; compared with protein drugs, mRNA can realize high-efficiency and dose-dependent active protein expression by utilizing a translation system of a cell, and the problem of non-druggability of some proteins is solved. However, mRNA has been plagued by problems with in vitro preparation, stability and delivery. Until recently IVT (in vitro transcription) technology coupled with chemical and enzymatic capping, introduction of modified nucleotides, HPLC purification technology allowed large scale preparation of mRNA in vitro [9,10 ]. While liposomes and lipid nanoparticles have been shown to be useful for mRNA encapsulation and delivery after success in siRNA delivery [11 ]. The breakthrough of these technologies has led to a great improvement in mRNA druggability, and currently more than 25 mRNA drugs including mRNA vaccines and protein replacement are under development [12], and competition for the first mRNA product in the market has been fully developed. More and more researchers are focusing on the application of mRNA drugs, and chinese research in this field is just starting.
One of the most potential applications of mRNA drugs is vaccines, including tumor vaccines and infectious disease vaccines. The mRNA molecules of the coded antigen protein can be used for human immunity after being synthesized in vitro and formed into preparations, and the process does not relate to the operation related to the culture of live viruses, thereby greatly shortening the research and development time [13 ]. mRNA vaccines have continued to develop in breakthrough in recent years, and in a study of 2013, researchers designed and prepared mRNA vaccines against H7N9 influenza virus, with success in mouse experiments [14 ]. In 2015, mRNA vaccines against HIV produced a humoral immune response in non-human primates. In 2017, the mRNA vaccine of Zika virus was effective in protecting mice under virus challenge [15] and reducing the risk of pregnancy mice infection [16 ]. In addition to the success in animal trials, mRNA vaccines (e.g., influenza vaccine and Zika vaccine) have begun clinical trials, and phase I clinical results from influenza virus mRNA vaccine from Moderna have shown immunogenicity or superiority over traditional vaccines [17 ]. Also, the Zika virus vaccine mRNA-1893 from this company entered the U.S. FDA's fast channel in the last year. The technical advantages of IVT mRNA can effectively cope with the high mutation rate of the virus, so that the rapid development of new outbreak epidemic vaccine becomes possible, and the IVT mRNA is expected to become a breakthrough direction for improving the prevention and treatment efficiency of new infectious diseases.
Conventional vaccines for virus prevention include recombinant protein vaccines, inactivated vaccines, attenuated live vaccines, and in vitro recombinant virus-like particles (VLPs). In the past experience, inactivated or attenuated vaccines have been the first choice for vaccines because they are similar in form and composition to authentic viruses and produce a strong immune response. They have inevitable disadvantages: the production cycle of inactivated or attenuated vaccines is long, some viruses such as norovirus cannot be cultured on a large scale, the inactivated virus cannot induce immune response, and attenuated vaccines also have the risk of progenitor return. The in vitro recombinant virus-like particle vaccine is an empty capsid structure formed by independently packaging virus capsid protein or envelope protein, can quickly stimulate an organism to generate humoral immunity and cellular immune response, does not contain virus genetic material and immunosuppressive protein, is a novel candidate vaccine with highest safety at present, and has various vaccine products based on VLP (VLP) [18 ]. Following the SARS-CoV and MERS-CoV outbreaks in 2002, various vaccine protocols were investigated, including inactivated or attenuated strains, recombinant DNA-based S proteins, and in vitro recombinant virus-like particles [19,20 ]. The S protein is a major protein mediating virus invasion and is also a major target of neutralizing antibodies, and is of particular interest for vaccine development. Animal experiments have shown that these vaccines all have protective effects, but safety is the greatest concern. For example, vaccines based on full-length S protein antigens generate a large number of non-neutralizing antibodies that play an important role in antibody-dependent enhanced infection (ADE) [21], but rather accelerate disease progression, creating a significant problem in vaccine safety. Since the body can synthesize any protein according to the coding information after receiving the mRNA drug, mRNA is extremely flexible in the selection of vaccine antigens. However, in view of the advantages of virus-like particles, most of the virus mRNA vaccines in clinical use today are virus-like particles as the final antigen display form, such as Zika virus [15 ].
mRNA vaccines offer many advantages, but are mostly theoretical and require extensive basic and clinical research. An effective mRNA vaccine capable of inducing and synthesizing virus-like particles in vivo meets two conditions, namely, the expression efficiency is high, and the virus-like particles with enough dosage are generated to stimulate an organism to generate immune response; secondly, the produced virus-like particles are consistent with real viruses in form and structure composition, so that the organism can obtain the immunity to the real viruses. However, many challenges are faced in development due to the nature of coronaviruses themselves. Coronaviruses are positive-strand single-stranded RNA viruses, which have a structure in which a lipid bilayer forms an envelope (envelope) into which structural proteins M (membrane), E (envelope), and S (spike) are inserted. Among them, the S-spinous-process protein is the most important surface protein of coronavirus, and determines the host range and specificity of virus. S protein is the important site of action of host neutralizing antibody, thus becoming a key target in the vaccine design of SARS-CoV and MERS-CoV. Coronaviruses also have a nucleoprotein n (nucleoprotein) that surrounds the viral genome within the inner layer. In addition to binding to the genome, the N protein also contributes to the morphological shaping of the envelope and is therefore also considered to be one of the structural proteins. One characteristic of coronaviruses is that their morphology and size are not completely fixed, in fact coronaviruses have diameters between 80 and 200 nm. Therefore, even with high resolution cryoelectron microscopy, the atomic structure of the entire virus cannot be obtained using single particle analysis. The proportion of structural proteins within the coronavirus envelope is also not fixed, and depends on the content of each structural protein when the virus is assembled in the cell. This is different from Zika virus, which is also an enveloped virus, but has a fixed morphology, a rigid icosahedral structure, a single structural protein, a fixed copy number, and no spinous process structure. Thus, also synthetic virus-like particles, coronavirus mRNA vaccine design is much more complex than that of the zika virus. First, the mRNA vaccine of Zika virus contains only one mRNA encoding the prM-E fusion protein, whereas the mRNA vaccine of coronavirus must be a combination (cocktail) containing at least 3 mRNAs encoding different structural proteins. Secondly, there are currently many issues with the assembly of coronavirus envelope structures. According to the study of SARS-CoV, M and E co-expression was sufficient to form virus-like particles, but without spinous process structure, co-expression of S with M and E could introduce S protein, resulting in VLP with spinous processes. However, despite the formation of virus-like particles, the protein composition ratio is very different from that of the true virus. In addition, the presence of the N protein, although it interacts mainly with the viral genome in the inner layer, has been studied and shown to have an enhancing effect on the expression and secretion of virus-like particles. At present, several new coronavirus vaccines enter clinical tests, all of which take new coronavirus S protein as a main antigen, so that the safety and the effectiveness are not proved, and the risk of failure still exists. Therefore, there is an urgent need to continue to develop novel coronavirus vaccines directed against multiple antigenic strategies of the new coronaviruses.
Disclosure of Invention
The invention aims to overcome the defects that no commercialized new coronavirus vaccine exists in the prior art and the like, and provides mRNA, DNA, a new coronavirus mRNA vaccine containing the mRNA, liposome nanoparticles, virus-like particles generated by expression of the liposome nanoparticles, a pharmaceutical composition and a kit. The mRNA of several proteins required for assembling the novel coronavirus, which are subjected to codon optimization or further nucleotide modification, can be highly expressed in cells independently. The mRNA formed by the specific proportion of the invention can efficiently generate virus protein at the cellular level, or the generated protein is self-assembled into virus-like particles, so that the high expression of the virus-like particles can be realized, the size and the morphological structure of the virus-like particles are extremely close to those of real viruses, and the virus-like particles can enable an organism to obtain the immunological competence to cope with the real viruses when being subsequently used in clinic. The efficiency/expression efficiency of the nano-particles containing the mRNA of the invention that a plurality of mRNAs are simultaneously packaged by the lipid nano-particles is still high, so that enough doses of virus-like particles can be generated to stimulate the body to generate immune response, and the immunogenicity and stability are high. When the mRNA containing several proteins required for assembling the novel coronavirus after codon optimization or nucleotide modification of the invention is prepared into a vaccine (for example, in the form of a virus-like particle, a vaccine only expressing S protein or a vaccine only expressing RBD region in S protein), the safety is high, the effectiveness is good, and non-neutralizing antibodies are not generated so that antibody-dependent enhanced infection effect is not generated.
It is well known to those skilled in the art that coronavirus are not completely fixed in shape and size, nor are the proportions of structural proteins in their envelopes fixed, and thus are also synthetic virus-like particles, and that coronavirus mRNA vaccine design is much more complex than other viruses of the prior art. However, the inventors have surprisingly found through a large number of experiments and gropes that complete expression of virus-like particles can be achieved after specific optimization of codons. The present inventors have also found in experiments that the translation efficiency and stability of in vitro transcribed mRNA is affected by its different chemical modifications (the fate of the cell is largely different using different modified nucleotides for each mRNA), 5 'and 3' Untranslated Region Sequences (UTRs), 5 'capping patterns (using different cap0 or cap1 analogues) and 3' poly (a) tail length. Through a great deal of research, the inventor finds that the high-level protein can be further expressed after mRNA transfected cells are half an hour by selecting specific nucleoside chemical modification, specific UTR sequence and specific optimized capping mode, and the expression can last for one week. Meanwhile, through a large amount of experiments, the inventor finally discovers that a plurality of modified nucleotides can further obtain better immunogenicity and stability through specific combination. In addition, the S protein of the present invention is 1273 amino acids long, belongs to a larger protein, and combines with 5 'and 3' UTRs, and the final total mRNA length exceeds 4000 nt. The present inventors found in experiments that the synthesis of long-chain mRNA is always a challenge, and by optimizing the mRNA sequence encoding a protein (e.g., S protein), and simultaneously optimizing UTR sequence and modified nucleotides, the expression screening of the protein (e.g., S protein) can overcome the problems of preparation and purification of the mRNA of a super-long gene.
In order to solve the above technical problems, the present invention provides, in a first aspect, mRNA comprising mRNA encoding one, two, three or four proteins, fragments, variants or derivatives thereof, among an S protein, an E protein, an M protein and an N protein derived from SARS-CoV-2 virus,
wherein the sequence of mRNA for coding the S protein is shown as SEQ ID NO.18, SEQ ID NO.19 or SEQ ID NO. 20; the sequence of mRNA for encoding the E protein is shown as SEQ ID NO. 21; the sequence of mRNA for coding the M protein is shown as SEQ ID NO. 22; the sequence of mRNA for coding the N protein is shown as SEQ ID NO. 23.
Preferably, the fragment is a fragment of the RBD domain of the S protein, and the sequence of the mRNA of the fragment is preferably shown as SEQ ID NO. 37.
Preferably, the mRNA further comprises a 5' -cap structure, preferably 3' -O-Me-m7G (5') ppp (5') G, m7G (5') ppp (5') (2' OMeA) pG or m7(3' OMeG) (5') ppp (5') (2' OMeA) pG.
In the present invention, the structure of 3' -O-Me-m7G (5') ppp (5') G is generally as follows:
in the present invention, the structure of m7G (5') ppp (5') (2' OMeA) pG is generally as follows:
in the present invention, the structure of m7(3'OMeG) (5') ppp (5') (2' OMeA) pG is generally as follows:
preferably, the mRNA sequence further comprises a 3' -poly a, which sequence preferably comprises a sequence of about 25 to about 400 adenosine nucleotides, preferably a sequence of about 50 to about 400 adenosine nucleotides, more preferably a sequence of about 50 to about 300 adenosine nucleotides, even more preferably a sequence of about 50 to about 250 adenosine nucleotides, even more preferably a sequence of about 60 to about 250 adenosine nucleotides, most preferably a sequence consisting of 120 poly a.
Preferably, the mRNA sequence further comprises a 5 '-UTR, the sequence of the 5' -UTR preferably being as shown in SEQ ID number 15.
Preferably, the mRNA sequence further comprises a 3' -UTR, the sequence of the 3' -UTR preferably being derived from the 3' UTR of a gene providing a stable mRNA or from a homologue, fragment or variant thereof, more preferably as shown in SEQ ID No.16 or SEQ ID No. 17.
Preferably, the mRNA sequence further comprises a polynucleotide modification, preferably one or more of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP and 5-Methoxy-UTP. In the present invention, 5-methyl-CTP is commercially available from ApexBio, # B7967. Such pseudo-UTP is commercially available from ApexBio, # B7972. N1-Methylprudo-UTP was purchased from ApexBio, # B8049. The 5-Methoxy-UTP is commercially available from ApexBio, # B8061.
More preferably, the mRNA of the N protein comprises a modification of 5-methyl-CTP, pseudo-UTP, N1-methyl pseudo-UTP or 5-Methoxy-UTP, or comprises a modification of both 5-methyl-CTP and pseudo-UTP.
More preferably, the mRNA of the E protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP.
More preferably, when the sequence of the mRNA encoding said S protein is as shown in SEQ ID NO.18, the mRNA of said S protein comprises a modification of 5-methyl-CTP, pseudo-UTP or N1-methyl pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP, preferably comprises a modification of pseudo-UTP or N1-methyl pseudo-UTP.
More preferably, when the mRNA encoding said S protein has the sequence shown in SEQ ID NO.19, the mRNA of said S protein comprises a modification of pseudo-UTP, or comprises a modification of 5-methyl-CTP and pseudo-UTP.
More preferably, the mRNA of said S protein comprises a modification of pseudo-UTP or N1-Methylprudo-UTP, when the sequence of the mRNA encoding said S protein is shown in SEQ ID NO. 20.
Preferably, the mRNA comprises mRNA encoding S protein, E protein and M protein derived from SARS-CoV-2 virus, the S protein, the E protein and the M protein are expressed from three separate mRNAs respectively, and the molar ratio of the mRNA expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5), for example, 1:1: 1.
Preferably, the mRNA comprises mRNA encoding M and E proteins from SARS-CoV-2 virus, which are preferably expressed after ligation, preferably by the sequence of mRNA encoding the 2A peptide stretch (after protein expression, the resulting 2A peptide is "self-cleaved", resulting in separate M and E proteins). Wherein, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.38 or SEQ ID NO.39, and the mRNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29.
Preferably, the mRNA comprises mRNA encoding the S protein derived from SARS-CoV-2 virus.
Preferably, the mRNA comprises mRNA encoding the RBD domain of the S protein derived from SARS-CoV-2 virus.
Preferably, the mRNA comprises mRNA encoding M protein, E protein and S protein derived from SARS-CoV-2 virus, and the mRNA of M protein and E protein is expressed after ligation, preferably by ligation of the sequence of the 2A peptide fragment encoding the 2A peptide fragment. Wherein, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID number 42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.38 or SEQ ID NO.39, and the mRNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29. More preferably, the molar ratio of the ligated mRNA to the mRNA of the S protein is preferably 1.5:1 to 3:1, for example, 2: 1.
Preferably, when the mRNA comprises mRNA encoding two, three or four proteins or fragments thereof from the S, E, M and N proteins of SARS-CoV-2 virus, the proteins encoded by the mRNA self-assemble into virus-like particles.
In order to solve the above-mentioned technical problems, the second aspect of the present invention provides a DNA comprising a DNA encoding at least one protein (e.g., one, two, three, four) or a fragment thereof among the S protein, the E protein, the M protein and the N protein derived from SARS-CoV-2 virus,
wherein, the sequence of the DNA for coding the S protein is shown as SEQ ID NO.3, SEQ ID NO.4 or SEQ ID number 5; the sequence of the DNA for coding the E protein is shown as SEQ ID NO. 8; the sequence of the DNA for coding the M protein is shown as SEQ ID NO. 11; the sequence of the DNA for coding the N protein is shown as SEQ ID NO. 13.
Preferably, the fragment is a fragment of the RBD domain of the S protein, and the DNA sequence of the fragment is preferably shown in SEQ ID NO. 30.
In order to solve the above technical problem, the third aspect of the present invention provides a composition comprising a plurality of or more than one mRNA according to the first aspect of the present invention or DNA according to the second aspect of the present invention.
In order to solve the above technical problem, the fourth aspect of the present invention provides a liposome nanoparticle comprising the mRNA according to the first aspect of the present invention, the DNA according to the second aspect of the present invention, or the composition according to the third aspect of the present invention.
Preferably, the liposomal nanoparticles further comprise a cationic lipid, preferably DLin-MC3-DMA or DOTMA, and a helper lipid, preferably DSPC and/or cholesterol.
In the invention, the structural formula of the DLin-MC3-DMA is generally shown as follows:
in the present invention, the formula of DOTMA is generally as follows:
in the present invention, the structural formula of the DSPC is generally as follows:
preferably, the liposome nanoparticle is a long-circulating cationic liposome nanoparticle, preferably a long-circulating cationic liposome nanoparticle modified by PEG or derivatives thereof; the PEG preferably has a relative molecular mass of 2000-5000, such as 2000, 3000, 4000 or 5000. In a preferred embodiment of the present invention, the liposome nanoparticle is a long circulating cationic liposome nanoparticle comprising DMPE-PEG 2000.
In order to solve the above technical problems, a fifth aspect of the present invention provides a virus-like particle comprising a self-assembly of a corresponding protein expressed by the mRNA according to the first aspect of the present invention, a self-assembly of a corresponding protein expressed by the DNA according to the second aspect of the present invention, and/or a self-assembly of a corresponding protein expressed by the composition according to the third aspect of the present invention, wherein the mRNA, the DNA, and/or the composition are preferably transferred into a cell, and the cell preferably expresses the corresponding protein, and the cell preferably expresses 293T and/or 293A.
Preferably, the virus-like particle is self-assembled from proteins expressed by mRNA encoding two, three or four of the S, E, M and N proteins of the SARS-CoV-2 virus or fragments thereof, preferably in cells, preferably 293T and/or 293A.
More preferably, the virus-like particle is composed of mRNAs encoding S protein, E protein and M protein of SARS-CoV-2 virus, wherein the S protein, the E protein and the M protein are self-assembled from three separate mRNAs, and the molar ratio of the mRNAs expressing the S protein, the E protein and the M protein is preferably 1: (2-0.5): (2 to 0.5), for example, 1:1: 1.
More preferably, the virus-like particle is self-assembled from proteins expressed by mRNAs encoding the M and E proteins of SARS-CoV-2 virus, preferably the proteins are expressed in cells, preferably 293T and/or 293A. Wherein, the mRNA of the M protein and the E protein is preferably expressed after being connected, and the connection is preferably performed through the sequence of the mRNA encoding the 2A peptide segment. Wherein, the amino acid sequence of the 2A peptide segment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.38 or SEQ ID NO.39, and the mRNA sequence for coding the 2A peptide segment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29.
More preferably, the virus-like particle is self-assembled from proteins expressed by mRNAs encoding the M, E and S proteins of SARS-CoV-2 virus, preferably the proteins are expressed in cells, preferably 293T and/or 293A. Wherein, the mRNA of the M protein and the E protein is connected and then expressed, and the connection is preferably carried out through the sequence of the mRNA of the encoding 2A peptide segment. Wherein, the amino acid sequence of the 2A peptide fragment is preferably shown as SEQ ID NO.42 or SEQ ID NO.43, the DNA sequence for coding the 2A peptide fragment is further preferably shown as SEQ ID number 38 or SEQ ID NO.39, and the RNA sequence for coding the 2A peptide fragment is further preferably shown as SEQ ID NO.40 or SEQ ID NO. 41. More preferably, the sequence of the ligated mRNA is preferably as shown in SEQ ID NO.35 or 36, and the DNA sequence thereof is preferably as shown in SEQ ID NO.28 or 29. More preferably, the molar ratio of the ligated mRNA to the mRNA of the S protein is preferably 1.5:1 to 3:1, for example, 2: 1.
In order to solve the above technical problem, the sixth aspect of the present invention provides an mRNA vaccine against the novel coronavirus, which comprises mRNA according to the first aspect of the present invention, DNA according to the second aspect of the present invention, a composition according to the third aspect of the present invention, and/or a liposomal nanoparticle according to the fourth aspect of the present invention.
Preferably, the mRNA vaccine induces the cells to produce virus-like particles to activate the immune system.
Preferably, the mRNA vaccine further comprises adjuvants conventionally used in the art.
In order to solve the above technical problem, the seventh aspect of the present invention provides a pharmaceutical composition comprising mRNA according to the first aspect of the present invention, DNA according to the second aspect of the present invention, a composition according to the third aspect of the present invention, liposomal nanoparticles according to the fourth aspect of the present invention, virus-like particles according to the fifth aspect of the present invention, and/or mRNA vaccine according to the sixth aspect of the present invention, and optionally a pharmaceutically acceptable carrier.
In order to solve the above technical problems, the eighth aspect of the present invention provides a kit comprising the mRNA according to the first aspect of the present invention, the DNA according to the second aspect of the present invention, the composition according to the third aspect of the present invention, the liposomal nanoparticle according to the fourth aspect of the present invention, the virus-like particle according to the fifth aspect of the present invention, the mRNA vaccine according to the sixth aspect of the present invention, and/or the pharmaceutical composition according to the seventh aspect of the present invention.
In order to solve the technical problem, the invention also provides mRNA for encoding the 2A peptide segment, and the sequence of the mRNA is preferably shown as SEQ ID NO.40 or SEQ ID NO. 41.
In order to solve the technical problems, the invention also provides a DNA for coding the 2A peptide segment, and the sequence of the DNA is shown as SEQ ID NO.38 or SEQ ID NO. 39.
In order to solve the above technical problems, the present invention also provides the use of mRNA according to the first aspect of the present invention or DNA according to the second aspect of the present invention or a composition according to the third aspect of the present invention in the preparation of liposomal nanoparticles according to the fourth aspect of the present invention, virus-like particles according to the fifth aspect of the present invention, mRNA vaccines according to the sixth aspect of the present invention, pharmaceutical compositions according to the seventh aspect of the present invention, and/or kits according to the eighth aspect of the present invention.
In order to solve the above technical problem, the present invention also provides a method for preventing and/or treating a neocoronavirus infection, comprising the step of administering (optionally to a subject in need thereof) an mRNA according to the first aspect of the invention, a DNA according to the second aspect of the invention, a composition according to the third aspect of the invention, a liposomal nanoparticle according to the fourth aspect of the invention, a virus-like particle according to the fifth aspect of the invention, an mRNA vaccine according to the sixth aspect of the invention, a pharmaceutical composition according to the seventh aspect of the invention, and/or a kit according to the eighth aspect of the invention.
In order to solve the above technical problems, the present invention also provides a mRNA according to the first aspect of the present invention, a DNA according to the second aspect of the present invention, a composition according to the third aspect of the present invention, a liposome nanoparticle according to the fourth aspect of the present invention, a virus-like particle according to the fifth aspect of the present invention, a mRNA vaccine according to the sixth aspect of the present invention, a pharmaceutical composition according to the seventh aspect of the present invention, and/or a kit according to the eighth aspect of the present invention, for use in preventing and/or treating a neocoronavirus infection.
In the invention, the sequence encoding the 2A peptide segment can be the sequence encoding the 2A peptide segment of a natural virus, and can also be an optimized sequence (for example, T2A and P2A, the mRNA sequence of T2A can be shown as SEQ ID NO.40, the mRNA sequence of P2A can be shown as SEQ ID NO.41, the corresponding DNA sequence can be shown as SEQ ID NO.38 and SEQ ID NO.39, and the amino acid sequence of the polypeptide obtained after translation can be shown as SEQ ID NO.42 and SEQ ID NO. 43). The polypeptide can be efficiently self-sheared into a front fragment and a rear fragment, so that the sequences of the front part and the rear part of the sequence can be independently expressed into two independent proteins, and the aim of cooperatively expressing the two independent proteins on the sequence is fulfilled.
Interpretation of terms
In the present invention, the mRNA is also called messenger RNA, and is usually a single-stranded ribonucleic acid (ssrna) that is transcribed from a DNA strand as a template and carries genetic information and can direct protein synthesis. After mRNA is produced by transcription from gene in cell as template based on base complementary pairing principle, the mRNA contains base sequence corresponding to some functional segment in DNA molecule as direct template for protein biosynthesis.
In the present invention, the mRNA vaccine is generally produced by directly introducing mRNA encoding a viral antigen into a human body, and expressing the viral protein antigen in cells, thereby activating the immune system of the human body and generating a neutralizing antibody against the virus.
In the present invention, the antigen (abbreviated as Ag) refers to any substance that can induce an immune response, and is generally a substance that can induce the production of antibodies.
In the present invention, the antibody generally refers to immunoglobulin produced by plasma cells differentiated from B cells in the body under the stimulation of an antigen substance and capable of specifically binding and reacting with a corresponding antigen.
In the invention, the neutralizing antibody generally means that a plurality of antibodies are generated by stimulation after microorganisms invade a human body, but only part of the antibodies can rapidly identify the microorganisms and can be caught before the microorganisms invade cells of the human body, so that the human body is protected from infection. This process is called neutralization, and the antibody that exerts its effect is a neutralizing antibody.
In the invention, the liposome nanoparticle generally refers to a compound which utilizes liposome to package drug molecules (small molecule compounds, RNA, DNA or protein drugs) into hundred nanometers in size, and delivers the drugs into the body, thereby having the advantages of increasing the solubility of the drugs, prolonging the retention time of the drugs in the body, enhancing the targeting property of the drugs, reducing the toxicity and the like.
In the present invention, the virus-like particles (VLPs) are typically hollow particles containing one or more structural proteins of a virus, do not contain viral nucleic acid, cannot replicate autonomously, and are identical or similar in morphology to authentic virus particles, and are commonly referred to as pseudoviruses.
In the invention, the new coronavirus S protein (Spike protein) is also called spinous process or Spike protein. The S protein is the most important pathogenic target protein of coronaviruses and comprises two subunits, S1 and S2. Among them, S1 mainly contains a receptor binding region (RBD domain), and it is through the RBD domain that coronavirus binds to a cell surface receptor to infect cells. The S protein thus assumes mainly the functions of binding the virus to the host cell membrane receptor and membrane fusion. And meanwhile, the polypeptide is also an important action site of host neutralizing antibodies and a key target point of vaccine design.
On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The reagents and starting materials used in the present invention are commercially available.
The positive progress effects of the invention are as follows:
(1) the mRNA of several proteins required for assembling the novel coronavirus, which are subjected to codon optimization or further nucleotide modification, can be highly expressed in cells independently. In addition, the mRNA formed by the specific proportion of the invention can efficiently generate virus protein at the cellular level, or the generated protein can be self-assembled into virus-like particles, so that the high expression of the virus-like particles can be realized, the size and the morphological structure of the virus-like particles are extremely close to those of real viruses, and the virus-like particles can enable an organism to obtain the immunocompetence to cope with the real viruses when being subsequently used in clinic.
(2) The efficiency/expression efficiency of the nano-particles containing the mRNA is high, and a plurality of mRNAs are simultaneously packaged by the lipid nano-particles, so that the virus-like particles can be generated in enough dosage to stimulate the body to generate immune response, and the immunogenicity and stability are high.
(3) When the mRNA containing several proteins required for assembling the novel coronavirus, which are codon-optimized or further nucleotide-modified according to the present invention, is prepared into a vaccine (for example, in the form of a virus-like particle, a vaccine expressing only the S protein, or a vaccine expressing only the RBD region in the S protein), the safety is high, the effectiveness is good, and non-neutralizing antibodies are not generated, so that antibody-dependent enhanced infection effects are not generated.
Drawings
Fig. 1 shows an overview of an embodiment of the invention. In the examples, mRNA was used to express the structural proteins S, M, E of the novel coronavirus and the RBD domain of the N and S proteins. The mRNA is coated into nanoparticles (LNP) with liposomes for cell transfection or animal immunization. Multiple mrnas transfected by cells in vitro can highly express viral proteins and, at the appropriate ratio, self-assemble into virus-like particles (VLPs). After LNP immunization of mice, the immune system of the mice is activated to produce antibodies.
FIG. 2 is a graph showing the results of Western Blot detection of protein expression after transfection of 293A cells with mRNA coated with liposomes. Among them, lane (lane)1 is a protein expressed by the cap 1-modified mRNA, lane 2 is a protein expressed by the cap1+5mC + pseudoU-modified mRNA, lane 3 is a protein expressed by the cap1+ pseudoU-modified mRNA, lane 4 is a protein expressed by the cap1+5 moU-modified mRNA, lane 5 is a protein expressed by the cap1+ N1-m-pseudoU-modified mRNA, and lane 6 is a protein expressed by the cap1+5 mC-modified mRNA. A is a WB result graph of mRNA of the N protein and WB results of proteins expressed by NBL mRNA, B is a WB result graph of proteins expressed by EBL mRNA and MBL mRNA, C is a WB result graph of proteins expressed by SGS mRNA, STFmRNA and SBLmRNA, D is a WB result graph of proteins expressed by SDC50, SDC54, SDC58 and SDC60, E is a WB result graph of proteins expressed by SGS-RBD domain, and F is a WB result graph of proteins expressed by MP2AE and MT2 AE.
Fig. 3 shows an electron micrograph of VLP particles.
Figure 4 shows a schematic representation of mRNA lipid nanoparticle packaging.
Figure 5 shows the LNP chromatogram profile with ZetaView detection. The top panel of a is the particle size and distribution profile before LNP filtration of SGS mRNA coated with expressed S protein, and the bottom panel is the particle size distribution profile after the same LNP filtration. The upper panel of B is the particle size and distribution profile before LNP filtration of mRNA expressing the RBD domain of the S protein, and the lower panel is the particle size distribution profile after the same LNP filtration. The top panel of C is the particle size and distribution profile before LNP filtration coated with mRNA expressing M, E and S protein, and the bottom panel is the particle size distribution profile after the same LNP filtration.
FIG. 6 is a graph showing the results of the measurement of the antibody titer in serum by ELISA one week after the first immunization.
FIG. 7 shows the results of mouse serum neutralizing antibody titer experiments. mRNA (spike) expressing the full length of the S protein and mRNA (SME) expressing the virus-like particles were induced to produce antibody titers of greater than 104The RBD domain-producing mRNA (RBD) alone induced a slightly higher titer of neutralizing antibodies than the blank control (Ctrl).
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention. The experimental methods without specifying specific conditions in the following examples were selected according to the conventional methods and conditions, or according to the commercial instructions.
The invention is directed against mRNA vaccine that the new coronavirus develops, mainly adopt (1) to express multiple viral protein, assemble into virus-like particle in vivo; (2) expressing the full-length mRNA of the S protein; (3) three ways of expressing the RBD domain of the S protein are shown in FIG. 1.
Example 1 mRNA preparation
It is codon optimized for S, M, E, N structural genes of 4 new type coronavirus (SARS-CoV-2) and has several designed coding sequences for each gene. Each sequence will be cloned into an mRNA synthesis vector. For each sequence, two mrnas were prepared, one encoding wild-type protein without tag and one encoding a Flag tag at the C-terminus for later expression validation. The method comprises the following specific steps:
the Shanghai work is entrusted to synthesize gene sequences carrying S protein (Spike protein), M protein, E protein and N protein (amino acid sequences are respectively shown as SEQ ID NO.1, SEQ ID NO.9, SEQ ID NO.6 and SEQ ID NO.12, natural gene sequences of the four proteins are respectively shown as SEQ ID NO.2 (connected with 3 '-UTR-2), SEQ ID NO.10, SEQ ID NO.7 and SEQ ID NO.14 (connected with 3' -UTR-2)) which are optimized by codons aiming at SARS-Cov-2, wherein: the optimized sequences of the S protein genes are respectively shown in SEQ ID NO.3(SGS connected with 3' -UTR-2), SEQ ID NO.4(SBL or S-benchling connected with 3' -UTR-1), SEQ ID NO.5(STF connected with 3' -UTR-1), SEQ ID NO.24(SDC50 connected with 3' -UTR-2), SEQ ID NO.25(SDC54 connected with 3' -UTR-2), SEQ ID NO.26(SDC58 connected with 3' -UTR-2), and SEQ ID NO.27(SDC60 connected with 3' -UTR-2); the sequence of the M protein gene after optimization is shown as SEQ ID NO.11(MBL, connected with 3' -UTR-1); the optimized sequence of the E protein gene is shown as SEQ ID NO.8(EBL connected with 3' -UTR-1); the sequence of the optimized N protein gene is shown as SEQ ID NO.13(NBL connected with 3' -UTR-2). The codon-optimized gene sequence was then subcloned into a vector containing the T7 promoter and the 5 'noncoding region (5' UTR, sequence shown in SEQ ID NO. 15), the 3 'noncoding region (3' UTR, sequence shown in SEQ ID NO.16(3 '-UTR-1) or SEQ ID NO.17 (3' -UTR-2)) (two vectors: one vector in which the 5 '-UTR and 3' -UTR-1 regions were added to pUC19 and one vector in which the 5 '-UTR and 3' -UTR-2 regions were added to pUC 57). S, E, N and the M protein are labeled at the C-terminus with HA and Flag, respectively. After the vector is amplified, the vector is linearized by digestion with restriction enzymes (all procedures are conventional in the art). The cleaved fragments were further purified and used as templates for In Vitro Transcription (IVT) to synthesize modified mRNA, specifically: IVT was performed using a HyperScribe T7 high yield RNA synthesis kit (ApexBio) with 1-2. mu.g template and capped cap0 or cap1 analogs (purchased from ApexBio) (7.5 mM of each modified nucleotide). The reaction was incubated at 37 ℃ for 2-4 hours and then subjected to DNase (thermo) treatment. The 3' poly (a) tail was further added to the IVT RNA product using the poly (a) tailing kit (apextio). The mRNA was purified by using RNAclean and Concentrator kit (ApexBio). The obtained mRNA sequences of the optimized S protein gene are respectively shown as SEQ ID NO.18(SGS mRNA), SEQ ID NO.19(SBL mRNA) and SEQ ID NO.20(STF mRNA), SEQ ID NO.31 (SDC50), SEQ ID NO.32(SDC54), SEQ ID NO.33(SDC58) and SEQ ID NO.34(SDC60), the obtained mRNA sequence of the optimized M protein gene is shown as SEQ ID NO.22(MBL mRNA), the obtained mRNA sequence of the optimized E protein gene is shown as SEQ ID NO.21(EBL mRNA), and the obtained mRNA sequence of the optimized N protein gene is shown as SEQ ID NO.23(NBL mRNA).
Example 2 modified nucleotides incorporated during in vitro transcription
In the in vitro transcription synthesis of modified mRNA described in example 1, modified nucleotides are added to the reaction system in a certain ratio and randomly inserted into the mRNA sequence. Modified nucleotides attempted to be used in this example include 5-methyl-CTP (abbreviated as 5mC, ApexBio, # B7967), pseudo-UTP (abbreviated as pseudo U, ApexBio, # B7972), N1-methyl pseudo-UTP (abbreviated as N1-m-pseudo, ApexBio, # B8049), 5-Methoxy-UTP (abbreviated as 5moU, ApexBio, # B8061); modified nucleotides used for 5' capping of mRNA are 3' -O-Me-m7G (5') ppp (5') G (ARCA, Cap0, product of APExBIO, # B8175), m7G (5') ppp (5') (2' OMeA) pG (product of APExBIO EZ Cap # B8176, Cap1,) and m7(3' OMeG) (5') ppp (5') (2' OMeA) pG (product of APExBIO EZ Cap # B8178, Cap1 analogues).
The specific experimental steps are as follows:
(1) inserting a plurality of modified nucleotides into an in vitro mRNA sequence; randomly inserting modified nucleotides during in vitro transcription according to the ratio of the modified nucleotides to the unmodified nucleotides 1:5 was added to the reaction system, and kit # K1047 by APExBIO was used. The reaction system is configured according to the kit use instruction and reacts for 2-4 hours at 37 ℃.
(2) Transcription processes such as the addition of 5' capping nucleotides; then 5'm 7(3' OMeG) (5') ppp (5') (2' OMeA) pG, m7G (5') ppp (5') (2' OMeA) pG or 3' -O-Me-m7G (5') ppp (5') G were added simultaneously to the transcription reaction system in a molar ratio of 8: 1.
(3) adding 120 poly A sequences at the 3' end; the 3' poly (A) tail was added to the IVT RNA product using a poly (A) tailing kit (APExBIO, # K1053), the reaction system was configured according to the kit instructions, and the reaction was carried out at 37 ℃ for 1 hour.
(4) Digesting a DNA template by DNase; DNA template digestion was carried out using DNase I (cat # M0303S) from NEB and the reaction was carried out at 37 ℃ for 1 hour.
(5) mRNA purification; after purification of the transcript with Thermo Fisher RiboPure Kit (# AM1924), the DNA template-digested mRNA was eluted with 1mM sodium citrate, pH 6.4. Agarose gel nucleic acid electrophoresis was performed to detect mRNA and the concentration was determined using NanoDrop.
Example 3 mRNA transfected cells
Lipofectamine 2000(lipo2K, ThermoFisher Scientific #11668019) was used to mix the two in a mass to volume ratio of 1: 2 (mRNA: lipo2K, 1g mRNA +2L lipo2K) the S, M, E, N mRNA obtained in examples 1 and 2 was transfected into 293A cells, respectively, and protein expression was examined 24hr later using Western Blot. The results are shown in FIG. 2.
In fig. 2, the numbers represent the insertion of different modified nucleotides into the mRNA sequence: cap 1; 2. cap1+5mC + pseudoU; cap1+ pseudoU; cap1+5 moU; cap1+ N1-m-pseudo; cap1+5 mC. The N protein and the E protein expressed by the cell both have HA sequence labels, an anti-HA antibody is used as a western blot to detect the protein expression condition in the cell, and the GAPDH protein is used as a positive control. Wherein:
as shown in A of FIG. 2, the N protein is small, and each sequence and modified optimized mRNA can express the N protein in the cell, wherein the mRNA of two modifications, namely cap1+5mC + pseudoU (lane 2) and cap1+5moU (lane 4), is relatively low for protein expression.
In B of FIG. 2, the EBL sequence is strongly expressed and the signal is strongly detected with an antibody against the HA-tag peptide. The MBL sequence is connected with a flag tag peptide, and detection is carried out by using an antibody against the flag tag peptide, so that four modification combinations of cap1(lane 1), cap1+ pseudoU (lane 3), cap1+ N1-m-pseudoo (lane 5) and cap1+5mC (lane 6) are better expressed, mRNA of two modification combinations of cap1+5mC + pseudoU (lane 2) and cap1+5moU (lane 4) is low in expression quantity of the E protein.
In C and D of FIG. 2, the sequences of expressed S proteins are respectively connected with HA tag peptide or flag tag peptide, and the protein expression difference is very large by detecting with anti-HA or flag tag peptide antibody. As can be seen in C of fig. 2, the native S gene sequence without optimization is hardly expressed, or expressed in very low amounts, in 293A cells. The expression of the STF and SBL optimized sequences is slightly improved compared with the protein expression of the natural S gene sequence, the protein expression of the STF modified by cap1+ pseudoU (lane 3) and cap1+ N1-m-pseudodo (lane 5) is relatively high, and the protein expression of the SBL modified by cap1+ pseudodo (lane 3) is relatively high. The SGS gene optimized sequence greatly increases protein expression, the best expression level is the SGS sequence modified by two modes of adding cap + pseudoU (lane 3) and cap1+ N1-m-pseudoU (lane 5), and the expression level of the SGS sequence modified by cap1(lane 1), cap1+5mC + pseudoU (lane 2) and cap1+5mC (lane 6) is also higher. As can be seen from D in fig. 2, the optimized sequences SDC50, SDC54, SDC58, and SDC60 express many proteins including hetero proteins.
In E of FIG. 2, the SGS-RBD optimized mRNA sequence with HA tag peptide (mRNA sequence shown in SEQ ID NO.37, the corresponding DNA sequence shown in SEQ ID NO.30, both modified with pseudoU polynucleotide, 5 'capped structure of Cap1, 3' added 120 poly A, connecting the 5 'UTR shown in SEQ ID NO.15 and the 3' UTR shown in SEQ ID NO.16 or 17) can highly express the S protein RBD domain in cells.
In F of FIG. 2, a mRNA sequence is used to serially express two proteins M and E (i.e., mRNA of M protein and E protein is linked and then expressed, and different mRNA expressing 2A peptide fragment can be used to link the two proteins, wherein the 2A peptide is self-sheared after protein expression, and finally independent M and E proteins can be obtained, and on the basis of the natural virus 2A sequence, the optimized DNA sequences corresponding to the T2A and P2A polypeptides expressing 2A peptide fragment are shown in SEQ ID NO.38 and SEQ ID NO.39, the T2A mRNA sequence is shown in SEQ ID NO.40, the P2A mRNA sequence is shown in SEQ ID NO.41, and can be translated into polypeptides (for the sequences of SEQ ID NO.42 and SEQ ID NO.43)), the mRNA sequence of MT2AE obtained after linking the mRNA is shown in SEQ ID NO.35 (the corresponding DNA sequence is shown in SEQ ID NO. 28), the mRNA sequence of MP2AE obtained after linking the mRNA is shown in SEQ ID NO.36 (the corresponding DNA sequence is shown in SEQ ID NO. 29), adopts pseudoU polynucleotide modification, the 5 'capping structure is Cap1, 120 poly A are added into 3', 5 'UTR with the sequence shown as SEQ ID NO.15 and 3' UTR with the sequence shown as SEQ ID NO.16 or 17 are connected, and the western blot shows that double bands with close positions represent M and E proteins. The two protein amounts obtained by optimizing the mRNA sequence with MP2AE are closer.
Example 4 preparation and Observation of Virus-like particles
To produce virus-like particles (VLPs), mRNA of expressed S, M, E protein (SME mRNA consisting of SGS mRNA, MBL mRNA, EBL mRNA, all modified with pseudoU polynucleotide, 5 'capped structure Cap1, 3' with 120 polya, linked to 5 'UTR of sequence shown in SEQ ID No.15 and 3' UTR of sequence shown in SEQ ID No.16 or 17), was coated with lipo2K at a molar ratio of 1:0.5:0.5, co-transfected into 293A cells, and supernatant was collected 48 hours after transfection. Or mRNA for serially expressing M protein and E protein (namely mRNA for M protein and E protein is connected and then carries out the subsequent steps, different connecting peptides can be used for connection, the sequence of the connected mRNA is shown as SEQ ID NO.35 or 36), the mRNA and mRNA for expressing S protein (SEQ ID NO.3) are coated with lipo2K according to the molar ratio of 2:1, the cells are transfected into 293A cells, and supernatant is collected 48 hours after transfection.
The collected supernatant was concentrated using Amicon Ultra-15(Millipore) at a cut-off concentration of 100kDa and then placed in an appropriate solution (20mM HEPES, pH7.4, 120mM NaCl). Immediately after ultracentrifugation at 31,000rpm (Beckman ultracentrifuge, rotor model SW32) for 90 minutes at 4 ℃, between 30-40% (w/v) of the sucrose solution comprising the virus-like particles (VLPs) was extracted with a 5mL syringe. The solution containing VLPs was replaced with PBS buffer using Amicon Ultra-15 centrifuge tubes with a 100kDa cut-off. To prepare grids for negative staining Transmission Electron Microscopy (TEM), 5 μ Ι _ of VLP solution was absorbed on a glow-discharge carbon coated grid for 2 minutes. The grid was stained in a drop-wise fashion for 60 seconds and then loaded onto a Talos L120C microscope (thermolasher) to visualize the VLPs. The result is shown in FIG. 3, wherein S, E transcribed from mRNA and M protein are shown in a of FIG. 3, and the self-assembled new coronavirus-like particle is shown in an electron micrograph; a magnified photograph of a single virus-like particle in b of fig. 3, and the size of the surface spinous process was measured; and c in FIG. 3 is a cartoon mode diagram of the novel coronavirus-like particle. As can be seen from FIG. 3, the diameter of VLP particles under electron microscope is about 90nm, trimeric spinous processes similar to natural viruses are formed on the surfaces, and the size of the spinous processes is about 12X 13nm, which is very close to the size and structure of the natural viruses.
Example 5 mRNA coating method
According to the previous report, mRNA containing modified nucleotides obtained in example 2(mRNA expressing RBD domain of S protein; SGS mRNA capable of expressing S protein; SME mRNA expressing S, M, E three proteins, respectively, are mixed and expressed in a molar ratio of 1:0.5: 0.5; both modified with pseudoU polynucleotide, 5 'capped with Cap structure Cap1, 3' added with 120 poly A acids, connecting 5 'UTR as shown in SEQ ID No.15 and 3' UTR as shown in SEQ ID No.16 or 17) are coated with DLin-MC3-DMA (APBIO, # A8791) ionizable (cationic) at low pH, two helper lipids (DSPC and cholesterol) and pegylated lipid (DMPE-PEG2000) to form nanoparticles (as shown in FIG. 4). The mRNA was purified by mixing mRNA dissolved in ultrapure water with 100. mu.mM citrate buffer 1 at pH 3.0: 1(v/v) to prepare an aqueous mRNA solution. Modulation of four lipid components [ ionizable lipids: cholesterol: DSPC: DMPE-PEG2000] ratio (50:10:38.5:1.5) was dissolved in ethanol (99.5%) to form a lipid solution. mRNA and lipid solutions were mixed in a nanoassmblr (precision nanosystems) microfluidic mixing system at Aq: EtOH ═ 3: a volumetric mixing ratio of 1 and a constant total flow rate of 12mL/min, resulting in liposomal nanoparticles containing mRNA (LNP).
To characterize the LNP prepared as described above, after preparation, 25 μ L of the sample fraction was injected into 975 of 10 μmM phosphate buffer (pH7.4) and used to measure the intensity average particle size (Z-average) on a ZetaSizer (Malvern Instruments Inc.). The sample fractions were immediately transferred to Slide-a-lyzer G2 dialysis cassettes (10000MWCO, Thermo Fischer Scientific Inc.) and dialyzed against PBS (ph7.4) at 4 ℃ overnight. The volume of PBS buffer was 650-800 times the sample volume. A sample fraction was collected and 25. mu.L of this volume was injected into 975. mu.L of 10. mu.mM phosphate buffer (pH7.4), and the particle size (post-dialysis particle size) of LNP, which was about 100nm in diameter before and after dialysis, was measured again in a uniform and stable state, as shown in FIG. 5 and Table 1. The dialyzed sample was used for mouse injection immunization. FIG. 5 shows the results for SGS mRNA expressing the S protein, and the packaging results for mRNA expressing the RBD domain of the S protein and for mRNA expressing S, M, E. These results show that the particle size of mRNA is between 100-110nm after liposome packing, and the packing efficiency is greater than 90%.
The particle size and distribution of the mRNA samples after LNP coating with ZetaView is shown in table 1. S-RBD mRNA can express an S protein RBD structural domain, SGS mRNA can express an S protein, SME mRNA can express S, M, E three proteins, and virus-like particles can be formed. The particle size of the coated LNP is between 100 and 110nm, which meets the expected size of the nanometer particles. The amount after dilution is between 100 and 300, and the dilution ratio is suitable. After dialysis with 1xPBS and filtration with 0.22. mu.M or 0.45. mu.M filters, the particle size and number remained stable and were available for subsequent animal experiments.
TABLE 1
Example 6 mouse immunization experiment
The coated above-described liposomal nanoparticles expressing neocoronaviruses VLPs (containing SGS mRNA expressing the S protein described in example 5, or SME mRNA expressing S, M, E three proteins) or RBDs (containing mRNA expressing the RBD domain of the S protein described in example 5) were injected with immunoadjuvant in Balb/c mice (muscle (i.m.) for information shown in Table 2 below, blood samples were collected on day 42 and sera were analyzed in a fluorescent antibody virus neutralization assay, as described in example 7 below.
TABLE 2
Group of
Line of
Number of
Pathway volume
Vaccine dosage
Time of inoculation
1
Balb/c
8
i.m.50μl×3
Control PBS
D0, sensitization; d14, boost; d35, boost immunization
2
Balb/c
8
i.m.50μl×1
mRNA 10μg
D0, sensitization
3
Balb/c
8
i.m.50μl×2
mRNA 10μg
D0, sensitization; d14, boost;
4
Balb/c
8
i.m.50μl×3
mRNA 10μg
d0, sensitization; d14, boost; d35, boost immunization
Example 7 measurement of antibody titer in serum by enzyme-linked immunosorbent assay
96-well ELISA plates, 50. mu.l/well, 4 degrees overnight protected from light were coated with 2. mu.g/ml antigenic protein (in PBS), 100ng, respectively. Wherein the S protein antigen is purchased from Sino Biological, cat # 40589-V08B 1; RBD domain of the S protein, purchased from Novoprotein, cat # DRA 36. PBST (0.05% Tween) 3 times, 200. mu.l/well, each time reverse the ELISA plate and tap clean. Blocking was performed by adding 100. mu.l/well 2% BSA (in PBST) and incubating at room temperature for 1 hr. PBST (0.05% Tween) 3 times, 200. mu.l/well, each time reverse the ELISA plate and tap clean. Mouse serum (diluted 100-fold as the initial concentration, followed by 5-fold dilution with gradient, total 6 gradients) was added to PBS, mixed well, 100. mu.l each was added to an ELISA plate, and incubated at room temperature for 2 hr. The mice in example 6 were periocularly bled with 100. mu.l of about 20. mu.l of serum. After washing, HRP-anti-mouse IgG (1:5000 diluted in PBS) was added thereto, 100. mu.l/well, and incubated at room temperature for 1 hr. After washing the plates, TMB substrate (Thermo Fisher, cat # 34022) was added in 50. mu.l/well and allowed to stand at room temperature for 5-15min (protected from light) to develop a blue color. The reaction was stopped by adding 1M sulfuric acid, 150. mu.l/well, and the blue color turned yellow. The microplate reader reads OD 450.
The results are shown in FIG. 6, in which the highest antibody titer, up to 10, was observed for mRNA expression virus-like particles7. The mRNA expressing the S protein produced an antibody titer of 106. mRNA expressing the RBD domain of the S protein produced an antibody titer of 104. Therefore, the virus-like particles expressed by the mRNA can effectively activate the immune system of mice, promote the generation of antibodies in serum and effectively play the role of vaccines.
Example 8 neutralizing antibody detection assay
Detection of virus neutralizing antibody responses (specific B cell immune responses) was performed by a virus neutralization assay.The result of this assay is called Virus Neutralization Titer (VNT). According to WHO standards, antibody titers are considered protective if the respective VNT is at least 0.5 IU/ml. Therefore, blood samples were taken from the vaccinated mice described in example 6 on day 42 and sera were prepared. These sera were used for fluorescent antibody titer neutralization (FAVN) assay using human CACO-2 cells. Cultured cells were infected with pseudovirions (expressing the new coronavirus S protein, the core being EGFP DNA). Shortly thereafter, heat inactivated sera were tested in quadruplicate at serial two-fold dilutions and tested for their potential to neutralize 100TCID50 (tissue culture infectious dose 50%) of pseudovirions in a volume of 50 μ l. Therefore, serum dilutions were made at 37 deg.C (in the presence of 5% CO)2Humidified incubator) was incubated with virus for 1 hour, and then trypsinized CACO-2 cells (4 × 10) were added5Individual cells/ml; 50. mu.l/well). Infected cell cultures were incubated in a humidified incubator at 37 ℃ and 5% CO2The culture was carried out for 48 hours. After fixation of the cells with 80% acetone at room temperature, EGFP expression was detected by fluorescence, using amounts to mark the infection of the cells.
From the results shown in FIG. 7, it was found that the vaccine of the present invention (in the form of a virus-like particle, a vaccine expressing only the S protein, or a vaccine expressing only the RBD region of the S protein) was effective in activating the immune system of mice, producing antibodies in serum, and was highly safe and effective. Wherein the combination of mrnas expressing virus-like particles produces the highest neutralizing antibody titer. Non-neutralizing antibodies and thus antibody-dependent enhanced infection effects are not produced.
Reference to the literature
1.Huang C,Wang Y,Li X,Ren L,Zhao J,et al.2020.Lancet
2.Zhu N,Zhang D,Wang W,Li X,Yang B,et al.2020.N Engl J Med
3.de Wit E,van Doremalen N,Falzarano D,Munster VJ.2016.Nat Rev Microbiol 14:523-34
4.Potter CW.2001.J Appl Microbiol 91:572-9
5.Smith W,Andrewes CH,Laidlaw PP.1933.Lancet 2:66-8
6.Barberis I,Myles P,Ault SK,Bragazzi NL,Martini M.2016.J Prev Med Hyg 57: E115-E20
7.Wolff JA,Malone RW,Williams P,Chong W,Acsadi G,et al.1990.Science 247: 1465-8
8.Jirikowski GF,Sanna PP,Maciejewski-Lenoir D,Bloom FE.1992.Science 255:996-8
9.Zangi L,Lui KO,von Gise A,Ma Q,Ebina W,et al.2013.Nat Biotechnol 31:898-907
10.Kariko K,Muramatsu H,Ludwig J,Weissman D.2011.Nucleic Acids Res 39:e142
11.Reichmuth AM,Oberli MA,Jaklenec A,Langer R,Blankschtein D.2016.Ther Deliv 7: 319-34
12.Sahin U,Kariko K,Tureci O.2014.Nat Rev Drug Discov 13:759-80
13.Pardi N,Hogan MJ,Porter FW,Weissman D.2018.Nat Rev Drug Discov 17:261-79
14.Hekele A,Bertholet S,Archer J,Gibson DG,Palladino G,et al.2013.Emerg Microbes Infect 2:e52
15.Richner JM,Himansu S,Dowd KA,Butler SL,Salazar V,et al.2017.Cell 169:176
16.Richner JM,Jagger BW,Shan C,Fontes CR,Dowd KA,et al.2017.Cell 170:273-83 e12
17.Feldman RA,Fuhr R,Smolenov I,Mick Ribeiro A,Panther L,et al.2019.Vaccine 37: 3326-34
18.Chroboczek J,Szurgot I,Szolajska E.2014.Acta Biochim Pol 61:531-9
19.Yong CY,Ong HK,Yeap SK,Ho KL,Tan WS.2019.Front Microbiol 10:1781
20.Baric RS,Sheahan T,Deming D,Donaldson E,Yount B,et al.2006.Adv Exp Med Biol 581:553-60
21.Yip MS,Leung HL,Li PH,Cheung CY,Dutry I,et al.2016.Hong Kong Med J 22: 25-31
22.Millet JK,Tang T,Nathan L,Jaimes JA,Hsu HL,et al.2019.J Vis Exp
23.Islam MA,Xu Y,Tao W,Ubellacker JM,Lim M,et al.2018.Nat Biomed Eng 2: 850-64
SEQUENCE LISTING
<110> Shanghai blue magpie Bio-pharmaceutical Co Ltd
<120> mRNA and novel coronavirus mRNA vaccine comprising the same
<130> P20011191C
<160> 43
<170> PatentIn version 3.5
<210> 1
<211> 1273
<212> PRT
<213> SARS-COV-2
<400> 1
Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val
1 5 10 15
Asn Leu Thr Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe
20 25 30
Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu
35 40 45
His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp
50 55 60
Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp
65 70 75 80
Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu
85 90 95
Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser
100 105 110
Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile
115 120 125
Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr
130 135 140
Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr
145 150 155 160
Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu
165 170 175
Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe
180 185 190
Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr
195 200 205
Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu
210 215 220
Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr
225 230 235 240
Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser
245 250 255
Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro
260 265 270
Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala
275 280 285
Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys
290 295 300
Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val
305 310 315 320
Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys
325 330 335
Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala
340 345 350
Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu
355 360 365
Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro
370 375 380
Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe
385 390 395 400
Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly
405 410 415
Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys
420 425 430
Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn
435 440 445
Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe
450 455 460
Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys
465 470 475 480
Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly
485 490 495
Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val
500 505 510
Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys
515 520 525
Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn
530 535 540
Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu
545 550 555 560
Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val
565 570 575
Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe
580 585 590
Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val
595 600 605
Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile
610 615 620
His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser
625 630 635 640
Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val
645 650 655
Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala
660 665 670
Ser Tyr Gln Thr Gln Thr Asn Ser Pro Arg Arg Ala Arg Ser Val Ala
675 680 685
Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser
690 695 700
Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile
705 710 715 720
Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val
725 730 735
Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu
740 745 750
Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr
755 760 765
Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln
770 775 780
Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe
785 790 795 800
Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser
805 810 815
Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly
820 825 830
Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp
835 840 845
Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu
850 855 860
Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly
865 870 875 880
Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile
885 890 895
Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr
900 905 910
Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn
915 920 925
Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala
930 935 940
Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn
945 950 955 960
Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val
965 970 975
Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln
980 985 990
Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val
995 1000 1005
Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn
1010 1015 1020
Leu Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys
1025 1030 1035
Arg Val Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro
1040 1045 1050
Gln Ser Ala Pro His Gly Val Val Phe Leu His Val Thr Tyr Val
1055 1060 1065
Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His
1070 1075 1080
Asp Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe Val Ser Asn
1085 1090 1095
Gly Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro Gln
1100 1105 1110
Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val
1115 1120 1125
Val Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro
1130 1135 1140
Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn
1145 1150 1155
His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn
1160 1165 1170
Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu
1175 1180 1185
Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu
1190 1195 1200
Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Ile Trp Leu
1205 1210 1215
Gly Phe Ile Ala Gly Leu Ile Ala Ile Val Met Val Thr Ile Met
1220 1225 1230
Leu Cys Cys Met Thr Ser Cys Cys Ser Cys Leu Lys Gly Cys Cys
1235 1240 1245
Ser Cys Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro
1250 1255 1260
Val Leu Lys Gly Val Lys Leu His Tyr Thr
1265 1270
<210> 2
<211> 3819
<212> DNA
<213> SARS-COV-2
<400> 2
atgtttgttt ttcttgtttt attgccacta gtctctagtc agtgtgttaa tcttacaacc 60
agaactcaat taccccctgc atacactaat tctttcacac gtggtgttta ttaccctgac 120
aaagttttca gatcctcagt tttacattca actcaggact tgttcttacc tttcttttcc 180
aatgttactt ggttccatgc tatacatgtc tctgggacca atggtactaa gaggtttgat 240
aaccctgtcc taccatttaa tgatggtgtt tattttgctt ccactgagaa gtctaacata 300
ataagaggct ggatttttgg tactacttta gattcgaaga cccagtccct acttattgtt 360
aataacgcta ctaatgttgt tattaaagtc tgtgaatttc aattttgtaa tgatccattt 420
ttgggtgttt attaccacaa aaacaacaaa agttggatgg aaagtgagtt cagagtttat 480
tctagtgcga ataattgcac ttttgaatat gtctctcagc cttttcttat ggaccttgaa 540
ggaaaacagg gtaatttcaa aaatcttagg gaatttgtgt ttaagaatat tgatggttat 600
tttaaaatat attctaagca cacgcctatt aatttagtgc gtgatctccc tcagggtttt 660
tcggctttag aaccattggt agatttgcca ataggtatta acatcactag gtttcaaact 720
ttacttgctt tacatagaag ttatttgact cctggtgatt cttcttcagg ttggacagct 780
ggtgctgcag cttattatgt gggttatctt caacctagga cttttctatt aaaatataat 840
gaaaatggaa ccattacaga tgctgtagac tgtgcacttg accctctctc agaaacaaag 900
tgtacgttga aatccttcac tgtagaaaaa ggaatctatc aaacttctaa ctttagagtc 960
caaccaacag aatctattgt tagatttcct aatattacaa acttgtgccc ttttggtgaa 1020
gtttttaacg ccaccagatt tgcatctgtt tatgcttgga acaggaagag aatcagcaac 1080
tgtgttgctg attattctgt cctatataat tccgcatcat tttccacttt taagtgttat 1140
ggagtgtctc ctactaaatt aaatgatctc tgctttacta atgtctatgc agattcattt 1200
gtaattagag gtgatgaagt cagacaaatc gctccagggc aaactggaaa gattgctgat 1260
tataattata aattaccaga tgattttaca ggctgcgtta tagcttggaa ttctaacaat 1320
cttgattcta aggttggtgg taattataat tacctgtata gattgtttag gaagtctaat 1380
ctcaaacctt ttgagagaga tatttcaact gaaatctatc aggccggtag cacaccttgt 1440
aatggtgttg aaggttttaa ttgttacttt cctttacaat catatggttt ccaacccact 1500
aatggtgttg gttaccaacc atacagagta gtagtacttt cttttgaact tctacatgca 1560
ccagcaactg tttgtggacc taaaaagtct actaatttgg ttaaaaacaa atgtgtcaat 1620
ttcaacttca atggtttaac aggcacaggt gttcttactg agtctaacaa aaagtttctg 1680
cctttccaac aatttggcag agacattgct gacactactg atgctgtccg tgatccacag 1740
acacttgaga ttcttgacat tacaccatgt tcttttggtg gtgtcagtgt tataacacca 1800
ggaacaaata cttctaacca ggttgctgtt ctttatcagg atgttaactg cacagaagtc 1860
cctgttgcta ttcatgcaga tcaacttact cctacttggc gtgtttattc tacaggttct 1920
aatgtttttc aaacacgtgc aggctgttta ataggggctg aacatgtcaa caactcatat 1980
gagtgtgaca tacccattgg tgcaggtata tgcgctagtt atcagactca gactaattct 2040
cctcggcggg cacgtagtgt agctagtcaa tccatcattg cctacactat gtcacttggt 2100
gcagaaaatt cagttgctta ctctaataac tctattgcca tacccacaaa ttttactatt 2160
agtgttacca cagaaattct accagtgtct atgaccaaga catcagtaga ttgtacaatg 2220
tacatttgtg gtgattcaac tgaatgcagc aatcttttgt tgcaatatgg cagtttttgt 2280
acacaattaa accgtgcttt aactggaata gctgttgaac aagacaaaaa cacccaagaa 2340
gtttttgcac aagtcaaaca aatttacaaa acaccaccaa ttaaagattt tggtggtttt 2400
aatttttcac aaatattacc agatccatca aaaccaagca agaggtcatt tattgaagat 2460
ctacttttca acaaagtgac acttgcagat gctggcttca tcaaacaata tggtgattgc 2520
cttggtgata ttgctgctag agacctcatt tgtgcacaaa agtttaacgg ccttactgtt 2580
ttgccacctt tgctcacaga tgaaatgatt gctcaataca cttctgcact gttagcgggt 2640
acaatcactt ctggttggac ctttggtgca ggtgctgcat tacaaatacc atttgctatg 2700
caaatggctt ataggtttaa tggtattgga gttacacaga atgttctcta tgagaaccaa 2760
aaattgattg ccaaccaatt taatagtgct attggcaaaa ttcaagactc actttcttcc 2820
acagcaagtg cacttggaaa acttcaagat gtggtcaacc aaaatgcaca agctttaaac 2880
acgcttgtta aacaacttag ctccaatttt ggtgcaattt caagtgtttt aaatgatatc 2940
ctttcacgtc ttgacaaagt tgaggctgaa gtgcaaattg ataggttgat cacaggcaga 3000
cttcaaagtt tgcagacata tgtgactcaa caattaatta gagctgcaga aatcagagct 3060
tctgctaatc ttgctgctac taaaatgtca gagtgtgtac ttggacaatc aaaaagagtt 3120
gatttttgtg gaaagggcta tcatcttatg tccttccctc agtcagcacc tcatggtgta 3180
gtcttcttgc atgtgactta tgtccctgca caagaaaaga acttcacaac tgctcctgcc 3240
atttgtcatg atggaaaagc acactttcct cgtgaaggtg tctttgtttc aaatggcaca 3300
cactggtttg taacacaaag gaatttttat gaaccacaaa tcattactac agacaacaca 3360
tttgtgtctg gtaactgtga tgttgtaata ggaattgtca acaacacagt ttatgatcct 3420
ttgcaacctg aattagactc attcaaggag gagttagata aatattttaa gaatcataca 3480
tcaccagatg ttgatttagg tgacatctct ggcattaatg cttcagttgt aaacattcaa 3540
aaagaaattg accgcctcaa tgaggttgcc aagaatttaa atgaatctct catcgatctc 3600
caagaacttg gaaagtatga gcagtatata aaatggccat ggtacatttg gctaggtttt 3660
atagctggct tgattgccat agtaatggtg acaattatgc tttgctgtat gaccagttgc 3720
tgtagttgtc tcaagggctg ttgttcttgt ggatcctgct gcaaatttga tgaagacgac 3780
tctgagccag tgctcaaagg agtcaaatta cattacaca 3819
<210> 3
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> sequence after S protein gene optimization (S-GS)
<400> 3
atgttcgtct tcctggtcct gctgcctctg gtctcctcac agtgcgtcaa tctgacaact 60
cggactcagc tgccacctgc ttatactaat agcttcacca gaggcgtgta ctatcctgac 120
aaggtgttta gaagctccgt gctgcactct acacaggatc tgtttctgcc attctttagc 180
aacgtgacct ggttccacgc catccacgtg agcggcacca atggcacaaa gcggttcgac 240
aatcccgtgc tgccttttaa cgatggcgtg tacttcgcct ctaccgagaa gagcaacatc 300
atcagaggct ggatctttgg caccacactg gactccaaga cacagtctct gctgatcgtg 360
aacaatgcca ccaacgtggt catcaaggtg tgcgagttcc agttttgtaa tgatcccttc 420
ctgggcgtgt actatcacaa gaacaataag agctggatgg agtccgagtt tagagtgtat 480
tctagcgcca acaactgcac atttgagtac gtgagccagc ctttcctgat ggacctggag 540
ggcaagcagg gcaatttcaa gaacctgagg gagttcgtgt ttaagaatat cgacggctac 600
ttcaaaatct actctaagca cacccccatc aacctggtgc gcgacctgcc tcagggcttc 660
agcgccctgg agcccctggt ggatctgcct atcggcatca acatcacccg gtttcagaca 720
ctgctggccc tgcacagaag ctacctgaca cccggcgact cctctagcgg atggaccgcc 780
ggcgctgccg cctactatgt gggctacctc cagccccgga ccttcctgct gaagtacaac 840
gagaatggca ccatcacaga cgcagtggat tgcgccctgg accccctgag cgagacaaag 900
tgtacactga agtcctttac cgtggagaag ggcatctatc agacatccaa tttcagggtg 960
cagccaaccg agtctatcgt gcgctttcct aatatcacaa acctgtgccc atttggcgag 1020
gtgttcaacg caacccgctt cgccagcgtg tacgcctgga ataggaagcg gatcagcaac 1080
tgcgtggccg actatagcgt gctgtacaac tccgcctctt tcagcacctt taagtgctat 1140
ggcgtgtccc ccacaaagct gaatgacctg tgctttacca acgtctacgc cgattctttc 1200
gtgatcaggg gcgacgaggt gcgccagatc gcccccggcc agacaggcaa gatcgcagac 1260
tacaattata agctgccaga cgatttcacc ggctgcgtga tcgcctggaa cagcaacaat 1320
ctggattcca aagtgggcgg caactacaat tatctgtacc ggctgtttag aaagagcaat 1380
ctgaagccct tcgagaggga catctctaca gaaatctacc aggccggcag caccccttgc 1440
aatggcgtgg agggctttaa ctgttatttc ccactccagt cctacggctt ccagcccaca 1500
aacggcgtgg gctatcagcc ttaccgcgtg gtggtgctga gctttgagct gctgcacgcc 1560
ccagcaacag tgtgcggccc caagaagtcc accaatctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggcacaggc gtgctgaccg agtccaacaa gaagttcctg 1680
ccatttcagc agttcggcag ggacatcgca gataccacag acgccgtgcg cgacccacag 1740
accctggaga tcctggacat cacaccctgc tctttcggcg gcgtgagcgt gatcacaccc 1800
ggcaccaata caagcaacca ggtggccgtg ctgtatcagg acgtgaattg taccgaggtg 1860
cccgtggcta tccacgccga tcagctgacc ccaacatggc gggtgtacag caccggctcc 1920
aacgtcttcc agacaagagc cggatgcctg atcggagcag agcacgtgaa caattcctat 1980
gagtgcgaca tcccaatcgg cgccggcatc tgtgcctctt accagaccca gacaaactct 2040
cccagaagag cccggagcgt ggcctcccag tctatcatcg cctataccat gtccctgggc 2100
gccgagaaca gcgtggccta ctctaacaat agcatcgcca tcccaaccaa cttcacaatc 2160
tctgtgacca cagagatcct gcccgtgtcc atgaccaaga catctgtgga ctgcacaatg 2220
tatatctgtg gcgattctac cgagtgcagc aacctgctgc tccagtacgg cagcttttgt 2280
acccagctga atagagccct gacaggcatc gccgtggagc aggataagaa cacacaggag 2340
gtgttcgccc aggtgaagca aatctacaag acccccccta tcaaggactt tggcggcttc 2400
aatttttccc agatcctgcc tgatccatcc aagccttcta agcggagctt tatcgaggac 2460
ctgctgttca acaaggtgac cctggccgat gccggcttca tcaagcagta tggcgattgc 2520
ctgggcgaca tcgcagccag ggacctgatc tgcgcccaga agtttaatgg cctgaccgtg 2580
ctgccacccc tgctgacaga tgagatgatc gcacagtaca caagcgccct gctggccggc 2640
accatcacat ccggatggac cttcggcgca ggagccgccc tccagatccc ctttgccatg 2700
cagatggcct ataggttcaa cggcatcggc gtgacccaga atgtgctgta cgagaaccag 2760
aagctgatcg ccaatcagtt taactccgcc atcggcaaga tccaggacag cctgtcctct 2820
acagccagcg ccctgggcaa gctccaggat gtggtgaatc agaacgccca ggccctgaat 2880
accctggtga agcagctgag cagcaacttc ggcgccatct ctagcgtgct gaatgacatc 2940
ctgagccggc tggacaaggt ggaggcagag gtgcagatcg accggctgat caccggccgg 3000
ctccagagcc tccagaccta tgtgacacag cagctgatca gggccgccga gatcagggcc 3060
agcgccaatc tggcagcaac caagatgtcc gagtgcgtgc tgggccagtc taagagagtg 3120
gacttttgtg gcaagggcta tcacctgatg tccttccctc agtctgcccc acacggcgtg 3180
gtgtttctgc acgtgaccta cgtgcccgcc caggagaaga acttcaccac agcccctgcc 3240
atctgccacg atggcaaggc ccactttcca agggagggcg tgttcgtgtc caacggcacc 3300
cactggtttg tgacacagcg caatttctac gagccccaga tcatcaccac agacaacacc 3360
ttcgtgagcg gcaactgtga cgtggtcatc ggcatcgtga acaataccgt gtatgatcca 3420
ctccagcccg agctggacag ctttaaggag gagctggata agtatttcaa gaatcacacc 3480
tcccctgacg tggatctggg cgacatcagc ggcatcaatg cctccgtggt gaacatccag 3540
aaggagatcg accgcctgaa cgaggtggct aagaatctga acgagagcct gatcgacctc 3600
caggagctgg gcaagtatga gcagtacatc aagtggccct ggtacatctg gctgggcttc 3660
atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgtat gacatcctgc 3720
tgttcttgcc tgaagggctg ctgtagctgt ggctcctgct gtaagtttga cgaggatgac 3780
tctgaacctg tgctgaaggg cgtgaagctg cattacacc 3819
<210> 4
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> Sequence (SBL) after S protein gene optimization
<400> 4
atgttcgttt tcctcgttct gctgcctctt gtcagctctc agtgtgtgaa cctgacaact 60
agaacacaac tacctcccgc ctacacaaac tctttcaccc ggggcgtgta ctacccagac 120
aaagtgttca ggagctctgt gttgcacagc acccaagacc tgtttttgcc attctttagt 180
aatgtgacct ggtttcacgc tatccatgtg tcgggcacca acgggaccaa aagattcgac 240
aaccccgttc tgccgttcaa cgacggcgtg tacttcgcta gcactgagaa gtccaacatt 300
attcgcgggt ggatcttcgg aactaccttg gactccaaaa cacagtctct actcatcgtg 360
aacaacgcga ctaacgtggt gattaaggtg tgtgaatttc agttctgcaa tgatccattt 420
ttaggagtgt actaccacaa aaataataaa tcatggatgg agtctgaatt tcgcgtatac 480
agtagcgcta ataactgtac attcgaatat gttagccaac cctttttgat ggacttagag 540
gggaagcagg gaaattttaa gaatttgcga gaatttgtgt tcaaaaatat cgatgggtat 600
ttcaagatct actccaagca tactcccata aatctggtgc gcgacttacc tcaagggttc 660
agcgcactgg agccactggt agacctgcca atcggcatca acatcacccg attccagacc 720
ctgcttgctc tgcaccgttc atatctgaca ccaggagatt cgtcttccgg atggacagca 780
ggggccgctg cttactatgt tggttatctt cagcctcgga cctttctgct caagtataat 840
gagaatggga ccattaccga cgctgttgat tgtgctctcg atcccctgtc agaaaccaag 900
tgcacactaa aatctttcac agtcgaaaag gggatctacc agacttctaa ctttcgtgta 960
cagcccaccg agagcatcgt caggttccca aatatcacta acctgtgtcc ttttggcgag 1020
gtgttcaacg ctacaagatt tgctagcgtg tacgcctgga acagaaaaag aatatcaaat 1080
tgcgtagccg attacagcgt cttatataac tctgcatcct tctcaacttt caagtgttat 1140
ggagtgagcc cgactaagct gaatgatttg tgctttacaa atgtttatgc cgattcattc 1200
gtgatccggg gcgacgaggt cagacagatc gcccctggcc aaacaggtaa gattgctgat 1260
tacaactaca aattacctga cgattttaca ggatgcgtta tcgcttggaa ctctaacaat 1320
ctcgattcta aggtcggcgg caattacaat tatctttatc gccttttcag gaagtcaaat 1380
cttaagccat tcgagcgaga catcagtacc gagatatacc aggcggggtc caccccgtgt 1440
aacggtgtcg agggtttcaa ctgctacttt ccactgcagt cctatgggtt ccagcccacc 1500
aatggcgtgg gttaccagcc ctaccgagta gtcgtattgt cttttgagct cttgcacgcc 1560
cccgccacgg tgtgcggtcc aaagaaatca actaacttag ttaagaataa atgtgtgaat 1620
tttaacttta acggcctgac agggacagga gtcctgacag aatccaataa gaagttcctt 1680
ccctttcagc agtttggacg cgacatcgca gacaccacag acgccgtgcg tgacccccaa 1740
actctcgaaa ttctcgatat cacaccctgc agttttggcg gggtcagtgt cattacccct 1800
gggaccaata ctagtaacca ggtcgcagtg ctttaccaag atgtcaactg taccgaggtt 1860
cctgtggcta ttcacgcaga ccaactgact ccgacttggc gggtgtatag tacaggctcc 1920
aatgtgtttc agacccgggc aggctgcctg attggggccg agcatgtaaa taactcctac 1980
gagtgcgata tccccatagg tgctggaata tgtgccagtt atcagaccca gacgaactcg 2040
ccaagacgag ctaggtccgt agcctctcag agcataatcg cgtacactat gagcctgggg 2100
gccgaaaatt ccgtggcata tagcaacaac agcattgcta ttcctactaa ctttacaatt 2160
tcagtcacga cggagatcct gccagtctcc atgactaaaa cctccgtgga ctgtacgatg 2220
tacatttgtg gcgattcaac tgaatgctct aacctgctct tacagtacgg ttctttttgt 2280
acccagctga accgggcatt gacgggcatc gcagttgagc aggacaagaa tactcaggag 2340
gtgtttgcgc aagtgaagca aatttataaa actcctccca ttaaggactt tggcggtttc 2400
aacttctcgc agatcctacc tgacccatca aaacctagca agaggtcttt cattgaagac 2460
cttctgttca acaaggtcac actggctgac gccggcttca ttaaacagta cggagattgt 2520
ctaggtgata ttgcagcgcg cgatctgatt tgcgcacaga agtttaacgg cctgacggtc 2580
ttaccccctc tccttaccga cgaaatgatt gcccagtaca ccagcgccct gctcgctggc 2640
acgattacta gcggatggac atttggggcc ggcgctgccc tccagatacc atttgccatg 2700
cagatggcgt ataggtttaa cggcatagga gtaacccaga acgtgctgta cgagaaccaa 2760
aaactgatag ccaatcaatt caatagtgcc ataggaaaga tacaggacag tctcagcagc 2820
accgcgtccg ctctcggaaa gctacaagat gtggtcaacc agaacgcgca ggcattgaat 2880
acactggtga agcagctctc ctcgaatttt ggagcaatca gcagcgtgct gaatgatatc 2940
ctgtctcggc tggacaaggt tgaagccgaa gtccagatcg acaggttaat caccggtcgg 3000
ctgcagagtc tccagacata tgttacccag caactcatca gagctgccga aatacgcgcc 3060
agtgccaatc ttgcagccac taagatgtcc gagtgcgtgt tggggcaaag taaaagggtt 3120
gatttctgtg gaaaaggata tcatcttatg agtttccctc aatccgcccc tcacggagtt 3180
gtcttcctgc atgtgaccta cgtgccagcg caggagaaga acttcacgac cgcccccgcc 3240
atctgccatg atggcaaggc ccattttccc cgcgaaggag tgttcgtatc caatggcacc 3300
cactggttcg tgacgcagag aaatttttat gagccgcaaa ttatcactac cgacaacaca 3360
ttcgtttccg gcaattgcga tgtcgtaatc gggatcgtga ataatacagt ctatgatcct 3420
cttcagccag aactcgattc attcaaagag gagctggata aatatttcaa gaaccacacc 3480
tcccccgatg tggatctggg tgacatatca ggaattaacg caagcgtcgt gaacattcag 3540
aaggaaatcg acaggctcaa tgaagtagca aagaacttga atgagtctct catcgacttg 3600
caggaactcg gcaaatatga gcagtacatt aaatggccgt ggtatatctg gctaggcttt 3660
atcgccggtc tgattgcaat tgtgatggtt actatcatgt tgtgctgcat gacaagttgc 3720
tgttcatgcc ttaaaggctg ctgctcctgc gggtcatgtt gtaaattcga tgaggacgac 3780
tctgagcccg tgctgaaagg ggtgaaactg cactacacg 3819
<210> 5
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> S protein Gene optimization sequence 3 (STF)
<400> 5
atgttcgtgt tcctggtgct gctgcctctg gtgtccagcc agtgtgtgaa cctgaccacc 60
agaacacagc tgcctccagc ctacaccaac agctttacca gaggcgtgta ctaccccgac 120
aaggtgttca gatccagcgt gctgcactct acccaggacc tgttcctgcc tttcttcagc 180
aacgtgacct ggttccacgc catccacgtg tccggcacca atggcaccaa gagattcgac 240
aaccccgtgc tgcccttcaa cgacggggtg tactttgcca gcaccgagaa gtccaacatc 300
atcagaggct ggatcttcgg caccacactg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt catcaaagtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtct actaccacaa gaacaacaag agctggatgg aaagcgagtt ccgggtgtac 480
agcagcgcca acaactgcac cttcgagtac gtgtcccagc ctttcctgat ggacctggaa 540
ggcaagcagg gcaacttcaa gaacctgcgc gagttcgtgt ttaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccctatc aacctcgtgc gggatctgcc tcagggcttc 660
tctgctctgg aacccctggt ggatctgccc atcggcatca acatcacccg gtttcagaca 720
ctgctggccc tgcacagaag ctacctgaca cctggcgata gcagcagcgg atggacagct 780
ggtgccgccg cttactatgt gggctacctg cagcctagaa ccttcctgct gaagtacaac 840
gagaacggca ccatcaccga cgccgtggat tgtgctctgg atcctctgag cgagacaaag 900
tgcaccctga agtccttcac cgtggaaaag ggcatctacc agaccagcaa cttccgggtg 960
cagcccaccg aatccatcgt gcggttcccc aatatcacca atctgtgccc cttcggcgag 1020
gtgttcaatg ccaccagatt cgcctctgtg tacgcctgga accggaagcg gatcagcaat 1080
tgcgtggccg actactccgt gctgtacaac tccgccagct tcagcacctt caagtgctac 1140
ggcgtgtccc ctaccaagct gaacgacctg tgcttcacaa acgtgtacgc cgacagcttc 1200
gtgatccggg gagatgaagt gcggcagatt gcccctggac agacaggcaa gatcgccgac 1260
tacaactaca agctgcccga cgacttcacc ggctgtgtga ttgcctggaa cagcaacaac 1320
ctggactcca aagtcggcgg caactacaat tacctgtacc ggctgttccg gaagtccaat 1380
ctgaagccct tcgagcggga catctccacc gagatctatc aggccggcag caccccttgt 1440
aacggcgtgg aaggcttcaa ctgctacttc ccactgcagt cctacggctt tcagcccaca 1500
aatggcgtgg gctatcagcc ctacagagtg gtggtgctga gcttcgaact gctgcatgcc 1560
cctgccacag tgtgcggccc taagaaaagc accaatctcg tgaagaacaa atgcgtgaac 1620
ttcaacttca acggcctgac cggcaccggc gtgctgacag agagcaacaa gaagttcctg 1680
ccattccagc agtttggccg ggatatcgcc gataccacag acgccgttag agatccccag 1740
acactggaaa tcctggacat caccccttgc agcttcggcg gagtgtctgt gatcacccct 1800
ggcaccaaca ccagcaatca ggtggcagtg ctgtaccagg acgtgaactg taccgaagtg 1860
cccgtggcca ttcacgccga tcagctgaca cctacatggc gggtgtactc caccggcagc 1920
aatgtgtttc agaccagagc cggctgtctg atcggagccg agcacgtgaa caatagctac 1980
gagtgcgaca tccccatcgg cgctggcatc tgtgccagct accagacaca gacaaacagc 2040
cccagacggg ccagatctgt ggccagccag agcatcattg cctacacaat gtctctgggc 2100
gccgagaaca gcgtggccta ctccaacaac tctatcgcta tccccaccaa cttcaccatc 2160
agcgtgacca cagagatcct gcctgtgtcc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgattccac cgagtgctcc aacctgctgc tgcagtacgg cagcttctgc 2280
acccagctga atagagccct gacagggatc gccgtggaac aggacaagaa cacccaagag 2340
gtgttcgccc aagtgaagca gatctacaag acccctccta tcaaggactt cggcggcttc 2400
aatttcagcc agattctgcc cgatcctagc aagcccagca agcggagctt catcgaggac 2460
ctgctgttca acaaagtgac actggccgac gccggcttca tcaagcagta tggcgattgt 2520
ctgggcgaca ttgccgccag ggatctgatt tgcgcccaga agtttaacgg actgacagtg 2580
ctgcctcctc tgctgaccga tgagatgatc gcccagtaca catctgccct gctggccggc 2640
acaatcacaa gcggctggac atttggagct ggcgccgctc tgcagatccc ctttgctatg 2700
cagatggcct accggttcaa cggcatcgga gtgacccaga atgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820
acagcaagcg ccctgggaaa gctgcaggac gtggtcaacc agaatgccca ggcactgaac 2880
accctggtca agcagctgtc ctccaacttc ggcgccatca gctctgtgct gaacgatatc 2940
ctgagcagac tggacaaggt ggaagccgag gtgcagatcg acagactgat caccggaagg 3000
ctgcagtccc tgcagaccta cgttacccag cagctgatca gagccgccga gattagagcc 3060
tctgccaatc tggccgccac caagatgtct gagtgtgtgc tgggccagag caagagagtg 3120
gacttttgcg gcaagggcta ccacctgatg agcttccctc agtctgcccc tcacggcgtg 3180
gtgtttctgc acgtgacata cgtgcccgct caagagaaga atttcaccac cgctccagcc 3240
atctgccacg acggcaaagc ccactttcct agagaaggcg tgttcgtgtc caacggcacc 3300
cattggttcg tgacccagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360
ttcgtgtctg gcaactgcga cgtcgtgatc ggcattgtga acaataccgt gtacgaccct 3420
ctgcagcccg agctggacag cttcaaagag gaactggata agtactttaa gaaccacaca 3480
agccccgacg tggacctggg cgatatcagc ggaatcaatg ccagcgtcgt gaacatccag 3540
aaagagatcg accggctgaa cgaggtggcc aagaatctga acgagagcct gatcgacctg 3600
caagaactgg ggaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttt 3660
atcgccggac tgattgccat cgtgatggtc acaatcatgc tgtgttgcat gaccagctgc 3720
tgtagctgcc tgaagggctg ttgtagctgt ggcagctgct gcaagttcga cgaggacgat 3780
tctgagcccg tgctgaaggg cgtgaaactg cactacaca 3819
<210> 6
<211> 75
<212> PRT
<213> SARS-COV-2
<400> 6
Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu Ile Val Asn Ser
1 5 10 15
Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala
20 25 30
Ile Leu Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn Ile Val Asn
35 40 45
Val Ser Leu Val Lys Pro Ser Phe Tyr Val Tyr Ser Arg Val Lys Asn
50 55 60
Leu Asn Ser Ser Arg Val Pro Asp Leu Leu Val
65 70 75
<210> 7
<211> 228
<212> DNA
<213> SARS-COV-2
<400> 7
atgtactcat tcgtttcgga agagacaggt acgttaatag ttaatagcgt acttcttttt 60
cttgctttcg tggtattctt gctagttaca ctagccatcc ttactgcgct tcgattgtgt 120
gcgtactgct gcaatattgt taacgtgagt cttgtaaaac cttcttttta cgtttactct 180
cgtgttaaaa atctgaattc ttctagagtt cctgatcttc tggtctaa 228
<210> 8
<211> 225
<212> DNA
<213> Artificial Sequence
<220>
<223> E protein Gene optimization sequence (EBL)
<400> 8
atgtacagct ttgtctcaga ggaaaccggc acgctgattg taaacagcgt gttactattc 60
ctcgccttcg ttgtgtttct ccttgttaca ctggcaatac tgactgccct gcggttgtgc 120
gcttactgct gtaatatcgt gaacgtgtct ttggtgaagc ccagtttcta tgtatattcc 180
agagtcaaaa atctcaactc ctctagggtg cctgacctgc ttgtc 225
<210> 9
<211> 222
<212> PRT
<213> SARS-COV-2
<400> 9
Met Ala Asp Ser Asn Gly Thr Ile Thr Val Glu Glu Leu Lys Lys Leu
1 5 10 15
Leu Glu Gln Trp Asn Leu Val Ile Gly Phe Leu Phe Leu Thr Trp Ile
20 25 30
Cys Leu Leu Gln Phe Ala Tyr Ala Asn Arg Asn Arg Phe Leu Tyr Ile
35 40 45
Ile Lys Leu Ile Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys
50 55 60
Phe Val Leu Ala Ala Val Tyr Arg Ile Asn Trp Ile Thr Gly Gly Ile
65 70 75 80
Ala Ile Ala Met Ala Cys Leu Val Gly Leu Met Trp Leu Ser Tyr Phe
85 90 95
Ile Ala Ser Phe Arg Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe
100 105 110
Asn Pro Glu Thr Asn Ile Leu Leu Asn Val Pro Leu His Gly Thr Ile
115 120 125
Leu Thr Arg Pro Leu Leu Glu Ser Glu Leu Val Ile Gly Ala Val Ile
130 135 140
Leu Arg Gly His Leu Arg Ile Ala Gly His His Leu Gly Arg Cys Asp
145 150 155 160
Ile Lys Asp Leu Pro Lys Glu Ile Thr Val Ala Thr Ser Arg Thr Leu
165 170 175
Ser Tyr Tyr Lys Leu Gly Ala Ser Gln Arg Val Ala Gly Asp Ser Gly
180 185 190
Phe Ala Ala Tyr Ser Arg Tyr Arg Ile Gly Asn Tyr Lys Leu Asn Thr
195 200 205
Asp His Ser Ser Ser Ser Asp Asn Ile Ala Leu Leu Val Gln
210 215 220
<210> 10
<211> 669
<212> DNA
<213> SARS-COV-2
<400> 10
atggcagatt ccaacggtac tattaccgtt gaagagctta aaaagctcct tgaacaatgg 60
aacctagtaa taggtttcct attccttaca tggatttgtc ttctacaatt tgcctatgcc 120
aacaggaata ggtttttgta tataattaag ttaattttcc tctggctgtt atggccagta 180
actttagctt gttttgtgct tgctgctgtt tacagaataa attggatcac cggtggaatt 240
gctatcgcaa tggcttgtct tgtaggcttg atgtggctca gctacttcat tgcttctttc 300
agactgtttg cgcgtacgcg ttccatgtgg tcattcaatc cagaaactaa cattcttctc 360
aacgtgccac tccatggcac tattctgacc agaccgcttc tagaaagtga actcgtaatc 420
ggagctgtga tccttcgtgg acatcttcgt attgctggac accatctagg acgctgtgac 480
atcaaggacc tgcctaaaga aatcactgtt gctacatcac gaacgctttc ttattacaaa 540
ttgggagctt cgcagcgtgt agcaggtgac tcaggttttg ctgcatacag tcgctacagg 600
attggcaact ataaattaaa cacagaccat tccagtagca gtgacaatat tgctttgctt 660
gtacagtaa 669
<210> 11
<211> 669
<212> DNA
<213> Artificial Sequence
<220>
<223> M protein gene optimization sequence MBL
<400> 11
atggcagatt ccaacggtac aattaccgtc gaagagctga aaaagctcct tgagcagtgg 60
aacctggtca tagggttcct attcctgaca tggatttgcc tgctgcaatt tgcctatgcc 120
aacaggaata ggtttttgta tataatcaag ctgattttcc tctggctgtt atggccagtg 180
accctggcct gttttgtgct tgccgctgtt tacagaataa attggatcac cggcggaatc 240
gccatcgcaa tggcttgcct tgtaggcttg atgtggctca gctacttcat tgcttctttc 300
cggctgtttg cgcgaacgcg gtccatgtgg tctttcaatc cggagactaa catactcctc 360
aatgtgcccc tccatggcac tattctgacc agacccctgc tagagagtga actcgtcatc 420
ggagctgtga tcctgcgggg gcacctgaga atcgccggac accacttagg ccgctgtgac 480
atcaaggatc tgcctaaaga aatcactgtt gccacatcac gaaccctttc ttattacaag 540
ttgggggcct cgcagcgtgt ggcaggagac tcaggttttg cggcatacag tcgctacagg 600
attggcaact ataaattaaa cacagaccat tccagcagca gcgataatat tgctttgctt 660
gtgcagtga 669
<210> 12
<211> 419
<212> PRT
<213> SARS-COV-2
<400> 12
Met Ser Asp Asn Gly Pro Gln Asn Gln Arg Asn Ala Pro Arg Ile Thr
1 5 10 15
Phe Gly Gly Pro Ser Asp Ser Thr Gly Ser Asn Gln Asn Gly Glu Arg
20 25 30
Ser Gly Ala Arg Ser Lys Gln Arg Arg Pro Gln Gly Leu Pro Asn Asn
35 40 45
Thr Ala Ser Trp Phe Thr Ala Leu Thr Gln His Gly Lys Glu Asp Leu
50 55 60
Lys Phe Pro Arg Gly Gln Gly Val Pro Ile Asn Thr Asn Ser Ser Pro
65 70 75 80
Asp Asp Gln Ile Gly Tyr Tyr Arg Arg Ala Thr Arg Arg Ile Arg Gly
85 90 95
Gly Asp Gly Lys Met Lys Asp Leu Ser Pro Arg Trp Tyr Phe Tyr Tyr
100 105 110
Leu Gly Thr Gly Pro Glu Ala Gly Leu Pro Tyr Gly Ala Asn Lys Asp
115 120 125
Gly Ile Ile Trp Val Ala Thr Glu Gly Ala Leu Asn Thr Pro Lys Asp
130 135 140
His Ile Gly Thr Arg Asn Pro Ala Asn Asn Ala Ala Ile Val Leu Gln
145 150 155 160
Leu Pro Gln Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly Ser
165 170 175
Arg Gly Gly Ser Gln Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg Asn
180 185 190
Ser Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Thr Ser Pro Ala
195 200 205
Arg Met Ala Gly Asn Gly Gly Asp Ala Ala Leu Ala Leu Leu Leu Leu
210 215 220
Asp Arg Leu Asn Gln Leu Glu Ser Lys Met Ser Gly Lys Gly Gln Gln
225 230 235 240
Gln Gln Gly Gln Thr Val Thr Lys Lys Ser Ala Ala Glu Ala Ser Lys
245 250 255
Lys Pro Arg Gln Lys Arg Thr Ala Thr Lys Ala Tyr Asn Val Thr Gln
260 265 270
Ala Phe Gly Arg Arg Gly Pro Glu Gln Thr Gln Gly Asn Phe Gly Asp
275 280 285
Gln Glu Leu Ile Arg Gln Gly Thr Asp Tyr Lys His Trp Pro Gln Ile
290 295 300
Ala Gln Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg Ile
305 310 315 320
Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr Thr Gly Ala
325 330 335
Ile Lys Leu Asp Asp Lys Asp Pro Asn Phe Lys Asp Gln Val Ile Leu
340 345 350
Leu Asn Lys His Ile Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu Pro
355 360 365
Lys Lys Asp Lys Lys Lys Lys Ala Asp Glu Thr Gln Ala Leu Pro Gln
370 375 380
Arg Gln Lys Lys Gln Gln Thr Val Thr Leu Leu Pro Ala Ala Asp Leu
385 390 395 400
Asp Asp Phe Ser Lys Gln Leu Gln Gln Ser Met Ser Ser Ala Asp Ser
405 410 415
Thr Gln Ala
<210> 13
<211> 1257
<212> DNA
<213> Artificial Sequence
<220>
<223> N protein gene optimization sequence NBL
<400> 13
atgtcagata acggaccgca gaaccaaagg aacgcccctc ggatcacttt cgggggtcct 60
agcgacagca ctgggtctaa ccaaaatgga gaacgttccg gcgcaagatc caaacagagg 120
aggcctcagg ggcttcctaa caatacagcc tcctggttca cagctctcac acagcatggc 180
aaggaagacc tgaagtttcc tagaggccag ggggttccca tcaatactaa ctcctcccca 240
gacgatcaga ttggttatta tcggcgggct accaggcgga tccggggcgg agacggtaag 300
atgaaggacc tctctccccg ttggtacttt tactacctcg gtacaggccc cgaggctggg 360
cttccgtatg gcgccaataa ggatggaata atttgggtgg ctacggaagg ggccctcaac 420
acaccgaagg atcacattgg cacccgtaat cccgcgaata atgccgccat tgtcctgcag 480
ttgccccagg ggacgacgtt gcccaaaggc ttttacgcag aaggatcgcg cggaggatcc 540
caagcctcca gccgatcaag ctctcgatct cggaactcaa gtcgcaatag cacaccaggg 600
tcttctcgcg ggaccagccc tgcaaggatg gccggaaacg gcggtgatgc tgctttagcg 660
ctgctgctgc tggatagact gaaccaatta gagagtaaaa tgtcaggtaa aggccagcaa 720
cagcaggggc agacagtgac caaaaaaagt gcggccgagg ccagcaagaa accccgccag 780
aaacgaacag ccactaaagc ctacaacgta acccaagcat tcggaaggag aggaccagag 840
cagacccaag gcaattttgg cgatcaagag ctgatccgcc aggggacgga ctataagcat 900
tggccacaga tcgcccagtt cgcacccagt gcttcagcct tcttcggaat gtcgagaatc 960
ggtatggagg tcactccttc tggcacttgg ctgacttata ccggcgcaat aaagctagac 1020
gacaaagacc ctaactttaa ggatcaggtg atcctgctaa ataaacacat tgatgcgtac 1080
aaaacattcc caccaactga gccaaagaag gacaagaaga agaaggcaga tgaaacccag 1140
gctttgcccc agagacagaa aaagcagcag accgtgacct tgctgccagc agccgacctc 1200
gacgattttt caaagcaact tcagcagtcc atgagtagcg ctgacagcac ccaggct 1257
<210> 14
<211> 1257
<212> DNA
<213> SARS-COV-2
<400> 14
atgtctgata atggacccca aaatcagcga aatgcacccc gcattacgtt tggtggaccc 60
tcagattcaa ctggcagtaa ccagaatgga gaacgcagtg gggcgcgatc aaaacaacgt 120
cggccccaag gtttacccaa taatactgcg tcttggttca ccgctctcac tcaacatggc 180
aaggaagacc ttaaattccc tcgaggacaa ggcgttccaa ttaacaccaa tagcagtcca 240
gatgaccaaa ttggctacta ccgaagagct accagacgaa ttcgtggtgg tgacggtaaa 300
atgaaagatc tcagtccaag atggtatttc tactacctag gaactgggcc agaagctgga 360
cttccctatg gtgctaacaa agacggcatc atatgggttg caactgaggg agccttgaat 420
acaccaaaag atcacattgg cacccgcaat cctgctaaca atgctgcaat cgtgctacaa 480
cttcctcaag gaacaacatt gccaaaaggc ttctacgcag aagggagcag aggcggcagt 540
caagcctctt ctcgttcctc atcacgtagt cgcaacagtt caagaaattc aactccaggc 600
agcagtaggg gaacttctcc tgctagaatg gctggcaatg gcggtgatgc tgctcttgct 660
ttgctgctgc ttgacagatt gaaccagctt gagagcaaaa tgtctggtaa aggccaacaa 720
caacaaggcc aaactgtcac taagaaatct gctgctgagg cttctaagaa gcctcggcaa 780
aaacgtactg ccactaaagc atacaatgta acacaagctt tcggcagacg tggtccagaa 840
caaacccaag gaaattttgg ggaccaggaa ctaatcagac aaggaactga ttacaaacat 900
tggccgcaaa ttgcacaatt tgcccccagc gcttcagcgt tcttcggaat gtcgcgcatt 960
ggcatggaag tcacaccttc gggaacgtgg ttgacctaca caggtgccat caaattggat 1020
gacaaagatc caaatttcaa agatcaagtc attttgctga ataagcatat tgacgcatac 1080
aaaacattcc caccaacaga gcctaaaaag gacaaaaaga agaaggctga tgaaactcaa 1140
gccttaccgc agagacagaa gaaacagcaa actgtgactc ttcttcctgc tgcagatttg 1200
gatgatttct ccaaacaatt gcaacaatcc atgagcagtg ctgactcaac tcaggcc 1257
<210> 15
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 5'UTR
<400> 15
ggaaataaga gagaaaagaa gagtaagaag aaatataaga gccacc 46
<210> 16
<211> 110
<212> DNA
<213> Artificial Sequence
<220>
<223> 3'UTR-1
<400> 16
gctggagcct cggtggccat gcttcttgcc ccttgggcct ccccccagcc cctcctcccc 60
ttcctgcacc cgtacccccg tggtctttga ataaagtctg agtgggcggc 110
<210> 17
<211> 109
<212> DNA
<213> Artificial Sequence
<220>
<223> 3'UTR-2
<400> 17
gcggccgctt aattaagctg ccttctgcgg ggcttgcctt ctggccatgc ccttcttctc 60
tcccttgcac ctgtacctct tggtctttga ataaagcctg agtaggaag 109
<210> 18
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> mRNA sequence 1 (S-GS mRNA) after S protein gene optimization
<400> 18
uacaagcaga aggaccagga cgacggagac cagaggagug ucacgcaguu agacuguuga 60
gccugagucg acgguggacg aauaugauua ucgaaguggu cuccgcacau gauaggacug 120
uuccacaaau cuucgaggca cgacgugaga uguguccuag acaaagacgg uaagaaaucg 180
uugcacugga ccaaggugcg guaggugcac ucgccguggu uaccguguuu cgccaagcug 240
uuagggcacg acggaaaauu gcuaccgcac augaagcgga gauggcucuu cucguuguag 300
uagucuccga ccuagaaacc guggugugac cugagguucu gugucagaga cgacuagcac 360
uuguuacggu gguugcacca guaguuccac acgcucaagg ucaaaacauu acuagggaag 420
gacccgcaca ugauaguguu cuuguuauuc ucgaccuacc ucaggcucaa aucucacaua 480
agaucgcggu uguugacgug uaaacucaug cacucggucg gaaaggacua ccuggaccuc 540
ccguucgucc cguuaaaguu cuuggacucc cucaagcaca aauucuuaua gcugccgaug 600
aaguuuuaga ugagauucgu guggggguag uuggaccacg cgcuggacgg agucccgaag 660
ucgcgggacc ucggggacca ccuagacgga uagccguagu uguagugggc caaagucugu 720
gacgaccggg acgugucuuc gauggacugu gggccgcuga ggagaucgcc uaccuggcgg 780
ccgcgacggc ggaugauaca cccgauggag gucggggccu ggaaggacga cuucauguug 840
cucuuaccgu gguagugucu gcgucaccua acgcgggacc ugggggacuc gcucuguuuc 900
acaugugacu ucaggaaaug gcaccucuuc ccguagauag ucuguagguu aaagucccac 960
gucgguuggc ucagauagca cgcgaaagga uuauaguguu uggacacggg uaaaccgcuc 1020
cacaaguugc guugggcgaa gcggucgcac augcggaccu uauccuucgc cuagucguug 1080
acgcaccggc ugauaucgca cgacauguug aggcggagaa agucguggaa auucacgaua 1140
ccgcacaggg gguguuucga cuuacuggac acgaaauggu ugcagaugcg gcuaagaaag 1200
cacuaguccc cgcugcucca cgcggucuag cgggggccgg ucuguccguu cuagcgucug 1260
auguuaauau ucgacggucu gcuaaagugg ccgacgcacu agcggaccuu gucguuguua 1320
gaccuaaggu uucacccgcc guugauguua auagacaugg ccgacaaauc uuucucguua 1380
gacuucggga agcucucccu guagagaugu cuuuagaugg uccggccguc guggggaacg 1440
uuaccgcacc ucccgaaauu gacaauaaag ggugagguca ggaugccgaa ggucgggugu 1500
uugccgcacc cgauagucgg aauggcgcac caccacgacu cgaaacucga cgacgugcgg 1560
ggucguuguc acacgccggg guucuucagg ugguuagacc acuucuuguu cacgcacuug 1620
aaguugaagu ugccggacug gccguguccg cacgacuggc ucagguuguu cuucaaggac 1680
gguaaagucg ucaagccguc ccuguagcgu cuaugguguc ugcggcacgc gcuggguguc 1740
ugggaccucu aggaccugua gugugggacg agaaagccgc cgcacucgca cuaguguggg 1800
ccgugguuau guucguuggu ccaccggcac gacauagucc ugcacuuaac auggcuccac 1860
gggcaccgau aggugcggcu agucgacugg gguuguaccg cccacauguc guggccgagg 1920
uugcagaagg ucuguucucg gccuacggac uagccucguc ucgugcacuu guuaaggaua 1980
cucacgcugu aggguuagcc gcggccguag acacggagaa uggucugggu cuguuugaga 2040
gggucuucuc gggccucgca ccggaggguc agauaguagc ggauauggua cagggacccg 2100
cggcucuugu cgcaccggau gagauuguua ucguagcggu aggguugguu gaaguguuag 2160
agacacuggu gucucuagga cgggcacagg uacugguucu guagacaccu gacguguuac 2220
auauagacac cgcuaagaug gcucacgucg uuggacgacg aggucaugcc gucgaaaaca 2280
ugggucgacu uaucucggga cuguccguag cggcaccucg uccuauucuu guguguccuc 2340
cacaagcggg uccacuucgu uuagauguuc ugggggggau aguuccugaa accgccgaag 2400
uuaaaaaggg ucuaggacgg acuagguagg uucggaagau ucgccucgaa auagcuccug 2460
gacgacaagu uguuccacug ggaccggcua cggccgaagu aguucgucau accgcuaacg 2520
gacccgcugu agcgucgguc ccuggacuag acgcgggucu ucaaauuacc ggacuggcac 2580
gacggugggg acgacugucu acucuacuag cgugucaugu guucgcggga cgaccggccg 2640
ugguagugua ggccuaccug gaagccgcgu ccucggcggg aggucuaggg gaaacgguac 2700
gucuaccgga uauccaaguu gccguagccg cacugggucu uacacgacau gcucuugguc 2760
uucgacuagc gguuagucaa auugaggcgg uagccguucu agguccuguc ggacaggaga 2820
ugucggucgc gggacccguu cgagguccua caccacuuag ucuugcgggu ccgggacuua 2880
ugggaccacu ucgucgacuc gucguugaag ccgcgguaga gaucgcacga cuuacuguag 2940
gacucggccg accuguucca ccuccgucuc cacgucuagc uggccgacua guggccggcc 3000
gaggucucgg aggucuggau acacuguguc gucgacuagu cccggcggcu cuagucccgg 3060
ucgcgguuag accgucguug guucuacagg cucacgcacg acccggucag auucucucac 3120
cugaaaacac cguucccgau aguggacuac aggaagggag ucagacgggg ugugccgcac 3180
cacaaagacg ugcacuggau gcacgggcgg guccucuucu ugaaguggug ucggggacgg 3240
uagacggugc uaccguuccg ggugaaaggu ucccucccgc acaagcacag guugccgugg 3300
gugaccaaac acugugucgc guuaaagaug cucggggucu aguaguggug ucuguugugg 3360
aagcacucgc cguugacacu gcaccaguag ccguagcacu uguuauggca cauacuaggu 3420
gaggucgggc ucgaccuguc gaaauuccuc cucgaccuau ucauaaaguu cuuagugugg 3480
aggggacugc accuagaccc gcuguagucg ccguaguuac ggaggcacca cuuguagguc 3540
uuccucuagc uggcggacuu gcuccaccga uucuuagacu ugcucucgga cuagcuggag 3600
guccucgacc cguucauacu cgucauguag uucaccggga ccauguagac cgacccgaag 3660
uagcggccgg acuagcggua gcacuaccac ugguaguacg acacgacaua cuguaggacg 3720
acaagaacgg acuucccgac gacaucgaca ccgaggacga cauucaaacu gcuccuacug 3780
agacuuggac acgacuuccc gcacuucgac guaaugugg 3819
<210> 19
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> mRNA sequence (SBLmRNA) after S protein gene optimization
<400> 19
uacaagcaaa aggagcaaga cgacggagaa cagucgagag ucacacacuu ggacuguuga 60
ucuuguguug auggagggcg gauguguuug agaaaguggg ccccgcacau gaugggucug 120
uuucacaagu ccucgagaca caacgugucg uggguucugg acaaaaacgg uaagaaauca 180
uuacacugga ccaaagugcg auagguacac agcccguggu ugcccugguu uucuaagcug 240
uuggggcaag acggcaaguu gcugccgcac augaagcgau cgugacucuu cagguuguaa 300
uaagcgccca ccuagaagcc uugauggaac cugagguuuu gugucagaga ugaguagcac 360
uuguugcgcu gauugcacca cuaauuccac acacuuaaag ucaagacguu acuagguaaa 420
aauccucaca ugaugguguu uuuauuauuu aguaccuacc ucagacuuaa agcgcauaug 480
ucaucgcgau uauugacaug uaagcuuaua caaucgguug ggaaaaacua ccugaaucuc 540
cccuucgucc cuuuaaaauu cuuaaacgcu cuuaaacaca aguuuuuaua gcuacccaua 600
aaguucuaga ugagguucgu augaggguau uuagaccacg cgcugaaugg aguucccaag 660
ucgcgugacc ucggugacca ucuggacggu uagccguagu uguagugggc uaaggucugg 720
gacgaacgag acguggcaag uauagacugu gguccucuaa gcagaaggcc uaccugucgu 780
ccccggcgac gaaugauaca accaauagaa gucggagccu ggaaagacga guucauauua 840
cucuuacccu gguaauggcu gcgacaacua acacgagagc uaggggacag ucuuugguuc 900
acgugugauu uuagaaagug ucagcuuuuc cccuagaugg ucugaagauu gaaagcacau 960
gucggguggc ucucguagca guccaagggu uuauagugau uggacacagg aaaaccgcuc 1020
cacaaguugc gauguucuaa acgaucgcac augcggaccu ugucuuuuuc uuauaguuua 1080
acgcaucggc uaaugucgca gaauauauug agacguagga agaguugaaa guucacaaua 1140
ccucacucgg gcugauucga cuuacuaaac acgaaauguu uacaaauacg gcuaaguaag 1200
cacuaggccc cgcugcucca gucugucuag cggggaccgg uuuguccauu cuaacgacua 1260
auguugaugu uuaauggacu gcuaaaaugu ccuacgcaau agcgaaccuu gagauuguua 1320
gagcuaagau uccagccgcc guuaauguua auagaaauag cggaaaaguc cuucaguuua 1380
gaauucggua agcucgcucu guagucaugg cucuauaugg uccgccccag guggggcaca 1440
uugccacagc ucccaaaguu gacgaugaaa ggugacguca ggauacccaa ggucgggugg 1500
uuaccgcacc caauggucgg gauggcucau cagcauaaca gaaaacucga gaacgugcgg 1560
gggcggugcc acacgccagg uuucuuuagu ugauugaauc aauucuuauu uacacacuua 1620
aaauugaaau ugccggacug ucccuguccu caggacuguc uuagguuauu cuucaaggaa 1680
gggaaagucg ucaaaccugc gcuguagcgu cugugguguc ugcggcacgc acuggggguu 1740
ugagagcuuu aagagcuaua gugugggacg ucaaaaccgc cccagucaca guaaugggga 1800
cccugguuau gaucauuggu ccagcgucac gaaaugguuc uacaguugac auggcuccaa 1860
ggacaccgau aagugcgucu gguugacuga ggcugaaccg cccacauauc auguccgagg 1920
uuacacaaag ucugggcccg uccgacggac uaaccccggc ucguacauuu auugaggaug 1980
cucacgcuau agggguaucc acgaccuuau acacggucaa uagucugggu cugcuugagc 2040
gguucugcuc gauccaggca ucggagaguc ucguauuagc gcaugugaua cucggacccc 2100
cggcuuuuaa ggcaccguau aucguuguug ucguaacgau aaggaugauu gaaauguuaa 2160
agucagugcu gccucuagga cggucagagg uacugauuuu ggaggcaccu gacaugcuac 2220
auguaaacac cgcuaaguug acuuacgaga uuggacgaga augucaugcc aagaaaaaca 2280
ugggucgacu uggcccguaa cugcccguag cgucaacucg uccuguucuu augaguccuc 2340
cacaaacgcg uucacuucgu uuaaauauuu ugaggagggu aauuccugaa accgccaaag 2400
uugaagagcg ucuaggaugg acuggguagu uuuggaucgu ucuccagaaa guaacuucug 2460
gaagacaagu uguuccagug ugaccgacug cggccgaagu aauuugucau gccucuaaca 2520
gauccacuau aacgucgcgc gcuagacuaa acgcgugucu ucaaauugcc ggacugccag 2580
aaugggggag aggaauggcu gcuuuacuaa cgggucaugu ggucgcggga cgagcgaccg 2640
ugcuaaugau cgccuaccug uaaaccccgg ccgcgacggg aggucuaugg uaaacgguac 2700
gucuaccgca uauccaaauu gccguauccu cauugggucu ugcacgacau gcucuugguu 2760
uuugacuauc gguuaguuaa guuaucacgg uauccuuucu auguccuguc agagucgucg 2820
uggcgcaggc gagagccuuu cgauguucua caccaguugg ucuugcgcgu ccguaacuua 2880
ugugaccacu ucgucgagag gagcuuaaaa ccucguuagu cgucgcacga cuuacuauag 2940
gacagagccg accuguucca acuucggcuu caggucuagc uguccaauua guggccagcc 3000
gacgucucag aggucuguau acaauggguc guugaguagu cucgacggcu uuaugcgcgg 3060
ucacgguuag aacgucggug auucuacagg cucacgcaca accccguuuc auuuucccaa 3120
cuaaagacac cuuuuccuau aguagaauac ucaaagggag uuaggcgggg agugccucaa 3180
cagaaggacg uacacuggau gcacggucgc guccucuucu ugaagugcug gcgggggcgg 3240
uagacgguac uaccguuccg gguaaaaggg gcgcuuccuc acaagcauag guuaccgugg 3300
gugaccaagc acugcgucuc uuuaaaaaua cucggcguuu aauagugaug gcuguugugu 3360
aagcaaaggc cguuaacgcu acagcauuag cccuagcacu uauuauguca gauacuagga 3420
gaagucgguc uugagcuaag uaaguuucuc cucgaccuau uuauaaaguu cuuggugugg 3480
agggggcuac accuagaccc acuguauagu ccuuaauugc guucgcagca cuuguaaguc 3540
uuccuuuagc uguccgaguu acuucaucgu uucuugaacu uacucagaga guagcugaac 3600
guccuugagc cguuuauacu cgucauguaa uuuaccggca ccauauagac cgauccgaaa 3660
uagcggccag acuaacguua acacuaccaa ugauaguaca acacgacgua cuguucaacg 3720
acaaguacgg aauuuccgac gacgaggacg cccaguacaa cauuuaagcu acuccugcug 3780
agacucgggc acgacuuucc ccacuuugac gugaugugc 3819
<210> 20
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> optimization of mRNA sequence 3(STF mRNA) for S protein gene
<400> 20
uacaagcaca aggaccacga cgacggagac cacaggucgg ucacacacuu ggacuggugg 60
ucuugugucg acggaggucg gaugugguug ucgaaauggu cuccgcacau gauggggcug 120
uuccacaagu cuaggucgca cgacgugaga uggguccugg acaaggacgg aaagaagucg 180
uugcacugga ccaaggugcg guaggugcac aggccguggu uaccgugguu cucuaagcug 240
uuggggcacg acgggaaguu gcugccccac augaaacggu cguggcucuu cagguuguag 300
uagucuccga ccuagaagcc guggugugac cugucguucu gggucucgga cgacuagcac 360
uuguugcggu gguugcacca guaguuucac acgcucaagg ucaagacguu gcuggggaag 420
gacccgcaga ugaugguguu cuuguuguuc ucgaccuacc uuucgcucaa ggcccacaug 480
ucgucgcggu uguugacgug gaagcucaug cacagggucg gaaaggacua ccuggaccuu 540
ccguucgucc cguugaaguu cuuggacgcg cucaagcaca aauucuugua gcugccgaug 600
aaguucuaga ugucguucgu guggggauag uuggagcacg cccuagacgg agucccgaag 660
agacgagacc uuggggacca ccuagacggg uagccguagu uguagugggc caaagucugu 720
gacgaccggg acgugucuuc gauggacugu ggaccgcuau cgucgucgcc uaccugucga 780
ccacggcggc gaaugauaca cccgauggac gucggaucuu ggaaggacga cuucauguug 840
cucuugccgu gguaguggcu gcggcaccua acacgagacc uaggagacuc gcucuguuuc 900
acgugggacu ucaggaagug gcaccuuuuc ccguagaugg ucuggucguu gaaggcccac 960
gucggguggc uuagguagca cgccaagggg uuauaguggu uagacacggg gaagccgcuc 1020
cacaaguuac gguggucuaa gcggagacac augcggaccu uggccuucgc cuagucguua 1080
acgcaccggc ugaugaggca cgacauguug aggcggucga agucguggaa guucacgaug 1140
ccgcacaggg gaugguucga cuugcuggac acgaaguguu ugcacaugcg gcugucgaag 1200
cacuaggccc cucuacuuca cgccgucuaa cggggaccug ucuguccguu cuagcggcug 1260
auguugaugu ucgacgggcu gcugaagugg ccgacacacu aacggaccuu gucguuguug 1320
gaccugaggu uucagccgcc guugauguua auggacaugg ccgacaaggc cuucagguua 1380
gacuucggga agcucgcccu guagaggugg cucuagauag uccggccguc guggggaaca 1440
uugccgcacc uuccgaaguu gacgaugaag ggugacguca ggaugccgaa agucgggugu 1500
uuaccgcacc cgauagucgg gaugucucac caccacgacu cgaagcuuga cgacguacgg 1560
ggacgguguc acacgccggg auucuuuucg ugguuagagc acuucuuguu uacgcacuug 1620
aaguugaagu ugccggacug gccguggccg cacgacuguc ucucguuguu cuucaaggac 1680
gguaaggucg ucaaaccggc ccuauagcgg cuaugguguc ugcggcaauc ucuagggguc 1740
ugugaccuuu aggaccugua guggggaacg ucgaagccgc cucacagaca cuagugggga 1800
ccgugguugu ggucguuagu ccaccgucac gacauggucc ugcacuugac auggcuucac 1860
gggcaccggu aagugcggcu agucgacugu ggauguaccg cccacaugag guggccgucg 1920
uuacacaaag ucuggucucg gccgacagac uagccucggc ucgugcacuu guuaucgaug 1980
cucacgcugu agggguagcc gcgaccguag acacggucga uggucugugu cuguuugucg 2040
gggucugccc ggucuagaca ccggucgguc ucguaguaac ggauguguua cagagacccg 2100
cggcucuugu cgcaccggau gagguuguug agauagcgau aggggugguu gaagugguag 2160
ucgcacuggu gucucuagga cggacacagg uacugguucu ggucgcaccu gacgugguac 2220
auguagacgc cgcuaaggug gcucacgagg uuggacgacg acgucaugcc gucgaagacg 2280
ugggucgacu uaucucggga cugucccuag cggcaccuug uccuguucuu guggguucuc 2340
cacaagcggg uucacuucgu cuagauguuc uggggaggau aguuccugaa gccgccgaag 2400
uuaaagucgg ucuaagacgg gcuaggaucg uucgggucgu ucgccucgaa guagcuccug 2460
gacgacaagu uguuucacug ugaccggcug cggccgaagu aguucgucau accgcuaaca 2520
gacccgcugu aacggcgguc ccuagacuaa acgcgggucu ucaaauugcc ugacugucac 2580
gacggaggag acgacuggcu acucuacuag cgggucaugu guagacggga cgaccggccg 2640
uguuaguguu cgccgaccug uaaaccucga ccgcggcgag acgucuaggg gaaacgauac 2700
gucuaccgga uggccaaguu gccguagccu cacugggucu uacacgacau gcucuugguc 2760
uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuguc ggacucgucg 2820
ugucguucgc gggacccuuu cgacguccug caccaguugg ucuuacgggu ccgugacuug 2880
ugggaccagu ucgucgacag gagguugaag ccgcgguagu cgagacacga cuugcuauag 2940
gacucgucug accuguucca ccuucggcuc cacgucuagc ugucugacua guggccuucc 3000
gacgucaggg acgucuggau gcaauggguc gucgacuagu cucggcggcu cuaaucucgg 3060
agacgguuag accggcggug guucuacaga cucacacacg acccggucuc guucucucac 3120
cugaaaacgc cguucccgau gguggacuac ucgaagggag ucagacgggg agugccgcac 3180
cacaaagacg ugcacuguau gcacgggcga guucucuucu uaaaguggug gcgaggucgg 3240
uagacggugc ugccguuucg ggugaaagga ucucuuccgc acaagcacag guugccgugg 3300
guaaccaagc acugggucgc cuugaagaug cucggggucu aguaguggug gcuguugugg 3360
aagcacagac cguugacgcu gcagcacuag ccguaacacu uguuauggca caugcuggga 3420
gacgucgggc ucgaccuguc gaaguuucuc cuugaccuau ucaugaaauu cuuggugugu 3480
ucggggcugc accuggaccc gcuauagucg ccuuaguuac ggucgcagca cuuguagguc 3540
uuucucuagc uggccgacuu gcuccaccgg uucuuagacu ugcucucgga cuagcuggac 3600
guucuugacc ccuucaugcu cgucauguag uucaccggga ccauguagac cgacccgaaa 3660
uagcggccug acuaacggua gcacuaccag uguuaguacg acacaacgua cuggucgacg 3720
acaucgacgg acuucccgac aacaucgaca ccgucgacga cguucaagcu gcuccugcua 3780
agacucgggc acgacuuccc gcacuuugac gugaugugu 3819
<210> 21
<211> 225
<212> RNA
<213> Artificial Sequence
<220>
<223> E protein Gene optimized mRNA sequence (EBL mRNA)
<400> 21
uacaugucga aacagagucu ccuuuggccg ugcgacuaac auuugucgca caaugauaag 60
gagcggaagc aacacaaaga ggaacaaugu gaccguuaug acugacggga cgccaacacg 120
cgaaugacga cauuauagca cuugcacaga aaccacuucg ggucaaagau acauauaagg 180
ucucaguuuu uagaguugag gagaucccac ggacuggacg aacag 225
<210> 22
<211> 669
<212> RNA
<213> Artificial Sequence
<220>
<223> M protein Gene optimized mRNA sequence (MBL mRNA)
<400> 22
uaccgucuaa gguugccaug uuaauggcag cuucucgacu uuuucgagga acucgucacc 60
uuggaccagu aucccaagga uaaggacugu accuaaacgg acgacguuaa acggauacgg 120
uuguccuuau ccaaaaacau auauuaguuc gacuaaaagg agaccgacaa uaccggucac 180
ugggaccgga caaaacacga acggcgacaa augucuuauu uaaccuagug gccgccuuag 240
cgguagcguu accgaacgga acauccgaac uacaccgagu cgaugaagua acgaagaaag 300
gccgacaaac gcgcuugcgc cagguacacc agaaaguuag gccucugauu guaugaggag 360
uuacacgggg agguaccgug auaagacugg ucuggggacg aucucucacu ugagcaguag 420
ccucgacacu aggacgcccc cguggacucu uagcggccug uggugaaucc ggcgacacug 480
uaguuccuag acggauuucu uuagugacaa cgguguagug cuugggaaag aauaauguuc 540
aacccccgga gcgucgcaca ccguccucug aguccaaaac gccguauguc agcgaugucc 600
uaaccguuga uauuuaauuu gugucuggua aggucgucgu cgcuauuaua acgaaacgaa 660
cacgucacu 669
<210> 23
<211> 1257
<212> RNA
<213> Artificial Sequence
<220>
<223> N protein Gene optimized mRNA sequence (NBL mRNA)
<400> 23
uacagucuau ugccuggcgu cuugguuucc uugcggggag ccuagugaaa gcccccagga 60
ucgcugucgu gacccagauu gguuuuaccu cuugcaaggc cgcguucuag guuugucucc 120
uccggagucc ccgaaggauu guuaugucgg aggaccaagu gucgagagug ugucguaccg 180
uuccuucugg acuucaaagg aucuccgguc ccccaagggu aguuaugauu gaggaggggu 240
cugcuagucu aaccaauaau agccgcccga ugguccgccu aggccccgcc ucugccauuc 300
uacuuccugg agagaggggc aaccaugaaa augauggagc cauguccggg gcuccgaccc 360
gaaggcauac cgcgguuauu ccuaccuuau uaaacccacc gaugccuucc ccgggaguug 420
uguggcuucc uaguguaacc gugggcauua gggcgcuuau uacggcggua acaggacguc 480
aacggggucc ccugcugcaa cggguuuccg aaaaugcguc uuccuagcgc gccuccuagg 540
guucggaggu cggcuaguuc gagagcuaga gccuugaguu cagcguuauc gugugguccc 600
agaagagcgc ccuggucggg acguuccuac cggccuuugc cgccacuacg acgaaaucgc 660
gacgacgacg accuaucuga cuugguuaau cucucauuuu acaguccauu uccggucguu 720
gucguccccg ucugucacug guuuuuuuca cgccggcucc ggucguucuu uggggcgguc 780
uuugcuuguc ggugauuucg gauguugcau uggguucgua agccuuccuc uccuggucuc 840
gucuggguuc cguuaaaacc gcuaguucuc gacuaggcgg uccccugccu gauauucgua 900
accggugucu agcgggucaa gcguggguca cgaagucgga agaagccuua cagcucuuag 960
ccauaccucc agugaggaag accgugaacc gacugaauau ggccgcguua uuucgaucug 1020
cuguuucugg gauugaaauu ccuaguccac uaggacgauu uauuugugua acuacgcaug 1080
uuuuguaagg gugguugacu cgguuucuuc cuguucuucu ucuuccgucu acuuuggguc 1140
cgaaacgggg ucucugucuu uuucgucguc uggcacugga acgacggucg ucggcuggag 1200
cugcuaaaaa guuucguuga agucgucagg uacucaucgc gacugucgug gguccga 1257
<210> 24
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> SDC50
<400> 24
atgttcgtgt ttctggtgct gctgcctctg gtgtcttctc agtgtgtgaa tctgacaaca 60
agaacacagc tgcctcctgc ctacaccaac agctttacaa gaggagtgta ctaccctgac 120
aaggtgttca gaagcagcgt gctgcattct acacaggacc tgtttctgcc tttcttcagc 180
aacgtgacct ggtttcacgc cattcacgtg tctggcacaa atggaaccaa gaggttcgac 240
aatcctgtgc tgcctttcaa cgatggcgtg tactttgcct ctaccgagaa gagcaacatc 300
atcagaggct ggatctttgg caccacactg gatagcaaga cacagtctct gctgatcgtg 360
aacaatgcca ccaacgtggt gatcaaggtg tgtgagttcc agttctgcaa cgaccctttt 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt cagagtgtac 480
agctctgcca acaattgcac ctttgagtac gtgagccagc ctttcctgat ggatctggaa 540
ggaaagcagg gcaatttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aatctggtga gagatctgcc tcagggattt 660
tctgctctgg aacctctggt ggatctgcct attggcatca acatcaccag attccagaca 720
ctgctggctc tgcacagatc ttacctgaca cctggagatt cttcttctgg atggacagct 780
ggagctgctg cttattacgt gggctatctg cagcctagaa ccttcctgct gaagtacaac 840
gagaatggca ccatcacaga tgctgtggat tgtgctctgg atcctctgtc tgagaccaag 900
tgtacactga agagcttcac agtggagaag ggcatctacc agaccagcaa tttcagagtg 960
cagcctacag agagcatcgt gagattcccc aacatcacca atctgtgccc ttttggagag 1020
gtgttcaatg ccaccagatt tgcctctgtg tacgcctgga acagaaagag gatcagcaac 1080
tgtgtggccg attactctgt gctgtacaac tctgccagct ttagcacctt caagtgctac 1140
ggagtgtctc ctacaaagct gaacgacctg tgtttcacca acgtgtacgc cgatagcttc 1200
gtgattagag gcgatgaagt gagacagatt gctcctggcc agacaggaaa gatcgccgat 1260
tacaactaca agctgcctga tgacttcacc ggctgtgtga ttgcctggaa tagcaataac 1320
ctggacagca aagtgggcgg caactacaac tacctgtaca gactgttcag gaagagcaac 1380
ctgaagccct tcgagagaga catctctacc gagatttatc aggctggaag caccccttgt 1440
aatggcgtgg aaggcttcaa ctgttacttt cctctgcaga gctacggctt tcagcctacc 1500
aatggagtgg gatatcagcc ttatagagtg gtggtgctga gctttgaact gctgcatgct 1560
cctgctacag tgtgtggacc taagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac aggaacagga gtgctgacag agagcaataa gaagttcctg 1680
cccttccagc agtttggcag agacattgcc gatacaacag atgccgtgag agatcctcag 1740
acactggaga tcctggatat cacaccttgt agctttggcg gcgtgtctgt gattacacct 1800
ggaaccaata ccagcaatca ggtggctgtg ctgtaccagg atgtgaattg cacagaagtg 1860
cctgtggcca ttcatgctga tcagctgaca cctacatgga gagtgtacag caccggctct 1920
aatgtgtttc agaccagagc tggatgtctg attggagccg agcacgtgaa taacagctac 1980
gagtgtgaca tccctattgg agccggaatc tgtgcctctt atcagacaca gaccaactct 2040
cctagaagag ccagatctgt ggcctctcag tctatcatcg cctataccat gtctctggga 2100
gctgagaata gcgtggccta tagcaacaac agcattgcca tccctaccaa cttcaccatc 2160
agcgtgacaa cagagattct gcctgtgagc atgaccaaga catctgtgga ctgcaccatg 2220
tacatctgtg gcgattctac cgagtgtagc aatctgctgc tgcagtacgg ctctttttgt 2280
acccagctga atagagccct gacaggaatt gccgtggaac aggacaagaa tacccaggaa 2340
gtgtttgccc aggtgaagca gatctacaag acccctccta tcaaggactt tggcggcttc 2400
aacttctctc agattctgcc tgatcctagc aagcccagca agagaagttt catcgaggat 2460
ctgctgttca acaaggtgac actggccgat gccggattta tcaagcagta tggagattgt 2520
ctgggcgata tcgccgccag agatctgatt tgtgcccaga agtttaatgg actgaccgtg 2580
ctgcctcctc tgctgacaga tgagatgatt gctcagtata catctgccct gctggccgga 2640
acaatcacat ctggatggac atttggagct ggagctgctc tgcagattcc ttttgccatg 2700
cagatggcct acagattcaa tggcatcggc gtgacacaga atgtgctgta cgagaaccag 2760
aagctgattg ccaaccagtt caacagcgcc attggcaaga tccaggattc tctgtcttct 2820
acagcctctg ctctgggaaa actgcaggat gtggtgaatc agaatgccca ggccctgaat 2880
acactggtga agcagctgtc tagcaatttt ggcgccatct ctagcgtgct gaatgacatc 2940
ctgagcagac tggataaagt ggaggccgaa gtgcagatcg atagactgat cacaggcaga 3000
ctgcagtctc tgcagacata tgtgacacag cagctgatta gagctgccga gatcagagct 3060
tctgctaatc tggctgccac aaagatgtct gagtgtgtgc tgggacagtc taagagagtg 3120
gacttctgtg gcaaaggcta tcacctgatg agctttcctc agtctgctcc tcatggagtg 3180
gtgtttctgc atgtgacata tgtgcctgcc caggagaaga acttcacaac agctcctgcc 3240
atttgtcatg atggcaaggc ccactttcct agagaaggag tgttcgtgtc taatggcaca 3300
cactggttcg tgacacagag gaacttctac gagcctcaga tcatcaccac cgataacacc 3360
ttcgtgtctg gcaattgcga tgtggtgatc ggcatcgtga acaataccgt gtatgatcct 3420
ctgcagcctg agctggatag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
tctcctgatg tggatctggg cgatatctct ggcatcaatg cctctgtggt gaacatccag 3540
aaggagatcg acagactgaa tgaggtggcc aagaacctga atgagagcct gatcgatctg 3600
caggaactgg gaaagtacga gcagtacatc aagtggcctt ggtacatctg gctgggattt 3660
attgccggac tgattgccat cgtgatggtg accatcatgc tgtgctgtat gaccagctgt 3720
tgtagctgtc tgaaaggctg ctgtagctgt ggcagctgtt gcaagtttga tgaggatgat 3780
tctgagcctg tgctgaaggg cgtgaagctg cactacacc 3819
<210> 25
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> SDC54
<400> 25
atgttcgtgt tcctggtgct gctgcctctg gtgagctctc agtgtgtgaa tctgaccaca 60
agaacccagc tgcctcctgc ctacaccaac agctttacca gaggagtgta ctaccccgac 120
aaggtgttca gaagcagcgt gctgcatagc acacaggatc tgttcctgcc cttcttcagc 180
aacgtgacct ggtttcacgc catccatgtg tctggcacca atggcaccaa gagattcgac 240
aaccctgtgc tgcctttcaa cgatggcgtg tacttcgcct ctaccgagaa gagcaacatc 300
atcagaggct ggatcttcgg caccacactg gatagcaaga cccagtctct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt cagggtgtac 480
agcagcgcca acaattgcac cttcgagtac gtgagccagc ctttcctgat ggatctggag 540
ggaaagcagg gcaacttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aacctggtga gagatctgcc tcagggattt 660
tctgctctgg agcctctggt ggatctgcct atcggcatca acatcaccag attccagaca 720
ctgctggccc tgcacagaag ctacctgaca cctggagatt cttcttctgg ctggacagct 780
ggagctgctg cctattacgt gggctatctg cagcccagaa ccttcctgct gaagtacaac 840
gagaacggca ccatcacaga tgccgtggat tgtgccctgg atcctctgtc tgagaccaag 900
tgtaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttcagagtg 960
cagcctaccg agagcatcgt gagattcccc aacatcacca acctgtgccc ttttggcgag 1020
gtgttcaatg ccaccagatt tgccagcgtg tacgcctgga acaggaagag gatcagcaac 1080
tgtgtggccg attacagcgt gctgtacaac tctgccagct tcagcacctt caagtgctac 1140
ggcgtgtctc ctacaaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200
gtgattagag gcgatgaggt gagacagatt gctcctggcc agacaggcaa gattgccgac 1260
tacaactaca agctgcctga cgacttcacc ggctgtgtga ttgcctggaa cagcaacaat 1320
ctggacagca aggtgggcgg caactacaac tacctgtaca ggctgttcag gaagagcaac 1380
ctgaagccct tcgagagaga catcagcacc gagatctatc aggctggaag caccccttgt 1440
aatggcgtgg agggcttcaa ctgttacttc cctctgcaga gctacggctt tcagcctacc 1500
aatggagtgg gctatcagcc ttacagagtg gtggtgctga gctttgaact gctgcatgct 1560
cctgctacag tgtgtggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggaacagga gtgctgacag agagcaacaa gaagttcctg 1680
cccttccagc agttcggcag agatatcgcc gataccacag atgccgtgag agatcctcag 1740
acactggaga tcctggacat cacaccttgc agctttggcg gagtgtctgt gatcacacct 1800
ggcaccaata ccagcaatca ggtggctgtg ctgtaccagg acgtgaattg caccgaagtg 1860
cctgtggcca ttcatgctga tcagctgacc cctacatgga gagtgtacag caccggctct 1920
aatgtgttcc agaccagagc cggatgtctg attggagccg agcacgtgaa taacagctac 1980
gagtgcgaca tccctattgg agccggcatc tgtgcctctt atcagaccca gaccaactct 2040
cctagaagag ccagaagcgt ggcctctcag agcatcattg cctacaccat gtctctggga 2100
gccgagaata gcgtggccta cagcaataac agcatcgcca tccccaccaa cttcaccatc 2160
agcgtgacca cagagattct gcctgtgagc atgaccaaga cctctgtgga ctgcaccatg 2220
tacatctgtg gcgactctac cgagtgcagc aatctgctgc tgcagtatgg cagcttttgt 2280
acccagctga acagagccct gacaggcatt gctgtggagc aggataagaa cacccaggag 2340
gtgtttgccc aggtgaagca gatctacaag acccctccca tcaaggactt cggcggcttt 2400
aacttcagcc agatcctgcc tgatcctagc aagcccagca agaggagctt tatcgaggac 2460
ctgctgttca acaaggtgac cctggccgat gctggcttta tcaagcagta cggagattgt 2520
ctgggcgata tcgccgccag agacctgatt tgtgcccaga agttcaatgg actgaccgtg 2580
ctgcctcctc tgctgacaga tgagatgatt gcccagtaca catctgccct gctggctggc 2640
acaatcacat ctggatggac atttggagct ggagctgccc tgcagatccc ttttgccatg 2700
cagatggcct acagattcaa cggcatcggc gtgacccaga atgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggattc tctgtctagc 2820
acagcctctg ctctgggaaa gctgcaggat gtggtgaatc agaatgccca ggccctgaat 2880
acactggtga agcagctgag cagcaacttt ggcgccatca gctctgtgct gaatgacatc 2940
ctgagcagac tggacaaggt ggaggctgaa gtgcagatcg acagactgat cacaggcaga 3000
ctgcagtctc tgcagaccta cgtgacacag cagctgatta gagctgccga gatcagagct 3060
tctgccaatc tggctgccac caagatgtct gagtgtgtgc tgggacagag caagagagtg 3120
gacttctgtg gcaaaggcta ccacctgatg agcttccctc agtctgctcc tcatggagtg 3180
gtgtttctgc acgtgaccta tgtgcctgcc caggagaaga acttcaccac agctcctgcc 3240
atttgtcacg atggcaaggc ccactttcct agagaaggcg tgttcgtgag caatggcaca 3300
cactggttcg tgacccagag gaacttctac gagccccaga tcatcaccac cgataacacc 3360
ttcgtgagcg gcaattgcga cgtggtgatc ggcatcgtga acaataccgt gtacgatcct 3420
ctgcagcctg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
agccctgatg tggatctggg cgacatctct ggcatcaatg ccagcgtggt gaacatccag 3540
aaggagatcg acaggctgaa cgaggtggcc aagaacctga atgagagcct gatcgatctg 3600
caggagctgg gcaagtacga gcagtacatc aagtggcctt ggtacatctg gctgggcttt 3660
atcgccggac tgattgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720
tgtagctgtc tgaagggctg ttgtagctgt ggcagctgtt gcaagttcga cgaggatgat 3780
agcgagcctg tgctgaaagg cgtgaagctg cactacacc 3819
<210> 26
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> SDC58
<400> 26
atgttcgtgt tcctggtgct gctgcccctg gtgagctctc agtgtgtgaa cctgaccacc 60
agaacccagc tgcctcctgc ctacaccaac agcttcacca gaggcgtgta ctaccccgac 120
aaggtgttca gaagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180
aacgtgacct ggttccacgc catccacgtg tctggcacca atggcaccaa gaggttcgac 240
aaccctgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300
atcaggggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt ccgggtgtac 480
agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540
ggcaagcagg gcaacttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aacctggtga gagacctgcc tcagggcttt 660
tctgccctgg agcctctggt ggacctgcct atcggcatca acatcaccag gttccagacc 720
ctgctggccc tgcacagaag ctacctgaca cctggcgata gctcttctgg ctggacagct 780
ggagctgctg cctattacgt gggctacctg cagcccagga ccttcctgct gaagtacaac 840
gagaacggca ccatcaccga cgccgtggat tgtgccctgg atcctctgag cgagaccaag 900
tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttccgggtg 960
cagcctaccg agagcatcgt gaggttcccc aacatcacca acctgtgccc tttcggcgag 1020
gtgttcaacg ccaccagatt cgcctctgtg tacgcctgga acaggaagcg gatcagcaac 1080
tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140
ggcgtgagcc ctaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200
gtgatcagag gcgatgaggt gagacagatc gcccctggac agaccggcaa gatcgccgac 1260
tacaactaca agctgcccga cgacttcacc ggctgtgtga tcgcctggaa cagcaacaac 1320
ctggacagca aggtgggcgg caactacaac tacctgtacc ggctgttccg gaagagcaac 1380
ctgaagccct tcgagaggga catcagcacc gagatctacc aggccggaag cacaccttgc 1440
aatggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt tcagcctacc 1500
aatggcgtgg gctaccagcc ctacagagtg gtggtgctga gctttgaact gctgcatgcc 1560
cctgccacag tgtgtggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggcacaggc gtgctgaccg agagcaacaa gaagttcctg 1680
cccttccagc agttcggcag agacatcgcc gataccaccg atgccgtgag agatcctcag 1740
accctggaga tcctggacat caccccttgc agctttggcg gagtgagcgt gatcacacct 1800
ggcaccaaca ccagcaatca ggtggccgtg ctgtaccagg acgtgaactg cacagaggtg 1860
cctgtggcca ttcatgccga tcagctgacc cctacctgga gagtgtacag caccggcagc 1920
aatgtgttcc agaccagagc cggctgtctg atcggagccg agcacgtgaa caacagctac 1980
gagtgcgaca tccctatcgg agccggcatc tgcgcctctt accagacaca gaccaacagc 2040
cccagaagag ccagaagcgt ggccagccag tctatcatcg cctacaccat gagcctggga 2100
gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160
agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgacagcac agagtgcagc aacctgctgc tgcagtacgg cagcttttgc 2280
acccagctga acagagccct gacaggcatt gccgtggagc aggacaagaa cacccaggag 2340
gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400
aacttcagcc agatcctgcc tgaccctagc aagcccagca agcggagctt catcgaggac 2460
ctgctgttca acaaggtgac cctggccgat gccggcttca tcaagcagta cggcgattgt 2520
ctgggcgata tcgccgccag agacctgatc tgtgcccaga agttcaacgg cctgaccgtg 2580
ctgcctcctc tgctgacaga tgagatgatc gcccagtaca cctctgccct gctggccgga 2640
accatcacat ctggctggac atttggagct ggagccgccc tgcagatccc tttcgccatg 2700
cagatggcct acaggttcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgtctagc 2820
acagcctctg ctctgggcaa gctgcaggat gtggtgaacc agaatgccca ggccctgaac 2880
accctggtga agcagctgag cagcaatttc ggcgccatca gcagcgtgct gaacgacatc 2940
ctgagcagac tggacaaggt ggaggccgag gtgcagatcg acagactgat caccggcaga 3000
ctgcagagcc tgcagaccta cgtgacacag cagctgatca gagccgccga gatcagagcc 3060
tctgccaatc tggctgccac caagatgagc gagtgtgtgc tgggccagag caagagagtg 3120
gacttctgcg gcaaaggcta ccacctgatg agcttccccc agtctgctcc tcatggcgtg 3180
gtgtttctgc acgtgaccta cgtgcctgcc caggagaaga acttcaccac agcccctgcc 3240
atctgtcacg atggcaaggc ccacttccct agagagggcg tgttcgtgag caatggcacc 3300
cactggttcg tgacccagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360
ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgaccct 3420
ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
agccccgacg tggatctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540
aaggagatcg accggctgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600
caggagctgg gcaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttt 3660
atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720
tgcagctgcc tgaagggctg ttgtagctgt ggcagctgct gcaagttcga cgaggacgat 3780
agcgagcctg tgctgaaggg cgtgaagctg cactacacc 3819
<210> 27
<211> 3819
<212> DNA
<213> Artificial Sequence
<220>
<223> SDC60
<400> 27
atgttcgtgt tcctggtgct gctgcccctg gtgagcagcc agtgtgtgaa cctgaccacc 60
agaacccagc tgcctcccgc ctacaccaac agcttcacca ggggcgtgta ctaccccgac 120
aaggtgttca ggagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180
aacgtgacct ggttccacgc catccacgtg agcggcacca atggcaccaa gcggttcgac 240
aaccctgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300
atccggggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt ccgggtgtac 480
agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540
ggcaagcagg gcaacttcaa gaacctgcgg gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aacctggtga gggacctgcc tcagggcttt 660
tctgccctgg agcctctggt ggacctgccc atcggcatca acatcaccag gttccagacc 720
ctgctggccc tgcacaggag ctacctgaca cctggcgata gctcttctgg ctggacagcc 780
ggagctgctg cctactacgt gggctacctg cagccccgga ccttcctgct gaagtacaac 840
gagaacggca ccatcaccga cgccgtggat tgcgccctgg atcctctgag cgagaccaag 900
tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttccgggtg 960
cagcccaccg agagcatcgt gaggttcccc aacatcacca acctgtgccc cttcggcgag 1020
gtgttcaacg ccaccagatt cgccagcgtg tacgcctgga accggaagcg gatcagcaac 1080
tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140
ggcgtgagcc ccaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200
gtgatcaggg gcgatgaggt gagacagatc gcccctggcc agaccggcaa gatcgccgac 1260
tacaactaca agctgcccga cgacttcacc ggctgcgtga tcgcctggaa cagcaacaac 1320
ctggacagca aggtgggcgg caactacaac tacctgtacc ggctgttccg gaagagcaac 1380
ctgaagccct tcgagcggga catcagcacc gagatctacc aggccggaag caccccttgc 1440
aacggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt ccagcctacc 1500
aatggcgtgg gctaccagcc ctacagggtg gtggtgctga gctttgagct gctgcatgct 1560
cctgccaccg tgtgcggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggcaccggc gtgctgaccg agagcaacaa gaagttcctg 1680
cccttccagc agttcggcag ggacatcgcc gataccaccg atgccgtgag agaccctcag 1740
accctggaga tcctggacat caccccttgc agcttcggcg gagtgagcgt gatcacacct 1800
ggcaccaaca ccagcaacca ggtggccgtg ctgtaccagg acgtgaactg caccgaggtg 1860
cctgtggcca ttcacgccga tcagctgacc cccacctgga gagtgtacag caccggcagc 1920
aacgtgttcc agaccagagc cggctgtctg atcggcgccg agcacgtgaa caacagctac 1980
gagtgcgaca tccccatcgg cgccggcatc tgtgccagct atcagaccca gaccaacagc 2040
cctaggaggg ccagaagcgt ggccagccag tctatcatcg cctacaccat gagcctgggc 2100
gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160
agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgacagcac cgagtgcagc aacctgctgc tgcagtacgg cagcttctgc 2280
acccagctga acagagccct gacaggcatc gccgtggagc aggacaagaa cacccaggag 2340
gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400
aacttcagcc agatcctgcc tgaccccagc aagcccagca agcggagctt catcgaggac 2460
ctgctgttca acaaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgt 2520
ctgggcgaca tcgccgccag agacctgatc tgtgcccaga agttcaacgg cctgaccgtg 2580
ctgccccctc tgctgaccga tgagatgatc gcccagtaca cctctgccct gctggccggc 2640
accatcacat ctggctggac ctttggagct ggagccgccc tgcagatccc tttcgccatg 2700
cagatggcct accggttcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820
accgcctctg ctctgggcaa actgcaggac gtggtgaacc agaacgccca ggccctgaac 2880
accctggtga agcagctgag cagcaacttc ggcgccatca gcagcgtgct gaacgacatc 2940
ctgagcaggc tggacaaggt ggaggccgag gtgcagatcg acaggctgat caccggcaga 3000
ctgcagagcc tgcagaccta cgtgacccag cagctgatca gagccgccga gatcagagcc 3060
tctgccaatc tggccgccac caagatgagc gagtgtgtgc tgggccagag caagagggtg 3120
gacttctgcg gcaagggcta ccacctgatg agcttccccc agtctgcccc tcatggcgtg 3180
gtgttcctgc acgtgaccta cgtgcctgcc caggagaaga acttcaccac cgcccctgcc 3240
atctgccacg atggcaaggc ccacttccct agagagggcg tgttcgtgag caacggcacc 3300
cactggttcg tgacccagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360
ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgacccc 3420
ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
agccccgacg tggacctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540
aaggagatcg accggctgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600
caggagctgg gcaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttc 3660
atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720
tgcagctgcc tgaagggctg ctgcagctgt ggcagctgtt gcaagttcga cgaggacgac 3780
agcgagcccg tgctgaaggg cgtgaagctg cactacacc 3819
<210> 28
<211> 957
<212> DNA
<213> Artificial Sequence
<220>
<223> MT2AE
<400> 28
atggccgatt ctaatggcac catcaccgtg gaagagctga agaagctgct cgagcaatgg 60
aacctggtga tcggatttct gttcctgacc tggatctgtc tgttgcagtt cgcctacgcc 120
aaccggaaca gattcctgta catcatcaaa ctgatcttcc tgtggctgct gtggcctgtg 180
accctggcct gcttcgtgct ggccgccgtg taccggatta actggatcac cggaggcatc 240
gctatcgcca tggcatgcct ggtcggactt atgtggctgt cttatttcat cgccagcttc 300
agactgttcg ctagaaccag aagcatgtgg tcctttaacc ctgagacaaa catcctgctg 360
aacgtgcctc tgcacggcac aatcctgaca cggccactgc tggaaagcga gctggtcatc 420
ggcgccgtga tcctgcgggg ccatctgcgc attgccggac accacctggg cagatgcgac 480
atcaaggacc tgcccaagga aatcaccgtg gccaccagca gaacactgtc ctactacaaa 540
ctgggcgcta gtcagagagt ggccggcgac agcggcttcg ccgcttattc tagatacaga 600
atcggcaact acaagctgaa taccgatcac agcagcagca gcgacaacat cgccctgctg 660
gtgcagggca gcggcgaggg cagaggaagc ctgctgacat gtggcgatgt ggaagagaac 720
cccggccctg ccatgtacag ctttgtgtct gaggaaaccg gcaccctgat cgtgaacagc 780
gtgctgctgt ttctggcctt cgtcgtgttc ctgctggtga cactggctat cctgaccgcc 840
ctgaggctgt gcgcctactg ctgcaacatc gtgaatgtat ccctggtgaa gccttccttc 900
tacgtgtaca gccgggtgaa gaaccttaat agctctagag tgcccgacct gctcgtt 957
<210> 29
<211> 960
<212> DNA
<213> Artificial Sequence
<220>
<223> MP2AE
<400> 29
atggccgaca gcaacggcac aatcacagtg gaagagctga agaagctgct ggagcagtgg 60
aacctggtga ttggatttct tttcctcacc tggatctgcc tgctgcagtt cgcctatgcc 120
aaccggaaca gattcctgta catcatcaag ctgatcttcc tgtggctgct gtggcccgtg 180
accctggcct gttttgtgct ggccgccgtg taccggatca actggatcac cggcggaatc 240
gctatcgcca tggcctgcct ggtgggcctg atgtggctga gctacttcat cgcctccttt 300
agactgttcg ccagaaccag aagcatgtgg tccttcaacc ctgagacaaa tatcctgctc 360
aacgtgcccc tgcacggcac catcctgacc cggcctctgc tcgagagcga gctggtgatc 420
ggcgccgtga tcctgagagg ccacctgaga atcgccggac accacctggg cagatgcgac 480
atcaaggacc tgccaaagga aatcaccgtt gctacaagca gaacactgtc ctactacaag 540
ctgggcgctt ctcaaagagt cgccggcgac agcggcttcg ctgcttatag ccgctacagg 600
attggaaatt acaagctgaa caccgatcat tcttctagca gcgacaacat cgccctgctg 660
gtccagggca gcggcgccac aaacttcagc ctgcttaaac aggccggcga tgtggaagag 720
aaccccggcc ctgccatgta cagcttcgtg tccgaggaaa ccggcaccct gatcgtgaac 780
agcgtgctgc tgttccttgc ttttgtggtg ttcctgctgg tcaccctggc catcctgacc 840
gccctgagac tgtgtgccta ctgctgcaac atcgtgaatg tgtctctggt gaagcctagc 900
ttctacgtgt acagccgggt gaaaaacctg aactctagcc gggtgcctga tctgctggtg 960
<210> 30
<211> 798
<212> DNA
<213> Artificial Sequence
<220>
<223> SGS-RBD
<400> 30
atggagacag acacactcct gctatgggta ctgctgctct gggttccagg ttccaccgga 60
gactgcccat ttggcgaggt gttcaacgca acccgcttcg ccagcgtgta cgcctggaat 120
aggaagcgga tcagcaactg cgtggccgac tatagcgtgc tgtacaactc cgcctctttc 180
agcaccttta agtgctatgg cgtgtccccc acaaagctga atgacctgtg ctttaccaac 240
gtctacgccg attctttcgt gatcaggggc gacgaggtgc gccagatcgc ccccggccag 300
acaggcaaga tcgcagacta caattataag ctgccagacg atttcaccgg ctgcgtgatc 360
gcctggaaca gcaacaatct ggattccaaa gtgggcggca actacaatta tctgtaccgg 420
ctgtttagaa agagcaatct gaagcccttc gagagggaca tctctacaga aatctaccag 480
gccggcagca ccccttgcaa tggcgtggag ggctttaact gttatttccc actccagtcc 540
tacggcttcc agcccacaaa cggcgtgggc tatcagcctt accgcgtggt ggtgctgagc 600
tttgagctgc tgcacgccta cccgtacgac gtgccggact acgccaatgc tgtgggccag 660
gacacgcagg aggtcatcgt ggtgccacac tccttgccct ttaaggtggt ggtgatctca 720
gccatcctgg ccctggtggt gctcaccatc atctccctta tcatcctcat catgctttgg 780
cagaagaagc cacgttag 798
<210> 31
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> SDC-50 mRNA
<400> 31
uacaagcaca aggaccacga cgacggagac cacagaagag ucacacacuu agacuguugg 60
ucuugggucg acggaggacg gauaugguug ucgaaguguu cuccgcacau gaugggacug 120
uuccacaagu ccagaagaca cgacgugaga uggguccuag acaaggacgg aaagaagucg 180
uugcacugga ccaaagugcg guagguacac agaccguggu uaccgugguu cucuaagcug 240
uuaggacacg acggaaaguu gcuaccgcac augaagcgga gauggcucuu cucguuguag 300
uagucuccga ccuagaaacc guguugggac cuaucguucu gggucagaga cgacuagcac 360
uuguuacggu gguugcacca cuaguuccac acgcucaagg ucaagacguu acugggaaag 420
gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa gucccacaug 480
ucgagacggu uguuaacgug gaagcucaug cacucggucg gaaaggacua ccuagaccuu 540
ccuuucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600
aaguucuaga ugucguucgu guggggguag uuagaccacu cucuagacgg agucccuaaa 660
agacgagacc uuggagacca ccuagacgga uagccguagu uguagugguc uaaggucugu 720
gacgaccgag acgugucuuc gauagacugu ggaccgcuaa gaagaagacc uaccugucga 780
ccucgacgac gaauaaugca cccgauggac gucggaucuu ggaaggacga cuucauguug 840
cucuuaccgu gguaguggcu acgacaccua acacgggacc uaggagacag acucuguuuc 900
acaugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu aaagucucac 960
gucggauggc ucucguagca cucuaagggg uuguaguggu uagacacggg aaaaccgcuc 1020
cacaaguuac gguggucuaa acggucgcac auacggaccu uguccuucuc uuagucguug 1080
acacaccggc ugaugucgca cgacauguua agacggucga aaucguggaa guucacgaug 1140
ccgcacagag gaugguucga cuuacuggac acaaaguggu ugcacaugcg gcugucgaag 1200
cacuagucuc cucuacuuca cucugucuaa cgaggaccgg ucuguccguu cuagcggcua 1260
auguugaugu ucgacggacu acugaagugg ccgacacacu agcggaccuu aucguuguua 1320
gaccugucgu uucacccgcc guugauguug auggacaugu ccgacaaguc cuucucguug 1380
gacuucggga agcucucucu guagagaugg cucuagauag uccgaccuuc guggggaaca 1440
uuaccgcacc uuccgaaguu gacaaugaag ggagacgucu cgaugccgaa agucggaugg 1500
uuaccucacc cuauagucgg aaugucucac caccacgacu cgaaacuuga cgacguacga 1560
ggacgauguc acacaccggg auucuucucg ugguuggacc acuucuuguu cacgcacuug 1620
aaguugaagu ugccggacug gccuuguccu cacgacuguc ucucguuguu cuucaaggac 1680
gggaaggucg ucaaaccguc ucuguaacgg cuaugguguc uacggcacuc ucuaggaguc 1740
ugugaccucu aggaccuaua guguggaacg ucgaaaccgc cucacagaca cuagugugga 1800
ccuugguuau ggucguuagu ccaccgacac gacauggucc ugcacuuaac gugucuucac 1860
ggacaccggu aaguacgacu agucgacugg ggauguaccu cucacauguc guguccgucg 1920
uuacacaaag ucuggucucg gccuacagac uaaccucgac ucgugcacuu guugucgaug 1980
cucacacugu agggauaacc ucggccuuag acacggucga uagucugugu cugguugaga 2040
ggaucuucuc ggucuagaca ccggucgguc agauaguagc ggauauggua cagagacccu 2100
cgacucuuau cgcaccggau gucguuguug ucguagcggu agggaugguu gaagugguag 2160
ucgcacuguu gucucuagga cggacacucg uacugguucu guagacaccu gacgugguac 2220
auguagacac cgcugucgug ucucacaucg uuagacgacg acgucaugcc gucgaaaaca 2280
ugggucgacu uaucucggga cuguccuuaa cggcaccucg uccuauucuu auggguccuc 2340
cacaaacggg uccacuucgu cuagauguuc uggggaggau aguuccugaa gccgccgaag 2400
uugaagucgg ucuaagacgg acuaggaucg uucgggucgu ucucuucaaa guagcuccua 2460
gacgacaagu uguuccacug ggaccggcua cggccuaaau aguucgucau accgcuaaca 2520
gacccgcuau agcggcgguc ucuagacuaa acacgggucu ucaaguuacc ugacuggcac 2580
gacggaggag acgacugucu acucuacuaa cgagucaugu guagacggga cgaccgaccg 2640
uguuagugua gaccuaccug uaaaccucga ccucgacgag acgucuaggg aaaacgguac 2700
gucuaccgga ugucuaaguu gccguagccu cacugggucu uacacgacau gcucuugguc 2760
uucgacuagc gguuggucaa guugucgcgg uaaccguucu agguccuaag agacagaucg 2820
ugucgaagac gagacccguu ugacguccua caccacuuag ucuuacgagu ccgggacuua 2880
ugggaccacu ucgucgacag aucguuaaaa ccgcgguagu cgucgcacga cuuacuguag 2940
gacucgucug accuauuuca ccuccggcuu cacgucuagc ugucugacua guguccuucu 3000
gacgucagag acgucuggau gcacuguguc gucgacuaau cucgacggcu cuaaucucgg 3060
agacgauuag accgacggug guucuacaga cucacacacg acccugucag auucucucac 3120
cugaagacac cguuuccgau gguggacuac ucgaaaggag ucagacgagg aguaccucac 3180
cacaaagacg ugcacuguau acacggacgg guccucuucu ugaaguggug ucgaggacgg 3240
uaaacagugc uaccguuucg ggugaaagga ucucuuccgc acaagcacuc guuaccuugg 3300
gugaccaaac acugggucuc uuugaagaug cucggggucu aguaguggug gcuguuaugg 3360
aagcacagac cguuaacgcu gcaccacuag ccguagcacu uguuauggca cauacuagga 3420
gacgucggac ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480
ucgggacuac accuagaccc gcuauagaga ccguaguuac ggagacacca cuuguagguc 3540
uuccucuagc uguccgacuu acuccaccgg uucuuggacu uacucucgga cuagcuagac 3600
guccucgacc cuuucaugcu cgucauguag uucaccggaa ccauguagac cgacccgaaa 3660
uaacggccug acuaacggua gcacuaccac ugguaguacg acacgacgua cuguucgaca 3720
acaucgacag acuucccgac gacaagaaca ccgucgacaa cguucaagcu acuccuacua 3780
ucgcucggac acgacuuucc gcacuucgac gugaugugg 3819
<210> 32
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> SDC-54 mRNA
<400> 32
uacaagcaca aggaccacga cgacggagac cacucgagag ucacacacuu agacuggugu 60
ucuugggucg acggaggacg gaugugguug ucgaaauggu cuccucacau gauggggcug 120
uuccacaagu cuucgucgca cgacguaucg uguguccuag acaaggacgg gaagaagucg 180
uugcacugga ccaaagugcg guagguacac agaccguggu uaccgugguu cucuaagcug 240
uugggacacg acggaaaguu gcuaccgcac augaagcgga gauggcucuu cucguuguag 300
uagucuccga ccuagaagcc guggugugac cuaucguucu gggucagaga cgacuagcac 360
uuguugcggu gguugcacca cuaguuccac acgcucaagg ucaagacguu gcuggggaag 420
gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa gucccacaug 480
ucgucgcggu uguuaacgug gaagcucaug cacucggucg gaaaggacua ccuagaccuc 540
ccuuucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600
aaguucuaga ugucguucgu guggggguag uuggaccacu cucuagacgg agucccuaaa 660
agacgagacc ucggagacca ccuagacgga uagccguagu uguagugguc uaaggucugu 720
gacgaccggg acgugucuuc gauggacugu ggaccucuaa gaagaagacc gaccugucga 780
ccucgacgac ggauaaugca cccgauagac gucgggucuu ggaaggacga cuucauguug 840
cucuugccgu gguagugucu acggcaccua acacgggacc uaggagacag acucugguuc 900
acaugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu gaagucucac 960
gucggauggc ucucguagca cucuaagggg uuguaguggu uggacacggg aaaaccgcuc 1020
cacaaguuac gguggucuaa acggucgcac augcggaccu uguccuucuc cuagucguug 1080
acacaccggc uaaugucgca cgacauguug agacggucga agucguggaa guucacgaug 1140
ccgcacagag gauguuucga cuugcuggac acgaaguggu ugcacaugcg gcugucgaag 1200
cacuaaucuc cgcuacucca cucugucuaa cgaggaccgg ucuguccguu cuaacggcug 1260
auguugaugu ucgacggacu gcugaagugg ccgacacacu aacggaccuu gucguuguua 1320
gaccugucgu uccacccgcc guugauguug auggacaugu ccgacaaguc cuucucguug 1380
gacuucggga agcucucucu guagucgugg cucuagauag uccgaccuuc guggggaaca 1440
uuaccgcacc ucccgaaguu gacaaugaag ggagacgucu cgaugccgaa agucggaugg 1500
uuaccucacc cgauagucgg aaugucucac caccacgacu cgaaacuuga cgacguacga 1560
ggacgauguc acacaccggg guucuucucg ugguuggacc acuucuuguu cacgcacuug 1620
aaguugaagu ugccggacug gccuuguccu cacgacuguc ucucguuguu cuucaaggac 1680
gggaaggucg ucaagccguc ucuauagcgg cuaugguguc uacggcacuc ucuaggaguc 1740
ugugaccucu aggaccugua guguggaacg ucgaaaccgc cucacagaca cuagugugga 1800
ccgugguuau ggucguuagu ccaccgacac gacauggucc ugcacuuaac guggcuucac 1860
ggacaccggu aaguacgacu agucgacugg ggauguaccu cucacauguc guggccgaga 1920
uuacacaagg ucuggucucg gccuacagac uaaccucggc ucgugcacuu auugucgaug 1980
cucacgcugu agggauaacc ucggccguag acacggagaa uagucugggu cugguugaga 2040
ggaucuucuc ggucuucgca ccggagaguc ucguaguaac ggauguggua cagagacccu 2100
cggcucuuau cgcaccggau gucguuauug ucguagcggu aggggugguu gaagugguag 2160
ucgcacuggu gucucuaaga cggacacucg uacugguucu ggagacaccu gacgugguac 2220
auguagacac cgcugagaug gcucacgucg uuagacgacg acgucauacc gucgaaaaca 2280
ugggucgacu ugucucggga cuguccguaa cgacaccucg uccuauucuu guggguccuc 2340
cacaaacggg uccacuucgu cuagauguuc uggggagggu aguuccugaa gccgccgaaa 2400
uugaagucgg ucuaggacgg acuaggaucg uucgggucgu ucuccucgaa auagcuccug 2460
gacgacaagu uguuccacug ggaccggcua cgaccgaaau aguucgucau gccucuaaca 2520
gacccgcuau agcggcgguc ucuggacuaa acacgggucu ucaaguuacc ugacuggcac 2580
gacggaggag acgacugucu acucuacuaa cgggucaugu guagacggga cgaccgaccg 2640
uguuagugua gaccuaccug uaaaccucga ccucgacggg acgucuaggg aaaacgguac 2700
gucuaccgga ugucuaaguu gccguagccg cacugggucu uacacgacau gcucuugguc 2760
uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuaag agacagaucg 2820
ugucggagac gagacccuuu cgacguccua caccacuuag ucuuacgggu ccgggacuua 2880
ugugaccacu ucgucgacuc gucguugaaa ccgcgguagu cgagacacga cuuacuguag 2940
gacucgucug accuguucca ccuccgacuu cacgucuagc ugucugacua guguccgucu 3000
gacgucagag acgucuggau gcacuguguc gucgacuaau cucgacggcu cuagucucga 3060
agacgguuag accgacggug guucuacaga cucacacacg acccugucuc guucucucac 3120
cugaagacac cguuuccgau gguggacuac ucgaagggag ucagacgagg aguaccucac 3180
cacaaagacg ugcacuggau acacggacgg guccucuucu ugaaguggug ucgaggacgg 3240
uaaacagugc uaccguuccg ggugaaagga ucucuuccgc acaagcacuc guuaccgugu 3300
gugaccaagc acugggucuc cuugaagaug cucggggucu aguaguggug gcuauugugg 3360
aagcacucgc cguuaacgcu gcaccacuag ccguagcacu uguuauggca caugcuagga 3420
gacgucggac ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480
ucgggacuac accuagaccc gcuguagaga ccguaguuac ggucgcacca cuuguagguc 3540
uuccucuagc uguccgacuu gcuccaccgg uucuuggacu uacucucgga cuagcuagac 3600
guccucgacc cguucaugcu cgucauguag uucaccggaa ccauguagac cgacccgaaa 3660
uagcggccug acuaacggua gcacuaccac ugguaguacg acacgacgua cuggucgacg 3720
acaucgacag acuucccgac aacaucgaca ccgucgacaa cguucaagcu gcuccuacua 3780
ucgcucggac acgacuuucc gcacuucgac gugaugugg 3819
<210> 33
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> SDC-58 mRNA
<400> 33
uacaagcaca aggaccacga cgacggggac cacucgagag ucacacacuu ggacuggugg 60
ucuugggucg acggaggacg gaugugguug ucgaaguggu cuccgcacau gauggggcug 120
uuccacaagu cuucgucgca cgacgugucg uggguccugg acaaggacgg gaagaagucg 180
uugcacugga ccaaggugcg guaggugcac agaccguggu uaccgugguu cuccaagcug 240
uugggacacg acgggaaguu gcugccgcac augaagcggu cguggcucuu cucguuguag 300
uaguccccga ccuagaagcc guggugggac cugucguucu gggucucgga cgacuagcac 360
uuguugcggu gguugcacca cuaguuccac acgcucaagg ucaagacguu gcuggggaag 420
gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa ggcccacaug 480
ucgucgcggu uguugacgug gaagcucaug cacucggucg ggaaggacua ccuggaccuc 540
ccguucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600
aaguucuaga ugucguucgu guggggguag uuggaccacu cucuggacgg agucccgaaa 660
agacgggacc ucggagacca ccuggacgga uagccguagu uguagugguc caaggucugg 720
gacgaccggg acgugucuuc gauggacugu ggaccgcuau cgagaagacc gaccugucga 780
ccucgacgac ggauaaugca cccgauggac gucggguccu ggaaggacga cuucauguug 840
cucuugccgu gguaguggcu gcggcaccua acacgggacc uaggagacuc gcucugguuc 900
acgugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu gaaggcccac 960
gucggauggc ucucguagca cuccaagggg uuguaguggu uggacacggg aaagccgcuc 1020
cacaaguugc gguggucuaa gcggagacac augcggaccu uguccuucgc cuagucguug 1080
acgcaccggc ugaugucgca cgacauguug ucgcggucga agucguggaa guucacgaug 1140
ccgcacucgg gaugguucga cuugcuggac acgaaguggu ugcacaugcg gcugucgaag 1200
cacuagucuc cgcuacucca cucugucuag cggggaccug ucuggccguu cuagcggcug 1260
auguugaugu ucgacgggcu gcugaagugg ccgacacacu agcggaccuu gucguuguug 1320
gaccugucgu uccacccgcc guugauguug auggacaugg ccgacaaggc cuucucguug 1380
gacuucggga agcucucccu guagucgugg cucuagaugg uccggccuuc guguggaacg 1440
uuaccgcacc ucccgaaguu gacgaugaag ggggacgucu cgaugccgaa agucggaugg 1500
uuaccgcacc cgauggucgg gaugucucac caccacgacu cgaaacuuga cgacguacgg 1560
ggacgguguc acacaccggg guucuucucg ugguuggacc acuucuuguu cacgcacuug 1620
aaguugaagu ugccggacug gccguguccg cacgacuggc ucucguuguu cuucaaggac 1680
gggaaggucg ucaagccguc ucuguagcgg cuaugguggc uacggcacuc ucuaggaguc 1740
ugggaccucu aggaccugua guggggaacg ucgaaaccgc cucacucgca cuagugugga 1800
ccgugguugu ggucguuagu ccaccggcac gacauggucc ugcacuugac gugucuccac 1860
ggacaccggu aaguacggcu agucgacugg ggauggaccu cucacauguc guggccgucg 1920
uuacacaagg ucuggucucg gccgacagac uagccucggc ucgugcacuu guugucgaug 1980
cucacgcugu agggauagcc ucggccguag acgcggagaa uggucugugu cugguugucg 2040
gggucuucuc ggucuucgca ccggucgguc agauaguagc ggauguggua cucggacccu 2100
cggcucuugu cgcaccggau gucguuguug ucguagcggu aggggugguu gaagugguag 2160
ucgcacuggu ggcucuagga cgggcacucg uacugguucu ggucgcaccu gacgugguac 2220
auguagacgc cgcugucgug ucucacgucg uuggacgacg acgucaugcc gucgaaaacg 2280
ugggucgacu ugucucggga cuguccguaa cggcaccucg uccuguucuu guggguccuc 2340
cacaagcggg uccacuucgu cuagauguuc uggggggggu aguuccugaa gccgccgaag 2400
uugaagucgg ucuaggacgg acugggaucg uucgggucgu ucgccucgaa guagcuccug 2460
gacgacaagu uguuccacug ggaccggcua cggccgaagu aguucgucau gccgcuaaca 2520
gacccgcuau agcggcgguc ucuggacuag acacgggucu ucaaguugcc ggacuggcac 2580
gacggaggag acgacugucu acucuacuag cgggucaugu ggagacggga cgaccggccu 2640
ugguagugua gaccgaccug uaaaccucga ccucggcggg acgucuaggg aaagcgguac 2700
gucuaccgga uguccaaguu gccguagccg cacugggucu ugcacgacau gcucuugguc 2760
uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuguc ggacagaucg 2820
ugucggagac gagacccguu cgacguccua caccacuugg ucuuacgggu ccgggacuug 2880
ugggaccacu ucgucgacuc gucguuaaag ccgcgguagu cgucgcacga cuugcuguag 2940
gacucgucug accuguucca ccuccggcuc cacgucuagc ugucugacua guggccgucu 3000
gacgucucgg acgucuggau gcacuguguc gucgacuagu cucggcggcu cuagucucgg 3060
agacgguuag accgacggug guucuacucg cucacacacg acccggucuc guucucucac 3120
cugaagacgc cguuuccgau gguggacuac ucgaaggggg ucagacgagg aguaccgcac 3180
cacaaagacg ugcacuggau gcacggacgg guccucuucu ugaaguggug ucggggacgg 3240
uagacagugc uaccguuccg ggugaaggga ucucucccgc acaagcacuc guuaccgugg 3300
gugaccaagc acugggucgc cuugaagaug cucggggucu aguaguggug gcuguugugg 3360
aagcacucgc cguugacgcu gcaccacuag ccguagcacu uguuguggca caugcuggga 3420
gacgucgggc ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480
ucggggcugc accuagaccc gcuguagucg ccguaguugc ggucgcacca cuuguagguc 3540
uuccucuagc uggccgacuu gcuccaccgg uucuuggacu ugcucucgga cuagcuggac 3600
guccucgacc cguucaugcu cgucauguag uucaccggga ccauguagac cgacccgaaa 3660
uagcggccgg acuagcggua gcacuaccac ugguaguacg acacgacgua cuggucgacg 3720
acgucgacgg acuucccgac aacaucgaca ccgucgacga cguucaagcu gcuccugcua 3780
ucgcucggac acgacuuccc gcacuucgac gugaugugg 3819
<210> 34
<211> 3819
<212> RNA
<213> Artificial Sequence
<220>
<223> SDC-60 mRNA
<400> 34
uacaagcaca aggaccacga cgacggggac cacucgucgg ucacacacuu ggacuggugg 60
ucuugggucg acggagggcg gaugugguug ucgaaguggu ccccgcacau gauggggcug 120
uuccacaagu ccucgucgca cgacgugucg uggguccugg acaaggacgg gaagaagucg 180
uugcacugga ccaaggugcg guaggugcac ucgccguggu uaccgugguu cgccaagcug 240
uugggacacg acgggaaguu gcugccgcac augaagcggu cguggcucuu cucguuguag 300
uaggccccga ccuagaagcc guggugggac cugucguucu gggucucgga cgacuagcac 360
uuguugcggu gguugcacca cuaguuccac acgcucaagg ucaagacguu gcuggggaag 420
gacccgcaca ugaugguguu cuuguuguuc ucgaccuacc ucucgcucaa ggcccacaug 480
ucgucgcggu uguugacgug gaagcucaug cacucggucg ggaaggacua ccuggaccuc 540
ccguucgucc cguugaaguu cuuggacgcc cucaagcaca aguucuugua gcugccgaug 600
aaguucuaga ugucguucgu guggggguag uuggaccacu cccuggacgg agucccgaaa 660
agacgggacc ucggagacca ccuggacggg uagccguagu uguagugguc caaggucugg 720
gacgaccggg acguguccuc gauggacugu ggaccgcuau cgagaagacc gaccugucgg 780
ccucgacgac ggaugaugca cccgauggac gucggggccu ggaaggacga cuucauguug 840
cucuugccgu gguaguggcu gcggcaccua acgcgggacc uaggagacuc gcucugguuc 900
acgugggacu ucucgaagug gcaccucuuc ccguagaugg ucuggucguu gaaggcccac 960
gucggguggc ucucguagca cuccaagggg uuguaguggu uggacacggg gaagccgcuc 1020
cacaaguugc gguggucuaa gcggucgcac augcggaccu uggccuucgc cuagucguug 1080
acgcaccggc ugaugucgca cgacauguug ucgcggucga agucguggaa guucacgaug 1140
ccgcacucgg ggugguucga cuugcuggac acgaaguggu ugcacaugcg gcugucgaag 1200
cacuaguccc cgcuacucca cucugucuag cggggaccgg ucuggccguu cuagcggcug 1260
auguugaugu ucgacgggcu gcugaagugg ccgacgcacu agcggaccuu gucguuguug 1320
gaccugucgu uccacccgcc guugauguug auggacaugg ccgacaaggc cuucucguug 1380
gacuucggga agcucgcccu guagucgugg cucuagaugg uccggccuuc guggggaacg 1440
uugccgcacc ucccgaaguu gacgaugaag ggggacgucu cgaugccgaa ggucggaugg 1500
uuaccgcacc cgauggucgg gaugucccac caccacgacu cgaaacucga cgacguacga 1560
ggacgguggc acacgccggg guucuucucg ugguuggacc acuucuuguu cacgcacuug 1620
aaguugaagu ugccggacug gccguggccg cacgacuggc ucucguuguu cuucaaggac 1680
gggaaggucg ucaagccguc ccuguagcgg cuaugguggc uacggcacuc ucugggaguc 1740
ugggaccucu aggaccugua guggggaacg ucgaagccgc cucacucgca cuagugugga 1800
ccgugguugu ggucguuggu ccaccggcac gacauggucc ugcacuugac guggcuccac 1860
ggacaccggu aagugcggcu agucgacugg ggguggaccu cucacauguc guggccgucg 1920
uugcacaagg ucuggucucg gccgacagac uagccgcggc ucgugcacuu guugucgaug 1980
cucacgcugu agggguagcc gcggccguag acacggucga uagucugggu cugguugucg 2040
ggauccuccc ggucuucgca ccggucgguc agauaguagc ggauguggua cucggacccg 2100
cggcucuugu cgcaccggau gucguuguug ucguagcggu aggggugguu gaagugguag 2160
ucgcacuggu ggcucuagga cgggcacucg uacugguucu ggucgcaccu gacgugguac 2220
auguagacgc cgcugucgug gcucacgucg uuggacgacg acgucaugcc gucgaagacg 2280
ugggucgacu ugucucggga cuguccguag cggcaccucg uccuguucuu guggguccuc 2340
cacaagcggg uccacuucgu cuagauguuc uggggggggu aguuccugaa gccgccgaag 2400
uugaagucgg ucuaggacgg acuggggucg uucgggucgu ucgccucgaa guagcuccug 2460
gacgacaagu uguuccacug ggaccggcug cggccgaagu aguucgucau gccgcugaca 2520
gacccgcugu agcggcgguc ucuggacuag acacgggucu ucaaguugcc ggacuggcac 2580
gacgggggag acgacuggcu acucuacuag cgggucaugu ggagacggga cgaccggccg 2640
ugguagugua gaccgaccug gaaaccucga ccucggcggg acgucuaggg aaagcgguac 2700
gucuaccgga uggccaaguu gccguagccg cacugggucu ugcacgacau gcucuugguc 2760
uucgacuagc gguuggucaa guugucgcgg uagccguucu agguccuguc ggacucgucg 2820
uggcggagac gagacccguu ugacguccug caccacuugg ucuugcgggu ccgggacuug 2880
ugggaccacu ucgucgacuc gucguugaag ccgcgguagu cgucgcacga cuugcuguag 2940
gacucguccg accuguucca ccuccggcuc cacgucuagc uguccgacua guggccgucu 3000
gacgucucgg acgucuggau gcacuggguc gucgacuagu cucggcggcu cuagucucgg 3060
agacgguuag accggcggug guucuacucg cucacacacg acccggucuc guucucccac 3120
cugaagacgc cguucccgau gguggacuac ucgaaggggg ucagacgggg aguaccgcac 3180
cacaaggacg ugcacuggau gcacggacgg guccucuucu ugaaguggug gcggggacgg 3240
uagacggugc uaccguuccg ggugaaggga ucucucccgc acaagcacuc guugccgugg 3300
gugaccaagc acugggucgc cuugaagaug cucggggucu aguaguggug gcuguugugg 3360
aagcacucgc cguugacgcu gcaccacuag ccguagcacu uguuguggca caugcugggg 3420
gacgucgggc ucgaccuguc gaaguuccuc cucgaccugu ucaugaaguu cuuggugugg 3480
ucggggcugc accuggaccc gcuguagucg ccguaguugc ggucgcacca cuuguagguc 3540
uuccucuagc uggccgacuu gcuccaccgg uucuuggacu ugcucucgga cuagcuggac 3600
guccucgacc cguucaugcu cgucauguag uucaccggga ccauguagac cgacccgaag 3660
uagcggccgg acuagcggua gcacuaccac ugguaguacg acacgacgua cuggucgacg 3720
acgucgacgg acuucccgac gacgucgaca ccgucgacaa cguucaagcu gcuccugcug 3780
ucgcucgggc acgacuuccc gcacuucgac gugaugugg 3819
<210> 35
<211> 957
<212> RNA
<213> Artificial Sequence
<220>
<223> MT2AE mRNA
<400> 35
uaccggcuaa gauuaccgug guaguggcac cuucucgacu ucuucgacga gcucguuacc 60
uuggaccacu agccuaaaga caaggacugg accuagacag acaacgucaa gcggaugcgg 120
uuggccuugu cuaaggacau guaguaguuu gacuagaagg acaccgacga caccggacac 180
ugggaccgga cgaagcacga ccggcggcac auggccuaau ugaccuagug gccuccguag 240
cgauagcggu accguacgga ccagccugaa uacaccgaca gaauaaagua gcggucgaag 300
ucugacaagc gaucuugguc uucguacacc aggaaauugg gacucuguuu guaggacgac 360
uugcacggag acgugccgug uuaggacugu gccggugacg accuuucgcu cgaccaguag 420
ccgcggcacu aggacgcccc gguagacgcg uaacggccug ugguggaccc gucuacgcug 480
uaguuccugg acggguuccu uuaguggcac cgguggucgu cuugugacag gaugauguuu 540
gacccgcgau cagucucuca ccggccgcug ucgccgaagc ggcgaauaag aucuaugucu 600
uagccguuga uguucgacuu auggcuagug ucgucgucgu cgcuguugua gcgggacgac 660
cacgucccgu cgccgcuccc gucuccuucg gacgacugua caccgcuaca ccuucucuug 720
gggccgggac gguacauguc gaaacacaga cuccuuuggc cgugggacua gcacuugucg 780
cacgacgaca aagaccggaa gcagcacaag gacgaccacu gugaccgaua ggacuggcgg 840
gacuccgaca cgcggaugac gacguuguag cacuuacaua gggaccacuu cggaaggaag 900
augcacaugu cggcccacuu cuuggaauua ucgagaucuc acgggcugga cgagcaa 957
<210> 36
<211> 960
<212> RNA
<213> Artificial Sequence
<220>
<223> MP2AE mRNA
<400> 36
uaccggcugu cguugccgug uuagugucac cuucucgacu ucuucgacga ccucgucacc 60
uuggaccacu aaccuaaaga aaaggagugg accuagacgg acgacgucaa gcggauacgg 120
uuggccuugu cuaaggacau guaguaguuc gacuagaagg acaccgacga caccgggcac 180
ugggaccgga caaaacacga ccggcggcac auggccuagu ugaccuagug gccgccuuag 240
cgauagcggu accggacgga ccacccggac uacaccgacu cgaugaagua gcggaggaaa 300
ucugacaagc ggucuugguc uucguacacc aggaaguugg gacucuguuu auaggacgag 360
uugcacgggg acgugccgug guaggacugg gccggagacg agcucucgcu cgaccacuag 420
ccgcggcacu aggacucucc gguggacucu uagcggccug ugguggaccc gucuacgcug 480
uaguuccugg acgguuuccu uuaguggcaa cgauguucgu cuugugacag gaugauguuc 540
gacccgcgaa gaguuucuca gcggccgcug ucgccgaagc gacgaauauc ggcgaugucc 600
uaaccuuuaa uguucgacuu guggcuagua agaagaucgu cgcuguugua gcgggacgac 660
caggucccgu cgccgcggug uuugaagucg gacgaauuug uccggccgcu acaccuucuc 720
uuggggccgg gacgguacau gucgaagcac aggcuccuuu ggccguggga cuagcacuug 780
ucgcacgacg acaaggaacg aaaacaccac aaggacgacc agugggaccg guaggacugg 840
cgggacucug acacacggau gacgacguug uagcacuuac acagagacca cuucggaucg 900
aagaugcaca ugucggccca cuuuuuggac uugagaucgg cccacggacu agacgaccac 960
<210> 37
<211> 798
<212> RNA
<213> Artificial Sequence
<220>
<223> SGS-RBD mRNA
<400> 37
uaccucuguc ugugugagga cgauacccau gacgacgaga cccaaggucc aagguggccu 60
cugacgggua aaccgcucca caaguugcgu ugggcgaagc ggucgcacau gcggaccuua 120
uccuucgccu agucguugac gcaccggcug auaucgcacg acauguugag gcggagaaag 180
ucguggaaau ucacgauacc gcacaggggg uguuucgacu uacuggacac gaaaugguug 240
cagaugcggc uaagaaagca cuaguccccg cugcuccacg cggucuagcg ggggccgguc 300
uguccguucu agcgucugau guuaauauuc gacggucugc uaaaguggcc gacgcacuag 360
cggaccuugu cguuguuaga ccuaagguuu cacccgccgu ugauguuaau agacauggcc 420
gacaaaucuu ucucguuaga cuucgggaag cucucccugu agagaugucu uuagaugguc 480
cggccgucgu ggggaacguu accgcaccuc ccgaaauuga caauaaaggg ugaggucagg 540
augccgaagg ucggguguuu gccgcacccg auagucggaa uggcgcacca ccacgacucg 600
aaacucgacg acgugcggau gggcaugcug cacggccuga ugcgguuacg acacccgguc 660
cugugcgucc uccaguagca ccacggugug aggaacggga aauuccacca ccacuagagu 720
cgguaggacc gggaccacca cgagugguag uagagggaau aguaggagua guacgaaacc 780
gucuucuucg gugcaauc 798
<210> 38
<211> 66
<212> DNA
<213> Artificial Sequence
<220>
<223> T2A DNA
<400> 38
ggcagcggcg agggcagagg aagcctgctg acatgtggcg atgtggaaga gaaccccggc 60
cctgcc 66
<210> 39
<211> 69
<212> DNA
<213> Artificial Sequence
<220>
<223> P2A DNA
<400> 39
ggcagcggcg ccacaaactt cagcctgctt aaacaggccg gcgatgtgga agagaacccc 60
ggccctgcc 69
<210> 40
<211> 66
<212> RNA
<213> Artificial Sequence
<220>
<223> T2A mRNA
<400> 40
ccgucgccgc ucccgucucc uucggacgac uguacaccgc uacaccuucu cuuggggccg 60
ggacgg 66
<210> 41
<211> 69
<212> RNA
<213> Artificial Sequence
<220>
<223> P2A mRNA
<400> 41
ccgucgccgc gguguuugaa gucggacgaa uuuguccggc cgcuacaccu ucucuugggg 60
ccgggacgg 69
<210> 42
<211> 22
<212> PRT
<213> thosea asigna virus 2A
<400> 42
Gly Ser Gly Glu Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu
1 5 10 15
Glu Asn Pro Gly Pro Ala
20
<210> 43
<211> 23
<212> PRT
<213> porcine teschovirus-1 2A
<400> 43
Gly Ser Gly Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val
1 5 10 15
Glu Glu Asn Pro Gly Pro Ala
20
- 上一篇:一种医用注射器针头装配设备
- 下一篇:一种提高杨树木材产量的方法