Engineered yeast construction method for glycoprotein preparation and strain thereof

文档序号:149360 发布日期:2021-10-26 浏览:34次 中文

阅读说明:本技术 一种用于糖蛋白制备的工程化酵母构建方法及其菌株 (Engineered yeast construction method for glycoprotein preparation and strain thereof ) 是由 吴军 刘波 孙鹏 巩新 王甜甜 侯旭宸 于 2020-04-24 设计创作,主要内容包括:本发明公开了一种用于糖蛋白制备的工程化酵母构建方法及其菌株。本发明提供了一种具有特定哺乳动物细胞糖型修饰能力的酵母工程菌的构建方法,包括:失活受体酵母内源的α-1,6-甘露糖转移酶、磷酸甘露糖转移酶、磷酸甘露糖合成酶、β甘露糖转移酶I-IV、O甘露糖转移酶I;表达外源甘露糖苷酶I、N-乙酰葡萄糖胺转移酶I、甘露糖苷酶II、N-乙酰葡萄糖胺转移酶II、半乳糖异构酶和外源半乳糖转移酶。本发明所得酵母工程菌构建周期短、生长快、易于大规模生产、安全性高等特点,使其不仅可用于制备普通糖蛋白疫苗,而且非常适合在突发新型传染病等应急条件下,进行疫苗高效研发和大规模生产。这在医药用途方面具有重要意义。(The invention discloses a construction method of engineered yeast for glycoprotein preparation and a strain thereof. The invention provides a construction method of a yeast engineering bacterium with specific mammal cell glycoform modification capacity, which comprises the following steps: inactivating endogenous alpha-1, 6-mannose transferase, phosphomannose synthetase, beta-mannose transferase I-IV, O-mannose transferase I of the recipient yeast; expressing the exogenous mannosidase I, N-acetylglucosamine transferase I, mannosidase II, N-acetylglucosamine transferase II, galactose isomerase, and exogenous galactose transferase. The yeast engineering bacteria obtained by the invention has the characteristics of short construction period, quick growth, easy large-scale production, high safety and the like, and can be used for preparing common glycoprotein vaccines and is very suitable for carrying out efficient research and development and large-scale production of the vaccines under emergency conditions of sudden novel infectious diseases and the like. This is of great significance in medical use.)

1. A construction method of Pichia pastoris engineering bacteria with specific mammal cell sugar type modification capability comprises the following steps:

(A1) inactivating alpha-1, 6-mannose transferase, phosphomannose synthetase, beta-mannose transferase I, beta-mannose transferase II, beta-mannose transferase III and beta-mannose transferase IV endogenous to a receptor pichia pastoris to obtain recombinant yeast 1;

(A2) expressing at least one of the following foreign proteins in the recombinant yeast 1: exogenous mannosidase I, exogenous N-acetylglucosamine transferase I, exogenous mannosidase II, exogenous N-acetylglucosamine transferase II, exogenous galactose isomerase and exogenous galactose transferase to obtain recombinant yeast 2; the recombinant yeast 2 is a yeast engineering bacterium with the sugar type modification capability of the specific mammalian cell;

the specific mammalian cell glycoform is GalaGlcNAcbMancGlcNAc2Wherein a: 0-2; b: 0-2; c: 3-5.

2. The method of claim 1, wherein: the method further comprises the step (a3) of:

(A3) inactivating the endogenous O-mannose transferase I of the recombinant yeast 2 to obtain recombinant yeast 3; the recombinant yeast 3 is also a yeast engineering bacterium with the sugar type modification capability of the specific mammalian cell.

3. The method according to claim 1 or 2, characterized in that: when the specific mammalian cell glycoform is Man5GlcNAc2When the foreign protein expressed in the recombinant yeast 1 in the step (a2) is foreign mannosidase I; or

When said particular mammalian cell glycoform is GlcNAcMan5GlcNAc2When the foreign proteins expressed in the recombinant yeast 1 in the step (A2) are foreign mannosidase I and foreign N-acetylglucosamine transferase I; or

When said particular mammalian cell glycoform is GalGlcNAcMan5GlcNAc2When the foreign proteins expressed in the recombinant yeast 1 in the step (a2) are foreign mannosidase I, foreign N-acetylglucosamine transferase I, and foreign galactose isomerase and foreign galactose transferase; or

When said particular mammalian cell glycoform is GalGlcNAcMan3GlcNAc2When the foreign proteins expressed in the recombinant yeast 1 in the step (a2) are foreign mannosidase I, foreign N-acetylglucosamine transferase I, foreign galactose isomerase, and foreign galactosyltransferase, and foreign mannosidase II; or

When the specific mammalian cell glycoform is Gal2GlcNAc2Man3GlcNAc2When, the foreign proteins expressed in the recombinant yeast 1 in the step (A2) are foreign mannosidase I, foreign N-acetylglucosamine transferase I, foreign galactose isomerase and foreign galactose transferase, foreign mannosidase II, and foreign N-acetylglucosamine transferase II.

4. A method according to any one of claims 1-3, characterized in that: in the step (A1), the inactivated receptor Pichia pastoris endogenous alpha-1, 6-mannose transferase, phosphomannose synthetase, beta-mannose transferase I, beta-mannose transferase II, beta-mannose transferase III and beta-mannose transferase IV are all knocked out by adopting a homologous recombination mode;

and/or

In the step (a2), the expression of the foreign protein in the recombinant yeast 1 is carried out by introducing a gene encoding the foreign protein into the recombinant yeast 1;

further, the gene encoding the foreign protein is introduced into the recombinant yeast 1 in the form of a recombinant vector; or

Further, the coding gene of the exogenous mannosidase I and the coding gene of the exogenous mannosidase II are both introduced into the recombinant yeast 1 twice;

and/or

In the step (a3), the inactivation of the O-mannosyltransferase I endogenous to the recombinant yeast 2 is achieved by insertional inactivation of the O-mannosyltransferase I-encoding gene in the genomic DNA of the recombinant yeast 2.

5. The method according to any one of claims 1-4, wherein: the exogenous mannosidase I is derived from trichoderma viride, and a C-terminal is fused with an endoplasmic reticulum retention signal HDEL;

and/or

The exogenous N-acetylglucosamine transferase I is derived from mammals, and is fused with endoplasmic reticulum or medial Golgi body positioning signals at the N-terminal or C-terminal;

further, the exogenous N-acetylglucosamine transferase I is of human origin and contains a mnn9 localization signal;

and/or

The exogenous mannosidase II is derived from filamentous fungi, plants, insects, java or mammals, and is fused with endoplasmic reticulum or internal Golgi body positioning signals at the N-end or the C-end;

and/or

The exogenous N-acetylglucosamine transferase II is derived from mammals, and is fused with endoplasmic reticulum or medial Golgi body positioning signals at the N-terminal or C-terminal;

further, the exogenous mannosidase II is of nematode origin and/or the N-acetylglucosamine transferase II is of human origin and both contain a mnn2 localization signal;

and/or

Both the exogenous galactose isomerase and the exogenous galactose transferase are derived from mammals, and endoplasmic reticulum or internal Golgi localization signals are fused at the N-terminal or C-terminal;

further, the exogenous galactose isomerase and the exogenous galactose transferase are fusion proteins, are both derived from human, and share an kre2 localization signal.

6. The method according to any one of claims 1-5, wherein: the alpha-1, 6-mannosyltransferase is B1) or B2) as follows:

B1) a protein having an amino acid sequence of SEQ ID No. 1;

B2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.1 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.1 and has the same function;

and/or

The phosphomannosyltransferase is B3) or B4) as follows:

B3) a protein having the amino acid sequence of SEQ ID No. 2;

B4) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.2 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.2 and has the same function;

and/or

The phosphomannose synthetase is B5) or B6) as follows:

B5) a protein having the amino acid sequence of SEQ ID No. 3;

B6) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.3 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.3 and has the same function;

and/or

The beta-mannosyltransferase I is B7) or B8) as follows:

B7) a protein having an amino acid sequence of SEQ ID No. 4;

B8) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.4 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.4 and has the same function;

and/or

The beta mannosyl transferase II is B9) or B10) as follows:

B9) a protein having the amino acid sequence of SEQ ID No. 5;

B10) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.5 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.5 and has the same function;

and/or

The beta mannosyl transferase III is B11) or B12) as follows:

B11) a protein having an amino acid sequence of SEQ ID No. 6;

B12) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.6 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.6 and has the same function;

and/or

The beta-mannosyltransferase IV is B13) or B14) as follows:

B13) a protein having the amino acid sequence of SEQ ID No. 7;

B14) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.7 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.7 and has the same function;

and/or

The O-mannosyltransferase I is B15) or B16) as follows:

B15) a protein having the amino acid sequence of SEQ ID No. 8;

B16) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.8 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.8 and has the same function;

and/or

The exogenous mannosidase I is B17) or B18) as follows:

B17) a protein having the amino acid sequence of SEQ ID No. 9;

B18) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.9 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.9 and has the same function;

and/or

The exogenous N-acetylglucosamine transferase I is B19) or B20) as follows:

B19) a protein having the amino acid sequence of SEQ ID No. 10;

B20) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.10 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.10 and has the same function;

and/or

The fusion protein consisting of the galactose isomerase and the galactose transferase is B21) or B22) as follows:

B21) a protein having the amino acid sequence of SEQ ID No. 11;

B22) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.11 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.11 and has the same function;

and/or

The mannosidase II is B23) or B24) as follows:

B23) a protein having the amino acid sequence of SEQ ID No. 12;

B24) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.12 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.12 and has the same function;

and/or

The N-acetylglucosamine transferase II is B25) or B26) as follows:

B25) a protein having an amino acid sequence of SEQ ID No. 13;

B26) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.13 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.13 and has the same function;

and/or

The encoding gene of the exogenous mannosidase I is C1) or C2):

C1) a DNA molecule having the nucleotide sequence of SEQ ID No. 14;

C2) a DNA molecule having more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the nucleotide sequence shown in SEQ ID No.14 and encoding the exogenous mannosidase I, or a DNA molecule hybridizing with the DNA molecule defined by C1) under stringent conditions and encoding the exogenous mannosidase I;

and/or

The encoding gene of the exogenous N-acetylglucosamine transferase I is C3) or C4) as follows:

C3) a DNA molecule having the nucleotide sequence of SEQ ID No. 15;

C4) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology to the nucleotide sequence represented by SEQ ID No.15 and encoding said exogenous N-acetylglucosamine transferase I, or a DNA molecule hybridizing under stringent conditions with the DNA molecule defined by C3) and encoding said exogenous N-acetylglucosamine transferase I;

and/or

The encoding gene of the fusion protein consisting of the galactose isomerase and the galactose transferase is C5) or C6) as follows:

C5) a DNA molecule having the nucleotide sequence of SEQ ID No. 16;

C6) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the nucleotide sequence shown in SEQ ID No.16 and encodes the fusion protein, or a DNA molecule which hybridizes with the DNA molecule defined by C5) under strict conditions and encodes the fusion protein;

and/or

The encoding gene of the mannosidase II is C7) or C8) as follows:

C7) a DNA molecule having the nucleotide sequence of SEQ ID No. 17;

C8) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the nucleotide sequence shown in SEQ ID No.17 and encodes the mannosidase II, or a DNA molecule which hybridizes with the DNA molecule defined by C7) under strict conditions and encodes the mannosidase II;

and/or

The coding gene of the N-acetylglucosamine transferase II is C9) or C10) as follows:

C9) a DNA molecule having the nucleotide sequence of SEQ ID No. 18;

C10) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.18 and encoding said N-acetylglucosamine transferase II, or a DNA molecule hybridizing with the DNA molecule defined by C9) under stringent conditions and encoding said N-acetylglucosamine transferase II.

7. The pichia pastoris engineered strain constructed by the method of any one of claims 1 to 6.

8. The pichia pastoris engineered strain of claim 7, wherein: the pichia pastoris engineering bacteria are strains with the preservation number of CGMCC No.19488 which are preserved in the common microorganism center of China Committee for culture Collection of microorganisms.

9. The use of the engineered Pichia pastoris of claim 7 or 8, in the preparation of a protein of interest modified with the particular mammalian cell glycoform.

10. A method for preparing a protein of interest modified with a glycoform of said specific mammalian cell, comprising the steps of: expressing the target protein in the pichia pastoris engineering bacteria of claim 7 or 8 to obtain recombinant yeast engineering bacteria; culturing the recombinant yeast engineering bacteria to prepare the target protein with the specific mammal cell glycoform.

Technical Field

The invention relates to the field of bioengineering, in particular to a construction method of an engineered yeast for glycoprotein preparation and a strain thereof.

Background

Yeast has been widely used for the expression of various recombinant proteins as an important recombinant protein expression system. It has the advantages of fast growth of prokaryotic cell system, convenient gene operation, large scale culture, etc. and has the features of post-translational processing of eukaryotic cell, capacity of producing recombinant protein with bioactivity, etc. Pichia pastoris (also known as Pichia pastoris) is a host strain for the expression of foreign proteins that has developed faster in recent years. Besides the characteristics of common yeast, pichia pastoris has many advantages, for example, pichia pastoris has a methanol-induced promoter and can strictly regulate and control the expression of foreign proteins; the expression product of the exogenous gene can exist in the cell and can be secreted out of the cell, and the product of the exogenous gene can be efficiently obtained; the expression vector can be stably inherited; can perform high-density and high-yield fermentation culture, is convenient for industrial production and the like, and can perform protein posttranslational modification, such as glycosylation modification, of a plurality of typical higher eukaryotes.

Glycosylation is crucial for the correct folding, stability and, of proteins. In humans, glycosylation is one of the reasons for influencing the pharmacokinetic properties of proteins, such as tissue distribution and clearance in blood (Guo Zheng, carbohydrate chemistry, chemical industry Press, 2005). The glycosyl groups of glycoproteins are classified into N-glycosyl and O-glycosyl. The N-sugar chain is linked to Asn (where X is any amino acid residue except proline) in the Asn-X-Thr/Ser conserved sequence. The structure of O-sugar chains is simpler than that of N-sugar chains, and the number of attachment sites is larger than that of N-sugar chains, and these are often found in serine (Ser) and threonine (Thr). Glycosylation is critical to the proper folding, stability and biological activity of a protein. However, glycosylation modifications of yeast expressed proteins often produce excessive mannosylation, and normal N-glycosyl modifications, generally containing 10-20 monosaccharides per glycosyl, have molecular weights of 1500-4000. When the glycosylation is excessively modified, each glycosyl can contain dozens to hundreds of mannose, the molecular weight is 5000 to tens of thousands, the molecular weight of the glycoprotein is obviously increased, the molecular weight of the glycoprotein is also not uniform because the excessive glycosylation modification is not uniform, and the SDS-PAGE analysis can generate obvious tailing. N-glycosylation modifications will occur at their conserved N-glycosylation modification sites (N-X-S/T), but since O-glycosylation modifications do not have conserved glycosylation sites, it is generally believed that amino acids rich in serine or threonine will occur, whether or not O-glycosylation modifications will occur in different proteins, and at which amino acid the degree of O-glycosylation modification will vary. Serine or threonine of a protein may be potential sites for O-glycosylation, but not every serine or threonine, nor every protein containing serine or threonine, will be O-glycosylated, and glycosylation modifications will vary from protein to protein in different expression systems. The excessively mannosylated glycoprotein has short half-life in human body, high immunogenicity and easy elimination. Due to the defect, the application of pichia pastoris in the production of most glycoprotein drugs is limited.

Disclosure of Invention

The invention aims to provide an engineered pichia pastoris strain with specific mammalian cell glycoform modification capacity and a construction method thereof, and aims to solve the first technical problem that a series of yeast self-related glycosylation modification enzymes need to be inactivated when a yeast chassis cell is constructed, but because the glycosylation modification enzymes are various, the yeast can die due to the inactivation of a plurality of glycosyl modifications, so that the uncertainty of selection of the modification enzymes is involved. The second technical problem to be solved is to construct an engineered pichia pastoris strain with specific mammal cell glycoform modification capability on a yeast chassis cell, because eukaryotes have glycosylation modification phenomena (the glycosylation modification phenomena are also found in prokaryotes in recent years), a plurality of choices exist when glycosylation modification enzymes are introduced into the yeast chassis cell, different species, different organelle positioning modes, different temperature and pH regulation modes (because some organisms are cold-resistant, and the like,Heat resistance, acid resistance, alkali resistance, etc.), and different biological activities, which need to be considered, a great deal of combined experiments and analysis need to be carried out, and the research and the trial are carried out for many years. The invention provides an engineered pichia pastoris with specific mammal cell glycoform modification capacity, the glycoform of a host protein expressed by the engineered pichia pastoris is the specific mammal cell glycoform: galaGlcNAcbMancGlcNAc2Wherein a: 0-2; b: 0-2; c: 3-5 (Gal: galactose, GlcNAc: N-acetylglucosamine; Man: mannose); meanwhile, the glycosylation modification phenomenon of the yeast O is further reduced.

In a first aspect, the invention claims a construction method of pichia pastoris engineering bacteria with specific mammal cell sugar type modification capability.

The construction method of the pichia pastoris engineering bacteria with the specific mammal cell glycoform modification capability, which is claimed by the invention, can comprise the following steps:

(A1) inactivating alpha-1, 6-mannose transferase, phosphomannose synthetase, beta-mannose transferase I, beta-mannose transferase II, beta-mannose transferase III and beta-mannose transferase IV endogenous to a receptor pichia pastoris to obtain recombinant yeast 1;

(A2) expressing at least one of the following foreign proteins in the recombinant yeast 1: exogenous mannosidase I, exogenous N-acetylglucosamine transferase I, exogenous mannosidase II, exogenous N-acetylglucosamine transferase II, exogenous galactose isomerase and exogenous galactose transferase to obtain recombinant yeast 2; the recombinant yeast 2 is a yeast engineering bacterium with the sugar type modification capability of the specific mammalian cell;

the specific mammalian cell glycoform is GalaGlcNAcbMancGlcNAc2Wherein a: 0-2; b: 0-2; c: 3-5 (Gal: galactose, GlcNAc: N-acetylglucosamine; Man: mannose).

When the alpha-1, 6-mannose transferase, phosphomannose synthetase and beta-mannose transferase I-IV are inactivated, the modification of N glycosylation is obviously reduced, and the environment in glycosyl tends to be relatively clean, so that the novel problems are brought: how to reduce O glycosylation modification? There are numerous members of the O-glycosylation family, and which enzyme inactivation may be suitable for use in the present invention and achieve the desired effect? It is known that N-glycosylation modification occurs at a conserved N-glycosylation modification site (N-X-S/T), but since O-glycosylation modification does not have a conserved glycosylation site, it is generally considered that it occurs at amino acids rich in serine or threonine, whether or not O-glycosylation modification occurs in different proteins, and at which amino acid, the degree of O-glycosylation modification varies. Serine or threonine of a protein may be potential sites for O-glycosylation, but not every serine or threonine, nor every protein containing serine or threonine, will be O-glycosylated, and glycosylation modifications will vary from protein to protein in different expression systems. When the modification by O-glycosylation is carried out, mannose is often used as the sugar group in the sugar chain, and although the sugar chain is relatively short, a large amount of exposed mannose may be present on the surface of the yeast-expressed protein due to the large number of sugar chains. The mannosylated glycoprotein has short half-life in human body, high immunogenicity and easy elimination. Due to the defect, the application of pichia pastoris in the production of most protein drugs is limited.

Members of the O-glycosyltransferase family are divided into three subfamilies based on their homology: subfamily PMT1, subfamily PMT2, and subfamily PMT 4. The number of members of the sub-family PMT1 and sub-family PMT2 may vary among species, for a total of 7 family members: PMT1\ PMT2\ PMT3\ PMT4\ PMT5\ PMT6\ PMT 7. The sub-family of PMT1 of Saccharomyces cerevisiae includes PMT1\ PMT5\ PMT7, and the sub-family of PMT2 includes PMT2\ PMT3\ PMT 638. The members of the subfamily Pmt1p (Pmt1p, Pmt5p) and Pmt2p (Pmt2p, Pmt3p) form heterodimers with each other, the members of the subfamily Pmt4p form homodiploids, and the members of the Pmt6p form neither heterodimers with other members of the Pmtp family nor homodiploids with itself. In wild type yeast, the complexes formed by members of the subfamily Pmt1p and Pmt2p are mainly the Pmt1 p-Pmt 2p and Pmt5 p-Pmt 3p complexes, and there is a very small amount of the Pmt1 p-Pmt 3p and Pmt2 p-Pmt 5p complexes. However, in the present invention, we have found that further inactivation of O-mannosidase I, simultaneous expression of certain exogenous mannosidase I, exogenous N-acetylglucosaminyltransferase I, exogenous mannosidase II, exogenous N-acetylglucosaminyltransferase II, exogenous galactose isomerase GalE and exogenous galactose transferase GalT from specific sources, based on α -1, 6-mannosyltransferase inactivation, phosphomannosyltransferase inactivation and β -mannosyltransferase I-IV inactivation, can significantly reduce O-glycosylation modification of proteins expressed by engineered yeast and obtain a glycoform with specific mammalian cells.

Accordingly, the method may further include the step (a3) of:

(A3) inactivating the endogenous O-mannose transferase I of the recombinant yeast 2 to obtain recombinant yeast 3; the recombinant yeast 3 is also a yeast engineering bacterium with the sugar type modification capability of the specific mammalian cell.

Step (A3) further reduces the yeast O glycosylation modification phenomenon.

When the specific mammalian cell glycoform is Man5GlcNAc2In this case, the foreign protein expressed in the recombinant yeast 1 in the step (A2) is foreign mannosidase I.

When said particular mammalian cell glycoform is GlcNAcMan5GlcNAc2When the exogenous proteins expressed in the recombinant yeast 1 in the step (A2) are exogenous mannosidase I and exogenous N-acetylglucosamine transferase I.

When said particular mammalian cell glycoform is GalGlcNAcMan5GlcNAc2When the foreign proteins expressed in the recombinant yeast 1 in the step (A2) are foreign mannosidase I, foreign N-acetylglucosamine transferase I, and foreign galactose isomerase and foreign galactose transferase.

When said particular mammalian cell glycoform is GalGlcNAcMan3GlcNAc2When the foreign protein expressed in the recombinant yeast 1 in the step (A2) is a foreign mannosideEnzyme I, exogenous N-acetylglucosamine transferase I, exogenous galactose isomerase and exogenous galactose transferase, and exogenous mannosidase II.

When the specific mammalian cell glycoform is Gal2GlcNAc2Man3GlcNAc2When, the foreign proteins expressed in the recombinant yeast 1 in the step (A2) are foreign mannosidase I, foreign N-acetylglucosamine transferase I, foreign galactose isomerase and foreign galactose transferase, foreign mannosidase II, and foreign N-acetylglucosamine transferase II.

In the steps (A1) and (A3), the inactivation of the sugar-based modifying enzyme may be achieved by mutating one or more nucleotide sequences of the gene, or by deleting a part or the entire sequence of the gene, or may be achieved by disrupting the original reading frame by inserting nucleotides, by terminating the protein synthesis in advance, or the like. The above mutations, deletions, insertional inactivation and the like can be obtained by conventional mutagenesis, knock-out and the like. These methods have been reported in many documents, such as J. SammBruk et al, molecular cloning, A laboratory Manual, second edition, scientific Press, 1995. Other methods known in the art can also be used to construct a gene-inactivated yeast strain. Among them, the preferred strain is obtained by knocking out a partial sequence of the mannosyltransferase gene. The sequence is at least more than three bases, preferably more than 100 bases, more preferably comprises more than 50% of the coding sequence. The strain obtained by knocking out a partial sequence of the glycosyl modified enzyme gene is not easy to generate back mutation, and the stability of the strain is higher than that of the strain constructed by using methods such as point mutation and the like, so that the method is more beneficial to being applied to the medical and industrial fields.

The method of knocking out a partial sequence of a glycosylation modifying enzyme gene may include: firstly, constructing a plasmid for knocking out the gene: the plasmid comprises homologous arm sequences at two sides of a gene to be knocked out, two homologous arms are selected at two sides of a target gene, the length of the homologous arms is at least more than 200bp, and the optimal size is 500bp-2000 bp. Also, an insertion inactivation method can be used to obtain a nucleotide sequence in which an amino acid sequence is substituted and/or deleted and/or added by one or more amino acid residues so that the nucleotide sequence has no functional activity, and the nucleotide sequence can be constructed into a plasmid. The plasmid also contains URA3 (nucleotide-5' -phosphate decarboxylase) gene, bleomycin, hygromycin B, blestic idin or G418, etc. as selection markers. Nucleic acid polynucleotide sequences encoding the flanking region homology arm fragments, nucleotide sequences of the protein to be functionally disrupted, are available from the published National Center for Biotechnology Information (NCBI). By using a PCR method and taking a pichia pastoris host genome as a template, obtaining flanking homologous regions with a certain length required by an inactivated gene, wherein the flanking homologous regions respectively comprise an upstream flanking homologous region and a downstream flanking homologous region of a target gene (the sequence of the flanking homologous regions is disclosed in NCBI), and adding a proper enzyme cutting site in a primer part. Polynucleotides obtained from the sequence can be obtained by methods well known in the art, such as PCR (J. SammBruk et al, second edition of the molecular cloning instructions, scientific Press, 1995), RT-PCR methods, synthetic methods, genomic DNA, and methods for constructing screened cDNA libraries. If desired, the polynucleotides can be mutated, deleted, inserted, linked to other polynucleotides, and the like, using methods well known in the art. The fusion of the upstream (5 ') and downstream (3') flanking region homology arm fragments obtained separately can be performed by various methods known in the art, such as by overlap PCR, while maintaining the size of the respective fragments, using standard molecular cloning procedures as described in J. SammBruke et al (J. SammBruke et al, molecular cloning instructions, second edition, scientific Press, 1995). The nucleic acid containing the fused fragment of the homology arm sequence of the gene to be inactivated can be cloned into various vectors suitable for yeast by methods known in the art. Or respectively inserting the vector into specific regions by utilizing enzyme cutting sites on respective homologous arms. Standard molecular cloning procedures used are described in J. SammBruk et al (J. SammBruk et al, second edition of the molecular cloning, A laboratory Manual, science, 1995). Constructing a recombinant knockout plasmid. The original plasmid may be selected from expression vectors suitable for yeast, shuttle vectors, vectors which may carry replication sites, selection markers, auxotrophic markers (URA3, HIS, ADE1, LEU2, ARG4) and the like, and the construction of these vectors is well known in many documents (e.g., J. SammBruke et al, molecular cloning, Experimental guidelines, second edition, scientific Press, 1995) or commercially available from various companies (e.g., Invitrogen life technologies, Carlsbad, California 92008, USA), with the preferred vector being pPICZ α A, pYES2 yeast expression vector. The inactivated vectors are shuttle plasmids which are firstly replicated and amplified in escherichia coli and then are introduced into host yeast cells, and the vectors should carry resistance marker genes or auxotroph marker genes so as to be beneficial to screening of later transformants.

Homologous regions (upstream is called 5 'arm, and downstream is called 3' arm) on two sides of the gene to be inactivated are respectively constructed into a yeast vector to form a recombinant knockout vector. Further utilizing the linearization site linearization knockout vector of the homology arm, transforming the vector into one of pichia pastoris or a modified body thereof by an electrotransformation method, and culturing. Transformation of a desired nucleic acid into a host cell can be carried out by a conventional method such as preparation of competent cells, electroporation, lithium acetate method, etc. (A. Adams et al, guide to Yeast genetic methods, science publishers, 2000). Successfully transformed cells, i.e., cells containing homologous regions of the gene to be knocked out, can be identified by well-known techniques, such as collecting and lysing the cells, extracting the DNA, and then identifying the genotype by PCR; whereas previous selection of the correct phenotype could be achieved by selection of auxotrophs or resistance markers. The transformant with correct primary recombination is cultured in a yeast minimal medium, coated on a secondary recombination screening plate such as a uracil-containing 5-fluoroorotic acid plate and the like, grown clone is obtained, and then PCR identification of the genotype is further carried out. Correct transformants lacking the expected coding region of the gene were selected separately.

In a specific embodiment of the present invention, in the step (a1), the inactivated receptor, i.e., α -1, 6-mannosyl transferase, phosphomannose synthetase, β -mannosyl transferase I, β -mannosyl transferase II, β -mannosyl transferase III, and β -mannosyl transferase IV endogenous to pichia pastoris are all gene-knocked out by homologous recombination.

In a specific embodiment of the present invention, in the step (a2), the expression of the foreign protein in the recombinant yeast 1 is carried out by introducing a gene encoding the foreign protein into the recombinant yeast 1.

Further, the gene encoding the foreign protein is introduced into the recombinant yeast 1 in the form of a recombinant vector.

Further, the coding gene for the exogenous mannosidase I and the coding gene for the exogenous mannosidase II are both introduced into the recombinant yeast 1 twice.

In the embodiment of the present invention, in the step (A3), the O-mannose transferase I endogenous to the recombinant yeast 2 is inactivated, and the present invention is achieved not by conventional gene knockout but by insertional inactivation (by insertional inactivation, by disrupting the corresponding nucleotide sequence) of the gene encoding O-mannose transferase I in the genomic DNA of the recombinant yeast 2.

In the present invention, specifically, the genomic DNA of the recombinant yeast 2 is provided with stop codons in different combinations at the front end and the end of the target fragment of the O-mannosyltransferase I-encoding gene, and a terminator (e.g., CYC1TT terminator) is provided after the terminal stop codon. The target fragment with different combinations of stop codons at the front end and the tail end is specifically a fragment obtained by performing PCR amplification by using genome DNA of Pichia pastoris JC308 as a template and primers PMT1-IN-5 and PMT 1-IN-3.

PMT1-IN-5:5’-tctatgcattaatgatagttaatgactaatagagtaaaacaagtcctcaagaggt-3’;

PMT1-IN-3:5’-tgacataactaattacatgatctattagtcattaactatcattagatcagagtggggacgactaagaaa gc-3’。

The following technical problem is to construct an engineered pichia pastoris strain with mammalian cell glycoform modification ability in yeast underpan cells, and the glycosyl modification enzymes involved in glycosyl modification of mammalian cells are numerous and complex, and what glycoform will be obtained by what enzyme modification? And the proportion combinations that give rise to glycoforms were not known until the study. The invention is realized by the following technical method:

the exogenous mannosidase I is derived from trichoderma viride, and a C-terminal is fused with an endoplasmic reticulum retention signal HDEL.

The exogenous N-acetylglucosamine transferase I may be N-acetylglucosamine transferase I derived from mammals or the like, such as human N-acetylglucosamine transferase I (GenBank NO NM 002406), Candida albicans N-acetylglucosamine transferase I (GenBank NO NW _139513.1), Dictyophora disclinalis N-acetylglucosamine transferase I (GenBank NO NC _007088.5), etc., and endoplasmic reticulum or medial Golgi localization signals, such as ScGLS, ScMNS1, SEC Pp 12, ScMNN9, etc., may be fused at the N-terminus or C-terminus; preferably of human origin and containing a mnn9 localization signal;

the exogenous mannosidase II may be a mannosidase II from filamentous fungi, plants, insects, javanica, mammals, etc., if fly mannosidase II (GenBank NOX77652), nematode mannosidase II (GenBank NO NM 0735941), human mannosidase II (GenBank NO U31520), etc.; the expressed mannosidase II may be fused at the N-or C-terminus to endoplasmic reticulum or to a medial Golgi localization signal, such as ScGLS, ScMNS1, PpSECE 12, ScMNN9 and the like, preferably of nematode origin, containing a mnn2 localization signal;

the exogenous N-acetylglucosamine transferase II may be N-acetylglucosamine transferase II derived from mammals or the like, such as human N-acetylglucosamine transferase II (GenBank NO Q10469), murine N-acetylglucosamine transferase II (GenBank NO Q09326), etc.; the expressed N-acetylglucosaminyltransferase II can be fused at the N-or C-terminus to endoplasmic reticulum or medial Golgi localization signals, such as ScGLS, ScMNS1, PpSEC12, ScMNN9, and the like, preferably of human origin, containing a mnn2 localization signal;

both the mannosidase II and the N-acetylglucosaminyltransferase II contain a mnn2 localization signal;

the galactose isomerase and the galactose transferase are fusion proteins, are both selected from human, and share an kre2 localization signal.

The galactosyltransferase may be a galactosyltransferase derived from a mammal or the like, such as human beta-1, 4-galactosyltransferase (GenBank NO gi:13929461), murine beta-1, 4-galactosyltransferase GenBank NO NC-000081.6), and the like. The expressed galactosyltransferase can be fused with endoplasmic reticulum or medial Golgi localization signals at the N-terminal or C-terminal, such as ScKRE2, ScGLS, ScMNS1, PPSeSEC 12, ScMNN9 and the like, and the galactosyltransferase is derived from human in the examples of the invention and shares a kre2 localization signal;

the alpha-1, 6-mannosyltransferase can be B1) or B2) as follows:

B1) a protein having an amino acid sequence of SEQ ID No. 1;

B2) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.1 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.1 and has the same function.

The phosphomannosyl transferase may be B3) or B4) as follows:

B3) a protein having the amino acid sequence of SEQ ID No. 2;

B4) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues of the amino acid sequence shown in SEQ ID No.2 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.2 and has the same function.

The phosphomannose synthetase may be B5) or B6) as follows:

B5) a protein having the amino acid sequence of SEQ ID No. 3;

B6) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.3 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.3 and has the same function.

The beta mannosyl transferase I may be B7) or B8) as follows:

B7) a protein having an amino acid sequence of SEQ ID No. 4;

B8) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.4 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.4 and has the same function.

The beta mannosyl transferase II can be B9) or B10) as follows:

B9) a protein having the amino acid sequence of SEQ ID No. 5;

B10) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.5 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.5 and has the same function.

The beta mannosyltransferase III can be B11) or B12) as follows:

B11) a protein having an amino acid sequence of SEQ ID No. 6;

B12) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.6 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.6 and has the same function.

The beta mannosyl transferase IV may be B13) or B14) as follows:

B13) a protein having the amino acid sequence of SEQ ID No. 7;

B14) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.7 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.7 and has the same function.

The O-mannosyltransferase I can be B15) or B16) as follows:

B15) a protein having the amino acid sequence of SEQ ID No. 8;

B16) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues in the amino acid sequence shown in SEQ ID No.8 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.8 and has the same function.

The exogenous mannosidase I may be B17) or B18) as follows:

B17) a protein having the amino acid sequence of SEQ ID No. 9;

B18) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.9 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.9 and has the same function.

The exogenous N-acetylglucosamine transferase I may be B19) or B20) as follows:

B19) a protein having the amino acid sequence of SEQ ID No. 10;

B20) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.10 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.10 and has the same function.

The fusion protein consisting of the galactose isomerase and the galactose transferase may be B21) or B22) as follows:

B21) a protein having the amino acid sequence of SEQ ID No. 11;

B22) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.11 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.11 and has the same function.

The mannosidase II may be B23) or B24) as follows:

B23) a protein having the amino acid sequence of SEQ ID No. 12;

B24) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence shown in SEQ ID No.12 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.12 and has the same function.

The N-acetylglucosamine transferase II may be B25) or B26) as follows:

B25) a protein having an amino acid sequence of SEQ ID No. 13;

B26) the protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues on the amino acid sequence shown in SEQ ID No.13 and has the same function, or the protein which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the amino acid sequence shown in SEQ ID No.13 and has the same function.

The encoding gene of the exogenous mannosidase I can be C1) or C2) as follows:

C1) a DNA molecule having the nucleotide sequence of SEQ ID No. 14;

C2) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the nucleotide sequence shown in SEQ ID No.14 and encodes the exogenous mannosidase I, or a DNA molecule which hybridizes with the DNA molecule defined by C1) under strict conditions and encodes the exogenous mannosidase I.

The encoding gene of the exogenous N-acetylglucosamine transferase I can be C3) or C4) as follows:

C3) a DNA molecule having the nucleotide sequence of SEQ ID No. 15;

C4) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.15 and encoding said exogenous N-acetylglucosamine transferase I, or a DNA molecule hybridizing under stringent conditions with the DNA molecule defined in C3) and encoding said exogenous N-acetylglucosamine transferase I.

The encoding gene of the fusion protein consisting of the galactose isomerase and the galactose transferase may be C5) or C6) as follows:

C5) a DNA molecule having the nucleotide sequence of SEQ ID No. 16;

C6) a DNA molecule which has more than 99 percent, more than 95 percent, more than 90 percent, more than 85 percent or more than 80 percent of homology with the nucleotide sequence shown in SEQ ID No.16 and codes the fusion protein, or a DNA molecule which is hybridized with the DNA molecule limited by C5) under strict conditions and codes the fusion protein.

The encoding gene of the mannosidase II can be C7) or C8) as follows:

C7) a DNA molecule having the nucleotide sequence of SEQ ID No. 17;

C8) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the nucleotide sequence shown in SEQ ID No.17 and encodes the mannosidase II, or a DNA molecule which hybridizes with the DNA molecule defined by C7) under strict conditions and encodes the mannosidase II.

The gene encoding N-acetylglucosamine transferase II may be C9) or C10) as follows:

C9) a DNA molecule having the nucleotide sequence of SEQ ID No. 18;

C10) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology to the nucleotide sequence shown in SEQ ID No.18 and encoding said N-acetylglucosamine transferase II, or a DNA molecule hybridizing under stringent conditions with the DNA molecule defined by C9) and encoding said N-acetylglucosamine transferase II.

In the above proteins, homology means the identity of amino acid sequences. The identity of the amino acid sequences can be determined using homology search sites on the internet, such as the BLAST web page of the NCBI home website. For example, in the advanced BLAST2.1, by using blastp as a program, setting the value of Expect to 10, setting all filters to OFF, using BLOSUM62 as a Matrix, setting Gap existence cost, Per residual Gap cost and Lambda ratio to 11, 1 and 0.85 (default values), respectively, and performing a calculation by searching for the identity of a pair of amino acid sequences, a value (%) of identity can be obtained.

In the above genes, homology means the identity of nucleotide sequences. The identity of the nucleotide sequences can be determined using homology search sites on the Internet, such as the BLAST web page of the NCBI home website. For example, in the advanced BLAST2.1, by using blastp as a program, setting the value of Expect to 10, setting all filters to OFF, using BLOSUM62 as a Matrix, setting Gap existence cost, Per residual Gap cost, and Lambda ratio to 11, 1, and 0.85 (default values), respectively, and performing a calculation by searching for the identity of a pair of nucleotide sequences, a value (%) of identity can be obtained.

In the above proteins and genes, the homology of 95% or more may be at least 96%, 97%, 98% identity. The homology of 90% or more may be at least 91%, 92%, 93%, 94% identity. The homology of 85% or more may be at least 86%, 87%, 88%, 89% identity. The homology of 80% or more may be at least 81%, 82%, 83%, 84% identity.

In the above genes, the stringent conditions may be as follows: 50 ℃ in 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO4Hybridization with 1mM EDTA, rinsing in 2 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M NaPO4Hybridization with 1mM EDTA, rinsing in 1 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M NaPO4Hybridization with 1mM EDTA, rinsing in 0.5 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M NaPO4Hybridization with 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M NaPO4Hybridization with 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 65 ℃; can also be: in 6 XSSC, 0.5% SDS solution, at 65 ℃ for hybridization, and then 2 inSSC, 0.1% SDS and 1 XSSC, 0.1% SDS washed the membrane once each.

All relevant information of the glycosyl modified enzyme of the invention can be obtained from the National Center for Biotechnology Information (NCBI) or published literature, and the function and definition of the relevant enzyme can also be obtained from the literature. Even in the same bacterium or species, the amino acids of the respective enzymes may be slightly different due to differences in origin, etc., but the functions thereof are substantially the same, and therefore, the enzyme of the present invention may include these variants.

In a second aspect, the invention claims the pichia pastoris engineered strain constructed by the method of the first aspect.

Further, the pichia pastoris engineering bacteria are strains with the preservation number of CGMCC No.19488 which are preserved in the common microorganism center of China Committee for culture Collection of microorganisms.

In a third aspect, the present invention claims the application of the pichia pastoris engineered bacterium described in the second aspect above in the preparation of target proteins modified with the specific mammalian cell glycoform.

In a fourth aspect, the invention claims a method for preparing a protein of interest modified with the glycoform of the specific mammalian cell.

The method for preparing the target protein modified with the specific glycoform of the mammalian cell, which is claimed by the invention, can comprise the following steps: expressing the target protein in the pichia pastoris engineering bacteria of the second aspect to obtain recombinant yeast engineering bacteria; culturing the recombinant yeast engineering bacteria to prepare the target protein with the specific mammal cell glycoform.

In a particular embodiment of the invention, the protein of interest is in particular an anti-Her 2 antibody.

Experiments prove that the Pichia pastoris engineering strain obtained by the invention has reduced N-glycosyl and O-glycosyl, and has animal cell glycoform modification capability, the glycoprotein prepared by the engineering yeast strain avoids the problems of possible allergy and the like caused by fungal glycoform modification, the engineered Pichia pastoris strain has the characteristics of short construction period, fast growth, easiness for large-scale production, high safety and the like, and can be used for preparing common glycoprotein vaccines and is very suitable for efficient vaccine research and large-scale production under emergency conditions of sudden novel infectious diseases and the like. This is of great significance in medical use.

Deposit description

The name of the strain Latin is: pichia pastoris

The biological material of the reference: GJK30

Suggested classification nomenclature: pichia pastoris

The preservation organization: china general microbiological culture Collection center

The preservation organization is abbreviated as: CGMCC (China general microbiological culture Collection center)

Address: xilu No.1 Hospital No.3 of Beijing market facing Yang district

The preservation date is as follows: year 2020, 03 and 18 months

Registration number of the preservation center: CGMCC No.19488

Drawings

FIG. 1 shows the results of identification of och1 gene in GJK01 strain and analysis of glycoform. A is the och1 gene identification result. M represents Marker; 1: GJK01 strain (och1 knocked out); 2: x33 strain (no knock-out och 1). B is the result of DSA-FACE glycoform analysis of an antibody expressed by GJK01 strain (knock-out och 1).

FIG. 2 shows the identification result of pno1 gene. M represents Marker; 1: GJK02 strain (No 1 knocked out); 2: x33 strain (knock-out pno 1).

FIG. 3 shows the identification result of mnn4b gene. M represents Marker; 1: GJK03 bacterium (mnn4b knocked out); 2: x33 bacterium (knockout mnn4 b).

FIG. 4 shows DSA-FACE sugar type analysis results of GJK01, GJK02 and GJK03 strains (knock-out och1, pno1 and mnn4 b).

FIG. 5 shows the identification result of ARM2 gene. M represents Marker; 1: GJK04 strain (ARM2 knocked out); 2: x33 strain (non-knock-out ARM 2).

FIG. 6 shows the identification result of ARM1 gene. M represents Marker; 1: GJK05 strain (ARM1 knocked out); 2: x33 strain (non-knock-out ARM 1).

FIG. 7 shows the identification result of ARM3 gene. M represents Marker; 1: GJK07 strain (ARM3 knocked out); 2: x33 strain (non-knock-out ARM 3).

FIG. 8 shows the identification result of ARM4 gene. M represents Marker; 1: GJK18 strain (ARM4 knocked out); 2: x33 strain (non-knock-out ARM 4).

FIG. 9 shows the results of DSA-FACE glycoform analysis of GJK18 strain.

FIG. 10 shows the results of identification of the TrmdSI gene and DSA-FACE glycoform analysis of W10 strain. A is the identification result of the TrmdSI gene. M represents Marker; 1: TrmdSI is introduced into W10 strain; no TrmdSI was present in the X33 strain. B is the result of DSA-FACE glycoform analysis of W10 strain.

FIG. 11 shows the results of the identification of the GnTI gene and the results of the DSA-FACE glycoform analysis of 1-8 bacteria. A is the result of GnTI gene identification. M represents Marker; 1: introducing GnTI into the 1-8 bacteria; 2: x33 bacteria were free of GnTI. B is the result of DSA-FACE glycoform analysis of 1-8 bacteria.

FIG. 12 shows the results of the identification of the GalE-GalT gene and the results of the DSA-FACE glycoform analysis of 1-8-4 bacteria. A is the identification result of the GalE-GalT gene. M represents Marker; 1: 1-8-4 bacteria are introduced with GalE-GalT; 2: the X33 strain was devoid of GalE-GalT. B is the result of DSA-FACE glycoform analysis of 1-8-4 bacteria.

FIG. 13 shows the results of the identification of the mdsII gene, GnTII gene and DSA-FACE glycoform analysis of the 52-60 and 150L2 strains. A is the result of MdsII gene identification. M represents Marker; 1: introducing MdsII into the 52-60 bacteria; 2: the X33 strain was MdsII-free. B is the result of GnTII gene identification. M represents Marker; 1: GnTII was introduced into 150L2 strain; 2: the X33 strain was free of GnTII. C is the result of DSA-FACE glycoform analysis of 52-60 bacteria.

FIG. 14 shows the result of identifying PMT1 insertion-inactivated gene. M represents Marker; 1: the X33 bacterium PMT1 is not inactivated; 2: GJK30(PMT1 inactivated).

FIG. 15 shows the results of analysis of the glycoform structure of the GJK 30-engineered bacteria. A is a structure of lower than 50% of previous Gal2GlcNAc2Man3GlcNAc 2; b is a glycoform proportion of a Gal2GlcNAc2Man3GlcNAc2 structure obtained by GJK30 engineering bacteria is more than 60%; c is the cleavage of the glycoform by glycosidase (New England Biolabs, Beijing).

Detailed Description

The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.

Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice of the present invention, as will be apparent to those skilled in the art. All publications and other references mentioned herein are incorporated by reference in their entirety. The materials, methods, and examples are illustrative only and not intended to be limiting.

The pPICZ alpha A, pYES2 vector, X33 and GS115 Pichia pastoris are products of Invitrogen corporation.

Pichia pastoris GJK01 CGMCC No.1853 (described in patent ZL200610164912.8, publication No. CN101195809, Pichia pastoris inactivated alpha-1, 6-mannosyltransferase).

Pyrobest enzyme, LA Taq enzyme, dNTPs, restriction enzyme, T4 ligase and the like used in the experiment are purchased from Dalibao bioengineering Co., Ltd, and pfu enzyme, a kit and DH5 alpha competent cells are products of Beijing Quanjin Co., Ltd. Total gene synthesis, nucleotide synthesis, primer synthesis, sequencing and the like are provided by Shanghai Biotechnology services, Inc.

The sequence information of the relevant modified enzymes referred to in the following examples is shown in Table 1.

TABLE 1 related modified enzymes to which the invention relates

Example 1 construction of engineered Pichia pastoris with modification of the glycoform of specific mammalian cells

Construction of phosphomannose transferase gene-inactivated yeast strains

The basic strain adopted by the invention is a GJK01 strain constructed in an earlier stage, the preservation number is CGMCC No.1853, and the granted patent number of the strain is as follows: ZL 200610164912.8. The strain is a pichia pastoris strain inactivated by alpha-1, 6-mannose transferase. The amino acid sequence of alpha-1, 6-mannose transferase (OCH1) is shown in SEQ ID No. 1.

The yeast strain GJK02 inactivated by the phosphomannose transferase gene is obtained by partially knocking out DNA molecules of the phosphomannose transferase shown in SEQ ID No.2 in Pichia pastoris GJK01, namely knocking out the phosphomannose transferase gene in a GJK01 yeast genome, so as to obtain the recombinant yeast.

1. Construction of Gene inactivation vector for phosphomannose transferase

Knockout plasmid pYES2-PNO1 for knocking out the mannose transferase (PNO1) gene is a vector obtained by inserting a gene fragment (SEQ ID No.20) corresponding to mannose transferase (PNO1) into the vector pYES2 between the KpnI and XbaI cleavage sites. Wherein the 7 th to 1006 th nucleotides from the 5' end of the SEQ ID No.20 are upstream homology arms of a knockout mannose transferase (PNO1) gene fragment; the 1015-th and 2017-th nucleotides from the 5' end of SEQ ID No.20 are downstream homology arms of the mannose transferase (PNO1) knockout gene segment.

The method comprises the following specific steps:

the genomic DNA of Pichia pastoris X33 was extracted by a glass bead preparation method (A. Adams et al, A guide to Yeast genetics methods, science publishers, 2000), and the homology arms on both sides of the mannose transferase (PNO1) gene were amplified using the genomic DNA as a template, and the homology arms on both sides of PNO1 were each about 1kb, and the coding gene of about 1.4kb was deleted in the middle.

Primers used for amplifying the homologous arm (PNO 15' homologous arm) of the upstream flanking region of the PNO1 are PNO-5-5 and PNO-5-3, and primer sequences are respectively as follows:

5′-AGTGGTACCGCAGTTTAATCATAGCCCACTGC-3' (the crosshatched portion is the Kpn I recognition site);

5′-ATTCCAATACCAAGAAAGTAAAGTgcggccgcAAGTGGAACTGGCGCACCGGT-3' (the crosshatched portion is Not I recognition site).

Primers used for amplifying the homologous arm (PNO 13' homologous arm) of the flanking region on the downstream side of PNO1 are PNO-3-5 and PNO-3-3, and primer sequences are respectively as follows:

5′-ACCGGTGCGCCAGTTCCACTTgcggccgcACTTTACTTTCTTGGTATTGGAAT-3' (underlined part is Not I recognition site);

5′-TGTTCTAGATCCGAGATTTTGCGCTATGGAGC-3' (the crosshatched portion is the Xba I recognition site).

The conditions for PCR amplification of the two homology arms are as follows: after denaturation at 94 ℃ for 5min, 30 cycles of denaturation at 94 ℃ for 30sec, renaturation at 55 ℃ for 30sec, and extension at 72 ℃ for 1min for 30sec, and final extension at 72 ℃ for 10 min; the size of the target fragment is about 1 kb. The PCR product was purified and recovered using a PCR product recovery purification kit (purchased from Dingguo Biotechnology Co., Ltd., Beijing). The PNO 15 'homology arm and 3' homology arm were fused by overlap extension PCR (see J. SammBruk et al, molecular cloning, Experimental Manual, second edition, scientific Press, 1995) using the PCR products of PNO 15 'and 3' homology arms as templates and PNO-5-5/PNO-3-3 as primers, and the PCR amplification conditions were as follows: after denaturation at 94 ℃ for 5min, 30 cycles of denaturation at 94 ℃ for 1min, renaturation at 55 ℃ for 1min, and extension at 72 ℃ for 3min and 30sec, and finally extension at 72 ℃ for 10 min; the size of the target fragment is around 2 kb. And purifying and recovering the PCR product by using a PCR product recovery and purification kit.

The PCR product was digested with Kpn I/Xba I (restriction enzymes used in this experiment were all from Takara Bio Inc., Dalian), the digested product was inserted into the same vector pYES2(Invitrogen Corp. USA) digested with two enzymes, ligated overnight at 16 ℃ with T4 ligase, E.coli DH 5. alpha. was transformed, and positive clones were selected on LB plates containing ampicillin (100. mu.g/ml). The plasmid of the positive clone is identified by Kpn I/Xba I double enzyme digestion, the recombinant vector of the fragments about 4200bp and about 2000bp is obtained and named as pYES2-PNO1, the recombinant vector is the knockout plasmid for knocking out the mannose transferase (PNO1) gene, and the upstream and downstream homology arms of the PNO1 gene are finally sequenced and verified to be correct.

2. Transformation of Pichia pastoris by knockout plasmid

Transformation of the knockout plasmid pYES2-pno1 into Pichia pastoris GJK01 (described in patent ZL200610164912.8, published as CN101195809) by electrotransformation is well known in the art (e.g., A. Adams et al, guide to Yeast genetics methods, scientific Press, 2000). Before electrotransformation, the knockout plasmid is linearized by a BamH I restriction site upstream of the 5' homology arm, and then electrotransferred into prepared competent cells, and spread on an MD medium (YNB 1.34g/100mL, biotin 4X 10) containing arginine and histidine-5g/100mL, glucose 2g/100mL, agar 1.5g/100mL, arginine 100mg/mL, histidine 100 mg/mL). After clones grow out of the culture medium, randomly selecting a plurality of clones to extract genomes, and identifying whether the knockout plasmid is correctly integrated into a target site on a chromosome by a PCR method, wherein two pairs of primers used in the PCR reaction are respectively as follows: primer sequence outside the 5' homology arm of PNO1 gene PNO-5-5 OUT: 5'-GCAGTTTAATCATAGCCCACTGCTA-3' and primer sequence on vector inner 01: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3' are provided. The enzyme used in the PCR reaction is rTaq (Takara Bio-engineering Limited), and the PCR amplification conditions are as follows: after denaturation at 94 ℃ for 5min, 30 cycles of denaturation at 94 ℃ for 30sec, renaturation at 55 ℃ for 30sec and elongation at 72 ℃ for 3min were carried out, and finally elongation at 72 ℃ for 10 min. The size of the PCR product band was analyzed by gel electrophoresis, and the band amplified by the primer was positive clone at about 2.3 kb.

3. PCR identification of positive engineering strain

One of the positive clones was inoculated into YPD medium (10g/L yeast extract, 20g/L peptone, 20g/L glucose), and after shaking culture at 25 ℃ for 12 hours, the bacterial solution was applied to adenine deficient 5-FOA medium (YNB 1.34g/100mL, biotin 4X 10-5g/100mL, glucose 2g/100mL, agar 1.5g/100mL, arginine 100mg/mL, histidine 100mg/mL, uracil 100mg/mL, 5-FOA 0.1%) (wherein YNB, an amino acid free yeast nitrogen source, Beijing Western Biotechnology Ltd., 5-fluorouracil, Sigma-aldrich P.O.BOX14508, St.Louis, MO 63178 USA), and cultured at 25 ℃.

After the clone grows on the 5-FOA culture medium, extracting the genome of the clone, and performing PCR identification: the genome is taken as a template, and identification primers are sequences PNO1-ORF01 and PNO1-ORF02 outside the homology arm of the PNO1 gene on the chromosome, wherein the primer sequences are respectively as follows:

PNO1-ORF01:5′-GGGAAAGAAAACCTTCAATTT-3′;

PNO1-ORF02:5′-TACAAGCCAGTTTCGCAATAA-3′。

a PCR reaction system using the genome of wild type X33 strain (Invitrogen) as a template was used as a control. The enzyme used in the PCR reaction was LA Taq (Bao bioengineering Co., Ltd.), and the PCR amplification conditions were as follows: after denaturation at 94 ℃ for 5min, 30 cycles of denaturation at 94 ℃ for 30sec, renaturation at 55 ℃ for 30sec and elongation at 72 ℃ for 3min were carried out, and finally elongation at 72 ℃ for 10 min.

In order to identify whether alpha-1, 6-mannosyltransferase is knocked out, a reporter protein is introduced after GJK01 engineering bacteria are obtained, an anti-Her 2 antibody is used as the reporter protein, and a construction method and a vector transformation method of an expression vector of the anti-Her 2 antibody are disclosed in an application patent (publication number: CN 101748145A). The method is used for transferring the anti-Her 2 antibody expression vector into GJK01 host bacteria to obtain a GJK01-HL engineering strain for expressing the anti-Her 2 antibody. A method for analyzing oligosaccharide chains by using DSA-FACE has been publicly reported in "Liubo, et al." A method for analyzing oligosaccharide chains by using DSA-FACE ". Biotechnology communication 2008.19(6). 885-"

The product was subjected to agarose gel electrophoresis. In FIG. 1, A is the identification result of GJK01 host bacteria; FIG. 1 shows the results of DSA-FACE glycoform analysis of GJK01-HL strain (knock-out och 1). In FIG. 2, lane 1 is PON 1-deficient, lane 2 is wild-type; the size of a PCR product taking a wild type X33 strain genome as a template is about 490bp, and a PON1 defective engineering bacterium non-amplification strip also proves that the PNO1 gene is lost, and a strain knocked out by phosphomannose transferase is correctly constructed, named GJK02 and is recombinant pichia pastoris knocked out by phosphomannose transferase.

Secondly, construction of yeast strain with inactivated phosphomannose synthetase gene

The yeast strain GJK03 inactivated by the phosphomannose synthetase gene is obtained by partially knocking out DNA molecules of the phosphomannose synthetase shown by SEQ ID No.3 in Pichia pastoris GJK02, namely knocking out the phosphomannose synthetase gene in a GJK02 yeast genome to obtain recombinant yeast; namely, the yeast is inactivated with respect to alpha-1, 6-mannosyl transferase, phosphomannosyl transferase and phosphomannosyl synthase.

The method for constructing the vector is the same as the first step.

1. Construction of Gene inactivation vector for phosphomannose synthetase

The plasmid pYES2-MNN4B for knocking out the phosphomannose synthetase gene is a vector obtained by inserting the upstream and downstream homologous arms of a gene-removed fragment to be knocked out corresponding to the phosphomannose synthetase into the enzyme cutting sites Stu I and Spe I of the vector pYES 2.

By using the same method, the genome DNA of Pichia pastoris X33 is extracted by a glass bead preparation method, and the mannose synthetase (MNN4B) gene fragment is amplified and knocked out by taking the genome DNA as a template, wherein the homology arms at two sides of MNN4B are about 1kb respectively, and the coding gene of about 1kb is deleted in the middle.

The primers used for amplifying the homology ARM (ARM 25' homology ARM) of the upstream flanking region of the MNN4B are MNN4B-5-5 and MNN4B-5-3, and the sequences of the primers are respectively as follows:

5′-AGTAGGCCTTTCAACGAGTGACCAATGTAGA-3' (the crosshatched portion is the Stu I recognition site);

5′-TATCTCCATAGTTTCTAAGCAGGGCGGCCGCAATATGTGCGGTGTAGGGAGAAA-3' (the crosshatched portion is Not I recognition site).

The primers used for amplifying the homology arm of the downstream flanking region of MNN4B (the homology arm of MNN4B 3' are MNN4B-3-5 and MNN 4B-3-3), and the sequences of the primers are respectively as follows:

5′-TTTCTCCCTACACCGCACATATTGCGGCCGCCCTGCTTAGAAACTATGGAGATA-3' (the crosshatched portion is Not I recognition site);

5′-TGTACTAGTTGAAGACGTCCCCTTTGAACA-3' (the underlined part is the Spe I recognition site).

The PCR amplification conditions, recovery method and enzyme digestion method of the two homology arms are all synchronized in step 1, and the pYES2-MNN4B knockout vector is finally constructed and verified to be correct through final sequencing.

2. Transformation of Pichia pastoris by knockout plasmid

The knockout plasmid is transformed into the constructed pichia pastoris engineering strain GJK02 by adopting an electrical transformation method, and the electrical transformation method and the identification method are the same as the first step.

The two pairs of primers used in the PCR reaction were: primer sequences outside the 5' homology arm of MNN4b gene MNN4B-5-5 OUT: 5'-TAGTCCAAGTACGAAACGACACTA-3' and primer sequence on vector inner 01: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3', the band amplified by the primer is positive clone at about 2 kb.

3. PCR identification of positive engineering strain

One positive clone is inoculated on a 5-FOA culture medium (the formula is the same as the formula), and after the clone grows out, the genome of the clone is extracted and PCR identification is carried out: the genome is taken as a template, and the identification primers are sequences MNN4B-ORF01 and MNN4B-ORF02 outside the homologous arm of the MNN4b gene on the chromosome, and the primer sequences are as follows:

MNN4B-ORF01:5'-AAAACTATCCAATGAGGGTCTC-3';

MNN4B-ORF02:5'-TCTTCAATGTCTTTAACGGTGT-3'。

PCR amplification was performed using the positive clone genomic DNA as a template and primers MNN4B-ORF01 and MNN4B-ORF 02. The results are shown in FIG. 3, with MNN4B deficient in lane 1 and wild-type in lane 2; the size of a PCR product taking a wild type X33 strain genome as a template is about 912bp, and an MNN4 defective engineering bacterium non-amplification strip also proves that the pichia pastoris is knocked out by phosphomannose synthetase, is named GJK03 and is a recombinant pichia pastoris knocked out by phosphomannose transferase and phosphomannose synthetase.

The results of DSA-FACE glycoform analysis of GJK02 and GJK03 (knock-outs of och1, pno1 and mnn4b) (the same procedure as in example one) are shown in FIG. 4, and it can be seen that mannose phosphate moieties in glycoforms are removed after the pno1 and mnn4b knock-outs.

Construction of yeast strains with beta-mannosyltransferase gene ARM2 inactivated

The yeast strain GJK04 inactivated by the genes of phosphomannose transferase, phosphomannose synthetase and beta-mannose transferase ARM2 (namely beta-mannose transferase II) is obtained by partially knocking out DNA molecules of beta-mannose transferase ARM2 shown in SEQ ID No.5 in Pichia pastoris GJK03, namely knocking out the gene of beta-mannose transferase ARM2 in a GJK03 yeast genome to obtain recombinant yeast; namely, the alpha-1, 6-mannosyl transferase, phosphomannosyl transferase gene, phosphomannosyl synthase gene and beta-mannosyl transferase ARM2 in the genome of yeast have been inactivated.

1. Construction of beta-mannose transferase ARM2 gene inactivation vector

The vector construction method is the same as the step one, and specifically comprises the following steps:

by using the same method, the genome DNA of the pichia pastoris X33 is extracted by using a glass bead preparation method, and the homology ARMs at two sides of a beta-mannose transferase (ARM2) gene are amplified by using the genome DNA as a template, wherein the homology ARMs at two sides of ARM2 are about 0.6kb respectively, and a coding gene of about 0.6kb is deleted in the middle.

Primers used for amplifying the homologous ARM (ARM 25' homologous ARM) of the upstream flanking region of ARM2 are ARM2-5-5 and ARM2-5-3, and the sequences of the primers are respectively as follows:

5′-ActTGGTACCACACGACTCAACTTCCTGCTGCTC-3' (the crosshatched portion is the Kpn I recognition site);

5′-actGCGGCCGCCACGAAACTTCTTACCTTTGACAA-3' (the dashed line indicates Not I identifying the locus).

Primers used for amplifying the homologous ARM (ARM 23' homologous ARM) of the downstream flanking region of ARM2 are ARM2-3-5 and ARM2-3-3, and the sequences of the primers are respectively as follows:

5′-TTGTCAAAGGTAAGAAGTTTCGTGGCGGCCGCTATCTTGACATTGTCATTCAGTG A-3' (the crosshatched portion is Not I recognition site);

5′-caaTCTAGAGCCTCCTTCTTTTCCGCCT-3' (the crosshatched portion is the Xba I recognition site).

2. Transformation of Pichia pastoris by knockout plasmid

The knockout plasmid is transformed into the constructed pichia pastoris engineering strain GJK03 by adopting an electrical transformation method, and the electrical transformation method and the identification method are the same as the above method.

The two pairs of primers used in the PCR reaction were: the primer sequence outside the 5' homologous ARM of the ARM2 gene is ARM2-5-5 OUT: 5'-TTTTCCTCAAGCCTTCAAAGACAG-3' and primer sequence on vector inner 01: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3', the band amplified by the primer is positive clone at about 0.8 kb.

3. PCR identification of positive engineering strain

One positive clone is inoculated on a 5-FOA culture medium (the formula is the same as the formula), and after the clone grows out, the genome of the clone is extracted and PCR identification is carried out: using genome as a template, and identifying primers as sequences ARM-ORF01 and ARM-ORF02 outside the homology ARM of ARM2 gene on chromosome, wherein the primer sequences are as follows:

Arm2-ORF-09:5'-gggcagaagatcctagag-3';

Arm2-ORF-10:5'-tcgtctccattgctatctacgact-3'。

PCR amplification using the positive cloned genomic DNA as a template and primers Arm2-ORF-09 and Arm2-ORF-10, showed that, in FIG. 5, lane 1 is an ARM2 deficient type, and lane 2 is a wild type; the result shows that the size of a PCR product taking a wild type X33 strain genome as a template is about 600bp, and ARM2 defective engineering bacteria do not have an amplification band, and the recombinant Pichia pastoris with a beta mannose transferase (ARM2) knocked out, named GJK04, is a phosphomannose transferase, a phosphomannose synthetase and a beta mannose transferase II (ARM2) knocked out gene is also proved.

Construction of yeast strains with inactivated beta-mannose transferase ARM1, ARM3 and ARM4 genes

According to the first to third steps, the design method and the construction process of the construction of the yeast strain inactivated by the beta-mannose transferase gene ARM2 are implemented, beta-mannose transferase ARM1, ARM3 and ARM4 (namely beta-mannose transferase I, III and beta-mannose transferase IV with amino acid sequences respectively of SEQ ID No.4, SEQ ID No.6 and SEQ ID No.7) are knocked out in sequence on the basis of GJK04 engineering bacteria, and GJK05, GJK07 and GJK18 engineering strains are respectively constructed and obtained.

1. Construction of beta-mannose transferase ARM1, ARM3 and ARM4 gene inactivation vectors

The vector construction method is the same as the third step, and the difference is that:

primers used for amplifying the homologous ARM (ARM 15' homologous ARM) of the upstream flanking region of ARM1 are ARM1-5-5 and ARM1-5-3, and the sequences of the primers are respectively as follows:

ARM1-5-5:5'-TCAACGCGTTGGCTCTGGATCGTTCTAATA-3' (underlined is the MluI recognition site);

ARM1-5-3:5'-ttctccgttctcctttctccgtGCGGCCGCcagcagcaaggaagataccaa-3' (underlined is the NotI recognition site).

Primers used for amplifying the homologous ARM (ARM 13' homologous ARM) of the downstream flanking region of ARM1 are ARM1-3-5 and ARM1-3-3, and the sequences of the primers are respectively as follows:

ARM1-3-5:5'-ttggtatcttccttgctgctgGCGGCCGCacggagaaaggagaacggagaa-3' (the crosshatched portion is the NotI recognition site);

ARM1-3-3:5'-TCAACGCGTTGGCTGGAGGTGACAGAGGAA-3' (underlined is the MluI recognition site).

Primers used for amplifying the homologous ARM (ARM 35' homologous ARM) of the upstream flanking region of ARM3 are ARM3-5-5 and ARM3-5-3, and the sequences of the primers are respectively as follows:

ARM 3-5-5: 5'-TCAACGCGTTAGTAGTGCCGTGCCAAGTAGCG-3' (underlined is the MluI recognition site);

ARM3-5-3:5'-tcctactttgcttatcatctgccGCGGCCGCggtcaggccctcttatggttgtg-3' (the crosshatched portion is the NotI recognition site).

Primers used for amplifying the homologous ARM (ARM 33' homologous ARM) of the downstream flanking region of ARM3 are ARM3-3-5 and ARM3-3-3, and the sequences of the primers are respectively as follows:

ARM3-3-5:5'-_cacaaccataagagggcctgaccGCGGCCGCggcagatgataagcaaagtagga-3' (underlined is a NotI recognition site);

ARM3-3-3:5'-TCAACGCGTCATAGGTAATGGCACAGGGATAG-3' (underlined is the MluI recognition site).

Primers used for amplifying the homologous ARM (ARM 45' homologous ARM) of the upstream flanking region of ARM4 are ARM4-5-5 and ARM4-5-3, and the sequences of the primers are respectively as follows:

ARM4-5-5:5'-TCAACGCGTGCAGCGTTTACGAATAGTGTCC-3' (underlined is the MluI recognition site);

ARM4-5-3:5'-gcatagggctgaagcatactgtGCGGCCGCaatgatatgtacgttcccaaga-3' (underlined is NotI recognition site)。

Primers used for amplifying the homologous ARM (ARM 43' homologous ARM) of the downstream flanking region of ARM4 are ARM4-3-5 and ARM4-3-3, and the sequences of the primers are respectively as follows:

ARM4-3-5:5'-tcttgggaacgtacatatcattGCGGCCGCacagtatgcttcagccctatgc-3' (underlined is a NotI recognition site);

ARM4-3-3:5'-TCAACGCGTGAGGTGGACAAGAGTTCAACAAAG-3' (underlined is the MluI recognition site).

2. Transformation of Pichia pastoris by knockout plasmid

The difference between the three steps is that the two pairs of primers used in the PCR reaction are:

the primer sequence outside the 5' homologous ARM of the ARM1 gene is ARM1-5-5 OUT: 5'-GTTCTGGTATGCGTTCTA TTCTTC-3' and primer sequence on vector inner 01: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3', the band amplified by the primer is positive clone at about 3.5 kb.

The primer sequence outside the 5' homologous ARM of the ARM3 gene is ARM3-5-5 OUT: 5'-TATTTGCCTTCTTCACCGT TAT-3' and primer sequence on vector inner 01: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3', the band amplified by the primer is a positive clone at about 3.7 kb.

The primer sequence outside the 5' homologous ARM of the ARM4 gene is ARM4-5-5 OUT: 5'-TCCGTTGAGGGTGCTAAT GGTA-3' and primer sequence on vector inner 01: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3', the band amplified by the primer is positive clone at about 3.7 kb.

3. PCR identification of positive engineering strain

The difference is that the gene is knocked out by identifying the engineering bacteria by using the following primers (FIG. 6, FIG. 7 and FIG. 8):

Arm1-ORF-09:5'-TAGTCTGGTTTGCGGTAGTGT-3';

Arm1-ORF-10:5'-AGATTGAGCATAGGAGTGGC-3'。

Arm3-ORF-09:5'-AAACGGAGTCCAGTTCTTCT-3';

Arm3-ORF-10:5'-CAACTTTGCCTGTCATTTCC-3'。

Arm4-ORF-09:5'-CGCTTCAGTTCACGGACATA-3';

Arm4-ORF-10:5'-GCAACCCAGACCTCCTTACC-3'。

the results of DSA-FACE glycoform analysis of GJK18 are shown in FIG. 9. Because the modification of the beta mannose is only added at the individual terminal of the mannose, although the result of the glycoform analysis is not substantially changed, the beta mannose is a sugar which potentially causes immunogenicity, so that potential risks exist for a drug source used for a human body, and all the beta mannose is inactivated, so that the problem of the existence of the beta mannose is fundamentally solved, and the glycoform structure is not changed.

Fifthly, constructing a glycosyl engineering yeast strain with the mammalian Man5GlcNAc2 and an afucose glycosylation structure

Firstly, in order to identify whether the exogenous mannosidase I (MDSI) plays a role correctly, the invention introduces a reporter protein into GJK18 engineering bacteria in advance, and the invention takes an anti-Her 2 antibody as the reporter protein, thereby constructing an expression vector of the anti-Her 2 antibody. The method for constructing the vector and the method for transforming the vector are disclosed in the patent application (publication No. CN 101748145A). The method is used for transferring the anti-Her 2 antibody expression vector into GJK18 host bacteria to obtain a W2 engineering strain for expressing the anti-Her 2 antibody.

Secondly, the glycosyl engineering yeast strain W10 with the mammalian Man5GlcNAc2 and no fucose glycosylation structure is an engineering bacterium obtained by inserting MDSI (Trmdsi, nucleotide sequence is shown as SEQ ID No.14, and MDSI protein is shown as SEQ ID No. 9) of a C-terminal fused HDEL sequence into the genome of the host bacterium W2.

1. Construction of exogenous mannosidase I (MDSI) expression vector

The recombinant vector pPIC9-TrmdSI for expressing the exogenous mannosidase I is a recombinant vector obtained by inserting a DNA molecule shown in SEQ ID No.14 into the position between Xho I and EcoR I enzyme cutting sites of a pPIC9 vector.

Wherein, the 1 st to 1524 th nucleotides from the 5 'end of SEQ ID No.14 are optimized mannosidase I coding genes, and the 1525 th and 1536 th nucleotides from the 5' end are endoplasmic reticulum retention signal-HDEL coding genes.

(1) Mannosidase I (MDSI) gene

The exogenous mannosidase I may be mannosidase I from filamentous fungus, plant, insect, java, mammal, etc., and in this example, the mannosidase I from Trichoderma viride (Janjie. Trichoderma viride alpha-1, 2-mannosidase cloned expression and activity identification [ academic master text ] in Pichia pastoris.) was selected and endoplasmic reticulum retention signal HDEL was fused at the C-terminal of the mannosidase I.

According to the clone expression and activity identification [ academic theory Master ] of Trichoderma viride mannosidase I sequence disclosed in Zhaojie, Trichoderma viride alpha-1, 2-mannosidase in Pichia pastoris, an encoding gene is optimized according to yeast preferred codon and gene high expression principle, and an HDEL sequence is fused at the C end to obtain a gene fragment (SEQ ID No. 14).

(2) The following primers were designed and synthesized:

TrmdsI-5:5’-TCTCTCGAGAAAAGAGAGGCTGAAGCTTATCCAAAGCCGGGC GCCAC-3'; the sequence shown underlined is the Xho I cleavage recognition site.

TrmdsI-3:5’-AGGGAATTCTTACAACTCGTCGTGAGCAAGGTGGCCGCCCCGT CGTGATG-3'; the sequences shown underlined are EcoR I restriction recognition sites.

(3) And (2) carrying out PCR amplification by taking the gene fragment obtained in the step (1) as a template and taking TrmdSI-5 and TrmdSI-3 as primers to obtain a PCR amplification product named as TrmdSI, wherein the product contains SEQ ID No. 14.

(4) Carrying out double enzyme digestion on the PCR product obtained in the step (3) by using Xho I and EcoR I to obtain a gene fragment; carrying out double enzyme digestion on the pPIC9 vector by Xho I and EcoR I to obtain a large vector fragment; the gene fragment is connected with the large fragment of the vector to obtain a recombinant plasmid which is named as pPIC 9-TrmdSI. The pPIC9-TrmdSI was sequenced with correct results.

2. Construction of recombinant Yeast expressing exogenous mannosidase I

About 10. mu.g of pPIC9-TrmdSI plasmid was linearized with Sal I and the linearized plasmid was precipitated with 1/10 volumes of 3M sodium acetate and 3 volumes of absolute ethanol. The plasmid was washed twice with 70% by volume of an aqueous ethanol solution to remove salts, air dried, and about 30. mu.L of water was added to resuspend the pellet, to obtain pPIC9-Trmdsi linearized plasmid for transformation.

The following procedures for preparing yeast electrotransformation competent cells are described in the Invitrogen company's Manual and "Molecular Cloning, A Laboratory Manual (Fourth Edition)", 2012Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. The selected host bacterium is the W2 engineering bacterium constructed in the above way.

The method comprises the following specific steps:

pichia pastoris W2 was streaked onto YPD plates (yeast extract 10g/L, tryptone 20g/L, glucose 20g/L, agar 15g/L) to isolate single clones, which were incubated at 28 ℃ in an incubator for 2 days. A single clone was inoculated into a 50mL Erlenmeyer flask containing 10mL YPD liquid medium (yeast extract 10g/L, tryptone 20g/L, glucose 20g/L), and cultured overnight at 28 ℃ to OD600About 2, a bacterial solution was obtained. Inoculating 0.1-0.5mL bacterial liquid into 3.5L shake flask containing 500mL YPD liquid culture medium, and culturing overnight to OD600To between 1.3 and 1.5. The bacterial solution was transferred to a sterile centrifuge flask and centrifuged at 1500g for 10 minutes at 4 ℃. The cells were resuspended in 500mL of pre-chilled sterile water, centrifuged at 4 ℃ at 1500g for 10 minutes to harvest the cells, and washed once more with 250mL of pre-chilled sterile water. Resuspend the cells with 20mL of pre-cooled sterile 1M sorbitol, harvest the cells by centrifugation at 1500g for 10min at 4 ℃, and resuspend the cells with pre-cooled 1M sorbitol to a final volume of 1.5mL to obtain a bacterial suspension.

mu.L of the bacterial suspension and 10. mu.L of pPIC9-TrmdSI linearized plasmid for transformation were taken and mixed in a microcentrifuge tube to give a mixture, which was placed on ice for 5min, the mixture was transferred into an ice-cold 0.2cm electric rotor, cells were electroporated (Bio-Rad Gene Pulser, 2000V, 25. mu.F, 200. omega.), 1mL of ice-cold 1M sorbitol was immediately added to the electric rotor, and the mixture (transformed cells) was carefully transferred into a 15mL culture tube.

The culture tubes were incubated at 28 ℃ for 1h without shaking. Then, 1mL of YPD liquid medium was added and the mixture was incubated at 28 ℃ for 3 hours on a shaker at 250 rpm. mu.L of the transformed cells were plated onto MD-containing plates (1.34g/100ml YNB, 4X 10)-5g/100ml Biotin, 2g/100ml glucose). Incubating at 28 deg.C for 2-5 days until formationThe monoclonal, W2-Tr, was named W10.

The genomic DNA of W10 was extracted by glass bead preparation, and PCR amplification was carried out using genomic DNA as template and TrMDSI-1.3kb-01 and TrMDSI-1.3kb-02 as primers to obtain a PCR amplification product of about 1.3kb, demonstrating that MDSI was inserted into the genome, i.e., a positive engineered bacterium (A in FIG. 10).

TrMDSI-1.3kb-01:5’-GAACACGATCCTTCAGTATGTA-3’;

TrMDSI-1.3kb-02:5’-TGATGATGAACGGATGCTAAAG-3’。

As a result of DSA-FACE glycoform analysis of the strain W10 (the method was the same as that described in example I), it was found that, after transfer into TrmdSI, the glycoform structures of the expressed proteins of the strain W10 were Man5GlcNAc2 and Man6GlcNAc2, in which Man5GlcNAc2 was predominant, as shown in B in FIG. 10.

Sixthly, constructing the glycosyl engineering yeast strain with the mammalian GlcNAcMan5GlcNAc2 and an afucosylated structure

Glycosyl engineering yeast strains 1-8 with mammalian GlcNAcMan5GlcNAc2 and no fucose glycosylation structure are engineering bacteria obtained by inserting DNA fragments of N-acetylglucosamine transferase I (GnTI) containing mNn9 localization signal (the nucleotide sequence is shown as SEQ ID No.15, and the coding protein is shown as SEQ ID No. 10) into the genome of host bacteria W10.

Wherein, the 1 st to 114 th nucleotides from the 5 'end of SEQ ID No.15 are mnn9 localization signal, and the 115 nd and 1335 th nucleotides from the 5' end are N-acetylglucosamine transferase I encoding gene.

1. Construction of N-acetylglucosamine transferase I (GnTI) expression vector containing mnn9 localization Signal

(1) Regulation of human gnt1 Gene

Human gnt1 gene upstream primer (mnn 9-GnTI-01: 5'-tcagtcagcgctctcgatggcgaccccg-3') and downstream primer GnTI-02: 5' -GCGAATTCTTAGTGCTAATTCCAGCTAGGATCATAG-3' (underlined as EcoR I cleavage site), the full-length fragment of the human gnt1 gene was obtained from a human fetal liver cDNA library (purchased from Clontech Laboratories Inc.1290Terra Bella Ave.mountain View, CA94043, USA) by PCR reaction conditions: pre-denaturation at 94 ℃ for 5min, denaturation at 94 ℃ for 30sec, 52Annealing at 72 deg.C for 30s, extending at 72 deg.C for 1min for 30s, and circulating for 30 times; final extension at 72 ℃ for 10 min. The PCR amplification products were separated by 0.8% agarose gel electrophoresis and recovered using a DNA recovery kit.

(2) GnTI DNA fragment containing localization signal mnn9

Core MNN9 golgi localization signal: ScMNN 9-03:tatAATattATGTCACTTTCTCTTGTATCGTACCGCCTAAGAAAGAACCCGTGGGTTAACATT TTTCTACCTGTTTTGGCCATATTTCTAATATATATAATTTTTTTCCAGAGAGATCAATCTtcagt cagcgctctcgatggcgaccccg

upstream primer ScMNN9-03 (tat) containing S.cere MNN9 Golgi positioning signal coding sequenceAATattATGTCACTTTCTCTTGTATCGTACCGCCTAAGAAAGAACCCGTGGGTTAACA TTTTTCTACCTGTTTTGGCCATATTTCTAATATATATAATTTTTTTCCAGAGAGATCAATCTtc agtcagcgctctcgatggcgaccccg, the SspI restriction site is underlined) and GnTI-02, the recovered and purified 1.2kb GnTI fragment was ligated with the S.cere MNN9 Golgi localization signal coding sequence by PCR reaction, and the MNN9-gnt1 gene fragment (SEQ ID No.15) was amplified using Pyrobest DNA polymerase.

And (3) PCR reaction conditions: denaturation at 94 ℃ for 2 min, annealing at 52 ℃ for 30sec, and extension at 72 ℃ for 5min, followed by denaturation at 94 ℃ for 30sec, annealing at 52 ℃ for 30sec, and extension at 72 ℃ for 1min for 30sec, and 30 cycles; final extension at 72 ℃ for 10 min.

The PCR amplification product was separated by 0.8% agarose gel electrophoresis (8V/cm, 15 min), and a 1.3kb band was cut out under an ultraviolet lamp with a clean blade and recovered with a DNA recovery kit in the same manner as above.

(3) Construction of PGE-URA3-GAP1-mnn9-GnTI expression vector

Carrying out double enzyme digestion on the mNn9-gnt1 gene fragment PCR product obtained in the step (2) by Ssp I and EcoR I to obtain a gene fragment; the Ssp I and EcoR I are subjected to double enzyme digestion to obtain PGE-URA3-GAP1 (Yangxingpeng, Liubo, Song \2815638, consolidation, Yangshao red, Schuquinuclein, Wujun. Man5GlcNAc2 mammal mannose-type glycoprotein Pichia pastoris expression system construction. bioengineering report.2011; 27:108-17.) vector large fragment is obtained; the gene fragment was ligated with the vector large fragment to obtain a recombinant plasmid, which was named PGE-URA3-GAP1-mnn 9-GnTI. And (5) sequencing, and the result is correct.

PGE-URA3-GAP1-mnn9-GnTI is a recombinant vector obtained by inserting a DNA molecule shown in SEQ ID No.15 between the restriction enzyme site Ssp I and EcoR I of the PGE-URA3-GAP1 vector.

2. Construction of recombinant Yeast expressing exogenous mannosidase I

Method for preparing yeast electrotransformation competent cells step five, procedure described above, was carried out by linearizing about 10 μ g of PGE-URA3-GAP1-mnn9-GnTI plasmid with Nhe I to obtain PGE-URA3-GAP1-mnn9-GnTI linearized plasmid for transformation.

The selected host bacterium is the W10 engineering bacterium constructed in the fifth step. The single clones formed on MD plates after transformation were named 1-8.

Extracting 1-8 genome DNA by glass bead preparation method, performing PCR amplification by using the genome DNA as template and HuGnTI-0.9k-01 and HuGnTI-0.9k-02 as primers to obtain PCR amplification product of about 0.9kb, and proving that the GnTI is inserted into genome, namely positive engineering bacteria (shown as A in figure 11).

HuGnTI-0.9k-01:5’-TGGACAAGCTGCTGCATTATC-3’;

HuGnTI-0.9k-02:5’-CGGAACTGGAAGGTGACAATA-3’。

As a result of DSA-FACE glycoform analysis of 1-8 bacteria (the same method as described in example I) shown in B in FIG. 11, it was found that the major glycoform structure of the host cell-expressed protein after the transfer into GnTI was GlcNAcMan5GlcNAc 2.

Seventhly, constructing glycosyl engineering yeast with mammal GalGlcNAcMan5GlcNAc2 and no fucose glycosylation structure

The glycosyl engineering yeast strain 1-8-4 with the mammal GalGlcNAcMan5GlcNAc2 and no fucose glycosylation structure is an engineering bacterium 1-8-4 obtained by inserting kre2-GalE-GalT gene segment (the nucleotide sequence is shown as SEQ ID No.16 and codes the protein shown as SEQ ID No. 11) into the host bacterium 1-8 genome.

Wherein, the 1 st to 294 nd nucleotides from the 5' end of SEQ ID No.16 are kre2 positioning signals, the 295 st and 1317 th nucleotides from the 5' end are galactose isomerase GalE encoding genes, and the 1325 st and 2394 th nucleotides from the 5' end are galactose transferase GalT encoding genes.

1. Construction of expression vector of galactose transferase (GalE + T) containing kre2 localization Signal

(1) Regulating human GalE and GalT genes

Human GalE gene upstream primer GalE5 'and downstream primer GalE 3' are used, human GalT gene upstream primer GalT5 'and downstream primer GalT 3' are used, full-length fragments of human GalE and GalT genes are obtained from a human fetal liver cDNA library (purchased from Clontech Laboratories Inc.1290Terra Bella Ave.mountain View, CA94043, USA) by a PCR method, and PCR reaction conditions are as follows: pre-denaturation at 94 ℃ for 5min, denaturation at 94 ℃ for 30sec, annealing at 52 ℃ for 30sec, extension at 72 ℃ for 1min for 30sec, and 30 cycles; final extension at 72 ℃ for 10 min. The PCR amplification products were separated by 0.8% agarose gel electrophoresis, and recovered by DNA recovery kits.

GalE5’:5’-ATGAGAGTTCTGGTTACCGGTGGTA-3’;

GalE3’:5’-AGGGTACCATCGGGATATCCCTGTGGATGGC-3’(KpnI);

GalT5’:5’-ATGGTACCGGTGGTGGACGTGACCTTTCTCGTCTGCCA-3’(KpnI)。

GalT3’:5’-GCatttaaatttaGCTCGGTGTCCCGATGTCCACTGTGAT-3’(SwaI)。

(2) GalE-GalT DNA fragment containing positioning signal kre2

Kre2 5’:5’-ATAATattAAACGATGGCCCTCTTTCTCAGTAAGAG-3' (underlined SspI I site);

Kre2 3’+GalE5’:5’-CACCGGtAACCAGaACTctCatGATCGGGGCAtctgccttttcagcg gcagctttcagagccttggattc-3’。

the kre2 localization signal fragment was adjusted from s.cerevisiae genome DNA by PCR. PCR conditions were as above.

The Kre2-GalE-GalT gene fragment was amplified using Pyrobest DNA polymerase by PCR using an upstream primer Kre2 containing the coding sequence for the S.cere Kre2 Golgi localization signal and a downstream primer GalT 3' containing the coding sequence for the GalE + GalT catalytic domain, wherein the recovered and purified GalE and GalT fragments were ligated to the coding sequence for the S.cere Kre2 Golgi localization signal.

And (3) PCR reaction conditions: denaturation at 94 ℃ for 2 min, annealing at 52 ℃ for 30sec, and extension at 72 ℃ for 5min, followed by denaturation at 94 ℃ for 30sec, annealing at 52 ℃ for 30sec, and extension at 72 ℃ for 4min for 30sec, and 30 cycles; final extension at 72 ℃ for 10 min.

The PCR amplification product was separated by 0.8% agarose gel electrophoresis (8V/cm, 15 min), and a 2.4kb band was cut out with a clean blade under an ultraviolet lamp and recovered with a DNA recovery kit in the same manner as above.

(3) Construction of PGE-URA3-GAP1-kre2-GalE-GalT vector

Firstly, SwaI is used for enzyme digestion of the kre2-GalE-GalT DNA molecule, and then T4 PNK enzyme (Dalianbao bio-Limited company) is used for phosphorylation of the gene segment; carrying out double enzyme digestion on PGE-URA3-GAP1 vector by Ssp I and SwaI to obtain a large vector fragment; the gene fragment was ligated with the large fragment of the vector to obtain a recombinant plasmid, which was designated PGE-URA3-GAP1-kre 2-GalE-GalT. And (5) sequencing, and the result is correct.

PGE-URA3-GAP1-kre2-GalE-GalT is a recombinant vector obtained by inserting DNA molecule kre2-GalE-GalT shown in SEQ ID No.16 into Ssp I and SwaI enzyme cutting sites of PGE-URA3-GAP1 vector.

2. Construction of recombinant Yeast expressing exogenous UDP-Gal and lactose transferase

About 10. mu.g of PGE-URA3-GAP1-kre2-GalE-GalT plasmid was linearized with Nhe I to obtain PGE-URA3-GAP1-kre2-GalE-GalT linearized plasmid for transformation, and yeast electroporation competent cells were prepared in the same manner as in step five above.

The selected host bacteria are 1-8 engineering bacteria constructed in the sixth step. The single clones formed on MD plates after transformation were named 1-8-4.

1-8-4 of genome DNA is extracted by a glass bead preparation method, the genome DNA is used as a template, GalE-T (1.5k) -01 (5'-TGATAACCTCTGTAACAGTAAGCGC-3') and GalE-T (1.5k) -02 (5'-GGAGCTTAGCACGATTGAATATAGT-3') are respectively used as primers, PCR amplification is carried out, PCR amplification products are respectively 1.5kb, and the fact that GalE-T is inserted into a genome is proved to be positive engineering bacteria (shown as A in figure 12).

As shown in B in FIG. 12, it can be seen from the results of DSA-FACE glycoform analysis of 1-8-4 bacteria (the same method as described in example I) that the major glycoform structure of the host cell-expressed protein after galactose isomerase and galactose transferase had been transferred was GalGlcNAcMan5GlcNAc 2.

Eighthly, constructing a glycosyl engineering yeast strain with the mammal GalGlcNAcMan3GlcNAc2 and an afucosylated structure

Glycosyl engineering yeast strain 52-60 with mammal GalGlcNAcMan3GlcNAc2 and no fucose glycosylation structure is the MDSII DNA molecule (the nucleotide sequence is shown in SEQ ID No.17, and the coding SEQ ID No.12 shows the protein) is inserted into the host bacteria 1-8-4 genome to obtain the engineering bacteria 52-60.

Wherein, the 1 st to 108 th nucleotides from the 5 'end of SEQ ID No.17 are mnn2 positioning signals of the mannosidase II encoding gene, and the 109 th and 3303 th nucleotides from the 5' end are the mannosidase II encoding gene.

1. Construction of mannosidase II (MDSII) expression vector containing mnn2 localization Signal

(1) Synthesis of MDSII Gene containing mnn2 localization Signal by Whole Gene Synthesis

The MDSII gene (SEQ ID No.17) containing mnn2 was synthesized from the sequence by a whole-gene synthesis method, synthesized by mikyo jinrius company and cloned into a pUC57 cloning vector to obtain pUC 57-MDSII.

Design of MDSII Gene upstream primer (mnn 2-MDSII-01: 5' -AT)AATattAAACCatgctgcttaccaaaaggttttcaaagctgttc-3') (SspI cleavage sites underlined) and the downstream primers (MDSII-02: 5' -GCTATTTA AATctattaCCTCAACTGGATTCGGAATGTGC TG ATTTCCATTG-3') (SwaI restriction sites underlined), obtaining a PCR product of the full-length fragment of the human MDSII gene from pUC57-MDSII by the PCR method under the conditions: pre-denaturation at 94 ℃ for 5min, denaturation at 94 ℃ for 30sec, annealing at 52 ℃ for 30sec, extension at 72 ℃ for 4min for 30sec, and 30 cycles; final extension at 72 ℃ for 10 min. The PCR amplification product (SEQ ID NO: 17) was separated by 0.8% agarose gel electrophoresis and recovered using a DNA recovery kit.

(2) Construction of the PGE-URA3-arm3-GAP-mnn2-MDSII expression vector

Firstly, SwaI is used for enzyme digestion of the PCR product, and then T4 PNK enzyme (Dalianbao biological Co., Ltd.) is used for phosphorylation of the gene segment; carrying out double enzyme digestion on PGE-URA3-GAP1 vector by Ssp I and SwaI to obtain a large vector fragment; the gene fragment was ligated with the large fragment of the vector to obtain a recombinant plasmid, which was named PGE-URA3-arm3-GAP-mnn 2-MDSII. And (5) sequencing, and the result is correct.

PGE-URA3-arm3-GAP-mnn2-MDSII is a recombinant vector obtained by inserting the DNA molecule shown in SEQ ID No.17 into the Ssp I and Swa I cleavage sites of the PGE-URA3-GAP1 vector.

2. Construction of recombinant Yeast expressing exogenous mannosidase II

Approximately 10. mu.g of PGE-URA3-arm3- -GAP-mnn2-MDSII plasmid was linearized with Msc I to obtain PGE-URA3-arm3-GAP-mnn2-MDSII linearized plasmid for transformation, and yeast electroporation competent cells were prepared as described above in step five.

The selected host bacteria are 1-8-4 engineering bacteria constructed in the seventh step. The monoclonals formed on the MD plates after transformation were designated 52-60.

52-60 genome DNA is extracted by a glass bead preparation method, PCR amplification is carried out by taking the genome DNA as a template and the CeMNSII-1.2k-01 and the CeMNSII-1.2k-02 as primers respectively, and the obtained PCR amplification products are 1.2kb respectively, so that the MDSII is proved to be inserted into the genome, namely the positive engineering bacterium (A in figure 13).

CeMNSII-1.2k-01:5’-CAGATGGATGAGCATAGAGTTA-3’;

CeMNSII-1.2k-02:5’-GACAAGAGGATAATGAAGAGAC-3’。

The results of DSA-FACE glycoform analysis of 52-60 bacteria are shown in FIG. 13 as C. Therefore, after the exogenous mannosidase II is transferred, the main glycoform structure of the expression protein of the host bacteria is GalGlcNAcMan3GlcNAc 2.

Ninthly, constructing glycosyl engineering yeast strain with mammal Gal2GlcNAc2Man3GlcNAc2 and no fucose glycosylation structure

A glycoengineered yeast strain 150L2 with mammalian Gal2GlcNAc2Man3GlcNAc2 and no fucosylation structure is an engineered bacterium 150L2 obtained by inserting GnT II DNA molecules (the nucleotide sequence is shown as SEQ ID No.18 and codes for the protein shown as SEQ ID No. 13) into the genome of host bacteria 52-60.

Wherein, the 1 st to 108 th nucleotides from the 5 'end of SEQ ID No.18 are mnn2 localization signal of the gene encoding N-acetylglucosamine transferase II, and the 109 th and 1185 th nucleotides from the 5' end are N-acetylglucosamine transferase II.

1. Construction of N-acetylglucosamine transferase II (GnTII) expression vector for mNn2 localization signal

(1) GnTII gene synthesized by whole gene synthesis mode

The GnTII gene (SEQ ID No.18) containing mnn2 was synthesized from the sequence by whole gene synthesis, synthesized by mikyo jinrius and cloned into a pUC57 cloning vector to obtain pUC 57-GnTII.

Design of GnTII Gene upstream primer (mnn 2-GnTII-01: 5' -AT)AATattAAACCatgctgcttaccaaaa ggttttcaaagctgttc-3') (SspI cleavage sites underlined) and the downstream primer (GnTII-02: 5' -GCTatttaaatTTAtcactgcagtcttctataacttttac-3') (the SwaI restriction site is underlined), obtaining N-acetylglucosamine transferase II (GnTII) DNA molecules containing the mnn2 localization signal from pUC57-GnTII by PCR reaction conditions: pre-denaturation at 94 ℃ for 5min, denaturation at 94 ℃ for 30sec, annealing at 52 ℃ for 30sec, extension at 72 ℃ for 2 min for 30sec, and circulation for 30 times; final extension at 72 ℃ for 10 min. The PCR amplification products were separated by 0.8% agarose gel electrophoresis and recovered using a DNA recovery kit.

(2) Construction of PGE-URA3-arm3-GAP-mnn2-GnTII expression vector

The enzyme digestion and construction method is consistent with the construction method of PGE-URA3-arm3-GAP-mnn2-MDSII, and the recombinant plasmid is obtained and named as PGE-URA3-arm3-GAP-mnn 2-GnTII. And (5) sequencing, and the result is correct.

PGE-URA3-arm3-GAP-mnn2-GnTII is a recombinant vector obtained by inserting the DNA molecule shown in SEQ ID No.18 into the Ssp I and Swa I cleavage sites of the PGE-URA3-GAP1 vector.

2. Construction of recombinant Yeast expressing exogenous N-acetylglucosamine transferase II

Approximately 10. mu.g of the PGE-URA3-arm3-GAP-mnn2-GnTII plasmid was linearized with Msc I to obtain the PGE-URA3-arm3-GAP-mnn2-GnTII linearized plasmid for transformation, and yeast electroporation competent cells were prepared as described above for step five.

The selected host bacteria are 52-60 engineering bacteria constructed in the step eight. The single clone formed on the MD plate after transformation was designated 150L 2.

150L2 genome DNA is extracted by a glass bead preparation method, and PCR amplification is carried out by taking the genome DNA as a template and RnGnTII-0.8k-01 and RnGnTII-0.8k-02 as primers respectively to obtain a PCR amplification product of 0.8kb, which proves that the GnTII is inserted into the genome, namely the positive engineering bacterium (B in figure 13).

RnGnTII-0.8k-01:5’-ATCAACAGTCTGATCTCTAGTG-3’;

RnGnTII-0.8k-02:5’-AGTTCATGGTCCCTAATATCTC-3’。

Ten, knock-out of anti-her 2 antibody gene in engineered strain

The yeast strain 3-5-11 with the her2 antibody gene inactivated is recombinant yeast obtained by introducing a DNA molecule (anti-her 2 antibody light and heavy chain gene knockout sequence) shown in SEQ ID No.19 into Pichia pastoris 150L2, carrying out homologous recombination with a homologous sequence in a 150L2 genome, and knocking out the anti-her 2 antibody light and heavy chain gene in a yeast genome.

Constructing an anti-her 2 antibody light and heavy chain gene inactivation vector, knocking out a plasmid, transforming pichia pastoris and identifying a positive engineering strain by PCR, wherein the method is the same as the step method, and the yeast strain subjected to gene inactivation of the anti-her 2 antibody is named as 3-5-11.

Eleven, inactivating O-mannose transferase I gene in engineering strain

As the instability of the host bacteria is found, MDSI and MDSII genes are easy to lose, before the O-mannose transferase I gene is inactivated, according to the same technical method of the step eight and the step five in the embodiment, the host bacteria are sequentially transferred into SEQ ID No.17(MDSII) and SEQ ID No.14(MDSI) in 3-5-11, double copies of the two genes in the engineering bacteria are ensured, and 670 host bacteria are obtained through construction.

The yeast strain 7b with the inactivated O-mannosyltransferase I gene is obtained by inserting and inactivating DNA molecules of O-mannosyltransferase I shown in SEQ ID No.8 in pichia pastoris 670, and is named as 7b, namely GJK 30. GJK30 has been deposited in China general microbiological culture collection center at 03-18.2020, with the collection number of CGMCC No. 19488.

1. Construction of O-mannosyltransferase Gene inactivation vector

The terminator AOXTT sequence was obtained by PCR using the plasmid pPIC9 (invitrogen) as a template. The PCR primers AOXTT-5 and AOXTT-3(5 '-AOX 1TT-5tctacgcgtccttag acatgactgttcctcagt-3'; AOX1 TT-3: 5'-tctacgcgtaagcttgcacaaacgaacttc-3') were obtained. The obtained PCR product was purified and recovered by using a PCR product recovery purification kit (Beijing, Dingguo Biotechnology Co., Ltd.) to obtain an AOX1TT terminator fragment.

The vector pYES2 (invitrogen) used in the present invention has URA3 selection marker of yeast and can be used for subsequent screening. In order to prevent the promoter of URA3 gene on the vector from influencing other genes on the vector, the invention adds AOX1TT terminator at the end of URA3 gene. The specific construction method comprises the following steps: recovering the obtained AOX1TT terminator fragment, and then carrying out enzyme digestion by using MluI to obtain an enzyme digestion fragment; this digested fragment was ligated with vector pYES2 treated with Mlu1 in the same manner, the ligation product was amplified by transforming E.coli competent cell Trans5 α (Beijing Quanji Biotechnology Co., Ltd., catalog No. CD201), the clone with the correct sequence was named Trans5 α -pYES2-URA3-AOX1TT, and the plasmid was extracted to obtain a recombinant vector in which AOX1TT terminator was added to the end of URA3 gene, and was designated as pYES2-URA3-AOX1 TT.

In order to ensure that the constructed vector can be integrated into the Pichia pastoris PMT1 gene at a fixed point, the invention uses PCR to fish a fragment of the ORF region in the PMT1 gene as a homologous recombination fragment. To ensure that integration of the inactivation vector into the PMT1 gene resulted in inactivation of PMT1 gene, different combinations of stop codons were added to the primers at both ends of the primer, and the PMT1 gene fragment 3 was ligatedThe end was added with a CYCTT terminator.

Pichia pastoris JC308(Invitrogen corporation) genome is used as a template, the genome DNA of the Pichia pastoris JC308 is extracted by a glass bead preparation method (A. Adams et al, guide to Yeast genetics methods and Experimental protocols, scientific Press, 2000), and the gene fragment of PMT1 is obtained by PCR amplification using the genome DNA as a template and primers PMT1-IN-5 and PMT 1-IN-3.

PMT1-IN-5:5’-tctatgcattaatgatagttaatgactaatagagtaaaacaagtcctcaagaggt-3’;

PMT1-IN-3:5’-tgacataactaattacatgatctattagtcattaactatcattagatcagagtggggacgactaagaaa gc-3’。

Stop codons with different combinations were added to both ends of the captured PMT1 gene fragment, which was named PMT 1-IN.

Carrying out PCR fishing on the PMT1 gene fragment under the reaction condition of 94 ℃ for 5 min; denaturation at 94 ℃ for 30s, annealing at 55 ℃ for 30s, and extension at 72 ℃ for 1min40 s. A total of 25 cycles were performed and a final extension at 72 ℃ for 10 min. And recovering the PCR product to obtain the fishing PMT1 gene fragment.

The plasmid pYES2 containing the CYCTT terminator is used as a template, and primers CYC1TT-5 and CYC1TT-3 (CYC1 TT-5: 5'-gctttcttagtcgtccccactctgatctaatgatagttaatgactaatagatcatgtaattagttatgtca-3'; CYC1 TT-3: 5'-gcaaattaaagccttcgagcgtc-3') are used for carrying out PCR amplification to obtain a CYC1TT terminator fragment. The PCR reaction condition is pre-denaturation at 94 ℃ for 5 min; denaturation at 94 ℃ for 30s, annealing at 55 ℃ for 30s, and extension at 72 ℃ for 1 min. A total of 25 cycles were performed and a final extension at 72 ℃ for 10 min. Recovering the PCR product, namely the CYC1TT terminator fragment.

And then, by taking the recovered PCR product CYC1TT terminator fragment and PMT1-IN fragment (the fished PMT1 gene fragment) as templates, carrying out PCR amplification by using primers PMT1-IN-5 and CYC1TT-3, and connecting PMT1-IN and CYC1TT fragments to construct a PMT1-IN-CYC1TT fusion fragment. The PCR reaction condition is pre-denaturation at 94 ℃ for 5 min; denaturation at 94 ℃ for 30s, annealing at 55 ℃ for 30s, and extension at 72 ℃ for 2.4 min. A total of 25 cycles were performed and a final extension at 72 ℃ for 10 min. Recovering PCR product as PMT1-IN and CYC1TT terminator connecting fragment-PMT 1-IN-CYC1TT fusion fragment. The recovered product is subjected to enzyme digestion by Nsi1 and phosphorylation, and then is connected with a vector skeleton obtained by carrying out enzyme digestion on pYES2-URA3-AOX1TT by Nsi1 and Stu1, and the obtained recombinant vector with the correct sequence is PMT1 insertion inactivation vector PMT1-IN-pYES 2.

The front end and the tail end of the fished PMT1 gene fragment are respectively provided with stop codons with different combinations, and the tail end of the fished PMT1 gene fragment is provided with a CYC1TT terminator, so that the PMT1 gene cannot be expressed if the genome is integrated correctly. The pYES2 vector contains Pichia pastoris URA3 gene, and in order to prevent the URA3 gene promoter from promoting PMT1 gene, AOX1TT terminator is inserted after URA3 gene. According to the designed primers, CYC1TT terminator (272bp) fragment and PMT1(907bp) fragment were obtained, which are consistent with the theoretical size. The size of the fusion fragment of the PMT1-IN fragment and CYC1TT is 1135bp, and the construction of the vector PMT1-IN-pYES2 is proved to be successful through the PCR identification, the sequencing and the like.

2. Construction of PMT1 Gene-inactivated Strain

Preparing yeast 670 competent cells, the preparation method comprises:

670 single colonies were picked and inoculated into 2mL YPD + U medium (medium in which uracil was added to YPD medium to give a uracil concentration of 100. mu.g/mL), and cultured in a shaker at 25 ℃ at 170r/min for 48 hours; then, 500. mu.L of the culture was inoculated into 100mL of YPD + U medium and cultured at 25 ℃ at 170r/min for 24 hours at OD600To 1.0; then centrifuging at 4 ℃ at 6000r/min for 6min, and resuspending the thalli with 15mL of cold sterile water; centrifuging again under the same conditions, and resuspending the thalli by using 15mL of cold sterile water; centrifuging at 6000r/min at 4 deg.C for 6min, and resuspending the thallus with 15mL of cold 1mol/L sorbitol; centrifuging again under the same conditions; the supernatant was decanted and the cells resuspended in 1mL of cold 1mol/L sorbitol, a volume of about 1.5mL, i.e., yeast 670 competent cells, and placed on ice until needed.

Electroporation of PMT1 insertion inactivation vector PMT1-IN-pYES 2: PMT1 was inserted into inactivation vector PMT1-IN-pYES2, which was digested and linearized with EcoRV, and the final product was dissolved IN 20. mu.L ddH2O, namely linearized plasmid; mixing 85 μ L670 competent cells and linearized plasmid in an electric transfer cup, standing on ice for 5min, performing electric conversion (2kV) according to the conditions of Pichia pastoris electric conversion handbook, immediately adding 700 μ L1M sorbitol after electric shock, transferring to a 1.5mL centrifuge tube, standing at 25 deg.C for 1h, coating on MD + RH plate (solid culture medium with histidine and arginine concentrations of 100 μ g/mL and 100 μ g/mL respectively obtained by adding histidine and arginine to MD culture medium), culturing at 25 deg.C, and collecting clone extraction medium grown on the plateThe genome DNA was identified by PCR using PMT1 genome peripheral primers PMT1-ORF-OUT-5 and PMT1-ORF-OUT-3, and the clone with correct genome identification was named 7b, i.e., GJK 30.

PMT1-ORF-OUT-5:5’-aagacccatgccgaacacgac-3’;

PMT1-ORF-OUT-3:5’-gctctgaggcaccttgggtaa-3’。

The vector is integrated into pichia pastoris chromosome by using an insertion inactivation vector insertion integration mode, and because the vector contains a PMT1 gene homologous fragment, the integration of the vector belongs to site-specific integration theoretically, and is inserted into PMT1 gene, and can be identified and screened by a designed specific primer. Clones growing on MD + RH plates were identified by pressure screening using the URA3 selection marker of Pichia pastoris. PCR identification was performed by using PMT1 gene peripheral primers PMT1-ORF-OUT-5 and PMT 1-ORF-OUT-3. If the PMT1-IN-pYES2 vector is correctly integrated into the PMT1 gene, a fragment of 8.6kb IN size can be obtained by using the above primers; the control (i.e.Yeast X33) was a 3kb sized fragment (FIG. 14); as can be seen, the PMT1-IN-pYES2 vector was correctly integrated into the PMT1 gene and was designated as 7b, i.e., GJK 30. Because different terminator codons and terminators are designed on the inserted vector, the gene integration is correct, and the PMT1 gene is not expressed.

Sugar type structure analysis of twelve and GJK30 engineering bacteria

In order to observe whether the sugar-type structure of the finally obtained GJK30 is correct or not, a reporter protein is introduced after the GJK30 engineering bacteria are obtained, the method of the first embodiment is the same as that of the first embodiment, an anti-Her 2 antibody is used as the reporter protein, and the construction method and the vector transformation method of the expression vector of the anti-Her 2 antibody are disclosed in the application patent (see the first embodiment). The method is used for transferring the anti-Her 2 antibody expression vector into GJK30 host bacteria to obtain a GJK30-HL engineering strain for expressing the anti-Her 2 antibody. The glycoform and the glycoform obtained in the previous period (the Her2 antibody expression vector is transferred into a control recombinant engineering strain obtained from a GJK08 strain constructed in example 1 of Chinese patent application 201410668305.X, namely, compared with the GJK30-HL engineering strain, the difference is three, the knocked-out beta-mannose transferase is I-IV, the control recombinant engineering strain only knocks out beta-mannose transferase II, the O-mannose transferase I is inactivated, the control recombinant engineering strain is not, exogenous MDSI and MDSII are introduced twice, and the control recombinant engineering strain is introduced once) although both contain Gal2GlcNAc2Man3GlcNAc2 structure, but the proportion of the two is obviously different, the structure of the front-stage Gal2GlcNAc2Man3GlcNAc2 is lower than 50 percent (A in figure 15), and the proportion of the Gal2GlcNAc2Man3GlcNAc2 structure obtained by the GJK30 engineering bacteria in the glycoform is more than 60%, and the whole glycoform is simpler and more uniform (B in figure 15). According to various reports in the literature, the sugar structure of Gal2GlcNAc2Man3GlcNAc2 affects the biological activity of protein, such as ADCC and CDC activity of antibody, so the proportion of it directly affects the characteristics of protein. This glycoform was subjected to enzymatic cleavage analysis by commercially available glycosidases (New England Biolabs, Beijing), as shown in FIG. 15C, since there was no N-acetylglucosamine at the end of Gal2GlcNAc2Man3GlcNAc2(G2), under the action of β -N-acetylglucosaminidase, Gal2GlcNAc2Man3GlcNAc2 structure was not changed, but two galactose residues could be cleaved by the action of exonuclease β 1, 4-galactosidase to form a structure of GlcNAc2Man3GlcNAc2 (G0); meanwhile, under the action of the two exonucleases, namely, under the action of the two exonucleases, galactose Gal and N-acetylglucosamine GlcNAc are sequentially cut off, so that the glycosyl structure is changed into a Man3GlcNAc2 structure, and the expressed glycoform is proved to be correct.

<110> military medical research institute of military science institute of people's liberation force of China

<120> construction method of engineered yeast for glycoprotein preparation and strain thereof

<130> GNCLN200956

<160> 20

<170> PatentIn version 3.5

<210> 1

<211> 404

<212> PRT

<213> Artificial sequence

<400> 1

Met Ala Lys Ala Asp Gly Ser Leu Leu Tyr Tyr Asn Pro His Asn Pro

1 5 10 15

Pro Arg Arg Tyr Tyr Phe Tyr Met Ala Ile Phe Ala Val Ser Val Ile

20 25 30

Cys Val Leu Tyr Gly Pro Ser Gln Gln Leu Ser Ser Pro Lys Ile Asp

35 40 45

Tyr Asp Pro Leu Thr Leu Arg Ser Leu Asp Leu Lys Thr Leu Glu Ala

50 55 60

Pro Ser Gln Leu Ser Pro Gly Thr Val Glu Asp Asn Leu Arg Arg Gln

65 70 75 80

Leu Glu Phe His Phe Pro Tyr Arg Ser Tyr Glu Pro Phe Pro Gln His

85 90 95

Ile Trp Gln Thr Trp Lys Val Ser Pro Ser Asp Ser Ser Phe Pro Lys

100 105 110

Asn Phe Lys Asp Leu Gly Glu Ser Trp Leu Gln Arg Ser Pro Asn Tyr

115 120 125

Asp His Phe Val Ile Pro Asp Asp Ala Ala Trp Glu Leu Ile His His

130 135 140

Glu Tyr Glu Arg Val Pro Glu Val Leu Glu Ala Phe His Leu Leu Pro

145 150 155 160

Glu Pro Ile Leu Lys Ala Asp Phe Phe Arg Tyr Leu Ile Leu Phe Ala

165 170 175

Arg Gly Gly Leu Tyr Ala Asp Met Asp Thr Met Leu Leu Lys Pro Ile

180 185 190

Glu Ser Trp Leu Thr Phe Asn Glu Thr Ile Gly Gly Val Lys Asn Asn

195 200 205

Ala Gly Leu Val Ile Gly Ile Glu Ala Asp Pro Asp Arg Pro Asp Trp

210 215 220

His Asp Trp Tyr Ala Arg Arg Ile Gln Phe Cys Gln Trp Ala Ile Gln

225 230 235 240

Ser Lys Arg Gly His Pro Ala Leu Arg Glu Leu Ile Val Arg Val Val

245 250 255

Ser Thr Thr Leu Arg Lys Glu Lys Ser Gly Tyr Leu Asn Met Val Glu

260 265 270

Gly Lys Asp Arg Gly Ser Asp Val Met Asp Trp Thr Gly Pro Gly Ile

275 280 285

Phe Thr Asp Thr Leu Phe Asp Tyr Met Thr Asn Val Asn Thr Thr Gly

290 295 300

His Ser Gly Gln Gly Ile Gly Ala Gly Ser Ala Tyr Tyr Asn Ala Leu

305 310 315 320

Ser Leu Glu Glu Arg Asp Ala Leu Ser Ala Arg Pro Asn Gly Glu Met

325 330 335

Leu Lys Glu Lys Val Pro Gly Lys Tyr Ala Gln Gln Val Val Leu Trp

340 345 350

Glu Gln Phe Thr Asn Leu Arg Ser Pro Lys Leu Ile Asp Asp Ile Leu

355 360 365

Ile Leu Pro Ile Thr Ser Phe Ser Pro Gly Ile Gly His Ser Gly Ala

370 375 380

Gly Asp Leu Asn His His Leu Ala Tyr Ile Arg His Thr Phe Glu Gly

385 390 395 400

Ser Trp Lys Asp

<210> 2

<211> 462

<212> PRT

<213> Artificial sequence

<400> 2

Met Ser Thr Asp Ser Asn Leu Gly Tyr Gly Ile Ser Ile Ser Gly Gly

1 5 10 15

Ser Arg Ser Thr Gln Ser Leu Gly Thr Ser Arg Val Thr Pro Ser Arg

20 25 30

Ser Ala Asn His Glu Gly Lys Glu Asn Lys Ala Phe Ser Met Ile Ser

35 40 45

Pro Lys Lys Leu Ile Asn Lys Leu Ser Lys Ser Ser Val Ser Ser Asn

50 55 60

Asn Thr Ser Ser Ser Asn His Asp Ser Phe Val Asp Arg Lys Tyr Lys

65 70 75 80

Ile Glu Ile Glu Asn Ser Phe Ser Asp Arg Ser Val Ser Glu Val Asp

85 90 95

Leu Leu Glu Asp Ser Leu Asp Thr Thr Glu Gly Asp Ser Gly Glu Asn

100 105 110

Leu Val Ser Thr Pro Thr Gln Val Thr Leu Arg Pro Lys Arg Gly Asn

115 120 125

Ser Gln Asp Arg Asn Glu Asn Arg Val Leu Lys Glu Lys Glu Thr Ala

130 135 140

Val Arg Glu Ser Gln Arg His Thr Gly Phe Phe Thr Glu Ser Met Leu

145 150 155 160

Ser Pro Ser Asp Gly Ser Arg Gln Asp Thr Ser Asp Ser Pro Gly Ser

165 170 175

Ile Ser Ile Pro Thr Ala Glu Leu Ser Lys Lys Asn Leu Ser Asp Val

180 185 190

Ser Lys Ser Thr Ser Glu Asn Ser His Asn Arg Lys Trp Glu Ala Arg

195 200 205

Ser Ser Leu Leu Pro Glu Asn Leu Ser Ser Ile His Leu Asp Asp Ser

210 215 220

Pro Ile Glu Ile Tyr Glu Asp Ala Glu Glu Ile Ile Asp Glu Thr Val

225 230 235 240

Glu Glu Pro Arg Ser Ser Ile Pro Leu Gln Asn Glu Trp Glu Met Glu

245 250 255

Asp Thr Ile Leu Glu Gly Arg Leu Val Gln Ser Ala Ser Asp Pro Val

260 265 270

Ile Thr Ser Asn Asp Ile Ser Lys Glu Leu Arg Lys Ser Ile Ser Thr

275 280 285

Pro Ala Leu Thr His Ser Asp Leu Val Asp Phe Arg Lys Val Ile Pro

290 295 300

Gly Ser Ser His Tyr His Val Phe Thr Asp Pro Lys Ser Pro Phe Thr

305 310 315 320

Glu Asp Pro Ser Gln Leu Ala Tyr His Lys Ile Arg Asp Arg Asn Phe

325 330 335

Asp Ala His Tyr Ser Thr Asp Pro Ile Arg Leu Ser Ser Gly Ser Ser

340 345 350

Ser Glu Gly Ser Asp Glu Lys Asn Leu Leu Leu Gly Ser Arg Lys Pro

355 360 365

Ser Asp Pro Tyr Arg Leu Pro Tyr Glu Asp Glu Asp Gly Tyr Arg Phe

370 375 380

Trp Thr Lys Thr Pro Leu Asn Arg Glu Cys Pro Lys Arg Val Ala Leu

385 390 395 400

Trp Leu Leu Val Gly Ala Ile Leu Ala Pro Pro Val Trp Ile Met Met

405 410 415

Tyr Val Gly Phe Leu Asp Ser Ser Val Gly Arg Leu Pro Pro Lys Tyr

420 425 430

Arg Val Ile Ser Gly Val Leu Ala Leu Ser Met Ile Ile Leu Thr Ala

435 440 445

Met Gly Ile Ala Val Gly Phe Ala Tyr Gly Leu Asn Asn Arg

450 455 460

<210> 3

<211> 652

<212> PRT

<213> Artificial sequence

<400> 3

Met Phe Lys Glu Thr Ser Lys Asn Leu Phe Gly Ser Ile Asn Thr Phe

1 5 10 15

Asn Thr Val Glu Tyr Val Met Tyr Met Met Leu Leu Leu Thr Ala Tyr

20 25 30

Phe Leu Asn His Leu Leu His Ser Leu Asp Asn Ile Asn His Leu Val

35 40 45

Glu Ser Asp Val Asn Tyr Gln Leu Leu Gln Arg Val Thr Asn Lys Val

50 55 60

Lys Leu Phe Asp Glu Glu Ala Val Leu Pro Phe Ala Lys Asn Leu Asn

65 70 75 80

Arg Arg Thr Glu Arg Phe Asp Pro Arg Leu Pro Val Ala Ala Tyr Leu

85 90 95

Arg Ser Leu Gln Asp Gln Tyr Ser Glu Leu Pro Gln Gly Thr Asp Leu

100 105 110

Asn Asp Ile Pro Pro Leu Glu Val Ser Phe His Trp Asp Asp Trp Leu

115 120 125

Ser Leu Gly Ile Ala Ser Thr Phe Trp Asp Ala Phe Asp Asn Tyr Asn

130 135 140

Lys Arg Gln Gly Glu Asn Ala Ile Ser Tyr Glu Gln Leu Gln Ala Ile

145 150 155 160

Leu Val Asn Asp Leu Glu Asp Phe Ser Pro Tyr Thr Ala His Ile Leu

165 170 175

His Ser Asn Val Glu Val Tyr Lys Tyr Arg Thr Ile Pro Gln Lys Ile

180 185 190

Val Tyr Met Ser Asn Lys Gly Tyr Phe Glu Leu Leu Val Thr Glu Lys

195 200 205

Glu Lys Leu Ser Asn Glu Gly Leu Trp Ser Ile Phe His Gln Lys Gln

210 215 220

Gly Gly Leu Asn Glu Phe Ser Ser Leu Asn Leu Ile Glu Glu Val Asp

225 230 235 240

Ala Leu Asp Glu Ile Tyr Asp Ser Lys Gly Leu Pro Ala Trp Asp Pro

245 250 255

Pro Phe Pro Glu Glu Leu Asp Ala Ser Asp Glu Asp Leu Pro Phe Asn

260 265 270

Ala Thr Glu Glu Leu Ala Lys Val Glu Gln Ile Lys Glu Pro Lys Leu

275 280 285

Glu Asp Ile Phe Tyr Gln Glu Gly Leu Gln His Gly Ile Gln Thr Leu

290 295 300

Pro Ser Asp Ala Ser Val Tyr Phe Pro Val Asn Tyr Val Glu Asn Asp

305 310 315 320

Pro Gly Leu Gln Ser His His Leu His Phe Pro Phe Phe Ser Gly Met

325 330 335

Val Leu Pro Arg Glu Ile His Ser Ser Val His His Met Asn Lys Ala

340 345 350

Phe Phe Leu Phe Ala Arg Gln His Gly Tyr Val Val Trp Phe Phe Tyr

355 360 365

Gly Asn Leu Ile Gly Trp Tyr Tyr Asn Gly Asn Asn His Pro Trp Asp

370 375 380

Ser Asp Ile Asp Ala Ile Met Pro Met Ala Glu Met Ala Arg Met Ala

385 390 395 400

His His His Asn Asn Thr Leu Ile Ile Glu Asn Pro His Asp Gly Tyr

405 410 415

Gly Thr Tyr Leu Leu Thr Ile Ser Pro Trp Phe Thr Lys Lys Thr Arg

420 425 430

Gly Gly Asn His Ile Asp Gly Arg Phe Val Asp Val Lys Arg Gly Thr

435 440 445

Tyr Ile Asp Leu Ser Ala Ile Ser Ala Met His Gly Ile Tyr Pro Asp

450 455 460

Trp Val Arg Asp Gly Val Lys Glu Asn Pro Lys Asn Leu Ala Leu Ala

465 470 475 480

Asp Lys Asn Gly Asn Trp Tyr Leu Thr Arg Asp Ile Leu Pro Leu Arg

485 490 495

Arg Thr Ile Phe Glu Gly Ser Arg Ser Tyr Thr Val Lys Asp Ile Glu

500 505 510

Asp Thr Leu Leu Arg Asn Tyr Gly Asp Lys Val Leu Ile Asn Thr Glu

515 520 525

Leu Ala Asp His Glu Trp His Asp Asp Trp Lys Met Trp Val Gln Lys

530 535 540

Lys Lys Tyr Cys Thr Tyr Glu Glu Phe Glu Asp Tyr Leu Ser Ala His

545 550 555 560

Gly Gly Val Glu Tyr Asp Glu Asp Gly Val Leu Thr Leu Glu Gly Ala

565 570 575

Cys Gly Phe Glu Glu Val Arg Gln Asp Trp Ile Ile Thr Arg Glu Ser

580 585 590

Val Asn Leu His Met Lys Glu Trp Glu Ala Ile Gln Arg Asn Glu Ser

595 600 605

Thr Thr Glu Tyr Thr Ala Lys Asp Leu Pro Arg Tyr Arg Pro Asp Ser

610 615 620

Phe Lys Asn Leu Leu Asp Gly Val Ser Asn His Gly Asn Gly Asn Val

625 630 635 640

Gly Lys Ile Glu His Val Lys Leu Glu His Asn Asp

645 650

<210> 4

<211> 594

<212> PRT

<213> Artificial sequence

<400> 4

Met Arg Ile Arg Ser Asn Val Leu Leu Leu Ser Thr Ala Gly Ala Leu

1 5 10 15

Ala Leu Val Trp Phe Ala Val Val Phe Ser Trp Asp Asp Lys Ser Ile

20 25 30

Phe Gly Ile Pro Thr Pro Gly His Ala Val Ala Ser Ala Tyr Asp Ser

35 40 45

Ser Val Thr Leu Gly Thr Phe Asn Asp Met Glu Val Asp Ser Tyr Val

50 55 60

Thr Asn Ile Tyr Asp Asn Ala Pro Val Leu Gly Cys Tyr Asp Leu Ser

65 70 75 80

Tyr His Gly Leu Leu Lys Val Ser Pro Lys His Glu Ile Leu Cys Asp

85 90 95

Met Lys Phe Ile Arg Ala Arg Val Leu Glu Thr Glu Ala Tyr Ala Ala

100 105 110

Leu Lys Asp Leu Glu His Lys Lys Leu Thr Glu Glu Glu Lys Ile Glu

115 120 125

Lys His Trp Phe Thr Phe Tyr Gly Ser Ser Val Phe Leu Pro Asp His

130 135 140

Asp Val His Tyr Leu Val Arg Arg Val Val Phe Ser Gly Glu Gly Lys

145 150 155 160

Ala Asn Arg Pro Ile Thr Ser Ile Leu Val Ala Gln Ile Tyr Asp Lys

165 170 175

Asn Trp Asn Glu Leu Asn Gly His Phe Leu Asn Val Leu Asn Pro Asn

180 185 190

Thr Gly Lys Leu Gln His His Ala Phe Pro Gln Val Leu Pro Ile Ala

195 200 205

Val Asn Trp Asp Arg Asn Ser Lys Tyr Arg Gly Gln Glu Asp Pro Arg

210 215 220

Val Val Leu Arg Arg Gly Arg Phe Gly Pro Asp Pro Leu Val Met Phe

225 230 235 240

Asn Thr Leu Thr Gln Asn Asn Lys Leu Arg Arg Leu Phe Thr Ile Ser

245 250 255

Pro Phe Asp Gln Tyr Lys Thr Val Met Tyr Arg Thr Asn Ala Phe Lys

260 265 270

Met Gln Thr Thr Glu Lys Asn Trp Val Pro Phe Phe Leu Lys Asp Asp

275 280 285

Gln Glu Ser Val His Phe Val Tyr Ser Phe Asn Pro Leu Arg Val Leu

290 295 300

Asn Cys Ser Leu Asp Asn Gly Ala Cys Asp Val Leu Phe Glu Leu Pro

305 310 315 320

His Asp Phe Gly Met Ser Ser Glu Leu Arg Gly Ala Thr Pro Met Leu

325 330 335

Asn Leu Pro Gln Ala Ile Pro Met Ala Asp Asp Lys Glu Ile Trp Val

340 345 350

Ser Phe Pro Arg Thr Arg Ile Ser Asp Cys Gly Cys Ser Glu Thr Met

355 360 365

Tyr Arg Pro Met Leu Met Leu Phe Val Arg Glu Gly Thr Asn Phe Phe

370 375 380

Ala Glu Leu Leu Ser Ser Ser Ile Asp Phe Gly Leu Glu Val Ile Pro

385 390 395 400

Tyr Thr Gly Asp Gly Leu Pro Cys Ser Ser Gly Gln Ser Val Leu Ile

405 410 415

Pro Asn Ser Ile Asp Asn Trp Glu Val Thr Gly Ser Asn Gly Glu Asp

420 425 430

Ile Leu Ser Leu Thr Phe Ser Glu Ala Asp Lys Ser Thr Ser Val Val

435 440 445

His Ile Arg Gly Leu Tyr Lys Tyr Leu Ser Glu Leu Asp Gly Tyr Gly

450 455 460

Gly Pro Glu Ala Glu Asp Glu His Asn Phe Gln Arg Ile Leu Ser Asp

465 470 475 480

Leu His Phe Asp Gly Lys Lys Thr Ile Glu Asn Phe Lys Lys Val Gln

485 490 495

Ser Cys Ala Leu Asp Ala Ala Lys Ala Tyr Cys Lys Glu Tyr Gly Val

500 505 510

Thr Arg Gly Glu Glu Asp Arg Leu Lys Asn Lys Glu Lys Glu Arg Lys

515 520 525

Ile Glu Glu Lys Arg Lys Lys Glu Glu Glu Arg Lys Lys Lys Glu Glu

530 535 540

Glu Lys Lys Lys Lys Glu Glu Glu Glu Lys Lys Lys Lys Glu Glu Glu

545 550 555 560

Glu Glu Glu Glu Lys Arg Leu Lys Glu Leu Lys Lys Lys Leu Lys Glu

565 570 575

Leu Gln Glu Glu Leu Glu Lys Gln Lys Asp Glu Val Lys Asp Thr Lys

580 585 590

Ala Lys

<210> 5

<211> 644

<212> PRT

<213> Artificial sequence

<400> 5

Met Arg Thr Arg Leu Asn Phe Leu Leu Leu Cys Ile Ala Ser Val Leu

1 5 10 15

Ser Val Ile Trp Ile Gly Val Leu Leu Thr Trp Asn Asp Asn Asn Leu

20 25 30

Gly Gly Ile Ser Leu Asn Gly Gly Lys Asp Ser Ala Tyr Asp Asp Leu

35 40 45

Leu Ser Leu Gly Ser Phe Asn Asp Met Glu Val Asp Ser Tyr Val Thr

50 55 60

Asn Ile Tyr Asp Asn Ala Pro Val Leu Gly Cys Thr Asp Leu Ser Tyr

65 70 75 80

His Gly Leu Leu Lys Val Thr Pro Lys His Asp Leu Ala Cys Asp Leu

85 90 95

Glu Phe Ile Arg Ala Gln Ile Leu Asp Ile Asp Val Tyr Ser Ala Ile

100 105 110

Lys Asp Leu Glu Asp Lys Ala Leu Thr Val Lys Gln Lys Val Glu Lys

115 120 125

His Trp Phe Thr Phe Tyr Gly Ser Ser Val Phe Leu Pro Glu His Asp

130 135 140

Val His Tyr Leu Val Arg Arg Val Ile Phe Ser Ala Glu Gly Lys Ala

145 150 155 160

Asn Ser Pro Val Thr Ser Ile Ile Val Ala Gln Ile Tyr Asp Lys Asn

165 170 175

Trp Asn Glu Leu Asn Gly His Phe Leu Asp Ile Leu Asn Pro Asn Thr

180 185 190

Gly Lys Val Gln His Asn Thr Phe Pro Gln Val Leu Pro Ile Ala Thr

195 200 205

Asn Phe Val Lys Gly Lys Lys Phe Arg Gly Ala Glu Asp Pro Arg Val

210 215 220

Val Leu Arg Lys Gly Arg Phe Gly Pro Asp Pro Leu Val Met Phe Asn

225 230 235 240

Ser Leu Thr Gln Asp Asn Lys Arg Arg Arg Ile Phe Thr Ile Ser Pro

245 250 255

Phe Asp Gln Phe Lys Thr Val Met Tyr Asp Ile Lys Asp Tyr Glu Met

260 265 270

Pro Arg Tyr Glu Lys Asn Trp Val Pro Phe Phe Leu Lys Asp Asn Gln

275 280 285

Glu Ala Val His Phe Val Tyr Ser Phe Asn Pro Leu Arg Val Leu Lys

290 295 300

Cys Ser Leu Asp Asp Gly Ser Cys Asp Ile Val Phe Glu Ile Pro Lys

305 310 315 320

Val Asp Ser Met Ser Ser Glu Leu Arg Gly Ala Thr Pro Met Ile Asn

325 330 335

Leu Pro Gln Ala Ile Pro Met Ala Lys Asp Lys Glu Ile Trp Val Ser

340 345 350

Phe Pro Arg Thr Arg Ile Ala Asn Cys Gly Cys Ser Arg Thr Thr Tyr

355 360 365

Arg Pro Met Leu Met Leu Phe Val Arg Glu Gly Ser Asn Phe Phe Val

370 375 380

Glu Leu Leu Ser Thr Ser Leu Asp Phe Gly Leu Glu Val Leu Pro Tyr

385 390 395 400

Ser Gly Asn Gly Leu Pro Cys Ser Ala Asp His Ser Val Leu Ile Pro

405 410 415

Asn Ser Ile Asp Asn Trp Glu Val Val Asp Ser Asn Gly Asp Asp Ile

420 425 430

Leu Thr Leu Ser Phe Ser Glu Ala Asp Lys Ser Thr Ser Val Ile His

435 440 445

Ile Arg Gly Leu Tyr Asn Tyr Leu Ser Glu Leu Asp Gly Tyr Gln Gly

450 455 460

Pro Glu Ala Glu Asp Glu His Asn Phe Gln Arg Ile Leu Ser Asp Leu

465 470 475 480

His Phe Asp Asn Lys Thr Thr Val Asn Asn Phe Ile Lys Val Gln Ser

485 490 495

Cys Ala Leu Asp Ala Ala Lys Gly Tyr Cys Lys Glu Tyr Gly Leu Thr

500 505 510

Arg Gly Glu Ala Glu Arg Arg Arg Arg Val Ala Glu Glu Arg Lys Lys

515 520 525

Lys Glu Lys Glu Glu Glu Glu Lys Lys Lys Lys Lys Glu Lys Glu Glu

530 535 540

Glu Glu Lys Lys Arg Ile Glu Glu Glu Lys Lys Lys Ile Glu Glu Lys

545 550 555 560

Glu Arg Lys Glu Lys Glu Lys Glu Glu Ala Glu Arg Lys Lys Leu Gln

565 570 575

Glu Met Lys Lys Lys Leu Glu Glu Ile Thr Glu Lys Leu Glu Lys Gly

580 585 590

Gln Arg Asn Lys Glu Ile Asp Pro Lys Glu Lys Gln Arg Glu Glu Glu

595 600 605

Glu Arg Lys Glu Arg Val Arg Lys Ile Ala Glu Lys Gln Arg Lys Glu

610 615 620

Ala Glu Lys Lys Glu Ala Glu Lys Lys Ala Asn Asp Lys Lys Asp Leu

625 630 635 640

Lys Ile Arg Gln

<210> 6

<211> 488

<212> PRT

<213> Artificial sequence

<400> 6

Met Tyr His Leu Ala Pro Arg Lys Lys Leu Leu Ile Trp Gly Gly Ser

1 5 10 15

Leu Gly Phe Val Leu Leu Leu Leu Ile Val Ala Ser Ser His Gln Arg

20 25 30

Ile Arg Ser Thr Ile Leu His Arg Thr Pro Ile Ser Thr Leu Pro Val

35 40 45

Ile Ser Gln Glu Val Ile Thr Ala Asp Tyr His Pro Thr Leu Leu Thr

50 55 60

Gly Phe Ile Pro Thr Asp Ser Asp Asp Ser Asp Cys Ala Asp Phe Ser

65 70 75 80

Pro Ser Gly Val Ile Tyr Ser Thr Asp Lys Leu Val Leu His Asp Ser

85 90 95

Leu Lys Asp Ile Arg Asp Ser Leu Leu Lys Thr Gln Tyr Lys Asp Leu

100 105 110

Val Thr Leu Glu Asp Glu Glu Lys Met Asn Ile Asp Asp Ile Leu Lys

115 120 125

Arg Trp Tyr Thr Leu Ser Gly Ser Ser Val Trp Ile Pro Gly Met Lys

130 135 140

Ala His Leu Val Val Ser Arg Val Met Tyr Leu Gly Thr Asn Gly Arg

145 150 155 160

Ser Asp Pro Leu Val Ser Phe Val Arg Val Gln Leu Phe Asp Pro Asp

165 170 175

Phe Asn Glu Leu Lys Asp Ile Ala Leu Lys Phe Ser Asp Lys Pro Asp

180 185 190

Gly Thr Val Ile Phe Pro Tyr Ile Leu Pro Val Asp Ile Pro Arg Glu

195 200 205

Gly Ser Arg Trp Leu Gly Pro Glu Asp Ala Lys Ile Ala Val Asn Pro

210 215 220

Glu Thr Pro Asp Asp Pro Ile Val Ile Phe Asn Met Gln Asn Ser Val

225 230 235 240

Asn Arg Ala Met Tyr Gly Phe Tyr Pro Phe Arg Pro Glu Asn Lys Gln

245 250 255

Val Leu Phe Ser Ile Lys Asp Glu Glu Pro Arg Lys Lys Glu Lys Asn

260 265 270

Trp Thr Pro Phe Phe Val Pro Gly Ser Pro Thr Thr Val Asn Phe Val

275 280 285

Tyr Asp Leu Gln Lys Leu Thr Ile Leu Lys Cys Ser Ile Ile Thr Gly

290 295 300

Ile Cys Glu Lys Glu Phe Val Ser Gly Asp Asp Gly Gln Asn His Gly

305 310 315 320

Ile Gly Ile Phe Arg Gly Gly Ser Asn Leu Val Pro Phe Pro Thr Ser

325 330 335

Phe Thr Asp Lys Asp Val Trp Val Gly Phe Pro Lys Thr His Met Glu

340 345 350

Ser Cys Gly Cys Ser Ser His Ile Tyr Arg Pro Tyr Leu Met Val Leu

355 360 365

Val Arg Lys Gly Asp Phe Tyr Tyr Lys Ala Phe Val Ser Thr Pro Leu

370 375 380

Asp Phe Gly Ile Asp Val Arg Ser Trp Glu Ser Ala Glu Ser Thr Ser

385 390 395 400

Cys Gln Thr Ala Lys Asn Val Leu Ala Val Asn Ser Ile Ser Asn Trp

405 410 415

Asp Leu Leu Asp Asp Gly Leu Asp Lys Asp Tyr Met Thr Ile Thr Leu

420 425 430

Ser Glu Ala Asp Val Val Asn Ser Val Leu Arg Val Arg Gly Ile Ala

435 440 445

Lys Phe Val Asp Asn Leu Thr Met Asp Asp Gly Ser Thr Thr Leu Ser

450 455 460

Thr Ser Asn Lys Ile Asp Glu Cys Ala Thr Thr Gly Ser Lys Gln Tyr

465 470 475 480

Cys Gln Arg Tyr Gly Glu Leu His

485

<210> 7

<211> 652

<212> PRT

<213> Artificial sequence

<400> 7

Met Val Asp Leu Phe Gln Trp Leu Lys Phe Tyr Ser Met Arg Arg Leu

1 5 10 15

Gly Gln Val Ala Ile Thr Leu Val Leu Leu Asn Leu Phe Val Phe Leu

20 25 30

Gly Tyr Lys Phe Thr Pro Ser Thr Val Ile Gly Ser Pro Ser Trp Glu

35 40 45

Pro Ala Val Val Pro Thr Val Phe Asn Glu Ser Tyr Leu Asp Ser Leu

50 55 60

Gln Phe Thr Asp Ile Asn Val Asp Ser Phe Leu Ser Asp Thr Asn Gly

65 70 75 80

Arg Ile Ser Val Thr Cys Asp Ser Leu Ala Tyr Lys Gly Leu Val Lys

85 90 95

Thr Ser Lys Lys Lys Glu Leu Asp Cys Asp Met Ala Tyr Ile Arg Arg

100 105 110

Lys Ile Phe Ser Ser Glu Glu Tyr Gly Val Leu Ala Asp Leu Glu Ala

115 120 125

Gln Asp Ile Thr Glu Glu Gln Arg Ile Lys Lys His Trp Phe Thr Phe

130 135 140

Tyr Gly Ser Ser Val Tyr Leu Pro Glu His Glu Val His Tyr Leu Val

145 150 155 160

Arg Arg Val Leu Phe Ser Lys Val Gly Arg Ala Asp Thr Pro Val Ile

165 170 175

Ser Leu Leu Val Ala Gln Leu Tyr Asp Lys Asp Trp Asn Glu Leu Thr

180 185 190

Pro His Thr Leu Glu Ile Val Asn Pro Ala Thr Gly Asn Val Thr Pro

195 200 205

Gln Thr Phe Pro Gln Leu Ile His Val Pro Ile Glu Trp Ser Val Asp

210 215 220

Asp Lys Trp Lys Gly Thr Glu Asp Pro Arg Val Phe Leu Lys Pro Ser

225 230 235 240

Lys Thr Gly Val Ser Glu Pro Ile Val Leu Phe Asn Leu Gln Ser Ser

245 250 255

Leu Cys Asp Gly Lys Arg Gly Met Phe Val Thr Ser Pro Phe Arg Ser

260 265 270

Asp Lys Val Asn Leu Leu Asp Ile Glu Asp Lys Glu Arg Pro Asn Ser

275 280 285

Glu Lys Asn Trp Ser Pro Phe Phe Leu Asp Asp Val Glu Val Ser Lys

290 295 300

Tyr Ser Thr Gly Tyr Val His Phe Val Tyr Ser Phe Asn Pro Leu Lys

305 310 315 320

Val Ile Lys Cys Ser Leu Asp Thr Gly Ala Cys Arg Met Ile Tyr Glu

325 330 335

Ser Pro Glu Glu Gly Arg Phe Gly Ser Glu Leu Arg Gly Ala Thr Pro

340 345 350

Met Val Lys Leu Pro Val His Leu Ser Leu Pro Lys Gly Lys Glu Val

355 360 365

Trp Val Ala Phe Pro Arg Thr Arg Leu Arg Asp Cys Gly Cys Ser Arg

370 375 380

Thr Thr Tyr Arg Pro Val Leu Thr Leu Phe Val Lys Glu Gly Asn Lys

385 390 395 400

Phe Tyr Thr Glu Leu Ile Ser Ser Ser Ile Asp Phe His Ile Asp Val

405 410 415

Leu Ser Tyr Asp Ala Lys Gly Glu Ser Cys Ser Gly Ser Ile Ser Val

420 425 430

Leu Ile Pro Asn Gly Ile Asp Ser Trp Asp Val Ser Lys Lys Gln Gly

435 440 445

Gly Lys Ser Asp Ile Leu Thr Leu Thr Leu Ser Glu Ala Asp Arg Asn

450 455 460

Thr Val Val Val His Val Lys Gly Leu Leu Asp Tyr Leu Leu Val Leu

465 470 475 480

Asn Gly Glu Gly Pro Ile His Asp Ser His Ser Phe Lys Asn Val Leu

485 490 495

Ser Thr Asn His Phe Lys Ser Asp Thr Thr Leu Leu Asn Ser Val Lys

500 505 510

Ala Ala Glu Cys Ala Ile Phe Ser Ser Arg Asp Tyr Cys Lys Lys Tyr

515 520 525

Gly Glu Thr Arg Gly Glu Pro Ala Arg Tyr Ala Lys Gln Met Glu Asn

530 535 540

Glu Arg Lys Glu Lys Glu Lys Lys Glu Lys Glu Ala Lys Glu Lys Leu

545 550 555 560

Glu Ala Glu Lys Ala Glu Met Glu Glu Ala Val Arg Lys Ala Gln Glu

565 570 575

Ala Ile Ala Gln Lys Glu Arg Glu Lys Glu Glu Ala Glu Gln Glu Lys

580 585 590

Lys Ala Gln Gln Glu Ala Lys Glu Lys Glu Ala Glu Glu Lys Ala Ala

595 600 605

Lys Glu Lys Glu Ala Lys Glu Asn Glu Ala Lys Lys Lys Ile Ile Val

610 615 620

Glu Lys Leu Ala Lys Glu Gln Glu Glu Ala Glu Lys Leu Glu Ala Lys

625 630 635 640

Lys Lys Leu Tyr Gln Leu Gln Glu Glu Glu Arg Ser

645 650

<210> 8

<211> 789

<212> PRT

<213> Artificial sequence

<400> 8

Met Cys Gln Ile Phe Leu Pro Gln Asn Val Thr Arg Cys Ser Val Ser

1 5 10 15

Leu Leu Thr Met Ser Lys Thr Ser Pro Gln Glu Val Pro Glu Asn Thr

20 25 30

Thr Glu Leu Lys Ile Ser Lys Gly Glu Leu Arg Pro Phe Ile Val Thr

35 40 45

Ser Pro Ser Pro Gln Leu Ser Lys Ser Arg Ser Val Thr Ser Thr Lys

50 55 60

Glu Lys Leu Ile Leu Ala Ser Leu Phe Ile Phe Ala Met Val Ile Arg

65 70 75 80

Phe His Asn Val Ala His Pro Asp Ser Val Val Phe Asp Glu Val His

85 90 95

Phe Gly Gly Phe Ala Arg Lys Tyr Ile Leu Gly Thr Phe Phe Met Asp

100 105 110

Val His Pro Pro Leu Ala Lys Leu Leu Phe Ala Gly Val Gly Ser Leu

115 120 125

Gly Gly Tyr Asp Gly Glu Phe Glu Phe Lys Lys Ile Gly Asp Glu Phe

130 135 140

Pro Glu Asn Val Pro Tyr Val Leu Met Arg Tyr Leu Pro Ser Gly Met

145 150 155 160

Gly Val Gly Thr Cys Ile Met Leu Tyr Leu Thr Leu Arg Ala Ser Gly

165 170 175

Cys Gln Pro Ile Val Cys Cys Ser Asp Asn Arg Ser Leu Ile Ile Glu

180 185 190

Asn Ala Asn Val Thr Ile Ser Arg Phe Ile Leu Leu Asp Ser Pro Met

195 200 205

Leu Phe Phe Ile Ala Ser Thr Val Tyr Ser Phe Lys Lys Phe Gln Ile

210 215 220

Gln Glu Pro Phe Thr Phe Gln Trp Tyr Lys Thr Leu Ile Ala Thr Gly

225 230 235 240

Val Ser Leu Gly Leu Ala Ala Ser Ser Lys Trp Val Gly Leu Phe Thr

245 250 255

Val Ala Trp Ile Gly Leu Ile Thr Ile Trp Asp Leu Trp Phe Ile Ile

260 265 270

Gly Asp Leu Thr Val Ser Val Lys Lys Ile Phe Gly His Phe Ile Thr

275 280 285

Arg Ala Val Ala Phe Leu Val Val Pro Thr Leu Ile Tyr Leu Thr Phe

290 295 300

Phe Ala Ile His Leu Gln Val Leu Thr Lys Glu Gly Asp Gly Gly Ala

305 310 315 320

Phe Met Ser Ser Val Phe Arg Ser Thr Leu Glu Gly Asn Ala Val Pro

325 330 335

Lys Gln Ser Leu Ala Asn Val Gly Leu Gly Ser Leu Val Thr Ile Arg

340 345 350

His Leu Asn Thr Arg Gly Gly Tyr Leu His Ser His Asn His Leu Tyr

355 360 365

Glu Gly Gly Ser Gly Gln Gln Gln Val Thr Leu Tyr Pro His Ile Asp

370 375 380

Ser Asn Asn Gln Trp Ile Val Gln Asp Tyr Asn Ala Thr Glu Glu Pro

385 390 395 400

Thr Glu Phe Val Pro Leu Lys Asp Gly Val Lys Ile Arg Leu Asn His

405 410 415

Lys Leu Thr Ser Arg Arg Leu His Ser His Asn Leu Arg Pro Pro Val

420 425 430

Thr Glu Gln Asp Trp Gln Asn Glu Val Ser Ala Tyr Gly His Glu Gly

435 440 445

Phe Gly Gly Asp Ala Asn Asp Asp Phe Val Val Glu Ile Ala Lys Asp

450 455 460

Leu Ser Thr Thr Glu Glu Ala Lys Glu Asn Val Arg Ala Ile Gln Thr

465 470 475 480

Val Phe Arg Leu Arg His Ala Met Thr Gly Cys Tyr Leu Phe Ser His

485 490 495

Glu Val Lys Leu Pro Lys Trp Ala Tyr Glu Gln Gln Glu Val Thr Cys

500 505 510

Ala Thr Gln Gly Ile Lys Pro Leu Ser Tyr Trp Tyr Val Glu Thr Asn

515 520 525

Glu Asn Pro Phe Leu Asp Lys Glu Val Asp Glu Ile Val Ser Tyr Pro

530 535 540

Val Pro Thr Phe Phe Gln Lys Val Ala Glu Leu His Ala Arg Met Trp

545 550 555 560

Lys Ile Asn Lys Gly Leu Thr Asp His His Val Tyr Glu Ser Ser Pro

565 570 575

Asp Ser Trp Pro Phe Leu Leu Arg Gly Ile Ser Tyr Trp Ser Lys Asn

580 585 590

His Ser Gln Ile Tyr Phe Ile Gly Asn Ala Val Thr Trp Trp Thr Val

595 600 605

Thr Ala Ser Ile Ala Leu Phe Ser Val Phe Leu Val Phe Ser Ile Leu

610 615 620

Arg Trp Gln Arg Gly Phe Gly Phe Ser Val Asp Pro Thr Val Phe Asn

625 630 635 640

Phe Asn Val Gln Met Leu His Tyr Ile Leu Gly Trp Val Leu His Tyr

645 650 655

Leu Pro Ser Phe Leu Met Ala Arg Gln Leu Phe Leu His His Tyr Leu

660 665 670

Pro Ser Leu Tyr Phe Gly Ile Leu Ala Leu Gly His Val Phe Glu Ile

675 680 685

Ile His Ser Tyr Val Phe Lys Asn Lys Gln Val Val Ser Tyr Ser Ile

690 695 700

Phe Val Leu Phe Phe Ala Val Ala Leu Ser Phe Phe Gln Arg Tyr Ser

705 710 715 720

Pro Leu Ile Tyr Ala Gly Arg Trp Thr Lys Asp Gln Cys Asn Glu Ser

725 730 735

Lys Ile Leu Lys Trp Asp Phe Asp Cys Asn Thr Phe Pro Ser His Thr

740 745 750

Ser Gln Tyr Glu Ile Trp Ala Ser Pro Val Gln Thr Ser Thr Pro Lys

755 760 765

Glu Gly Thr His Ser Glu Ser Thr Val Gly Glu Pro Asp Val Glu Lys

770 775 780

Leu Gly Glu Thr Val

785

<210> 9

<211> 512

<212> PRT

<213> Artificial sequence

<400> 9

Glu Ala Glu Ala Tyr Pro Lys Pro Gly Ala Thr Lys Arg Gly Ser Pro

1 5 10 15

Asn Pro Thr Arg Ala Ala Ala Val Lys Ala Ala Phe Gln Thr Ser Trp

20 25 30

Asn Ala Tyr His His Phe Ala Phe Pro His Asp Asp Leu His Pro Val

35 40 45

Ser Asn Ser Phe Asp Asp Glu Arg Asn Gly Trp Gly Ser Ser Ala Ile

50 55 60

Asp Gly Leu Asp Thr Ala Ile Leu Met Gly Asp Ala Asp Ile Val Asn

65 70 75 80

Thr Ile Leu Gln Tyr Val Pro Gln Ile Asn Phe Thr Thr Thr Ala Val

85 90 95

Ala Asn Gln Gly Ile Ser Val Phe Glu Thr Asn Ile Arg Tyr Leu Gly

100 105 110

Gly Leu Leu Ser Ala Tyr Asp Leu Leu Arg Gly Pro Phe Ser Ser Leu

115 120 125

Ala Thr Asn Gln Thr Leu Val Asn Ser Leu Leu Arg Gln Ala Gln Thr

130 135 140

Leu Ala Asn Gly Leu Lys Val Ala Phe Thr Thr Pro Ser Gly Val Pro

145 150 155 160

Asp Pro Thr Val Phe Phe Asn Pro Thr Val Arg Arg Ser Gly Ala Ser

165 170 175

Ser Asn Asn Val Ala Glu Ile Gly Ser Leu Val Leu Glu Trp Thr Arg

180 185 190

Leu Ser Asp Leu Thr Gly Asn Pro Gln Tyr Ala Gln Leu Ala Gln Lys

195 200 205

Gly Glu Ser Tyr Leu Leu Asn Pro Lys Gly Ser Pro Glu Ala Trp Pro

210 215 220

Gly Leu Ile Gly Thr Phe Val Ser Thr Ser Asn Gly Thr Phe Gln Asp

225 230 235 240

Ser Ser Gly Ser Trp Ser Gly Leu Met Asp Ser Phe Tyr Glu Tyr Leu

245 250 255

Ile Lys Met Tyr Leu Tyr Asp Pro Val Ala Phe Ala His Tyr Lys Asp

260 265 270

Arg Trp Val Leu Ala Ala Asp Ser Thr Ile Ala His Leu Ala Ser His

275 280 285

Pro Ser Thr Arg Lys Asp Leu Thr Phe Leu Ser Ser Tyr Asn Gly Gln

290 295 300

Ser Thr Ser Pro Asn Ser Gly His Leu Ala Ser Phe Ala Gly Gly Asn

305 310 315 320

Phe Ile Leu Gly Gly Ile Leu Leu Asn Glu Gln Lys Tyr Ile Asp Phe

325 330 335

Gly Ile Lys Leu Ala Ser Ser Tyr Phe Ala Thr Tyr Asn Gln Thr Ala

340 345 350

Ser Gly Ile Gly Pro Glu Gly Phe Ala Trp Val Asp Ser Val Thr Gly

355 360 365

Ala Gly Gly Ser Pro Pro Ser Ser Gln Ser Gly Phe Tyr Ser Ser Ala

370 375 380

Gly Phe Trp Val Thr Ala Pro Tyr Tyr Ile Leu Arg Pro Glu Thr Leu

385 390 395 400

Glu Ser Leu Tyr Tyr Ala Tyr Arg Val Thr Gly Asp Ser Lys Trp Gln

405 410 415

Asp Leu Ala Trp Glu Ala Phe Ser Ala Ile Glu Asp Ala Cys Arg Ala

420 425 430

Gly Ser Ala Tyr Ser Ser Ile Asn Asp Val Thr Gln Ala Asn Gly Gly

435 440 445

Gly Ala Ser Asp Asp Met Glu Ser Phe Trp Phe Ala Glu Ala Leu Lys

450 455 460

Tyr Ala Tyr Leu Ile Phe Ala Glu Glu Ser Asp Val Gln Val Gln Ala

465 470 475 480

Asn Gly Gly Asn Lys Phe Val Phe Asn Thr Glu Ala His Pro Phe Ser

485 490 495

Ile Arg Ser Ser Ser Arg Arg Gly Gly His Leu Ala His Asp Glu Leu

500 505 510

<210> 10

<211> 445

<212> PRT

<213> Artificial sequence

<400> 10

Met Ser Leu Ser Leu Val Ser Tyr Arg Leu Arg Lys Asn Pro Trp Val

1 5 10 15

Asn Ile Phe Leu Pro Val Leu Ala Ile Phe Leu Ile Tyr Ile Ile Phe

20 25 30

Phe Gln Arg Asp Gln Ser Ser Val Ser Ala Leu Asp Gly Asp Pro Ala

35 40 45

Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp Ala Glu Val Glu

50 55 60

Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Gly Asp Ala Leu Ser

65 70 75 80

Ser Gln Arg Gly Arg Val Pro Thr Ala Ala Pro Pro Ala Gln Pro Arg

85 90 95

Val Pro Val Thr Pro Ala Pro Ala Val Ile Pro Ile Leu Val Ile Ala

100 105 110

Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu His Tyr

115 120 125

Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln Asp Cys Gly

130 135 140

His Glu Glu Thr Ala Gln Ala Ile Ala Ser Tyr Gly Ser Ala Val Thr

145 150 155 160

His Ile Arg Gln Pro Asp Leu Ser Ser Ile Ala Val Pro Pro Asp His

165 170 175

Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg Trp Ala

180 185 190

Leu Gly Gln Val Phe Arg Gln Phe Arg Phe Pro Ala Ala Val Val Val

195 200 205

Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe Arg Ala

210 215 220

Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val Ser Ala

225 230 235 240

Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ala Ser Arg Pro Glu

245 250 255

Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu Leu Leu

260 265 270

Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala Phe Trp

275 280 285

Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala Cys Ile

290 295 300

Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly Val Ser

305 310 315 320

His Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu Asn Gln

325 330 335

Gln Phe Val His Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln Arg Glu

340 345 350

Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro Gln Leu

355 360 365

Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly Glu Val

370 375 380

Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe Ala Lys Ala

385 390 395 400

Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala Gly Tyr

405 410 415

Arg Gly Ile Val Thr Phe Gln Phe Arg Gly Arg Arg Val His Leu Ala

420 425 430

Pro Pro Leu Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn

435 440 445

<210> 11

<211> 804

<212> PRT

<213> Artificial sequence

<400> 11

Met Ala Leu Phe Leu Ser Lys Arg Leu Leu Arg Phe Thr Val Ile Ala

1 5 10 15

Gly Ala Val Ile Val Leu Leu Leu Thr Leu Asn Ser Asn Ser Arg Thr

20 25 30

Gln Gln Tyr Ile Pro Ser Ser Ile Ser Ala Ala Phe Asp Phe Thr Ser

35 40 45

Gly Ser Ile Ser Pro Glu Gln Gln Val Ile Ser Glu Glu Asn Asp Ala

50 55 60

Lys Lys Leu Glu Gln Ser Ala Leu Asn Ser Glu Ala Ser Glu Asp Ser

65 70 75 80

Glu Ala Met Asp Glu Glu Ser Lys Ala Leu Lys Ala Ala Ala Glu Lys

85 90 95

Ala Asp Ala Pro Ile Gly Gly Gly Pro Ala Gly Met Arg Val Leu Val

100 105 110

Thr Gly Gly Ser Gly Tyr Ile Gly Ser His Thr Cys Val Gln Leu Leu

115 120 125

Gln Asn Gly His Asp Val Ile Ile Leu Asp Asn Leu Cys Asn Ser Lys

130 135 140

Arg Ser Val Leu Pro Val Ile Glu Arg Leu Gly Gly Lys His Pro Thr

145 150 155 160

Phe Val Glu Gly Asp Ile Arg Asn Glu Ala Leu Met Thr Glu Ile Leu

165 170 175

His Asp His Ala Ile Asp Thr Val Ile His Phe Ala Gly Leu Lys Ala

180 185 190

Val Gly Glu Ser Val Gln Lys Pro Leu Glu Tyr Tyr Asp Asn Asn Val

195 200 205

Asn Gly Thr Leu Arg Leu Ile Ser Ala Met Arg Ala Ala Asn Val Lys

210 215 220

Asn Phe Ile Phe Ser Ser Ser Ala Thr Val Tyr Gly Asp Gln Pro Lys

225 230 235 240

Ile Pro Tyr Val Glu Ser Phe Pro Thr Gly Thr Pro Gln Ser Pro Tyr

245 250 255

Gly Lys Ser Lys Leu Met Val Glu Gln Ile Leu Thr Asp Leu Gln Lys

260 265 270

Ala Gln Pro Asp Trp Ser Ile Ala Leu Leu Arg Tyr Phe Asn Pro Val

275 280 285

Gly Ala His Pro Ser Gly Asp Met Gly Glu Asp Pro Gln Gly Ile Pro

290 295 300

Asn Asn Leu Met Pro Tyr Ile Ala Gln Val Ala Val Gly Arg Arg Asp

305 310 315 320

Ser Leu Ala Ile Phe Gly Asn Asp Tyr Pro Thr Glu Asp Gly Thr Gly

325 330 335

Val Arg Asp Tyr Ile His Val Met Asp Leu Ala Asp Gly His Val Val

340 345 350

Ala Met Glu Lys Leu Ala Asn Lys Pro Gly Val His Ile Tyr Asn Leu

355 360 365

Gly Ala Gly Val Gly Asn Ser Val Leu Asp Val Val Asn Ala Phe Ser

370 375 380

Lys Ala Cys Gly Lys Pro Val Asn Tyr His Phe Ala Pro Arg Arg Glu

385 390 395 400

Gly Asp Leu Pro Ala Tyr Trp Ala Asp Ala Ser Lys Ala Asp Arg Glu

405 410 415

Leu Asn Trp Arg Val Thr Arg Thr Leu Asp Glu Met Ala Gln Asp Thr

420 425 430

Trp His Trp Gln Ser Arg His Pro Gln Gly Tyr Pro Asp Gly Thr Gly

435 440 445

Gly Gly Arg Asp Leu Ser Arg Leu Pro Gln Leu Val Gly Val Ser Thr

450 455 460

Pro Leu Gln Gly Gly Ser Asn Ser Ala Ala Ala Ile Gly Gln Ser Ser

465 470 475 480

Gly Glu Leu Arg Thr Gly Gly Ala Arg Pro Pro Pro Pro Leu Gly Ala

485 490 495

Ser Ser Gln Pro Arg Pro Gly Gly Asp Ser Ser Pro Val Val Asp Ser

500 505 510

Gly Pro Gly Pro Ala Ser Asn Leu Thr Ser Val Pro Val Pro His Thr

515 520 525

Thr Ala Leu Ser Leu Pro Ala Cys Pro Glu Glu Ser Pro Leu Leu Val

530 535 540

Gly Pro Met Leu Ile Glu Phe Asn Met Pro Val Asp Leu Glu Leu Val

545 550 555 560

Ala Lys Gln Asn Pro Asn Val Lys Met Gly Gly Arg Tyr Ala Pro Arg

565 570 575

Asp Cys Val Ser Pro His Lys Val Ala Ile Ile Ile Pro Phe Arg Asn

580 585 590

Arg Gln Glu His Leu Lys Tyr Trp Leu Tyr Tyr Leu His Pro Val Leu

595 600 605

Gln Arg Gln Gln Leu Asp Tyr Gly Ile Tyr Val Ile Asn Gln Ala Gly

610 615 620

Asp Thr Ile Phe Asn Arg Ala Lys Leu Leu Asn Val Gly Phe Gln Glu

625 630 635 640

Ala Leu Lys Asp Tyr Asp Tyr Thr Cys Phe Val Phe Ser Asp Val Asp

645 650 655

Leu Ile Pro Met Asn Asp His Asn Ala Tyr Arg Cys Phe Ser Gln Pro

660 665 670

Arg His Ile Ser Val Ala Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr

675 680 685

Val Gln Tyr Phe Gly Gly Val Ser Ala Leu Ser Lys Gln Gln Phe Leu

690 695 700

Thr Ile Asn Gly Phe Pro Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp

705 710 715 720

Asp Asp Ile Phe Asn Arg Leu Val Phe Arg Gly Met Ser Ile Ser Arg

725 730 735

Pro Asn Ala Val Val Gly Arg Cys Arg Met Ile Arg His Ser Arg Asp

740 745 750

Lys Lys Asn Glu Pro Asn Pro Gln Arg Phe Asp Arg Ile Ala His Thr

755 760 765

Lys Glu Thr Met Leu Ser Asp Gly Leu Asn Ser Leu Thr Tyr Gln Val

770 775 780

Leu Asp Val Gln Arg Tyr Pro Leu Tyr Thr Gln Ile Thr Val Asp Ile

785 790 795 800

Gly Thr Pro Ser

<210> 12

<211> 1101

<212> PRT

<213> Artificial sequence

<400> 12

Met Leu Leu Thr Lys Arg Phe Ser Lys Leu Phe Lys Leu Thr Phe Ile

1 5 10 15

Val Leu Ile Leu Cys Gly Leu Phe Val Ile Thr Asn Lys Tyr Met Asp

20 25 30

Glu Asn Thr Ser Pro Ala Gly Val Glu Asp Gly Pro Lys Ser Ser Gln

35 40 45

Ser Asn Phe Ser Gln Gly Ala Gly Ser His Leu Leu Pro Ser Gln Leu

50 55 60

Ser Leu Ser Val Asp Thr Ala Asp Cys Leu Phe Ala Ser Gln Ser Gly

65 70 75 80

Ser His Asn Ser Asp Val Gln Met Leu Asp Val Tyr Ser Leu Ile Ser

85 90 95

Phe Asp Asn Pro Asp Gly Gly Val Trp Lys Gln Gly Phe Asp Ile Thr

100 105 110

Tyr Glu Ser Asn Glu Trp Asp Thr Glu Pro Leu Gln Val Phe Val Val

115 120 125

Pro His Ser His Asn Asp Pro Gly Trp Leu Lys Thr Phe Asn Asp Tyr

130 135 140

Phe Arg Asp Lys Thr Gln Tyr Ile Phe Asn Asn Met Val Leu Lys Leu

145 150 155 160

Lys Glu Asp Ser Arg Arg Lys Phe Ile Trp Ser Glu Ile Ser Tyr Leu

165 170 175

Ser Lys Trp Trp Asp Ile Ile Asp Ile Gln Lys Lys Asp Ala Val Lys

180 185 190

Ser Leu Ile Glu Asn Gly Gln Leu Glu Ile Val Thr Gly Gly Trp Val

195 200 205

Met Pro Asp Glu Ala Thr Pro His Tyr Phe Ala Leu Ile Asp Gln Leu

210 215 220

Ile Glu Gly His Gln Trp Leu Glu Asn Asn Ile Gly Val Lys Pro Arg

225 230 235 240

Ser Gly Trp Ala Ile Asp Pro Phe Gly His Ser Pro Thr Met Ala Tyr

245 250 255

Leu Leu Asn Arg Ala Gly Leu Ser His Met Leu Ile Gln Arg Val His

260 265 270

Tyr Ala Val Lys Lys His Phe Ala Leu His Lys Thr Leu Glu Phe Phe

275 280 285

Trp Arg Gln Asn Trp Asp Leu Gly Ser Val Thr Asp Ile Leu Cys His

290 295 300

Met Met Pro Phe Tyr Ser Tyr Asp Ile Pro His Thr Cys Gly Pro Asp

305 310 315 320

Pro Lys Ile Cys Cys Gln Phe Asp Phe Lys Arg Leu Pro Gly Gly Arg

325 330 335

Phe Gly Cys Pro Trp Gly Val Pro Pro Glu Thr Ile His Pro Gly Asn

340 345 350

Val Gln Ser Arg Ala Arg Met Leu Leu Asp Gln Tyr Arg Lys Lys Ser

355 360 365

Lys Leu Phe Arg Thr Lys Val Leu Leu Ala Pro Leu Gly Asp Asp Phe

370 375 380

Arg Tyr Cys Glu Tyr Thr Glu Trp Asp Leu Gln Phe Lys Asn Tyr Gln

385 390 395 400

Gln Leu Phe Asp Tyr Met Asn Ser Gln Ser Lys Phe Lys Val Lys Ile

405 410 415

Gln Phe Gly Thr Leu Ser Asp Phe Phe Asp Ala Leu Asp Lys Ala Asp

420 425 430

Glu Thr Gln Arg Asp Lys Gly Gln Ser Met Phe Pro Val Leu Ser Gly

435 440 445

Asp Phe Phe Thr Tyr Ala Asp Arg Asp Asp His Tyr Trp Ser Gly Tyr

450 455 460

Phe Thr Ser Arg Pro Phe Tyr Lys Arg Met Asp Arg Ile Met Glu Ser

465 470 475 480

His Leu Arg Ala Ala Glu Ile Leu Tyr Tyr Phe Ala Leu Arg Gln Ala

485 490 495

His Lys Tyr Lys Ile Asn Lys Phe Leu Ser Ser Ser Leu Tyr Thr Ala

500 505 510

Leu Thr Glu Ala Arg Arg Asn Leu Gly Leu Phe Gln His His Asp Ala

515 520 525

Ile Thr Gly Thr Ala Lys Asp Trp Val Val Val Asp Tyr Gly Thr Arg

530 535 540

Leu Phe His Ser Leu Met Val Leu Glu Lys Ile Ile Gly Asn Ser Ala

545 550 555 560

Phe Leu Leu Ile Gly Lys Asp Lys Leu Thr Tyr Asp Ser Tyr Ser Pro

565 570 575

Asp Thr Phe Leu Glu Met Asp Leu Lys Gln Lys Ser Gln Asp Ser Leu

580 585 590

Pro Gln Lys Asn Ile Ile Arg Leu Ser Ala Glu Pro Arg Tyr Leu Val

595 600 605

Val Tyr Asn Pro Leu Glu Gln Asp Arg Ile Ser Leu Val Ser Val Tyr

610 615 620

Val Ser Ser Pro Thr Val Gln Val Phe Ser Ala Ser Gly Lys Pro Val

625 630 635 640

Glu Val Gln Val Ser Ala Val Trp Asp Thr Ala Asn Thr Ile Ser Glu

645 650 655

Thr Ala Tyr Glu Ile Ser Phe Arg Ala His Ile Pro Pro Leu Gly Leu

660 665 670

Lys Val Tyr Lys Ile Leu Glu Ser Ala Ser Ser Asn Ser His Leu Ala

675 680 685

Asp Tyr Val Leu Tyr Lys Asn Lys Val Glu Asp Ser Gly Ile Phe Thr

690 695 700

Ile Lys Asn Met Ile Asn Thr Glu Glu Gly Ile Thr Leu Glu Asn Ser

705 710 715 720

Phe Val Leu Leu Arg Phe Asp Gln Thr Gly Leu Met Lys Gln Met Met

725 730 735

Thr Lys Glu Asp Gly Lys His His Glu Val Asn Val Gln Phe Ser Trp

740 745 750

Tyr Gly Thr Thr Ile Lys Arg Asp Lys Ser Gly Ala Tyr Leu Phe Leu

755 760 765

Pro Asp Gly Asn Ala Lys Pro Tyr Val Tyr Thr Thr Pro Pro Phe Val

770 775 780

Arg Val Thr His Gly Arg Ile Tyr Ser Glu Val Thr Cys Phe Phe Asp

785 790 795 800

His Val Thr His Arg Val Arg Leu Tyr His Ile Gln Gly Ile Glu Gly

805 810 815

Gln Ser Val Glu Val Ser Asn Ile Val Asp Ile Arg Lys Val Tyr Asn

820 825 830

Arg Glu Ile Ala Met Lys Ile Ser Ser Asp Ile Lys Ser Gln Asn Arg

835 840 845

Phe Tyr Thr Asp Leu Asn Gly Tyr Gln Ile Gln Pro Arg Met Thr Leu

850 855 860

Ser Lys Leu Pro Leu Gln Ala Asn Val Tyr Pro Met Thr Thr Met Ala

865 870 875 880

Tyr Ile Gln Asp Ala Lys His Arg Leu Thr Leu Leu Ser Ala Gln Ser

885 890 895

Leu Gly Val Ser Ser Leu Asn Ser Gly Gln Ile Glu Val Ile Met Asp

900 905 910

Arg Arg Leu Met Gln Asp Asp Asn Arg Gly Leu Glu Gln Gly Ile Gln

915 920 925

Asp Asn Lys Ile Thr Ala Asn Leu Phe Arg Ile Leu Leu Glu Lys Arg

930 935 940

Ser Ala Val Asn Thr Glu Glu Glu Lys Lys Ser Val Ser Tyr Pro Ser

945 950 955 960

Leu Leu Ser His Ile Thr Ser Ser Leu Met Asn His Pro Val Ile Pro

965 970 975

Met Ala Asn Lys Phe Ser Ser Pro Thr Leu Glu Leu Gln Gly Glu Phe

980 985 990

Ser Pro Leu Gln Ser Ser Leu Pro Cys Asp Ile His Leu Val Asn Leu

995 1000 1005

Arg Thr Ile Gln Ser Lys Val Gly Asn Gly His Ser Asn Glu Ala

1010 1015 1020

Ala Leu Ile Leu His Arg Lys Gly Phe Asp Cys Arg Phe Ser Ser

1025 1030 1035

Lys Gly Thr Gly Leu Phe Cys Ser Thr Thr Gln Gly Lys Ile Leu

1040 1045 1050

Val Gln Lys Leu Leu Asn Lys Phe Ile Val Glu Ser Leu Thr Pro

1055 1060 1065

Ser Ser Leu Ser Leu Met His Ser Pro Pro Gly Thr Gln Asn Ile

1070 1075 1080

Ser Glu Ile Asn Leu Ser Pro Met Glu Ile Ser Thr Phe Arg Ile

1085 1090 1095

Gln Leu Arg

1100

<210> 13

<211> 394

<212> PRT

<213> Artificial sequence

<400> 13

Met Leu Leu Thr Lys Arg Phe Ser Lys Leu Phe Lys Leu Thr Phe Ile

1 5 10 15

Val Leu Ile Leu Cys Gly Leu Phe Val Ile Thr Asn Lys Tyr Met Asp

20 25 30

Glu Asn Thr Ser Pro Ala Gly Ser Leu Val Tyr Gln Leu Asn Phe Asp

35 40 45

Gln Thr Leu Arg Asn Val Asp Lys Ala Gly Thr Trp Ala Pro Arg Glu

50 55 60

Leu Val Leu Val Val Gln Val His Asn Arg Pro Glu Tyr Leu Arg Leu

65 70 75 80

Leu Leu Asp Ser Leu Arg Lys Ala Gln Gly Ile Asp Asn Val Leu Val

85 90 95

Ile Phe Ser His Asp Phe Trp Ser Thr Glu Ile Asn Gln Leu Ile Ala

100 105 110

Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe Phe Pro Phe Ser Ile

115 120 125

Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp Pro Arg Asp Cys Pro

130 135 140

Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu Gly Cys Ile Asn Ala

145 150 155 160

Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu Ala Lys Phe Ser Gln

165 170 175

Thr Lys His His Trp Trp Trp Lys Leu His Phe Val Trp Glu Arg Val

180 185 190

Lys Ile Leu Arg Asp Tyr Ala Gly Leu Ile Leu Phe Leu Glu Glu Asp

195 200 205

His Tyr Leu Ala Pro Asp Phe Tyr His Val Phe Lys Lys Met Trp Lys

210 215 220

Leu Lys Gln Gln Glu Cys Pro Glu Cys Asp Val Leu Ser Leu Gly Thr

225 230 235 240

Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala Asp Lys Val Asp Val

245 250 255

Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly Leu Ala Leu Thr Arg

260 265 270

Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp Thr Phe Cys Thr Tyr

275 280 285

Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr Leu Thr Val Ser Cys

290 295 300

Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln Ile Pro Arg Ile Phe

305 310 315 320

His Ala Gly Asp Cys Gly Met His His Lys Lys Thr Cys Arg Pro Ser

325 330 335

Thr Gln Ser Ala Gln Ile Glu Ser Leu Leu Asn Asn Asn Lys Gln Tyr

340 345 350

Met Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys Phe Thr Val Val Ala

355 360 365

Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly Asp Ile Arg Asp His

370 375 380

Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln

385 390

<210> 14

<211> 1539

<212> DNA

<213> Artificial sequence

<400> 14

gaggctgaag cttatccaaa gccgggcgcc acaaaacgtg gatctcccaa ccctacgagg 60

gcggcagcag tcaaggccgc attccagacg tcgtggaacg cttaccacca ttttgccttt 120

ccccatgacg acctccaccc ggtcagcaac agctttgatg atgagagaaa cggctggggc 180

tcgtcggcaa tcgatggctt ggacacggct atcctcatgg gggatgccga cattgtgaac 240

acgatccttc agtatgtacc gcagatcaac ttcaccacga ctgcggttgc caaccaaggc 300

atctccgtgt tcgagaccaa cattcggtac ctcggtggcc tgctttctgc ctatgacctg 360

ttgcgaggtc ctttcagctc cttggcgaca aaccagaccc tggtaaacag ccttctgagg 420

caggctcaaa cactggccaa cggcctcaag gttgcgttca ccactcccag cggtgtcccg 480

gaccctaccg tcttcttcaa ccctaccgtc cggagaagtg gtgcatctag caacaacgtc 540

gctgaaattg gaagcctggt gctcgaatgg acacggttga gcgacctgac gggaaacccg 600

cagtatgccc agcttgcgca gaagggcgag tcgtatctcc tgaatccaaa gggaagcccg 660

gaggcatggc ctggcctgat tggaacgttt gtcagcacga gcaacggtac ctttcaggat 720

agcagcggca gctggtccgg cctcatggac agcttctacg agtacctgat caagatgtac 780

ctgtacgacc cggttgcgtt tgcacactac aaggatcgct gggtccttgc tgccgactcg 840

accattgcgc atctcgcctc tcacccgtcg acgcgcaagg acttgacctt tttgtcttcg 900

tacaacggac agtctacgtc gccaaactca ggacatttgg ccagttttgc cggtggcaac 960

ttcatcttgg gaggcattct cctgaacgag caaaagtaca ttgactttgg aatcaagctt 1020

gccagctcgt actttgccac gtacaaccag acggcttctg gaatcggccc cgaaggcttc 1080

gcgtgggtgg acagcgtgac gggcgccggc ggctcgccgc cctcgtccca gtccgggttc 1140

tactcgtcgg caggattctg ggtgacggca ccgtattaca tcctgcggcc ggagacgctg 1200

gagagcttgt actacgcata ccgcgtcacg ggcgactcca agtggcagga cctggcgtgg 1260

gaagcgttca gtgccattga ggacgcatgc cgcgccggca gcgcgtactc gtccatcaac 1320

gacgtgacgc aggccaacgg cgggggtgcc tctgacgata tggagagctt ctggtttgcc 1380

gaggcgctca agtatgcgta cctgatcttt gcggaggagt cggatgtgca ggtgcaggcc 1440

aacggcggga acaaatttgt ctttaacacg gaggcgcacc cctttagcat ccgttcatca 1500

tcacgacggg gcggccacct tgctcacgac gagttgtaa 1539

<210> 15

<211> 1338

<212> DNA

<213> Artificial sequence

<400> 15

atgtcacttt ctcttgtatc gtaccgccta agaaagaacc cgtgggttaa catttttcta 60

cctgttttgg ccatatttct aatatatata atttttttcc agagagatca atcttcagtc 120

agcgctctcg atggcgaccc cgccagcctc acccgggaag tgattcgcct ggcccaagac 180

gccgaggtgg agctggagcg gcagcgtggg ctgctgcagc agatcgggga tgccctgtcg 240

agccagcggg ggagggtgcc caccgcggcc cctcccgccc agccgcgtgt gcctgtgacc 300

cccgcgccgg cggtgattcc catcctggtc atcgcctgtg accgcagcac tgttcggcgc 360

tgcctggaca agctgctgca ttatcggccc tcggctgagc tcttccccat catcgttagc 420

caggactgcg ggcacgagga gacggcccag gccatcgcct cctacggcag cgcggtcacg 480

cacatccggc agcccgacct gagcagcatt gcggtgccgc cggaccaccg caagttccag 540

ggctactaca agatcgcgcg ccactaccgc tgggcgctgg gccaggtctt ccggcagttt 600

cgcttccccg cggccgtggt ggtggaggat gacctggagg tggccccgga cttcttcgag 660

tactttcggg ccacctatcc gctgctgaag gccgacccct ccctgtggtg cgtctcggcc 720

tggaatgaca acggcaagga gcagatggtg gacgccagca ggcctgagct gctctaccgc 780

accgactttt tccctggcct gggctggctg ctgttggccg agctctgggc tgagctggag 840

cccaagtggc caaaggcctt ctgggacgac tggatgcggc ggccggagca gcggcagggg 900

cgggcctgca tacgccctga gatctcaaga acgatgacct ttggccgcaa gggtgtgagc 960

cacgggcagt tctttgacca gcacctcaag tttatcaagc tgaaccagca gtttgtgcac 1020

ttcacccagc tggacctgtc ttacctgcag cgggaggcct atgaccgaga tttcctcgcc 1080

cgcgtctacg gtgctcccca gctgcaggtg gagaaagtga ggaccaatga ccggaaggag 1140

ctgggggagg tgcgggtgca gtatacgggc agggacagct tcaaggcttt cgccaaggct 1200

ctgggtgtca tggatgacct taagtcgggg gttccgagag ctggctaccg gggtattgtc 1260

accttccagt tccggggccg ccgtgtccac ctggcgcccc cactgacgtg ggagggctat 1320

gatcctagct ggaattag 1338

<210> 16

<211> 2397

<212> DNA

<213> Artificial sequence

<400> 16

atggccctct ttctcagtaa gagactgttg agatttaccg tcattgcagg tgcggttatt 60

gttctcctcc taacattgaa ttccaacagt agaactcagc aatatattcc gagttccatc 120

tccgctgcat ttgattttac ctcaggatct atatcccctg aacaacaagt catctctgag 180

gaaaatgatg ctaaaaaatt agagcaaagt gctctgaatt cagaggcaag cgaagactcc 240

gaagccatgg atgaagaatc caaggctctg aaagctgccg ctgaaaaggc agatgccccg 300

atcatgagag ttctggttac cggtggtagc ggttacattg gaagtcatac ctgtgtgcaa 360

ttactgcaaa acggtcatga tgtcatcatt cttgataacc tctgtaacag taagcgcagc 420

gtactgcctg ttatcgagcg tttaggcggc aaacatccaa cgtttgttga aggcgatatt 480

cgtaacgaag cgttgatgac cgagatcctg cacgatcacg ctatcgacac cgtgatccac 540

ttcgccgggc tgaaagccgt gggcgaatcg gtacaaaaac cgctggaata ttacgacaac 600

aatgtcaacg gcactctgcg cctgattagc gccatgcgcg ccgctaacgt caaaaacttt 660

atttttagct cctccgccac cgtttatggc gatcagccca aaattccata cgttgaaagc 720

ttcccgaccg gcacaccgca aagcccttac ggcaaaagca agctgatggt ggaacagatc 780

ctcaccgatc tgcaaaaagc ccagccggac tggagcattg ccctgctgcg ctacttcaac 840

ccggttggcg cgcatccgtc gggcgatatg ggcgaagatc cgcaaggcat tccgaataac 900

ctgatgccat acatcgccca ggttgctgta ggccgtcgcg actcgctggc gatttttggt 960

aacgattatc cgaccgaaga tggtactggc gtacgcgatt acatccacgt aatggatctg 1020

gcggacggtc acgtcgtggc gatggaaaaa ctggcgaaca agccaggcgt acacatctac 1080

aacctcggcg ctggcgtagg caacagcgtg ctggacgtgg ttaatgcctt cagcaaagcc 1140

tgcggcaaac cggttaatta tcattttgca ccgcgtcgcg agggcgacct tccggcctac 1200

tgggcggacg ccagcaaagc cgaccgtgaa ctgaactggc gcgtaacgcg cacactcgat 1260

gaaatggcgc aggacacctg gcactggcag tcacgccatc cacagggata tcccgatggt 1320

accggtggtg gacgtgacct ttctcgtctg ccacaactgg ttggagtttc tactccactg 1380

caaggtggat ctaactctgc tgctgcaatt ggtcaatcat ctggtgagct tcgtactgga 1440

ggtgctcgtc cccctccacc acttggtgct tcttcccagc cccgtccagg tggcgactcc 1500

agcccagtcg tggattctgg ccctggcccc gctagcaact tgacctcggt cccagtgccc 1560

cacaccaccg cactgtcgct gcccgcctgc cctgaggagt ccccgctgct tgtgggcccc 1620

atgctgattg agtttaacat gcctgtggac ctggagctcg tggcaaagca gaacccaaat 1680

gtgaagatgg gcggccgcta tgcccccagg gactgcgtct ctcctcacaa ggtggccatc 1740

atcattccat tccgcaaccg gcaggagcac ctcaagtact ggctatatta tttgcaccca 1800

gtcctgcagc gccagcagct ggactatggc atctatgtta tcaaccaggc gggagacact 1860

atattcaatc gtgctaagct cctcaatgtt ggctttcaag aagccttgaa ggactatgac 1920

tacacctgct ttgtgtttag tgacgtggac ctcattccaa tgaatgacca taatgcgtac 1980

aggtgttttt cacagccacg gcacatttcc gttgcaatgg ataagtttgg attcagccta 2040

ccttatgttc agtattttgg aggtgtctct gctctaagta aacaacagtt tctaaccatc 2100

aatggatttc ctaataatta ttggggttgg ggaggagaag atgacgacat ttttaacaga 2160

ttagttttta gaggcatgtc tatatctcgc ccaaatgctg tggtcgggag gtgtcgcatg 2220

atccgccact caagagacaa gaaaaatgaa cccaatcctc agaggtttga ccgaattgca 2280

cacacaaagg agacaatgct ctctgatggt ttgaactcac tcacctacca ggtgctggat 2340

gtacagagat acccattgta tacccaaatc acagtggaca tcgggacacc gagctaa 2397

<210> 17

<211> 3306

<212> DNA

<213> Artificial sequence

<400> 17

atgctgctta ccaaaaggtt ttcaaagctg ttcaagctga cgttcatagt tttgatattg 60

tgcgggctgt tcgtcattac aaacaaatac atggatgaga acacgtcgcc tgcaggcgtg 120

gaggatggtc cgaaaagttc acaaagcaat ttcagccaag gtgctggctc acatcttctg 180

ccctcacaat tatccctctc agttgacact gcagactgtc tgtttgcttc acaaagtgga 240

agtcacaatt cagatgtgca gatgttggat gtttacagtc taatttcttt tgacaatcca 300

gatggtggag tttggaagca aggatttgac attacttatg aatctaatga atgggacact 360

gaaccccttc aagtctttgt ggtgcctcat tcccataacg acccaggttg gttgaagact 420

ttcaatgact actttagaga caagactcag tatattttta ataacatggt cctaaagctg 480

aaagaagact cacggaggaa gtttatttgg tctgagatct cttacctttc aaagtggtgg 540

gatattatag atattcagaa gaaggatgct gttaaaagtt taatagaaaa tggtcagctt 600

gaaattgtga caggtggctg ggttatgcct gatgaagcta ctccacatta ttttgcctta 660

attgatcaac taattgaagg acatcagtgg ctggaaaata atataggagt gaaacctcgg 720

tccggctggg ctattgatcc ctttggacac tcaccaacaa tggcttatct tctaaaccgt 780

gctggacttt ctcacatgct tatccagaga gttcattatg cagttaaaaa acactttgca 840

ctgcataaaa cattggagtt tttttggaga cagaattggg atctgggatc tgtcacagat 900

attttatgcc acatgatgcc cttctacagc tatgacatcc ctcacacttg tggacctgat 960

cctaaaatat gctgccagtt tgattttaaa cgtcttcctg gaggcagatt tggttgtccc 1020

tggggagtcc ccccagaaac aatacatcct ggaaatgtcc aaagcagggc tcggatgcta 1080

ctagatcagt accgaaagaa gtcaaagctt tttcgaacca aagttctcct ggctccacta 1140

ggagatgatt tccgctactg tgaatacacg gaatgggatt tacagtttaa gaattatcag 1200

cagctttttg attatatgaa ttctcagtcc aagtttaaag ttaagataca gtttggaact 1260

ttatcagatt tttttgatgc gctggataaa gcagatgaaa ctcagagaga caagggccaa 1320

tcgatgttcc ctgttttaag tggagatttt ttcacttatg ccgatcgaga tgatcattac 1380

tggagtggct attttacatc cagacccttt tacaaacgaa tggacagaat catggaatct 1440

catttaaggg ctgctgaaat tctttactat ttcgccctga gacaagctca caaatacaag 1500

ataaataaat ttctctcatc atcactttac acggcactga cagaagccag aaggaatttg 1560

ggactgtttc aacatcatga tgctatcaca ggaactgcaa aagactgggt ggttgtggat 1620

tatggtacca gactttttca ttcgttaatg gttttggaga agataattgg aaattctgca 1680

tttcttctta ttgggaagga caaactcaca tacgactctt actctcctga taccttcctg 1740

gagatggatt tgaaacaaaa atcacaagat tctctgccac aaaaaaatat aataaggctg 1800

agtgcggagc caaggtacct tgtggtctat aatcctttag aacaagaccg aatctcgttg 1860

gtctcagtct atgtgagttc cccgacagtg caagtgttct ctgcttcagg aaaacctgtg 1920

gaagttcaag tcagcgcagt ttgggataca gcaaatacta tttcagaaac agcctatgag 1980

atctcttttc gagcacatat accgccattg ggactgaaag tgtataagat tttggaatca 2040

gcaagttcaa attcacattt agctgattat gtcttgtata agaataaagt agaagatagc 2100

ggaattttca ccataaagaa tatgataaat actgaagaag gtataacact agagaactcc 2160

tttgttttac ttcggtttga tcaaactgga cttatgaagc aaatgatgac taaagaagat 2220

ggtaaacacc atgaagtaaa tgtgcaattt tcatggtatg gaaccacaat taaaagagac 2280

aaaagtggtg cctacctctt cttacctgat ggtaatgcca agccttatgt ttacacaaca 2340

ccgccctttg tcagagtgac acatggaagg atttattcgg aagtgacttg cttttttgac 2400

catgttactc atagagtccg actataccac atacagggaa tagaaggaca gtctgtggaa 2460

gtttccaata ttgtggacat ccgaaaagta tataaccgtg agattgcaat gaaaatttct 2520

tctgatataa aaagccaaaa tagattttat actgacctaa atgggtacca gattcaacct 2580

agaatgacac tgagcaaatt gcctcttcaa gcaaatgtct atcccatgac cacaatggcc 2640

tatatccagg atgccaaaca tcgtttgaca ctgctctctg ctcagtcatt aggggtttcg 2700

agtttgaata gtggtcagat tgaagttatc atggatcgaa gactcatgca agatgataat 2760

cgtggccttg agcaaggtat ccaggataac aagattacag ctaatctatt tcgaatacta 2820

ctagaaaaaa gaagtgctgt taatacggaa gaagaaaaga agtcggtcag ttatccttct 2880

ctccttagcc acataacttc ttctctcatg aatcatccag tcattccaat ggcaaataag 2940

ttctcctcac ctacccttga gctgcaaggt gaattctctc cattacagtc atctttgcct 3000

tgtgacattc atctggttaa tttgagaaca atacagtcaa aggtgggcaa tgggcactcc 3060

aatgaggcag ccttgatcct ccacagaaaa gggtttgatt gtcggttctc tagcaaaggc 3120

acagggctgt tttgttctac tactcaggga aagatattgg tacagaaact tttaaacaag 3180

tttattgtcg aaagtctcac accttcatca ctatccttga tgcattcacc tcccggcact 3240

cagaatataa gtgagatcaa cttgagtcca atggaaatca gcacattccg aatccagttg 3300

aggtga 3306

<210> 18

<211> 1188

<212> DNA

<213> Artificial sequence

<400> 18

atgctgctta ccaaaaggtt ttcaaagctg ttcaagctga cgttcatagt tttgatattg 60

tgcgggctgt tcgtcattac aaacaaatac atggatgaga acacgtcgcc tgcaggctcc 120

ctggtgtacc agctgaactt tgatcagacc ctgaggaatg tagataaggc tggcacctgg 180

gccccccggg agctggtgct ggtggtccag gtgcataacc ggcccgaata cctcagactg 240

ctgctggact cacttcgaaa agcccaggga attgacaacg tcctcgtcat ctttagccat 300

gacttctggt cgaccgagat caatcagctg atcgccgggg tgaatttctg tccggttctg 360

caggtgttct ttcctttcag cattcagttg taccctaacg agtttccagg tagtgaccct 420

agagattgtc ccagagacct gccgaagaat gccgctttga aattggggtg catcaatgct 480

gagtatcccg actccttcgg ccattataga gaggccaaat tctcccagac caaacatcac 540

tggtggtgga agctgcattt tgtgtgggaa agagtgaaaa ttcttcgaga ttatgctggc 600

cttatacttt tcctagaaga ggatcactac ttagccccag acttttacca tgtcttcaaa 660

aagatgtgga aactgaagca gcaagagtgc cctgaatgtg atgttctctc cctggggacc 720

tatagtgcca gtcgcagttt ctatggcatg gctgacaagg tagatgtgaa aacttggaaa 780

tccacagagc acaatatggg tctagccttg acccggaatg cctatcagaa gctgatcgag 840

tgcacagaca ctttctgtac ttatgatgat tataactggg actggactct tcaatacttg 900

actgtatctt gtcttccaaa attctggaaa gtgctggttc ctcaaattcc taggatcttt 960

catgctggag actgtggtat gcatcacaag aaaacctgta gaccatccac tcagagtgcc 1020

caaattgagt cactcttaaa taataacaaa caatacatgt ttccagaaac tctaactatc 1080

agtgaaaagt ttactgtggt agccatttcc ccacctagaa aaaatggagg gtggggagat 1140

attagggacc atgaactctg taaaagttat agaagactgc agtgataa 1188

<210> 19

<211> 4921

<212> DNA

<213> Artificial sequence

<400> 19

ggcatacact attatcttat ctatattagt cgtcgccgtt gcttttggat cctcgtgtat 60

ctctggagca ttattcactg tggaagataa ttataatgtt tcattggaag ttgccatttt 120

gacagtttca ttgatggtct tgggtttctc cttgggtcca ttgttgtggt ctcctttatc 180

tgagcagatt ggaaggagat gggtttattt tatatccttg ggtctctaca caatttttaa 240

cattccttgc gctctatccc ctaatatcgg tggtctctta gtttgtcgat ttttgtgtgg 300

tgtttttagt tccagcgcac tttgtctggt tggtggttct atagctgaca tgcatccttc 360

tgaaacaaga ggtaaagcaa tcgcctattt tgcagcagct ccttatggtg gaccagttat 420

tggaccttta gtatgtggtt ggatcggtgt taaaaccaac agaatggatc ttatcttttg 480

ggtaaatatg ggatttgcag gatttatgtg gttactagtt gcctgcattc cagaaaccta 540

tcaaccagta attttaaaga accgagcaaa gaaattaaga atggagttga acaatcctaa 600

catcatgaca gagcaagaag ctaatccact aactttcaag gaattagtag ttacctgcct 660

ttataggcct cttatgtttg ttttcactga gcctgttttg gacatgatgt gtgtttacgt 720

ttgtcttatt tactcattgc tttatgcatt tttctttgca tacccagtta tatttaatga 780

gctttatggc tatgaagatg atttcatcgg cctgatgttg attccaatat tgataggagc 840

ctttttggcc ttagttacaa ctccaatttt ggaatccatg tacgtgaaaa tgtgtcaacg 900

aagaaaacca actcctgaag acagattggt aggagccatg attgggtctc ctttccctgc 960

aattgcccta tttattttgg gagcaacgtc ctacaagcat atcatttggg tcggtccagc 1020

atcttccggt atcgccttcg gttatggaat ggtactaatt tactactctt tgaataatta 1080

catcatcgac acctacgcca agtatgcagc tagtgctctg gcaacaaagg ttttcctgag 1140

gagtgctgga ggtgctgctt tcccactatt tactacacag atgtaccata aactagggct 1200

acagtgggcc agttggttgt tggcattcat ttcattagca atgattctca tcccattcgt 1260

tttctacatt tatggtgctc gtttgagggc caaaatgtgt aaagagaact acagtgagat 1320

gtgatgcatt aagaacaatc attcattaat ccttttcagc atatattatt tctaattaat 1380

tcatacttaa taacgaaaat atggtacctg ccctcacggt ggttacggtc taggaacgga 1440

acgtatctta gcatggttgt gcgacagatt cactgtgaaa gactgttcat tatacccacg 1500

tttcactggg agatgtaagc cttaggtgtt ttaccctgat tagataatac aataaccaac 1560

agaaatacga gaatctagac taatttcgat gattcatttt tctttttacc gcgctgcctc 1620

ttttggcaat tctttcacct atattctacc ttctctttcc ttttgttcta aacttattac 1680

cagctatcta tgtcgaatca agaagaaaga cttaaactgt ggggtggcag gtttactggg 1740

gctactgacc ccttgatgga tttgtataac gcttccttac cttacgacaa gaaaatgtac 1800

aaggtggatt tagaaggaac aaaagtttac actgagggcc tggagaaaat taatttgcta 1860

actaaagacg aactaagtga gattcatcgt ggtctcaaat tgattgaagc agagtgggca 1920

gaagggaagt ttgttgagaa gccaggggat gaggatattc acactgctaa tgaacgtcgc 1980

ttgggtgagt tgattggtcg tggaatctct ggtaaggttc ataccggaag gtctagaaat 2040

gatcaagttg ccactgatat gcggttgtat gtcagagaca atctaactca gttggctgac 2100

tatctgaagc agttcattca agtaatcatc aagagagctg aacaggaaat agacgtcttg 2160

atgcccggtt atactcactt gcaaagagct caaccaatca gatggtctca ctggttgagc 2220

atgtatgcta cctatttcac tgaagattat gagagactga atcaaatcgt taaaaggttg 2280

aacaaatccc cattgggagc tggagctttg gctggtcatc cttatggaat tgatcgtgaa 2340

tacattgctg agagattagg gtttgattct gttattggta attctttggc cgctgtttca 2400

gacagagatt ttgtagtcga aaccatgttc tggtcttcgt tgtttatgaa tcatatttct 2460

cgattctcag aagatttgat catttactcc actggagagt ttggatttat caagttggca 2520

gatgcttatt ctactggatc ttctctgatg cctcaaaaaa aaaacccaga ctctttggag 2580

ttattgaggg gtaaatctgg tagatgtttt ggggccttgg ctggtttcct catgtctatt 2640

aagtccattc cgtcaaccta taacaaagat atgcaagagg ataaggagcc tttatttgat 2700

actctaatca ctgtagagca ctcgattttg atagcatccg gtgtagtttc taccttgaac 2760

attgatgccg aacgaatgaa gaatgctcta actatggata tgctggctac agatcttgcc 2820

gactatttag ttagaagggg agttccattc agagaaactc accacatttc tggtgaatgt 2880

gtcagacaag ccgaggagtt gaacctttct ggtattgatc agttgtccct cgaacaattg 2940

aaatccattg actcccgttt tgaggctgat gtggcttcaa cgtttgactt tgaagccagt 3000

gttgaaaaaa gaactgccac cggaggaact tctaagactg ctgttttaaa gcaattggat 3060

gcactgaatg aaaagctaga gtcttgaagg ttttatactg agtttgttaa tgatacaata 3120

aactgttata gtacatacaa ttgaaactct cttatctata ctgggggacc ttctcgcaga 3180

atggtataaa tatctactaa ctgactgtcg tacggcctag gggtctcttc ttcgattatt 3240

tgcaggtcgg aacatccttc gtctgatgcg gatctcctga gacaaagttc acgggtatct 3300

agtattctat cagcataaat ggaggacctt tctaaactaa actttgaatc gtctccagca 3360

gcatcctcgc ataatccttt tgtcatttcc tctatgtcta ttgtcactgt ggttggcgca 3420

tcaagagtcg tccttctgta aaccggtaca gaattcctac cactagaagc ttgaaatggg 3480

gagggtttca gctttgtatc ccgatactgt gctttaaaaa gggagtccaa actgaaatct 3540

ttttcggaat cattggatga tacctctgta ttagatctcc tatgtatcgg tttcctcggg 3600

tagatagaac ttcactcatc aacattatga tctttgtcga aaagtatcaa ttgaaacatt 3660

gccgctctgg ctctttcctt ggtgtccgtg ttgtcgcttt caaaactcaa tttcttgata 3720

acatcataaa atccatcttt aattagcttc aacgctcttg atctaggtgc tcgcatcttc 3780

ttgaaatgtt catcggaagt tagctcattc aagtacccaa catttatttc ttcttcaata 3840

gtttccatat ccatttcaac atctgaatct tccagatctg aagatgtatc gtccttccat 3900

gttaagttgg taactatcca aatacatgat atcatcagat ctttatggaa agcggcccat 3960

tcggaggaga ccccttctat ttcttgtact aaaggagtct ccaataacat ataaatgaag 4020

tcgagcaatt cttgattaca aataatcatt gatctgttat cttcattaga ggccgcaaaa 4080

tggaccagga tataagtgat agcaagaata acctcataag tttctgattc ctttctttta 4140

ctaatgtcat cctcctttaa tgtggatgat aaactcttca aattttttaa tagaaaattc 4200

aaaaaatctt tatcatcgtg agcttttgct gtcgggtcgg aacagaatga ttgaatgatt 4260

ttgttcgaat agttaagagg accacaggac aagtttcgga taatattgaa tgctttttct 4320

tgaatttgca gatttgaaga ataacaaagt tcaaaaattc ttgataaagg aactttgtcc 4380

aaaaataatt ttttatcaat gatatcatcc ccgtaaaggt aatttctaag aattgataag 4440

gcattgttct ttaagaattc aaactcgttc tcttccgaaa caaaataaga caataccttc 4500

aggaagtctt cattgaaaac attttttttc aaagaactat attccaccac acaattggaa 4560

atgattccca aaactagcgc cttgagtgta atttcgtcat ccaataagga cgcacaagat 4620

tcattgtctc ttgaaaccaa aggcagtttc accagatccg tcaacgagtt tacaatgttt 4680

aaatctacta ggtaggttcg tagcaatggc gctgatcgtg acagtgagcg gaccaagtac 4740

agagcagacg tcgtgattat gtttgaaagc ttcaatatag taatgacatg atccaactga 4800

gcagcgtctc cgtgaaactc tttgtaaata ttaatatgac gagagacaag atccaccaca 4860

cattctgaaa caatgtcatg cttgataatt tcatctctgt attcctcgtt gttggacgtt 4920

a 4921

<210> 20

<211> 2023

<212> DNA

<213> Artificial sequence

<400> 20

ggtaccgcag tttaatcata gcccactgct aagccagaat tctaatatgt aactacgtac 60

ctttcctttt aataaatgat ctgtattttc cacctagtag cagatcaaat tgttcaactt 120

taagtctttg gtccctcaag cgagagaact tgcgatgaca ctcaggagtg ccataaaagc 180

cagaacctca aaaggactga tcggagctgt tattatagcc tcaataatat ttttcaccac 240

agtaaccttc tacgatgaaa gcaaaattgt cggcataata agagtttctg atacttatac 300

aggccatagc gctgtatctt caactttcaa tgcttcttcc gttgttagtg acaacaagat 360

caacggatat ggacttcctt tgattgacac ggaatcaaat agccgttatg aggatccaga 420

cgatatttcc attgaaaacg aattgcgcta tagaattgcc caatctacca aagaggaaga 480

aaacatgtgg aaactcgata ccactctcac ggaagcaagc ttgaaaatcc ccaacataca 540

gtcgtttgag ctgcagccgt tcaaagaaag acttgataat tcactttaca attctaagaa 600

cataggaaac ttttacttct atgacccaag gcttacattc tcagtttact tgaagtatat 660

caaggataaa ttggcctctg gaagcacaac aaatcttaca atacccttca actgggcaca 720

ttttagagat ttatcgtcac tgaatcctta tttggacata aaacaagaag ataaggtcgc 780

atgtgattac ttttatgaat caagtaataa agacaaacga aaacccacgg gtaactgtat 840

tgagtttaaa gatgttcgtg atgagcacct gatacagtat gggatttcat caaaagacca 900

tctacctggt ccttttattt taaagtcact tggaattccc atgcagcata cagccaagcg 960

actggaatca aatctttatc tattaaccgg tgcgccagtt ccacttgcgg ccgcacttta 1020

ctttcttggt attggaattc attgatgttc ccttgggatt atgatattga tgtgcaaatg 1080

ccaatcaaga gtttgaacaa tctatgtgct aacttcaacc aatcattaat aattgaggat 1140

cttactgaag gatattcttc ttttttcttg gattgcggat caagtatcac gcatagaaca 1200

aaaggcaaag gattaaactt cattgatgca agattcataa atgttgaaac aggcctttat 1260

atcgatatca ctggattaag taccagtcag tcagctcgac cgccaaggtt tagtaacgct 1320

tcgaagaaag atcctattta caattgcagg aataatcatt tctactctca taacaatata 1380

gcacctctca aatacacgtt gatggagggg gttcccagtt tcattcctca acagtatgaa 1440

gaaatattga gagaggagta tacaactggt ttgacttcga aacactacaa cggcaacttt 1500

tttatgactc aattgaattt gtggcttgaa agagatccaa tgctagcact tgtgccttca 1560

tccaaatacg aaattgaagg tggaggggtg gaccataaca agattatcaa gtctattctt 1620

gaactttcca acatcaaaaa attggaattg ttggatgata atcccgatat attagaggag 1680

gtgatcagga catacgaact gacttccatt caccataaag agatgcagta tctttccagt 1740

gtcaaaccag atggggacag gtccatgcag tcaaatgaca taaccagttc ttaccaggag 1800

tttctagcaa gtctgaagaa attccagcct ttacgcaaag atttgttcca atttgagcgg 1860

atagaccttt ctaagcatag aaaacagtga gcagccgttt tgcctaaaat gttccagaaa 1920

ctataggata aatatataca gtaatgaatt aggtgatgtt agcatttagt ccccaaaaat 1980

acctcgaatc tccagctcca tagcgcaaaa tctcggatct aga 2023

77页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:高效合成不同来源的血红或肌红蛋白的酿酒酵母菌株的构建及其应用

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!